In OpenStack Kilo with RabbitMQ, you may periodically run across services that simply do not start and register correctly. There is no issue with the service configuration, and no matter how many times you try to restart them, they simply don’t come back. (I frequently hit this issue with nova-compute and nova-network on my compute nodes.) If you check the service log, you’ll probably notice a lot of logs like this occurring around every 90 seconds:

If you’re lucky enough to have debug and verbose modes enabled, you might also catch this:

What’s happening is that OpenStack, via the oslo.messaging subsystem, is trying to create a queue that already exists, because it wasn’t cleaned up previously. This procedure has awful, terrible error handling (but you’re running OpenStack, so you already knew that everything in OpenStack has awful, terrible error handling). It assumes this failure must be a race condition, and keeps retrying indefinitely to create the queue. The solution is to delete the queue and restart the service, so the service can create the queue correctly.

Delivery queues in OpenStack are named <subsystem>.<nodename>, so if your compute node is named compute-1.mydomain.com, your nova-compute and nova-network queues will be named compute.compute-1.mydomain.com and network.compute-1.mydomain.com, respectively. (Nova services don’t prefix their queue names with nova-; other services generally do.)

If you prefer to use the GUI, and this is easy for small deployments, you can find the queue underneath the Queues tab in your management GUI. If you prefer to use the CLI (the GUI becomes entirely unusable past a few thousand queues), use rabbitmqadmin:

Afterwards, restart the service, and you should see it functioning normally.