Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

What do I do if a RabbitMQ instance dies?

April 26, 2017instance rabbitmq

0

Posted

What do I do if a RabbitMQ instance dies?

1 Answer

0

Posted

It depends how and why it died, of course. Please differentiate between a dead machine and a partitioned network. One quick fix for a truly dead node may be to get a backup machine, reinstall the OS and Rabbit and then just restart Rabbit with the contents of the old mnesia data directory (if the disk is still ok, then you could just try slotting it in the new machine). Make sure that the backup machine has the same name as the machine that died. If this works, you are in luck. If not, i.e. mnesia does not seem to be recovering itself (it hangs will the waiting_for_tables error message), then what you can try is to nuke the mnesia directory and bring this node as part of the cluster and let it replicate itself from the other cluster members. Note that this will not restart queue processes that were running on this node before it crashed. But you can just re-declare the queues.