RabbitMQ Clustering

In the last post of this series, we learned how to install/configure RMQ for vCloud Director. This post is an extension of my last post where I will be adding one more node to my RMQ setup to form a cluster for high availability.

What data is replicated in an RMQ Cluster?

All data/state required for the operation of a RabbitMQ broker is replicated across all nodes. An exception to this is message queues, which by default reside on one node, though they are visible and reachable from all nodes. To replicate queues across nodes in a cluster, see the documentation on high-availability

Note: Before proceeding with cluster formation, please ensure the following:

1: Use the same version of Erlang and RMQ server rpm which is installed on the master node.

2: RMQ nodes address each other using domain names, either short or FQDNs. Therefore hostnames of all cluster members must be resolvable from all cluster nodes, as well as machines on which command line tools such as rabbitmqctl might be used.

Here are the high-level steps for cluster formation:

  • Have a single node running (rmqsrv01).
  • Stop another node (rmqsrv02).
  • Reset the stopped node (rabbit-2rmqsrv02).
  • Cluster the other node to the root node.
  • Start the stopped node.

Step 1) Install the same version of Erlang and rabbitmq-server rpm that is installed on the master node. The steps of doing so have been documented in my previous post

Step 2) Copy the Erlang cookie from the master node to the second node.

RabbitMQ nodes and CLI tools (e.g. rabbitmqctl) use a cookie to determine whether they are allowed to communicate with each other. For two nodes to be able to communicate they must have the same shared secret called the Erlang cookie. The cookie is just a string of alphanumeric characters. Every cluster node must have the same cookie.

On Linux systems, the cookie will be typically located in /var/lib/rabbitmq/.erlang.cookie or $HOME/.erlang.cookie.

Copy the Erlang cookie file to directory /var/lib/rabbitmq on the 2nd node.

Additionally, you can verify the md5sum of the cookie:

Step 3) Stop the RMQ app and reset the node.

At this point, if you check the cluster status, you will see only one node as a disc node and also running_nodes will list only one node i.e. node where you are checking the status.

Step 4)  Add the second node to the cluster

Now if you check the cluster status, you will see both nodes are listed as disc nodes.

Note: By default, the cluster stores messages on the disk. You can also choose to store Queues in Memory. You can have a node as a RAM node while attaching it to the cluster:

Node type can be changed by supplying a switch ‘change_cluster_node_type’ and then selecting either disc or ram as the type.

It is recommended to have at least one disk node in the cluster so that messages are stored on a persistent disk and can avoid any loss of messages in case of a disaster

Set the HA Policy

The following command will sync all the queues across all nodes:

I hope this post is informational to you. Feel free to share this on social media if it is worth sharing. Be sociable 🙂

Leave a Reply