In last post of this series we learnt how to install/configure RMQ for vCloud Director. This post is extension of my last post where I will be adding one more node to my RMQ setup to form a cluster for high availability.

What data is replicated in a RMQ Cluster?

All data/state required for the operation of a RabbitMQ broker is replicated across all nodes. An exception to this are message queues, which by default reside on one node, though they are visible and reachable from all nodes. To replicate queues across nodes in a cluster, see the documentation on high availability

Note: Before proceeding with cluster formation, plese ensure following:

1: Use same version of erlang and RMQ server rpm which is installed on master node.

2: RMQ nodes address each other using domain names, either short or FQDNs. Therefore hostnames of all cluster members must be resolvable from all cluster nodes, as well as machines on which command line tools such as rabbitmqctl might be used.

High level steps for cluster formation can be penned as :

  • Have a single node running (rmqsrv01).
  • Stop another node (rmqsrv02).
  • Reset the stopped node (rabbit-2rmqsrv02).
  • Cluster the other node to the root node.
  • Start the stopped node.

Please follow below steps for cluster formation

Step 1) Install same version of erlang and rabbitmq-server rpm which is installed on master node. Steps of doing so have been documented in my previous post

Step 2) Copy Erlang cookie from master node to node 2

RabbitMQ nodes and CLI tools (e.g. rabbitmqctl) use a cookie to determine whether they are allowed to communicate with each other. For two nodes to be able to communicate they must have the same shared secret called the Erlang cookie. The cookie is just a string of alphanumeric characters. Every cluster node must have the same cookie.

On Linux systems, the cookie will be typically located in /var/lib/rabbitmq/.erlang.cookie or $HOME/.erlang.cookie.

[root@rmqsrv01 ~]# cat /var/lib/rabbitmq/.erlang.cookie
PKATXPIFMSTGSLHMVFPU

scp .erlang cookie file to directory  /var/lib/rabbitmq on 2nd node.

[root@rmqsrv02 ~]# cat /var/lib/rabbitmq/.erlang.cookie
PKATXPIFMSTGSLHMVFPU

Additionally you can verify the md5sum of the cookie:

[root@rmqsrv01 ~]# md5sum /var/lib/rabbitmq/.erlang.cookie
df97d981d5733d95f0c3191969bc234a /var/lib/rabbitmq/.erlang.cookie

[root@rmqsrv02 ~]# md5sum /var/lib/rabbitmq/.erlang.cookie
df97d981d5733d95f0c3191969bc234a /var/lib/rabbitmq/.erlang.cookie

Step 3) Stop RMQ app and reset the node.

[root@rmqsrv02 ~]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@rmqsrv02 …

[root@rmqsrv02 ~]# rabbitmqctl reset
Resetting node rabbit@rmqsrv02 …

At this point if you check the cluster status, you will see only one node as disc node and also running_nodes will list only one node i.e. node where you are checking the status.

[root@rmqsrv01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@rmqsrv01 …
[{nodes,[{disc,[rabbit@rmqsrv01]}]},
{running_nodes,[rabbit@rmqsrv01]},
{cluster_name,<<“rabbit@rmqsrv01.alex.local”>>},
{partitions,[]},
{alarms,[{rabbit@rmqsrv01,[]}]}]

[root@rmqsrv02 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@rmqsrv02 …
[{nodes,[{disc,[rabbit@rmqsrv02]}]},
{running_nodes,[rabbit@rmqsrv02]},
{cluster_name,<<“rabbit@rmqsrv02.alex.local”>>},
{partitions,[]},
{alarms,[{rabbit@rmqsrv02,[]}]}]

Step 4)  Add node 2 to cluster

[root@rmqsrv02 ~]# rabbitmqctl join_cluster –ram rabbit@rmqsrv01
Clustering node rabbit@rmqsrv02 with rabbit@rmqsrv01 …

Now if you check the cluster status, you will see both nodes are listed as disc nodes.

[root@rmqsrv01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@rmqsrv01 …
[{nodes,[{disc,[rabbit@rmqsrv01]},{ram,[rabbit@rmqsrv02]}]},
{running_nodes,[rabbit@rmqsrv02,rabbit@rmqsrv01]},
{cluster_name,<<“rabbit@rmqsrv01.alex.local”>>},
{partitions,[]},
{alarms,[{rabbit@rmqsrv02,[nodedown]},{rabbit@rmqsrv01,[]}]}]

[root@rmqsrv02 rabbitmq]# rabbitmqctl cluster_status
Cluster status of node rabbit@rmqsrv02 …
[{nodes,[{disc,[rabbit@rmqsrv01]},{ram,[rabbit@rmqsrv02]}]},
{running_nodes,[rabbit@rmqsrv01,rabbit@rmqsrv02]},
{cluster_name,<<“rabbit@rmqsrv01.alex.local”>>},
{partitions,[]},
{alarms,[{rabbit@rmqsrv01,[]},{rabbit@rmqsrv02,[]}]}]

Note: By default, the cluster stores messages on the disk. You can also choose to store Queues in Memory. You can have a node as a RAM node while attaching it to the cluster:

# rabbitmqctl stop_app

# rabbitmqctl join_cluster –ram rabbit@rmqsrv02

Node type can be changed via supplying a swaitch ‘change_cluster_node_type’ and then selecting either disc or ram as type

# rabbitmqctl change_cluster_node_type disc | ram

It is recommended to have at least one disk node in the cluster so that messages are stored on a persistent disk and can avoid any loss of messages in case of a disaster

Set the HA Policy

The following command will sync all the queues across all nodes:

[root@rmqsrv01 ~]# rabbitmqctl set_policy ha-all “” ‘{“ha-mode”:”all”,”ha-sync-mode”:”automatic”}’

Setting policy “ha-all” for pattern [] to “{\”ha-mode\”:\”all\”,\”ha-sync-mode\”:\”automatic\”}” with priority “0” …

Posted in: Vmware.
Last Modified: April 3, 2017

Leave a reply