天天看點

RabbitMQ cluster

RabbitMQ概念及環境搭建(三)RabbitMQ cluster

測試環境:VMS00781 VMS00782 VMS00386 (CentOS5.8)

1.先在三台機器上分别安裝RabbitMQ Server

2.讀取其中一個節點的cookie,并複制到其他節點(節點間通過cookie确定互相是否可通信)

兩者之一均可:

sudo vim /var/lib/rabbitmq/.erlang.cookie

sudo vim $HOME/.erlang.cookie

3.逐個啟動節點

sudo service rabbitmq-server start

4.檢視各節點中的RabbitMQ brokers

sudo rabbitmqctl cluster_status

5.建叢集

分别在VMS00386、VMS00782 上執行

sudo rabbitmqctl stop_app

sudo rabbitmqctl join_cluster --ram [email protected]

sudo rabbitmqctl start_app

sudo rabbitmqctl stop_app

sudo rabbitmqctl join_cluster [email protected]

sudo rabbitmqctl start_app

6.排錯

建叢集過程中碰到如下錯誤:

sudo rabbitmqctl join_cluster --ram [email protected]

Clustering node [email protected] with [email protected] ...

Error: unable to connect to nodes [[email protected]]: nodedown

DIAGNOSTICS

===========

attempted to contact: [[email protected]]

[email protected]:

  * unable to connect to epmd (port 4369) on VMS00386: nxdomain (non-existing domain)

current node details:

- node name: '[email protected]'

- home dir: /var/lib/rabbitmq

- cookie hash: 50YO3zK+HJHos0tab1vHjg==

解決之道:

叢集節點間需能互相通路,故每個叢集節點的hosts檔案應包含叢集内所有節點的資訊以保證互相解析

vim /etc/hosts

781's IP  VMS00781

782's IP  VMS00782

386's IP  vms00386

之後重新開機各節點中的rabbitmq

7.其他問題

Error: mnesia_unexpectedly_running

原因:忘記先停止stop_app

解決:sudo rabbitmqctl stop_app

若rabbitmq-server第一次啟動後hostname不能被解析或者發生了更改則會導緻啟動失敗

需執行如下操作

sudo rm -rf /var/lib/rabbitmq/mnesia(因為相關資訊會記錄在此資料庫)

重裝RabbitMQ Server

#####################################################

RabbitMQ cluster 管理

#####################################################

1.檢視叢集狀态

可分别在叢集中各個節點執行

sudo rabbitmqctl cluster_status

2.更改節點類型(記憶體型或磁盤型)

sudo rabbitmqctl stop_app

sudo rabbitmqctl change_cluster_node_type disc

sudo rabbitmqctl change_cluster_node_type ram

sudo rabbitmqctl start_app

3.重新開機cluster中的節點

停止某個節點或者節點down掉剩餘節點不受影響

[[email protected] ~]$ sudo rabbitmqctl stop

Stopping and halting node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl stop

Stopping and halting node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

待節點重新開機後自動追上其他節點

[[email protected] ~]$ sudo service rabbitmq-server start

Starting rabbitmq-server: SUCCESS

rabbitmq-server.

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo service rabbitmq-server start

Starting rabbitmq-server: SUCCESS

rabbitmq-server.

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

幾點注意:

保證叢集中至少有一個磁盤類型的節點以防資料丢失,在更改節點類型時尤其要注意。

若整個叢集被停掉了,應保證最後一個down掉的節點被最先啟動,若不能則要使用forget_cluster_node指令将其移出叢集

若叢集中節點幾乎同時以不可控的方式down了此時在其中一個節點使用force_boot指令重新開機節點

4.從叢集移除節點

[[email protected] ~]$ sudo rabbitmqctl stop_app

Stopping node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl reset

Resetting node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl start_app

Starting node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected]]}]},

 {running_nodes,[[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected],[email protected]]}]},

 {running_nodes,[[email protected],[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

可見[email protected]成為了獨立的節點,原叢集隻剩[email protected],[email protected]了

也可在某個節點移除叢集中其他節點

如繼續在[email protected]上移除[email protected]

[[email protected] ~]$ sudo rabbitmqctl forget_cluster_node [email protected]

Removing node [email protected] from cluster ...

[[email protected] ~]$ sudo rabbitmqctl cluster_status

Cluster status of node [email protected] ...

[{nodes,[{disc,[[email protected]]}]},

 {running_nodes,[[email protected]]},

 {cluster_name,<<"[email protected]">>},

 {partitions,[]}]

可見叢集隻剩[email protected]一個節點了

這裡有個問題,在遠端其他節點中被移除的節點會自認為仍屬于叢集

[[email protected] ~]$ sudo rabbitmqctl start_app

Starting node [email protected] ...

BOOT FAILED

===========

Error description:

  {error,{inconsistent_cluster,"Node [email protected] thinks it's clustered with node [email protected], but [email protected] disagrees"}}

Log files (may contain more information):

  /var/log/rabbitmq/[email protected]

  /var/log/rabbitmq/[email protected]

Stack trace:

  [{rabbit_mnesia,check_cluster_consistency,0},

    {rabbit,'-start/0-fun-0-',0},

    {rabbit,start_it,1},

    {rpc,'-handle_call_call/6-fun-0-',5}]

Error: {rabbit,failure_during_boot,

          {error,

              {inconsistent_cluster,

                  "Node [email protected] thinks it's clustered with node [email protected], but [email protected] disagrees"}}}

需要重置一下

[[email protected] ~]$ sudo rabbitmqctl reset

Resetting node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl start_app

Starting node [email protected] ...

此時三個節點均已成為獨立的節點

其中[email protected]、[email protected]均被重置為了新的RabbitMQ broker而[email protected]還保留着原cluster的殘留狀态可通過如下步驟重置

[[email protected] ~]$ sudo rabbitmqctl stop_app

Stopping node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl reset

Resetting node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl start_app

Starting node [email protected] ...

5.自動配置cluster

顯然,這是通過配置檔案而非指令行工具進行的

首先重置各節點

[[email protected] ~]$ sudo rabbitmqctl stop_app

Stopping node [email protected] ...

[[email protected] ~]$ sudo rabbitmqctl reset

Resetting node [email protected] ...

...

其次調整配置檔案

[{rabbit,

  [{cluster_nodes, {['rab[email protected]', '[email protected]', '[email protected]'], disc}}]}].

...

之後啟動各節點

[[email protected] ~]$ sudo service rabbitmq-server start

Starting rabbitmq-server: SUCCESS

rabbitmq-server.

檢視叢集狀态

[[email protected] ~]$ sudo rabbitmqctl cluster_status

幾點注意:

無論通過指令行還是通過配置檔案配置,請確定各節點上Erlang和RabbitMQ版本一緻

配置檔案僅對新鮮節點有效,也即被reset或者第一次啟動的節點。是以在重新開機節點後自動化叢集過程并不會發生。也以為這通過rabbitmq進行的改變優先于自動化叢集配置。

在一台機器上部署叢集,一般使用者測試叢集特性

這裡的關鍵是已不同的端口可節點名稱啟動多個rabbitmq-server執行個體,其餘過程同多機器上部署叢集類似

其他注意事項:

如防火牆政策等

參考:

http://www.rabbitmq.com/clustering.html

https://www.linuxidc.com/Linux/2014-12/110449p3.htm

openstack官方:https://docs.openstack.org/ha-guide/shared-messaging.html

問題參考:https://stackoverflow.com/questions/15720540/ubuntu-rabbitmq-error-unable-to-connect-to-node-rabbitsomename-nodedown

繼續閱讀