天天看點

mysql group replication的安裝搭建

2016年12月mysql釋出了5.7.17版本開始支援了group replication首先來看先傳統複制和半同步複制群組複制的原理比較圖

比較圖

1.mysql傳統複制是異步複制

mysql group replication的安裝搭建

2.mysql的半同步複制

mysql group replication的安裝搭建

3.mysql的組複制

mysql group replication的安裝搭建

部署組複制

看完原理我們就部署組複制,要安裝和使用組複制必須要配置正确的mysqlserver執行個體,建議将配置存儲在my.cnf中,以下是組中第一個節點的配置

server_id= #一般定義為你的機器ip的後幾位加上資料庫端口号
gtid_mode=on 
enforce_gtid_consistenct=on
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_fomat=ROW
           
transaction_write_set_extraction = XXHASH64 #訓示server必須為每個事務收集寫集合,并使用XXHASH64雜湊演算法将其編碼為散列
loose-group_replication_group_name = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa' #告知插件正在加入或要建立的組要命名為'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaa'
loose-group_replication_start_on_boot = off
#插件在server啟動時不自動啟動組複制
loose-group_replication_local_address = '192.168.1.197:33061'
#告訴插件使用位址為192.168.1.197,端口33061用于接受來自組中其他成員的傳入連接配接
loose-group_replication_group_seeds ='192.168.1.197:33061,192.168.1.198:33062,192.168.1.199:33063'
#當下面這些server需要加入組時,應該連接配接到這些主機和端口上通路他們
loose-group_replication_allow_local_disjoint_gtids_join=on
loose-group_replication_bootstrap_group = off
#訓示插件是否自動引導組
loose-group_replication_single_primary_mode=true
#設定為single-primary模式下,組内隻有一個節點負責寫入,讀可以從任意一個節點讀取,組内資料保持最終一緻(另一種模式為multi-primary模式即為多寫方案,即寫操作會下發到組内所有節點,組内所有節點同時可讀可寫,該模式也是能夠保證組内資料最終一緻性)
loose-group_replication_enforce_update_everywhere_checks=false
#這個參數設定檢查對RP有影響的參數在各個節點是否一緻
           

建立使用者

建立具有REPLICATION-SLAVE權限的使用者,此操作應該記錄到二進制日志中,以避免将更改傳遞到其他server執行個體,以下是建立的過程。
root@mysqldb ::  [(none)]> set sql_log_bin=;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> create user 'repl'@'%';
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> grant replication slave on *.* to 'repl'@'%' identified by 'repl';
Query OK, 0 rows affected, 1 warning (0.00 sec)

root@mysqldb ::  [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> set sql_log_bin=;
Query OK, 0 rows affected (0.00 sec)
           

複制憑證

使用者進行上述配置後,需要使用change master to語句将server配置為,在下次需要從其他成員恢複其狀态時,使用group_replication_recovery複制通道的給定憑據。執行以下指令
[(none)]> change master to 
    -> master_user='repl',
    -> master_password='repl'
    -> for channel 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.01 sec)
           

安裝組複制插件

[(none)]> install PLUGIN group_replication SONAME 'group_replication.so';
Query OK,  rows affected ( sec)
           

檢查插件是否成功安裝

請執行SHOW PLUGINS并檢查輸出在最後一行,如下顯示
[(none)]> show plugins;
+----------------------------+----------+--------------------+----------------------+---------+
| Name                       | Status   | Type               | Library              | License |
+----------------------------+----------+--------------------+----------------------+---------+
(...)

| group_replication          | ACTIVE   | GROUP REPLICATION  | group_replication.so | GPL     |
+----------------------------+----------+--------------------+----------------------+---------+
           

啟動第一個節點的group replication

[(none)]> set global group_replication_bootstrap_group=on;
Query OK,  rows affected ( sec)
[(none)]> start group_replication;
ERROR  (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log
這裡報錯了 我們檢視錯誤日志有如下的一段話
[Warning] Plugin group_replication reported: 'Group Replication requires slave-preserve-commit-order to be set to ON when using more than  applier threads.'
從這裡可以看出 有一個參數值需要設定為ON
我們登入資料庫操作
[(none)]> show global variables like '%slave_preserve%';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| slave_preserve_commit_order | OFF   |
+-----------------------------+-------+
[(none)]> set global slave_preserve_commit_order=on;
Query OK,  rows affected ( sec)#這裡改好之後我們再啟動group replication試一下
[(none)]> start group_replication;
Query OK,  rows affected ( sec)
[(none)]> set global group_replication_bootstrap_group=off;
Query OK,  rows affected ( sec)
           
這個時候
組已經啟動成功,檢查該組是否已建立,并且其中有一個成員
[(none)]> select * from  performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | 211de504-c8be-11e7-8e2c-000c294732bd |  node0       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
1 rows in set (0.00 sec)
           

此時組中已經有一個成員了通過添加配置來添加剩下的成員第二個執行個體的配置檔案與第一個執行個體額配置檔案類似

server_id= #對應該機器的ip的後三位加上mysql的端口号
gtid_mode=on 
enforce_gtid_consistenct=on
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_fomat=ROW
           
transaction_write_set_extraction = XXHASH64
loose-group_replication_group_name = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa'
loose-group_replication_start_on_boot = off
loose-group_replication_local_address = '.:' #本機ip和接收的端口
loose-group_replication_group_seeds ='.:,.:,.:'
loose-group_replication_bootstrap_group = off
loose-group_replication_single_primary_mode=true
loose-group_replication_enforce_update_everywhere_checks=false
           

建立使用者和建立恢複憑證

root@mysqldb ::  [(none)]> set sql_log_bin=;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> create user 'repl'@'%';
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> grant replication slave on *.* to 'repl'@'%' identified by 'repl';
Query OK, 0 rows affected, 1 warning (0.00 sec)

root@mysqldb ::  [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> set sql_log_bin=;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> change master to 
    -> master_user='repl',
    -> master_password='repl'
    -> for channel 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.01 sec)
           

安裝組複制插件,并啟動将server加入組的程式,與部署第一個節點相同的方式安裝插件

[(none)]> install PLUGIN group_replication SONAME 'group_replication.so';
Query OK,  rows affected ( sec)
[(none)]> start group_replication;
Query OK,  rows affected ( sec)
           
與之前的步驟不同的是,這裡與第一個節點中的步驟有一個差別就是不執行SET GLOBAL group_replication_bootstrap_group=ON的操作;在啟動組複制之前,因為該組已由第一個節點建立和引導。此時,第二結點隻需添加到已經存在的組中即可;再次檢查performance_schema.replication_group_members表,可以看出組中現在有兩個ONLINE的server。
[(none)]> select * from  performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | 211de504-c8be-11e7-8e2c-000c294732bd | node0                 |        3306 | ONLINE       |
| group_replication_applier | 8eb4648b-bf0b-11e7-b843-000c29f6d9d2 | node1                 |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
2 rows in set (0.00 sec)
           

至此兩個節點已經成功加入現在我們來加入第三個節點,首先是配置檔案

server_id= #本機ip的後三位加上mysql的端口号
gtid_mode=on 
enforce_gtid_consistenct=on
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_fomat=ROW
           
transaction_write_set_extraction = XXHASH64
loose-group_replication_group_name = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa'
loose-group_replication_start_on_boot = off
loose-group_replication_local_address = '.:' #本機ip加上接收的端口号
loose-group_replication_group_seeds ='.:,.:,.:'
loose-group_replication_bootstrap_group = off
loose-group_replication_single_primary_mode=true
loose-group_replication_enforce_update_everywhere_checks=false
           

然後與之前兩個節點一樣建立使用者和恢複憑證

root@mysqldb ::  [(none)]> set sql_log_bin=;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> create user 'repl'@'%';
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> grant replication slave on *.* to 'repl'@'%' identified by 'repl';
Query OK, 0 rows affected, 1 warning (0.00 sec)

root@mysqldb ::  [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> set sql_log_bin=;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb ::  [(none)]> change master to 
    -> master_user='repl',
    -> master_password='repl'
    -> for channel 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.01 sec)
           

安裝組複制插件并啟動将server加入組的程式,與部署前兩個節點相同的方式安裝

[(none)]> install PLUGIN group_replication SONAME 'group_replication.so';
Query OK,  rows affected ( sec)
[(none)]> start group_replication;
Query OK,  rows affected ( sec)
           

此時,第三個節點正在運作,并且已經加入組且與組中的其他成員同步。通路performance_schema.replication_group_members表再次确認情況

[(none)]> select * from  performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST           | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
| group_replication_applier | 211de504-c8be-11e7-8e2c-000c294732bd | node0                 |        3306 | ONLINE       |
| group_replication_applier | 8eb4648b-bf0b-11e7-b843-000c29f6d9d2 | node1                 |        3306 | ONLINE       |
| group_replication_applier | d31f9fbc-c8c7-11e7-9a73-000c2984d9aa | node2           |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-----------------------+-------------+--------------+
3 rows in set (0.00 sec)
           

現在我們已經把group replication搭建好了,來簡單說明下group replication有關的表

replication_group_members

該表用于監控在目前視圖中的不同server執行個體的狀态,換句話說,是該組的一部分,用于組員服務追蹤

+--------------+----------+------+-----+---------+-------+
| Field        | Type     | Null | Key | Default | Extra |
+--------------+----------+------+-----+---------+-------+
| CHANNEL_NAME | char(64) | NO   |     | NULL    |       |
| MEMBER_ID    | char(36) | NO   |     | NULL    |       |
| MEMBER_HOST  | char(60) | NO   |     | NULL    |       |
| MEMBER_PORT  | int(11)  | YES  |     | NULL    |       |
| MEMBER_STATE | char(64) | NO   |     | NULL    |       |
+--------------+----------+------+-----+---------+-------+
5 rows in set (0.00 sec)
           

replication_group_member_stats

該表提供與認證過程相關的資訊,對于了解申請隊列增長情況,出發了多少沖突,檢查了多少事務,哪些事務已被所有成員送出等等非常有用。

+------------------------------------+---------------------+------+-----+---------+-------+
| Field                              | Type                | Null | Key | Default | Extra |
+------------------------------------+---------------------+------+-----+---------+-------+
| CHANNEL_NAME                       | char()            | NO   |     | NULL    |       |
| VIEW_ID                            | char()            | NO   |     | NULL    |       |
| MEMBER_ID                          | char()            | NO   |     | NULL    |       |
| COUNT_TRANSACTIONS_IN_QUEUE        | bigint() unsigned | NO   |     | NULL    |       |
| COUNT_TRANSACTIONS_CHECKED         | bigint() unsigned | NO   |     | NULL    |       |
| COUNT_CONFLICTS_DETECTED           | bigint() unsigned | NO   |     | NULL    |       |
| COUNT_TRANSACTIONS_ROWS_VALIDATING | bigint() unsigned | NO   |     | NULL    |       |
| TRANSACTIONS_COMMITTED_ALL_MEMBERS | longtext            | NO   |     | NULL    |       |
| LAST_CONFLICT_FREE_TRANSACTION     | text                | NO   |     | NULL    |       |
+------------------------------------+---------------------+------+-----+---------+-------+
 rows in set ( sec)
           

這些字段對于監控組中的成員的性能很重要。 例如,假設組的成員之一出現延遲,并且不能與該組的其他成員同步。 在這種情況下,您可能會在隊列中看到大量的事務。 基于此資訊,您可以決定從組中删除成員或延遲組中其他成員的事務處理,進而減少排隊的事務的數量。 此資訊還可以幫助您決定如何調整組複制插件的流控制

replication_connection_status

連接配接到組時,此表中的某些字段顯示有關組複制的資訊。例如,已從組中接收并在應用隊列(中繼日志)中排隊的事務

+---------------------------+-------------------------------+------+-----+---------+-------+
| Field                     | Type                          | Null | Key | Default | Extra |
+---------------------------+-------------------------------+------+-----+---------+-------+
| CHANNEL_NAME              | char()                      | NO   |     | NULL    |       |
| GROUP_NAME                | char()                      | NO   |     | NULL    |       |
| SOURCE_UUID               | char()                      | NO   |     | NULL    |       |
| THREAD_ID                 | bigint() unsigned           | YES  |     | NULL    |       |
| SERVICE_STATE             | enum('ON','OFF','CONNECTING') | NO   |     | NULL    |       |
| COUNT_RECEIVED_HEARTBEATS | bigint() unsigned           | NO   |     |        |       |
| LAST_HEARTBEAT_TIMESTAMP  | timestamp                     | NO   |     | NULL    |       |
| RECEIVED_TRANSACTION_SET  | longtext                      | NO   |     | NULL    |       |
| LAST_ERROR_NUMBER         | int()                       | NO   |     | NULL    |       |
| LAST_ERROR_MESSAGE        | varchar()                 | NO   |     | NULL    |       |
| LAST_ERROR_TIMESTAMP      | timestamp                     | NO   |     | NULL    |       |
+---------------------------+-------------------------------+------+-----+---------+-------+
 rows in set ( sec)
           

replication_applier_status

可以使用正常replication_applier_status表觀察組複制相關的通道和線程的狀态。如果有許多不同的工作線程在應用事務,那麼工作表也可以用來監控每個工作線程正在做什麼。

+----------------------------+---------------------+------+-----+---------+-------+
| Field                      | Type                | Null | Key | Default | Extra |
+----------------------------+---------------------+------+-----+---------+-------+
| CHANNEL_NAME               | char()            | NO   |     | NULL    |       |
| SERVICE_STATE              | enum('ON','OFF')    | NO   |     | NULL    |       |
| REMAINING_DELAY            | int() unsigned    | YES  |     | NULL    |       |
| COUNT_TRANSACTIONS_RETRIES | bigint() unsigned | NO   |     | NULL    |       |
+----------------------------+---------------------+------+-----+---------+-------+
 rows in set ( sec)
           

以上的知識點大多來源自京東的資料庫技術部京東資料庫技術部

以上步驟再進行到第一個節點啟動的時候發生錯誤

ERROR 3096 (HY000): The START GROUP_REPLICATION command failed as there was an error when initializing the group communication layer.

是由于本人馬虎在配置檔案中的loose-group_replication_group_seeds這個值設定重複 改正以後就好了