天天看点

mysql主从(一)--搭建(GTID+row+增强半同步)1、主从搭建2、主从不一致时的修复

文章目录

  • 1、主从搭建
    • 1.1、mysql5.7 GTID+row+增强半同步
      • 1.1.1、主从库环境准备
      • 1.1.2、主库(3309)备份数据,在从库(3306)恢复
      • 1.1.3、创建复制用户
      • 1.1.4、从库(3306)执行change master,start slave
    • 1.2、配置为双主模式
  • 2、主从不一致时的修复
    • 2.1、从库已经有T2表,再在主库上创建T2
    • 2.2、主键冲突的错误 1062
    • 2.3、主库上更新后,从库找不到记录 1032
    • 2.4、主库上delete后,从库找不到记录 1032

1、主从搭建

1.1、mysql5.7 GTID+row+增强半同步

实践经验证明GTID+row+增强半同步方式是一种靠谱的方式,已经在金融环境广泛应用。

GTID的限制

1) 不支持非事务引擎,从库报错,stop ,start slave 忽略报错
2) 不支持create table .. select 语句复制
3) 不允许一个SQL同时更新一个事务引擎和非事务引擎的表
4) 在一个复制组中,必须要求统一开启GTID或者关闭GTID。5.7开始支持GTID的在线切换
5) 5.7之前,开启GTID需要重启
6) 开启GTID后,就不在使用原来传统的复制方式。MHA也被GTID功能替代。
7) 对于create temporary table不会同步,但是drop temporary table会同步,要避开这个坑
8) 不支持sql slave skip counter
           

复制格式

推荐使用row格式,statement和mixed格式坑太多。

关于增强半同步

rpl_semi_sync_master_wait_point = AFTER_SYNC #5.7中默认已经是after_sync

测试环境

IP 端口
主库 1.1.3.111 3309
从库 1.1.3.111 3306

1.1.1、主从库环境准备

主从库需要修改的参数:

–主库(3309)

############replication###############
master_info_repository = TABLE
relay_log_info_repository = TABLE

max_binlog_size = 1G
binlog_cache_size = 4M
max_binlog_cache_size = 2G
binlog_format = row
log_bin=/data/mysql/mysql57/logs/mysql-bin
relay_log = /data/mysql/mysql57/logs/mysql-relay-bin
relay_log_index = /data/mysql/mysql57/logs/mysql-relay-bin.index
max_relay_log_size = 1G
relay_log_purge = 1
relay_log_recovery = 1
#除了中继节点
log_slave_updates = 1
sync_binlog = 10

#######gtid####################
gtid_mode = on
enforce_gtid_consistency = 1 
binlog_gtid_simple_recovery = 1
###############################
#并行复制参数
slave_parallel_type=LOGICAL_CLOCK
slave_parallel_workers=8

########semi sync replication settings########
#plugin_dir=/usr/local/mysql-5.7.15/lib/plugin  #不同实例不同的位置
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
rpl_semi_sync_master_wait_point = AFTER_SYNC   #5.7中默认值是after_sync
           

–从库(3306)的log_bin位置,其它参数和主库的设置一样

log_bin=/data/mysql/mysql3306/logs/mysql-bin
relay_log = /data/mysql/mysql3306/logs/mysql-relay-bin
relay_log_index = /data/mysql/mysql3306/logs/mysql-relay-bin.index
           

1.1.2、主库(3309)备份数据,在从库(3306)恢复

方法一:mysqldump

  • 在备份时候需要指定–master-data
  • 导出的语句中包含 set@@GLOBAL.GTID_PURGED=’’,恢复时,需要先在slave上执行reset master
  • 导入数据后做change master to。

如果主从1:1复制,使用-A参数全库级别备份:

/usr/local/mysql57/bin/mysqldump --single-transaction --master-data=2 -uroot -pbigbangdata.cn  -A -S /usr/local/mysql57/mysql.sock >3309-full.sql
           

如果要排除mysql、sys等系统库使用下面的脚本:

/usr/local/mysql57/bin/mysql -uroot -pbigbangdata.cn -S /usr/local/mysql57/mysql.sock -e 'show databases;'|grep -E -v "Database|information_schema|mysql|performance_schema"|xargs /usr/local/mysql57/bin/mysqldump -uroot -pbigbangdata.cn -S /usr/local/mysql57/mysql.sock --databases > 3309.sql
           

方法二:percona xtrabackup

  • xtrabakcup binlog info 包含了GTID信息
  • 做从库恢复后,需要手工设置:set gbloabl gtid_purged=‘server_uuid:1-N’
  • 恢复后,执行change master to

注意 这里的 N 是特指

从库恢复

source 3309-full.sql

1.1.3、创建复制用户

在主库(3309)上创建repl用户,如果是双主,从库上也要创建

create user repl@'1.1.3.111' identified by '1234.C0m';
grant replication slave on *.* to repl@'1.1.3.111';
           

1.1.4、从库(3306)执行change master,start slave

change master to master_host='1.1.3.111', master_user='repl', master_password='1234.C0m', master_port=3309, master_auto_position=1;
           

如果要设置忽略的库,建议在从库上配置:

CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB = (information_schema,performance_schema,mysql,sys);
           

检查配置无误后,start slave;

1.2、配置为双主模式

基于完成1.1章节的配置,在主库上反过来配置从库的设置即可。

从库(3306)创建repl用户,如果IP相同,上面步骤同步了mysql库,则下面脚本无需执行

create user repl@'1.1.3.111' identified by '1234.C0m';
grant replication slave on *.* to repl@'1.1.3.111';
           

主库(3309)执行change master命令:

change master to master_host='1.1.3.111', master_user='repl', master_password='1234.C0m', master_port=3306, master_auto_position=1;
           

最后,start slave;

至此,实例3309和3306上的库互为主从。

2、主从不一致时的修复

2.1、从库已经有T2表,再在主库上创建T2

处理原则:以主库为准,在从库上drop t2。 然后重启slave。

注意: 在db里的操作都会记录到binlog中,如果不想被记录到binlog中,可以先set sql_log_bin=0.drop完成后,再 set sql_log_bin=1即可。

如果drop 操作被记录到binlog中,在发生主从切换后,主库的t2 表会被删除。

从5.7 开始,有super read only。

Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1050
                   Last_Error: Error 'Table 't2' already exists' on query. Default database: 'myreptest'. Query: 'create table t2 (id int)'
           

基于GTID错误 跳过

由于mysql 中GTID 在复制中 GTID 要求必须是连续的,主库的GTID,必须在从库上也有。所以不能skip过去一个事务。只能通过注入一个空事务去跳过一个事务。

stop slave;
set gtid_next='xxxx:N';
<<<<< 这里xxxx 指定的是主库的server_uuid 和你想跳过的事务ID
begin;commit;  <<<<<<<<<<< 空事务
set gtid_next='AUTOMATIC';
start slave;
show slave status\G
           

2.2、主键冲突的错误 1062

[email protected]:mysql3307.sock [myreptest]>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.56.103
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000050
          Read_Master_Log_Pos: 5442
               Relay_Log_File: relay-bin.000011
                Relay_Log_Pos: 2178
        Relay_Master_Log_File: mysql-bin.000050
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows event on table myreptest.myreptest; Duplicate entry '2' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000050, end_log_pos 5411

<<<<<<<<< 这里很明显提示 duplicate entry 2 冲突了,在从库上把这条记录删掉,然后重启slave即可。
           

2.3、主库上更新后,从库找不到记录 1032

[email protected]:mysql3307.sock [myreptest]>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.56.103
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000050
          Read_Master_Log_Pos: 5121
               Relay_Log_File: relay-bin.000011
                Relay_Log_Pos: 1849
        Relay_Master_Log_File: mysql-bin.000050
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1032
                   Last_Error: Could not execute Update_rows event on table myreptest.myreptest; Can't find record in 'myreptest', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000050, end_log_pos 5090
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 4792   <<<<<< 报错事务的开始位置。
              Relay_Log_Space: 4142
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Update_rows event on table myreptest.myreptest; Can't find record in 'myreptest', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000050, end_log_pos 5090
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1033306
                  Master_UUID: d490cfb4-caac-11e6-b1da-080027d6827d
             Master_Info_File: /data/mysql/mysql3307/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 170215 09:38:48
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: d490cfb4-caac-11e6-b1da-080027d6827d:79-98
            Executed_Gtid_Set: d035c34f-cbaf-11e6-8a78-080027d6827d:1-7,
d490cfb4-caac-11e6-b1da-080027d6827d:1-97
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)
           

这时需要解析主库的binlog,把从库的数据补回来。

mysqlbinlog --base64-output=decode-rows -v --start-position=4792    --stop-position=5090 mysql-bin.000050 > 50.sql

......
......
# at 4982
#170215  9:38:48 server id 1033306  end_log_pos 5032 CRC32 0xc902e88e   Table_map: `wu
bx`.`myreptest` mapped to number 152
# at 5032
#170215  9:38:48 server id 1033306  end_log_pos 5090 CRC32 0x820931f0   Update_rows: t
able id 152 flags: STMT_END_F
### UPDATE `myreptest`.`myreptest`
### WHERE
###   @1=2
###   @2='myreptest'
### SET
###   @1=2
###   @2='python'
ROLLBACK /* added by mysqlbinlog */ /*!*/;
......
......
           

这里就能看到从库丢失的那条记录。然后在从库补充这条记录即可。

2.4、主库上delete后,从库找不到记录 1032

[email protected]:mysql3307.sock [myreptest]>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.56.103
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000050
          Read_Master_Log_Pos: 2779
               Relay_Log_File: relay-bin.000010
                Relay_Log_Pos: 985
        Relay_Master_Log_File: mysql-bin.000050
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1032
                   Last_Error: Could not execute Delete_rows event on table myreptest.myreptest; Can't find record in 'myreptest', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000050, end_log_pos 2748  <<<<<< 该事物的结束位置
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 2465   <<<<<< 该事物的开始位置
              Relay_Log_Space: 1800
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Delete_rows event on table myreptest.myreptest; Can't find record in 'myreptest', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000050, end_log_pos 2748
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1033306
                  Master_UUID: d490cfb4-caac-11e6-b1da-080027d6827d
             Master_Info_File: /data/mysql/mysql3307/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 170214 11:32:39
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: d490cfb4-caac-11e6-b1da-080027d6827d:79-89 
            Executed_Gtid_Set: d035c34f-cbaf-11e6-8a78-080027d6827d:1-6,
d490cfb4-caac-11e6-b1da-080027d6827d:1-88
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)


Retrieved_Gtid_Set: d490cfb4-caac-11e6-b1da-080027d6827d:79-89   <<<<<< 这里为什么会从79开始 而不是1.是因为从库是用mysqldump备份过来的。dump内容包含了1-78这部分所有的内容了。show global variables like 'gtid%';  

+----------------------------------+-------------------------------------------------------------------------------------+
| Variable_name                    | Value                                                                               |
+----------------------------------+-------------------------------------------------------------------------------------+
| gtid_executed                    | d035c34f-cbaf-11e6-8a78-080027d6827d:1-7,
d490cfb4-caac-11e6-b1da-080027d6827d:1-90 |
| gtid_executed_compression_period | 1000                                                                                |
| gtid_mode                        | ON                                                                                  |
| gtid_owned                       |                                                                                     |
| gtid_purged                      | d490cfb4-caac-11e6-b1da-080027d6827d:1-78                                           |
+----------------------------------+-------------------------------------------------------------------------------------+
           

这里可以使用GTID的跳过功能。

stop slave;
set gtid_next='d490cfb4-caac-11e6-b1da-080027d6827d:89';  89是Retrieved_Gtid_Set的位置。Executed_Gtid_Set 里面的ID 都是已经执行过了的。
begin;commit;
set gtid_next='automatic';
start slave;
show slave status\G;
           

想看某段pos内执行过的sql:

主库执行:

mysqlbinlog --base64-output=decode-rows -v --start-position=2465 --stop-position=2748 mysql-bin.000050 > 50.sql
           

输出如下:

### DELETE FROM `myreptest`.`myreptest`
### WHERE
###   @1=2
###   @2='myreptest'
ROLLBACK /* added by mysqlbinlog */ /*!*/;  <<<<< 注意这里的rollback。如果以后基于binlog和时间点的恢复。这条数据会被rollback掉,造成一条数据的丢失。所以如果想保留这条数据,需要找到commit的位置,或者下个pos的位置。
           
[email protected]:mysql3307.sock [myreptest]>show processlist;
+----+-------------+-----------+------+---------+-------+--------------------------------------------------------+------------------+
| Id | User        | Host      | db   | Command | Time  | State                                                  | Info             |
+----+-------------+-----------+------+---------+-------+--------------------------------------------------------+------------------+
| 22 | system user |           | NULL | Connect | 79359 | Waiting for master to send event                       | NULL             |  <<<<< IO_thread,从主库拿日志。
| 32 | system user |           | NULL | Connect |  1420 | Slave has read all relay log; waiting for more updates | NULL      <<<<< SQL therad
| 33 | root        | localhost | myreptest | Query   |     0 | starting                                               | show processlist |
+----+-------------+-----------+------+---------+-------+--------------------------------------------------------+------------------+
           

继续阅读