環境:雙m-m架構,其中一台b因為磁盤損壞,伺服器異常重新開機。重新開機之後b上面的資料庫正常運作,當時a 庫報如下錯誤:
last_io_error: got fatal error 1236 from master when reading data from binary log: 'client requested master to start replication from impossible position
root@rac1 # my 3306
entry port ==== 3306
welcome to the mysql monitor. commands end with ; or \g.
your mysql connection id is 1612406693
server version: 5.1.40-community-log mysql community server (gpl)
type 'help;' or '\h' for help. type '\c' to clear the current input statement.
[email protected] : (none) 20:55:50> show slave status \g
*************************** 1. row ***************************
slave_io_state:
master_host: 10.250.7.3
master_user: replicator
master_port: 3306
connect_retry: 60
master_log_file: mysql-bin.000143
read_master_log_pos: 664526789
relay_log_file: slave-relay.000339
relay_log_pos: 251
relay_master_log_file: mysql-bin.000143
slave_io_running: no
slave_sql_running: yes
replicate_do_db:
replicate_ignore_db:
replicate_do_table:
replicate_ignore_table:
replicate_wild_do_table:
replicate_wild_ignore_table:
last_errno: 0
last_error:
skip_counter: 0
exec_master_log_pos: 664526789
relay_log_space: 445
until_condition: none
until_log_file:
until_log_pos: 0
master_ssl_allowed: no
master_ssl_ca_file:
master_ssl_ca_path:
master_ssl_cert:
master_ssl_cipher:
master_ssl_key:
seconds_behind_master: null
master_ssl_verify_server_cert: no
last_io_errno: 1236
last_io_error: got fatal error 1236 from master when reading data from binary log: 'client requested master to start replication from impossible position'
last_sql_errno: 0
last_sql_error:
1 row in set (0.00 sec)
[email protected] : (none) 20:55:52> exit
bye
檢視a庫的error log日志 發現
root@rac1 # tail -f /home/mysql/data/mysql/master-error.log
120611 19:32:35 [warning] aborted connection 1341365540 to db: 'unconnected' user: 'replicator' host: '10.250.7.3' (got timeout writing communication packets)
120611 20:31:28 [error] error reading packet from server: client requested master to start replication from impossible position ( server_errno=1236)
120611 20:31:28 [error] slave i/o: got fatal error 1236 from master when reading data from binary log: 'client requested master to start replication from impossible position', error_code: 1236
120611 20:31:28 [note] slave i/o thread exiting, read up to log 'mysql-bin.000143', position 664526789
120611 20:39:27 [note] error reading relay log event: slave sql thread was killed
120611 20:39:38 [note] slave sql thread initialized, starting replication in log 'mysql-bin.000143' at position 664526789, relay log '/home/mysql/data/mysql/slave-relay.000339' position: 251
120611 20:39:38 [note] slave i/o thread: connected to master '[email protected]:3306',replication started in log 'mysql-bin.000143' at position 664526789
120611 20:39:38 [error] error reading packet from server: client requested master to start replication from impossible position ( server_errno=1236) 上面的位置錯誤
120611 20:39:38 [error] slave i/o: got fatal error 1236 from master when reading data from binary log: 'client requested master to start replication from impossible position', error_code: 1236
120611 20:39:38 [note] slave i/o thread exiting, read up to log '<b>mysql-bin.000143', position 664526789</b>
<b>slave_io_running線程終止。仔細看上面的報錯資訊,說slave程序試圖從mysql-bin.000143日志的 position 664526789開始啟動恢複,但是該日志中是沒有此position。</b>
根據主庫a 上的錯誤資訊
到從庫b 上檢視其bin log記錄
root@rac2 # mysqlbinlog mysql-bin.000143 > log_20120611.sql
root@rac2 #
root@rac2 # tail -f log_20120611.sql
set timestamp=1339414288/*!*/;
insert into vm_monitor_20120611(time_stamp,name,group_id,user_id,cpu,memory,rx,tx,flow_volume,bandwidth, read_iops, write_iops, gmt_create) values(1339414287,'vm-3157068f','977','21',10.0304,12288,14533,31178,45711,1524,221.4,9.66667,'2012-06-11,19:31:28')
/*!*/;
# at<b> 664521543 <===最後的日志記錄位置明顯小于 a 報錯資訊中的 664526789 是以才報錯說“impossible position” 因為本來就沒有嘛</b>
#120611 19:31:28 server id 2 end_log_pos 664521570 xid = 546104615
commit/*!*/;
delimiter ;
# end of log file
rollback /* added by mysqlbinlog */;
/*!50003 set completion_type=@old_completion_type*/;
回到主庫a 上面進行重新指定應用日志的位置,檢視master.info資訊,從這裡也可以看出 io thread 進行恢複的起始位置為不合理的664526789
root@rac1 #
root@rac1 # more master.info
15
mysql-bin.000143
664526789
10.250.7.3
replicator
xxxxxxxx
3306
60
your mysql connection id is 1612460302
type 'help;' or '\h' for help. type '\c' to clear the current input statement.
[email protected] : (none) 21:11:51> stop slave;
query ok, 0 rows affected (0.00 sec)
<b>這裡可以指定mysql-bin.000144 的第一個post 也可以指定 mysql-bin.000143的664521543位置!</b>
[email protected] : (none) 21:12:27> change master to master_host='10.250.7.3',
-> master_user ='replicator',
-> master_password ='xxxxx',
-> master_log_file ='<b>mysql-bin.000144</b>',
-> master_log_pos =<b>4</b>,
-> master_port =3306;
query ok, 0 rows affected (0.16 sec)
[email protected] : (none) 21:13:32> start slave;
[email protected] : (none) 21:13:38>
[email protected] : (none) 21:13:38> show slave status \g
slave_io_state: waiting for master to send event
master_log_file: mysql-bin.000144
read_master_log_pos: 148515312
relay_log_file: slave-relay.000002
relay_master_log_file: mysql-bin.000144
slave_io_running: yes
exec_master_log_pos: 148515312
relay_log_space: 402
seconds_behind_master: 0
last_io_errno: 0
last_io_error:
至此問題解決 ok!!