在上一篇文章《MySQL 5.7中如何定位DDL被阻塞的問題》中,對于DDL被阻塞問題的定位,我們主要是基于MySQL 5.7新引入的performance_schema.metadata_locks表。提出的定位方法,頗有種"錦上添花"的意味,而且,也隻适用于MySQL 5.7開始的版本。
但在實際生産中,MySQL 5.6還是占絕不多數。雖然MySQL 8.0都已經GA了,但鑒于資料庫的特殊性,在對待更新的這個事情上,相當一部分人還是秉持着一種“不主動”的态度。
既然MySQL 5.6用者衆多,有沒有一種方法,來解決MySQL 5.6的這個痛點呢?
還是之前的測試Demo
會話1開啟了事務并執行了三個操作,但未送出,此時,會話2執行了alter table操作,被阻塞。
session1> begin;
Query OK, 0 rows affected (0.00 sec)
session1> delete from slowtech.t1 where id=2;
Query OK, 1 row affected (0.00 sec)
session1> select * from slowtech.t1;
+------+------+
| id | name |
+------+------+
| 1 | a |
+------+------+
row in set (0.00 sec)
session1> update slowtech.t1 set name='c' where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
session2> alter table slowtech.t1 add c1 int; ##被阻塞
session3> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| 2 | root | localhost | NULL | Sleep | 51 | | NULL |
| 3 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 4 | root | localhost | NULL | Query | 9 | Waiting for table metadata lock | alter table slowtech.t1 add c1 int |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
rows in set (0.00 sec)
其實,導緻DDL阻塞的操作,無非兩類:
1. 慢查詢
2. 表上有事務未送出
其中,第一類比較好定位,通過show processlist即能發現。而第二類基本沒法定位,因為未送出事務的連接配接在show processlist中的輸出同空閑連接配接一樣。
如下面Id為2的連接配接,雖然Command顯示為“Sleep”,其實是事務未送出。
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
| 2 | root | localhost | NULL | Sleep | 77 | | NULL |
| 3 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 4 | root | localhost | NULL | Query | 44 | Waiting for table metadata lock | alter table slowtech.t1 add c1 int |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+
3 rows in set (0.00 sec)
是以,網上有kill空閑(Command為Sleep)連接配接的說法,其實也不無道理,但這樣做就太簡單粗暴了,難免會誤殺。
其實,既然是事務,在information_schema. innodb_trx中肯定會有記錄,如會話1中的事務,在表中的記錄如下,
mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
trx_id: 1050390
trx_state: RUNNING
trx_started: 2018-07-17 08:55:32
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 4
trx_mysql_thread_id: 2
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 1
trx_lock_structs: 2
trx_lock_memory_bytes: 1136
trx_rows_locked: 3
trx_rows_modified: 2
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 0
trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
其中trx_mysql_thread_id是線程id,結合performance_schema.threads,可以知道目前哪些連接配接上存在着活躍事務,這樣就進一步縮小了可被kill的線程範圍。
但從影響程度上,和kill所有Command為Sleep的連接配接沒太大差別,畢竟,kill真正的空閑連接配接對業務的影響不大。
此時,依然可以借助performance_schema. events_statements_history表。
在上篇MySQL 5.7的分析中,我們是首先知道引發阻塞的線程ID,然後利用events_statements_history表,檢視該線程的相關SQL。
而在MySQL 5.6中,我們并不知道引發阻塞的線程ID,但是,我們可以反其道而行之,利用窮舉法,首先統計出所有線程在目前事務執行過的所有SQL,然後再判斷這些SQL中是否包含目标表。
具體SQL如下,
SELECT
processlist_id,
sql_text
FROM
(
SELECT
c.processlist_id,
substring_index( sql_text, "transaction_begin;",-1 ) sql_text
FROM
information_schema.innodb_trx a,
(
SELECT
thread_id,
group_concat( CASE WHEN EVENT_NAME = 'statement/sql/begin' THEN "transaction_begin" ELSE sql_text END ORDER BY event_id SEPARATOR ";" ) AS sql_text
FROM
performance_schema.events_statements_history
GROUP BY
thread_id
) b,
performance_schema.threads c
WHERE
a.trx_mysql_thread_id = c.processlist_id
AND b.thread_id = c.thread_id
) t
WHERE
sql_text LIKE '%t1%';
+----------------+---------------------------------------------------------------------------------------------------------+
| processlist_id | sql_text |
+----------------+---------------------------------------------------------------------------------------------------------+
| 2 | delete from slowtech.t1 where id=2;select * from slowtech.t1;update slowtech.t1 set name='c' where id=1 |
+----------------+---------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
從輸出來看,确實也達到了預期效果。
需要注意的是,在MySQL5.6中,events_statements_history預設是沒有開啟的。
mysql> SELECT * FROM performance_schema.setup_consumers WHERE NAME LIKE '%statements%';
+--------------------------------+---------+
| NAME | ENABLED |
+--------------------------------+---------+
| events_statements_current | YES |
| events_statements_history | NO |
| events_statements_history_long | NO |
| statements_digest | YES |
+--------------------------------+---------+
4 rows in set (0.00 sec)