标簽
PostgreSQL , pg_rewind , 時間線 , 變化量 , 業務補齊
https://github.com/digoal/blog/blob/master/201901/20190129_01.md#%E8%83%8C%E6%99%AF 背景
pg_rewind類似Oracle flashback,可以将一個資料庫回退到一個以前的狀态,例如用于:
1、PG實體流複制的從庫,當激活後,可以開啟讀寫,使用pg_rewind可以将從庫回退為隻讀從庫的角色。而不需要重建整個從庫。
2、當異步主從發生角色切換後,主庫的wal目錄中可能還有沒完全同步到從庫的内容,是以老的主庫無法直接切換為新主庫的從庫。使用pg_rewind可以修複老的主庫,使之成為新主庫的隻讀從庫。而不需要重建整個從庫。
如果沒有pg_rewind,遇到以上情況,需要完全重建從庫,如果庫占用空間很大,重建非常耗時,也非常耗費上遊資料庫的資源(讀)。
詳見:
《PostgreSQL pg_rewind,時間線修複,腦裂修複 - 從庫開啟讀寫後,回退為隻讀從庫。異步主從發生角色切換後,主庫rewind為新主庫的從庫》以上解決的是怎麼回退的問題,還有一個問題沒有解,在分歧點到目前狀态下,這些被回退掉的WAL,其中包含了哪些邏輯變化,這些資訊怎麼補齊?
https://github.com/digoal/blog/blob/master/201901/20190129_01.md#%E6%97%B6%E9%97%B4%E7%BA%BF%E5%88%86%E6%AD%A7%E5%8F%98%E5%8C%96%E9%87%8F%E8%A1%A5%E9%BD%90%E5%8E%9F%E7%90%86 時間線分歧變化量補齊原理
1、開啟wal_level=logical
1.1、確定有足夠的slots
2、開啟DDL定義功能,參考:
《PostgreSQL 邏輯訂閱 - DDL 訂閱 實作方法》3、在主庫,為每一個資料庫(或需要做時間線補齊的資料庫)建立一個logical SLOT
4、有更新、删除操作的表,必須有主鍵
5、間歇性移動slot的位置到pg_stat_replication.sent_lsn的位置
6、如果從庫被激活,假設老主庫上還有未發送到從庫的WAL
7、從從庫擷取激活位置LSN
8、由于使用了SLOT,是以從庫激活位點LSN之後的WAL一定存在于老主庫WAL目錄中。
9、将老主庫的slot移動到激活位置LSN
10、從激活位置開始擷取logical變化量
11、業務層根據業務邏輯對這些變化量進行處理,補齊時間線分歧
https://github.com/digoal/blog/blob/master/201901/20190129_01.md#%E7%A4%BA%E4%BE%8B 示例
環境使用:
主庫
port 4001
從庫
port 4000
psql -p 4000
postgres=# alter system set wal_level=logical;
ALTER SYSTEM
psql -p 4001
postgres=# alter system set wal_level=logical;
ALTER SYSTEM
edb=# show max_replication_slots ;
max_replication_slots
-----------------------
16
(1 row)
重新開機資料庫。
postgres=# select pg_create_logical_replication_slot('fix_tl','test_decoding');
pg_create_logical_replication_slot
------------------------------------
(fix_tl,B/73000140)
(1 row)
edb=# select pg_create_logical_replication_slot('fix_tl_edb','test_decoding');
pg_create_logical_replication_slot
------------------------------------
(fix_tl_edb,B/73000140)
(1 row)
連接配接到對應的庫操作
postgres=# select pg_replication_slot_advance('fix_tl',sent_lsn) from pg_stat_replication ;
pg_replication_slot_advance
-----------------------------
(fix_tl,B/73000140)
(1 row)
edb=# select pg_replication_slot_advance('fix_tl_edb',sent_lsn) from pg_stat_replication ;
pg_replication_slot_advance
-----------------------------
(fix_tl,B/73000140)
(1 row)
pg_ctl promote -D /data04/ppas11/pg_root4000
cd /data04/ppas11/pg_root4000
cat pg_wal/00000003.history
1 8/48DE2318 no recovery target specified
2 D/FD5FFFB8 no recovery target specified
psql -p 4001 postgres
postgres=# select pg_replication_slot_advance('fix_tl','D/FD5FFFB8');
psql -p 4001 edb
edb=# select pg_replication_slot_advance('fix_tl_edb','D/FD5FFFB8');
edb=# select * from pg_logical_slot_get_changes('fix_tl_edb',NULL,10,'include-xids', '0');
lsn | xid | data
-----+-----+------
(0 rows)
由于EDB庫沒有變化,是以傳回0條記錄
postgres=# select * from pg_logical_slot_get_changes('fix_tl',NULL,10,'include-xids', '0');
lsn | xid | data
------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
D/FD5FEC60 | 68900576 | BEGIN
D/FD5FEC60 | 68900576 | table public.pgbench_accounts: UPDATE: aid[integer]:44681547 bid[integer]:447 abalance[integer]:-4591 filler[character]:' '
D/FD5FF3A8 | 68900576 | table public.pgbench_tellers: UPDATE: tid[integer]:5091 bid[integer]:510 tbalance[integer]:-160944 filler[character]:null
D/FD5FF9A8 | 68900576 | table public.pgbench_branches: UPDATE: bid[integer]:740 bbalance[integer]:-261044 filler[character]:null
D/FD5FFEF8 | 68900576 | table public.pgbench_history: INSERT: tid[integer]:5091 bid[integer]:740 aid[integer]:44681547 delta[integer]:-4591 mtime[timestamp without time zone]:'29-JAN-19 09:48:14.39739' filler[character]:null
D/FD6001E8 | 68900576 | COMMIT
D/FD5FE790 | 68900574 | BEGIN
D/FD5FE790 | 68900574 | table public.pgbench_accounts: UPDATE: aid[integer]:60858810 bid[integer]:609 abalance[integer]:3473 filler[character]:' '
D/FD5FF1C8 | 68900574 | table public.pgbench_tellers: UPDATE: tid[integer]:8829 bid[integer]:883 tbalance[integer]:60244 filler[character]:null
D/FD5FF810 | 68900574 | table public.pgbench_branches: UPDATE: bid[integer]:33 bbalance[integer]:86295 filler[character]:null
D/FD5FFD80 | 68900574 | table public.pgbench_history: INSERT: tid[integer]:8829 bid[integer]:33 aid[integer]:60858810 delta[integer]:3473 mtime[timestamp without time zone]:'29-JAN-19 09:48:14.397383' filler[character]:null
D/FD600218 | 68900574 | COMMIT
(12 rows)
postgres=# select * from pg_logical_slot_get_changes('fix_tl',NULL,10,'include-xids', '0');
lsn | xid | data
------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
D/FD5FEED0 | 68900578 | BEGIN
D/FD5FEED0 | 68900578 | table public.pgbench_accounts: UPDATE: aid[integer]:15334791 bid[integer]:154 abalance[integer]:-2741 filler[character]:' '
D/FD5FF518 | 68900578 | table public.pgbench_tellers: UPDATE: tid[integer]:2402 bid[integer]:241 tbalance[integer]:191936 filler[character]:null
D/FD5FFB88 | 68900578 | table public.pgbench_branches: UPDATE: bid[integer]:345 bbalance[integer]:-693783 filler[character]:null
D/FD5FFFB8 | 68900578 | table public.pgbench_history: INSERT: tid[integer]:2402 bid[integer]:345 aid[integer]:15334791 delta[integer]:-2741 mtime[timestamp without time zone]:'29-JAN-19 09:48:14.397396' filler[character]:null
D/FD600248 | 68900578 | COMMIT
D/FD5FF438 | 68900579 | BEGIN
D/FD5FF438 | 68900579 | table public.pgbench_accounts: UPDATE: aid[integer]:54259132 bid[integer]:543 abalance[integer]:3952 filler[character]:' '
D/FD5FFEA8 | 68900579 | table public.pgbench_tellers: UPDATE: tid[integer]:9591 bid[integer]:960 tbalance[integer]:-498586 filler[character]:null
D/FD600298 | 68900579 | table public.pgbench_branches: UPDATE: bid[integer]:147 bbalance[integer]:459542 filler[character]:null
D/FD600560 | 68900579 | table public.pgbench_history: INSERT: tid[integer]:9591 bid[integer]:147 aid[integer]:54259132 delta[integer]:3952 mtime[timestamp without time zone]:'29-JAN-19 09:48:14.397464' filler[character]:null
D/FD600938 | 68900579 | COMMIT
(12 rows)
... ...
直到沒有記錄傳回,說明已擷取到所有變化量
直到沒有記錄傳回,說明已擷取到所有變化量
10.1、檢視SLOT狀态,目前WAL位置資訊
psql -p 4001
postgres=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
------------+---------------+-----------+--------+-----------+--------+------------+------+--------------+-------------+---------------------
fix_tl | test_decoding | logical | 15844 | f | f | | | 67005646 | D/D7959218 | D/FD600218
fix_tl_edb | test_decoding | logical | 15845 | f | f | | | 72528996 | E/71C92B00 | E/71C92B38
(2 rows)
目前WAL位置
postgres=# select pg_current_wal_lsn();
pg_current_wal_lsn
--------------------
E/71C92B38
(1 row)
https://github.com/digoal/blog/blob/master/201901/20190129_01.md#%E5%B0%8F%E7%BB%93 小結
主庫開啟邏輯SLOT,并根據從庫的接收LSN位置,使用pg_replication_slot_advance移動主庫的slot位點到從庫的接收LSN位置。
當從庫激活,老主庫還有未同步到從庫的WAL時,可以通過邏輯decode的方法,擷取到未同步的邏輯變化量。
業務層根據業務邏輯,補齊這些變化量到新的主庫。
注意:
1、開啟logical wal_level,會給資料庫增加較多的WAL日志,請酌情開啟。
2、開啟SLOT後,由于資料庫會保證沒有被訂閱的WAL保留在pg_wal目錄中,那麼如果SLOT沒有及時移動,則可能導緻主庫的pg_wal目錄暴增。
https://github.com/digoal/blog/blob/master/201901/20190129_01.md#%E5%8F%82%E8%80%83 參考
https://www.postgresql.org/docs/11/test-decoding.html https://www.postgresql.org/docs/11/functions-admin.html#FUNCTIONS-REPLICATIONhttps://github.com/digoal/blog/blob/master/201901/20190129_01.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機
