标簽
PostgreSQL , pg_rewind , 主從切換 , 時間線修複 , 腦裂修複 , 從庫開啟讀寫後,回退為隻讀從庫 , 異步主從發生角色切換後,主庫rewind為新主庫的從庫
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E8%83%8C%E6%99%AF 背景
1、PG實體流複制的從庫,當激活後,可以開啟讀寫,使用pg_rewind可以将從庫回退為隻讀從庫的角色。而不需要重建整個從庫。
2、當異步主從發生角色切換後,主庫的wal目錄中可能還有沒完全同步到從庫的内容,是以老的主庫無法直接切換為新主庫的從庫。使用pg_rewind可以修複老的主庫,使之成為新主庫的隻讀從庫。而不需要重建整個從庫。
3、如果沒有pg_rewind,遇到以上情況,需要完全重建從庫。或者你可以使用存儲層快照,回退回腦裂以前的狀态。又或者可以使用檔案系統快照,回退回腦裂以前的狀态。
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E5%8E%9F%E7%90%86%E4%B8%8E%E4%BF%AE%E5%A4%8D%E6%AD%A5%E9%AA%A4 原理與修複步驟
1、使用pg_rewind功能的前提條件:必須開啟full page write,必須開啟wal hint或者data block checksum。
2、需要被修複的庫:從激活點開始,所有的WAL必須存在pg_wal目錄中。如果WAL已經被覆寫,隻要有歸檔,拷貝到pg_wal目錄即可。
3、新的主庫,從激活點開始,産生的所有WAL必須存在pg_wal目錄中,或者已歸檔,并且被修複的庫可以使用restore_command通路到這部分WAL。
4、修改(source db)新主庫或老主庫配置,允許連接配接。
5、修複時,連接配接新主庫,得到切換點。或連接配接老主庫,同時比對目前要修複的新主庫的TL與老主庫進行比對,得到切換點。
6、解析需要被修複的庫的從切換點到現在所有的WAL。同時連接配接source db(新主庫(或老主庫)),進行回退操作(被修改或删除的BLOCK從source db擷取并覆寫,新增的BLOCK,直接抹除。)回退到切換點的狀态。
7、修改被修複庫(target db)的recovery.conf, postgresql.conf配置。
8、啟動target db,連接配接source db接收WAL,或restore_command配置接收WAL,從切換點開始所有WAL,進行apply。
9、target db現在是source db的從庫。
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E4%BB%A5edb-pg-11%E4%B8%BA%E4%BE%8B%E8%AE%B2%E8%A7%A3 以EDB PG 11為例講解
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E7%8E%AF%E5%A2%83%E9%83%A8%E7%BD%B2 環境部署
《MTK使用 - PG,PPAS,oracle,mysql,ms sql,sybase 遷移到 PG, PPAS (支援跨版本更新)》export PS1="$USER@`/bin/hostname -s`-> "
export PGPORT=4000
export PGDATA=/data04/ppas11/pg_root4000
export LANG=en_US.utf8
export PGHOME=/usr/edb/as11
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH
export DATE=`date +"%Y%m%d%H%M"`
export PATH=$PGHOME/bin:$PATH:.
export MANPATH=$PGHOME/share/man:$MANPATH
export PGHOST=127.0.0.1
export PGUSER=postgres
export PGDATABASE=postgres
alias rm='rm -i'
alias ll='ls -lh'
unalias vi
1、初始化資料庫叢集
initdb -D /data04/ppas11/pg_root4000 -E UTF8 --lc-collate=C --lc-ctype=en_US.UTF8 -U postgres -k --redwood-like
2、配置recovery.done
cd $PGDATA
cp $PGHOME/share/recovery.conf.sample ./
mv recovery.conf.sample recovery.done
vi recovery.done
restore_command = 'cp /data04/ppas11/wal/%f %p'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=localhost port=4000 user=postgres'
3、配置postgresql.conf
要使用rewind功能:
必須開啟full_page_writes
必須開啟data_checksums或wal_log_hints
postgresql.conf
listen_addresses = '0.0.0.0'
port = 4000
max_connections = 8000
superuser_reserved_connections = 13
unix_socket_directories = '.,/tmp'
unix_socket_permissions = 0700
tcp_keepalives_idle = 60
tcp_keepalives_interval = 10
tcp_keepalives_count = 10
shared_buffers = 16GB
max_prepared_transactions = 8000
maintenance_work_mem = 1GB
autovacuum_work_mem = 1GB
dynamic_shared_memory_type = posix
vacuum_cost_delay = 0
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 10.0
effective_io_concurrency = 0
max_worker_processes = 128
max_parallel_maintenance_workers = 8
max_parallel_workers_per_gather = 8
max_parallel_workers = 24
wal_level = replica
synchronous_commit = off
full_page_writes = on
wal_compression = on
wal_buffers = 32MB
wal_writer_delay = 10ms
checkpoint_timeout = 25min
max_wal_size = 32GB
min_wal_size = 8GB
checkpoint_completion_target = 0.2
archive_mode = on
archive_command = 'cp -n %p /data04/ppas11/wal/%f'
max_wal_senders = 16
wal_keep_segments = 4096
max_replication_slots = 16
hot_standby = on
max_standby_archive_delay = 300s
max_standby_streaming_delay = 300s
wal_receiver_status_interval = 1s
wal_receiver_timeout = 10s
random_page_cost = 1.1
effective_cache_size = 400GB
log_destination = 'csvlog'
logging_collector = on
log_directory = 'log'
log_filename = 'edb-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 1s
log_checkpoints = on
log_error_verbosity = verbose
log_line_prefix = '%t '
log_lock_waits = on
log_statement = 'ddl'
log_timezone = 'PRC'
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_max_workers = 6
autovacuum_freeze_max_age = 1200000000
autovacuum_multixact_freeze_max_age = 1400000000
autovacuum_vacuum_cost_delay = 0
statement_timeout = 0
lock_timeout = 0
idle_in_transaction_session_timeout = 0
vacuum_freeze_table_age = 1150000000
vacuum_multixact_freeze_table_age = 1150000000
datestyle = 'redwood,show_time'
timezone = 'PRC'
lc_messages = 'en_US.utf8'
lc_monetary = 'en_US.utf8'
lc_numeric = 'en_US.utf8'
lc_time = 'en_US.utf8'
default_text_search_config = 'pg_catalog.english'
shared_preload_libraries = 'auto_explain,pg_stat_statements,$libdir/dbms_pipe,$libdir/edb_gen,$libdir/dbms_aq'
edb_redwood_date = on
edb_redwood_greatest_least = on
edb_redwood_strings = on
db_dialect = 'redwood'
edb_dynatune = 66
edb_dynatune_profile = oltp
timed_statistics = off
4、配置pg_hba.conf,允許流複制
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host all all 0.0.0.0/0 md5
5、配置歸檔目錄
mkdir /data04/ppas11/wal
chown enterprisedb:enterprisedb /data04/ppas11/wal
6、建立從庫
pg_basebackup -h 127.0.0.1 -p 4000 -D /data04/ppas11/pg_root4001 -F p -c fast
7、配置從庫
cd /data04/ppas11/pg_root4001
mv recovery.done recovery.conf
vi postgresql.conf
port = 4001
8、啟動從庫
pg_ctl start -D /data04/ppas11/pg_root4001
9、壓測主庫
pgbench -i -s 1000
pgbench -M prepared -v -r -P 1 -c 24 -j 24 -T 300
10、檢查歸檔
postgres=# select * from pg_stat_archiver ;
archived_count | last_archived_wal | last_archived_time | failed_count | last_failed_wal | last_failed_time | stats_reset
----------------+--------------------------+----------------------------------+--------------+-----------------+------------------+----------------------------------
240 | 0000000100000000000000F0 | 28-JAN-19 15:08:43.276965 +08:00 | 0 | | | 28-JAN-19 15:01:17.883338 +08:00
(1 row)
postgres=# select * from pg_stat_archiver ;
archived_count | last_archived_wal | last_archived_time | failed_count | last_failed_wal | last_failed_time | stats_reset
----------------+--------------------------+----------------------------------+--------------+-----------------+------------------+----------------------------------
248 | 0000000100000000000000F8 | 28-JAN-19 15:08:45.120134 +08:00 | 0 | | | 28-JAN-19 15:01:17.883338 +08:00
(1 row)
11、檢查從庫延遲
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+---------------------------------
pid | 8124
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 127.0.0.1
client_hostname |
client_port | 62988
backend_start | 28-JAN-19 15:07:34.084542 +08:00
backend_xmin |
state | streaming
sent_lsn | 1/88BC2000
write_lsn | 1/88BC2000
flush_lsn | 1/88BC2000
replay_lsn | 1/88077D48
write_lag | 00:00:00.001417
flush_lag | 00:00:00.002221
replay_lag | 00:00:00.097657
sync_priority | 0
sync_state | async
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E4%BE%8B%E5%AD%901%E4%BB%8E%E5%BA%93%E6%BF%80%E6%B4%BB%E5%90%8E%E4%BA%A7%E7%94%9F%E8%AF%BB%E5%86%99%E4%BD%BF%E7%94%A8pg_rewind%E4%BF%AE%E5%A4%8D%E4%BB%8E%E5%BA%93%E5%9B%9E%E9%80%80%E5%88%B0%E5%8F%AA%E8%AF%BB%E4%BB%8E%E5%BA%93 例子1,從庫激活後産生讀寫,使用pg_rewind修複從庫,回退到隻讀從庫
1、激活從庫
pg_ctl promote -D /data04/ppas11/pg_root4001
2、寫從庫
pgbench -M prepared -v -r -P 1 -c 4 -j 4 -T 120 -p 4001
此時從庫已經和主庫不在一個時間線,無法直接變成目前主庫的從庫
enterprisedb@pg11-test-> pg_controldata -D /data04/ppas11/pg_root4001|grep -i time
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Time of latest checkpoint: Mon 28 Jan 2019 03:56:38 PM CST
Min recovery ending loc's timeline: 2
track_commit_timestamp setting: off
Date/time type storage: 64-bit integers
enterprisedb@pg11-test-> pg_controldata -D /data04/ppas11/pg_root4000|grep -i time
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Time of latest checkpoint: Mon 28 Jan 2019 05:11:38 PM CST
Min recovery ending loc's timeline: 0
track_commit_timestamp setting: off
Date/time type storage: 64-bit integers
3、修複從庫,使之繼續成為目前主庫的從庫
4、檢視切換點
cd /data04/ppas11/pg_root4001
ll pg_wal/*.history
-rw------- 1 enterprisedb enterprisedb 42 Jan 28 17:15 pg_wal/00000002.history
cat pg_wal/00000002.history
1 6/48C62000 no recovery target specified
5、從庫激活時間開始産生的WAL必須全部在pg_wal目錄中。
-rw------- 1 enterprisedb enterprisedb 42 Jan 28 17:15 00000002.history
-rw------- 1 enterprisedb enterprisedb 16M Jan 28 17:16 000000020000000600000048
............
000000020000000600000048開始,所有的wal必須存在從庫pg_wal目錄中。如果已經覆寫了,必須從歸檔目錄拷貝到從庫pg_wal目錄中。
6、從庫激活時,主庫從這個時間點開始所有的WAL還在pg_wal目錄,或者從庫可以使用restore_command獲得(recovery.conf)。
recovery.conf
restore_command = 'cp /data04/ppas11/wal/%f %p'
7、pg_rewind指令幫助
https://www.postgresql.org/docs/11/app-pgrewind.htmlpg_rewind --help
pg_rewind resynchronizes a PostgreSQL cluster with another copy of the cluster.
Usage:
pg_rewind [OPTION]...
Options:
-D, --target-pgdata=DIRECTORY existing data directory to modify
--source-pgdata=DIRECTORY source data directory to synchronize with
--source-server=CONNSTR source server to synchronize with
-n, --dry-run stop before modifying anything
-P, --progress write progress messages
--debug write a lot of debug messages
-V, --version output version information, then exit
-?, --help show this help, then exit
Report bugs to <[email protected]>.
8、停庫(被修複的庫,停庫)
pg_ctl stop -m fast -D /data04/ppas11/pg_root4001
9、嘗試修複
pg_rewind -n -D /data04/ppas11/pg_root4001 --source-server="hostaddr=127.0.0.1 user=postgres port=4000"
servers diverged at WAL location 6/48C62000 on timeline 1
rewinding from last common checkpoint at 5/5A8CD30 on timeline 1
Done!
10、嘗試正常,說明可以修複,實施修複
pg_rewind -D /data04/ppas11/pg_root4001 --source-server="hostaddr=127.0.0.1 user=postgres port=4000"
servers diverged at WAL location 6/48C62000 on timeline 1
rewinding from last common checkpoint at 5/5A8CD30 on timeline 1
Done!
11、已修複,改配置
cd /data04/ppas11/pg_root4001
vi postgresql.conf
port = 4001
mv recovery.done recovery.conf
vi recovery.conf
restore_command = 'cp /data04/ppas11/wal/%f %p'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=localhost port=4000 user=postgres'
12、删除歸檔中錯誤時間線上産生的檔案否則會在啟動修複後的從庫後,走到00000002時間線上,這是不想看到的。
mkdir /data04/ppas11/wal/error_tl_2
mv /data04/ppas11/wal/00000002* /data04/ppas11/wal/error_tl_2
13、啟動從庫
pg_ctl start -D /data04/ppas11/pg_root4001
14、建議對主庫做一個檢查點,從庫收到檢查點後,重新開機後不需要應用太多WAL,而是從新檢查點開始恢複
psql
checkpoint;
15、壓測主庫
pgbench -M prepared -v -r -P 1 -c 16 -j 16 -T 200 -p 4000
16、檢視歸檔狀态
postgres=# select * from pg_stat_archiver ;
archived_count | last_archived_wal | last_archived_time | failed_count | last_failed_wal | last_failed_time | stats_reset
----------------+--------------------------+----------------------------------+--------------+-----------------+------------------+----------------------------------
1756 | 0000000100000006000000DC | 28-JAN-19 17:41:57.562425 +08:00 | 0 | | | 28-JAN-19 15:01:17.883338 +08:00
(1 row)
17、檢視從庫健康、延遲,觀察修複後的情況
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+--------------------------------
pid | 13179
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 127.0.0.1
client_hostname |
client_port | 63198
backend_start | 28-JAN-19 17:47:29.85308 +08:00
backend_xmin |
state | catchup
sent_lsn | 7/DDE80000
write_lsn | 7/DC000000
flush_lsn | 7/DC000000
replay_lsn | 7/26A8DCB0
write_lag | 00:00:18.373263
flush_lag | 00:00:18.373263
replay_lag | 00:00:18.373263
sync_priority | 0
sync_state | async
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E4%BE%8B%E5%AD%902%E4%BB%8E%E5%BA%93%E6%BF%80%E6%B4%BB%E6%88%90%E4%B8%BA%E6%96%B0%E4%B8%BB%E5%BA%93%E5%90%8E%E8%80%81%E4%B8%BB%E5%BA%93%E4%BE%9D%E6%97%A7%E6%9C%89%E8%AF%BB%E5%86%99%E4%BD%BF%E7%94%A8pg_rewind%E4%BF%AE%E5%A4%8D%E8%80%81%E4%B8%BB%E5%BA%93%E5%B0%86%E8%80%81%E4%B8%BB%E5%BA%93%E9%99%8D%E7%BA%A7%E4%B8%BA%E6%96%B0%E4%B8%BB%E5%BA%93%E7%9A%84%E4%BB%8E%E5%BA%93 例子2,從庫激活成為新主庫後,老主庫依舊有讀寫,使用pg_rewind修複老主庫,将老主庫降級為新主庫的從庫
pg_ctl promote -D /data04/ppas11/pg_root4001
pgbench -M prepared -v -r -P 1 -c 16 -j 16 -T 200 -p 4001
3、寫主庫
pgbench -M prepared -v -r -P 1 -c 16 -j 16 -T 200 -p 4000
此時老主庫已經和新的主庫不在一個時間線
enterprisedb@pg11-test-> pg_controldata -D /data04/ppas11/pg_root4000|grep -i timeline
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Min recovery ending loc's timeline: 0
enterprisedb@pg11-test-> pg_controldata -D /data04/ppas11/pg_root4001|grep -i timeline
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Min recovery ending loc's timeline: 2
enterprisedb@pg11-test-> cd /data04/ppas11/pg_root4001/pg_wal
enterprisedb@pg11-test-> cat 00000002.history
1 8/48DE2318 no recovery target specified
enterprisedb@pg11-test-> ll *.partial
-rw------- 1 enterprisedb enterprisedb 16M Jan 28 17:48 000000010000000800000048.partial
4、修複老主庫,變成從庫
4.1、從庫激活時,老主庫從這個時間點開始所有的WAL,必須全部在pg_wal目錄中。
000000010000000800000048 開始的所有WAL必須存在pg_wal,如果已經覆寫了,必須從WAL歸檔拷貝到pg_wal目錄
4.2、從庫激活時間開始産生的所有WAL,老主庫必須可以使用restore_command獲得(recovery.conf)。
recovery.conf
restore_command = 'cp /data04/ppas11/wal/%f %p'
5、關閉老主庫
pg_ctl stop -m fast -D /data04/ppas11/pg_root4000
6、嘗試修複老主庫
pg_rewind -n -D /data04/ppas11/pg_root4000 --source-server="hostaddr=127.0.0.1 user=postgres port=4001"
servers diverged at WAL location 8/48DE2318 on timeline 1
rewinding from last common checkpoint at 6/CCCEF770 on timeline 1
Done!
7、嘗試成功,可以修複,實施修複
pg_rewind -D /data04/ppas11/pg_root4000 --source-server="hostaddr=127.0.0.1 user=postgres port=4001"
8、修複完成後,改配置
cd /data04/ppas11/pg_root4000
vi postgresql.conf
port = 4000
mv recovery.done recovery.conf
vi recovery.conf
restore_command = 'cp /data04/ppas11/wal/%f %p'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=localhost port=4001 user=postgres'
9、啟動老主庫
pg_ctl start -D /data04/ppas11/pg_root4000
10、建議對新主庫做一個檢查點,從庫收到檢查點後,重新開機後不需要應用太多WAL,而是從新檢查點開始恢複
checkpoint;
11、壓測新主庫
pgbench -M prepared -v -r -P 1 -c 16 -j 16 -T 200 -p 4001
12、檢視歸檔狀态
psql -p 4001
postgres=# select * from pg_stat_archiver ;
archived_count | last_archived_wal | last_archived_time | failed_count | last_failed_wal | last_failed_time | stats_reset
----------------+--------------------------+----------------------------------+--------------+-----------------+------------------+----------------------------------
406 | 0000000200000009000000DB | 28-JAN-19 21:18:22.976118 +08:00 | 0 | | | 28-JAN-19 17:47:29.847488 +08:00
(1 row)
13、檢視從庫健康、延遲
psql -p 4001
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+---------------------------------
pid | 17675
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 127.0.0.1
client_hostname |
client_port | 60530
backend_start | 28-JAN-19 21:18:36.472197 +08:00
backend_xmin |
state | streaming
sent_lsn | 9/E8361C18
write_lsn | 9/E8361C18
flush_lsn | 9/E8361C18
replay_lsn | 9/D235B520
write_lag | 00:00:00.000101
flush_lag | 00:00:00.000184
replay_lag | 00:00:03.028098
sync_priority | 0
sync_state | async
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E5%B0%8F%E7%BB%93 小結
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#1-%E9%80%82%E5%90%88%E5%9C%BA%E6%99%AF 1 适合場景
如果沒有pg_rewind,遇到以上情況,需要完全重建從庫,如果庫占用空間很大,重建非常耗時,也非常耗費上遊資料庫的資源(讀)。
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#2-%E5%89%8D%E6%8F%90 2 前提
1、必須開啟full_page_writes
2、必須開啟data_checksums或wal_log_hints
initdb -k 開啟data_checksums
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#3-%E5%8E%9F%E7%90%86%E4%B8%8E%E4%BF%AE%E5%A4%8D%E6%B5%81%E7%A8%8B 3 原理與修複流程
https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E5%8F%82%E8%80%83 參考
《PostgreSQL primary-standby failback tools : pg_rewind》 《PostgreSQL 9.5 new feature - pg_rewind fast sync Split Brain Primary & Standby》 《PostgreSQL 9.5 add pg_rewind for Fast align for PostgreSQL unaligned primary & standby》https://github.com/digoal/blog/blob/master/201901/20190128_02.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機
