天天看點

為什麼pg_basebackup或pg_start_backup好像hang住确沒有開始拷貝檔案 - checkpoint 的幾種排程(checkpoint_completion_target)

标簽

PostgreSQL , checkpoint , 排程 , lazy , immediate , pg_start_backup , pg_basebackup

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E8%83%8C%E6%99%AF 背景

PostgreSQL支援線上全量備份與增量歸檔備份。線上全量備份實際上就是拷貝檔案,增量備份則分為兩種,一種是基于BLOCK lsn變化的BLOCK即增量備份,另一種是基于WAL的持續歸檔檔案備份。

全量備份通常使用pg_basebackup用戶端實作,或者使用SQL函數pg_start_backup()+COPY檔案、打快照的方式實作。

全量備份開啟前,需要對資料庫做一次checkpoint,并強制開啟full page write,確定partial block在後續可以通過wal進行恢複。備份結束時通過pg_stop_backup告知,關閉full page write(如果參數開啟了FPW則不受影響)。

有時你可能會發現使用pg_basebackup或pg_start_backup接口時,好像hang住确沒有開始拷貝檔案。實際上是在做checkpoint,但是為什麼這個checkpoint比較慢,但是直接SQL執行checkopint指令确很快呢?

原因是checkpoint分為排程和非排程模式。

排程模式的checkpoint和checkpoint_completion_target以及配置的max_wal_size區間大小有關,checkpoint_completion_target和max_wal_size越大,表示這個checkpoint将在這麼大的區間内排程完成,是以總耗時會非常長,好處是減少CHECKPOINT帶來的大量刷髒和FSYNC,進而減少抖動。

壞處就是你會發現checkpoint很漫長。

非排程模式的checkpoint,就是盡快完成檢查點,會全速刷髒,不進行排程。好處是快,壞處是,如果髒頁特别多,可能會有大量IO影響其他會話性能。

postgres=# show max_wal_size ;  
 max_wal_size   
--------------  
 128GB  
(1 row)  
  
postgres=# show min_wal_size;  
 min_wal_size   
--------------  
 32GB  
(1 row)  
             
postgres=# show checkpoint_completion_target ;  
 checkpoint_completion_target   
------------------------------  
 0.1  
(1 row)  
           

代碼中可以看到,checkpoint有如下flag來控制檢查點行為。

* RequestCheckpoint  
 *              Called in backend processes to request a checkpoint  
 *  
 * flags is a bitwise OR of the following:  
 *      CHECKPOINT_IS_SHUTDOWN: checkpoint is for database shutdown.  
 *      CHECKPOINT_END_OF_RECOVERY: checkpoint is for end of WAL recovery.  
 *      CHECKPOINT_IMMEDIATE: finish the checkpoint ASAP,  
 *              ignoring checkpoint_completion_target parameter.  
 *      CHECKPOINT_FORCE: force a checkpoint even if no XLOG activity has occurred  
 *              since the last one (implied by CHECKPOINT_IS_SHUTDOWN or  
 *              CHECKPOINT_END_OF_RECOVERY).  
 *      CHECKPOINT_WAIT: wait for completion before returning (otherwise,  
 *              just signal checkpointer to do it, and return).  
 *      CHECKPOINT_CAUSE_XLOG: checkpoint is requested due to xlog filling.  
 *              (This affects logging, and in particular enables CheckPointWarning.)  
 */  
void  
RequestCheckpoint(int flags)  
           

start backup如何控制是使用快速checkpoint(非排程模式)、或者排程模式的checkpoint呢?

XLogRecPtr  
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,  
                                   StringInfo labelfile, DIR *tblspcdir, List **tablespaces,  
                                   StringInfo tblspcmapfile, bool infotbssize,  
                                   bool needtblspcmapfile)  
{  
  
  
  
                         * Since the fact that we are executing do_pg_start_backup()  
                         * during recovery means that checkpointer is running, we can use  
                         * RequestCheckpoint() to establish a restartpoint.  
                         *  
                         * We use CHECKPOINT_IMMEDIATE only if requested by user (via  
                         * passing fast = true).  Otherwise this can take awhile.  
                         */  
                        RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |  
                                                          (fast ? CHECKPOINT_IMMEDIATE : 0));  
           

1、pg_basebackup用戶端指令,通過-c參數控制(fast表示使用非排程模式checkpoint)

-c, --checkpoint=fast|spread  
                         set fast or spread checkpointing  
           

2、pg_start_backup SQL函數,通過參數fast控制

postgres=# \df pg_start_backup  
                                                        List of functions  
   Schema   |      Name       | Result data type |                          Argument data types                           | Type   
------------+-----------------+------------------+------------------------------------------------------------------------+------  
 pg_catalog | pg_start_backup | pg_lsn           | label text, fast boolean DEFAULT false, exclusive boolean DEFAULT true | func  
(1 row)  
           

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E5%B0%8F%E7%BB%93 小結

如果你需要快速的開始備份,可以使用fast(非排程模式)參數。

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機

為什麼pg_basebackup或pg_start_backup好像hang住确沒有開始拷貝檔案 - checkpoint 的幾種排程(checkpoint_completion_target)