為什麼pg_basebackup或pg_start_backup好像hang住确沒有開始拷貝檔案 - checkpoint 的幾種排程(checkpoint_completion_target)

标簽

PostgreSQL , checkpoint , 排程 , lazy , immediate , pg_start_backup , pg_basebackup

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E8%83%8C%E6%99%AF 背景

PostgreSQL支援線上全量備份與增量歸檔備份。線上全量備份實際上就是拷貝檔案，增量備份則分為兩種，一種是基于BLOCK lsn變化的BLOCK即增量備份，另一種是基于WAL的持續歸檔檔案備份。

全量備份通常使用pg_basebackup用戶端實作，或者使用SQL函數pg_start_backup()+COPY檔案、打快照的方式實作。

全量備份開啟前，需要對資料庫做一次checkpoint，并強制開啟full page write，確定partial block在後續可以通過wal進行恢複。備份結束時通過pg_stop_backup告知，關閉full page write(如果參數開啟了FPW則不受影響)。

有時你可能會發現使用pg_basebackup或pg_start_backup接口時，好像hang住确沒有開始拷貝檔案。實際上是在做checkpoint，但是為什麼這個checkpoint比較慢，但是直接SQL執行checkopint指令确很快呢？

原因是checkpoint分為排程和非排程模式。

排程模式的checkpoint和checkpoint_completion_target以及配置的max_wal_size區間大小有關，checkpoint_completion_target和max_wal_size越大，表示這個checkpoint将在這麼大的區間内排程完成，是以總耗時會非常長，好處是減少CHECKPOINT帶來的大量刷髒和FSYNC，進而減少抖動。

壞處就是你會發現checkpoint很漫長。

非排程模式的checkpoint，就是盡快完成檢查點，會全速刷髒，不進行排程。好處是快，壞處是，如果髒頁特别多，可能會有大量IO影響其他會話性能。

postgres=# show max_wal_size ;  
 max_wal_size   
--------------  
 128GB  
(1 row)  
  
postgres=# show min_wal_size;  
 min_wal_size   
--------------  
 32GB  
(1 row)  
             
postgres=# show checkpoint_completion_target ;  
 checkpoint_completion_target   
------------------------------  
 0.1  
(1 row)

代碼中可以看到，checkpoint有如下flag來控制檢查點行為。

* RequestCheckpoint  
 *              Called in backend processes to request a checkpoint  
 *  
 * flags is a bitwise OR of the following:  
 *      CHECKPOINT_IS_SHUTDOWN: checkpoint is for database shutdown.  
 *      CHECKPOINT_END_OF_RECOVERY: checkpoint is for end of WAL recovery.  
 *      CHECKPOINT_IMMEDIATE: finish the checkpoint ASAP,  
 *              ignoring checkpoint_completion_target parameter.  
 *      CHECKPOINT_FORCE: force a checkpoint even if no XLOG activity has occurred  
 *              since the last one (implied by CHECKPOINT_IS_SHUTDOWN or  
 *              CHECKPOINT_END_OF_RECOVERY).  
 *      CHECKPOINT_WAIT: wait for completion before returning (otherwise,  
 *              just signal checkpointer to do it, and return).  
 *      CHECKPOINT_CAUSE_XLOG: checkpoint is requested due to xlog filling.  
 *              (This affects logging, and in particular enables CheckPointWarning.)  
 */  
void  
RequestCheckpoint(int flags)

start backup如何控制是使用快速checkpoint(非排程模式)、或者排程模式的checkpoint呢？

XLogRecPtr  
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,  
                                   StringInfo labelfile, DIR *tblspcdir, List **tablespaces,  
                                   StringInfo tblspcmapfile, bool infotbssize,  
                                   bool needtblspcmapfile)  
{  
  
  
  
                         * Since the fact that we are executing do_pg_start_backup()  
                         * during recovery means that checkpointer is running, we can use  
                         * RequestCheckpoint() to establish a restartpoint.  
                         *  
                         * We use CHECKPOINT_IMMEDIATE only if requested by user (via  
                         * passing fast = true).  Otherwise this can take awhile.  
                         */  
                        RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |  
                                                          (fast ? CHECKPOINT_IMMEDIATE : 0));

1、pg_basebackup用戶端指令，通過-c參數控制(fast表示使用非排程模式checkpoint)

-c, --checkpoint=fast|spread  
                         set fast or spread checkpointing

2、pg_start_backup SQL函數，通過參數fast控制

postgres=# \df pg_start_backup  
                                                        List of functions  
   Schema   |      Name       | Result data type |                          Argument data types                           | Type   
------------+-----------------+------------------+------------------------------------------------------------------------+------  
 pg_catalog | pg_start_backup | pg_lsn           | label text, fast boolean DEFAULT false, exclusive boolean DEFAULT true | func  
(1 row)

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E5%B0%8F%E7%BB%93 小結

如果你需要快速的開始備份，可以使用fast（非排程模式）參數。

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機

為什麼pg_basebackup或pg_start_backup好像hang住确沒有開始拷貝檔案 - checkpoint 的幾種排程(checkpoint_completion_target)

為什麼pg_basebackup或pg_start_backup好像hang住确沒有開始拷貝檔案 - checkpoint 的幾種排程(checkpoint_completion_target)

标簽

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E8%83%8C%E6%99%AF 背景

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E5%B0%8F%E7%BB%93 小結

https://github.com/digoal/blog/blob/master/201901/20190107_02.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機

繼續閱讀

set define off關閉替代變量功能

報錯：'mysql' 不是内部或外部指令，也不是可運作的程式或批處理檔案。

Linxu常用指令技巧彙總

ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_4188_1.MYI' (Errcode: 28)

艱難安裝LDAP,SSL認證

《Linux指令行與Shell腳本程式設計大全第2版.布盧姆》pdf

MySQL的4種隔離級别？出現問題

XX系統實施過程問題總結

無元件上傳圖檔到資料庫中，最完整解決方案

【MySQL資料庫】資料庫索引事務1.索引2.事務

neo4j之cypher使用文檔

NOSQL安全攻擊

mybatis_入門程式Mybatis入門

登入plsql 報錯 the account is locked --使用者被鎖

sqlServer根據經緯查距離

SequoiaDB巨杉資料庫C++驅動概述