天天看點

PostgreSQL 誰堵塞了誰(鎖等待檢測)- pg_blocking_pids, pg_safe_snapshot_blocking_pids

标簽

PostgreSQL , 鎖等待 , 隊列

https://github.com/digoal/blog/blob/master/201902/20190201_02.md#%E8%83%8C%E6%99%AF 背景

當一個程序處于等待(被堵塞)狀态時,是誰幹的?可以使用如下函數,快速得到搗蛋(堵塞别人)的PID。

1、請求鎖時被堵,是哪些PID堵的?

pg_blocking_pids(int)	int[]	Process ID(s) that are blocking specified server process ID from acquiring a lock  
           

2、請求safe快照時被堵(SSI隔離級别,請求安全快照沖突),是哪些PID堵的?

pg_safe_snapshot_blocking_pids(int)	int[]	Process ID(s) that are blocking specified server process ID from acquiring a safe snapshot  
           

https://github.com/digoal/blog/blob/master/201902/20190201_02.md#%E4%BE%8B%E5%AD%90 例子

1、會話1

postgres=# begin;  
BEGIN  
postgres=# select * from tbl limit 1;  
   id   | c1 | c2   
--------+----+----  
 918943 |  1 |  0  
(1 row)  
  
postgres=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          30862  
(1 row)  
           

2、會話2

postgres=# begin;  
BEGIN  
postgres=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          30928  
(1 row)  
  
postgres=# truncate tbl;  
  
等待中  
           

3、會話3

postgres=# begin;  
BEGIN  
postgres=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          30936  
(1 row)  
  
postgres=# select * from tbl limit 1;  
  
等待中  
           

4、會話4

postgres=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          30999  
(1 row)  
  
postgres=# select * from tbl limit 1;  
  
等待中  
           

5、檢視搗蛋PID

postgres=# select pid,pg_blocking_pids(pid),wait_event_type,wait_event,query from pg_stat_activity ;  
  pid  | pg_blocking_pids | wait_event_type |     wait_event      |                                           query                                             
-------+------------------+-----------------+---------------------+-------------------------------------------------------------------------------------------  
 30862 | {}               | Client          | ClientRead          | select pg_backend_pid();  
 30928 | {30862}          | Lock            | relation            | truncate tbl;  
 30936 | {30928}          | Lock            | relation            | select * from tbl limit 1;  
 30999 | {30928}          | Lock            | relation            | select * from tbl limit 1;  
           

https://github.com/digoal/blog/blob/master/201902/20190201_02.md#%E4%BB%A3%E7%A0%81 代碼

src/backend/utils/adt/lockfuncs.c

/*  
 * pg_blocking_pids - produce an array of the PIDs blocking given PID  
 *  
 * The reported PIDs are those that hold a lock conflicting with blocked_pid's  
 * current request (hard block), or are requesting such a lock and are ahead  
 * of blocked_pid in the lock's wait queue (soft block).  
 *  
 * In parallel-query cases, we report all PIDs blocking any member of the  
 * given PID's lock group, and the reported PIDs are those of the blocking  
 * PIDs' lock group leaders.  This allows callers to compare the result to  
 * lists of clients' pg_backend_pid() results even during a parallel query.  
 *  
 * Parallel query makes it possible for there to be duplicate PIDs in the  
 * result (either because multiple waiters are blocked by same PID, or  
 * because multiple blockers have same group leader PID).  We do not bother  
 * to eliminate such duplicates from the result.  
 *  
 * We need not consider predicate locks here, since those don't block anything.  
 */  
Datum  
pg_blocking_pids(PG_FUNCTION_ARGS)  
{  
  
  
...............  
  
/*  
 * pg_safe_snapshot_blocking_pids - produce an array of the PIDs blocking  
 * given PID from getting a safe snapshot  
 *  
 * XXX this does not consider parallel-query cases; not clear how big a  
 * problem that is in practice  
 */  
Datum  
pg_safe_snapshot_blocking_pids(PG_FUNCTION_ARGS)  
{  
...........  
           

src/backend/storage/lmgr/predicate.c

/*  
 * GetSafeSnapshotBlockingPids  
 *              If the specified process is currently blocked in GetSafeSnapshot,  
 *              write the process IDs of all processes that it is blocked by  
 *              into the caller-supplied buffer output[].  The list is truncated at  
 *              output_size, and the number of PIDs written into the buffer is  
 *              returned.  Returns zero if the given PID is not currently blocked  
 *              in GetSafeSnapshot.  
 */  
int  
GetSafeSnapshotBlockingPids(int blocked_pid, int *output, int output_size)  
{  
        int                     num_written = 0;  
        SERIALIZABLEXACT *sxact;  
  
        LWLockAcquire(SerializableXactHashLock, LW_SHARED);  
  
        /* Find blocked_pid's SERIALIZABLEXACT by linear search. */  
        for (sxact = FirstPredXact(); sxact != NULL; sxact = NextPredXact(sxact))  
        {  
                if (sxact->pid == blocked_pid)  
                        break;  
        }  
  
        /* Did we find it, and is it currently waiting in GetSafeSnapshot? */  
        if (sxact != NULL && SxactIsDeferrableWaiting(sxact))  
        {  
                RWConflict      possibleUnsafeConflict;  
  
                /* Traverse the list of possible unsafe conflicts collecting PIDs. */  
                possibleUnsafeConflict = (RWConflict)  
                        SHMQueueNext(&sxact->possibleUnsafeConflicts,  
                                                 &sxact->possibleUnsafeConflicts,  
                                                 offsetof(RWConflictData, inLink));  
  
                while (possibleUnsafeConflict != NULL && num_written < output_size)  
                {  
                        output[num_written++] = possibleUnsafeConflict->sxactOut->pid;  
                        possibleUnsafeConflict = (RWConflict)  
                                SHMQueueNext(&sxact->possibleUnsafeConflicts,  
                                                         &possibleUnsafeConflict->inLink,  
                                                         offsetof(RWConflictData, inLink));  
                }  
        }  
  
        LWLockRelease(SerializableXactHashLock);  
  
        return num_written;  
}  
           

https://github.com/digoal/blog/blob/master/201902/20190201_02.md#%E5%8F%82%E8%80%83 參考

https://www.postgresql.org/docs/11/functions-info.html 《PostgreSQL 鎖等待排查實踐 - 珍藏級 - process xxx1 acquired RowExclusiveLock on relation xxx2 of database xxx3 after xxx4 ms at xxx》 《PostgreSQL 鎖等待監控 珍藏級SQL - 誰堵塞了誰》 《PostgreSQL 鎖等待跟蹤》

https://github.com/digoal/blog/blob/master/201902/20190201_02.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機

PostgreSQL 誰堵塞了誰(鎖等待檢測)- pg_blocking_pids, pg_safe_snapshot_blocking_pids