背景
場景:
内卷現象, 供不應求(高峰期打車、電商秒殺), 熱點資料更新
社會現象: 資源有限而需求無限的情況(春運時期的火車票、學生報補習班、企業裡面的資源地盤争奪等)
挑戰:
當系統中出現熱點row時, 意味着大量的并發請求更新同一行資料, 因為資料庫最小粒度的鎖為行鎖,是以這些并發請求隻能串行執行,
一個會話在更新的時候其他所有會話都處于等待狀态, 可能導緻連接配接打爆, 其他會話連不進來引起雪崩.
如果被秒殺的商品庫存隻有10個, 那麼實際上隻有10個請求能達成交易, 其他等待中的會話都屬于無用功.浪費大量的連接配接和等待時間.
PG 解決方案:
大禹治水(疏導、消滅無用等待):
- SKIP LOCKED,
- advisory lock
例子
測試表, 1條熱點記錄, 庫存1000萬.
id int primary key , -- 商品ID
cnt int , -- 庫存
ts timestamp -- 修改時間
);
insert into a values (1, 10000000, now());
扣減庫存并傳回
id | cnt | ts
----+---------+----------------------------
1 | 9999993 | 2021-06-01 14:41:14.775177
(1 row)
UPDATE 1
postgres=# update a set cnt=cnt-1, ts=clock_timestamp() where id=1 returning *;
id | cnt | ts
----+---------+----------------------------
1 | 9999992 | 2021-06-01 14:41:17.747961
(1 row)
UPDATE 1
并發能力測試
1、傳統方法
update a set cnt=cnt-1, ts=clock_timestamp() where id=1 returning *;
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 12 -j 12 -T 120
pgbench (PostgreSQL) 14.0
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 12
duration: 120 s
number of transactions actually processed: 2301279
latency average = 0.625 ms
latency stddev = 0.562 ms
initial connection time = 8.466 ms
tps = 19177.578464 (without initial connection time)
statement latencies in milliseconds:
0.625 update a set cnt=cnt-1, ts=clock_timestamp() where id=1 returning *;
2、skip locked 跳過被鎖的行
ctid =
(select ctid from a where id=1 and cnt>=1 for update skip locked)
returning *;
QUERY PLAN
-----------------------------------------------------------------------------------
Update on a (cost=2.36..3.48 rows=1 width=18)
InitPlan 1 (returns $1)
-> LockRows (cost=0.12..2.36 rows=1 width=12)
-> Index Scan using a_pkey on a a_1 (cost=0.12..2.35 rows=1 width=12)
Index Cond: (id = 1)
Filter: (cnt >= 1)
-> Tid Scan on a (cost=0.00..1.12 rows=1 width=18)
TID Cond: (ctid = $1)
(8 rows)
pgbench (PostgreSQL) 14.0
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 12
duration: 120 s
number of transactions actually processed: 7165617
latency average = 0.201 ms
latency stddev = 0.150 ms
initial connection time = 11.126 ms
tps = 59717.700525 (without initial connection time)
statement latencies in milliseconds:
0.202 update a set cnt=cnt-1 , ts=clock_timestamp() where
3、advisory lock, 徹底消除行鎖
QUERY PLAN
-----------------------------------------------------------------------
Update on a (cost=0.12..2.36 rows=1 width=18)
-> Index Scan using a_pkey on a (cost=0.12..2.36 rows=1 width=18)
Index Cond: (id = 1)
Filter: pg_try_advisory_xact_lock((id)::bigint)
(4 rows)
postgres=# begin;
BEGIN
postgres=*# update a set cnt=cnt-1, ts=clock_timestamp() where id=1 and pg_try_advisory_xact_lock(id) returning *;
id | cnt | ts
----+---------+----------------------------
1 | 6839129 | 2021-06-01 14:47:54.232782
(1 row)
UPDATE 1
其他會話, 探測同一個商品ID的advisory鎖, 未擷取則不會進行更新
postgres=# update a set cnt=cnt-1, ts=clock_timestamp() where id=1 and pg_try_advisory_xact_lock(id) returning *;
id | cnt | ts
----+-----+----
(0 rows)
UPDATE 0
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 12
duration: 120 s
number of transactions actually processed: 10701637
latency average = 0.134 ms
latency stddev = 0.705 ms
initial connection time = 10.577 ms
tps = 89184.703653 (without initial connection time)
statement latencies in milliseconds:
0.136 update a set cnt=cnt-1, ts=clock_timestamp() where id=1 and pg_try_advisory_xact_lock(id) returning *;
tps 性能提升
12個并發:
19177(傳統方法) -> 59717(skip locked) -> 89184(advisory lock)
800個并發:
374(傳統方法) -> 34495(skip locked) -> 70444(advisory lock)
知識點
1、skip locked
https://www.postgresql.org/docs/14/sql-select.html2、advisory lock (database->session|xact level)
https://www.postgresql.org/docs/14/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS https://www.postgresql.org/docs/14/explicit-locking.html#ADVISORY-LOCKS3、tid scan
https://www.postgresql.org/docs/14/runtime-config-query.html#RUNTIME-CONFIG-QUERY-ENABLE4、ctid
https://www.postgresql.org/docs/14/ddl-system-columns.html5、update delete returning
https://www.postgresql.org/docs/14/dml-returning.html