背景
- 軌迹類業務, 一個軌迹由多個點組成, 每個點的ROW寫入散落到不同的PAGE, 查詢一條軌迹可能要回表通路上百千個PAGE, 号稱IO殺手.
業務:
- 共享單車、巡邏車、公務用車、網約車、金融行業股票資料、物聯網行業傳感器資料等.
PG index include (覆寫索引)功能:
- 重組存儲結構, 按指定次元聚集.
- 葉子結點存儲include column value, 無需回表(軌迹資料都是append only的, VM bit全部都是clean page, 是以無需回表).
文法:
Command: CREATE INDEX
Description: define a new index
Syntax:
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ] [ opclass [ ( opclass_parameter = value [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ INCLUDE ( column_name [, ...] ) ]
[ WITH ( storage_parameter [= value] [, ... ] ) ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]
URL:
https://www.postgresql.org/docs/14/sql-createindex.html例子
共享單車軌迹
create unlogged table tbl_sensor_track
(
id serial8 primary key,
sid int, -- 單車ID
pos point, -- 位置
traceid int, -- 軌迹ID
info text, -- 其他資訊
crt_time timestamp -- 時間
);
create index idx_tbl_sensor_track_1 on tbl_sensor_track (sid,traceid,crt_time);
寫入壓測資料, 1000萬條, 1000量單車, 每輛單車約10個軌迹, 每個軌迹約1000條記錄.
vi test.sql
\set sid random(1,1000)
insert into tbl_sensor_track (sid,pos,traceid,info,crt_time) values
(:sid, point(random(),random()), random()*10, md5(random()::text), clock_timestamp());
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 10 -j 10 -t 1000000
vacuum analyze tbl_sensor_track;
select * from tbl_sensor_track where sid=1 and traceid=1 order by crt_time;
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from tbl_sensor_track where sid=1 and traceid=1 order by crt_time;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_tbl_sensor_track_1 on public.tbl_sensor_track (cost=0.43..1099.13 rows=980 width=73) (actual time=0.042..1.875 rows=973 loops=1)
Output: id, sid, pos, traceid, info, crt_time
Index Cond: ((tbl_sensor_track.sid = 1) AND (tbl_sensor_track.traceid = 1))
Buffers: shared hit=976
Planning Time: 0.144 ms
Execution Time: 2.072 ms
(6 rows)
Time: 3.119 ms
為什麼要通路這麼多資料塊?
傳感器都是活躍的, 大家都在寫, 某個sid的多條資料必定會分散到多個PAGE中.
postgres=# select ctid,* from tbl_sensor_track where sid=1 and traceid=1 order by crt_time;
ctid | id | sid | pos | traceid | info | crt_time
-------------+---------+-----+----------------------------------------------+---------+----------------------------------+----------------------------
(289,70) | 24022 | 1 | (0.3291194292095341,0.7362849779331277) | 1 | dd989480a73b6ccf0988cd2ce4594c41 | 2021-05-30 12:12:35.221626
(316,61) | 24046 | 1 | (0.30092918298355187,0.7515717517477718) | 1 | 0f563c5e6fae4ff5335d3c1a6c6d8af7 | 2021-05-30 12:12:35.221854
(374,24) | 27822 | 1 | (0.7082108853392377,0.8023826990838856) | 1 | cf24c90af1a0cf3211d2be5f6ba09466 | 2021-05-30 12:12:35.26342
(387,23) | 29083 | 1 | (0.6827721329130654,0.6080967981321557) | 1 | 33a812ceeb690147acaab7b2e3602eaf | 2021-05-30 12:12:35.276581
(506,73) | 38479 | 1 | (0.2819613389873581,0.8749003165249825) | 1 | e34d53dccf0b5cb9b7e30366a3535f2c | 2021-05-30 12:12:35.376353
(1096,75) | 82442 | 1 | (0.1818822075073001,0.508428318796188) | 1 | ba3c9669eac963976fcf28f51d33d810 | 2021-05-30 12:12:35.850693
(1429,69) | 107646 | 1 | (0.14452220616886535,0.42080454115547994) | 1 | 74d3995610885086c7b60744c2feebca | 2021-05-30 12:12:36.129595
(1451,54) | 109320 | 1 | (0.13488553655361812,0.9924723125826809) | 1 | 3ea411a5dc265fc56887817ca61580c2 | 2021-05-30 12:12:36.146846
(1566,59) | 118072 | 1 | (0.0021032854500404596,0.5327942934971936) | 1 | bac5b2b038d7e775251df93d65bce9af | 2021-05-30 12:12:36.243577
(1677,4) | 125396 | 1 | (0.11318825440940117,0.46600202700733817) | 1 | f41da3b991c5ad55a5d9c11993a1f396 | 2021-05-30 12:12:36.322615
(1699,1) | 127190 | 1 | (0.1162984003478087,0.47079561917773916) | 1 | ba1be1fc77621faacaa0ca4a2fa289f9 | 2021-05-30 12:12:36.341182
(2195,1) | 157052 | 1 | (0.7798969830712075,0.29867198696907593) | 1 | 346bd94b5be3d30cc749efc939b04c89 | 2021-05-30 12:12:36.658956
(2204,39) | 165303 | 1 | (0.5289847732077746,0.9900722786928533) | 1 | 3a8eb11558768555f3b04001702c80a7 | 2021-05-30 12:12:36.746148
(2284,58) | 171596 | 1 | (0.7207220490941282,0.7717989719965779) | 1 | 90f10c713ec430a938b4a6f2d82ad6c2 | 2021-05-30 12:12:36.810753
(2464,24) | 184864 | 1 | (0.7473279005965843,0.036199381685484866) | 1 | 37705afad1f8be5ab53e89a27c2ddde4 | 2021-05-30 12:12:36.952249
(2492,58) | 186995 | 1 | (0.2527484469419434,0.812076005344526) | 1 | cdbfb7ff0ec5c85e6eb29cebc1604ce9 | 2021-05-30 12:12:36.975931
(2602,64) | 195112 | 1 | (0.3987292028407765,0.5321831181795922) | 1 | ccf72d761d93b6834e5586f69ff7ac97 | 2021-05-30 12:12:37.06301
(2624,29) | 196765 | 1 | (0.7868492865049959,0.3834753392663437) | 1 | 59da61df594812fd801f70361d3b9b0e | 2021-05-30 12:12:37.079775
(2662,36) | 199557 | 1 | (0.4178753519891032,0.6877918734552573) | 1 | 807c6dcfb2aabae2decbce3f3f502d93 | 2021-05-30 12:12:37.11165
(2686,11) | 201309 | 1 | (0.2643845772676734,0.9410469934455321) | 1 | 19bdddf814473489670604e658f60cd1 | 2021-05-30 12:12:37.130968
(2736,52) | 205340 | 1 | (0.052081088208076665,0.05945480161325989) | 1 | 9bdcfc79a023d2e924085bf0d4f46c28 | 2021-05-30 12:12:37.172398
(2786,55) | 209333 | 1 | (0.11154442604792081,0.27523521858215716) | 1 | bdc9393ed3448f6084dd99638a49a576 | 2021-05-30 12:12:37.21326
(2811,67) | 211280 | 1 | (0.764354185520979,0.46827810976283857) | 1 | 7b332c186e088fcd44e616ede4f1f33a | 2021-05-30 12:12:37.235043
(2821,57) | 211800 | 1 | (0.9366284195602432,0.14309645108922453) | 1 | 5268a745de0e8d951fbe6b35b1ae53ed | 2021-05-30 12:12:37.240254
(2886,44) | 216725 | 1 | (0.7654642711557571,0.08884990148832017) | 1 | a19c9ae847d408b51ea5c6171c1ba5a4 | 2021-05-30 12:12:37.291206
...
覆寫索引:
無需回表, 通路的資料塊從976下降到22
create index idx_tbl_sensor_track_2 on tbl_sensor_track (sid,traceid,crt_time) include (id,pos,info);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_tbl_sensor_track_2 on public.tbl_sensor_track (cost=0.56..34.92 rows=1003 width=73) (actual time=0.033..0.542 rows=973 loops=1)
Heap Fetches: 0
Buffers: shared hit=22
Planning Time: 0.166 ms
Execution Time: 0.742 ms
(7 rows)
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+------------------+-------+----------+-------------+---------------+---------+-------------
public | tbl_sensor_track | table | postgres | unlogged | heap | 1042 MB |
(1 row)
postgres=# \di+
List of relations
Schema | Name | Type | Owner | Table | Persistence | Access method | Size | Description
--------+------------------------+-------+----------+------------------+-------------+---------------+--------+-------------
public | idx_tbl_sensor_track_1 | index | postgres | tbl_sensor_track | unlogged | btree | 301 MB |
public | idx_tbl_sensor_track_2 | index | postgres | tbl_sensor_track | unlogged | btree | 994 MB |
public | tbl_sensor_track_pkey | index | postgres | tbl_sensor_track | unlogged | btree | 214 MB |
(3 rows)
并發能力壓測
全記憶體命中的情況下, 差異較小, 但是實際生産環境中資料不可能全部在記憶體中, 此時IO帶來的問題就會凸顯, 性能差異明顯.
非聚集索引
pgbench (PostgreSQL) 14.0
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 12
duration: 120 s
number of transactions actually processed: 366951
latency average = 3.924 ms
latency stddev = 10.133 ms
initial connection time = 12.210 ms
tps = 3058.035523 (without initial connection time)
statement latencies in milliseconds:
3.925 select * from tbl_sensor_track where sid=1 and traceid=1 order by crt_time;
聚集覆寫索引
number of transactions actually processed: 460368
latency average = 3.127 ms
latency stddev = 8.477 ms
initial connection time = 12.219 ms
tps = 3836.628407 (without initial connection time)
3.128 select * from tbl_sensor_track where sid=1 and traceid=1 order by crt_time;
include index解決了什麼問題?
1、在某個次元上查詢需要傳回N條記錄, N條記錄在HEAP PAGE中非常分散, 需要耗費大量IO的問題.
- 不需要聯合索引, 減少索引build的時間和複雜度, 并且索引有<1/3 index page的限制, 複合索引會導緻資料超長寫入失敗.
2、支援按任意次元查詢, 每個次元都需要傳回N條, N條記錄在HEAP PAGE中非常分散, 需要耗費大量IO的問題.
- index include 可以按任意次元進行聚集存儲. 滿足不同次元的大範圍搜尋需求. 比聚集表要厲害: 聚集表隻能按1個次元(也就是PK)來進行存儲.
其他
除了使用index include, 在業務側也可以改進來解決離散IO問題, 例如
- 軌迹合并存儲(使用1條記錄, 而非存儲到N條記錄裡面.)
- 按SID分片存儲, 每個SID一個表. 這樣就不會和其他SID混PAGE了.
- 軌迹合并可以使用PG的PostGIS或者阿裡雲ganos的軌迹資料類型(支援軌迹的計算、例如伴随、相似、拟合等分析).
202104/20210406_04.md
《PostgreSQL 14 preview - SP-GiST 索引新增index 葉子結點include column value 功能支援》202011/20201117_01.md
《使用Postgres,MobilityDB和Citus大規模(百億級)實時分析GPS軌迹》202004/20200429_01.md
《PostgreSQL 索引算子下推擴充- 索引内offset - 索引内過濾- include index - 随機偏移》201905/20190503_03.md
《PostgreSQL index include - 類聚簇表與應用(append only, IoT時空軌迹, 離散多行掃描與傳回)》201903/20190331_08.md
《PostgreSQL 12 preview - GiST 索引支援INCLUDE columns - 覆寫索引- 類聚簇索引》201812/20181209_01.md
《PostgreSQL IoT,車聯網- 實時軌迹、行程實踐2 - (含index only scan類聚簇表效果)》201812/20181207_01.md
《PostgreSQL IoT,車聯網- 實時軌迹、行程實踐1》201811/20181101_02.md
《PostgreSQL pipelinedb 流計算插件- IoT應用- 實時軌迹聚合》201806/20180607_02.md
《Greenplum 軌迹相似(伴随分析)》201712/20171231_01.md
《PostgreSQL 實時位置跟蹤+軌迹分析系統實踐- 單機頂千億軌迹/天》201712/20171204_01.md
《GIS術語- POI、AOI、LOI、路徑、軌迹》201708/20170803_01.md
《菜鳥末端軌迹- 電子圍欄(解密支撐每天251億個包裹的資料庫) - 阿裡雲RDS PostgreSQL最佳實踐》201707/20170722_02.md
《車聯網案例,軌迹清洗- 阿裡雲RDS PostgreSQL最佳實踐- 視窗函數》201704/20170418_01.md
《PostgreSQL 物流軌迹系統資料庫需求分析與設計- 包裹俠實時跟蹤與召回》201703/20170312_23.md
《PostgreSQL 10.0 preview 功能增強- 唯一限制+附加字段組合功能索引- 覆寫索引- covering index》201702/20170219_01.md
《PostgreSQL 聚集存儲與BRIN索引- 高并發行為、軌迹類大吞吐資料查詢場景解說》201606/20160611_02.md
《PostgreSQL 如何輕松搞定行駛、運動軌迹合并和切分》