标簽
PostgreSQL , 分區表 , 優化器 , 分區過濾 , hash 分區
https://github.com/digoal/blog/blob/master/201805/20180524_05.md#%E8%83%8C%E6%99%AF 背景
PostgreSQL 10開始内置分區表文法,當時隻支援了range,list兩種分區,實際上可以通過LIST實作HASH分區。
https://github.com/digoal/blog/blob/master/201805/20180524_05.md#postgresql-10-hash-%E5%88%86%E5%8C%BA%E8%A1%A8 PostgreSQL 10 hash 分區表
使用list支援hash分區
postgres=# create table p (id int , info text, crt_time timestamp) partition by list (abs(mod(id,4)));
CREATE TABLE
postgres=# create table p0 partition of p for values in (0);
CREATE TABLE
postgres=# create table p1 partition of p for values in (1);
CREATE TABLE
postgres=# create table p2 partition of p for values in (2);
CREATE TABLE
postgres=# create table p3 partition of p for values in (3);
CREATE TABLE
分區表如下
postgres=# \d+ p
Table "public.p"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |
crt_time | timestamp without time zone | | | | plain | |
Partition key: LIST (abs(mod(id, 4)))
Partitions: p0 FOR VALUES IN (0),
p1 FOR VALUES IN (1),
p2 FOR VALUES IN (2),
p3 FOR VALUES IN (3)
寫入資料
postgres=# insert into p select generate_series(1,1000),md5(random()::text),now();
INSERT 0 1000
postgres=# select tableoid::regclass,id from p limit 10;
tableoid | id
----------+----
p0 | 4
p0 | 8
p0 | 12
p0 | 16
p0 | 20
p0 | 24
p0 | 28
p0 | 32
p0 | 36
p0 | 40
(10 rows)
普通的查詢,無法做到分區的過濾
postgres=# explain select * from p where id=1 ;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..96.50 rows=24 width=44)
-> Seq Scan on p0 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 1)
-> Seq Scan on p1 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 1)
-> Seq Scan on p2 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 1)
-> Seq Scan on p3 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 1)
(9 rows)
一定要帶上分區條件,才可以做到分區過濾
postgres=# explain select * from p where id=1 and abs(mod(id, 4))=abs(mod(1, 4));
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..32.60 rows=1 width=44)
-> Seq Scan on p1 (cost=0.00..32.60 rows=1 width=44)
Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
(3 rows)
https://github.com/digoal/blog/blob/master/201805/20180524_05.md#postgresql-11-hash-%E5%88%86%E5%8C%BA%E8%A1%A8 PostgreSQL 11 hash 分區表
PostgreSQL 11同樣可以使用與10一樣的方法,LIST來實作HASH分區,但是有一個更加優雅的方法,直接使用HASH分區。
postgres=# create table p (id int , info text, crt_time timestamp) partition by hash (id);
CREATE TABLE
postgres=# create table p0 partition of p for values WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE
postgres=# create table p1 partition of p for values WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE
postgres=# create table p2 partition of p for values WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE
postgres=# create table p3 partition of p for values WITH (MODULUS 4, REMAINDER 3);
CREATE TABLE
表結構如下
postgres=# \d+ p
Table "public.p"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |
crt_time | timestamp without time zone | | | | plain | |
Partition key: HASH (id)
Partitions: p0 FOR VALUES WITH (modulus 4, remainder 0),
p1 FOR VALUES WITH (modulus 4, remainder 1),
p2 FOR VALUES WITH (modulus 4, remainder 2),
p3 FOR VALUES WITH (modulus 4, remainder 3)
表分區定義,内置的限制是一個HASH函數的傳回值
postgres=# \d+ p0
Table "public.p0"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |
crt_time | timestamp without time zone | | | | plain | |
Partition of: p FOR VALUES WITH (modulus 4, remainder 0)
Partition constraint: satisfies_hash_partition('180289'::oid, 4, 0, id)
postgres=# \d+ p1
Table "public.p1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |
crt_time | timestamp without time zone | | | | plain | |
Partition of: p FOR VALUES WITH (modulus 4, remainder 1)
Partition constraint: satisfies_hash_partition('180289'::oid, 4, 1, id)
這個hash函數的定義如下,他一定是一個immutable 函數,是以可以用于分區過濾
postgres=# \x
Expanded display is on.
postgres=# \df+ satisfies_hash_partition
List of functions
-[ RECORD 1 ]-------+--------------------------------------
Schema | pg_catalog
Name | satisfies_hash_partition
Result data type | boolean
Argument data types | oid, integer, integer, VARIADIC "any"
Type | func
Volatility | immutable
Parallel | safe
Owner | postgres
Security | invoker
Access privileges |
Language | internal
Source code | satisfies_hash_partition
Description | hash partition CHECK constraint
PostgreSQL 11終于可以隻輸入分區字段值就可以做到分區過濾了
postgres=# explain select * from p where id=1;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..24.16 rows=6 width=44)
-> Seq Scan on p0 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 1)
(3 rows)
postgres=# explain select * from p where id=0;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..24.16 rows=6 width=44)
-> Seq Scan on p0 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 0)
(3 rows)
postgres=# explain select * from p where id=2;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..24.16 rows=6 width=44)
-> Seq Scan on p2 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 2)
(3 rows)
postgres=# explain select * from p where id=3;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..24.16 rows=6 width=44)
-> Seq Scan on p1 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 3)
(3 rows)
它受控于一個開關,當關閉後,就無法隻通過分區值來過濾分區。
postgres=# set enable_partition_pruning =off;
SET
postgres=# explain select * from p where id=0;
QUERY PLAN
----------------------------------------------------------
Append (cost=0.00..96.62 rows=24 width=44)
-> Seq Scan on p0 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 0)
-> Seq Scan on p1 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 0)
-> Seq Scan on p2 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 0)
-> Seq Scan on p3 (cost=0.00..24.12 rows=6 width=44)
Filter: (id = 0)
(9 rows)
https://github.com/digoal/blog/blob/master/201805/20180524_05.md#postgresql-%E7%BB%A7%E6%89%BF%E8%A1%A8-hash-%E5%88%86%E5%8C%BA%E8%A1%A8%E5%AE%9E%E7%8E%B0 PostgreSQL 繼承表 hash 分區表實作
《PostgreSQL 傳統 hash 分區方法和性能》postgres=# explain select * from tbl where abs(mod(id,4)) = abs(mod(1,4)) and id=1;
QUERY PLAN
--------------------------------------------------------------------------
Append (cost=0.00..979127.84 rows=3 width=45)
-> Seq Scan on tbl (cost=0.00..840377.67 rows=2 width=45)
Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
-> Seq Scan on tbl1 (cost=0.00..138750.17 rows=1 width=45)
Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))
(5 rows)
https://github.com/digoal/blog/blob/master/201805/20180524_05.md#pg_pathman%E5%88%86%E5%8C%BA%E6%96%B9%E6%B3%95 pg_pathman分區方法
支援9.5以上的版本
《PostgreSQL 9.5+ 高效分區表實作 - pg_pathman》https://github.com/digoal/blog/blob/master/201805/20180524_05.md#%E5%B0%8F%E7%BB%93 小結
1、PostgreSQL 10内置分區表,為了HASH分區,可以使用LIST分區的方法,但是為了讓資料庫可以自動過濾分區,一定要帶上HASH分區條件表達式到SQL中。
2、PostgreSQL 11内置分區表,内置了HASH分區,并且支援隻按照HASH分區條件,自動過濾分區。
3、繼承表的方法,同樣可以實作HASH分區,需要建立觸發器,同時主表在查詢時依舊會被查詢到。
以上三種方法,必須保證
constraint_exclusion
參數設定為partition或者on, 否則無法做到分區自動過濾。
對于PostgreSQL 11,為了實作隻輸入分區字段的值就能夠滿足分區自動過濾,還需要設定
enable_partition_pruning
為on.
索性這些參數預設都是OK的。
4、pg_pathman是通過custom scan接口實作的分區,是目前為止,性能最好的,鎖粒度最低的方法。