一、初識分區表
通常情況下,掃描一個大表會很慢。 例如,如果一個訂單表orders的資料量是50G,統計某個州範圍内訂單的平均額度,往往會消耗幾分鐘的時間。
select avg(total_amount) from orders where state_code=1;
如果能夠把大表分拆成小表,查詢資料的時猴,隻掃描資料所屬的小表,就能大大降低掃描時間,提高查詢速度。
PostgreSQL的分區表(Table Partitioning)可以用來解決此類問題。解決方式是:建立一個表orders,作為分區表(partitionedtable),再建立50個分區(partition),orders_1, orders_2, …, orders_50, 每一個分區對應一個州的資料,分區的資料量平均是1G。分區表和分區都是表。本例中,這50分區聯合在一起,組成分區表orders。在執行查詢語句(如下)的時候:
select avg(total_amount) from orders where state_code=1;
PostgreSQL通過對執行語句的分析處理,最終把掃描的任務定位在分區order_1上,把查詢語句轉換成下面的語句,其他分區根本不需要掃描。
select avg(total_amount) from orders_1;
二、PostgreSQL分區表應用舉例
溫度采集在物聯網應用中非常普遍,通常一個系統中部署大量的溫度傳感器,傳感器按照設定的采集頻率把溫度資料發送到伺服器。 下面是一個溫度采集的例子,表temperature_sensor_data,用于儲存溫度傳感器采集的溫度資料。 如果有10萬個傳感器,每隔一小時采集一次資料,則每一個月會産生3.7G的資料,一年會産生大約43G的資料。
對于這樣量級的資料,通常需要采用特殊的處理方式。一種可能的方式是:按照月建立分區,資料按照所屬的月份,被存儲到較小的分區。
2.1 建立分區表
在下面的例子中,建立了分區表temperature_sensor_data和12分區。分區表代表2017年全年的資料,而每一個分區代表單月的資料。
droptableifexists temperature_sensor_data ;CREATETABLEtemperature_sensor_data ( sensor_id integer NOTNULL, timestamp timestampNOTNULL, temperature decimal(5,2) NOTNULL) PARTITION BY RANGE (timestamp);droptableifexists temperature_sensor_data_2017_1;CREATETABLEtemperature_sensor_data_2017_1 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-01-01') TO ('2017-02-01');droptableifexists temperature_sensor_data_2017_2;CREATETABLEtemperature_sensor_data_2017_2 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-02-01') TO ('2017-03-01');droptableifexists temperature_sensor_data_2017_3;CREATETABLEtemperature_sensor_data_2017_3 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-03-01') TO ('2017-04-01');droptableifexists temperature_sensor_data_2017_4;CREATETABLEtemperature_sensor_data_2017_4 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-04-01') TO ('2017-05-01');droptableifexists temperature_sensor_data_2017_5;CREATETABLEtemperature_sensor_data_2017_5 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-05-01') TO ('2017-06-01');droptableifexists temperature_sensor_data_2017_6;CREATETABLEtemperature_sensor_data_2017_6 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-06-01') TO ('2017-07-01');droptableifexists temperature_sensor_data_2017_7;CREATETABLEtemperature_sensor_data_2017_7 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-07-01') TO ('2017-08-01');droptableifexists temperature_sensor_data_2017_8;CREATETABLEtemperature_sensor_data_2017_8 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-08-01') TO ('2017-09-01');droptableifexists temperature_sensor_data_2017_9;CREATETABLEtemperature_sensor_data_2017_9 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-09-01') TO ('2017-10-01');droptableifexists temperature_sensor_data_2017_10;CREATETABLEtemperature_sensor_data_2017_10 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-10-01') TO ('2017-11-01');droptableifexists temperature_sensor_data_2017_11;CREATETABLEtemperature_sensor_data_2017_11 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-11-01') TO ('2017-12-01');droptableifexists temperature_sensor_data_2017_12;CREATETABLEtemperature_sensor_data_2017_12 PARTITION OF temperature_sensor_data FORVALUESFROM ('2017-12-01') TO ('2018-01-01');
2.2 模拟加載資料
- 100000個傳感器
- 每隔1小時采集一次資料
- 總共12個月
with ids as ( select generate_series(1,100000) as sensor_id ), times as ( SELECT generate_series( '2017-01-01 00:00:00'::timestamp,'2017-12-31 23:59:00', '1 hour' ) as timestamp ), samples as ( select sensor_id, timestamp, random()*100::decimal as temperature from ids full join times on 1=1 )insert into temperature_sensor_data select sensor_id, timestamp, round(temperature::decimal,2) as temperature from samples;postgres=# \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+---------------------------------+-------------------+-------------+---------+------------- public | temperature_sensor_data | partitioned table | postgres | 0 bytes | public | temperature_sensor_data_2017_1 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_10 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_11 | table | postgres | 3584 MB | public | temperature_sensor_data_2017_12 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_2 | table | postgres | 3345 MB | public | temperature_sensor_data_2017_3 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_4 | table | postgres | 3584 MB | public | temperature_sensor_data_2017_5 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_6 | table | postgres | 3584 MB | public | temperature_sensor_data_2017_7 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_8 | table | postgres | 3703 MB | public | temperature_sensor_data_2017_9 | table | postgres | 3584 MB | (13 rows)
2.3 統計1月份的平均溫度
- 1月份的資料量是3703M
- 耗時大約33秒
postgres=#selectavg(temperature) from temperature_sensor_data wheretimestampbetween '2017-01-01 00:00:00'and'2017-01-0123:59:00'; avg --------------------- 50.0171680480000000(1 row)Time: 33305.055 ms(00:33.305)postgres=#
2.4 使用一個大表,不使用分區表的查詢結果
- 單個表資料量是43G
- 耗時大約7分51秒
postgres=# \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+-------------------------+-------+-------------+-------+------------- public | temperature_sensor_data | table | postgres | 43 GB | (1 row)postgres=# select avg(temperature) from temperature_sensor_data where timestamp between '2017-01-01 00:00:00' and '2017-01-01 23:59:00'; avg --------------------- 50.0010354000000000(1 row)
Time: 471373.514 ms (07:51.374)
三、使用DeclarativePartitioning定義分區表
PostgreSQL從版本10開始,支援DeclarativePartitioning功能,就是使用create table語句定義分區表和分區。
建立分區表的方式是:create table tablename (…) partition by (…)
建立分區的方式是: create table partitionname partition oftablename for values (…);
其中partition by (…)定義來分區表根據哪些列來分區,使用什麼算法;for values (…)定義一個分區内,落入該分區的資料的取值範圍。
目前PostgreSQL12提供來3種分區算法:
- partition by range(…),pg10引入
- partition by list(…),pg10引入
- parition by hash(…),pg11引入
3.1 使用PARTITION BY RANGE方式定義分區
在建立分區表的時候,需要使用PARTITION BY指明該表是一個分區表,并且定義分區的方式。 以下是PostgreSQL官方文檔中一個例子:
該例子中,根據logdate字段做分區,使用RANGE方式。分區表measurement對應3個分區:measurement_y2006m02,measurement_y2006m03,measurement_def。其中measurement_def是預設分區。
在插入資料的時候,如果logdate的取值在2016年2月份,則資料插入到分區measurement_y2006m02;如果logdate的取值在2016年3月份,則資料被插入到分區measurement_y2006m03;其它的資料,插入到預設分區measurement_def。
CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int) PARTITION BY RANGE (logdate);CREATE TABLE measurement_y2006m02 PARTITION OF measurement FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
CREATE TABLE measurement_def PARTITION OF measurement DEFAULT;
查詢資料的時候,PostgreSQL能夠根據合适的過濾條件,選擇正确的分區做查詢;如果沒有适當的過濾條件,則掃描所有分區。
postgres=# explain select * from measurement where logdate='2006-02-10'; QUERY PLAN ---------------------------------------------------------------------- Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=9 width=16) Filter: (logdate = '2006-02-10'::date)(2 rows)
postgres=# explain select * from measurement; QUERY PLAN ------------------------------------------------------------------------------- Append (cost=0.00..113.25 rows=5550 width=16) -> Seq Scan on measurement_y2006m02 (cost=0.00..28.50 rows=1850 width=16) -> Seq Scan on measurement_y2006m03 (cost=0.00..28.50 rows=1850 width=16) -> Seq Scan on measurement_def (cost=0.00..28.50 rows=1850 width=16)(4 rows)
3.2 使用PARTITION BY LIST(column )定義分區
列的取值範圍值是一個小的集合,類似程式設計中的枚舉概念。當列值等于某個特定值的時候,落入指定的分區。
下面的例子中,分區表sale_order包含3個分區:
europe_order,asia_order,default_order。當列country等于'germany'或者'sweden'時,資料落入分區europe_order;當country的值等于india或japan時,行落入分區asia_order;當country等于其它值時,則行資料落入分區default_order。
CREATE TABLE sale_order( order_no integer, store_no integer, country varchar(20), date date, amount decimal(5,2)) PARTITION BY LIST(country);
CREATE TABLE europe_order PARTITION OF sale_order FOR VALUES IN ('germany', 'sweden');
CREATE TABLE asia_order PARTITION OF sale_order FOR VALUES IN ('india', 'japan');
CREATE TABLE default_order PARTITION OF sale_order DEFAULT;
查詢資料的時候,PostgreSQL能夠根據合适的過濾條件,選擇正确的分區做查詢;如果沒有适當的過濾條件,則掃描所有分區。
postgres=#explain select * from sale_order where country='india'; QUERY PLAN ------------------------------------------------------------ Seq Scan on asia_order (cost=0.00..19.25 rows=4 width=82) Filter: ((country)::text = 'india'::text)(2 rows)postgres=#explain select * from sale_order; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..63.30 rows=2220 width=82) -> Seq Scan on europe_order (cost=0.00..17.40 rows=740 width=82) -> Seq Scan on asia_order (cost=0.00..17.40 rows=740 width=82) -> Seq Scan on default_order (cost=0.00..17.40 rows=740 width=82)(4 rows)
3.3 使用PARTITION BY HASH( column )定義分區
對列的值做哈希,哈希值把資料分割成幾個分區。
下面的例子中,分區表orders包含4個分區:orders_p1,orders_p2,orders_p3,orders_p4。
插入資料時,對列o_w_id取餘,結果等于0,1,2,3,行資料分别落入分區orders_p1, orders_p2, orders_p3,orders_p4。
createtableorders ( o_w_id integer notnull, o_d_id integer notnull, o_id integer notnull, o_c_id integer, o_carrier_id integer, o_ol_cnt integer, o_all_local integer, o_entry_d timestamp)PARTITIONBY HASH ( o_w_id );CREATETABLEorders_p1 PARTITION OF orders FORVALUESWITH (MODULUS 4, REMAINDER 0);CREATETABLEorders_p2 PARTITION OF orders FORVALUESWITH (MODULUS 4, REMAINDER 1);CREATETABLEorders_p3 PARTITION OF orders FORVALUESWITH (MODULUS 4, REMAINDER 2);CREATETABLEorders_p4 PARTITION OF orders FORVALUESWITH (MODULUS 4, REMAINDER 3);
3.4 分區的其它特性
- 可以在分區表上建立索引,相應的所有分區都能自動建立索引;或者,可以為分區單獨建立索引。
- 可以根據需要,解除安裝或這增加一個分區。
- 分區可以指定單獨的表空間,能充分利用多個磁盤。
- 分區可以指向一個PG外表,即FDW表。
- 分區表可以根據多個列的值來分區。
- 分區可以再次分區。
四、使用表繼承(Inheritance)方式定義分區表
分區表也可以使用繼承的方式來使用。該方式早在PostgreSQL8就支援了。建立方式舉例如下:
- 1. 建立一個普通表measurement
CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int);
- 2. 建立子表,繼承自measurement
- 每個子表的check限制是為了確定字表隻運作符合條件的資料插入。
CREATE TABLE measurement_y2006m02 ( CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 ( CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )) INHERITS (measurement);
CREATE TABLE measurement_y2007m12 ( CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )) INHERITS (measurement);
CREATE TABLE measurement_y2008m01 ( CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )) INHERITS (measurement);
- 3. 建立函數和觸發器,用于把資料插入到相應的分區。
CREATE OR REPLACE FUNCTION measurement_insert_trigger()RETURNS TRIGGER AS $$BEGIN IF ( NEW.logdate >= DATE '2006-02-01' AND NEW.logdate < DATE '2006-03-01' ) THEN INSERT INTO measurement_y2006m02 VALUES (NEW.*); ELSIF ( NEW.logdate >= DATE '2006-03-01' AND NEW.logdate < DATE '2006-04-01' ) THEN INSERT INTO measurement_y2006m03 VALUES (NEW.*); ELSIF ( NEW.logdate >= DATE '2007-12-01' AND NEW.logdate < DATE '2008-01-01' ) THEN INSERT INTO measurement_y2007m12 VALUES (NEW.*); ELSIF ( NEW.logdate >= DATE '2008-01-01' AND NEW.logdate < DATE '2008-02-01' ) THEN INSERT INTO measurement_y2008m01 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!'; END IF; RETURN NULL;END;$$LANGUAGE plpgsql;
CREATE TRIGGER insert_measurement_trigger BEFORE INSERT ON measurement FOR EACH ROW EXECUTE FUNCTION measurement_insert_trigger();
- 4. 插入資料
插入4條資料,應該分别落入4個子表。
insertinto measurement values (1, '2006-02-10', 10, 1), (1, '2006-03-10', 10, 1), (1, '2007-12-10', 10, 1), (1, '2008-01-10', 10, 1);
- 5. 查詢資料
postgres=#select * from measurement; city_id | logdate | peaktemp | unitsales ---------+------------+----------+----------- 1 | 2006-02-10 | 10 | 1 1 | 2006-03-10 | 10 | 1 1 | 2007-12-10 | 10 | 1 1 | 2008-01-10 | 10 | 1(4 rows)postgres=#select * from measurement_y2006m02; city_id | logdate | peaktemp | unitsales ---------+------------+----------+----------- 1 | 2006-02-10 | 10 | 1(1 row)postgres=#select * from measurement_y2006m03; city_id | logdate | peaktemp | unitsales ---------+------------+----------+----------- 1 | 2006-03-10 | 10 | 1(1 row)postgres=#select * from measurement_y2007m12; city_id | logdate | peaktemp | unitsales ---------+------------+----------+----------- 1 | 2007-12-10 | 10 | 1(1 row)postgres=#select * from measurement_y2008m01 ; city_id | logdate | peaktemp | unitsales ---------+------------+----------+----------- 1 | 2008-01-10 | 10 | 1(1 row)postgres=#explain select * from measurement; QUERYPLAN ------------------------------------------------------------------------------- Append (cost=0.00..151.00 rows=7401 width=16) -> Seq Scan on measurement (cost=0.00..0.00 rows=1 width=16) -> Seq Scan on measurement_y2006m02 (cost=0.00..28.50 rows=1850 width=16) -> Seq Scan on measurement_y2006m03 (cost=0.00..28.50 rows=1850 width=16) -> Seq Scan on measurement_y2007m12 (cost=0.00..28.50 rows=1850 width=16) -> Seq Scan on measurement_y2008m01 (cost=0.00..28.50 rows=1850 width=16)(6 rows)postgres=#explain select * from measurement where logdate='2007-01-10'; QUERY PLAN ------------------------------------------------------------ Seq Scan on measurement (cost=0.00..0.00 rows=1 width=16) Filter: (logdate = '2007-01-10'::date)(2 rows)