天天看點

PostgreSQL 并行計算解說 之21 - parallel partition table wise agg

标簽

PostgreSQL , cpu 并行 , smp 并行 , 并行計算 , gpu 并行 , 并行過程支援

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#%E8%83%8C%E6%99%AF 背景

PostgreSQL 11 優化器已經支援了非常多場合的并行。簡單估計,已支援27餘種場景的并行計算。

parallel seq scan                                            
                                            
parallel index scan                                            
                                            
parallel index only scan                                            
                                            
parallel bitmap scan                                            
                                            
parallel filter                                            
                                        
parallel hash agg                                        
                                        
parallel group agg                                        
                                            
parallel cte                                            
                                            
parallel subquery                                            
                                            
parallel create table                                            
                                            
parallel create index                                            
                                            
parallel select into                                            
                                            
parallel CREATE MATERIALIZED VIEW                                            
                                            
parallel 排序 : gather merge                                             
                                            
parallel nestloop join                                            
                                            
parallel hash join                                            
                                            
parallel merge join                                            
                                            
parallel 自定義并行聚合                                            
                                            
parallel 自定義并行UDF                                            
                                            
parallel append                                            
                                            
parallel union                                            
                                            
parallel fdw table scan                                            
                                            
parallel partition join                                            
                                            
parallel partition agg                                            
                                            
parallel gather                                    
                            
parallel gather merge                            
                                            
parallel rc 并行                                            
                                            
parallel rr 并行                                            
                                            
parallel GPU 并行                                            
                                            
parallel unlogged table          
        
lead parallel      
           

接下來進行一一介紹。

關鍵知識請先自行了解:

1、優化器自動并行度算法 CBO

《PostgreSQL 9.6 并行計算 優化器算法淺析》 《PostgreSQL 11 并行計算算法,參數,強制并行度設定》

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#parallel-partition-table-wise-agg parallel partition table wise agg

并行分區表智能聚合(類似MPP)

當分區表聚合操作的分組字段為分區字段時,PostgreSQL優化器會選擇并行分區智能聚合。

《PostgreSQL 11 preview - 分區表智能并行聚合、分組計算(已類似MPP架構,性能暴增)》 《PostgreSQL 11 preview - 分區表智能并行JOIN (已類似MPP架構,性能暴增)》

資料量:10億

場景 資料量 關閉并行 開啟并行 并行度 開啟并行性能提升倍數
10億 191 秒 8 秒 24 23.9 倍

例子,24個分區的HASH分區表。

CREATE unlogged TABLE ccc (      
    order_id     bigint not null,      
    cust_id      bigint not null,      
    status       text      
) PARTITION BY HASH (cust_id);      
    
do language plpgsql $$    
declare    
begin    
  for i in 0..23 loop    
    execute format('CREATE unlogged TABLE %s%s PARTITION OF %s FOR VALUES WITH (MODULUS %s, REMAINDER %s)', 'ccc', i, 'ccc', 24, i);     
    execute format('alter table %s%s set(parallel_workers =64)', 'ccc',i);    
  end loop;    
end;    
$$;    
    
postgres=# \d ccc    
            Unlogged table "public.ccc"    
  Column  |  Type  | Collation | Nullable | Default     
----------+--------+-----------+----------+---------    
 order_id | bigint |           | not null |     
 cust_id  | bigint |           | not null |     
 status   | text   |           |          |     
Partition key: HASH (cust_id)    
Number of partitions: 24 (Use \d+ to list them.)    
           

寫入10億資料

insert into ccc select i, random()*960 from generate_series(1,1000000000) t(i);    
vacuum (analyze,verbose) ccc;    
           
postgres=# show max_worker_processes ;      
 max_worker_processes       
----------------------      
 128      
(1 row)      
postgres=# set min_parallel_table_scan_size =0;      
postgres=# set min_parallel_index_scan_size =0;      
postgres=# set parallel_tuple_cost =0;      
postgres=# set parallel_setup_cost =0;      
postgres=# set max_parallel_workers=128;      
postgres=# set max_parallel_workers_per_gather =24;      
postgres=# set enable_parallel_hash =on;      
postgres=# set enable_parallel_append =off;      
postgres=# set enable_partitionwise_aggregate =on;      
postgres=# set work_mem ='128MB';      
           

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#1%E5%85%B3%E9%97%AD%E5%B9%B6%E8%A1%8C%E8%80%97%E6%97%B6-191-%E7%A7%92 1、關閉并行,耗時: 191 秒。

postgres=# set max_parallel_workers_per_gather =0;      
postgres=# set enable_parallel_append =off;      
postgres=# set enable_partitionwise_aggregate =off;      
      
      
postgres=# explain select cust_id,count(*) from ccc group by cust_id;     
                                 QUERY PLAN                                      
-----------------------------------------------------------------------------    
 HashAggregate  (cost=25405409.32..25405418.93 rows=961 width=16)    
   Group Key: ccc0.cust_id    
   ->  Append  (cost=0.00..20405411.99 rows=999999466 width=8)    
         ->  Seq Scan on ccc0  (cost=0.00..641839.96 rows=41663296 width=8)    
         ->  Seq Scan on ccc1  (cost=0.00..625842.88 rows=40624888 width=8)    
         ->  Seq Scan on ccc2  (cost=0.00..722092.24 rows=46872124 width=8)    
         ->  Seq Scan on ccc3  (cost=0.00..545581.20 rows=35414920 width=8)    
         ->  Seq Scan on ccc4  (cost=0.00..657705.92 rows=42693192 width=8)    
         ->  Seq Scan on ccc5  (cost=0.00..609843.00 rows=39586300 width=8)    
         ->  Seq Scan on ccc6  (cost=0.00..625934.32 rows=40630732 width=8)    
         ->  Seq Scan on ccc7  (cost=0.00..673876.80 rows=43742880 width=8)    
         ->  Seq Scan on ccc8  (cost=0.00..601729.04 rows=39059604 width=8)    
         ->  Seq Scan on ccc9  (cost=0.00..609919.96 rows=39591296 width=8)    
         ->  Seq Scan on ccc10  (cost=0.00..674124.76 rows=43758976 width=8)    
         ->  Seq Scan on ccc11  (cost=0.00..529544.24 rows=34373924 width=8)    
         ->  Seq Scan on ccc12  (cost=0.00..818443.04 rows=53127004 width=8)    
         ->  Seq Scan on ccc13  (cost=0.00..674104.80 rows=43757680 width=8)    
         ->  Seq Scan on ccc14  (cost=0.00..786195.28 rows=51033728 width=8)    
         ->  Seq Scan on ccc15  (cost=0.00..609709.04 rows=39577604 width=8)    
         ->  Seq Scan on ccc16  (cost=0.00..633745.96 rows=41137896 width=8)    
         ->  Seq Scan on ccc17  (cost=0.00..673962.32 rows=43748432 width=8)    
         ->  Seq Scan on ccc18  (cost=0.00..802380.08 rows=52083808 width=8)    
         ->  Seq Scan on ccc19  (cost=0.00..529621.20 rows=34378920 width=8)    
         ->  Seq Scan on ccc20  (cost=0.00..642042.32 rows=41676432 width=8)    
         ->  Seq Scan on ccc21  (cost=0.00..401251.50 rows=26046150 width=8)    
         ->  Seq Scan on ccc22  (cost=0.00..673891.04 rows=43743804 width=8)    
         ->  Seq Scan on ccc23  (cost=0.00..642033.76 rows=41675876 width=8)    
(27 rows)    
    
postgres=# select cust_id,count(*) from ccc group by cust_id;     
 cust_id |  count      
---------+---------    
     652 | 1041702    
     273 | 1041730    
      51 | 1043333    
     951 | 1041738    
      70 | 1042693    
     839 | 1042059    
     350 | 1040677    
     539 | 1042402    
     758 | 1042199    
     874 | 1043176    
     278 | 1041664    
     946 | 1041442    
     176 | 1041523    
     576 | 1041131    
     292 | 1041236    
     929 | 1041554    
     663 | 1041569    
     770 | 1041953    
      22 | 1041484    
..........................    
     790 | 1041575    
     828 | 1042024    
(961 rows)    
    
Time: 191475.978 ms (03:11.476)    
           

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#2%E5%BC%80%E5%90%AF%E5%B9%B6%E8%A1%8C%E8%80%97%E6%97%B6-8-%E7%A7%92 2、開啟并行,耗時: 8 秒。

postgres=# set max_parallel_workers_per_gather =24;      
postgres=# set enable_parallel_append =off;      
postgres=# set enable_partitionwise_aggregate =on;      
      
postgres=# explain select cust_id,count(*) from ccc group by cust_id;     
                                              QUERY PLAN                                                   
-------------------------------------------------------------------------------------------------------    
 Append  (cost=251248.61..6031194.59 rows=961 width=16)    
   ->  Finalize GroupAggregate  (cost=251248.61..251278.60 rows=40 width=16)    
         Group Key: ccc0.cust_id    
         ->  Gather Merge  (cost=251248.61..251273.40 rows=960 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=251248.03..251248.13 rows=40 width=16)    
                     Sort Key: ccc0.cust_id    
                     ->  Partial HashAggregate  (cost=251246.56..251246.96 rows=40 width=16)    
                           Group Key: ccc0.cust_id    
                           ->  Parallel Seq Scan on ccc0  (cost=0.00..242566.71 rows=1735971 width=8)    
   ->  Finalize GroupAggregate  (cost=244986.56..245015.80 rows=39 width=16)    
         Group Key: ccc1.cust_id    
         ->  Gather Merge  (cost=244986.56..245010.73 rows=936 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=244985.98..244986.07 rows=39 width=16)    
                     Sort Key: ccc1.cust_id    
                     ->  Partial HashAggregate  (cost=244984.56..244984.95 rows=39 width=16)    
                           Group Key: ccc1.cust_id    
                           ->  Parallel Seq Scan on ccc1  (cost=0.00..236521.04 rows=1692704 width=8)    
   ->  Finalize GroupAggregate  (cost=282669.29..282703.03 rows=45 width=16)    
         Group Key: ccc2.cust_id    
         ->  Gather Merge  (cost=282669.29..282697.18 rows=1080 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=282668.71..282668.82 rows=45 width=16)    
                     Sort Key: ccc2.cust_id    
                     ->  Partial HashAggregate  (cost=282667.02..282667.47 rows=45 width=16)    
                           Group Key: ccc2.cust_id    
                           ->  Parallel Seq Scan on ccc2  (cost=0.00..272901.68 rows=1953068 width=8)    
   ->  Finalize GroupAggregate  (cost=213567.74..213593.23 rows=34 width=16)    
         Group Key: ccc3.cust_id    
         ->  Gather Merge  (cost=213567.74..213588.81 rows=816 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=213567.16..213567.25 rows=34 width=16)    
                     Sort Key: ccc3.cust_id    
                     ->  Partial HashAggregate  (cost=213565.96..213566.30 rows=34 width=16)    
                           Group Key: ccc3.cust_id    
                           ->  Parallel Seq Scan on ccc3  (cost=0.00..206187.97 rows=1475597 width=8)    
   ->  Finalize GroupAggregate  (cost=257459.33..257490.07 rows=41 width=16)    
         Group Key: ccc4.cust_id    
         ->  Gather Merge  (cost=257459.33..257484.74 rows=984 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=257458.75..257458.86 rows=41 width=16)    
                     Sort Key: ccc4.cust_id    
                     ->  Partial HashAggregate  (cost=257457.24..257457.65 rows=41 width=16)    
                           Group Key: ccc4.cust_id    
                           ->  Parallel Seq Scan on ccc4  (cost=0.00..248562.83 rows=1778883 width=8)    
   ->  Finalize GroupAggregate  (cost=238722.97..238751.46 rows=38 width=16)    
         Group Key: ccc5.cust_id    
         ->  Gather Merge  (cost=238722.97..238746.52 rows=912 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=238722.39..238722.48 rows=38 width=16)    
                     Sort Key: ccc5.cust_id    
                     ->  Partial HashAggregate  (cost=238721.01..238721.39 rows=38 width=16)    
                           Group Key: ccc5.cust_id    
                           ->  Parallel Seq Scan on ccc5  (cost=0.00..230474.01 rows=1649401 width=8)    
   ->  Finalize GroupAggregate  (cost=245023.21..245052.45 rows=39 width=16)    
         Group Key: ccc6.cust_id    
         ->  Gather Merge  (cost=245023.21..245047.38 rows=936 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=245022.63..245022.72 rows=39 width=16)    
                     Sort Key: ccc6.cust_id    
                     ->  Partial HashAggregate  (cost=245021.21..245021.60 rows=39 width=16)    
                           Group Key: ccc6.cust_id    
                           ->  Parallel Seq Scan on ccc6  (cost=0.00..236556.47 rows=1692947 width=8)    
   ->  Finalize GroupAggregate  (cost=263789.43..263820.92 rows=42 width=16)    
         Group Key: ccc7.cust_id    
         ->  Gather Merge  (cost=263789.43..263815.46 rows=1008 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=263788.85..263788.96 rows=42 width=16)    
                     Sort Key: ccc7.cust_id    
                     ->  Partial HashAggregate  (cost=263787.30..263787.72 rows=42 width=16)    
                           Group Key: ccc7.cust_id    
                           ->  Parallel Seq Scan on ccc7  (cost=0.00..254674.20 rows=1822620 width=8)    
   ->  Finalize GroupAggregate  (cost=235547.21..235575.70 rows=38 width=16)    
         Group Key: ccc8.cust_id    
         ->  Gather Merge  (cost=235547.21..235570.76 rows=912 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=235546.63..235546.73 rows=38 width=16)    
                     Sort Key: ccc8.cust_id    
                     ->  Partial HashAggregate  (cost=235545.25..235545.63 rows=38 width=16)    
                           Group Key: ccc8.cust_id    
                           ->  Parallel Seq Scan on ccc8  (cost=0.00..227407.83 rows=1627484 width=8)    
   ->  Finalize GroupAggregate  (cost=238753.52..238782.01 rows=38 width=16)    
         Group Key: ccc9.cust_id    
         ->  Gather Merge  (cost=238753.52..238777.07 rows=912 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=238752.94..238753.03 rows=38 width=16)    
                     Sort Key: ccc9.cust_id    
                     ->  Partial HashAggregate  (cost=238751.56..238751.94 rows=38 width=16)    
                           Group Key: ccc9.cust_id    
                           ->  Parallel Seq Scan on ccc9  (cost=0.00..230503.37 rows=1649637 width=8)    
   ->  Finalize GroupAggregate  (cost=263886.49..263917.98 rows=42 width=16)    
         Group Key: ccc10.cust_id    
         ->  Gather Merge  (cost=263886.49..263912.52 rows=1008 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=263885.91..263886.02 rows=42 width=16)    
                     Sort Key: ccc10.cust_id    
                     ->  Partial HashAggregate  (cost=263884.36..263884.78 rows=42 width=16)    
                           Group Key: ccc10.cust_id    
                           ->  Parallel Seq Scan on ccc10  (cost=0.00..254767.91 rows=1823291 width=8)    
   ->  Finalize GroupAggregate  (cost=207290.45..207315.19 rows=33 width=16)    
         Group Key: ccc11.cust_id    
         ->  Gather Merge  (cost=207290.45..207310.90 rows=792 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=207289.87..207289.95 rows=33 width=16)    
                     Sort Key: ccc11.cust_id    
                     ->  Partial HashAggregate  (cost=207288.70..207289.03 rows=33 width=16)    
                           Group Key: ccc11.cust_id    
                           ->  Parallel Seq Scan on ccc11  (cost=0.00..200127.47 rows=1432247 width=8)    
   ->  Finalize GroupAggregate  (cost=320379.91..320418.15 rows=51 width=16)    
         Group Key: ccc12.cust_id    
         ->  Gather Merge  (cost=320379.91..320411.52 rows=1224 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=320379.33..320379.46 rows=51 width=16)    
                     Sort Key: ccc12.cust_id    
                     ->  Partial HashAggregate  (cost=320377.38..320377.89 rows=51 width=16)    
                           Group Key: ccc12.cust_id    
                           ->  Parallel Seq Scan on ccc12  (cost=0.00..309309.25 rows=2213625 width=8)    
   ->  Finalize GroupAggregate  (cost=263878.68..263910.17 rows=42 width=16)    
         Group Key: ccc13.cust_id    
         ->  Gather Merge  (cost=263878.68..263904.71 rows=1008 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=263878.10..263878.21 rows=42 width=16)    
                     Sort Key: ccc13.cust_id    
                     ->  Partial HashAggregate  (cost=263876.55..263876.97 rows=42 width=16)    
                           Group Key: ccc13.cust_id    
                           ->  Parallel Seq Scan on ccc13  (cost=0.00..254760.37 rows=1823237 width=8)    
   ->  Finalize GroupAggregate  (cost=307756.52..307793.26 rows=49 width=16)    
         Group Key: ccc14.cust_id    
         ->  Gather Merge  (cost=307756.52..307786.89 rows=1176 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=307755.94..307756.07 rows=49 width=16)    
                     Sort Key: ccc14.cust_id    
                     ->  Partial HashAggregate  (cost=307754.08..307754.57 rows=49 width=16)    
                           Group Key: ccc14.cust_id    
                           ->  Parallel Seq Scan on ccc14  (cost=0.00..297122.05 rows=2126405 width=8)    
   ->  Finalize GroupAggregate  (cost=238670.96..238699.45 rows=38 width=16)    
         Group Key: ccc15.cust_id    
         ->  Gather Merge  (cost=238670.96..238694.51 rows=912 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=238670.38..238670.48 rows=38 width=16)    
                     Sort Key: ccc15.cust_id    
                     ->  Partial HashAggregate  (cost=238669.00..238669.38 rows=38 width=16)    
                           Group Key: ccc15.cust_id    
                           ->  Parallel Seq Scan on ccc15  (cost=0.00..230423.67 rows=1649067 width=8)    
   ->  Finalize GroupAggregate  (cost=248080.23..248110.22 rows=40 width=16)    
         Group Key: ccc16.cust_id    
         ->  Gather Merge  (cost=248080.23..248105.02 rows=960 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=248079.65..248079.75 rows=40 width=16)    
                     Sort Key: ccc16.cust_id    
                     ->  Partial HashAggregate  (cost=248078.19..248078.59 rows=40 width=16)    
                           Group Key: ccc16.cust_id    
                           ->  Parallel Seq Scan on ccc16  (cost=0.00..239507.79 rows=1714079 width=8)    
   ->  Finalize GroupAggregate  (cost=263822.24..263853.73 rows=42 width=16)    
         Group Key: ccc17.cust_id    
         ->  Gather Merge  (cost=263822.24..263848.27 rows=1008 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=263821.66..263821.77 rows=42 width=16)    
                     Sort Key: ccc17.cust_id    
                     ->  Partial HashAggregate  (cost=263820.11..263820.53 rows=42 width=16)    
                           Group Key: ccc17.cust_id    
                           ->  Parallel Seq Scan on ccc17  (cost=0.00..254706.07 rows=1822807 width=8)    
   ->  Finalize GroupAggregate  (cost=314097.79..314135.28 rows=50 width=16)    
         Group Key: ccc18.cust_id    
         ->  Gather Merge  (cost=314097.79..314128.78 rows=1200 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=314097.21..314097.33 rows=50 width=16)    
                     Sort Key: ccc18.cust_id    
                     ->  Partial HashAggregate  (cost=314095.30..314095.80 rows=50 width=16)    
                           Group Key: ccc18.cust_id    
                           ->  Parallel Seq Scan on ccc18  (cost=0.00..303244.20 rows=2170220 width=8)    
   ->  Finalize GroupAggregate  (cost=207320.57..207345.31 rows=33 width=16)    
         Group Key: ccc19.cust_id    
         ->  Gather Merge  (cost=207320.57..207341.02 rows=792 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=207319.99..207320.07 rows=33 width=16)    
                     Sort Key: ccc19.cust_id    
                     ->  Partial HashAggregate  (cost=207318.83..207319.15 rows=33 width=16)    
                           Group Key: ccc19.cust_id    
                           ->  Parallel Seq Scan on ccc19  (cost=0.00..200156.55 rows=1432455 width=8)    
   ->  Finalize GroupAggregate  (cost=251327.81..251357.81 rows=40 width=16)    
         Group Key: ccc20.cust_id    
         ->  Gather Merge  (cost=251327.81..251352.61 rows=960 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=251327.23..251327.33 rows=40 width=16)    
                     Sort Key: ccc20.cust_id    
                     ->  Partial HashAggregate  (cost=251325.77..251326.17 rows=40 width=16)    
                           Group Key: ccc20.cust_id    
                           ->  Parallel Seq Scan on ccc20  (cost=0.00..242643.18 rows=1736518 width=8)    
   ->  Finalize GroupAggregate  (cost=157070.25..157089.00 rows=25 width=16)    
         Group Key: ccc21.cust_id    
         ->  Gather Merge  (cost=157070.25..157085.75 rows=600 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=157069.67..157069.74 rows=25 width=16)    
                     Sort Key: ccc21.cust_id    
                     ->  Partial HashAggregate  (cost=157068.84..157069.09 rows=25 width=16)    
                           Group Key: ccc21.cust_id    
                           ->  Parallel Seq Scan on ccc21  (cost=0.00..151642.56 rows=1085256 width=8)    
   ->  Finalize GroupAggregate  (cost=263795.01..263826.50 rows=42 width=16)    
         Group Key: ccc22.cust_id    
         ->  Gather Merge  (cost=263795.01..263821.04 rows=1008 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=263794.43..263794.53 rows=42 width=16)    
                     Sort Key: ccc22.cust_id    
                     ->  Partial HashAggregate  (cost=263792.88..263793.29 rows=42 width=16)    
                           Group Key: ccc22.cust_id    
                           ->  Parallel Seq Scan on ccc22  (cost=0.00..254679.58 rows=1822658 width=8)    
   ->  Finalize GroupAggregate  (cost=251324.47..251354.46 rows=40 width=16)    
         Group Key: ccc23.cust_id    
         ->  Gather Merge  (cost=251324.47..251349.26 rows=960 width=16)    
               Workers Planned: 24    
               ->  Sort  (cost=251323.89..251323.99 rows=40 width=16)    
                     Sort Key: ccc23.cust_id    
                     ->  Partial HashAggregate  (cost=251322.42..251322.82 rows=40 width=16)    
                           Group Key: ccc23.cust_id    
                           ->  Parallel Seq Scan on ccc23  (cost=0.00..242639.95 rows=1736495 width=8)    
(217 rows)    
    
postgres=# select cust_id,count(*) from ccc group by cust_id;     
 cust_id |  count      
---------+---------    
      16 | 1041453    
      34 | 1042275    
      54 | 1039714    
      60 | 1040715    
      81 | 1041507    
     108 | 1042387    
     119 | 1041137    
     151 | 1041746    
     153 | 1042255    
     186 | 1041705    
..................................    
     588 | 1040386    
     629 | 1043353    
     686 | 1041974    
     690 | 1040047    
     693 | 1042870    
     725 | 1042487    
     765 | 1040199    
     770 | 1041953    
     795 | 1042219    
     801 | 1042153    
     807 | 1041840    
     821 | 1041773    
     870 | 1041345    
     874 | 1043176    
     895 | 1042988    
     915 | 1040430    
     934 | 1043436    
     951 | 1041738    
(961 rows)    
    
Time: 8142.401 ms (00:08.142)    
           

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#%E5%85%B6%E4%BB%96%E7%9F%A5%E8%AF%86 其他知識

2、function, op 識别是否支援parallel

postgres=# select proparallel,proname from pg_proc;                                            
 proparallel |                   proname                                                                
-------------+----------------------------------------------                                            
 s           | boolin                                            
 s           | boolout                                            
 s           | byteain                                            
 s           | byteaout                                            
           

3、subquery mapreduce unlogged table

對于一些情況,如果期望簡化優化器對非常非常複雜的SQL并行優化的負擔,可以自己将SQL拆成幾段,中間結果使用unlogged table儲存,類似mapreduce的思想。unlogged table同樣支援parallel 計算。

4、vacuum,垃圾回收并行。

5、dblink 異步調用并行

《PostgreSQL VOPS 向量計算 + DBLINK異步并行 - 單執行個體 10億 聚合計算跑進2秒》 《PostgreSQL 相似搜尋分布式架構設計與實踐 - dblink異步調用與多機并行(遠端 遊标+記錄 UDF執行個體)》 《PostgreSQL dblink異步調用實作 并行hash分片JOIN - 含資料交、并、差 提速案例 - 含dblink VS pg 11 parallel hash join VS pg 11 智能分區JOIN》

暫時不允許并行的場景(将來PG會繼續擴大支援範圍):

1、修改行,鎖行,除了create table as , select into, create mview這幾個可以使用并行。

2、query 會被中斷時,例如cursor , loop in PL/SQL ,因為涉及到中間處理,是以不建議開啟并行。

3、paralle unsafe udf ,這種UDF不會并行

4、嵌套并行(udf (内部query并行)),外部調用這個UDF的SQL不會并行。(主要是防止large parallel workers )

5、SSI 隔離級别

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#%E5%8F%82%E8%80%83 參考

https://www.postgresql.org/docs/11/parallel-plans.html 《PostgreSQL 11 preview - 并行計算 增強 彙總》 《PostgreSQL 10 自定義并行計算聚合函數的原理與實踐 - (含array_agg合并多個數組為單個一進制數組的例子)》

https://github.com/digoal/blog/blob/master/201903/20190317_13.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免費領取阿裡雲RDS PostgreSQL執行個體、ECS虛拟機

PostgreSQL 并行計算解說 之21 - parallel partition table wise agg