天天看點

greenplum 單表 資料掃描

單表查詢

1)seq scan 順序掃描 資料檔案從頭讀到尾,greenplum中壓縮表尤為突出

2)index scan:索引掃描,對于查詢小資料量而言,速度很快

3)bitmap heap scan:位圖堆表掃描,資料在整表占較大的時候

tutorial=> create table pg_class_tmp as select * from pg_class distributed by (relname);

SELECT 468

tutorial=> create index pg_class_tmp_relkind_idx on pg_class_tmp(relkind);

CREATE INDEX

通過參數 enable_seqscan禁止順序掃描,確定執行計劃通過pg_class_tmp_relkind_idx查詢資料

tutorial=> set enable_seqscan = off;

SET

tutorial=> explain select * from pg_class_tmp where relkind='c';

                                          QUERY PLAN                                         

----------------------------------------------------------------------------------------------

-

Gather Motion 2:1  (slice1; segments: 2)  (cost=100.28..143.12 rows=3 width=234)

   ->  Bitmap Heap Scan on pg_class_tmp  (cost=100.28..143.12 rows=3 width=234)

         Recheck Cond: relkind = 'c'::"char"

         ->  Bitmap Index Scan on pg_class_tmp_relkind_idx  (cost=0.00..100.28 rows=3 width=0)

               Index Cond: relkind = 'c'::"char"

Settings:  enable_seqscan=off

(6 rows)

/*建立索引時建立的為普通索引,為什麼會變成位圖索引*/

4)tid scan:隐藏字段ctid掃描(oracle rowid)

ctid是postgresql标記資料位置位置的字段,每個子節點都是一個postgresql資料庫,每一個子節點都單獨維護自己的一套ctid字段

tutorial=> select ctid from pg_class_tmp limit 1;

ctid 

-------

(0,1)

(1 row)

tutorial=> select relkind from pg_class_tmp where ctid = '(0,1)';

NOTICE:  SELECT uses system-defined column "pg_class_tmp.ctid" without the necessary companion

column "pg_class_tmp.gp_segment_id"HINT:  To uniquely identify a row within a distributed table, use the "gp_segment_id" column t

ogether with the "ctid" column. relkind

---------

t

r

(2 rows)

tutorial=> select relkind from pg_class_tmp where ctid = '(0,1)' and gp_segment_id=1;

relkind

tutorial=> select relkind from pg_class_tmp where ctid = '(0,1)' and gp_segment_id=0;

tutorial=>

5) 子查詢

隻要sql中有子查詢,需要對子查詢的結果做順序掃描,就會進行子查詢掃描

6) 函數掃描

tutorial=> explain select * from generate_series(1,20);

                               QUERY PLAN                              

------------------------------------------------------------------------

Function Scan on generate_series  (cost=0.00..12.50 rows=2000 width=4)

(2 rows)