對表進行分析,通常情況下可以對表,索引,列進行單獨分析,或者進行組合分析,但這三者哪些是相對重要的,哪些分析顯得不那麼重要?通過本篇文章的實驗相信大家也會對直方圖有更一步的了解.
1.首先建立測試表,并插入100000條資料
SQL> create table test(id number,nick varchar2(30));
Table created.
SQL> begin
2 for i in 1..100000 loop
3 insert into test(id) values(i);
4 end loop;
5 commit;
6 end;
7 /
PL/SQL procedure successfully completed.
更新nick字段,使資料發生嚴重傾斜
SQL> update test set nick='abc' where rownum<99999;
99998 rows updated.
SQL> commit;
Commit complete.
SQL> create index idx_test_nick on test(nick);
Index created.
SQL> update test set nick='def' where nick is null;
2 rows updated.
SQL> commit;
Commit complete.
--隻對索引進行分析
SQL> analyze index idx_test_nick compute statistics;
Index analyzed.
SQL> select index_name,LEAF_BLOCKS,DISTINCT_KEYS,NUM_ROWS from user_indexes where index_name='IDX_TEST_NICK';
INDEX_NAME LEAF_BLOCKS DISTINCT_KEYS NUM_ROWS
------------------------------ ----------- ------------- ----------
IDX_TEST_NICK 210 2 100000
SQL> select COLUMN_NAME,NUM_BUCKETS,num_distinct from USER_tab_columns where table_name='TEST';
COLUMN_NAME NUM_BUCKETS NUM_DISTINCT
------------------------------ ----------- ------------
ID
NICK
檢視隻分析索引的情況下的執行計劃,從執行計劃可以看出,優化器選擇RBO,都走索引
SQL> set autotrace trace exp
SQL> select * from test where nick ='abc';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
SQL> select * from test where nick ='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
分析有資料嚴重傾斜的nick列後,也沒有産生正确的執行計劃,此時使用的優化器仍然是RBO
通過此實驗,說明隻分析索引和列,ORACLE不會使用CBO的優化器
SQL> analyze table test compute statistics for columns size 2 nick;
Table analyzed.
SQL> select * from test where nick ='abc';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
SQL> select * from test where nick ='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
現在分析表後,産生了正确的執行計劃
SQL> analyze table test compute statistics for table;
Table analyzed.
SQL> select * from test where nick ='abc';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=99998 Bytes=
1499970)
1 0 TABLE ACCESS (FULL) OF 'TEST' (Cost=50 Card=99998 Bytes=14
99970)
SQL> select * from test where nick ='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=2 Bytes=30)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=2 Byt
es=30)
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE) (Cost
=1 Card=2)
删除所有的統計資料,并隻對表與列進行分析,不分析索引,ORACLE使用CBO的優化器,并産生了正确的執行計劃
SQL> analyze table test delete statistics;
Table analyzed.
SQL> analyze table test compute statistics for table for columns size 2 nick;
Table analyzed.
SQL> select * from test where nick ='abc';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=99998 Bytes=
1499970)
1 0 TABLE ACCESS (FULL) OF 'TEST' (Cost=50 Card=99998 Bytes=14
99970)
SQL> select * from test where nick ='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=2 Bytes=30)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=2 Byt
es=30)
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE) (Cost
=1 Card=2)
建立TEST表ID列上的索引,但不對索引進行分析
SQL> create index idx_test_id on test(id);
Index created.
SQL> select * from test where id=1;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1000 Bytes=15
000)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1000
Bytes=15000)
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_ID' (NON-UNIQUE) (Cost=1
Card=400)
當條件中即有id,又有nick時,因為nick上有直方圖,ORACLE知道nick='abc'的值特别的多,是以不走IDX_TEST_NICK索引,走IDX_TEST_ID上的索引
SQL> select * from test where id=5 and nick='abc';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1000 Bytes=15
000)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1000
Bytes=15000)
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_ID' (NON-UNIQUE) (Cost=1
Card=400)
當條件中即有id,又有nick時,因為nick上有直方圖,ORACLE知道nick='def'的值特别的少,是以走IDX_TEST_NICK上的索引,不走IDX_TEST_ID索引
SQL> select * from test where id=5 and nick='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=15)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1 Byt
es=15)
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE) (Cost
=1 Card=2)
在分析ID列後,ORACLE發現ID列的選擇度更高,是以不再選擇IDX_TEST_NICK索引,而是選擇IDX_TEST_ID
SQL> analyze table test compute statistics for columns size 1 id;
Table analyzed.
SQL> select * from test where id=5 and nick='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=7)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1 Byt
es=7)
2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_ID' (NON-UNIQUE) (Cost=1
Card=1)
下面來看另外一種情況,我們删除所有的統計資料,然後在ID列上建立唯一索引,在此條件下,隻分析表與分析列nick,我們看到ORACLE走了正确的執行計劃,走了UK_TEST_ID,其實從這裡也給我們帶來很多的啟示:在主鍵與唯一鍵限制的列上是否需要直方圖的問題?如果在這些列上有像這樣的查詢where id > 100 and id < 1000,我們還是需要有直方圖的,但除此之外,好像真的沒有直方圖的必要了!
SQL> analyze table test delete statistics;
Table analyzed.
SQL> drop index idx_test_id;
Index dropped.
SQL> create unique index uk_test_id on test(id);
Index created.
SQL> analyze table test compute statistics for table for columns size 2 nick;
Table analyzed.
SQL> select * from test where id=5 and nick='def';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=15)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1 Byt
es=15)
2 1 INDEX (UNIQUE SCAN) OF 'UK_TEST_ID' (UNIQUE) (Cost=1 Car
d=1)
從以上一系列的實驗可以看出,對ORACLE的優化器CBO來說,表的分析與列的分析才是最重要的,索引的分析次之。還有我們可以考慮我們的哪些列上需要直方圖,對于bucket的個數問題,oracle的預設值是75個,是以根據你的應用規則,選擇合适的桶數對性能也是有幫助的。因為不必要的桶的個數的大量增加,必然會帶來SQL語句硬解析時産生執行計劃的複雜度問題。