天天看點

一次層次查詢相關的sql的調整優化

關于層次查詢最常見的性能問題是:因為優化器在評估層次查詢傳回的資料行時存在嚴重的偏差進而導緻了錯誤的連接配接順序,連接配接方法導緻的糟糕的性能.這似乎是優化器固有的問題(其實也确實是不好評估傳回的行數),與優化器統計資訊的收集方法沒有關系,可以參見老熊的這篇文章:

http://www.laoxiong.net/take-care-of-using-connect-by-query.html,

對于這種問題,因為問題本身是因為優化器錯誤的評估了層次查詢傳回的資料行數導緻的,是以我一般會使用cardinality提示或者是opt_estimate提示使用代表性的傳回行數來糾正優化器評估的資料行數,其它地方不做任何修改,不添加任何的提示,一般就能糾正問題.(這裡動态采樣似乎是不管用的,即使是我使用了提示也改變不了優化器評估的傳回行數).

當然,我這裡說的不是這個問題.是公司的上一個cms版本中的一個sql的問題,是差不多兩年前的一個問題了,具體的資料庫版本不是太清楚了,應該是9206的版本.

select distinct id,fid,name,path,IsCheck,

STATUS,rank,title,type,code,

domainPath,domain,s.idsite,keyword,

comments,contribution,pubrart,

pids,isaddlable,tiid,taid

from CMS_SM s, CMS_USER_RF r

where s.type = 0

and s.status = 1

and s.fid = 1

and (((r.idsite = -1 and r.idchannel = -1) or

(r.idchannel = -888 and s.id = r.idsite) or

(r.idsite > 0 and r.idchannel = -1 and

r.idsite in (select id

from cms_sm

connect by prior fid = id

start with id = s.id)) or

(r.idsite > 0 and r.idchannel > 0 and

(r.idchannel in (select id

from cms_sm

connect by prior fid = id

start with id = s.id) or

(s.fid = -1 and s.id = r.idsite)))) and r.userid = 3535)

order by id desc;

按CBO走,邏輯讀:2.8W   CPU使用:1.77s

這是一個使用者權限檢測的SQL語句,執行頻率還是挺高的,而且這裡的and,or,改起來确實挺麻煩的.

當時的情況就是user cpu占有率偏高,sp報表中大量的hash latch争用,很快就定位到是這個sql引起的.

當時,并沒有儲存之前的sp報表,當然更沒有儲存這個sql之前(更确切的說應該是表現良好時)的執行計劃,執行統計資訊,優化器統計資訊,因為缺少這些對比的基準資訊,是以無從确定這個sql是一直表現這麼差呀還是突然變成這麼差了?也許是一直表現這麼差,隻是因為這時候的執行頻率稍微多了些,或者是因為一些其它的性能上的問題,造成資源的争用,把這個sql的性能問題給凸顯出來了;也許是以前表現良好,隻是因為實體結構的改變(比如說添加了新的索引),重新收集了優化器統計資訊,或者是因為其它的一些原因,重新硬分析時,才選擇了一個錯誤的執行計劃,才突然變得性能這麼差了.關鍵的問題不在于我剛調到這個部門來,對這個資料庫系統還幾乎是一無所知,而在于我當時還沒有形成這樣一套解決性能問題時應該有的思路,是以并沒有之前的一些準備,也沒有按照這樣的思路去處理問題.當然,現在整理之前的一些文檔的時候,我能想到這樣一些處理步驟,還是說明自己是有了一些進步的,呵呵.當時,隻是詢問開發人員對應用是否變更過,他們回應說沒動過什麼東西呀,檢查了一下,最近确實沒有新增過什麼索引(至于開發人員是否删除了什麼索引,就無從确定了,當然他們說沒有變動過),就開始檢查這個sql該怎麼調整了.

這裡是connect by prior fid = id,當然,如果是我寫的話,我更傾向于寫成connect by id=prior fid,立即起來就是新記錄的id字段=之前記錄(也就是上一級level-1記錄)的fid字段.我發現這裡的cms_sm的id列上有主鍵限制,但卻沒有index(id,fid),于是我添加了這樣的索引:create index index_cms_sm_id_fid on cms_sm(id,fid) online;這樣執行計劃變成了這樣:

/

Elapsed: 00:00:01.69 (大部分都是CPU時間)

---------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

---------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 211 | 5 (20)|

| 1 | SORT ORDER BY | | 1 | 211 | |

| 2 | HASH UNIQUE | | 1 | 211 | 5 (20)|

|* 3 | FILTER | | | | |

| 4 | MERGE JOIN CARTESIAN | | 2 | 422 | 4 (0)|

|* 5 | TABLE ACCESS BY INDEX ROWID | CMS_SM | 1 | 200 | 3 (0)|

|* 6 | INDEX RANGE SCAN DESCENDING| CMS_SM_FID | 5 | | 1 (0)|

| 7 | BUFFER SORT | | 3 | 33 | 1 (0)|

|* 8 | INDEX RANGE SCAN | USERID-SITE-CHANNEL | 3 | 33 | 1 (0)|

|* 9 | FILTER | | | | |

|* 10 | CONNECT BY WITH FILTERING | | | | |

| 11 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |

|* 12 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|

|* 13 | HASH JOIN | | | | |

| 14 | CONNECT BY PUMP | | | | |

| 15 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|

| 16 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|

|* 17 | FILTER | | | | |

|* 18 | CONNECT BY WITH FILTERING | | | | |

| 19 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |

|* 20 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|

|* 21 | HASH JOIN | | | | |

| 22 | CONNECT BY PUMP | | | | |

| 23 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|

| 24 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|

---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR "R"."IDCHANNEL"=(-888)

AND "S"."ID"="R"."IDSITE" OR "R"."IDCHANNEL"=(-1) AND "R"."IDSITE">0 AND EXISTS

(SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND ("ID"=:B1)) OR "R"."IDSITE">0 AND

"R"."IDCHANNEL">0 AND ( EXISTS (SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND

("ID"=:B2)) OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))

5 - filter("S"."TYPE"=0 AND "S"."STATUS"=1)

6 - access("S"."FID"=1)

8 - access("R"."USERID"=3535)

9 - filter("ID"=:B1)

10 - filter("ID"=:B1)

12 - access("ID"=:B1)

13 - access("ID"=NULL)

17 - filter("ID"=:B1)

18 - filter("ID"=:B1)

20 - access("ID"=:B1)

21 - access("ID"=NULL)

Statistics

----------------------------------------------------------

17915 consistent gets

650 sorts (memory)

5 rows processed

發現邏輯io已經降低為1.8W了,當然,這樣的邏輯io可能還是不可接受的,是以我們還需要看看是否還有優化的餘地.

當然,因為這裡的and,or,業務邏輯似乎還是很複雜的,是以改寫起來還是挺麻煩的.因為我當時剛剛讀完Jonathan Lewis關于CBO的書,知道oracle在計算複雜的and,or的時候,cardinality的估算上可能會存在一些問題,并由此造成一些執行計劃上的問題,是以忽然就想到了加rule提示來嘗試看看能不能有所改變,于是在最外層的select處加上了的提示.說真的,我都沒想到邏輯io,執行時間立刻就降了下來:

/

Elapsed: 00:00:00.02

----------------------------------------------------------------

| Id | Operation | Name |

----------------------------------------------------------------

| 0 | SELECT STATEMENT | |

| 1 | SORT UNIQUE | |

|* 2 | FILTER | |

| 3 | NESTED LOOPS | |

|* 4 | TABLE ACCESS BY INDEX ROWID | CMS_SM |

|* 5 | INDEX RANGE SCAN | CMS_SM_TYPE |

|* 6 | INDEX RANGE SCAN | USERID-SITE-CHANNEL |

|* 7 | FILTER | |

|* 8 | CONNECT BY WITH FILTERING | |

| 9 | TABLE ACCESS BY INDEX ROWID | CMS_SM |

|* 10 | INDEX UNIQUE SCAN | SYS_C0010312 |

| 11 | NESTED LOOPS | |

| 12 | BUFFER SORT | |

| 13 | CONNECT BY PUMP | |

| 14 | TABLE ACCESS BY INDEX ROWID| CMS_SM |

|* 15 | INDEX UNIQUE SCAN | SYS_C0010312 |

|* 16 | TABLE ACCESS FULL | CMS_SM |

|* 17 | FILTER | |

|* 18 | CONNECT BY WITH FILTERING | |

| 19 | TABLE ACCESS BY INDEX ROWID | CMS_SM |

|* 20 | INDEX UNIQUE SCAN | SYS_C0010312 |

| 21 | NESTED LOOPS | |

| 22 | BUFFER SORT | |

| 23 | CONNECT BY PUMP | |

| 24 | TABLE ACCESS BY INDEX ROWID| CMS_SM |

|* 25 | INDEX UNIQUE SCAN | SYS_C0010312 |

|* 26 | TABLE ACCESS FULL | CMS_SM |

----------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR

"R"."IDCHANNEL"=(-888) AND "S"."ID"="R"."IDSITE" OR "R"."IDSITE">0 AND

"R"."IDCHANNEL"=(-1) AND EXISTS (SELECT 0 FROM "CMS_SM" "CMS_SM" WHERE

"ID"=NULL AND ("ID"=:B1)) OR "R"."IDSITE">0 AND "R"."IDCHANNEL">0 AND (

EXISTS (SELECT 0 FROM "CMS_SM" "CMS_SM" WHERE "ID"=NULL AND ("ID"=:B2))

OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))

4 - filter("S"."FID"=1 AND "S"."STATUS"=1)

5 - access("S"."TYPE"=0)

6 - access("R"."USERID"=3535)

7 - filter("ID"=:B1)

8 - filter("ID"=:B1)

10 - access("ID"=:B1)

15 - access("ID"=NULL)

16 - access("ID"=NULL)

17 - filter("ID"=:B1)

18 - filter("ID"=:B1)

20 - access("ID"=:B1)

25 - access("ID"=NULL)

26 - access("ID"=NULL)

Note

-----

- rule based optimizer used (consider using cbo)

Statistics

----------------------------------------------------------

2159 consistent gets

1081 sorts (memory)

5 rows processed

現在邏輯io,執行時間直接降低了一個數量級,應該來說還是令人滿意的,是以就直接讓開發人員修改應用,添加了rule提示,然後重新開機了應用伺服器,後來系統性能也就沒有再出現問題.

可當我仔細思考這個問題的時候,總覺得rule的威力有這麼大嗎?還有我敢一直這樣使用這個rule提示嗎?(雖然說新的cms正在試用階段,過不了多長時間這個老的cms系統也就要正式退役了吧),于是我嘗試着使用提示來構造一個和這個rule提示一樣的執行計劃.

我去掉rule提示,添加了提示之後,執行計劃變成了這樣:

SQL> /

Elapsed: 00:00:01.64

---------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

---------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 211 | 141 (2)|

| 1 | SORT ORDER BY | | 1 | 211 | 141 (2)|

| 2 | HASH UNIQUE | | 1 | 211 | 140 (1)|

|* 3 | FILTER | | | | |

| 4 | NESTED LOOPS | | 2 | 422 | 139 (0)|

|* 5 | TABLE ACCESS BY INDEX ROWID | CMS_SM | 1 | 200 | 138 (0)|

|* 6 | INDEX RANGE SCAN | CMS_SM_TYPE | 4365 | | 6 (0)|

|* 7 | INDEX RANGE SCAN | USERID-SITE-CHANNEL | 3 | 33 | 1 (0)|

|* 8 | FILTER | | | | |

|* 9 | CONNECT BY WITH FILTERING | | | | |

| 10 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |

|* 11 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|

|* 12 | HASH JOIN | | | | |

| 13 | CONNECT BY PUMP | | | | |

| 14 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|

| 15 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|

|* 16 | FILTER | | | | |

|* 17 | CONNECT BY WITH FILTERING | | | | |

| 18 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |

|* 19 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|

|* 20 | HASH JOIN | | | | |

| 21 | CONNECT BY PUMP | | | | |

| 22 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|

| 23 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|

---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR "R"."IDCHANNEL"=(-888)

AND "S"."ID"="R"."IDSITE" OR "R"."IDCHANNEL"=(-1) AND "R"."IDSITE">0 AND EXISTS

(SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND ("ID"=:B1)) OR "R"."IDSITE">0 AND

"R"."IDCHANNEL">0 AND ( EXISTS (SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND

("ID"=:B2)) OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))

5 - filter("S"."FID"=1 AND "S"."STATUS"=1)

6 - access("S"."TYPE"=0)

7 - access("R"."USERID"=3535)

8 - filter("ID"=:B1)

9 - filter("ID"=:B1)

11 - access("ID"=:B1)

12 - access("ID"=NULL)

16 - filter("ID"=:B1)

17 - filter("ID"=:B1)

19 - access("ID"=:B1)

20 - access("ID"=NULL)

Statistics

----------------------------------------------------------

17927 consistent gets

649 sorts (memory)

5 rows processed

這裡我通過在sql語句中添加提示的方式,使得它的執行計劃和上面rule的方式幾乎是一樣的(當然,到現在我也不知道有什麼提示可以單獨的控制層次查詢的執行計劃,是以對于層次查詢的執行計劃,我無法使得它們可以保持一緻),我感覺這裡真正不同的地方在于層次查詢的執行計劃上,rule的時候走的是nl,cbo的時候走的是hash join(層次查詢具體是如何執行的,不清楚,網上也找不到相應的資料),但它的執行時間,邏輯IO還是比rule的方式高出很多.也許性能上的巨大差異就是因為這裡層次查詢執行上的不同造成的

這裡的層次查詢,單獨拿出來執行的時候,就是nl,但放進大的查詢裡就是hash join.rule提示使得它可以走nl,而不是hash join,但也許其它地方比如連接配接順序等使用rule的方式不是太好,但這裡的語句改起來挺費勁的,cbo下又沒有提示使得這裡的層次查詢可以走nl.

我猜測這裡的層次查詢使用hash join的方式比使用nl的方式要差一些,也許執行一次就是多幾十個邏輯讀,也差不到哪裡去,但關鍵是這裡還有兩個filter操作,多次過濾,多次執行這樣的層次查詢,會使得邏輯讀差别就大了

可對于層次查詢,執行計劃看的不是太明白,感覺隻能是建立正确的索引來控制性能了吧!我不知道有什麼提示可以單獨的控制層次查詢的行為.

如何才能大部分使用cbo,隻是層次查詢使用nl呢?上面提到這裡的層次查詢,單獨拿出來執行的時候,就是nl,于是乎,想到通過table函數來實作.

create or replace package zsj_pack

is

t_count number;

end;

/

這裡的這個包完全是為了計數的需要

create or replace type t_tab_id is table of number;

create or replace function get_cms_sm_ids(v_id number) return t_tab_id

is

v_tab_id t_tab_id;

begin

select id

bulk collect into v_tab_id

from cms_sm

start with id = v_id

connect by id=prior fid;

zsj_pack.t_count:=zsj_pack.t_count+1;

return v_tab_id;

end;

/

SQL> explain plan for select id

from cms_sm

start with id = :v_id

connect by id=prior fid;

SQL> select * from table(dbms_xplan.display());

-----------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

-----------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 8 | 1 (0)|

|* 1 | CONNECT BY WITH FILTERING | | | | |

| 2 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |

|* 3 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|

| 4 | NESTED LOOPS | | | | |

| 5 | BUFFER SORT | | | | |

| 6 | CONNECT BY PUMP | | | | |

|* 7 | INDEX RANGE SCAN | INDEX_CMS_SM_ID_FID | 1 | 8 | 1 (0)|

|* 8 | TABLE ACCESS FULL | CMS_SM | 1 | 8 | 1 (0)|

-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("ID"=TO_NUMBER(:V_ID))

3 - access("ID"=TO_NUMBER(:V_ID))

7 - access("ID"=NULL)

8 - access("ID"=NULL)

select distinct id,fid,name,path,IsCheck,

STATUS,rank,title,type,code,

domainPath,domain,s.idsite,keyword,

comments,contribution,pubrart,

pids,isaddlable,tiid,taid

from CMS_SM s, CMS_USER_RF r

where s.type = 0

and s.status = 1

and s.fid = 1

and (((r.idsite = -1 and r.idchannel = -1) or

(r.idchannel = -888 and s.id = r.idsite) or

(r.idsite > 0 and r.idchannel = -1 and

r.idsite in (select column_value id from table(get_cms_sm_ids(s.id)))) or

(r.idsite > 0 and r.idchannel > 0 and

(r.idchannel in (select column_value id from table(get_cms_sm_ids(s.id))) or

(s.fid = -1 and s.id = r.idsite)))) and r.userid = 3535)

order by id desc;

SQL> /

Elapsed: 00:00:00.03

-------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

-------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 211 | 5 (20)|

| 1 | SORT ORDER BY | | 1 | 211 | |

| 2 | HASH UNIQUE | | 1 | 211 | 5 (20)|

|* 3 | FILTER | | | | |

| 4 | MERGE JOIN CARTESIAN | | 2 | 422 | 4 (0)|

|* 5 | TABLE ACCESS BY INDEX ROWID | CMS_SM | 1 | 200 | 3 (0)|

|* 6 | INDEX RANGE SCAN DESCENDING | CMS_SM_FID | 5 | | 1 (0)|

| 7 | BUFFER SORT | | 3 | 33 | 1 (0)|

|* 8 | INDEX RANGE SCAN | USERID-SITE-CHANNEL | 3 | 33 | 1 (0)|

|* 9 | COLLECTION ITERATOR PICKLER FETCH| GET_CMS_SM_IDS | | | |

|* 10 | COLLECTION ITERATOR PICKLER FETCH| GET_CMS_SM_IDS | | | |

-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR "R"."IDCHANNEL"=(-888) AND

"S"."ID"="R"."IDSITE" OR "R"."IDCHANNEL"=(-1) AND "R"."IDSITE">0 AND EXISTS (SELECT /*+

*/ 0 FROM TABLE() "KOKBF$" WHERE VALUE(KOKBF$)=:B1) OR "R"."IDSITE">0 AND

"R"."IDCHANNEL">0 AND ( EXISTS (SELECT /*+ */ 0 FROM TABLE() "KOKBF$" WHERE

VALUE(KOKBF$)=:B2) OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))

5 - filter("S"."TYPE"=0 AND "S"."STATUS"=1)

6 - access("S"."FID"=1)

8 - access("R"."USERID"=3535)

9 - filter(VALUE(KOKBF$)=:B1)

10 - filter(VALUE(KOKBF$)=:B1)

Statistics

----------------------------------------------------------

216 recursive calls

1931 consistent gets

1082 sorts (memory)

0 sorts (disk)

5 rows processed

可以發現邏輯IO比rule的實作還要小

單獨的測試層次查詢執行了多少次:

exec zsj_pack.t_count:=0;

--執行上面修改後的sql語句

SQL> set serveroutput on;

SQL> exec dbms_output.put_line(zsj_pack.t_count);

216

顯示這個層次查詢執行了有216次之多.其實從上面的遞歸調用次數也可以猜出來的

這裡通過table函數的實作,在使用cbo的情況下,對于層次查詢,使得sql可以按照自己想要的執行方式去執行.但就像我上面提到的一樣,有沒有什麼提示可以單獨的控制層次查詢的執行計劃呢?對于層次查詢大家有什麼好的資料嗎?