关于层次查询最常见的性能问题是:因为优化器在评估层次查询返回的数据行时存在严重的偏差进而导致了错误的连接顺序,连接方法导致的糟糕的性能.这似乎是优化器固有的问题(其实也确实是不好评估返回的行数),与优化器统计信息的收集方法没有关系,可以参见老熊的这篇文章:
http://www.laoxiong.net/take-care-of-using-connect-by-query.html,
对于这种问题,因为问题本身是因为优化器错误的评估了层次查询返回的数据行数导致的,所以我一般会使用cardinality提示或者是opt_estimate提示使用代表性的返回行数来纠正优化器评估的数据行数,其它地方不做任何修改,不添加任何的提示,一般就能纠正问题.(这里动态采样似乎是不管用的,即使是我使用了提示也改变不了优化器评估的返回行数).
当然,我这里说的不是这个问题.是公司的上一个cms版本中的一个sql的问题,是差不多两年前的一个问题了,具体的数据库版本不是太清楚了,应该是9206的版本.
select distinct id,fid,name,path,IsCheck,
STATUS,rank,title,type,code,
domainPath,domain,s.idsite,keyword,
comments,contribution,pubrart,
pids,isaddlable,tiid,taid
from CMS_SM s, CMS_USER_RF r
where s.type = 0
and s.status = 1
and s.fid = 1
and (((r.idsite = -1 and r.idchannel = -1) or
(r.idchannel = -888 and s.id = r.idsite) or
(r.idsite > 0 and r.idchannel = -1 and
r.idsite in (select id
from cms_sm
connect by prior fid = id
start with id = s.id)) or
(r.idsite > 0 and r.idchannel > 0 and
(r.idchannel in (select id
from cms_sm
connect by prior fid = id
start with id = s.id) or
(s.fid = -1 and s.id = r.idsite)))) and r.userid = 3535)
order by id desc;
按CBO走,逻辑读:2.8W CPU使用:1.77s
这是一个用户权限检测的SQL语句,执行频率还是挺高的,而且这里的and,or,改起来确实挺麻烦的.
当时的情况就是user cpu占有率偏高,sp报表中大量的hash latch争用,很快就定位到是这个sql引起的.
当时,并没有保存之前的sp报表,当然更没有保存这个sql之前(更确切的说应该是表现良好时)的执行计划,执行统计信息,优化器统计信息,因为缺少这些对比的基准信息,所以无从确定这个sql是一直表现这么差呀还是突然变成这么差了?也许是一直表现这么差,只是因为这时候的执行频率稍微多了些,或者是因为一些其它的性能上的问题,造成资源的争用,把这个sql的性能问题给凸显出来了;也许是以前表现良好,只是因为物理结构的改变(比如说添加了新的索引),重新收集了优化器统计信息,或者是因为其它的一些原因,重新硬分析时,才选择了一个错误的执行计划,才突然变得性能这么差了.关键的问题不在于我刚调到这个部门来,对这个数据库系统还几乎是一无所知,而在于我当时还没有形成这样一套解决性能问题时应该有的思路,所以并没有之前的一些准备,也没有按照这样的思路去处理问题.当然,现在整理之前的一些文档的时候,我能想到这样一些处理步骤,还是说明自己是有了一些进步的,呵呵.当时,只是询问开发人员对应用是否变更过,他们回应说没动过什么东西呀,检查了一下,最近确实没有新增过什么索引(至于开发人员是否删除了什么索引,就无从确定了,当然他们说没有变动过),就开始检查这个sql该怎么调整了.
这里是connect by prior fid = id,当然,如果是我写的话,我更倾向于写成connect by id=prior fid,立即起来就是新记录的id字段=之前记录(也就是上一级level-1记录)的fid字段.我发现这里的cms_sm的id列上有主键约束,但却没有index(id,fid),于是我添加了这样的索引:create index index_cms_sm_id_fid on cms_sm(id,fid) online;这样执行计划变成了这样:
/
Elapsed: 00:00:01.69 (大部分都是CPU时间)
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 211 | 5 (20)|
| 1 | SORT ORDER BY | | 1 | 211 | |
| 2 | HASH UNIQUE | | 1 | 211 | 5 (20)|
|* 3 | FILTER | | | | |
| 4 | MERGE JOIN CARTESIAN | | 2 | 422 | 4 (0)|
|* 5 | TABLE ACCESS BY INDEX ROWID | CMS_SM | 1 | 200 | 3 (0)|
|* 6 | INDEX RANGE SCAN DESCENDING| CMS_SM_FID | 5 | | 1 (0)|
| 7 | BUFFER SORT | | 3 | 33 | 1 (0)|
|* 8 | INDEX RANGE SCAN | USERID-SITE-CHANNEL | 3 | 33 | 1 (0)|
|* 9 | FILTER | | | | |
|* 10 | CONNECT BY WITH FILTERING | | | | |
| 11 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |
|* 12 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|
|* 13 | HASH JOIN | | | | |
| 14 | CONNECT BY PUMP | | | | |
| 15 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|
| 16 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|
|* 17 | FILTER | | | | |
|* 18 | CONNECT BY WITH FILTERING | | | | |
| 19 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |
|* 20 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|
|* 21 | HASH JOIN | | | | |
| 22 | CONNECT BY PUMP | | | | |
| 23 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|
| 24 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR "R"."IDCHANNEL"=(-888)
AND "S"."ID"="R"."IDSITE" OR "R"."IDCHANNEL"=(-1) AND "R"."IDSITE">0 AND EXISTS
(SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND ("ID"=:B1)) OR "R"."IDSITE">0 AND
"R"."IDCHANNEL">0 AND ( EXISTS (SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND
("ID"=:B2)) OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))
5 - filter("S"."TYPE"=0 AND "S"."STATUS"=1)
6 - access("S"."FID"=1)
8 - access("R"."USERID"=3535)
9 - filter("ID"=:B1)
10 - filter("ID"=:B1)
12 - access("ID"=:B1)
13 - access("ID"=NULL)
17 - filter("ID"=:B1)
18 - filter("ID"=:B1)
20 - access("ID"=:B1)
21 - access("ID"=NULL)
Statistics
----------------------------------------------------------
17915 consistent gets
650 sorts (memory)
5 rows processed
发现逻辑io已经降低为1.8W了,当然,这样的逻辑io可能还是不可接受的,所以我们还需要看看是否还有优化的余地.
当然,因为这里的and,or,业务逻辑似乎还是很复杂的,所以改写起来还是挺麻烦的.因为我当时刚刚读完Jonathan Lewis关于CBO的书,知道oracle在计算复杂的and,or的时候,cardinality的估算上可能会存在一些问题,并由此造成一些执行计划上的问题,所以忽然就想到了加rule提示来尝试看看能不能有所改变,于是在最外层的select处加上了的提示.说真的,我都没想到逻辑io,执行时间立刻就降了下来:
/
Elapsed: 00:00:00.02
----------------------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT UNIQUE | |
|* 2 | FILTER | |
| 3 | NESTED LOOPS | |
|* 4 | TABLE ACCESS BY INDEX ROWID | CMS_SM |
|* 5 | INDEX RANGE SCAN | CMS_SM_TYPE |
|* 6 | INDEX RANGE SCAN | USERID-SITE-CHANNEL |
|* 7 | FILTER | |
|* 8 | CONNECT BY WITH FILTERING | |
| 9 | TABLE ACCESS BY INDEX ROWID | CMS_SM |
|* 10 | INDEX UNIQUE SCAN | SYS_C0010312 |
| 11 | NESTED LOOPS | |
| 12 | BUFFER SORT | |
| 13 | CONNECT BY PUMP | |
| 14 | TABLE ACCESS BY INDEX ROWID| CMS_SM |
|* 15 | INDEX UNIQUE SCAN | SYS_C0010312 |
|* 16 | TABLE ACCESS FULL | CMS_SM |
|* 17 | FILTER | |
|* 18 | CONNECT BY WITH FILTERING | |
| 19 | TABLE ACCESS BY INDEX ROWID | CMS_SM |
|* 20 | INDEX UNIQUE SCAN | SYS_C0010312 |
| 21 | NESTED LOOPS | |
| 22 | BUFFER SORT | |
| 23 | CONNECT BY PUMP | |
| 24 | TABLE ACCESS BY INDEX ROWID| CMS_SM |
|* 25 | INDEX UNIQUE SCAN | SYS_C0010312 |
|* 26 | TABLE ACCESS FULL | CMS_SM |
----------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR
"R"."IDCHANNEL"=(-888) AND "S"."ID"="R"."IDSITE" OR "R"."IDSITE">0 AND
"R"."IDCHANNEL"=(-1) AND EXISTS (SELECT 0 FROM "CMS_SM" "CMS_SM" WHERE
"ID"=NULL AND ("ID"=:B1)) OR "R"."IDSITE">0 AND "R"."IDCHANNEL">0 AND (
EXISTS (SELECT 0 FROM "CMS_SM" "CMS_SM" WHERE "ID"=NULL AND ("ID"=:B2))
OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))
4 - filter("S"."FID"=1 AND "S"."STATUS"=1)
5 - access("S"."TYPE"=0)
6 - access("R"."USERID"=3535)
7 - filter("ID"=:B1)
8 - filter("ID"=:B1)
10 - access("ID"=:B1)
15 - access("ID"=NULL)
16 - access("ID"=NULL)
17 - filter("ID"=:B1)
18 - filter("ID"=:B1)
20 - access("ID"=:B1)
25 - access("ID"=NULL)
26 - access("ID"=NULL)
Note
-----
- rule based optimizer used (consider using cbo)
Statistics
----------------------------------------------------------
2159 consistent gets
1081 sorts (memory)
5 rows processed
现在逻辑io,执行时间直接降低了一个数量级,应该来说还是令人满意的,所以就直接让开发人员修改应用,添加了rule提示,然后重启了应用服务器,后来系统性能也就没有再出现问题.
可当我仔细思考这个问题的时候,总觉得rule的威力有这么大吗?还有我敢一直这样使用这个rule提示吗?(虽然说新的cms正在试用阶段,过不了多长时间这个老的cms系统也就要正式退役了吧),于是我尝试着使用提示来构造一个和这个rule提示一样的执行计划.
我去掉rule提示,添加了提示之后,执行计划变成了这样:
SQL> /
Elapsed: 00:00:01.64
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 211 | 141 (2)|
| 1 | SORT ORDER BY | | 1 | 211 | 141 (2)|
| 2 | HASH UNIQUE | | 1 | 211 | 140 (1)|
|* 3 | FILTER | | | | |
| 4 | NESTED LOOPS | | 2 | 422 | 139 (0)|
|* 5 | TABLE ACCESS BY INDEX ROWID | CMS_SM | 1 | 200 | 138 (0)|
|* 6 | INDEX RANGE SCAN | CMS_SM_TYPE | 4365 | | 6 (0)|
|* 7 | INDEX RANGE SCAN | USERID-SITE-CHANNEL | 3 | 33 | 1 (0)|
|* 8 | FILTER | | | | |
|* 9 | CONNECT BY WITH FILTERING | | | | |
| 10 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |
|* 11 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|
|* 12 | HASH JOIN | | | | |
| 13 | CONNECT BY PUMP | | | | |
| 14 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|
| 15 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|
|* 16 | FILTER | | | | |
|* 17 | CONNECT BY WITH FILTERING | | | | |
| 18 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |
|* 19 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|
|* 20 | HASH JOIN | | | | |
| 21 | CONNECT BY PUMP | | | | |
| 22 | INDEX FAST FULL SCAN | INDEX_CMS_SM_ID_FID | 13095 | 102K| 8 (0)|
| 23 | TABLE ACCESS FULL | CMS_SM | 13095 | 102K| 8 (0)|
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR "R"."IDCHANNEL"=(-888)
AND "S"."ID"="R"."IDSITE" OR "R"."IDCHANNEL"=(-1) AND "R"."IDSITE">0 AND EXISTS
(SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND ("ID"=:B1)) OR "R"."IDSITE">0 AND
"R"."IDCHANNEL">0 AND ( EXISTS (SELECT /*+ */ 0 FROM "CMS_SM" "CMS_SM" AND
("ID"=:B2)) OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))
5 - filter("S"."FID"=1 AND "S"."STATUS"=1)
6 - access("S"."TYPE"=0)
7 - access("R"."USERID"=3535)
8 - filter("ID"=:B1)
9 - filter("ID"=:B1)
11 - access("ID"=:B1)
12 - access("ID"=NULL)
16 - filter("ID"=:B1)
17 - filter("ID"=:B1)
19 - access("ID"=:B1)
20 - access("ID"=NULL)
Statistics
----------------------------------------------------------
17927 consistent gets
649 sorts (memory)
5 rows processed
这里我通过在sql语句中添加提示的方式,使得它的执行计划和上面rule的方式几乎是一样的(当然,到现在我也不知道有什么提示可以单独的控制层次查询的执行计划,所以对于层次查询的执行计划,我无法使得它们可以保持一致),我感觉这里真正不同的地方在于层次查询的执行计划上,rule的时候走的是nl,cbo的时候走的是hash join(层次查询具体是如何执行的,不清楚,网上也找不到相应的资料),但它的执行时间,逻辑IO还是比rule的方式高出很多.也许性能上的巨大差异就是因为这里层次查询执行上的不同造成的
这里的层次查询,单独拿出来执行的时候,就是nl,但放进大的查询里就是hash join.rule提示使得它可以走nl,而不是hash join,但也许其它地方比如连接顺序等使用rule的方式不是太好,但这里的语句改起来挺费劲的,cbo下又没有提示使得这里的层次查询可以走nl.
我猜测这里的层次查询使用hash join的方式比使用nl的方式要差一些,也许执行一次就是多几十个逻辑读,也差不到哪里去,但关键是这里还有两个filter操作,多次过滤,多次执行这样的层次查询,会使得逻辑读差别就大了
可对于层次查询,执行计划看的不是太明白,感觉只能是建立正确的索引来控制性能了吧!我不知道有什么提示可以单独的控制层次查询的行为.
如何才能大部分使用cbo,只是层次查询使用nl呢?上面提到这里的层次查询,单独拿出来执行的时候,就是nl,于是乎,想到通过table函数来实现.
create or replace package zsj_pack
is
t_count number;
end;
/
这里的这个包完全是为了计数的需要
create or replace type t_tab_id is table of number;
create or replace function get_cms_sm_ids(v_id number) return t_tab_id
is
v_tab_id t_tab_id;
begin
select id
bulk collect into v_tab_id
from cms_sm
start with id = v_id
connect by id=prior fid;
zsj_pack.t_count:=zsj_pack.t_count+1;
return v_tab_id;
end;
/
SQL> explain plan for select id
from cms_sm
start with id = :v_id
connect by id=prior fid;
SQL> select * from table(dbms_xplan.display());
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 1 (0)|
|* 1 | CONNECT BY WITH FILTERING | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| CMS_SM | | | |
|* 3 | INDEX UNIQUE SCAN | SYS_C0010312 | 1 | 4 | 1 (0)|
| 4 | NESTED LOOPS | | | | |
| 5 | BUFFER SORT | | | | |
| 6 | CONNECT BY PUMP | | | | |
|* 7 | INDEX RANGE SCAN | INDEX_CMS_SM_ID_FID | 1 | 8 | 1 (0)|
|* 8 | TABLE ACCESS FULL | CMS_SM | 1 | 8 | 1 (0)|
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=TO_NUMBER(:V_ID))
3 - access("ID"=TO_NUMBER(:V_ID))
7 - access("ID"=NULL)
8 - access("ID"=NULL)
select distinct id,fid,name,path,IsCheck,
STATUS,rank,title,type,code,
domainPath,domain,s.idsite,keyword,
comments,contribution,pubrart,
pids,isaddlable,tiid,taid
from CMS_SM s, CMS_USER_RF r
where s.type = 0
and s.status = 1
and s.fid = 1
and (((r.idsite = -1 and r.idchannel = -1) or
(r.idchannel = -888 and s.id = r.idsite) or
(r.idsite > 0 and r.idchannel = -1 and
r.idsite in (select column_value id from table(get_cms_sm_ids(s.id)))) or
(r.idsite > 0 and r.idchannel > 0 and
(r.idchannel in (select column_value id from table(get_cms_sm_ids(s.id))) or
(s.fid = -1 and s.id = r.idsite)))) and r.userid = 3535)
order by id desc;
SQL> /
Elapsed: 00:00:00.03
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 211 | 5 (20)|
| 1 | SORT ORDER BY | | 1 | 211 | |
| 2 | HASH UNIQUE | | 1 | 211 | 5 (20)|
|* 3 | FILTER | | | | |
| 4 | MERGE JOIN CARTESIAN | | 2 | 422 | 4 (0)|
|* 5 | TABLE ACCESS BY INDEX ROWID | CMS_SM | 1 | 200 | 3 (0)|
|* 6 | INDEX RANGE SCAN DESCENDING | CMS_SM_FID | 5 | | 1 (0)|
| 7 | BUFFER SORT | | 3 | 33 | 1 (0)|
|* 8 | INDEX RANGE SCAN | USERID-SITE-CHANNEL | 3 | 33 | 1 (0)|
|* 9 | COLLECTION ITERATOR PICKLER FETCH| GET_CMS_SM_IDS | | | |
|* 10 | COLLECTION ITERATOR PICKLER FETCH| GET_CMS_SM_IDS | | | |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("R"."IDSITE"=(-1) AND "R"."IDCHANNEL"=(-1) OR "R"."IDCHANNEL"=(-888) AND
"S"."ID"="R"."IDSITE" OR "R"."IDCHANNEL"=(-1) AND "R"."IDSITE">0 AND EXISTS (SELECT /*+
*/ 0 FROM TABLE() "KOKBF$" WHERE VALUE(KOKBF$)=:B1) OR "R"."IDSITE">0 AND
"R"."IDCHANNEL">0 AND ( EXISTS (SELECT /*+ */ 0 FROM TABLE() "KOKBF$" WHERE
VALUE(KOKBF$)=:B2) OR "S"."FID"=(-1) AND "S"."ID"="R"."IDSITE"))
5 - filter("S"."TYPE"=0 AND "S"."STATUS"=1)
6 - access("S"."FID"=1)
8 - access("R"."USERID"=3535)
9 - filter(VALUE(KOKBF$)=:B1)
10 - filter(VALUE(KOKBF$)=:B1)
Statistics
----------------------------------------------------------
216 recursive calls
1931 consistent gets
1082 sorts (memory)
0 sorts (disk)
5 rows processed
可以发现逻辑IO比rule的实现还要小
单独的测试层次查询执行了多少次:
exec zsj_pack.t_count:=0;
--执行上面修改后的sql语句
SQL> set serveroutput on;
SQL> exec dbms_output.put_line(zsj_pack.t_count);
216
显示这个层次查询执行了有216次之多.其实从上面的递归调用次数也可以猜出来的
这里通过table函数的实现,在使用cbo的情况下,对于层次查询,使得sql可以按照自己想要的执行方式去执行.但就像我上面提到的一样,有没有什么提示可以单独的控制层次查询的执行计划呢?对于层次查询大家有什么好的资料吗?