天天看點

sql trace基礎

一、 基礎.SQL_TRACE

1. 在全局啟用

在參數檔案(pfile/spfile)中指定:

sql_trace =true

在全局啟用SQL_TRACE會導緻所有程序的活動被跟蹤,包括背景程序及所有使用者程序,這通常會導緻比較嚴重的性能問題,是以在生産環境中要謹慎使用.

提示: 通過在全局啟用sql_trace,我們可以跟蹤到所有背景程序的活動,很多在文檔中的抽象說明,通過跟蹤檔案的實時變化,我們可以清晰的看到各個程序之間的緊密協調.

2. 在目前session級設定

大多數時候我們使用sql_trace跟蹤目前程序.通過跟蹤目前程序可以發現目前操作的背景資料庫遞歸活動(這在研究資料庫新特性時尤其有效), 研究SQL執行,發現背景錯誤等.

在session級啟用和停止sql_trace方式如下:

SQL> alter session set sql_trace=true;

SQL> SQL statements ......

SQL> alter session set sql_trace=false;

3. 跟蹤其他使用者程序

在很多時候我們需要跟蹤其他使用者的程序,而不是目前使用者,這可以通過Oracle提供的系統包

DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION來完成

SET_SQL_TRACE_IN_SESSION過程式要提供三個參數:

SQL> desc dbms_system

......

PROCEDURE SET_SQL_TRACE_IN_SESSION

Argument Name                           Type                    In/Out Default?

------------------------------           ---------------------    ------ --------

SID                                      NUMBER                  IN

SERIAL#                                  NUMBER                  IN

SQL_TRACE                                BOOLEAN                 IN

......

通過v$session我們可以獲得sid、serial#等資訊:

獲得程序資訊,選擇需要跟蹤的程序:

SQL> select sid,serial#,username from v$session where username is not null;

SID    SERIAL# USERNAME

---------- ---------- ------------------------------

10       9068 BIGBEAR

SQL> exec dbms_system.set_sql_trace_in_session(10,9068,true)

SQL> exec dbms_system.set_sql_trace_in_session(10,9068,false)

二、 基礎.10046事件

10046事件是Oracle提供的内部事件,是對SQL_TRACE的增強.

10046事件可以設定以下四個級别:

1 - 啟用标準的SQL_TRACE功能,等價于sql_trace

4 - Level 1 加上綁定值(bind values)

8 - Level 1 + 等待事件跟蹤

12 - Level 1 + Level 4 + Level 8

類似sql_trace,10046事件可以在全局設定,也可以在session級設定。

1. 在全局設定

在參數檔案中增加:

event="10046 trace name context forever,level 12"

此設定對所有使用者的所有程序生效、包括背景程序.

2. 對目前session設定

通過alter session的方式修改,需要alter session的系統權限:

SQL> alter session set events '10046 trace name context forever';

SQL> alter session set events '10046 trace name context forever, level 8';

......

SQL> alter session set events '10046 trace name context off';

3. 對其他使用者session設定

通過DBMS_SYSTEM.SET_EV系統包來實作:

SQL> desc dbms_system

...

PROCEDURE SET_EV

Argument Name                  Type                    In/Out Default?

------------------------------ ----------------------- ------ --------

SI                             BINARY_INTEGER          IN

SE                             BINARY_INTEGER          IN

EV                             BINARY_INTEGER          IN

LE                             BINARY_INTEGER          IN

NM                             VARCHAR2                IN

...

其中的參數SI、SE來自v$session視圖:

查詢獲得需要跟蹤的session資訊:

SQL> select sid,serial#,username from v$session where username is not null;

SID    SERIAL# USERNAME

---------- ---------- ------------------------------

10       9068 BIGBEAR

SQL> exec dbms_system.set_ev(10,9068,10046,8,'bigbear');

......

SQL> exec dbms_system.set_ev(10,9068,10046,0,'bigbear');

三、TKPROF格式化輸出

TKPROF bigbear_ora_27292.trc output-file

  EXPLAIN=scott/tiger

  TABLE=scott.temp_plan_table_a

  INSERT=STOREA.SQL

  SYS=NO

  SORT=(EXECPU,FCHCPU)

  PRINT=5

list description
prscnt

number of times parse was called

解析被調用次數

prscpu

cpu time parsing

解析占用CPU時間(秒)

prsela

elapsed time parsing

解析占用總時間(秒)

prsdsk

number of disk reads during parse

解析期間磁盤讀取的資料塊數

prsqry

number of buffers for consistent read during parse

解析期間一緻讀所通路到的緩沖塊的數量

prscu

number of buffers for current read during parse

解析期間目前讀所通路到的緩沖塊的數量

prsmis

number of misses in library cache during parse

解析期間從庫緩存中丢失的塊的數量

execnt number of execute was called
execpu cpu time spent executing
exeela elapsed time executing
exedsk number of disk reads during execute                   
exeqry number of buffers for consistent read during execute  
execu number of buffers for current read during execute     
exerow number of rows processed during execute               
exemis number of library cache misses during execute         
fchcnt number of times fetch was called                      
fchcpu cpu time spent fetching                               
fchela elapsed time fetching                                 
fchdsk number of disk reads during fetch                     
fchqry number of buffers for consistent read during fetch    
fchcu number of buffers for current read during fetch       
fchrow number of rows fetched                                
userid userid of user that parsed the cursor                 

四、TKPROF輸出說明

list description
PARSE

Translates the SQL statement into an execution plan, including checks for proper security authorization and checks for the existence of tables, columns, and other referenced objects.

将SQL語句轉換為執行計劃,包括檢查安全認證,還有表/列及其它相關對象的存在性

EXECUTE

Actual execution of the statement by Oracle. For INSERT, UPDATE, and DELETE statements, this modifies the data. For SELECT statements, this identifies the selected rows.

實際送出給ORACLE執行。對于INS/UPD/DEL這個名額反映修改後的資料,對于SELECT反映的是選中的資料。

FETCH

Retrieves rows returned by a query. Fetches are only performed for SELECT statements.

對于SELECT取回資料的行數。

column description
COUNT

Number of times a statement was parsed, executed, or fetched.

次數

CPU

Total CPU time in seconds for all parse, execute, or fetch calls for the statement. This value is zero (0) if TIMED_STATISTICS is not turned on.

占CUP的秒數

ELAPSED

Total elapsed time in seconds for all parse, execute, or fetch calls for the statement. This value is zero (0) if TIMED_STATISTICS is not turned on.

整個流逝的時間

DISK

Total number of data blocks physically read from the data files on disk for all parse, execute, or fetch calls.

讀磁盤的塊數

QUERY

Total number of buffers retrieved in consistent mode for all parse, execute, or fetch calls. Usually, buffers are retrieved in consistent mode for queries.

一緻讀所通路的緩沖塊數

CURRENT

Total number of buffers retrieved in current mode. Buffers are retrieved in current mode for statements such as INSERT, UPDATE, and DELETE.

SQL所影響到的目前的緩沖塊數

ROWS

Total number of rows processed by the SQL statement. This total does not include rows processed by sub-queries of the SQL statement.

SQL總處理的行數,但不包括子查詢所通路的行。

Timing statistics have a resolution of one hundredth of a second; therefore, any operation on a cursor that takes a hundredth of a second or less might not be timed accurately. Keep this in mind when interpreting statistics. In particular, be careful when interpreting the results from simple queries that execute very quickly.

基于時間的統計是每百秒一次,是以任何基于指針的操作如果小于100秒将不會被精确統計。特别提醒:對于那些簡單的執行速度很快的query的統計要十分地小心

Sometimes, in order to execute a SQL statement issued by a user, Oracle must issue additional statements. Such statements are called recursive calls or recursive SQL statements. For example, if you insert a row into a table that does not have enough space to hold that row, then Oracle makes recursive calls to allocate the space dynamically. Recursive calls are also generated when data dictionary information is not available in the data dictionary cache and must be retrieved from disk.

有時候為了執行使用者的一條SQL語句,ORACLE需要調用其它的SQL語句。這樣的SQL我們稱為遞歸調用,或者稱為遞歸SQL語句。例如:當為一張表插入一行資料時,發現沒有足夠的空間,這時ORACLE将産生遞歸調用為其配置設定空間。當資料字典不可用時,會從磁盤上讀取,這一過程也産生遞歸調用。

If recursive calls occur while the SQL Trace facility is enabled, then TKPROF produces statistics for the recursive SQL statements and marks them clearly as recursive SQL statements in the output file. You can suppress the listing of Oracle internal recursive calls (for example, space management) in the output file by setting the SYS command-line parameter to NO. The statistics for a recursive SQL statement are included in the listing for that statement, not in the listing for the SQL statement that caused the recursive call. So, when you are calculating the total resources required to process a SQL statement, consider the statistics for that statement as well as those for recursive calls caused by that statement.

在遞歸調用發生在SQL TRACE中時,TKPROF會統計這些資訊,并明确地在輸出檔案中表明。你可以通過設定參數禁止上述遞歸調用資訊的産生,即:隻有産生遞歸調用的SQL被記錄,相關的遞歸調用過程不被記錄。是以,當要計算整個資源消耗時,要考慮是否将上述資訊算在内。

Deciding Which Statements to Tune

You need to find which SQL statements use the most CPU or disk resource. If the TIMED_STATISTICS parameter is on, then you can find high CPU activity in the CPU column. If TIMED_STATISTICS is not on, then check the QUERY and CURRENT columns.

判斷哪些SQL需要調整

如果設定了TIMED_STATISTICS可以檢視CPU占用,如果沒有設定TIMED_STATISTICS,檢視QUERY和CURRNET。

With the exception of locking problems and inefficient PL/SQL loops, neither the CPU time nor the elapsed time is necessary to find problem statements. The key is the number of block visits, both query (that is, subject to read consistency) and current (that is, not subject to read consistency). Segment headers and blocks that are going to be updated are acquired in current mode, but all query and sub-query processing requests the data in query mode. These are precisely the same measures as the instance statistics CONSISTENT GETS and DB BLOCK GETS. You can find high disk activity in the disk column.

當産生鎖異常或者低效率的PL/SQL,不管是CPU time還是elapsed time都不能發現有問題的SQL。這時的關鍵是通路的資料塊的數目,即:QUERY和CURRENT兩項中對應的數值。

The following listing shows TKPROF output for one SQL statement as it appears in the output file:

SELECT *

FROM emp, dept

WHERE emp.deptno = dept.deptno;

call   count   cpu       elapsed  disk      query     current rows

----   ------- ------- ------- -------- -------- ------- ------

Parse   11     0.08      0.18     0          0        0         0

Execute 11     0.23      0.66     0          3        6         0

Fetch   35     6.70      6.83     100        12326    2         824

------------------------------------------------------------------

total  57      7.01      7.67     100        12329    8         826

Misses in library cache during parse: 0

If it is acceptable to have 7.01 CPU seconds and to retrieve 824 rows, then you need not look any further at this trace output. In fact, a major use of TKPROF reports in a tuning exercise is to eliminate processes from the detailed tuning phase.

You can also see that 10 unnecessary parse call were made (because there were 11 parse calls for this one statement) and that array fetch operations were performed. You know this because more rows were fetched than there were fetches performed. A large gap between CPU and elapsed timings indicates Physical I/Os (PIOs).

可以發現有10個無用的解析被執行(因為對于同一個SQL解析了11次),并且執行的是“批取回”,因為取回的行數比結果集中實際存在行數多。如果CPU與elapsed相關比較大,則認為有實體I/O,即:(PIOs)。

五、儲存 SQL_TRACE 統計結果

When you run TKPROF, use the INSERT parameter to specify the name of the generated SQL script. If you omit this parameter, then TKPROF does not generate a script.

如果使用INSERT參數,可以生成相應的CREATE/INSERT語句。

預設條件下不會生成。

下面表中内容是對于TKPROF_TABLE中部分字段的解釋:

column description
SQL_STATEMENT

This is the SQL statement for which the SQL Trace facility collected the row of statistics. Because this column has data-type LONG, you cannot use it in expressions or WHERE clause conditions.

這裡儲存的是SQL語句,注意儲存類型為LONG,是以不能使用WHERE條件對其做為查詢條件

DATE_OF_INSERT

This is the date and time when the row was inserted into the table. This value is not exactly the same as the time the statistics were collected by the SQL Trace facility.

這個值是反映插入這個表的時間,并不是執行TKPROF的時間。

DEPTH

This indicates the level of recursion at which the SQL statement was issued. For example, a value of 0 indicates that a user issued the statement. A value of 1 indicates that Oracle generated the statement as a recursive call to process a statement with a value of 0 (a statement issued by a user). A value of n indicates that Oracle generated the statement as a recursive call to process a statement with a value of n-1.

指嵌套查詢的嵌套深度。

USER_ID

This identifies the user issuing the statement. This value also appears in the formatted output file.

使用者唯一的辨別,同時也會出現在格式輸出檔案的檔案名上

CURSOR_NUM

Oracle uses this column value to keep track of the cursor to which each SQL statement was assigned.

ORACLE用來記錄遊标數,這些遊标是為每個SQL配置設定的

六、TKPROF的缺陷

避免參數(Argument)缺陷

If you are not aware of the values being bound at run time, then it is possible to fall

into the argument trap. EXPLAIN PLAN cannot determine the type of a bind variable from the text of SQL statements, and it always assumes that the type is varchar. If the bind variable is actually a number or a date, then TKPROF can cause implicit data conversions, which can cause inefficient plans to be executed. To avoid this, experiment with different data types in the query.

如果在運作時綁定變量不小心,就有可能遇到這個缺陷。EXPLAIN PLAN從SQL語句中不能判斷綁定變量的類型,并且總是将其認為是varchar類型的。如果SQL語句中實際綁定的變量是數字或者日期型的,則TKPROF将會做隐式的資料轉換,這樣會導緻低效的計劃被執行。為了避免這個不足,可以綁定不同的資料類型的變量做測試。

避免讀一緻性(Read Consistency)缺陷

The next example illustrates the read consistency trap. Without knowing that an uncommitted transaction had made a series of updates to the NAME column, it is very difficult to see why so many block visits would be incurred.

如果不知道一個未送出的事務,其産生一系列的更新在NAME列上,就非常難于明白為什麼有如此多的資料塊被通路。

Cases like this are not normally repeatable: if the process were run again, it is unlikely that another transaction would interact with it in the same way.

像這樣的一些例子不經常出現,如果将上述過程再運作一次,未必另一個事務會出現同樣的結果。

SELECT name_id

FROM cq_names

WHERE name = ’FLOOR’;

call count cpu elapsed disk query current rows

---- ----- --- ------- ---- ----- ------- ----

Parse   1 0.10 0.18 0 0   0 0

Execute 1 0.00 0.00 0 0   0 0

Fetch   1 0.11 0.21 2 101 0 1

Misses in library cache during parse: 1

Parsing user id: 01 (USER1)

Rows Execution Plan

---- --------- ----

0 SELECT STATEMENT

1 TABLE ACCESS (BY ROWID) OF ’CQ_NAMES’

2 INDEX (RANGE SCAN) OF ’CQ_NAMES_NAME’ (NON_UNIQUE)

避免讀計劃(schema)缺陷

This example shows an extreme (and thus easily detected) example of the schema trap. At first, it is difficult to see why such an apparently straightforward indexed query needs to look at so many database blocks, or why it should access any blocks at all in current mode.

SELECT name_id

FROM cq_names

WHERE name = ’FLOOR’;

call count cpu elapsed disk query current rows

-------- ------- -------- --------- ------- ------ ------- ----

Parse   1 0.06 0.10 0  0  0 0

Execute 1 0.02 0.02 0  0  0 0

Fetch   1 0.23 0.30 3131 3 1

Misses in library cache during parse: 0

Parsing user id: 02 (USER2)

Rows Execution Plan

------- ---------------------------------------------------

0 SELECT STATEMENT

2340 TABLE ACCESS (BY ROWID) OF ’CQ_NAMES’

0 INDEX (RANGE SCAN) OF ’CQ_NAMES_NAME’ (NON-UNIQUE)

Two statistics suggest that the query might have been executed with a full table scan. These statistics are the current mode block visits, plus the number of rows originating from the Table Access row source in the execution plan. The explanation is that the required index was built after the trace file had been produced, but before TKPROF had been run.

Generating a new trace file gives the following data:

SELECT name_id

FROM cq_names

WHERE name = ’FLOOR’;

call count cpu elapsed disk query current rows

----- ------ ------ -------- ----- ------ ------- -----

Parse   1 0.01 0.02 0 0 0 0

Execute 1 0.00 0.00 0 0 0 0

Fetch   1 0.00 0.00 0 2 0 1

Misses in library cache during parse: 0

Parsing user id: 02 (USER2)

Rows Execution Plan

------- ---------------------------------------------------

0 SELECT STATEMENT

1 TABLE ACCESS (BY ROWID) OF ’CQ_NAMES’

2 INDEX (RANGE SCAN) OF ’CQ_NAMES_NAME’ (NON-UNIQUE)

One of the marked features of this correct version is that the parse call took 10 milliseconds of CPU time and 20 milliseconds of elapsed time, but the query apparently took no time at all to execute and perform the fetch. These anomalies arise because the clock tick of 10 milliseconds is too long relative to the time taken to execute and fetch the data. In such cases, it is important to get lots of executions of the statements, so that you have statistically valid numbers.

由于在執行TKPORF之前沒有建立索引,一但建立索引,執行TKPROF會是另外一種結果。

避免讀時間(time)缺陷

Sometimes, as in the following example, you might wonder why a particular query has taken so long.

UPDATE cq_names SET ATTRIBUTES = lower(ATTRIBUTES)

WHERE ATTRIBUTES = :att

call count cpu elapsed disk query current rows

-------- ------- -------- --------- -------- -------- ------- ----------

Parse   1 0.06  0.24 0  0   0  0

Execute 1 0.62 19.62 22 526 12 7

Fetch   0 0.00  0.00 0  0   0  0

Misses in library cache during parse: 1

Parsing user id: 02 (USER2)

Rows Execution Plan

------- ---------------------------------------------------

0 UPDATE STATEMENT

2519 TABLE ACCESS (FULL) OF ’CQ_NAMES’

Again, the answer is interference from another transaction. In this case, another transaction held a shared lock on the table cq_names for several seconds before and after the update was issued. It takes a fair amount of experience to diagnose that interference effects are occurring. On the one hand, comparative data is essential when the interference is contributing only a short delay (or a small increase in block visits in the previous example). On the other hand, if the interference is contributing only a modest overhead, and the statement is essentially

efficient, then its statistics might not need to be analyzed.

還是要從事務角度考慮,當執行這個UPDATE之前幾秒,可能有另外一個事務在同一表上配置設定了共享鎖。處理這樣的沖突事件需要有權衡的能力。一方面,當這個沖突在瞬間發生時通路到一定量的資料(或者通路有小量的增長)。另一方面,如果這一事件不是很過份,并且這個SQL語句本質上是有效率的,那麼這個統計資訊可能被忽略。

避免觸發器(trigger)缺陷

The resources reported for a statement include those for all of the SQL issued while the statement was being processed. Therefore, they include any resources used within a trigger, along with the resources used by any other recursive SQL (such as that used in space allocation). With the SQL Trace facility enabled, TKPROF reports these resources twice. Avoid trying to tune the DML statement if the resource is actually being consumed at a lower level of recursion.

SQL語句的資源報告包括它們被執行時全部的SQL聲明。是以上述資源包括在觸發器中調用的任何資源,連同被其它嵌套占用的資源(例如調用空間配置設定的SQL)。在執行SQL_TRACE後,TKPROF會報告這些資源兩次被使用。最好不要調整這些DML語句,如果這些資源确實被以低度嵌套SQL所消耗。

If a DML statement appears to be consuming far more resources than you would expect, then check the tables involved in the statement for triggers and constraints that could be greatly increasing the resource usage.

如果一個DML語句消耗的資料比預期的大,則檢查與這些語句相關的觸發器和限制,他們也許會極大地增加資源的占用。