天天看點

oracle等待事件3——高速緩沖内enq鎖

6、 enq:TC-contention

在手動執行檢查點操作中,一部分需要獲得TC鎖(thread checkpointlock 或 tablespace checkpointlock )在獲得TC鎖過程中,若發生争用,則需要等待enq:TC-contention 事件。事實上獲得TC鎖的過程稍微複雜。

1) 伺服器程序首先以X模式獲得TC鎖

2) 伺服器程序将已獲得的TC鎖變更為SSX模式。同時,CKPT程序以SS模式獲得該鎖。CKPT獲得鎖後執行檢查點操作。

3) 伺服器欲重新以X模式獲得TC鎖,等待CKPT釋放該鎖,這時的等待事件就是enqueue:TC-contention

4) 檢查點工作結束後,CKPT程序将會釋放TC鎖,伺服器程序就會獲得TC鎖,是以得知檢查點工作已經結束。

Enq:TC-contention 等待即便在沒有多個程序引起争用的情況下,也可以發生,在這一點上與其他鎖争用引起的等待現象不同。需要了解的是在等待現象中,存在隻有争用才能引發的等待現象,但是也存在不發生争用,也會單純為了等待工作結束而等待的情況。。

發生檢查點的情況雖然很多,但不是所有的情況都會發生TC鎖引起的等待,之後再程序由伺服器程序引發的檢查點同步過程中發生。

enq:TC-contention 等待發生的代表案例如下:并行查詢   和 表空間熱備

1:、并行查詢(parallel query)

pq發生檢查點的原因是slave session引起的direct path read。這就是所謂的“直接路徑讀”,它不經過高速緩沖區直接讀取資料檔案。oracle在如下三種情況下使用direct path read(也叫physical read direct) 方式的讀取。

(1)記憶體區域上不能完成排序工作時,會在臨時段的區域裡存儲和讀取的過程中,發生direct path write ,direct path read 。這時的等待事件可以通過direct path read temp、direct path write temp 觀察。

(2)slave session(從屬會話)為了掃描直接讀取的資料檔案時,使用direct path read 。這時等待事件通過direct path read 事件觀察。

(3)若判斷是因為I/O系統的性能下降,導緻不能将以足夠快的速度讀取,oracle為了臨時友善會使用direct path read。

slave session執行direct path read 對象時資料檔案,從資料檔案上直接讀取資料時,因為不經過SGA,是以可能發生當期SGA上的塊和資料檔案上的塊之間版本不一緻的現象,為了防止這些現象,oracle對資料檔案執行direct path read 之前,應該執行檢查點。coordinate session在驅動slave session之前,對于執行direct path read現象,請求段級别的檢查點,檢查點發生之前一直處于enq:TC-contention等待事件狀态。coordinate

session上可以發現enq:TC-eontention等待,slave session上則可以發現direct path read 等待。

2、表空間熱備份(tablespace hot backup)

執行alter tablespace 。。begin backup後,将屬于此表空間的所有高速緩沖區的髒資料記錄到磁盤上,這個過程經曆enq:TC-contention 等待。

7、enq:CI-contention 和 enq:RO-contention

“Cross Instance call Enqueue”是一種在一個或多個instance執行個體間調用背景程序行為時用到的隊列鎖,具體調用的背景程序行為包括檢查點checkpoint、日志切換logfile switch、shutdown執行個體、載入資料檔案頭等等。需要注意的是這種Enqueue

Lock并不僅僅在RAC中使用,即便是單節點也會用到。CI鎖的數量取決于并行執行Cross Instance Call調用的程序的總數。

當系統中出現有大量這種跨執行個體背景程序調用時,将出現CI隊列鎖的争用。

假設在一個RAC場景中,同時有大量的回話開始對不同的資料表執行TRUNCATE截斷操作,TRUNCATE的一個前提是在所有執行個體上(因為對象表的dirty buffer可能分布在多個執行個體上)發生對象級别的檢查點(object level checkpoint),檢查點發生時CKPT程序會通知DBWR寫出指定對象表相關的髒塊,DBWR需要掃描Buffer

Cache以找出髒塊,而如果Buffer Cache很大那麼掃描将花費大量的時間,而在此過程中前台程序将一直排他地持有着本地的CI隊列鎖,這就将造成CI鎖的嚴重争用。

為了減少CI隊列鎖地争用,我們第一步所要做的是找出實際的Cross Instance call跨執行個體調用的類型。這裡要另外提一下的是在10g以前不管是v$session_wait或statspack中都不會将enqueue鎖等待事件的具體enqueue lock類型寫明,一般需要我們從p1/p2/p3列中找出enqueue的具體身份,例如”WAIT #1: nam=’enqueue’ ela= 910796 p1=1128857606

  p2=1   p3=4″,這裡的p1為1128857606也就是16進制的43490006,高位的’4349′轉換為ascii碼也就是’CI’,而這裡的p2/p3對應為V$lock中的ID1/ID2,ID1=1代表了”Reuse (checkpoint and invalidate) block range”,ID2=4代表了”Mounted excl, use to allocate mechanism”。

具體ID1/ID2代表的含義在不同版本中有所變化,可以參考下表:

enq: RO - fast object reuse 等待事件

查了一下這個等待,出現這個等待比較高的情況一般都有異常:

1.truncate表或者分區表時

2.收集統計資訊采用degree>1時

這個event表示在等待DBWR to clean cache.

出現異常的時候症狀:the CKPT background process is the one holding the needed RO enqueue although it is actually doing nothing.

Bug:7385253.

這個wait event表示在等待DBWR to clean cache.

如果要優化這個問題,需要綜合考慮,比如減少cache size,增加dbwr process或減少MTTR等等。

Is it still locked? 

For some reason the truncate is still waiting for CKPT process. When you truncate or drop a table, CKPT does a range flush of the db_cache_size, which seems to be completed according to your alert_log. 

That was an issue in 9i-10g. This looks like bug 4201369 which is supposed to be fixed in 10.1.0.5. I will suggest you open a tar on this! They will make you do a hanganalysis which should clarify the issue.

該等待事件多與bug 相關

Bug 7385253 - Slow Truncate / DBWR useshigh CPU / CKPT blocks on RO enqueue [ID 7385253.8]

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions >= 10 but BELOW 11.2

Versions confirmed as being affected

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#AFFECTS_11.1.0.7">11.1.0.7</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#AFFECTS_10.2.0.4">10.2.0.4</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#AFFECTS_10.2.0.3">10.2.0.3</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#AFFECTS_10.2.0.2">10.2.0.2</a>

Platforms affected

Generic (all / most platforms affected)

This issue is fixed in

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#FIXED_11.2.0.1">11.2.0.1 (Base Release)</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#FIXED_11.1.0.7.3">11.1.0.7.3 (Patch Set Update)</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#FIXED_10.2.0.5">10.2.0.5 (Server Patch Set)</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#FIXED_10.2.0.4.1">10.2.0.4.1 (Patch Set Update)</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=560295.1">11.1.0.7 Patch 25 on Windows Platforms</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=342443.1">10.2.0.4 Patch 14 on Windows Platforms</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=8344348.8">10.2.0.4 RAC Recommended Patch Bundle #3</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=7612639.8">10.2.0.4 Generic Recommended Patch Bundle #3</a>

該Bug的3個表現:

DBWR may use alot of CPU and seem to spin in or around kcbo_write_qdue to large number offree buffers on the object reuse queue or checkpoint queue.

In some casesthe CKPT holds the RO enqueue for very long blocking other operations  with waitevent "enq: RO - fast objectreuse".

Operations so farreported being affected are :

- Apply Processes in StandBy databases

- Gather stats

- Truncates

- drop/shrink/alter tablespace

Note: This fix was previously incorrectlylisted as not affecting 11g.

     The bug itself is present in 11g but it is unlikely to show anysignificant symptom due to other 11g changes meaning that free buffers are nolonger kept on the object queue.

對與該Bug 的解決方法:

setting _db_fast_obj_truncate=FALSE &lt;--did not fix the issue

enabling asyn i/o &lt;-- customer refused to implement to avoid corruptionsrisk

applying 7287289 &lt;-- did not fix the issue

'enq: RO - fastobject reuse' contention when gathering schema/table statistics in parallel [ID762085.1]

Symptoms:

(1)Database has been recently upgradedfrom 10.2.0.1 to 10.2.0.4.

(2)There is 'enq: RO - fastobject reuse' contention when gathering schema/table statistics in parallelusing DBMS_STATS package (with DEGREE&gt;1).

解決方法:

1) Flushing the buffer cache.

OR

2) Setting "_db_fast_obj_truncate" =FALSE. This reverts back to the9i way of invalidating buffers in the buffer cache. 

Kindly note thatboth workarounds could have an impact on the database performance. Instead, itis recommended applying the corresponding patch.

--這2種解決方法對db

性能都有很大影響,建議應用合适的patch。

Bug8544896 - Waits for "enq: RO - fast object reuse" with high DBWR CPU[ID 8544896.8]

Versions &gt;= 10.2.0.4 but BELOW 10.2.0.5

   Regression introduced in 10.2.0.4

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=245840.1#FIXED_10.2.0.4.3">10.2.0.4.3 (Patch Set Update)</a>

<a href="https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=342443.1">10.2.0.4 Patch 27 on Windows Platforms</a>

This problem is introduced in 10.2.0.4.

Sessions can wait on "enq: RO - fastobject reuse" while DBWR consumes lots of CPU when performing truncatetype operations.

Workaround:

(1)Flush the buffer cache beforetruncating

 OR

(2) set _db_fast_obj_truncate = FALSE.

我這裡出現這2個等待事件都與Truncate

操作有關。