昨天晚上生産的兩套10.2.0.4的資料庫修改了參數,需要重新開機。在發出shutdown immediate指令後等了大概10分鐘的時間,資料庫還沒有down下來。檢查背景alert日志,發現從開始shutdown到最後隻輸出幾條日志,其中最後一條日志是:SHUTDOWN: Active processes prevent shutdown operation。
<a href="https://s5.51cto.com/wyfs02/M01/92/8E/wKioL1kAjvrhU73jAABw-tRSRfA520.png" target="_blank"></a>
圖為在虛拟機上還原場景時的截圖。
開一個新的會話連接配接顯示已連接配接,但無法查視圖,又提示未連接配接。再次執行shutdown immediate指令得到報錯如下:
<a href="https://s5.51cto.com/wyfs02/M01/92/8F/wKiom1kAj16gsXwWAABNplUV_ok888.png" target="_blank"></a>
立馬百度了一下這個問題,随便點開一個去看,說是因為在sqlplus / as sysdba 連接配接的情況下又執行host指令導緻的。如下圖這種情況
在實際的生産中卻是,把會話kill掉之後又等了幾分種,還是沒有返應,于是隻得強制shutdown abort了。好在業務都是提前停了,不過在執行這條指令是還是有些忐忑的,就怕資料庫起不來。
在看貼子的時候,作者提到了他在停庫前做了切換日志,歸檔和checkpoint的操作來以防萬一,我覺得這也是一個可取的地方,以後在停生産庫時不應該業務停了就立馬停資料庫,也應該做這些操作來保護資料。
俗話說禍不單行,真是沒錯,就在第一個資料庫成功重新開機後,在shutdown第二套資料庫也hang住了,但這次報錯與上次還不一樣!!!(我的小心髒啊。。)
alert輸出資訊如下:
<a href="https://s1.51cto.com/wyfs02/M01/92/8F/wKioL1kAk1vjnbecAAARQrfm-FM569.png" target="_blank"></a>
從輸出資訊可以了解到有進行還沒有執行完,但業務都已經停了,會是什麼進行呢。由于還有其他事情,就沒去管是什麼進行,就直接kill掉了,但資料庫還是停不下來啊。
最後還是隻得shutdown abort了,還是依然的忐忑。最後平安無事。
這個問題最後也沒有在虛拟機上模拟出來,始終不知道是什麼。
總結一下:
1、自己給自己挖坑的滋味真是不好受啊。
2、關閉資料庫前做一些切換日志檔案、歸檔和checkpoint的操作,來保證資料的安全性
3、關閉資料庫前一定做好檢查,是否有未解決的事務、JOB、或其他程序,避免出現類似第二種的情況
4、要有一顆強大的内心,相信自己:)
事後在MOS上查到了一些相關資訊:
MOS文檔
Troubleshooting Shutdown Immediate/Normal Hanging Issues (文檔 ID 1906014.1)
列舉四種shutdown Hang的情況。
第一種情況的原因和給出的解決方案
Shutdown Immediate Hangs / Active Processes Prevent Shutdown (文檔 ID 416658.1)
This is not a bug.
If the DB Control repository is running on the database target against which shutdown immediate was attempted then an incorrect order of events seems used.
You should stop DB Control first to get rid of all connections between DB Control and the repository database and then shutdown the database with 'shutdown immediate'.
Current database sessions may show:
SQL> select SID, USERNAME, PROGRAM from v$session;
SID USERNAME PROGRAM
----- ---------------------- ----------------------------------
243 SYSTEM SQL Developer
246 SYSMAN OMS
247 oracle@lgiora09 (q001)
248 oracle@lgiora09 (q000)
251 DBSNMP emagent@lgiora09 (TNS V1-V3)
252 SYSMAN OMS
253 SYSMAN OMS
254 DBSNMP emagent@lgiora09 (TNS V1-V3)
255 SYSTEM java.exe
256 SYSMAN OMS
Clearly OMS and OEM are connected (Oracle Enterprise Manager Grid Control or DBConsole) via users SYSMAN and DBSNMP.
These sessions should be de-activated (that is to log off any OEM, OMS, SYSMAN and DBSNMP) before the shutdown immediate is attempted.
Oracle Enterprise Manager, Grid Control, Dbconsole and agents keep doing internal processing.
This may include a few PLSQL notification procedures running on the database by database control like
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
As per internal documentation of the shutdown immediate, if there are active calls then it would wait for all the active calls to finish.
To implement the solution:
1. Given OEM connections are active (SYSMAN and DBSNMP), de-activate these sessions, i.e. by stopping the agent/DBConsole
2. Then shutdown immediate as normal
- OR -
There may be processes still running and holding locks at the time a shutdown is issued.
Sometimes these are failed jobs or transactions, which are effectively 'zombies', which are not able to receive a signal from Oracle.
If this occurs, the only way to shutdown the database is by doing:
sql>
shutdown abort
startup restrict
shutdown normal
The startup does any necessary recovery and cleanup, so that a valid cold backup can be taken afterward.
If this issue occurs frequently, it would be a good practice to see if there are any active user processes running in v$session or v$process before shutting down the instance.
If the problem persists, and no apparent user processes are active, you can set this event prior to issuing the shutdown command in order to see what is happening. This will dump a systemstate every 5 minutes while shutdown is hanging
SQL>
connect / as sysdba
alter session set events '10400 trace name context forever, level 1';
Then issue the shutdown command.
本文轉自hbxztc 51CTO部落格,原文連結:http://blog.51cto.com/hbxztc/1919778,如需轉載請自行聯系原作者