天天看點

ORA-17500 ODM err的問題排查

今天在一套環境中做系統檢查的時候,發現alert日志中有一段ODM的錯誤。

日志内容大體如下,可以看到是在半夜4點多報的錯誤。

Clearing Resource Manager plan via parameter

Fri Aug 22 02:00:52 2014

ALTER SYSTEM ARCHIVE LOG

Thread 1 advanced to log sequence 6934 (LGWR switch)

  Current log# 3 seq# 6934 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A3/redo/redo03A.log

  Current log# 3 seq# 6934 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B3/redo/redo03B.log

Archived Log entry 6933 added for thread 1 sequence 6933 ID 0x4a0d6000 dest 1:

Fri Aug 22 04:27:37 2014

Control file backup creation failed.

Errors in file /u01/oracle/PETCUS1/oradmp/diag/rdbms/petcus1/PETCUS1/trace/PETCUS1_mmon_5584.trc:

ORA-17500: ODM err:ODM ERROR V-41-4-1-83-9 Bad file descriptor

Errors in file /u01/oracle/PETCUS1/oradmp/diag/rdbms/petcus1/PETCUS1/trace/PETCUS1_ora_10695.trc:

ORA-00245: control file backup operation failed

Fri Aug 22 05:03:01 2014

Thread 1 advanced to log sequence 6935 (LGWR switch)

  Current log# 2 seq# 6935 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A2/redo/redo02A.log

  Current log# 2 seq# 6935 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B2/redo/redo02B.log

Archived Log entry 6934 added for thread 1 sequence 6934 ID 0x4a0d6000 dest 1:

Fri Aug 22 08:05:01 2014

Thread 1 advanced to log sequence 6936 (LGWR switch)

  Current log# 4 seq# 6936 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A4/redo/redo04A.log

  Current log# 4 seq# 6936 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B4/redo/redo04B.log

Archived Log entry 6935 added for thread 1 sequence 6935 ID 0x4a0d6000 dest 1:

Fri Aug 22 11:02:05 2014

Thread 1 advanced to log sequence 6937 (LGWR switch)

  Current log# 1 seq# 6937 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A1/redo/redo01A.log

  Current log# 1 seq# 6937 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B1/redo/redo01B.log

Archived Log entry 6936 added for thread 1 sequence 6936 ID 0x4a0d6000 dest 1:

Fri Aug 22 14:06:47 2014

Thread 1 advanced to log sequence 6938 (LGWR switch)

  Current log# 3 seq# 6938 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A3/redo/redo03A.log

  Current log# 3 seq# 6938 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B3/redo/redo03B.log

Archived Log entry 6937 added for thread 1 sequence 6937 ID 0x4a0d6000 dest 1:

Fri Aug 22 17:02:40 2014

Thread 1 advanced to log sequence 6939 (LGWR switch)

  Current log# 2 seq# 6939 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A2/redo/redo02A.log

  Current log# 2 seq# 6939 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B2/redo/redo02B.log

Archived Log entry 6938 added for thread 1 sequence 6938 ID 0x4a0d6000 dest 1:

Fri Aug 22 19:13:42 2014

Thread 1 advanced to log sequence 6940 (LGWR switch)

  Current log# 4 seq# 6940 mem# 0: /u01/oracle/PETCUS1/oracnt01/redolog_A4/redo/redo04A.log

  Current log# 4 seq# 6940 mem# 1: /u01/oracle/PETCUS1/oracnt02/redolog_B4/redo/redo04B.log

Fri Aug 22 19:13:50 2014

Archived Log entry 6939 added for thread 1 sequence 6939 ID 0x4a0d6000 dest 1:

Fri Aug 22 19:18:16 2014

Errors in file /u01/oracle/PETCUS1/oradmp/diag/rdbms/petcus1/PETCUS1/trace/PETCUS1_ckpt_5573.trc:

Fri Aug 22 19:18:25 2014

Errors in file /u01/oracle/PETCUS1/oradmp/diag/rdbms/petcus1/PETCUS1/trace/PETCUS1_ora_12261.trc:

關于ODM,在以前的一個章節中詳細讨論過。http://blog.itpub.net/23718752/viewspace-1252507/

啟用了ODM對于系統的io提升是很明顯的。

如果啟用了odm,在資料庫啟動的時候會有如下的提示資訊: 

diagnostic_dest          = "/u01/oracle/PETCUS1/oradmp"

Oracle instance running with ODM: Veritas 6.0.100.000 ODM Library, Version 2.0 

Sat Aug 16 17:25:59 2014

PMON started with pid=2, OS id=5545 

PSP0 started with pid=3, OS id=5547 

看這個錯誤,已經過去了挺長時間了。直接看trace也找不到思路,就從metalink中找一些思路。

其中有一篇文章提到的問題很類似

After Upgrade To 11.2.0.2 We Recieve Ora-00245 During Autobackup Of The Controlfile. (Doc ID 1308378.1)

我們這套庫剛好就是更新到11.2.0.2的,這個條件滿足,但是根據文章的提示說是使用rman來做的備份。

這一點通過ash很可能找不到任何線索。看報錯的前後提示的日志吧。

 less /u01/oracle/PETCUS1/oradmp/diag/rdbms/petcus1/PETCUS1/trace/PETCUS1_ora_10695.trc

Trace file /dbccbsPT1/oracle/PETCUS1/oradmp/diag/rdbms/petcus1/PETCUS1/trace/PETCUS1_ora_10695.trc

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

ORACLE_HOME = /opt/app/oracle/dbccbspt1/product/11.2.0

System name:    Linux

Node name:      ccbdbpt3

Release:        2.6.18-308.el5

Version:        #1 SMP Fri Jan 27 17:17:51 EST 2012

Machine:        x86_64

Instance name: PETCUS1

Redo thread mounted by this instance: 1

Oracle process number: 54

Unix process pid: 10695, image: oracle@ccbdbpt3 (TNS V1-V3)

*** 2014-08-22 02:00:53.024

*** SESSION ID:(2655.1985) 2014-08-22 02:00:53.024

*** CLIENT ID:() 2014-08-22 02:00:53.024

*** SERVICE NAME:(SYS$USERS) 2014-08-22 02:00:53.024

*** MODULE NAME:(rman@ccbdbpt3 (TNS V1-V3)) 2014-08-22 02:00:53.024

*** ACTION NAME:(0000001 FINISHED70) 2014-08-22 02:00:53.024

Initial buffer sizes: read 1024K, overflow 832K, change 805K

有了這些資訊,就能肯定這個問題是由于一個bug導緻的。看來這個問題也不算優先級很高的。可以通過workaround來完成。要不就打patch

A workaround is to make the controlfile backup via SQL*Plus instead of RMAN

SQL> alter database backup controlfile to '';

__OR__

Configure the Controlfile backup to a filesystem which is not using ODM.

Example :

RMAN> configure controlfile autobackup format for device type disk to '/tmp/%f';

一些相關的連結:

ORA-17500: ODM err:ODM ERROR V-41-3-2-180-2 Filesystem not mounted with ODM/QIO support (Doc ID 1556149.1)

How to Enable/Disable the Veritas ODM for Oracle database (Doc ID 755159.1)