天天看點

Oracle corrupt block(壞塊) 詳解

一. 壞塊說明

1.1 相關連結

       在看壞塊之前,先看幾個相關的連結,在後面的說明中,會用到連結中的一些内容。

 ​​​

MOS 上的相關文檔:

       RMAN: Block-Level Media Recovery - Concept & Example [ID 144911.1]   

       FAQ:Physical Corruption [ID 403747.1]

       MasterNote for Handling Oracle Database Corruption Issues [ID 1088018.1]

       HandlingOracle Block Corruptions in Oracle7/8/8i/9i/10g/11g [ID 28814.1]

       ExtractingData from a Corrupt Table using ROWID Range Scans in Oracle8 and higher [ID61685.1]

官網的連結:

      ValidatingDatabase Files and Backups

       ​

      Performing Block Media Recovery

1.2  block corruption 分類

For purposes of the paper we will categorize corruption under three general areasand give best practices for prevention, detection and repair for each:

        Memory corruption

        Logical corruption(soft corruption)

       Media corruption(Physicalcorruption)

       Physicalor structural corruption can be defined as damage to internal data structureswhich do not allow Oracle software to find user data within the database.        Logical corruption involves Oracle beingable to find the data, but the data values are incorrect as far as the end useris concerned.

       Physica lcorruption due to hardware or software can occur in two general places -- inmemory (including various IO buffers and the Oracle buffer cache) or on disk.Operator error such as overwriting a file can also be defined as a physicalcorruption.   Logical corruption on theother hand is usually due to end-user error or non-robust(?) applicationdesign. A small physical corruption such as a single bit flip may be mistakenfor a logical error.

1.3 檢視blockcorruption

       可以通過v$database_block_corruption 檢視database 的corruption。 官網對該視圖的定義如下:

       V$DATABASE_BLOCK_CORRUPTION displaysinformation about database blocks that were corrupted after the last backup.

SYS@dave2(db2)>desc v$database_block_corruption

 Name                                     Null?    Type

 ------------------------------------------------- ----------------------------

 FILE#                                             NUMBER

 BLOCK#                                            NUMBER

 BLOCKS                                            NUMBER

 CORRUPTION_CHANGE#                                 NUMBER

 CORRUPTION_TYPE                                    VARCHAR2(9)

SYS@dave2(db2)> select * fromv$database_block_corruption;

no rows selected

       這裡沒有壞塊,為了示範這個效果,我用BBED 制造一個壞塊,然後在用bbed 恢複過來。

先确定block:

SYS@dave2(db2)> select * from dvd;

JOB

--------------------------------------------------------------------------------

DMM  is DBA!

dmme like Oracle!

SYS@dave2(db2)>select

 2  rowid,

 3  dbms_rowid.rowid_relative_fno(rowid)rel_fno,

  4 dbms_rowid.rowid_block_number(rowid)blockno,

 5 dbms_rowid.rowid_row_number(rowid) rowno

 6  from dvd;

ROWID                 REL_FNO    BLOCKNO     ROWNO

------------------ ---------- --------------------

AAAN9hAAGAAAAAcAAA          6         28          0

AAAN9hAAGAAAAAcAAB          6         28          1

用bbed 修改block 28:

[oracle@db2 ~]$ bbed parfile=/u01/bbed.par

Password:

BBED: Release 2.0.0.0.0 - LimitedProduction on Mon Aug 15 22:15:10 2011

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

*************!!! For Oracle Internal Use only !!! ***************

BBED> set dba 6,28

       DBA             0x0180001c(25165852 6,28)

BBED> map                 

 File: /u01/app/oracle/oradata/dave2/dave01.dbf(6)

 Block: 28                                   Dba:0x0180001c

------------------------------------------------------------

 KTBData Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0      

 struct ktbbh, 96 bytes                     @20     

 struct kdbh, 14 bytes                      @124    

 struct kdbt[1], 4 bytes                    @138    

 sb2kdbr[2]                               @142    

 ub1freespace[8005]                       @146    

 ub1rowdata[37]                           @8151   

 ub4tailchk                               @8188   

BBED> d /voffset 0 count 128

 Block: 28     Offsets:    0 to  127 Dba:0x0180001c

-------------------------------------------------------

 06a20000 1c008001 f4a90780 00000104 l.¢......ô©......

 f5b40000 01000000 61df0000 f4a90780 lõ´......aß..ô©..

 00000000 03003200 19008001 03001000 l......2.........

 3b0a0000 00000000 00000000 00800000 l;...............

 dda90780 00000000 00000000 00000000 lÝ©..............

 00000000 00000000 00000000 00000000 l................

 0000000000000000 00000000 00000000 l ................

 00000000 00000000 00000000 00010200 l................

 <16 bytes per line>

BBED> modify /x 12345678 offset 0

 Block: 28               Offsets:    0to  127           Dba:0x0180001c

------------------------------------------------------------------------

 12345678 1c008001 f4a90780 00000104 f5b4000001000000 61df0000 f4a90780

 00000000 03003200 19008001 03001000 3b0a000000000000 00000000 00800000

 dda90780 00000000 00000000 00000000 0000000000000000 00000000 00000000

 00000000 00000000 00000000 00000000 0000000000000000 00000000 00010200

 <32 bytes per line>

BBED> sum apply

Check value for File 6, Block 28:

current = 0x5ab7, required = 0x5ab7

SYS@dave2(db2)>alter system flush buffer_cache;

System altered.

select * from dvd

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted(file # 6, block # 28)

ORA-01110: data file 6:'/u01/app/oracle/oradata/dave2/dave01.dbf'

這裡提示塊有錯誤,我們檢視下一下:v$database_block_corruption:

       這裡顯示為空,但是我們之前看該視圖的定義的時候,說是自上次備份以來的壞塊。是以這裡我們驗證一下:

RMAN> backup validate datafile 6;

Starting backupat 15-AUG-11

using target database control file insteadof recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=141 devtype=DISK

channel ORA_DISK_1: starting full datafilebackupset

channel ORA_DISK_1: specifying datafile(s)in backupset

input datafile fno=00006 name=/u01/app/oracle/oradata/dave2/dave01.dbf

channel ORA_DISK_1: backup set complete,elapsed time: 00:00:02

Finished backup at 15-AUG-11

在次select 查詢:

SYS@dave2(db2)>  select * from v$database_block_corruption;

FILE#    BLOCK#     BLOCKSCORRUPTION_CHANGE# CORRUPTIO

---------- ---------- ---------------------------- ---------

        6         15          1                  0 CORRUPT

         6        28          1                  0 CORRUPT

這次就查詢到結果了。 視圖顯示block的狀态為corrupt。 對于該類型,共有一下幾種:

       (1)ALL ZERO:Block header on disk contained only zeros. The block may be valid ifit was never filled and if it is in an Oracle7 file. The buffer will bereformatted to the Oracle8 standard for an empty block.

       (2)FRACTURED: Block header looks reasonable, but the front and back of the blockare different versions.

       (3)CHECKSUM: optional check value shows that the block is not self-consistent.It is impossible to determine exactly why the check value fails, but itprobably fails because sectors in the middle of the block are from differentversions.

       (4)CORRUPT: Block is wrongly identified or is not a data block (for example,the data block address is missing)

       (5)LOGICAL: Specifies the range is for logically corrupt blocks.CORRUPTION_CHANGE# will have a nonzero value.

現在用BBED還原block:

BBED> revert

All changes made in this session will berolled back. Proceed? (Y/N) y

Reverted file'/u01/app/oracle/oradata/dave2/dave01.dbf', block 28

Warning: contents of previous BIFILE willbe lost. Proceed? (Y/N) y

current = 0xb4f5, required = 0xb4f5

在次查詢,block 正常:

SYS@dave2(db2)> alter system flushbuffer_cache;

但是注意我們的v$database_block_corruption 視圖:

        6         28          1                  0 CORRUPT

corruption block的資訊還存在裡面。 之前經過該視圖和備份有關,我們用rman validate 一下datafile,之後就ok了。

using channel ORA_DISK_1

input datafile fno=00006name=/u01/app/oracle/oradata/dave2/dave01.dbf

channel ORA_DISK_1: backup set complete,elapsed time: 00:00:01

SYS@dave2(db2)>select * from v$database_block_corruption;

--說明,我們修改的是block 28,它已經消失了。block 15 不是我們這本次測試修改的。 它是曆史遺留問題。這裡就不讨論了。

如果用rmanvalidate 之後還沒有消失,可能是oracle的bug,參考:

       V$Database_Block_CorruptionDoes not clear after Block Recover Command [ID 422889.1]

1.4 使用RMAN 驗證和recover corruption

       RMAN的預設validate 隻驗證實體壞塊,不驗證邏輯壞塊。

驗證所有datafile 和歸檔是否有實體壞塊:

RMAN>backup validate database archivelogall;

或者

RMAN>backup validate database;

驗證所有datafile 和歸檔的實體壞塊和邏輯壞塊:

RMAN>backup validate check logicaldatabase archivelog all;

或者:

RMAN>backup validate check logicaldatabase;

       注意: 如果加上了archivelog all,就必須要有歸檔檔案存在。

在Oracle 11g裡還可以單獨驗證某一個資料塊: 

       RMAN>VALIDATE DATAFILE 1 BLOCK 10;

       對于實體壞塊,我們可以通過recover database 或者recover datafile 來解決,但是對于邏輯壞塊這種方法就不行。 在後面單獨講到邏輯壞塊時有說明。對于邏輯壞塊可以嘗試對對象進行重建,如重建索引,重建表在導入資料。

       對于實體壞塊,如果不使用recover,那麼塊上的data 基本是丢失的。 我們可以采用相關的event或者通過rowid來跳過block,導出資料。 要保證塊上資料不丢失,就需要通過RMAN有效的備份來進行recover。

相關的recover 指令如下:

RMAN>blockrecover datafile 6 block 15;

RMAN> blockrecover corruption list;

該指令recover 的所有block 來自v$database_block_corruption視圖。

RMAN : Block-Level Media Recovery - Concept& Example [ID 144911.1]

二. Memory Corruption

2.1 Background

       Oracle allocates both shared and private memory. Shared memory is allocated when an Oracle instance starts and all processes(or threads) connecting to an Oracle database can access it. Oracle softwaredefines how this shared resource is accessed to prevent multiple processes fromsimultaneously writing to the same address. It also has to recover any incomplete changes made to memory by aprocess that dies abnormally.  The amountof shared memory allocated is static in size and is only freed when theinstance is shut down.  Private memory isallocated and freed as needed by each process (or thread) at the OS level.

       Corruptionis more likely to occur within shared memory than private memory so we focusattention to the structures and algorithms used within shared memory (alsoknown as the Oracle SGA).

       The SGA is divided into four portions - fixed, variable,Database Buffer cache, and Redo log buffer.  A diagram appears below.

2.2 Definition

       Memory corruption can be defined as inconsistencies in the data structures that arerelated to handling memory.  Thisinconsistency could appear in any of the different parts of memory discussedabove.  As discussed in the sectionabove, corruption in memory can be caused either in the SGA or the PGA.

       Onlycorruption in the database buffer cache portion of the SGA can potentially leadto data loss.

       Thisis termed as ‘Cache corruption’ and is discussed in detail below.  Corruption in theother parts of SGA do not result in loss of data, but can still cause theinstance to crash.  On the otherhand, a corruption in the PGA causes only the corresponding process tocrash.  If this process is updating ablock in the buffer cache when this happens, then the background process, PMONdoes the necessary recovery on the block being changed by this process.  SMON and otherprocesses will rollback any other uncommitted data.

2.3 Cache corruption

       TheOracle buffer cache is a mechanism where frequently accessed blocks are storedin memory for quicker access. The cache also maintains older versions of blocksfor consistent read purposes.  If thereis a corruption in this part of memory then there is a possibility of loss ofdata.

       Theforeground processes read Oracle blocks from the disk into the buffercache.  There arecertain checks done on the data block when it is read from the disk. 

       For example, one of the checks is to compare the Incarnation Number (INC) andSequence Number (SEQ) data structures from the header of the data block withthe INCSEQ structure in the footer to make sure that the block versions match.This is done to avoid reading a block from disk whose header is corrupt.

The structuresdiscussed above is specific to version 7.

2.4 Causes

       In general, memory corruption is usually identified by abackground or shadow  process whenever ittries to access the part of memory that is corrupted.  Cache corruption, which might cause a loss ofdata, is usually caught by:

       (1)the shadow process when trying to read or update a block in memory,

       (2)by the background process, DBWR, when trying to write a dirtied block frommemory to disk,

       (3)by the background process, PMON, while performing block recovery. 

Common causes include:

       (1)Operatingsystem bugs causing bad reads or bad writes

       (2)Hardware issues

       (3)Oracle bug

       (4)Non-Oracle program attaching and illegally writing to the sameshared memory address

2.5  Prevention

       There is very little that can be done from the user’s perspective to prevent memorycorruption. An INIT.ORA parameter, DB_BLOCK_CHECKSUM can be set to true.  

       This enables another check where the block checksum is calculated and compared withthe checksum stored in the block header. If they aredifferent then the block is considered corrupt on disk, and the block is notread into memory. This prevents corrupt blocks from entering the cache.When a block is changed and being written to disk from memory, DBWR calculatesa new checksum for the block by summing up its contents and writes it in theblock header. 

       There is a CPU overhead associated with this parameter since the checksum is calculated every time a block is read and written.

       A similar parameter, LOG_BLOCK_CHECKSUM, can be set totrue in the INIT.ORA for verifying the records written to the redo log buffer.  This extra test prevents bad redo from beingapplied to a block in cache during recovery. There is CPU overhead for reasons similar to the DB_BLOCK_CHECKSUM.

       The best form of corruption prevention from occurring in a production environmentis to test the hardware, operating system, database, application and so on forbugs before rolling them into production. This is also true when introducing new hardware, patches and upgrades.

2.6  Detection

       Memorycorruption can be detected from errors in the operating system logs indicatingany kind of memory problems.  These canalso be found from the Oracle ALERT.LOG by certain ORA-600 errors.  The first argument ofthe ORA-600 could be from 17000 through 17999 in case of memory corruption. 

       Somecache corruption can be detected by ORA-600s in the range from 2000 through8000.  One thing to remember is that notall ORA-600s imply memory corruption.

       對于memory corruption的錯誤提示,是ORA-600的第一個參數值從17000到17999或者2000到8000. 這裡隻是一種可能,并不完全肯定。​

2.7  Repair

Location Data loss Repair
SGA - Buffer cache Probable data loss ⇒ If the corrupt block has been written to disk, then the object to which the block belongs to has to be repaired by methods discussed under Media corruption. ⇒ SHUTDOWN and STARTUP the instance ⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table.
SGA - Redo buffer No data loss ⇒ SHUTDOWN and STARTUP the instance. ⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table
SGA - Shared SQL Area
PGA ⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table

       To diagnose the cause of the corruption, call Oracle support with the followinginformation:

       (1)INIT.ORA file

       (2)ALERT.LOG

       (3)Trace files for any ORA-600s found in the directory specified by theINIT.ORA parameter, USER_DUMP_DEST

       (4)Heap dump : this can be obtained by executing the following commandin SVRMGR or SQLDBA:

       SQL>ALTER SESSION SET EVENTS ‘IMMEDIATE TRACE NAME HEAPDUMP, LEVEL 10’;

       This creates a trace file in the USER_DUMP_DEST.

       (5)Reproducibletest case

       (6)Thorough history of events that led to the corruption

       (7)Record any noticeable changes to the environment such as newINIT.ORA parameters, new code, patches, and upgrades.

三. Logical Corruption (soft corruption)

       LogicalCorruption can be defined as a situation where the actual data is not corruptedin a data block but a query results in a wrong set of data due to a problem inthe way data was loaded into the database or due to a misexecution of theoptimizer path.  A logical corruption is not an inconsistencyin a data block but an inconsistency in the result of a query.

       邏輯壞塊通常不是data block的不一緻,而是查詢結果的不一緻。

       Forexample, a query that is expected to fetch 5 rows might result in 10 rowsbecause the data in the table was duplicated due to the lack of a primary orunique key.

       比如我們查詢5行記錄,實際卻傳回10行。

3.1 Causes

       Badapplication design that lacks validation or proper integrity checks OptimizerBugs (in rare cases)

       通常是應用設計不完善或者是優化器的bug造成。

3.2 Prevention

       The only kind of prevention methodology for avoiding user errors is to test thatapplications return valid results thoroughly before implementing them inproduction environments.  

       There cannot be a way to prevent logical corruption caused due to optimizer problemsunless the appropriate patches are applied or the database is upgraded to themost recent version.

3.3 Detection

       When logical corruption is caused by user errors, they are more difficult to detectthan those caused by optimizer problems. In case of user errors, the user should have a good knowledge of theapplication that is being run to identify inconsistencies in the queryresults.  In the case of optimizerproblems, inconsistencies may be accompanied by changes in the query executionplan leading to a different response time. 

       Also one may notice, invalid results only when using the cost based optimizer rather than the rule based optimizer. Other changes inthe query path can be caused by changing the optimizer mode, specifying a different optimizer hint, dropping or creatingindexes, or analyzing objects to generate new statistics.

3.4 Repair

       Whenthese problems are caused by user errors, they can be fixed by making theapplication more robust. 

       解決這個問題需要完善系統的設計。

       Fo rexample, if invalid data is successfully inserted into a table because of alack of integrity checking , then constraints should be created and the invaliddata will have to be found and deleted. In case of optimizer problems, contactWorldwide Customer Support Services to determine if this is a bug with anavailable patch. 

       They will need to be provided with the query, EXPLAIN PLAN and potentially theexport dump of the tables involved.

四. Media Corruption(physical corruption )

       Media corruption can be defined as a situation where an inconsistency has occurred inthe data block structures in the physical disk as a result of mediafailures.  Media failures are failuresthat are caused due to hardware problems, operating system problems, controllerproblems, logical volume problems, and so on. As a result of a media corruption, the data in the corrupted block islost.

       實體壞塊通常是block上的不一緻,造成實體壞塊的原因可能是硬體故障,作業系統問題,控制器問題,邏輯卷問題等。 對于實體壞塊,其block上的資料會丢失。

       MediaCorruption could occur in different parts of the database and the detection,prevention and repair are different depending on the object that iscorrupted.  The following are differentobjects that could be corrupted:

       實體壞塊可以存在資料庫的不同部分,對應不同部分的處理方法也不一樣,具體可以以下下幾種:

       ∗ Control file

       ∗ Redo log file

       ∗ Data file The block could belong to one of the following categoriesin case of data  file corruption:

       (1)File header block

       (2)Data dictionary object (SYSTEM tablespace)

       (3)Undo header and Undo blocks (ROLLBACK tablespace)

       (4)Sort blocks (TEMP tablespace)

       (5)Data/Index blocks (DATA/INDEX tablespace)

               =>Tables

               =>Clusters

               =>Indexes

4.1 Control filecorruption

       TheControl file is the file that has the structural information of thedatabase.  The control file hasinformation such as the database name, names and locations of the data filesand the redo log files belonging to the database, the creation timestamp of thedatabase, log sequence information, checkpoint information and so on.

       控制檔案包含了資料庫的一些結構資訊。

The control file can be dumped in asciiformat by executing the command:

       ALTERSESSION SET EVENTS ‘IMMEDIATE TRACE NAME CONTROLF LEVEL 10’;

       This command creates a trace file in the location defined by the INIT.ORA parameter,USER_DUMP_DEST. 

可以dump 控制檔案。 

       ​​​

       Whenany of the mirrors (Oracle’s multiplexing) of the control file is corrupted,critical information about the database cannot be accessed which will result ina database crash.  The following tablesexplain the potential causes, detection, prevention and repair of a corruptionin the current control file.

       雖然控制檔案有備援,但是任何一個控制檔案出現損壞都會導緻db crash。

4.1.1 Causes and Prevention

4.1.2 Detectionand Repair

   控制檔案通常有三個,任何一個出現corrupt都會使db crash。 這種情況下,可以修改初始話參數,使用其他的幾個控制來啟動db,如果能啟動,就ok。不能啟動,就需要重建控制檔案來解決。 關于控制檔案的重建,我上面貼的連結裡有。

4.2  Redo log file corruption

       Redolog files are critical in protecting a database from failures.  The changes made to data in a database arerecorded in the redo log files. 

       redo 是在db failures來進行恢複的,所有的事務操作都會記錄到redolog裡。

       When there is a failure which prevents data from being written to the data files ondisk from memory, the changes can be obtained from the redo log files.  When a data file is corrupted, a backup froman earlier day can be restored and the changes from that day onwards can beapplied to the data file from the redo log files. 

       The redo log files are used only when a database is recovered from a failure.  There are two kinds of redo log files: Onlineredo log files and Archived redo log files. Archived redo log files are the spooled copies of the online redo logfiles. 

       Corruption could occur in an online redo log file or an archived redo log file.  If a corruption occurs on an online redo logfile, the instance crashes if there are no mirrors of the redo log file or ifthe mirrors are corrupted as well.  Ifthe corruption occurs in an archived redo log file, the database is notaffected unless a backup is restored and recovery is being done for which it isneeded.

       corruption 可能發生在online log 或archived log。 如果發生在online log,并且log 沒有鏡像或者鏡像log也出現了壞塊,這種情況下,db 會crash。 如果出現在歸檔檔案上,僅當使用備份進行恢複時才會有影響。

4.2.1  Causes and  Prevention

  4.2.2 Detectionand Repair

Forcing the database tostartup using _ALLOW_RESETLOGS_CORRUPTION:

       This parameter is undocumented and unsupported.

       The_allow_resetlogs_corruptionshould only be done as a last resort. Usually when a database is opened witheither the RESETLOGS or NORESETLOGS option, the status and checkpointstructures in all the file headers of all data files are checked to make surethat they are consistent.  Once this ischecked, the redo logs are zeroed out in case of RESETLOGS.  When the _ALLOW_RESETLOGS_CORRUPTIONparameter is set, the file header checks are bypassed.  This means that we do not make sure that thefiles are consistent and open the database. It will potentially cause some lost data and lost data integrity.  The database should be rebuilt since data andthe data dictionary could be corrupt in ways that are not immediatelydetectable.  This could lead to futureoutages or put the database in a state where it cannot be salvaged at all.  There is no guarantee that this will work.

Advantages

       nsome circumstances, database can be brought up even when no valid backups areavailable

Disadvantages

(1)The database will not be in aconsistent state once the undocumented parameter is used and so it has to berebuilt by doing a full database export, recreate the database and a fulldatabase import.

(2)Not guaranteed to work

 ​

4.3 Data file corruption

       The data files are the physical storage unit of data stored in a database.  Each data file ismade up of data blocks which can be divided into 5 different types (in thecontext of media corruption):

       (1)File header blocks : File header block is the first Oracle block inevery data file in an Oracle database. This block keeps track of a lot of information about the data file thatit belongs to.

       (2)Data Dictionary blocks : SYSTEM tablespace consists of datadictionary objects.  Data dictionaryobjects are objects that keep track of information stored in the database suchas the information about the tablespaces, information about the data files,information of amount of free space in each tablespace.

       (3)Undo header and Undo blocks: ROLLBACK tablespace consists ofrollback segments that are made up of undo header blocks and undo blocks.  These blocks are used to undo a transactionwhen it fails or when the user executes a ROLLBACK command.

       (4)Sort blocks: TEMP tablespace consists of temp segments that are madeup of sort blocks. Sorts are usually done in memory where they are allocated asize specified by the INIT.ORA parameter, SORT_AREA_SIZE amount of space.  If the sort is so huge that it cannot fit inthe allocated space in memory, then temp segments are created in the user’stemporary tablespace for doing the sort on disk.

       (5)Data blocks: DATA tablespace consists of tables, clusters andindexes.

       The following tables explain the potential causes of data file corruption and alsothe different ways to prevent data file corruption.

4.3.1 Causes and Prevention

       Now that we have discussed the causes and the prevention, let us discuss thedetection and repair which is different for corruption in different blocks.

4.3.2 File header block

       Fileheader block is the first Oracle block in a data file.  This block keeps track of information aboutthe data file itself (i.e., file metadata) including different checkpointstructures (explained below), status of the file (e.g., hot backup in progress,hot backup ended, media recovery required, instance recovery required), andresetlogs information (information on the time when the database has beenbrought with resetlogs option most recently).

       File header block 是data file的第一個block。 它儲存了datafile的一些資訊。

       Checkpointingis the process of writing the blocks that have been changed in memory to disk.The control file and the data file headers are updated after every checkpointis done in the database.  This is anautomatic action executed by the background process called the Database Writer(DBWR).  This can also be forced bycertain user commands such as a normal SHUTDOWN, a normal OFFLINE of atablespace and so on. 

       checkpoint程序會将已經變更的block從記憶體寫如磁盤。 每次執行checkpoint時,dbwr 程序會更新controlfile 和 datafileheaders。 當normal shutdown 或者normal offline tablespace時,也會更新control file 和datafile headers.

       Thedata file header block keeps track of the most recent checkpoint informationwhich denotes that all blocks in this data file that has been changed beforethis checkpoint has been written to disk from memory and so a failure in thememory will not affect the data in this data file before the checkpoint.

       datafile header 儲存了最近一次checkpoint的資訊,它表示該checkpint 之前已經改變的block已經從記憶體寫入磁盤了。 當memory 發生錯誤時,不會影響該checkpoint 之前的data。

       Whena file header block is corrupted, the information stored in this block cannotbe accessed which means that the objects in the data file cannot be accessedeither.  The following tables give anidea of the potential causes for a corruption in a file header block and theprevention, detection and repair of the same.

       當 fileheader block 發生corrupted時,存儲在該block裡的資訊無法讀取,那麼的data file 也就無能讀取。

4.3.3 Data dictionary object (SYSTEM tablespace)

       Datadictionary objects are objects that reside in the SYSTEM tablespace which hascritical information about the objects in the database and their relationshipsand attributes.

       Data dictionary 對象存放在system 表空間,它儲存了資料庫中對象的重要資訊,如對象之間的關系和屬性等。

Data dictionary objects found in the systemtablespace are described below:

SYSTEM 表空間下的data dictionary objects 有如下幾種:

       1)Objects created by the script SQL.BSQ run by the CREATEDATABASE command found under $ORACLE_HOME/dbs directory.  There are certain tables, indexes andclusters that cannot be dropped and recreated in the SYSTEM tablespace.  These are the tables that are used whenbringing up the database.  They arecalled the bootstrap objects and are found in the SQL.BSQ.  It is not an easy task to locate datadictionary objects that can be dropped and recreated since the relationsbetween the different objects could result in an inconsistent database if wedrop the wrong object.

       datadictionary 是在建立資料庫時通過$ORACLE_HOME/dbs/sql.bsq腳本建立的,并且這些核心的表,索引和clusters 不能被drop 和recreate。 他們在db 啟動時使用。

       2) Views createdon the fixed data structures (V$ views).

       3)SYSTEM rollback segment created after the databasecreation. If the corruption is in this segment, then the most recentbackup should be restored and a point in time recovery should be done on thedatabase up to the time when the corruption could have occurred.

       system rollback segment 在資料庫建立之後建立。

          ​

       4)Compatibility segment (this is the only segment of type ‘CACHE’ in the SYSTEMtablespace). The Compatibility segment is a segment that keeps track of thefeatures being used in the database which will be used when the database isbeing downgraded to an earlier version. This segment is used to make sure thatthe features being used in the current version are disabled before beingdowngraded to the earlier version.  Ifthe compatibility segment has a corruption then, the database can be brought upby shutting down and starting it up.  Ifthe problem is still not fixed call Oracle support with appropriate trace filesand the alert.log.

       The supported way of fixing data dictionary corruption is to restore from a backupand roll forward using the archived redo logs.

4.3.4 Undo header and Undo blocks (ROLLBACK tablespace)

       Rollback segments are undo segments that have information about the transaction that hasbeen executed so that it can be rolled back in case of failure of thetransaction or when the user asks for the transaction to be rolled backexplicitly. 

       They are made up of undo header blocks and undo blocks which are required to accessundo information to provide for consistent reads and transactionconsistency.  If the rollback segment iscorrupted, the transaction consistency of the data blocks (including datadictionary objects) can be jeopardized.

The corruption found when undoing atransaction could fall under three categories :

在以下三種情況下,可以會出現corruption:

       (1)belongs to the object (table/index/cluster) that has the data onwhich the transaction was executed (data block)

       (2)belongs to the undo block that is being used to undo the transaction(undo header)

       (3)belongs to the undo segment header block of the segment where theundo block is found (undo block)

對于第一種情況,出現corruption 的位置是data block。

       Thefirst case where the corruption is in the object to which the activetransaction belongs to, we have to identify the object first.  This can be done by setting an event in theINIT.ORA as follows:

       event= “10015 trace name context forever, level 10”

       這種情況的壞塊是active transaction,當我們重新開機DB後,會進行相關的transaction recover(Rolling Back)。 當recover 完成,就可以正常通路對應的block了。

       Thisevent traces the undo segment recovery when the database is started, the eventputs out a trace in the directory specified by the INIT.ORA parameter,USER_DUMP_DEST.  This trace file containsa transaction table for each of the rollback segments that are onlined in thedatabase.  The trace file has a messagethat says ‘error recovering tx(#, #) object #’. Tx(#, #) refers to thetransaction information and the object # is the object id of the object thathas a corruption. 

       設定event時候,會dump DB 啟動是的undo segment recovery,在trace裡會有‘error recovering tx(#, #) object#’資訊,其中Tx(#, #)指的是事務資訊,object #是corruption 的對象id。找到對象ID 後可以通過如下SQL查詢:

The following query gives the name of theobject that is corrupted:

       SQL>SELECTOWNER, OBJECT_NAME, OBJECT_TYPE, STATUS  FROM SYS.DBA_OBJECTS WHERE OBJECT_ID = <object #_from_tracefile>;

4.3.5 Sort blocks (TEMP tablespace)

       Sortsare usually done in the part of memory allocated from the SGA.  This is defined by the INIT.ORA parameterSORT_AREA_SIZE.  If the sort space neededfor a sort is so big that it cannot fit in the sort area defined in memory,then it is done on disk by creating segments called the Temporary segments.

       當SORT_AREA_SIZE 指定的sort 空間不夠時,會在磁盤上建立一個temporary segments來作為排序使用。 建議每個使用者指定自己的temporary tablespace。

       Itis advisable to create a separate tablespace called the TEMP tablespace.  After creating this tablespace, alter theusers to use this as their temporary tablespace by executing the followingcommand:

       ALTERUSER user_name TEMPORARY TABLESPACE TEMP;

       This way, the temporary segments created by any user will be in the TEMP tablespaceand it provides easy manageability.

Detection

· Usually thissegment is never corrupted since they are reformatted every time they get used

· Tempsegments are reused frequently

· If problempersists, either move or drop and recreate the temp tablespace

4.3.6 Data/Index blocks (DATA/INDEX tablespace)

       Whena data block is corrupted, when it belongs to a table segment, cluster segmentor an index segment, the detection mechanisms are the same:

       datablock的corrupt可以出現在table segment,cluster segment和index segment。

可以通過以下方法來檢測corruption:

       (1)DBVERIFY can be used to detect the corrupted blocks in thedata file

       (2)ANALYZE command run on the objects give errors (ANALYZE<table/cluster/index> <table/cluster/index_name> VALIDATESTRUCTURE;) When an ANALYZE (with CASCADE option) is run on a table or cluster,it cross verifies the index and data/cluster blocks along with the integritychecks done for the block.

       (3)DB_BLOCK_CHECKSUM can be set to TRUE in the INIT.ORA file.When a block is changed and being written to disk from memory, DBWR calculatesa checksum for the block by summing up its contents and writes it in the blockalong with it on disk.  The next time when the block is being read by theforeground process, it calculates the checksum again for the block that isbeing read and compared with the checksum already written in the block ondisk.  If both are different then theblock has been corrupted on disk and so the block is not read into memory sothat it prevents cache corruption.  There is an overhead associated with this parameter since thechecksum is calculated each time it is read and written.

       (4)Events 10210, 10211, 10212 can be set in the INIT.ORA file todetect software corrupt blocks.  When there is a corruption in a block, it is not detecteduntil the block is being updated. So any SELECTs on a corrupted block is executed until it is marked assoftware corrupt.  When the events areset in the INIT.ORA, the blocks are checked for integrity by comparing certaindata structures and once there is an inconsistency found, the seq or the sequence of the block is set to 0 in theblock header representing that the block is software corrupt.  

The events can be set as follows:

       event= “10210 trace name context forever, level 10”  (for data blocks)

       event= “10211 trace name context forever, level 10”  (for index blocks)

       event= “10212 trace name context forever, level 10”  (for cluster blocks)

       (5)Users receiving ORA-1578 when trying to access an object.  The query from DBA_EXTENTS given in pg# 30shows that the error is on a table, cluster or index in the data tablespace,SQL語句如下:

       SQL>selectsegment_type, segment_name from sys.dba_extents where file_id =<file_id_from_ora-1578> and <block_id_from_ora-1578> betweenblock_id and (block_id+blocks-1);

       (6)ALERT.LOG shows ORA-1578 or ORA-600s with the first argument in therange of 2000 to 8000

4.3.6.1 Tables

       Whena data block is corrupted in a table, it should be understood that the data inthe corrupted block is lost.  The onlyway to not lose any data from the table is to restore from a valid backup andrecover until a point in time before the corruption occurred.

       datablock 壞塊通常意味着資料的丢失,如果要保證沒有資料丢失,需要通過有效的備份進行恢複。這塊參考1.4節。

方法一:Event method

       Event10231 can be set to skip corrupted blocks on full table scans in the INIT.ORAfile.  The object can be exported aftersetting this event.  This is notguaranteed to work for every kind of corruption. 

       This works only when the block is soft corrupted(邏輯壞塊), sequence is set to 0. The event can be set as follows:

       event= “10231 trace name context forever, level 10”

       對于table上的邏輯壞塊,并且sequence 設定為0,可以設定10231 event,當全表掃描時,可以跳過corruption,進而讀取資料。

方法二:ROWID method

       Extractthe data that does not belong to the corrupted block using ROWIDs. Every row inevery table in an Oracle database has a ROWID column which is usually not displayedunless SELECTed explicitly. 

       通過rowid,我們可以抽取出非corrupted block上的資料。​

(1)通過函數建構ROWID

函數參數如下:    

function ROWID_CREATE(rowid_type    IN number,

                           object_number INnumber,

                           relative_fno  IN number,

                           block_number  IN number,

                           row_number    IN number)

              return ROWID;

    -- rowid_type      - type(restricted=0/extended=1)

    -- object_number   - data objectnumber

    -- relative_fno    - relative filenumber

    -- block_number    - block numberin this file

    -- row_number      - row number inthis block

這些參數可以通過如下方式擷取:

ROWID_TYPE:

       Thisis 1 because we are using the extended rowid format.

RELATIVE_FNO:

       Thisshould have been available when you came to this article. It can also be foundfrom the DBA_EXTENTS view given the absolute file number and block number ofthe corrupt block:

       SQL>SELECTtablespace_name, relative_fno, segment_type, owner, segment_name, partition_nameFROM dba_extents WHERE file_id = <AFN>    AND  <BL> between block_id and block_id + blocks-1;

OBJECT_NUMBER:

       For a non-partitioned table, select the DATA_OBJECT_ID from DBA_OBJECTS for theproblem table:

SQL>SELECT data_object_id FROM dba_objectsWHERE object_name = '<TABLE-NAME>' AND owner = '<TABLE-OWNER>' ;

Note that a partitionedtable has an object number for each partition.

--注意,對于分區表,每個分區有一個對象id:

Select the DATA_OBJECT_ID from DBA_OBJECTSthus:

       SQL>SELECTdata_object_id FROM dba_objects WHERE object_name = '<TABLE-NAME>' ANDowner = '<TABLE-OWNER>' AND subobject_name = '<PARTITION-NAME>' ;

       相對與普通表,普通表多了一個subojbect_name 條件。

BLOCK_NUMBER andROW_NUMBER:

       Theblock number of the corrupt block should be available before you came to thisarticle. (Eg: It is reported in an ORA-1578 error, or as a Page Number byDBVerify).

       Fora ROWID range scan we generally want to select all rows BEFORE the corruptblock, then all rows AFTER the corrupt block. The first row in a block is rowzero (0) and so we want all rowids LESS THAN "Block <BL> row 0"and then GREATER THAN OR EQUAL TO "Block <BL>+1 row 0".

(2)建立ROWID

  You can now create the rowid strings to use in a predicate thus:

現在可以建立rowid:

The "LOW_RID"is the lowest rowid INSIDE the corrupt block:

SQL>SELECT dbms_rowid.rowid_create(1,<DATA_OBJECT_ID>,<RFN>,<BL>,0)LOW_RID  from DUAL;

The "HI_RID" isthe first rowid AFTER the corrupt block:

SQL>SELECT  dbms_rowid.rowid_create(1,<DATA_OBJECT_ID>,<RFN>,<BL>+1,0)HI_RID       from DUAL;

(3)重建資料

       Itis now possible to use CREATE TABLE AS SELECT or INSERT ... SELECT to get datawithout accessing the corrupt block using a query of the form:

       根據剛才查詢的rowid,跳過corrupt block來進行createtable 或者insert 操作:

SQL>CREATE TABLE salvage_table AS SELECT/*+ ROWID(A) */ * FROM <owner.tablename> A WHERE rowid <'<low_rid>'   ;

SQL>INSERT INTO salvage_table SELECT /*+ROWID(A) */ * FROM <owner.tablename> A WHERE rowid >='<hi_rid>';

注意hint裡的A是表的别名。

       Fora table partition then only the problem partition need be selected from byusing the PARTITION(xxx) option in the FROM clause:

       對于分區表,僅需要對問題分區進行處理:

SQL>CREATE TABLE salvage_table AS SELECT/*+ ROWID(A) */ * FROM <owner.tablename> PARTITION(<partition_name>) A WHERE rowid < '<lo_rid>';

SQL>INSERT INTO salvage_table SELECT /*+ROWID(A) */ * FROM <owner.tablename> PARTITION (<partition_name>) AWHERE rowid >= '<hi_rid>';

       注意: 采用上面這種rowid的方法,不能處理含有LONG字段的表,對于LONG字段的表,隻能使用帶有where 條件的export/import.

       If the corrupt block is the table segment header, this method won't work. You stillhave the option of using any indexes on the corrupt table to extract the data.

       如果是塊頭出現corrupt,那麼這個方法不使用。 當塊頭出現問題,整個塊的data 都不可讀取。 如果是非塊頭,我們可以使用這種方法挽回更多的資料。

  Use the following query to determine if the affected block is thesegment header :

       可以使用如下SQL判斷是否是塊頭block:

SQL>selectfile_id,block_id,blocks,extent_id from dba_extents where owner='<owner>'and segment_name='<table_name>' and segment_type='TABLE'   order by extent_id;

FILE_ID BLOCK_ID    BLOCKS EXTENT_ID

--------- --------- --------- ---------

8    94854     20780         0 <- EXTENT_IDZERO is segment header

方法三: Index method

       If there are any indexes on the corrupt table then it is possible to get someinformation about what data was in the corrupt block from the index. Thisrequires selecting indexed columns from the table for rowids in the corruptblock. We already know the ROWID range covered by the corrupt block from theSELECT dbms_rowid.rowid_create ... statements above.

       To extract the column data use one of the following forms of select statement:

If the columns requiredat NOT NULLable you can use a fast full scan:

如果列是非空的,可是使用fast full scan:

       SQL>SELECT/*+ INDEX_FFS(X <index_name>) */ <index_column1>,<index_column2> ...  FROM<tablename> X WHERE rowid >= '<low_rid>'          AND rowid <  '<hi_rid>' ;

If the columns required are NULLable thenyou cannot use an index fast full scan and must use a range scan. This requiresyou to know a minimum possible value for the leading index column to ensure youenable the index scan:

如果列是可以null的,那麼必須使用range scan:

SQL>SELECT /*+ INDEX(X<index_name>) */ <index_column1>, <index_column2> ...  FROM <tablename> X  WHERE rowid >= '<low_rid>'        ANDrowid <  '<hi_rid>'  AND <index_column1> >=<min_col1_value>;

       Using this technique for all indexes on the table may be able to retrieve some of thedata. See <View:DBA_IND_COLUMNS> for which columns make up each index.

關于第二和第三中方法,MOS上有示例:

4.3.6.2 Clusters

       Clusters can be defined as a way to store more than one tables physically together sincethe tables have some common columns.  Bystoring the related rows from multiple tables together, the access time can bereduced. 

       For example, if a cluster is formed with the EMP and DEPT tables, the cluster blocklooks like:

       A query that runs across EMP and DEPT will have to access just one block to getboth tables’ rows.  The IO is less andthe access time is less too.

       The detection and repair for a cluster block corruption is similar to that of atable block corruption but the only difference is that when there is acorruption in a cluster block, all the objects that make up the cluster has tobe fixed.

4.3.6.3 indexes

·(1) DBVERIFY gives the corrupted blockinformation

·(2)ALERT.LOG shows corruption errors

·(3)Users running queries against theindex get ORA-1578 on the index

·(4)ORA-600 with first argument in the rangebetween 2000 and 8000

·(1) Drop and recreate the indexsegment

·(2)Restore from a valid backup and recover

       索引上的corrupt block 是比較好處理的,因為可以對索引進行重建,是以不會造成data lose。

五. dbms_repair 包與壞塊

       Oracle從8i 開始提供了DBMS_REPAIR包,該包可以用來發現、辨別并修改資料檔案中的壞塊。dbms_repair包沒有進行授權,隻有sys使用者可以執行。

       任何工具都不是萬能的,使用這個包的同時會帶來資料丢失、表和索引傳回資料不一緻,完整性限制破壞等其他問題。是以當出現錯誤時,應當首先從實體備份或邏輯備份恢複,使用dbms_repair隻是在沒有備份的情況下使用的一種手段,這種方式一般都會造成資料的丢失。

       dbms_repair包的工作原理比較簡單,是将檢查到的壞塊标注出來,使随後的dml操作跳過該塊,同時,dbms_repair包還提供了用于儲存索引中包含的标注為壞塊中的鍵值,以及修複freelist和segment bitmap的過程。

官網的說明:

       DBMS_REPAIR Examples

       Using the DBMS_REPAIR Package

yangtingkun的blog:

       DBMS_REPAIR的使用

5.1 準備工作

create tablespace block datafile '/u01/block.dbf' size 5M;

create table DMM tablespace block as select * from all_tables;

commit;

CREATE INDEX indx_dmm on DMM(TABLE_NAME);

select count(*) from DMM;

  COUNT(*)

----------

12896

5.2.建立管理表

SQL> conn sys/admin as sysdba;

已連接配接。

SQL> execDBMS_REPAIR.ADMIN_TABLES('REPAIR_TABLE',1,1,'USERS');

PL/SQL procedure successfully completed

SQL> execDBMS_REPAIR.ADMIN_TABLES('ORPHAN_TABLE',2,1,'USERS');

5.3 檢查壞塊:dbms_repair.check_object

 Set serveroutput on;

DECLARE

   cc  NUMBER;

BEGIN

   DBMS_REPAIR.check_object (schema_name => 'SYS', -- 注意此處是使用者名

                             object_name     => 'DMM',

                             corrupt_count   => cc);

   DBMS_OUTPUT.put_line ( TO_CHAR (cc));

END;

       正常情況下輸入為0。 如果有壞塊,可以在建立的REPAIR_TABLE中檢視塊損壞資訊:   

/* Formatted on 2009-12-17 13:18:19 (QP5v5.115.810.9015) */

SELECT   object_name,

         relative_file_id,

         block_id,

         marked_corrupt,

         corrupt_description,

         repair_description,

         CHECK_TIMESTAMP

  FROM   repair_table;                                                                                                     

       注意:在8i下,check_object隻會檢查壞塊,MARKED_CORRUPT為false,故需要進行 定位壞塊(fix_corrupt_blocks) ,修改MARKED_CORRUPT為true,同時更新CHECK_TIMESTAMP。

       9i以後經過check_object,MARKED_CORRUPT的值已經辨別為TRUE了。是以可以直接進行第四步了。

5.4 .定位壞塊:dbms_repair.fix_corrupt_blocks     

隻有将壞塊資訊寫入定義的REPAIR_TABLE後,才能定位壞塊。 

/* Formatted on 2009-12-17 13:29:01 (QP5v5.115.810.9015) */

   cc   NUMBER;

   DBMS_REPAIR.fix_corrupt_blocks (schema_name   => 'SYS',

                                   object_name   => 'DMM',

                                   fix_count     => cc);

   DBMS_OUTPUT.put_line (a => TO_CHAR (cc));

 5.5 .跳過壞塊

我們前面雖然定位了壞塊,但是,如果我們通路table:

SQL> select count(*) from SYS.DMM;

ORA-01578: ORACLE 資料塊損壞(檔案号14,塊号154)

ORA-01110: 資料檔案 14: 'D: /BLOCK.DBF'

還是會得到錯誤資訊。 這裡需要用skip_corrupt_blocks來跳過壞塊:

/* Formatted on 2009-12-17 13:30:17 (QP5v5.115.810.9015) */

exec dbms_repair.skip_corrupt_blocks(schema_name => 'SYS',object_name => 'DMM',flags => 1);

SQL> select count(*) from SYS.DMM;

     12850

丢失了12896-12850=46行資料。

5.6.處理index上的無效鍵值;dump_orphan_keys

 /* Formatted on 2009-12-17 13:34:55(QP5 v5.115.810.9015) */

   DBMS_REPAIR.dump_orphan_keys (schema_name         => 'SYS',

                                 object_name         => 'INDX_DMM',

                                 object_type         => 2,

                                 repair_table_name   => 'REPAIR_TABLE',

                                 orphan_table_name   => 'ORPHAN_TABLE',

                                 key_count           => CC);

通過以下指令可以知道丢失行的資訊:

SQL> SELECT * FROM ORPHAN_TABLE;

我們根據這個結果來考慮是否需要rebuild index.

5.7重建freelist:rebuild_freelists

 /* Formatted on 2009-12-17 13:37:57(QP5 v5.115.810.9015) */

exec dbms_repair.rebuild_freelists(schema_name => 'SYS',object_name => 'DMM');

繼續閱讀