天天看点

oracle dataguard 操作系统重启后数据不能同步

环境:oracle虚拟机+ Oracle Linux 6.3 64bit + Oracle11g 11.2.0.3 64bit Dataguard

SQL> select * from v$version;

BANNER

--------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

PL/SQL Release 11.2.0.3.0 - Production

CORE    11.2.0.3.0      Production

TNS for Linux: Version 11.2.0.3.0 - Production

NLSRTL Version 11.2.0.3.0 - Production

$ cat /etc/issue

Oracle Linux Server release 6.3

Kernel \r on an \m

alert:

Managed Standby Recovery not using Real Time Apply

Parallel Media Recovery started with 4 slaves

Waiting for all non-current ORLs to be archived...

All non-current ORLs have been archived.

Media Recovery Waiting for thread 1 sequence 39953

Fetching gap sequence in thread 1, gap sequence 39953-39953

Completed: alter database recover managed standby database disconnect from session

Wed Mar 12 21:22:16 2014

Creating archive destination file : /u01/ora11g/arch01/1_39953_770961807.dbf (77251 blocks)

Wed Mar 12 21:22:37 2014

Creating archive destination file : /u01/ora11g/arch01/1_39953_770961807.dbf (77251 blocks)

Wed Mar 12 21:22:58 2014

Creating archive destination file : /u01/ora11g/arch01/1_39953_770961807.dbf (77251 blocks

手工注册归档,可以正常应用。

ALTER DATABASE REGISTER LOGFILE '/u01/ora11g/arch01/1_39953_770961807.dbf';

/var/log/dmesg:

EXT4-fs warning (device xvda2): ext4_end_bio:258: I/O error writing to inode 11145535 (offset 0 size 4096 starting block 10659634)

JBD2: Detected IO errors while flushing file data on xvda2-8

end_request: I/O error, dev xvda, sector 85277072

检查物理机的磁盘利用率,发现有一块盘利用率是100%。

#df -h

dev/mapper/3600000e00d10000000100377000c0000

1300G 1300G 0 100% /OVS/Repositories/0001fb0000030000d4a4b27cf3835c47

检查这块盘下都有什么文件

# find /OVS/Repositories/0001fb0000030000d4a4b27cf3835c47 -type f

发现有一个ISO文件,占用约3.5G空间。将此ISO文件删除,但空间仍未释放。(如果打开文件的进程还在,这个文件并未被真正删除,这个是linux的特性。)

#lsof | grep -i deleted

#ps -elf|grep <process_id_of_result_above>

我们发现还有进程在打开ISO文件。停掉相应的进程,空间被释放。这一步要注意,打开ISO文件的进程有可能是某个正在运行的虚拟机,关闭进程的时候要小心。

重启操作系统,dataguard恢复正常。

这其实是一个操作系统级别的错误导致oracle dataguard数据不能正常同步。

在发现问题后将重点放在了oracle dataguard的配置上,耽误了比较长的时间解决问题。由于只是重启了一下操作系统并没有对配置做修改。

在手工注册归档,日志能正常应用后,开始怀疑导致这个错误的原因不在oracle本身,而将问题转向了操作系统。

当发现dmesg中有错误,才真正的定位问题的根源。

其实有时候头痛不一定要医头,去看看脚也许能解决头痛的问题。