故障現象:
一台P550 AIX系統主機,啟動後發現資料分區沒有挂載上,手動挂載提示如下錯誤
#mount /data
Replaying log for /dev/lv_tdprd_bak.mount:
0506-324 Cannot mount /dev/lv_tdprd_bak on /data: The media is not formatted or the format is not correct.
0506-342 The superblock on /dev/datavg is dirty. Run a full fsck to fix.
檢視系統日志如下:
# errpt
A6DF45AA 0215180108 I O RMCdaemon The daemon is started.
B38E3397 0215175908 U S SYSDUMP 先前的系統轉儲資訊
C0AA5338 0215175808 U S SYSDUMP 系統轉儲
9D035E4D 0215175108 P S SYSVMM 資料存儲中斷,處理器
9DBCFDEE 0215175908 T O errdemon 記錄錯誤日志打開
B6DB68E0 0215043408 I O SYSJ2 FILE SYSTEM RECOVERY REQUIRED
49A83216 0215030208 T H hdisk2 磁盤操作錯誤
6926ECA8 0215030108 I O SYSJ2 META-DATA I/O ERROR
613E5F38 0215030108 P H LVDD LVM 檢測到 I/O 錯誤
425BDD47 0215030108 P H hdisk2 磁盤操作錯誤
故障原因:
由于AIX重新開機時無法正常重新開機,強制重新開機後出現如上錯誤。這是因為強制關機造成了系統分區的損壞,顯示“The media is not formatted or the format is not correct”錯誤,導緻分區無法正常挂載。
解決辦法:
執行fsck對磁盤分區進行修複
#fsck -p /data
** Phase 1 - Check Blocks and Sizes
....
** Phase 6b - Salvage Block Map
-1 blocks missing
Superblock is marked dirty (FIXED)
1922574 files 375776584 blocks 46344568 free
***** Filesystem was modified *****
修複成功後重新挂載分區,問題解決。
<a>檔案系統無法mount的問題</a>
一 故障現象
# oslevel -r
5200-04
# lsvg
rootvg
datavg
#lspv
hdisk0 0054338ee0b6f496 rootvg active
hdisk1 0054338efa398c64 datavg active
# lsvg -l datavg
web:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
web jfs 100 100 1 open/syncd /ws
loglv00 jfslog 1 1 1 open/syncd N/A
data jfs2 284 284 1 closed/syncd /data
loglv01 jfs2log 1 1 1 closed/syncd N/A
# mount /data
重新播放 /dev/data 的日志。
mount: 0506-324 不能将 /dev/data2 安裝到 /data2:媒體未格式化或格式不正确。
0506-342 /dev/data2 的超級塊有錯誤,運作完整的 fsck 以修訂。
檢視錯誤日志表明hdisk1損壞.
二 解決步驟
由于datavg裡有2個檔案系統,/ws檔案系統可以mount,是以先備份此檔案系統資料至安全處,然後按如下步驟進行:
1 運作fsck修複
# fsck /data
****************
目前卷是:/dev/data
**階段 1 — 檢查塊、檔案/目錄和目錄條目
fsck:0507-089 讀 /dev/rdata 時發生不可恢複的錯誤。無法繼續。
fsck:0507-039 在通路檔案系統(1,17360109568,16384,-1)時發生嚴重錯誤(-10015,-1)。
fsck:0506-042 執行子產品“/sbin/helpers/jfs2/fsck”失敗。
2 超級塊修複
If you receive one of the following errors from the fsck or mount commands, the problem may be a corrupted superblock.
fsck: Not an AIX4 file system
fsck: Not an AIXV4 file system
fsck: Not a recognized file system type
0506-342 The superblock is dirty. Run a full fsck to fix.
mount: invalid argument
The backup superblock can be copied over the primary superblock via one of these commands:
dd count=1 bs=4k skip=31 seek=1 if=/dev/lv00 of=/dev/lv00 (JFS)
dd count=1 bs=4k skip=15 seek=8 if=/dev/lv00 of=/dev/lv00 (JFS2) (Version
5 only)
fsck -p /dev/lv00 (works for both JFS and JFS2)
Once the copying over is completed, check the integrity of the file system by issuing:
fsck /dev/lv00
In many cases, copying the backup superblock to the primary superblock will recover the file system. If this does not work, you will have to recreate the file system and restore the data from a backup.
3 Formats a logical volume
# logform /dev/loglv01
logform: destroy /dev/rloglv01 (y)?y
4 恢複備份的資料
客戶的系統日志裡已經表明此datavg的硬碟已經壞了,由以上幾步都無法解決問題,是以隻能換新盤并重建立立/data檔案系統,恢複備份的資料.客戶50G的資料大約有3G的無法恢複,隻能客戶自己再手工恢複了(重新輸入資料).如果平時沒有做好資料備份那隻能哭了,是以一定要做好平時的資料備份.
hdisk0 000af70d4d50358c rootvg active
hdisk1 000af70dca7aea4d datavg active
hdisk2 000af70dca7ae679 sunvg active
#lsvg -l datavg
datavg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
raw1 raw 5 5 1 closed/syncd N/A
loglv00 jfs2log 1 1 1 closed/syncd N/A
fslv00 jfs2 192 192 1 closed/syncd /solaris
lv00 raw 4 4 1 closed/syncd N/A
#mount /solaris
mount: 0506-324 Cannot mount /dev/fslv00 on /solaris: There is a request to a device or address that does not exist.
比較奇怪,以前沒遇到過
查errpt,沒有得到有效的資訊
查/etc/filesystems檔案也正常
fsck /solaris的時候發現問題所在了
#fsck /solaris
The current volume is: /dev/fslv00
Open volume exclusive read or write returned, rc = 6
fsck: 0507-289 Device unavailable or locked by another process.
Cannot continue.
原因是被鎖定了,這時才回想起來,昨天同僚問了一個varyonvg -s的問題,
從新varyoffvg ,再次varyonvg
#varyoffvg datavg
#varyonvg datavg
#df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 180224 110672 39% 2753 4% /
/dev/hd2 2867200 134992 96% 39835 6% /usr
/dev/hd9var 16384 4996 70% 441 11% /var
/dev/hd3 65536 28484 57% 283 2% /tmp
/dev/hd1 16384 15756 4% 87 3% /home
/proc - - - - - /proc
/dev/hd10opt 114688 4668 96% 2629 10% /opt
/dev/test 327680 326048 1% 5 1% /tst
/dev/fslv00 3145728 1113040 65% 638 1% /solaris
問題解決,
再仔細察看了一下-s參數的作用
-s 使卷組隻在“系統管理”方式中可用。邏輯卷指令能對卷組執行操作,但是不能為輸入或輸出打開邏輯卷。
注:邏輯卷指令也不能讀取或寫入用 -s 标志聯機的卷組中的邏輯卷。如果邏輯卷指令試圖寫入用 -s 标志聯機的卷組内的某個邏輯卷(如 chvg 或 mklvcopy),那麼可能顯示錯誤消息,表明它們不能寫入和/或讀取邏輯卷。
在使用-s參數激活卷組的情況下,卷組的lv是出于closed狀态,lvm指令對其操作也是無效的。
For jfs2 , Check and recover file system
The fsck utility was enhanced to also handle JFS2-type file systems. This utility checks the file system for consistency and repairs problems found.
# fsck -V jfs2 /myfs
The current volume is: /dev/lv01
File system is clean.
All observed inconsistencies have been repaired.
If the -V flag is not specified, fsck will figure out the JFS type by the VFS type specified for this file system and work in the assumed way:
# fsck /myfs
本文轉自 Mr_sheng 51CTO部落格,原文連結:http://blog.51cto.com/sf1314/2054667