linux磁盤重新開機亂序問題處理
最近到客戶那去巡檢時,客戶提到一個問題,他們的rac在重新開機的時候,原來的sda1、sdb1、sdc1會對應變成sdd1、sde1、sdf1,由于他們使用的是盤符來綁定裸裝置,是以啟動後,經常要手動執行以下指令
[root@ractest1 ~]# raw /dev/raw/raw1 /dev/sda1
[root@ractest1 ~]# raw /dev/raw/raw2 /dev/sdb1
[root@ractest1 ~]# raw /dev/raw/raw3 /dev/sdc1
并且,比較奇怪的事,兩邊有時認得的盤完全不一樣,一邊是sda\b\c,另一邊是sdd\e\f,這樣,使oracle rac的共享盤出現問題。
在了解了他們的情況後,我基本上明白是什麼原因,這種盤序錯亂,與linux對磁盤的掃描機制有關,是以我們隻能從另一角度去規避這樣的問題,使用id号去綁定,這樣就沒有問題。在告訴他後,他同意我們對他原來的綁定方式進行修改,具體操作如下:
[root@ractest1 ~]# fdisk -l
Disk /dev/sdd: 429.4 GB, 429496729600 bytes
255 heads, 63 sectors/track, 52216 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 52216 419424988+ 83 Linux
Disk /dev/sde: 209 MB, 209715200 bytes
7 heads, 58 sectors/track, 1008 cylinders
Units = cylinders of 406 * 512 = 207872 bytes
/dev/sde1 1 1008 204595 83 Linux
Disk /dev/sdf: 209 MB, 209715200 bytes
/dev/sdf1 1 1008 204595 83 Linux
可以看到,剛重新開機的節點1是sdd/sde/sdf
另一個節點的情況是:
[root@ractest2 ~]# fdisk -l
Disk /dev/sda: 429.4 GB, 429496729600 bytes
/dev/sda1 1 52216 419424988+ 83 Linux
Disk /dev/sdb: 209 MB, 209715200 bytes
/dev/sdb1 1 1008 204595 83 Linux
Disk /dev/sdc: 209 MB, 209715200 bytes
/dev/sdc1 1 1008 204595 83 Linux
分别在兩台機子上執行如下指令:
[root@ractest2 ~] scsi_id -g -s /block/sda
360080e500017ff06000004054c47bd4a
[root@ractest2 ~] scsi_id -g -s /block/sdb
360080e500017fdd8000004c74c6344ef
[root@ractest2 ~] scsi_id -g -s /block/sdc
360080e500017ff060000044f4c63446e
[root@ractest1 ~] scsi_id -g -s /block/sdd
[root@ractest1 ~] scsi_id -g -s /block/sde
[root@ractest1 ~] scsi_id -g -s /block/sdf
能過對比,可以看到sda與sdd,sdb與sde,sdc與sdf是對應用的,是以我們啟用udev,通過綁定id來規避這個問題!
[root@ractest1 ~]# cd /etc/udev/rules.d/
[root@ractest1 rules.d]# ls -a
. 50-udev.rules 60-pcmcia.rules 61-uinput-wacom.rules 90-hal.rules
.. 51-hotplug.rules 60-raw.rules 85-pcscd_ccid.rules 95-pam-console.rules
05-udev-early.rules 60-libsane.rules 60-wacom.rules 90-alsa.rules 98-kexec.rules
40-multipath.rules 60-net.rules 61-uinput-stddev.rules 90-dm.rules bluetooth.rules
[root@ractest1 rules.d]# vi 60-raw.rules
# Enter raw device bindings here.
#
# An example would be:
# ACTION=="add", KERNEL=="sda", RUN+="/bin/raw /dev/raw/raw1 %N"
# to bind /dev/raw/raw1 to /dev/sda, or
# ACTION=="add", ENV{MAJOR}=="8", ENV{MINOR}=="1", RUN+="/bin/raw /dev/raw/raw2 %M %m"
# to bind /dev/raw/raw2 to the device with major 8, minor 1.
ACTION=="add", KERNEL=="sd*1", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="360080e500017ff060000044f4c63446e", RUN+="/bin/raw /dev/raw/raw1 %N"
ACTION=="add", KERNEL=="sd*1", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="360080e500017fdd8000004c74c6344ef", RUN+="/bin/raw /dev/raw/raw2 %N"
ACTION=="add", KERNEL=="sd*1", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="360080e500017ff06000004054c47bd4a", RUN+="/bin/raw /dev/raw/raw3 %N"
KERNEL=="raw[1-3]", OWNER="oracle", GROUP="dba", MODE="660"
[root@ractest1 rules.d]# start_udev
Starting udev: [ OK ]
[root@ractest1 rules.d]#
[root@ractest1 rules.d]# raw -qa
/dev/raw/raw1: bound to major 8, minor 81
/dev/raw/raw2: bound to major 8, minor 65
/dev/raw/raw3: bound to major 8, minor 49
同理,在另一台機,也進行同樣的操作。
經過如上操作後,所有問題都解決了,不管怎麼重新開機都不會有問題!