天天看点

netapp fas3220更换硬盘

Netapp存储无法正常工作导致小机数据库无法连接

1.使用sysconfig -r查看系统状态硬盘状态

SBJYJ-02> sysconfig –r 
Aggregate aggr0 (online, raid_dp, degraded, hybrid_enabled) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (degraded, block checksums)

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------
      dparity   0b.10.1         0b    10  1   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      parity    0b.20.1         0b    20  1   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.17        0b    20  17  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.3         0b    10  3   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.3         0b    20  3   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      4c.30.3         4c    30  3   SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.5         0b    10  5   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.5         0b    20  5   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      4c.30.5         4c    30  5   SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.7         0b    10  7   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      FAILED                  N/A                        560000/ -
      data      4c.30.7         4c    30  7   SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.9         0b    10  9   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.9         0b    20  9   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      4c.30.9         4c    30  9   SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.11        0b    10  11  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.11        0b    20  11  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 

    RAID group /aggr0/plex0/rg1 (double degraded, block checksums)

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------
      dparity   4c.30.11        4c    30  11  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      parity    0b.20.13        0b    20  13  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.13        0b    10  13  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      4c.30.13        4c    30  13  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.15        0b    20  15  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      FAILED                  N/A                        560000/ -
      data      4c.30.15        4c    30  15  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.23        0b    10  23  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      4c.30.1         4c    30  1   SA:A   0   SAS 15000 560000/1146880000 560879/1148681096 
      data      4c.30.17        4c    30  17  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.19        0b    20  19  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      FAILED                  N/A                        560000/ -
      data      4c.30.19        4c    30  19  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.20.21        0b    20  21  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.21        0b    10  21  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
      data      0b.10.15        0b    10  15  SA:B   0   SAS 15000 560000/1146880000 560879/1148681096 
      data      0b.20.23        0b    20  23  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 


Pool1 spare disks (empty)

Pool0 spare disks

RAID Disk       Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------          ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           0a.00.1         0a    0   1   SA:B   0   SSD   N/A 190532/390209536  190782/390721968 
spare           0a.00.3         0a    0   3   SA:B   0   SSD   N/A 190532/390209536  190782/390721968 

Broken disks

RAID Disk       Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------          ------------- ---- ---- ---- ----- --------------    --------------
failed          0b.10.17        0b    10  17  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
failed          0b.10.19        0b    10  19  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
failed          0b.20.7         0b    20  7   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
failed          4c.30.23        4c    30  23  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 

Partner disks

RAID Disk       Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------          ------------- ---- ---- ---- ----- --------------    --------------
partner         0b.10.22        0b    10  22  SA:B   0   SAS 15000 560000/1146880000 560879/1148681096 
partner         4c.30.21        4c    30  21  SA:A   0   SAS 15000 0/0               560879/1148681096 
partner         4c.30.22        4c    30  22  SA:A   0   SAS 15000 0/0               560879/1148681096 
partner         4c.20.10        4c    20  10  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.2         0b    10  2   SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.16        0b    10  16  SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.20        0b    10  20  SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.0         0b    10  0   SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.18        0b    10  18  SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.8         0b    10  8   SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.14        4c    30  14  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.12        4c    20  12  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.6         4c    30  6   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.6         4c    20  6   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.20        4c    30  20  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.2         4c    20  2   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.16        4c    20  16  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.0         4c    30  0   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.16        4c    30  16  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.8         4c    20  8   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.2         4c    30  2   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.12        0b    10  12  SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.14        0b    10  14  SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.10        0b    10  10  SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.4         0b    10  4   SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         0b.10.6         0b    10  6   SA:B   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.14        4c    20  14  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.22        4c    20  22  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.12        4c    30  12  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.20        4c    20  20  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.18        4c    20  18  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.10        4c    30  10  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.18        4c    30  18  SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.0         4c    20  0   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.20.4         4c    20  4   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.4         4c    30  4   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4c.30.8         4c    30  8   SA:A   0   SAS 15000 0/0               560208/1147307688 
partner         4a.00.2         4a    0   2   SA:A   0   SSD   N/A 0/0               190782/390721968 
partner         4a.00.0         4a    0   0   SA:A   0   SSD   N/A 0/0               190782/390721968
           

看到存储做了raid dp 初步断定硬盘损坏,数据不受影响

由以上日志可看出,坏盘一共四块,坏在两个raid组中,vol status -r 命令查看热备盘已用完,无法进行坏硬盘更换,而下方硬盘没有被控制器二所接管,无法看到硬盘具体状态。

2.更换控制器二所属的硬盘

①将坏掉的硬盘拔出,等待30秒(防止磁盘断电后还在转动,防止磁盘造成物理损坏为数据恢复增加困难),插入新硬盘(确认黄灯亮、绿灯不闪烁)

②插入硬盘后disk show -v 查看新硬盘是否分配了owner(控制器)

DISK       OWNER                    POOL   SERIAL NUMBER         DR HOME            CHKSUM
------------ -------------            -----  -------------    -------------            -------  
4a.00.0      esad  (22312421)    Pool0  S142NEAD806058        SBJYJ-01  (2017242430)    Block
4a.00.2      SBJYJ-01  (2017242430)    Pool0  S142NEAD803850        SBJYJ-01  (2017242430)    Block

发现新插入的硬盘不属于此控制器,带有其他控制器的信息或raid信息
使用disk assign  -f  <disk_id>  -s  <owner_id> 强制分配给一个控制器 *慎用,分配完成后使用disk show –v查看是否分配成功
使用aggr destroy <aggr名称>  删除一个AGGR *慎用

③使用vol status -r 查看硬盘状态,如果硬盘为Bad Label,执行步骤(1);如果硬盘已经进入spare disks中,并且磁盘最后标注了not zerod执行步骤(2)
(1)	在vol status –r中看到带有bad label标签的盘,但是已经将新的硬盘安装上
先priv set advanced 进入高级模式,使用 disk unfail -s <硬盘id 0b.**.**> 去除Bad标签,再退出高级模式priv set 
RAID Disk       Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------          ------------- ---- ---- ---- ----- --------------    --------------
failed          0b.10.19        0b    10  19  SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
failed          0b.20.7         0b    20  7   SA:B   0   SAS 15000 560000/1146880000 560208/1147307688 
failed          4c.30.23        4c    30  23  SA:A   0   SAS 15000 560000/1146880000 560208/1147307688 
bad label       0b.10.17        0b    10  17  SA:B   0   SAS 15000 560000/1146880000 560879/1148681096
(2)	当热备盘中硬盘后跟not zerod,

RAID Disk       Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------          ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           0a.00.1         0a    0   1   SA:B   0   SSD   N/A 190532/390209536  190782/390721968   not zerod
spare           0a.00.3         0a    0   3   SA:B   0   SSD   N/A 190532/390209536  190782/390721968 

           

使用命令disk zero spares 初始化所有spare disk

拉起控制器一

控制器一因硬盘损坏,进行保护硬盘,自己关闭了控制器,防止用户继续访问存储造成硬盘的继续损坏导致数据丢失

使用console线直连存储,会进入LOADER>模式,使用help命令查看可用命令,使用boot_ontap命令强制控制器启动(若无法启动,则控制器可能损坏),拉起控制器后进入控制器更换硬盘即可

总结

存储进行重构、初始化需要一段时间,当存储硬盘重构、初始化完成后raid组自动将降级取消,存储正常运行

参考命令
sysconfig -v 查看存储状态
sysconfig -r 查看存储硬盘状态
sysconfig -a 查看系统信息详情
vol status -v 查看volume状态
vol status -f 检查是否有故障硬盘
disk show 查看磁盘分配信息
disk show -v 查看硬盘所属控制器
storage show disk -p 查看硬盘位置
disk zero spares 初始化所有spare disk
environment status 检查电源、风扇状态
rdfile /etc/messages 检查最新的日志
cf status 检查控制器状态
df 检查逻辑卷磁盘使用率
environment status 查看环境信息
license 查看许可信息
ifconfig -a 查看网络配置
aggr status 查看raid组信息
aggr status -r 查看raid组详情
 df -Vh 查看卷空间
df -Ah 查看aggr空间

disk assign  -f <disk_id> -s <owner_id> 强制分配给一个控制器 *慎用
disk replace start disk_name spare_disk_name 使用spare disk 替换一块磁盘 *慎用
disk replace stop disk_name 停止替换硬盘 *慎用
disk sanitize start disk_name 将磁盘上面所有数据移除 *慎用
disk sanitize abort disk_name 停止 *慎用
aggr destroy aggrname  删除一个AGGR *慎用
disk remove onwership disk_name 删除硬盘owner *慎用