天天看點

故障案例-ESXI6.7 EP13 紫屏分析

産品版本資訊。

Huawei RH2288H V3 | BIOS: 3.87 | Date (ISO-8601): 2018-02-02

VMware ESXi 6.5.0 build-5969303

ESXi 6.5 U1 ESXi 6.5 U1 7/27/2017 5969303 N/A

下面是紫萍發生時的stacktrace,顯示LINT1/NMI 導緻的紫萍,應該是硬體問題。

2020-07-22T19:47:32.067Z cpu0:66825)@BlueScreen: LINT1/NMI (motherboard nonmaskable interrupt), undiagnosed. This may be a hardware problem; please contact your hardware vendor.

2020-07-22T19:47:32.068Z cpu0:66825)Code start: 0x41802ca00000 VMK uptime: 127:07:45:14.433

2020-07-22T19:47:32.068Z cpu0:66825)0x4380c0002c60:[0x41802caed451]PanicvPanicInt@vmkernel#nover+0x545 stack: 0x41802caed451

2020-07-22T19:47:32.068Z cpu0:66825)0x4380c0002d00:[0x41802caed4dd]Panic_NoSave@vmkernel#nover+0x4d stack: 0x4380c0002d60

2020-07-22T19:47:32.068Z cpu0:66825)0x4380c0002d60:[0x41802caea7ae]NMICheckLint1@vmkernel#nover+0x19a stack: 0x0

2020-07-22T19:47:32.069Z cpu0:66825)0x4380c0002e20:[0x41802caea844]NMI_Interrupt@vmkernel#nover+0x94 stack: 0x0

2020-07-22T19:47:32.069Z cpu0:66825)0x4380c0002ea0:[0x41802cb2c531]IDTNMIWork@vmkernel#nover+0x99 stack: 0x0

2020-07-22T19:47:32.069Z cpu0:66825)0x4380c0002f20:[0x41802cb2d9c1]Int2_NMI@vmkernel#nover+0x19 stack: 0x418040000000

2020-07-22T19:47:32.069Z cpu0:66825)0x4380c0002f40:[0x41802cb3d044]gate_entry_@vmkernel#nover+0x0 stack: 0x0

2020-07-22T19:47:32.070Z cpu0:66825)0x43916849bcf0:[0x41802ca8b9c2]Power_ArchSetCState@vmkernel#nover+0x106 stack: 0x7fffffffffffffff

2020-07-22T19:47:32.070Z cpu0:66825)0x43916849bd20:[0x41802ccc49d3]CpuSchedIdleLoopInt@vmkernel#nover+0x39b stack: 0x1

2020-07-22T19:47:32.070Z cpu0:66825)0x43916849bd90:[0x41802ccc728a]CpuSchedDispatch@vmkernel#nover+0x114a stack: 0x410000000001

2020-07-22T19:47:32.071Z cpu0:66825)0x43916849bec0:[0x41802ccc8502]CpuSchedWait@vmkernel#nover+0x27a stack: 0x100000000000000

2020-07-22T19:47:32.071Z cpu0:66825)0x43916849bf40:[0x41802ccc85d5]CpuSched_NoEvqWait@vmkernel#nover+0x19 stack: 0x0

2020-07-22T19:47:32.071Z cpu0:66825)0x43916849bf50:[0x41802d5cc345]TcpipDispatch@(tcpip4)#+0x345 stack: 0x6

2020-07-22T19:47:32.071Z cpu0:66825)0x43916849bfe0:[0x41802ccc91b5]CpuSched_StartWorld@vmkernel#nover+0x99 stack: 0x0

2020-07-22T19:47:32.075Z cpu0:66825)base fs=0x0 gs=0x418040000000 Kgs=0x0

IPMI日志相同時間點有下面一個event.

162 2020-07-22T19:47:38 2 111 (Unknown) 2 (System Event) 83 Assert + Slot/Connector Fault Status

下一步:

需要伺服器硬體廠商做進一步排查

繼續閱讀