天天看點

小計:HP-UX CSTM指令列出記憶體問題

發現CSTM是一個非常有用的工具,可以檢視HP-UX各硬體的資訊和狀态及硬體故障。可以結合hp小機的mp的sl 指令,先查event日志中是否有報錯?如果有,可以在HP-UX系統中用CSTM指令來檢視詳細的錯誤。

如我在MP中使用SL–sel–l指令,檢視到下面的報錯資訊:MEM_SBE_IN_RANK錯誤

以root使用者,在終端輸入cstm,啟動cstm這個工具

# cstm
Running Command File (/usr/sbin/stm/ui/config/.stmrc).

-- Information --
Support Tools Manager

Version C.48.05

Product Number B4708AA

(C) Copyright Hewlett Packard Co. 1995-2004
All Rights Reserved

Use of this program is subject to the licensing restrictions described
in "Help-->On Version". HP shall not be liable for any damages resulting
from misuse or unauthorized use of this program.

cstm>
           

輸入map,列出主機所有的硬體資訊

cstm>map
                                      PAYDB

  Dev                                                 Last        Last Op
  Num  Path                 Product                   Active Tool Status
  ===  ==================== ========================= =========== =============
*   1  system               system (1016)             Information Successful
*   2  memory               IPF_MEMORY (1016)         Information Successful
*   3  0                    Cell (ffffffff)           Information Successful
*   4  0/0                  Bus Adapter (103c12eb)    Information Successful
*   5  0/0/0                PCI Bus Adapter (103c122e Information Successful
*   6  0/0/0/1/0            PCI-X 1000Base-T Interfac Information Successful
*   7  0/0/0/2/0            MPT SCSI Adapter (MPT SCS Information Successful
*   8  0/0/0/2/0.6.0        SCSI Disk (HP300)         Information Successful
*   9  0/0/0/2/1            MPT SCSI Adapter (MPT SCS Information Successful
*  10  0/0/0/3/0            MPT SCSI Adapter (MPT SCS Information Successful
*  11  0/0/0/3/0.6.0        SCSI Disk (HP300)         Information Successful
*  12  0/0/0/3/1            MPT SCSI Adapter (MPT SCS Information Successful
*  13  0/0/1                PCI Bus Adapter (103c12ee Information Successful
*  14  0/0/2                PCI Bus Adapter (103c12ee Information Successful
*  15  0/0/4                PCI Bus Adapter (103c12ee Information Successful
*  16  0/0/6                PCI Bus Adapter (103c12ee Information Successful
*  17  0/0/8                PCI Bus Adapter (103c12ee Information Successful
*  18  0/0/8/1/0            PCI 1000Base-T LAN Adapte Information Successful
*  19  0/0/8/1/1            PCI 1000Base-T LAN Adapte Information Successful
*  20  0/0/10               PCI Bus Adapter (103c12ee Information Successful
*  21  0/0/10/1/0           PCI 1000Base-T LAN Adapte Information Successful
*  22  0/0/10/1/1           PCI 1000Base-T LAN Adapte Information Successful
*  23  0/0/12               PCI Bus Adapter (103c12ee Information Successful
*  24  0/0/12/1/0           FC Interface (HPAB378B_QL Information Successful
*  25  0/0/12/1/0.11        Fibre Channel Driver (Mas
*  26  0/0/12/1/0.11.3.0.0. EMC Array (EMCSYMMETRIX)
*  27  0/0/12/1/0.11.3.0.0. EMC Array (EMCSYMMETRIX)
*  28  0/0/12/1/0.11.3.0.0. EMC Array (EMCSYMMETRIX)
*  29  0/0/12/1/0.11.3.0.0. EMC Array (EMCSYMMETRIX)
*  30  0/0/12/1/0.11.3.0.0. EMC Array (EMCSYMMETRIX)
*  31  0/0/12/1/0.11.3.255. EMC Array (EMCSYMMETRIX)
*  32  0/0/14               PCI Bus Adapter (103c12ee Information Successful
*  33  0/0/14/1/0           FC Interface (HPAB378B_QL Information Successful
*  34  0/0/14/1/0.21        Fibre Channel Driver (Mas
*  35  0/0/14/1/0.21.3.0.0. EMC Array (EMCSYMMETRIX)
*  36  0/0/14/1/0.21.3.0.0. EMC Array (EMCSYMMETRIX)
*  37  0/0/14/1/0.21.3.0.0. EMC Array (EMCSYMMETRIX)
*  38  0/0/14/1/0.21.3.0.0. EMC Array (EMCSYMMETRIX)
*  39  0/0/14/1/0.21.3.0.0. EMC Array (EMCSYMMETRIX)
*  40  0/0/14/1/0.21.3.255. EMC Array (EMCSYMMETRIX)
*  41  0/120                CPU (1016)
*  42  0/121                CPU (1016)
*  43  0/122                CPU (1016)
*  44  0/123                CPU (1016)
*  45  0/124                CPU (1016)
*  46  0/125                CPU (1016)
*  47  0/126                CPU (1016)
*  48  0/127                CPU (1016)
*  49  0/250                Core I/O Adapter (fffffff
*  50  0/250/0              ACPI Device (41435049)    Information Successful
*  51  0/250/1              IPMI Controller (49504930 Information Successful
*  52  0/250/2              RS-232 Interface (504e503 Information Successful
           

選中所需要檢視的裝置的num 輸入指令 以記憶體為例 輸入sel dev 2

然後在提示符下鍵入info 從系統kernel裡面收集裝置的資訊

在提示符下鍵入il 列出裝置的資訊

cstm>sel dev 2
cstm>info
-- Updating Map --
Updating Map...
cstm>il
-- Converting multiple raw log files to text. --
Preparing the Information Tool Log for each selected device...

.... PAYDB  :  10.127.8.181 .... 

-- Information Tool Log for system on path system --

Log creation time: Wed Aug 17 16:30:35 2011

Hardware path: system

Product ID                : ia64 hp server rx8640
Current Product Number    : AB297A
Original Product Number   : AB297A
System Firmware Revision  : 9.022
BMC Revision              : v03.01
System Serial Number:     : xxxxxxxxxxxx

System Software ID           : xxxxxxxxxx

      For additional information about the system and the CPUs, please run the
      following command:

                 /usr/contrib/bin/machinfo

Field Replaceable Unit Identification (FRUID):

=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=

-- Information Tool Log for IPF_MEMORY on path memory --

Log creation time: Wed Aug 17 16:30:35 2011

Hardware path: memory

Basic Memory Description 

   Module Type: MEMORY
   Page Size: 4096 Bytes
   Total Physical Memory: 65536 MB               #一共64G記憶體
   Total Configured Memory: 65536 MB
   Total Deconfigured Memory: 0 MB

Memory Board Inventory 

   DIMM Location          Size(MB) State   Serial Num       Part Num
   --------------------   -------- ------- ---------------- ------------------
   Cab 0 Cell 0 DIMM 0A   4096     Config  PRY081636Y       A9849-60301
   Cab 0 Cell 0 DIMM 0B   4096     Config  PRY090561K       A9849-60301
   Cab 0 Cell 0 DIMM 1A   4096     Config  PRY0905288       A9849-60301
   Cab 0 Cell 0 DIMM 1B   4096     Config  PRY08163D7       A9849-60301
   Cab 0 Cell 0 DIMM 2A   4096     Config  PRY08161EP       A9849-60301
   Cab 0 Cell 0 DIMM 2B   4096     Config  PRY08166SL       A9849-60301
   Cab 0 Cell 0 DIMM 3A   4096     Config  PRY0816114       A9849-60301
   Cab 0 Cell 0 DIMM 3B   4096     Config  PRY0816111       A9849-60301
   Cab 0 Cell 0 DIMM 4A   4096     Config  PRY08163D3       A9849-60301
   Cab 0 Cell 0 DIMM 4B   4096     Config  PRY08161ES       A9849-60301
   Cab 0 Cell 0 DIMM 5A   4096     Config  PRY081619T       A9849-60301
   Cab 0 Cell 0 DIMM 5B   4096     Config  PRY08163X5       A9849-60301
   Cab 0 Cell 0 DIMM 6A   4096     Config  PRY081619U       A9849-60301
   Cab 0 Cell 0 DIMM 6B   4096     Config  PRY081630Z       A9849-60301
   Cab 0 Cell 0 DIMM 7A   4096     Config  PRY0816183       A9849-60301
   Cab 0 Cell 0 DIMM 7B   4096     Config  PRY0816371       A9849-60301       

   Cab 0 Cell 0 Total: 65536 (MB)

   ===========================================================================

Memory Error Log Summary        #這個是系統記憶體的運作資訊,如果有error資訊的話記憶體就有可能存在問題,我這邊是DIMM 6A slot 上的記憶體在報 Single-Bit  的錯誤。

   DIMM Location           Error Address     Error Type  Page           Count
   ----------------------  ----------------  ----------  -------------  -----
   Cab 0 Cell 0 DIMM 6A    0xa34077f80       Single-Bit  0xa34077       1
   Cab 0 Cell 0 DIMM 6A    0xb4a727f80       Single-Bit  0xb4a727       1
   Cab 0 Cell 0 DIMM 6A    0xc88f27f80       Single-Bit  0xc88f27       1
   Cab 0 Cell 0 DIMM 6A    0xa324a7f80       Single-Bit  0xa324a7       1
   Cab 0 Cell 0 DIMM 6A    0xc112d7f80       Single-Bit  0xc112d7       1    

   ===========================================================================

-- Information Tool Log for each selected device --
View   - To View the file.
Print - To Print the file.
SaveAs - To Save the file.
Enter Done, Help, Print, SaveAs, or View: [Done]
cstm>