天天看點

硬碟監控和分析工具:Smartctl硬碟監控和分析工具:Smartctl

smartctl(s.m.a.r.t 自監控,分析和報告技術)是類unix系統下實施smart任務指令行套件或工具,它用于列印smart自檢和錯誤日志,啟用并禁用smrat自動檢測,以及初始化裝置自檢。

smartctl對于linux實體伺服器十分有用,在這些伺服器上,可以對智能磁盤進行錯誤檢查,并将與硬體raid相關的磁盤資訊摘錄下來。

在本帖中,我們将讨論smartctl指令的一些實用樣例。如果你的linux上海沒有安裝smartctl,請按以下步驟來安裝。

硬碟監控和分析工具:Smartctl硬碟監控和分析工具:Smartctl

<a target="_blank"></a>

對于 ubuntu

$ sudo apt-get install smartmontools

對于 centos &amp; rhel

# yum install smartmontools

$ sudo /etc/init.d/smartmontools start

# service smartd start ; chkconfig smartd on

root@linuxtechi:~# smartctl -i /dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)

copyright (c) 2002-13, bruce allen, christian franke, www.smartmontools.org

=== start of information section ===

model family: seagate momentus 5400.6

device model: st9320325as

serial number: 5vd2v59t

lu wwn device id: 5 000c50 020a37ec4

firmware version: 0002bsm1

user capacity: 320,072,933,376 bytes [320 gb]

sector size: 512 bytes logical/physical

rotation rate: 5400 rpm

device is: in smartctl database [for details use: -p show]

ata version is: ata8-acs t13/1699-d revision 4

sata version is: sata 2.6, 1.5 gb/s

local time is: sun nov 16 12:32:09 2014 ist

smart support is: available - device has smart capability.

smart support is: enabled

這裡‘/dev/sdb’是你的硬碟。上面輸出中的最後兩行顯示了smart功能已啟用。

root@linuxtechi:~# smartctl -s on /dev/sdb

=== start of enable/disable commands section ===

smart enabled.

root@linuxtechi:~# smartctl -s off /dev/sdb

smart disabled. use option -s with argument 'on' to enable it.

root@linuxtechi:~# smartctl -a /dev/sdb // for ide drive

root@linuxtechi:~# smartctl -a -d ata /dev/sdb // for sata drive

root@linuxtechi:~# smartctl -h /dev/sdb

=== start of read smart data section ===

smart overall-health self-assessment test result: passed

warning: this result is based on an attribute check.

please note the following marginal attributes:

id# attribute_name flag value worst thresh type updated when_failed raw_value

190 airflow_temperature_cel 0x0022 067 045 045 old_age always in_the_past 33 (min/max 25/33)

long測試

root@linuxtechi:~# smartctl --test=long /dev/sdb

=== start of offline immediate and self-test section ===

sending command: "execute smart extended self-test routine immediately in off-line mode".

drive command "execute smart extended self-test routine immediately in off-line mode" successful.

testing has begun.

please wait 102 minutes for test to complete.

test will complete after sun nov 16 14:29:43 2014

use smartctl -x to abort test.

或者,我們可以重定向測試輸出到日志檔案,就像下面這樣

root@linuxtechi:~# smartctl --test=long /dev/sdb &gt; /var/log/long.text

short測試

root@linuxtechi:~# smartctl --test=short /dev/sdb

sending command: "execute smart short self-test routine immediately in off-line mode".

drive command "execute smart short self-test routine immediately in off-line mode" successful.

please wait 1 minutes for test to complete.

test will complete after sun nov 16 12:51:45 2014

root@linuxtechi:~# smartctl --test=short /dev/sdb &gt; /var/log/short.text

注意:short測試将花費最多2分鐘,而在long測試中沒有時間限制,因為它會讀取并驗證磁盤的每個段。

root@linuxtechi:~# smartctl -l selftest /dev/sdb

smart self-test log structure revision number 1

num test_description status remaining lifetime(hours) lba_of_first_error

# 1 short offline completed: read failure 90% 492 210841222

# 2 extended offline completed: read failure 90% 492 210841222

root@linuxtechi:~# smartctl -c /dev/sdb

general smart values:

offline data collection status: (0x00) offline data collection activity

was never started.

auto offline data collection: disabled.

self-test execution status: ( 121) the previous self-test completed having

the read element of the test failed.

total time to complete offline

data collection: ( 0) seconds.

offline data collection

capabilities: (0x73) smart execute offline immediate.

auto offline data collection on/off support.

suspend offline collection upon new

command.

no offline surface scan supported.

self-test supported.

conveyance self-test supported.

selective self-test supported.

smart capabilities: (0x0003) saves smart data before entering

power-saving mode.

supports smart auto save timer.

error logging capability: (0x01) error logging supported.

general purpose logging supported.

short self-test routine

recommended polling time: ( 1) minutes.

extended self-test routine

recommended polling time: ( 102) minutes.

conveyance self-test routine

recommended polling time: ( 2) minutes.

sct capabilities: (0x103b) sct status supported.

sct error recovery control supported.

sct feature control supported.

sct data table supported.

root@linuxtechi:~# smartctl -l error /dev/sdb

sample output

smart error log version: 1

ata error count: 5

cr = command register [hex]

fr = features register [hex]

sc = sector count register [hex]

sn = sector number register [hex]

cl = cylinder low register [hex]

ch = cylinder high register [hex]

dh = device/head register [hex]

dc = device command register [hex]

er = error register [hex]

st = status register [hex]

powered_up_time is measured from power on, and printed as

ddd+hh:mm:ss.sss where dd=days, hh=hours, mm=minutes,

ss=sec, and sss=millisec. it "wraps" after 49.710 days.

commands leading to the command that caused the error were:

cr fr sc sn cl ch dh dc powered_up_time command/feature_name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 da 08 e7 e5 a5 4c 00 00:30:44.515 read dma ext

25 da 08 df e5 a5 4c 00 00:30:44.514 read dma ext

25 da 80 5f e5 a5 4c 00 00:30:44.502 read dma ext

25 da f0 5f e6 a5 4c 00 00:30:44.496 read dma ext

25 da 10 4f e6 a5 4c 00 00:30:44.383 read dma ext

原文釋出時間:2015-01-16

本文來自雲栖合作夥伴“linux中國”

繼續閱讀