天天看點

Nagios監控NetAPP NAS存儲容量,Volume、Qtree

   作為一個資深的打醬油工作者,需要偶爾展現下自己的水準,好讓上司不認為你是空氣, 當然,更重要的是:打好醬油!

背景:公司使用Netapp v3240 NAS存儲提供CIFS共享,Qtree給應用系統作為附件存儲。

     一天,某應用系統突然附件無法上傳,排查後發現是Quota滿了,然後緊急擴了下容量,就恢複了,這件事情上層上司不太滿意,因為是使用者報的故障給我們運維,運維事先并未及時發現。為了避免以後這類事情再發生,上司提出需要一個方案,監控Qtree使用率,并及時報警,好吧,我自報奮勇一下吧。

使用Nagios監控:

   公司已經有nagios監控軟體了,我們都知道nagios非常強大,監控插件非常多,你幾乎不用自己寫監控插件,去nagios exchange網站上搜尋就可以了。

   nagios exchange網站:http://exchange.nagios.org ,找星級高的插件下載下傳。

   我下載下傳的是這個插件:http://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/SAN-and-NAS/NetApp/check_netapp-2Ddu/details

   閱讀一下上面頁面的英文說明,可以知道這個插件可以做什麼,怎樣去使用,以及注意事項。

   下載下傳下來一共4個檔案:

File Description
check_netapp-du Plugin code(源碼)
netapp-shares.map Share to volume mapping file(卷映射檔案,我沒用)
netapp-gencache Cache generation (緩存檔案)
NETWORK-APPLIANCE-MIB.txt NetApp MIBs (Netapp MIB庫)

   簡單解釋下插件的說明文檔:

A very fast SNMP based disk space checker for NetApp NAS filers, which supports both Volumes and QTrees, also over 2TB.

一個基于SNMP協定的netapp NAS存儲磁盤空間檢測插件,支援Volume和Qtree,超過2TB也支援

This is a disk space checker for NetApp filers, based on an older plugin by Rob Hassing. The plugin interrogates the NAS via SNMP and supports both Volumes and subdirectories with user quota aka QTrees, also ones larger than 2TB.

介紹下插件之前的作者,插件查詢使用SNMP協定,支援Volume和子目錄的使用者quota配額(又叫做Qtrees).

The plugin uses a map file (netapp-shares.map) and some basic heuristics to translate Windows share names to volume/Qtree names (CIFS share names are not exposed via SNMP).

插件使用一個映射檔案,和一些基本的方法将windows共享名轉換到volume/Qtree名,(CIFS共享名不能直接通過SNMP來查詢)

###這個意思我猜是:如果想監控\\share.company.com\share1這樣的共享檔案夾,需要使用這個檔案netapp-shares.map。###

Usage: 使用方法:

check_netapp-du -H host -v volume [-C community] -w warnpct -c critpct [--force-gencache] [--debug]

check_netapp-du -H 主機名/IP -v 卷名/Qtree名 [-C 共同體名] -w 警告值 -c 嚴重值 [--force-gencache] [--debug] ##--force-gencache是更新緩存,--debug是啟動debug模式。

For performance reasons, the SNMP Volume and QTree IDs are looked up from a local cache file rather than retrieved from the NetApp filer every time. It will be necessary to regenerate this cache regularly, as volumes are added or changed on the NAS.

一些性能原因,SNMP 卷和qtree ids是從本地緩存檔案中查詢的,進而不用每次都去netapp存儲裡去現查。是以,當NAS上有新卷被添加或者更改,必須重新生成緩存檔案。

Since the running time of this cache generation may be too long for Nagios (depending on the global service check time-out), it is recommended to do this only from an interactive shell or a cron job. Run a check on any volume adding the "-–force-gencache" parameter to regenerate the volume ID cache. A sample script is provided which can be run from cron, provided you are running Opsview, and provided all NetApp filers are in the same host group.

建議使用一個crontab,定期更新緩存檔案。

監控步驟:

1.check_netapp-du放在nagios的libexec插件目錄,權限什麼的我就不說了。

2.less check_netapp-du,可以看到以下内容:

$PROGNAME = "check_netapp-du";

$REVISION = "2.1b";

$ENV{'PATH'}='';

$ENV{'BASH_ENV'}='';

$ENV{'ENV'}='';

# Path to manual CIFS to volume/qtree map file

my $MAPFILE = "/data/scripts/nagios/NAS/netapp-shares.map";

# Path to NetApp MIB table (get it from http://www.protocolsoftware.com/documents/mibs/netapp.mib.txt)

my $MIBFILE = "/usr/share/snmp/mibs/NETWORK-APPLIANCE-MIB.txt";

# Path to locaiton of cache files

my $CACHEPATH = "/data/scripts/nagios/NAS";

3.目錄建立下,MIB庫到自己該去的位子,

   mkdir -p /data/scripts/nagios/NAS;cp NETWORK-APPLIANCE-MIB.txt /usr/share/snmp/mibs/

4.下面配置netapp存儲的SNMP:

   先看下已有的Qtree,選一個等會測試:

Nagios監控NetAPP NAS存儲容量,Volume、Qtree

   配置SNMP:

Nagios監控NetAPP NAS存儲容量,Volume、Qtree

5.測試下,看能不能得到想要的結果:

Nagios監控NetAPP NAS存儲容量,Volume、Qtree

   成功了,看下百分比4.68%,和上面在netapp管理工具裡看到的一樣吧。

   這個指令第一次成功執行後,會生成這個緩存檔案,下次再執行SNMP查詢,會先去這個緩存檔案裡取Qtree對應的oid,再來用oid去通過該snmp查詢資訊,oid可以了解為共享名對應的一個數字唯一标示符:    

[v_nashengzhi@BJS0-0TH libexec]$ ls -al /data/scripts/nagios/NAS/

total 12

drwxr-xr-x 2 nagios nagios 4096 Sep 13 10:08 .

drwxr-xr-x 3 nagios nagios 4096 Aug  9 15:32 ..

-rw-r--r-- 1 nagios nagios 1160 Sep 13 10:08 .netapp-oidcache.10.199.94.11

Nagios監控NetAPP NAS存儲容量,Volume、Qtree

6.如果netapp nas上有新加的Qtree共享檔案夾,那麼緩存檔案裡還沒有這個oid,緩存檔案不是實時更新,是觸發更新,需要執行這個指令(或者自己寫個crontab):

   [v_nashengzhi@BJS0-0TH libexec]$ sudo ./check_netapp-du -H 10.199.94.11 -v cljgshare -w 75 -c 90 --force-gencache

   随便進行1次查詢,并強制更新緩存檔案。

   7.好了,到此基本就監控完成了,監控的本質還是SNMP,隻要裝置支援SNMP,我們得到MIB庫以後,對MIB庫裡已經定義好了的對象實施監控,我這隻是介紹下監控腳本,因為我發現部落格裡很少有通過nagios監控儲存設備的,都是監控一些網絡裝置居多。

   剩下的工作就是把要監控的目錄加到nagios的服務項目裡,來看看最後的監控結果:

Nagios監控NetAPP NAS存儲容量,Volume、Qtree
Nagios監控NetAPP NAS存儲容量,Volume、Qtree

8.看看插件還支援哪些參數:

[v_nashengzhi@BJS0-0TH libexec]$ sudo ./check_netapp-du --help

Password:

Sorry, try again.

check_netapp-du v2.1b (nagios-plugins 1.4.15)

The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute

copies of the plugins under the terms of the GNU General Public License.

For more information about these matters, see the file named COPYING.

Copyright (c) 2009 Rob Hassing, 2012 Peter Mc Aulay

This plugin reports the usage of a Netapp Storage volume

Usage: check_netapp-du -H <host> -v <volume> [-C community] -w <warn> -c <crit> [--force-gencach

-H, --hostname=HOST

  Name or IP address of host to check

-v, --volume=Volume

  Name of the volume to check

-C, --community=community

  SNMP read community (default public)

-w, --warning=X

  Percentage above which a WARNING status will result

-c, --critical=X

  Percentage above which a CRITICAL status will result

--force-gencache

  Force cache file generation (slow, don't use from Nagios)

--debug

  Show lots of debug messages

上一篇: 什麼是CDN?