一、簡介
1、擷取更多的NSClient原理,請參考NSClient++官方站點
2、NSClient++與NRPE
NSClient++的工作原理
<a href="http://s3.51cto.com/wyfs02/M02/4A/E9/wKiom1QnYaewuJNPAACIXWqRWWw312.jpg" target="_blank"></a>
NRPE的工作原理
<a href="http://s3.51cto.com/wyfs02/M00/4A/F0/wKiom1Qnc3zAjECMAAEUsOqv9kM145.jpg" target="_blank"></a>
Nagios對Windows主機的監控主要有三種方法
第一種是NSclient++
第二種是NRPE
第三種是SNMP(不是很常用)
NSclient++與nrpe最大的差別就是:
1、被監控機上安裝有nrpe,并且還有插件,最終的監控是由這些插件來進行的.當監控主機将監控請求發給nrpe後,nrpe調用插件來完成監控.
2、NSclient++則不同,被監控機上隻安裝NSclient++,沒有任何的插件.當監控主機将監控請求發給NSclient++後,NSclient++直接完成監控,所有的監控是由NSclient++完成的。
這也說明了NSclient++的一個很大的問題,不靈活,沒有可擴充性.它隻能完成自己本身包含的監控操作,不能由一些插件來擴充.好在NSclient++已經做的不錯了,基本上可以完全滿足我們的監控需要。
二、check_nt的方式監控windows主機
1、下載下傳NSClient++
2、安裝NSClient++
<a href="http://s3.51cto.com/wyfs02/M01/4A/B2/wKiom1QmswLChu7BAAF5ugAoQLo820.jpg" target="_blank"></a>
<a href="http://s3.51cto.com/wyfs02/M02/4A/B4/wKioL1QmsyqyCXr-AAHsG6bmgZw306.jpg" target="_blank"></a>
<a href="http://s3.51cto.com/wyfs02/M02/4A/B2/wKiom1QmswOwzMYBAAGsdr9E0Pc403.jpg" target="_blank"></a>
<a href="http://s3.51cto.com/wyfs02/M00/4A/B4/wKioL1QmsyqTfejsAAGRfdnH-xU473.jpg" target="_blank"></a>
Allowed hosts:(this is the IP of the nagios (or other)server)
允許的主機位址:Nagios伺服器端的IP位址
NSClient password(only userd via check_nt)
NSClient的密碼:填寫Nagios跟NSClient++程序通信的密碼,可以不設定
Modules to load:
安裝并加載相應的子產品:NSClient++自帶的有check_plugins插件,check_nt,check_nrpe,NSCA,WMI
在這裡我們全部選擇,以後會有用到的。
<a href="http://s3.51cto.com/wyfs02/M00/4A/B2/wKiom1QmswOTN7ZdAAF8CMhGqWs779.jpg" target="_blank"></a>
<a href="http://s3.51cto.com/wyfs02/M02/4A/B4/wKioL1Qmsyvjmrj1AADuL2UQatY830.jpg" target="_blank"></a>
點選【Finsh】,表示安裝NSClient++完成
<a href="http://s3.51cto.com/wyfs02/M01/4A/B4/wKioL1QmsyujK89FAAGCBr_MNf0266.jpg" target="_blank"></a>
3、檢視NSClient++服務是否已經啟用
<a href="http://s3.51cto.com/wyfs02/M02/4A/E9/wKioL1QnXTmgMzAWAAQEAtOdHlA759.jpg" target="_blank"></a>
<a href="http://s3.51cto.com/wyfs02/M01/4A/E9/wKiom1QnYYexKlUCAACG8_0iqhA152.jpg" target="_blank"></a>
4、檢視NSClient++的配置檔案
預設是安裝在C:\Program Files\NSClient++ 目錄下,NSC.ini即為NSClient服務的配置檔案,一般我們無需修改,但是當我們監控端的IP位址改變時,或者密碼忘記,即可以在這裡修改了。
安裝時加載的子產品
<a href="http://s3.51cto.com/wyfs02/M00/4A/E9/wKioL1QnXLeisp6DAAOmUPyr8zQ318.jpg" target="_blank"></a>
設定允許連接配接的位址:為Nagios的IP位址
<a href="http://s3.51cto.com/wyfs02/M00/4A/E7/wKiom1QnXJDxt5BDAAZlbauD00Y308.jpg" target="_blank"></a>
NRPE的預設端口号
<a href="http://s3.51cto.com/wyfs02/M01/4A/E9/wKioL1QnXLnCQe-sAATIjXBJX2M538.jpg" target="_blank"></a>
三、NSClient應用監控
NSClient++與Nagios伺服器通信,主要使用Nagios伺服器的check_nt插件。原理圖如下
<a href="http://s3.51cto.com/wyfs02/M02/4A/E8/wKiom1QnXc7Rghy-AAC9NsiJ0jA271.jpg" target="_blank"></a>
1、check_nt插件的使用說明
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
<code>[root@Nagios ~]</code><code># cd /usr/local/nagios/libexec/</code>
<code>[root@Nagios libexec]</code><code># ./check_nt -h #檢視check_nt的文法</code>
<code>check_nt v2.0.3 (nagios-plugins 2.0.3)</code>
<code>Copyright (c) 2000 Yves Rubin ([email protected])</code>
<code>Copyright (c) 2000-2014 Nagios Plugin Development Team</code>
<code> </code><code><[email protected]></code>
<code>This plugin collects data from the NSClient service running on a</code>
<code>Windows NT</code><code>/2000/XP/2003</code> <code>server.</code>
<code>Usage:</code>
<code>check_nt -H host -</code><code>v</code> <code>variable [-p port] [-w warning] [-c critical]</code>
<code>[-l params] [-d SHOWALL] [-u] [-t timeout]</code>
<code>Options:</code>
<code> </code><code>-h, --help</code>
<code> </code><code>Print detailed help </code><code>screen</code>
<code> </code><code>-V, --version</code>
<code> </code><code>Print version information</code>
<code> </code><code>--extra-opts=[section][@</code><code>file</code><code>]</code>
<code> </code><code>Read options from an ini </code><code>file</code><code>. See</code>
<code> </code><code>https:</code><code>//www</code><code>.nagios-plugins.org</code><code>/doc/extra-opts</code><code>.html</code>
<code> </code><code>for</code> <code>usage and examples.</code>
<code> </code><code>-H, --</code><code>hostname</code><code>=HOST</code>
<code> </code><code>Name of the host to check</code>
<code> </code><code>-p, --port=INTEGER</code>
<code> </code><code>Optional port number (default: 1248) </code><code>#預設端口号</code>
<code> </code><code>-s, --secret=<password> </code><code>#指定的密碼</code>
<code> </code><code>Password needed </code><code>for</code> <code>the request</code>
<code> </code><code>-w, --warning=INTEGER</code>
<code> </code><code>Threshold </code><code>which</code> <code>will result </code><code>in</code> <code>a warning status</code>
<code> </code><code>-c, --critical=INTEGER</code>
<code> </code><code>Threshold </code><code>which</code> <code>will result </code><code>in</code> <code>a critical status</code>
<code> </code><code>-t, --timeout=INTEGER</code>
<code> </code><code>Seconds before connection attempt </code><code>times</code> <code>out (default: -l, --params=<parameters></code>
<code> </code><code>Parameters passed to specified check (see below) -d, --display={SHOWALL}</code>
<code> </code><code>Display options (currently only SHOWALL works) -u, --unknown-timeout</code>
<code> </code><code>Return UNKNOWN on timeouts10)</code>
<code> </code><code>Print this help </code><code>screen</code>
<code> </code><code>Print version information</code>
<code> </code><code>-</code><code>v</code><code>, --variable=STRING</code>
<code> </code><code>Variable to check</code>
<code>Valid variables are:</code>
<code> </code><code>CLIENTVERSION = Get the NSClient version</code>
<code> </code><code>If -l <version> is specified, will </code><code>return</code> <code>warning </code><code>if</code> <code>versions differ.</code>
<code> </code><code>CPULOAD =</code>
<code> </code><code>Average CPU load on last x minutes.</code>
<code> </code><code>Request a -l parameter with the following syntax:</code>
<code> </code><code>-l <minutes range>,<warning threshold>,<critical threshold>.</code>
<code> </code><code><minute range> should be </code><code>less</code> <code>than 24*60.</code>
<code> </code><code>Thresholds are percentage and up to 10 requests can be </code><code>done</code> <code>in</code> <code>one shot.</code>
<code> </code><code>ie: -l 60,90,95,120,90,95</code>
<code> </code><code>UPTIME =</code>
<code> </code><code>Get the uptime of the machine.</code>
<code> </code><code>-l <unit> </code>
<code> </code><code><unit> = seconds, minutes, hours, or days. (default: minutes)</code>
<code> </code><code>Thresholds will use the unit specified above.</code>
<code> </code><code>USEDDISKSPACE =</code>
<code> </code><code>Size and percentage of disk use.</code>
<code> </code><code>Request a -l parameter containing the drive letter only.</code>
<code> </code><code>Warning and critical thresholds can be specified with -w and -c.</code>
<code> </code><code>MEMUSE =</code>
<code> </code><code>Memory use.</code>
<code> </code><code>SERVICESTATE =</code>
<code> </code><code>Check the state of one or several services.</code>
<code> </code><code>Request a -l parameters with the following syntax:</code>
<code> </code><code>-l <service1>,<service2>,<service3>,...</code>
<code> </code><code>You can specify -d SHOWALL </code><code>in</code> <code>case</code> <code>you want to see working services</code>
<code> </code><code>in</code> <code>the returned string.</code>
<code> </code><code>PROCSTATE =</code>
<code> </code><code>Check </code><code>if</code> <code>one or several process are running.</code>
<code> </code><code>Same syntax as SERVICESTATE.</code>
<code> </code><code>COUNTER =</code>
<code> </code><code>Check any performance counter of Windows NT</code><code>/2000</code><code>.</code>
<code> </code><code>Request a -l parameters with the following syntax:</code>
<code> </code><code>-l </code><code>"\\<performance object>\\counter"</code><code>,"<description></code>
<code> </code><code>The <description> parameter is optional and is given to a </code><code>printf</code>
<code> </code><code>output </code><code>command</code> <code>which</code> <code>requires a float parameter.</code>
<code> </code><code>If <description> does not include </code><code>"%%"</code><code>, it is used as a label.</code>
<code> </code><code>Some examples:</code>
<code> </code><code>"Paging file usage is %%.2f %%%%"</code>
<code> </code><code>"%%.f %%%% paging file used."</code>
<code> </code><code>INSTANCES =</code>
<code> </code><code>Check any performance counter object of Windows NT</code><code>/2000</code><code>.</code>
<code> </code><code>Syntax: check_nt -H <</code><code>hostname</code><code>> -p <port> -</code><code>v</code> <code>INSTANCES -l <counter object></code>
<code> </code><code><counter object> is a Windows Perfmon Counter object (eg. Process),</code>
<code> </code><code>if</code> <code>it is two words, it should be enclosed </code><code>in</code> <code>quotes</code>
<code> </code><code>The returned results will be a comma-separated list of instances on </code>
<code> </code><code>the selected computer </code><code>for</code> <code>that object.</code>
<code> </code><code>The purpose of this is to be run from </code><code>command</code> <code>line to determine what instances</code>
<code> </code><code>are available </code><code>for</code> <code>monitoring without having to log onto the Windows server</code>
<code> </code><code>to run Perfmon directly.</code>
<code> </code><code>It can also be used </code><code>in</code> <code>scripts that automatically create Nagios service</code>
<code> </code><code>configuration files.</code>
<code> </code><code>check_nt -H 192.168.1.1 -p 1248 -</code><code>v</code> <code>INSTANCES -l Process </code><code>#check_nt的文法</code>
<code> </code>
<code>Notes:</code>
<code> </code><code>- The NSClient service should be running on the server to get any information</code>
<code> </code><code>(http:</code><code>//nsclient</code><code>.ready2run.</code><code>nl</code><code>).</code>
<code> </code><code>- Critical thresholds should be lower than warning thresholds</code>
<code> </code><code>- Default port 1248 is sometimes </code><code>in</code> <code>use by other services. The error</code>
<code> </code><code>output when this happens contains </code><code>"Cannot map xxxxx to protocol number"</code><code>.</code>
<code> </code><code>One fix </code><code>for</code> <code>this is to change the port to something </code><code>else</code> <code>on check_nt </code>
<code> </code><code>and on the client service it's connecting to.</code>
<code>Send email to [email protected] </code><code>if</code> <code>you have questions regarding use</code>
<code>of this software. To submit patches or suggest improvements, send email to</code>
<code>[email protected]</code>
2、check_nt指令的使用
check_nt參數解釋
-w:警告比例
-c:緊急比例
四、定義指令、主機、服務
1、定義指令
<code>[root@Nagios ~]</code><code># vim /usr/local/nagios/etc/objects/commands.cfg</code>
<code># 'check_win' command definition</code>
<code>define </code><code>command</code><code>{</code>
<code> </code><code>command_name check_win</code>
<code> </code><code>command_line $USER1$</code><code>/check_nt</code> <code>-H $HOSTADDRESS$ -p 12489 -</code><code>v</code> <code>$ARG1$ $ARG2$</code>
<code> </code><code>}</code>
<code>注釋:</code>
<code>$..$ 表示系統内置的宏,也就是所謂的變量</code>
<code>$USER1$ 表示插件所在的目錄 </code>
<code>-H 指定主機位址</code>
<code>$HOSTADDRESS$ 應用到哪個主機,就用哪個主機的位址</code>
<code>$ARG1$ 傳遞的參數,形參</code>
<code>-s 指定密碼 預設為空</code>
2、定義主機和服務
<code>[root@Nagios objects]</code><code># cp windows.cfg windows106.cfg </code>
<code>[root@Nagios objects]</code><code># sed -i 's/winserver/Windows106/g' windows106.cfg </code>
<code>[root@Nagios objects]</code><code># sed -i 's/192.168.1.2/192.168.0.106/' windows106.cfg</code>
4、檢測配置檔案是否有文法錯誤
<code>[root@Nagios ~]</code><code># service nagios configtest</code>
<code>Nagios Core 4.0.7</code>
<code>Copyright (c) 2009-present Nagios Core Development Team and Community Contributors</code>
<code>Copyright (c) 1999-2009 Ethan Galstad</code>
<code>Last Modified: 06-03-2014</code>
<code>License: GPL</code>
<code>Website: http:</code><code>//www</code><code>.nagios.org</code>
<code>Reading configuration data...</code>
<code> </code><code>Read main config </code><code>file</code> <code>okay...</code>
<code> </code><code>Read object config files okay...</code>
<code>Running pre-flight check on configuration data...</code>
<code>Checking objects...</code>
<code> </code><code>Checked 8 services.</code>
<code> </code><code>Checked 1 hosts.</code>
<code> </code><code>Checked 1 host </code><code>groups</code><code>.</code>
<code> </code><code>Checked 0 service </code><code>groups</code><code>.</code>
<code> </code><code>Checked 1 contacts.</code>
<code> </code><code>Checked 1 contact </code><code>groups</code><code>.</code>
<code> </code><code>Checked 25 commands.</code>
<code> </code><code>Checked 5 </code><code>time</code> <code>periods.</code>
<code> </code><code>Checked 0 host escalations.</code>
<code> </code><code>Checked 0 service escalations.</code>
<code>Checking </code><code>for</code> <code>circular paths...</code>
<code> </code><code>Checked 1 hosts</code>
<code> </code><code>Checked 0 service dependencies</code>
<code> </code><code>Checked 0 host dependencies</code>
<code> </code><code>Checked 5 timeperiods</code>
<code>Checking global event handlers...</code>
<code>Checking obsessive compulsive processor commands...</code>
<code>Checking misc settings...</code>
<code>Total Warnings: 0</code>
<code>Total Errors: 0</code>
<code>Things </code><code>look</code> <code>okay - No serious problems were detected during the pre-flight check</code>
<code>Object precache </code><code>file</code> <code>created:</code>
<code>/usr/local/nagios/var/objects</code><code>.precache</code>
5、重新開機nagios服務
<code>[root@Nagios objects]</code><code># service nagios restart</code>
<code>Running configuration check...</code>
<code>Stopping nagios: .</code><code>done</code><code>.</code>
<code>Starting nagios: </code><code>done</code><code>.</code>
四、浏覽器檢視監控資訊
1、登陸後點選【Hosts】,檢視新監控的Windows主機資訊
<a href="http://s3.51cto.com/wyfs02/M00/4A/BB/wKioL1Qmw93SnhVOAAOI_HKVwgw436.jpg" target="_blank"></a>
2、點選【Services】,檢視Windows服務的狀态資訊
<a href="http://s3.51cto.com/wyfs02/M00/4A/B9/wKiom1Qmw7awxW6cAAfMmerFYOc506.jpg" target="_blank"></a>
3、等待幾分鐘後狀态就正常了,如下圖所示
<a href="http://s3.51cto.com/wyfs02/M02/4A/BC/wKiom1Qmys7Sl-qRAAVcoYMMwSw515.jpg" target="_blank"></a>
三、NRPE的方式監控windows主機
1、修改NSClient++的配置檔案
<a href="http://s3.51cto.com/wyfs02/M02/4A/EE/wKioL1QnawbRvD3RAAO2J91L10Q919.jpg" target="_blank"></a>
2、重新啟動NSClient++服務
<a href="http://s3.51cto.com/wyfs02/M00/4A/EE/wKioL1Qna8GTPNRKAASz4_IEazg262.jpg" target="_blank"></a>
3、Nagios服務端測試NRPE指令
<code>[root@Nagios libexec]</code><code># ./check_nrpe -h</code>
<code>NRPE Plugin </code><code>for</code> <code>Nagios</code>
<code>Copyright (c) 1999-2008 Ethan Galstad ([email protected])</code>
<code>Version: 2.15</code>
<code>Last Modified: 09-06-2013</code>
<code>License: GPL v2 with exemptions (-l </code><code>for</code> <code>more</code> <code>info)</code>
<code>SSL</code><code>/TLS</code> <code>Available: Anonymous DH Mode, OpenSSL 0.9.6 or higher required</code>
<code>Usage: check_nrpe -H <host> [ -b <bindaddr> ] [-4] [-6] [-n] [-u] [-p <port>] [-t <timeout>] [-c <</code><code>command</code><code>>] [-a <arglist...>]</code>
<code> </code><code>-n = Do no use SSL</code>
<code> </code><code>-u = Make socket timeouts </code><code>return</code> <code>an UNKNOWN state instead of CRITICAL</code>
<code> </code><code><host> = The address of the host running the NRPE daemon</code>
<code> </code><code><bindaddr> = bind to </code><code>local</code> <code>address</code>
<code> </code><code>-4 = user ipv4 only</code>
<code> </code><code>-6 = user ipv6 only</code>
<code> </code><code>[port] = The port on </code><code>which</code> <code>the daemon is running (default=5666)</code>
<code> </code><code>[timeout] = Number of seconds before connection </code><code>times</code> <code>out (default=10)</code>
<code> </code><code>[</code><code>command</code><code>] = The name of the </code><code>command</code> <code>that the remote daemon should run</code>
<code> </code><code>[arglist] = Optional arguments that should be passed to the </code><code>command</code><code>. Multiple</code>
<code> </code><code>arguments should be separated by a space. If provided, this must be</code>
<code> </code><code>the last option supplied on the </code><code>command</code> <code>line.</code>
<code>Note:</code>
<code>This plugin requires that you have the NRPE daemon running on the remote host.</code>
<code>You must also have configured the daemon to associate a specific plugin </code><code>command</code>
<code>with the [</code><code>command</code><code>] option you are specifying here. Upon receipt of the</code>
<code>[</code><code>command</code><code>] argument, the NRPE daemon will run the appropriate plugin </code><code>command</code> <code>and</code>
<code>send the plugin output and </code><code>return</code> <code>code back to *this* plugin. This allows you</code>
<code>to execute plugins on remote hosts and </code><code>'fake'</code> <code>the results to </code><code>make</code> <code>Nagios think</code>
<code>the plugin is being run locally.</code>
<code>check_nrpe文法:</code>
<code>check_nrpe ... -c <</code><code>command</code><code>> [-a <argument> <argument> <argument>]</code>
<code>check_nrpe的内置指令:</code>
<code>· CheckAlwaysCRITICAL (check)</code>
<code>· CheckAlwaysOK (check)</code>
<code>· CheckAlwaysWARNING (check)</code>
<code>· CheckCPU (check)</code>
<code>· CheckCRITICAL (check)</code>
<code>· CheckCounter (check)</code>
<code>· CheckEventLog</code><code>/CheckEventLog</code> <code>(check)</code>
<code>· CheckFile (check)</code>
<code>· CheckFileSize (check)</code>
<code>· CheckMem (check)</code>
<code>· CheckMultiple (check)</code>
<code>· CheckOK (check)</code>
<code>· CheckProcState (check)</code>
<code>· CheckServiceState (check)</code>
<code>· CheckTaskSched</code><code>/CheckTaskSched</code> <code>(check)</code>
<code>· CheckUpTime (check)</code>
<code>· CheckVersion (check)</code>
<code>· CheckWARNING (check)</code>
<code>· CheckWMI</code><code>/CheckWMI</code> <code>(check)</code>
<code>· CheckWMIValue (check)</code>
<code>[root@Nagios libexec]</code><code># ./check_nrpe -H 192.168.1.142 -p 5666 -c CheckCPU -a warn=80 crit=90 time=20m time=10s time=4</code>
<code>OK CPU Load ok.|</code><code>'20m'</code><code>=0%;80;90 </code><code>'10s'</code><code>=0%;80;90 </code><code>'4'</code><code>=0%;80;90</code>
<code></code>
本文轉自zys467754239 51CTO部落格,原文連結:http://blog.51cto.com/467754239/1558861,如需轉載請自行聯系原作者