Nagios監控HP-UX

本文轉自http://www.toxingwang.com/management/1100.html

本文講解如何在HP-UX下安裝nrpe軟體并實作Nagios監控。

1、準備NRPE for HP-UX軟體和使用者：

1.1 準備軟體

可以到http://www.mayoxide.com/naghpux/下載下傳

我實際使用中由于存在多台HP-UX，是以都是統一從Nagios伺服器拷貝至被監控端的，指令如下：

# scp /var/ftp/nagios/NRPE-2.12.depot HP-UX_IP:/tmp

如果伺服器很多，可以将上述指令寫成腳本，實作批量推送。

1.2 建立NRPE使用者：

groupadd -g 312 nrpe

useradd -g nrpe -u 312 nrpe

##指定GID和UID為312是因為後面的配置腳本中預設是這樣指定的，我這裡就不做修改

2、在HP-UX上安裝NRPE：

2.1 HP-UX的軟體包格式為depot，使用swinstall指令安裝：

# swinstall -s /tmp/NRPE-2.12.depot

2.2 預設會彈出swinstall指令的使用提示，按任意鍵繼續：

2.3 使用空格鍵選中NRPE包，并使用tab鍵切換至菜單“Actions”，然後選擇“Mark For Install”

按回車後，NRPE包前方會有Yes标記：

2.4 再次按tab鍵，切換到菜單“Actions”，然後選擇“Install”進行安裝：

首先會對安裝程式進行分析，通過後選中“OK”，進入正式安裝：

安裝完成後，選中“done”，後按Enter鍵，然後使用tab鍵選擇“File”菜單的"exit"退出：

檢查安裝情況，預設nrpe會被安裝到/opt/nrpe:

[email protected]:/#ls /opt/nrpe

bin etc libexec

3、配置nrpe:

3.1 配置nrpe主配置檔案/opt/nrpe/etc/nrpe.conf:

vi /opt/nrpe/etc/nrpe.cfg ##修改如下兩行

server_address=127.0.0.1 Nagios_Server_IP

allowed_hosts=127.0.0.1 Nagios_Server_IP

##其他可以暫時保持預設，注意底部有監控指令的配置，如果是自寫腳本，則需要配置：

……中間省略……

command[check_users]=/opt/nrpe/libexec/check_users -w 5 -c 10

command[check_load]=/opt/nrpe/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_zombie_procs]=/opt/nrpe/libexec/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/opt/nrpe/libexec/check_procs -w 1500 -c 2000

command[check_hpux_disk]=/opt/nrpe/libexec/check_disk -w 20 -c 10

command[check_free_mem]=/opt/nrpe/libexec/check_mem.pl -f -w 10 -c 5 ##自寫腳本，後面會貼出腳本内容

3.2 将NRPE配置為inetd管理的程序：

NRPE自帶有配置腳本，隻需執行下該腳本即可

/opt/nrpe/bin/configure.sh

##該腳本會将nrpe相關配置寫入/etc/inetd.conf 和/etc/services，與Linux下nrpe啟動配置一樣

4、編寫記憶體監控腳本：

NRPE自帶有大量監控插件的，此處于linux下需要單獨安裝插件不同。但自帶的插件不能監控記憶體，是以我借鑒網上别人腳本，再根據實際情況做了些調整：

#!/usr/bin/perl -w

#by barlow修改

#use strict;

use Getopt::Std;

use vars qw($opt_c $opt_f $opt_u $opt_w

$free_memory $used_memory $total_memory

$crit_level $warn_level

%exit_codes @memlist

$percent $fmt_pct

$verb_err $command_line);

# Predefined exit codes for Nagios

%exit_codes = ('UNKNOWN' ,-1,

'OK' , 0,

'WARNING' , 1,

'CRITICAL', 2,);

#

$verb_err = 0;

#注意指令需要全路徑，且該指令需要root權限

$command_line = `/usr/sbin/swapinfo | tail -1 | awk '{print \$3,\$4}'`;

chomp $command_line;

@memlist = split(/ /, $command_line);

# Define the calculating scalars

$used_memory = $memlist[0];

$free_memory = $memlist[1];

$total_memory = $used_memory + $free_memory;

# Get the options

if ($#ARGV le 0)

{

&usage;

}

else

{

getopts('c:fuw:');

}

# Shortcircuit the switches

if (!$opt_w or $opt_w == 0 or !$opt_c or $opt_c == 0)

{

print "*** You must define WARN and CRITICAL levels!" if ($verb_err);

&usage;

}

elsif (!$opt_f and !$opt_u)

{

print "*** You must select to monitor either USED or FREE memory!" if ($verb_err);

&usage;

}

# Check if levels are sane

if ($opt_w <= $opt_c and $opt_f)

{

print "*** WARN level must not be less than CRITICAL when checking FREE memory!" if ($verb_err);

&usage;

}

elsif ($opt_w >= $opt_c and $opt_u)

{

print "*** WARN level must not be greater than CRITICAL when checking USED memory!" if ($verb_err);

&usage;

}

$warn_level = $opt_w;

$crit_level = $opt_c;

if ($opt_f)

{

$percent = $free_memory / $total_memory * 100;

$fmt_pct = sprintf "%.1f", $percent;

if ($percent <= $crit_level)

{

print "Memory CRITICAL -FREE $fmt_pct% (FREE:$free_memory kB TOTAL:$total_memory kB)\n";

exit $exit_codes{'CRITICAL'};

}

elsif ($percent <= $warn_level)

{

print "Memory WARNING -FREE $fmt_pct% (FREE:$free_memory kB TOTAL:$total_memory kB)\n";

exit $exit_codes{'WARNING'};

}

else

{

print "Memory OK -FREE $fmt_pct% (FREE:$free_memory kB TOTAL:$total_memory kB)\n";

exit $exit_codes{'OK'};

}

}

elsif ($opt_u)

{

$percent = $used_memory / $total_memory * 100;

$fmt_pct = sprintf "%.1f", $percent;

if ($percent >= $crit_level)

{

print "Memory CRITICAL -USED $fmt_pct% (USED：$used_memory kB TOTAL:$total_memory kB)\n";

exit $exit_codes{'CRITICAL'};

}

elsif ($percent >= $warn_level)

{

print "Memory WARNING -USED $fmt_pct% (USED：$used_memory kB TOTAL:$total_memory kB)\n";

exit $exit_codes{'WARNING'};

}

else

{

print "Memory OK -USED $fmt_pct% (USED：$used_memory kB TOTAL:$total_memory kB)\n";

exit $exit_codes{'OK'};

}

}

#列印幫助

sub usage()

{

print "\ncheck_mem.pl - Nagios Plugin\n\n";

print "usage:\n";

print " check_mem.pl -<f|u> -w <warnlevel> -c <critlevel>\n\n";

print "options:\n";

print " -f Check FREE memory\n";

print " -u Check USED memory\n";

print " -w PERCENT Percent free/used when to warn\n";

print " -c PERCENT Percent free/used when critical\n";

exit $exit_codes{'UNKNOWN'};

}

腳本說明：/usr/sbin/swapinfo 取出的資訊并不是真實的實體記憶體使用情況，也不是swap資訊，而是HP-UX系統下的頁面排程資訊，與伺服器真實的記憶體使用情況有一定出入，但基本一緻。

另外由于swapinfo指令需要管理者身份執行，是以我直接賦予腳本u+s權限：

chmod 4755 /opt/nrpe/libexec/check_mem.pl ##相當于chmod u+x

5、啟動NRPE：

5.1 方法一：

# inetd -k && inetd ##重新開機inetd守護程序以實作nrpe的啟動

5.2 方法二：

# su - nrpe

/opt/nrpe/bin/nrpe -c /opt/nrpe/etc/nrpe.cfg -i & ##以inetd服務方式啟動

/opt/nrpe/bin/nrpe -c /opt/nrpe/etc/nrpe.cfg -d & ##獨立守護程序

6、測試NRPE：

6.1 在被監控伺服器上檢視服務是否正常啟動：

# netstat -an|grep -i tcp |grep 5666 ##檢視監控端口是否開啟

6.2 在Nagios伺服器上測試聯系NRPE：

/usr/local/nagios/libexec/check_nrpe -H HP-UX_IP

NRPE v2.12

##傳回如上資訊則正常

##反之則需要檢查hp-ux伺服器日志：

tail -20 /var/adm/syslog/syslog.log

至于Nagios伺服器端的監控配置，前面的文章已經說過很多，這裡就不再重複。

監控效果（點選圖檔放大檢視）：

Nagios監控HP-UX

Nagios監控HP-UX

1、準備NRPE for HP-UX軟體和使用者：

1.1 準備軟體

1.2 建立NRPE使用者：

2、在HP-UX上安裝NRPE：

3、配置nrpe:

3.1 配置nrpe主配置檔案/opt/nrpe/etc/nrpe.conf:

3.2 将NRPE配置為inetd管理的程序：

4、編寫記憶體監控腳本：

5、啟動NRPE：

5.1 方法一：

5.2 方法二：

6、測試NRPE：

6.1 在被監控伺服器上檢視服務是否正常啟動：

6.2 在Nagios伺服器上測試聯系NRPE：

監控效果（點選圖檔放大檢視）：

繼續閱讀

智慧社群總體規劃建設方案(ppt)

智慧政務雲解決方案

【PPT】數字化轉型的主要任務、挑戰與措施

重大翻車現場！批量做了一批闆子，SMT加工成成品後，3.3V跟GND短路！

Linux/AIX系統下服務自啟動配置LinuxAIX

warning: incompatible implicit declaration of built-in function 'exit'

關于vsnprintf的一些總結

更改核心的啟動畫面

top指令使用記憶體

直播｜大資料高手修煉：職業發展+學習路線圖

為啥我欲望總這麼低？

Ubuntu下安裝配置JDK

Django 實作單點登入（SSO）

CentOS 5.2安裝nagios實作短信告警

nagios 監控在centos中安裝與使用

Nagios實戰全解（三）：配置Nagios服務（下）