版本:nagios-plugins-1.4.15.tar.gz
nagios-3.3.1.tar.gz
nrpe-2.13.tar.gz
nagios 3.3 是变化很大的 他需要apache和PHP 结合 加CGI 原来只用CGI 。
因为主页原来是/usr/local/nagios/share/index.html
现在是 index.php 但是感觉页面没原来好看了。
本章 是结合前面的LAMP 环境nginx代理没了使用正常的80端口。
vim /usr/local/apache-2.2.21/conf/httpd.conf
#setting for nagios
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
<Directory "/usr/local/nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
</Directory>
Alias /nagios /usr/local/nagios/share
<Directory "/usr/local/nagios/share">
Options None
AuthUserFile /usr/local/nagios/etc/htpasswd
<b>(2)安装Nagios</b>
添加nagios用户和组
# groupadd nagios
# useradd -g nagios nagios
#groupadd nagcmd
#usermod -G nagcmd nagios
# id nagios
uid=1000(nagios) gid=1000(nagios) 组=1000(nagios),1001(nagcmd)
#tar zxvf nagios-3.3.1.tar.gz
#cd nagios/
#./configure --with-command-group=nagcmd
#make all
#make install
这里会报错。。。
/usr/bin/install: omitting directory `includes/rss/extlib’
/usr/bin/install: omitting directory `includes/rss/htdocs’
/usr/bin/install: omitting directory `includes/rss/scripts’
make[1]: *** [install] Error 1
make[1]: Leaving directory `/tmp/nagios-3.3.1/nagios/html’
make: *** [install] Error 2
解决方法:
<code>sed -i 's:for file in includes/rss/*;:for file in includes/rss/*.*;:g' ./html/Makefile sed -i 's:for file in includes/rss/extlib/*;:for file in includes/rss/extlib/*.*;:g' ./html/Makefile</code>
或者手动 vim html/Makefile
<a target="_blank" href="http://blog.51cto.com/attachment/201204/183659493.jpg"></a>
#make install
# make install-init
# make install-config
# make install-commandmode
# /usr/local/apache-2.2.21/bin/htpasswd -c /usr/local/nagios/etc/htpasswd houzc
New password:
Re-type new password:
Adding password for user houzc
# ll /usr/local/nagios/etc/objects
commands.cfg 设定默认的指令来执行某个监控,也可以自己设定
contacts.cfg 设定联系人,出问题时的联系人与联系组
localhost.cfg 设定对本服务器的监控,配置其他服务时可参考此文件
timeperiods.cfg 设定周一至周五7X24小时不间断,或自定义其他时间段
hosts.cfg 设定被监控的主机(自己创建)
services.cfg 设定被监控的服务(自己创建)
<b>1)nagios.cfg主配置文件</b>
# cd /usr/local/nagios/etc/
# vi nagios.cfg
//添加cfg配置文件
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
cfg_file=/usr/local/nagios/etc/objects/services.cfg
# check_external_commands=0
check_external_commands=1
// 将其值修改为1,作用是允许在web界面下执行重启nagios、停止主机/服务检查等操作
# command_check_interval = 10s
<b>2)修改配置文件cgi.cfg</b>
# vi cgi.cfg
use_authentication=1
authorized_for_system_information=nagiosadmin,houzc
authorized_for_configuration_information=nagiosadmin,houzc
authorized_for_system_commands=houzc//多个用户之间用逗号隔开
authorized_for_all_services=nagiosadmin,houzc
authorized_for_all_hosts=nagiosadmin,houzc
authorized_for_all_service_commands=nagiosadmin,houzc
authorized_for_all_host_commands=nagiosadmin,houzc
// houzc的用户名即是从执行/usr/local/apache/bin/htpasswd -c /usr/local/nagios/etc/htpasswd houzc 而来的
<b>3)查看修改其他配置文件</b>
定义监控时间段,查看timeperiods.cfg
# ll /usr/local/nagios/etc/objects/ timeperiods.cfg
定义联系人,查看contacts.cfg
# vi /usr/local/nagios/etc/objects/contacts.cfg
define contact{
contact_name houzc
use generic-contact
alias nagiosadmin
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin,houzc #定义联系组的成员
注:以下是各参数的解释
service_notification_period 24x7
服务出状况通知的时间段,参照timeperiods.cfg中定义的
host_notification_period 24x7
主机出状况通知的时间段,参照timeperiods.cfg中定义的
service_notification_options w,u,c,r
当服务出现w—报警(warning),u—未知(unkown),c—严重(critical),或者r—从异常情况恢复正常,在这四种情况下通知联系人.
host_notification_options d,u,r
d—当机(down),u—返回不可达(unreachable),r—从异常情况恢复正常
service_notification_commands notify-service-by-email
服务出问题通知采用的命令,这个命令是在commands.cfg中定义的,作用是给联系人发邮件.
host_notification_commands notify-host-by-email
主机出问题通知采用的命令,这个命令是在commands.cfg中定义的,作用是给联系人发邮件.
定义被监控主机,查看hosts.cfg
# vim /usr/local/nagios/etc/objects/hosts.cfg
define host{
host_name 192.168.3.107
address 192.168.3.107
check_command check-host-alive
max_check_attempts 5
check_period 24x7
contact_groups admins
notification_interval 10
notification_period 24x7
notification_options d,u,r
#vim /usr/local/nagios/etc/objects/services.cfg
#service definition
define service{
host_name 192.168.3.107 #被监控的主机,hosts.cfg中定义的
service_description check-host-alive #被监控服务的描述
check_command check-host-alive #所用的命令,是commands.cfg中定义的
max_check_attempts 5
normal_check_interval 3
retry_check_interval 2
check_period 24x7
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
contact_groups admins # 联系人组,是contactgroups.cfg中定义的
# tar -zxvf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15/
# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
# make
# make install
注:安装成功后,会在/usr/local/nagios/目录下生成libexec文件夹
# /usr/local/nagios/libexec/check_mrtg -h
libexec目录下的所有程序都是可以独立执行的,使用方法可以通过”<b>命令名</b><b> –h</b>”来查看
运行nagios之前先进行测试:
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
…………..前面省略
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 24 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
service nagios start
<a target="_blank" href="http://blog.51cto.com/attachment/201204/175556291.jpg"></a>
主:192.168.3.107
被:192.168.3.101
nagios 是一个主动与被动的监控,什么是主动和被动?
主动 :就是被监控端不用授权或者被监控是提供对外的服务。如:ping HTTP
下面图是本地监控
<a target="_blank" href="http://blog.51cto.com/attachment/201204/180515320.jpg"></a>
如果想监控URL 不同参数 还要自己修改他的插件的参数。
被动 :对于像磁盘容量,cpu负载这样的“本地信息”,nagios只能监测自己所在的主机,而对其他的机器则显得有点无能为力,毕竟没得到被控主机的适当权限是不可能得到这些信息的。为了解决这个问题,nagios有一个附加组件—NRPE。用它就可以完成对linux类型主机“远端本地信息”的监控。
被动监控需要通过SLL 加入 隧道然后 NRPE ->libexec的监控插件采集数据。
<a target="_blank" href="http://blog.51cto.com/attachment/201204/180357718.jpg"></a>
<b>1.被监控机上的配置</b>
<b>(1)安装nagios-plugins插件</b>
添加用户和组
安装nagios-plugins
#tar -zxvf nagios-plugins-1.4.15.tar.gz
#cd nagios-plugins-1.4.15
#./configure
#make && make install
# chown nagios.nagios /usr/local/nagios
# chown -R nagios.nagios /usr/local/nagios/libexec
apt-get install libssl-dev libssl0.9.8
centos为:
openssl-
openssl-devel-
tar -zxvf nrpe-2.13.tar.gz
cd nrpe-2.13
./configure
make all
make install-plugin
注:其实只用在监测服务器上安装,被监测机不用安装,只是为了测试
make install-daemon
make install-daemon-config
ls /usr/local/nagios 会生成 bin etc libexec share
vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.3.107 监控服务器的IP
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d #启动
[root@bogon nrpe-2.13]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE v2.13
这样说明安装成功
<b>(3)nrpe监控命令</b>
cd /usr/local/nagios/etc/
vim nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_dev]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
注:蓝色标注的是命令名,即check_users就是等号后面/usr/local/nagios/libexec/check_users -w 5 -c 10的简称。这五个命令,分别是监控登录用户数、cpu负载、/ 分区使用情况、僵尸进程和总的进程数。
具体可以-h 查看相应参数。
[root@bogon libexec]# /usr/local/nagios/libexec/check_nrpe -H localhost -c check_total_procs
PROCS OK: 85 processes
本地测试下。
被监控端的NRPE 没问题 后 主监控端也要安装
回到107上
<b>(1)安装check_nrpe</b>
<b></b>
<b>apt-get <b>install</b> libssl-dev libssl0.9.8</b>
# tar -zxvf nrpe-2.12.tar.gz -C /usr/src/
# cd /usr/src/nrpe-2.12
# ./configure
# make all
# make install-plugin
注:将check_nrpe 安装到/usr/local/nagios/libexec目录下 (必须)
这里为了测试装:
make install-daemon
make install-daemon-config
/usr/local/nagios/libexec/check_nrpe -H 192.168.3.101
也没问题
vim /usr/local/nagios/etc/objects/commands.cfg
添加如下内容:
# 'check_nrpe ' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
$USER1 是/usr/local/nagios/etc/resource.cfg 声明的
$USER1$=/usr/local/nagios/libexec
<b>(3)修改配置文件</b>
在监测服务器上修改services.cfg文件
# vim /usr/local/nagios/etc/objects/services.cfg
host_name 192.168.3.101 #被监控的主机名,这里注意必须是linux且运行着nrpe,而且必须是hosts.cfg中定义的
service_description check-load # 监控项目的名称
check_command check_nrpe!check_load
#监控命令是check_nrpe,是在commands.cfg中定义的,带的参数是check_load,是在nrpe.cfg中定义的
max_check_attempts 5
normal_check_interval 3
retry_check_interval 2
check_period 24x7
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
这里只添加一个测试。
host 配置文件添加和107 修改没区别只是IP地址变了。
define host{
host_name 192.168.3.101
address 192.168.3.101
check_command check-host-alive
max_check_attempts 5
check_period 24x7
contact_groups admins
notification_interval 10
notification_period 24x7
notification_options d,u,r
测试nagios配置# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
重启nagios服务
# service nagios restart
<a target="_blank" href="http://blog.51cto.com/attachment/201204/165113618.jpg"></a>
nagios发警告邮件是采用本机的smtp服务,可以查看commands.cfg中关于发邮件的命令进行定义。使用本机的mail命令,就需要开启本机的smtp服务。
SendEmail是一个通过命令来发smtp邮件的程序,下面介绍它的安装和使用。
<b>1.安装邮件服务器。</b>
我是用的公司的邮件服务器建立了个用户而已,不然自己用postfix 搭建个SMTP端只为nagios用更好!
# tar –zxvf sendEmail-v1.55.tar.gz -c /usr/src
# cp sendEmail-v1.55/sendEmail /usr/local/bin
# chmod +x /usr/local/bin/sendEmail
使用sendEmail命令发邮件
# /usr/local/bin/sendEmail -f [email protected] -t [email protected] -s mail.wsn.com.cn -u "hello" -xu [email protected] -xp 123456 -m aaa
注:以下是命令参数的介绍
-f 表示发送者的邮箱
-t 表示接收者的邮箱
-s 表示SMTP服务器的域名或者ip
-u 表示邮件的主题
-xu 表示SMTP验证的用户名
-xp 表示SMTP验证的密码
-m 表示邮件的内容,
如果你不带-m参数的话,就会提示你自行输入,输入完成后使用CTRL-D来结束
vim /usr/local/nagios/etc/objects/commands.cfg
#文件前面有最后先注释 直接复制修改自己的邮件用户和认证就可以!
# 'notify-host-by-email' command definition
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\n
363.com -xp hou123..
# 'notify-service-by-email' command definition
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICE
重启nagios服务,即可实现发送邮件的功能。
默认设置5次轮询,5次错误以上在会报警的。
下面是一个效果图,邮件服务器没坏的话 延迟基本没有的!
如有问题请加群:71922203!!!!
本文转自 houzaicunsky 51CTO博客,原文链接:http://blog.51cto.com/hzcsky/838778