最近2天,nagios總是報警,inotifywait程序為0
express_1這台主機有2個rsync腳本,express_1向express_2同步,開啟後,會有2個inotifywait程序。
每隔幾個小時就會挂掉,需要手動啟動一下。但是這樣太麻煩了,一晚上就發了十幾條nagios報警。
是以我就想用monit來監控inotifywait程序。
建立啟動腳本
vi /manage/express_monit.sh
#!/bin/bash
case "$1" in
start)
echo "Starting express..."
/manage/rsync/rsync_express.sh &
sleep 1
ps -aux | grep inotifywait |grep express | head -1 | awk '{print $2}' > /var/run/express.pid
;;
stop)
echo "Stopping express..."
kill -9 `cat /var/run/express.pid`
restart)
echo
ps -aux | grep inotifywait | grep express | head -1 | awk '{print $2}' > /var/run/express.pid
*)
echo "Usage: $prog {start|stop|restart}"
esac
exit 0
設定權限
chmod 755 express_monit.sh
安裝monit,最好使用rpm安裝,使用編碼包編譯有問題
yum install -y monit
編輯配置檔案
vim /etc/monit.conf
修改檢查時間為3秒以及id檔案路徑和開啟日志
set daemon 3 # check services at 2-minute intervals
# set logfile syslog facility log_daemon
set logfile /var/log/monit.log
set idfile /var/.monit.id
set statefile /var/.monit.state
注釋倒數第3行
# set daemon mode timeout to 1 minute
#set daemon 60
進入配置目錄
cd /etc/monit.d/
添加express同步程序監控
vi express
check process express with pidfile /var/run/express.pid
start program = "/manage/express_monit.sh start"
stop program = "/manage/express_monit.sh stop"
啟動monit
/etc/init.d/monit start
kill掉inotifywait程序
pkill inotifywait
觀察monit日志
tail -f /var/log/monit
[CST Apr 20 10:41:07] error : 'express' process is not running
[CST Apr 20 10:41:07] info : 'express' trying to restart
[CST Apr 20 10:41:07] info : 'express' start: /manage/express_monit.sh
[CST Apr 20 10:41:12] info : 'express' process is running with pid 14139
檢視程序是否啟動
[root@iZ23vu75locZ ~]# ps -aux | grep ino
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
root 14306 0.0 0.0 6344 776 ? S 10:41 0:00 /usr/local/inotify/bin/inotifywait -mrq --timefmt %d/%m/%y %H:%M --format %T %w%f -e modify,delete,create,attrib /www/express/