<b>2.6.2 統計類腳本</b>
統計工作一直是shell和python腳本的強項,我們完全可以利用sed、awk再加上正規表達式,寫出強大的統計腳本來分析我們的系統日志、安全日志及伺服器應用日志等。
1. nginx負載均衡器日志彙總腳本
以下腳本是用來分析nginx負載均衡器的日志的,作為awstats的補充,它可以快速得出排名最前的網站和ip等,腳本内容如下(此腳本在centos 5.8/6.4 x86_64下均已測試通過):
#!/bin/bash
if [ $# -eq 0 ];
then
echo "error: please specify
logfile."
exit 0
else
log=$1
fi
if [ ! -f $1 ];
echo "sorry, sir, i can't find this
apache log file, pls try again!"
exit 0
####################################################
echo "most
of the ip:"
echo
"-------------------------------------------"
awk '{ print $1
}' $log | sort | uniq -c | sort -nr | head -10
of the time:"
"--------------------------------------------"
awk '{ print $4
}' $log | cut -c 14-18 | sort | uniq -c | sort -nr | head -10
of the page:"
awk '{print
$11}' $log | sed 's/^.*\(.cn*\)\"/\1/g' |
sort | uniq -c | sort -rn | head -10
of the time / most of the ip:"
}' $log | cut -c 14-18 | sort -n | uniq -c | sort -nr | head -10 > timelog
for i in `awk '{
print $2 }' timelog`
do
num=`grep $i timelog | awk '{ print $1 }'`
echo " $i $num"
ip=`grep $i $log | awk '{ print $1}' | sort
-n | uniq -c | sort -nr | head -10`
echo "$ip"
echo
done
rm -f timelog
2. 探測多節點web服務品質
工作中有時會出現網絡延遲導緻程式傳回資料不及時的問題,這時就需要精準定位機器是在哪個時間段出現了網絡延遲的情況。對此,可以通過python下的pycurl子產品來實作定位,它可以通過調用pycurl提供的方法,來探測web服務品質,比如了解相應的http狀态碼、請求延時、http頭資訊、下載下傳速度等,腳本内容如下所示(此腳本在amazon linux ami x86_64下已測試通過):
#!/usr/bin/python
#encoding:utf-8
#*/30 * * * *
/usr/bin/python /root/dnstime.py >> /root/myreport.txt 2>&1
import os
import time
import sys
import pycurl
#import commands
url="http://imp-east.example.net"
isotimeformat="%y-%m-%d
%x"
c =
pycurl.curl()
c.setopt(pycurl.url,
url)
c.setopt(pycurl.connecttimeout,
5)
c.setopt(pycurl.timeout,
c.setopt(pycurl.forbid_reuse,
1)
c.setopt(pycurl.maxredirs,
c.setopt(pycurl.noprogress,
c.setopt(pycurl.dns_cache_timeout,30)
indexfile =
open(os.path.dirname(os.path.realpath(__file__))+"/content.txt",
"wb")
c.setopt(pycurl.writeheader,
indexfile)
c.setopt(pycurl.writedata,
try:
c.perform()
except
exception,e:
print "connecion error:"+str(e)
indexfile.close()
c.close()
sys.exit()
namelookup_time
= c.getinfo(c.namelookup_time)
connect_time
= c.getinfo(c.connect_time)
pretransfer_time
= c.getinfo(c.pretransfer_time)
starttransfer_time
= c.getinfo(c.starttransfer_time)
total_time =
c.getinfo(c.total_time)
http_code = c.getinfo(c.http_code)
size_download
= c.getinfo(c.size_download)
header_size =
c.getinfo(c.header_size)
speed_download=c.getinfo(c.speed_download)
print "http狀态碼:%s" %(http_code)
print "dns解析時間:%.2f ms"%(namelookup_time*1000)
print "建立連接配接時間:%.2f ms" %(connect_time*1000)
print "準備傳輸時間:%.2f ms" %(pretransfer_time*1000)
print "傳輸開始時間:%.2f ms" %(starttransfer_time*1000)
print "傳輸結束總時間:%.2f ms" %(total_time*1000)
print "下載下傳資料包大小:%d bytes/s" %(size_download)
print "http頭部大小:%d byte" %(header_size)
print "平均下載下傳速度:%d bytes/s" %(speed_download)
indexfile.close()
c.close()
time.strftime( isotimeformat, time.gmtime( time.time() ) )
"================================================================"
3.測試區域網路内主機是否alive的小腳本
我們在對區域網路的網絡情況進行維護時,經常會遇到這樣的問題,需要收集網絡中存活的ip,這個時候可以寫一個python腳本,自動收集某一網段的ip。現在的it技術型公司都比較大,網絡工程師一般會規劃幾個vlan(網段),我們可以用如下這個腳本來收集某個vlan下存活的主機,(此腳本在centos
6.4 x86_64下已測試通過):
import re
import
subprocess
lifeline =
re.compile(r"(\d) received")
report =
("no response","partial response","alive")
time.ctime()
for host in
range(1,254):
ip = "192.168.1."+str(host)
pingaling =
subprocess.popen(["ping","-q", "-c 2",
"-r", ip], shell=false, stdin=subprocess.pipe,
stdout=subprocess.pipe)
print "testing ",ip,
while 1:
pingaling.stdout.flush()
line = pingaling.stdout.readline()
if not line: break
igot = re.findall(lifeline,line)
if igot:
print report[int(igot[0])]
python對空格的要求是非常嚴謹的,請大家注意下這個問題。腳本雖然短小,但非常實用、精悍,可避免到windows
下去下載下傳區域網路檢測工具。平時我們在日常工作中也應該注意多收集、多寫一些這樣的腳本,以達到簡化運維工作的目的。