1 準備資料
$ vi countries
$ cat countries
ussr 8689 275 asia
canada 3852 25 north america
china 3705 1032 asia
usa 3615 237 north america
brazil 3286 134 sourth america
india 1267 746 asia
mexico 762 78 north america
france 211 55 europe
japan 144 120 asia
germany 96 61 europe
england 94 56 europe
2 模式
BEGIN 在awk讀取前執行,END在awk讀取後執行。
2.1 比較運算符
2.2 字元串比對
2.3 正規表達式
2.4 轉義字元
2.5 複合模式
$ awk '$4 ~ /^(asia|europe)$/ {print $0}' countries
ussr 8689 275 asia
china 3705 1032 asia
...
$ awk '$4 ~ /asia/ || /europe/ {print $0}' countries
ussr 8689 275 asia
...
england 94 56 europe
$ awk '$4 ~ /asia|europe/ {print $0}' countries
ussr 8689 275 asia
...
england 94 56 europe
$ awk '$4 !~ /asia|europe/ {print $0}' countries
canada 3852 25 north america
...
mexico 762 78 north america
2.6 Range Patterns
範圍模式比對 pat1 和 pat2 之間的每一行;pat2 可以與 pat1 比對同一行,使範圍為單行。
每當範圍的第一個模式比對時,就開始比對;如果随後沒有找到第二個模式的執行個體,則比對到輸入末尾的所有行。
/Canada/, /USA/
$ awk '$4 ~ /europe/,/africa/ {print $0}' countries
france 211 55 europe
japan 144 120 asia
germany 96 61 europe
england 94 56 europe
# 第一行至第2行
$ awk 'NR==1,NR==2 {print $0}' countries
ussr 8689 275 asia
canada 3852 25 north america
# 第一行至第五行
$ awk 'FNR==1,FNR==5 {print FILENAME ":" $0}' countries
countries:ussr 8689 275 asia
countries:canada 3852 25 north america
countries:china 3705 1032 asia
countries:usa 3615 237 north america
countries:brazil 3286 134 sourth america
3 動作
3.1 内置變量
3.2 數學運算符
$ awk '{print ($2 >200 ? $2:"$2 less than 200" NR)}' countries
8689
3852
3705
3615
3286
1267
762
211
$2 less than 2009
$2 less than 20010
$2 less than 20011
$ awk '$4=="asia" {pop=pop+$3;n=n+1}
END {print "total population of the",n,
"asian countries is " ,pop,"million."}' countries
total population of the 4 asian countries is 2173 million.
$ awk '$3>maxpop {maxpop=$3;country=$1}
> END {print "contry with largest population:", country,maxpop}' countries
contry with largest population: china 1032
3.3 内置數學函數
# 字元串作為表達式
$ awk 'BEGIN {digits="^[0-9]+$"}
$2 ~ digits {print $0}' countries
ussr 8689 275 asia
canada 3852 25 north america
...
$ awk 'BEGIN {
}
sign = "[+-]?"
decimal= "[0-9]+[.]?[0-9]*"
fraction= "[.][0-9]+"
exponent= "([eEl" sign "[0-9]+)?"
number= nAn sign"(" decimal "I" fraction ")" exponent "$"
$0 ~ number' countries
其中,
/ /
和
" "
在正則中等價,如:
$0 - /(,+l-)[0-9]+/
$0 - "(,\+l-)[0-9]+"