Logstash conf 參數解釋歸納

Logstash *.conf 配置檔案所使用的參數源于Ruby，現歸納如下：

####################### 主要參數（總體架構是input/filter/output，而filter中最主要的是grok，grok包含了最常用的patterns，通過patterns可以把需要的資訊過濾成所需的字段）

# input

# e.g.

input 
{
	file 
	{
		path => ["/home/logone/testlog3/ora_*.log","/home/logone/testlog3/alert_orcl.log"]
		start_position => beginning
		type => "db_log"
		add_field => { "platform" => "oracle" }
	}
}

file 
	{
		path => "F:\Temp\TmpLog\mig_20141031.log"
		codec => multiline 
		{
			# pattern => "^%{TIMESTAMP_ISO8601} ^%{DATE} ^%{DATESTAMP}"
			patterns_dir => ["F:\Dev\Logagent_3.142\logagent-3.0.142\mypatterns"]
			pattern => "%{TOMCATDATE}|%{TIME}"
			negate => true
			what => previous
		}
...

# filter

#e.g.

filter
{
	if [type] == "TmpLog" { # if [foo] in ["hello", "world", "foo"]
		mutate {
			replace => { "type" => "apache_access" }		# mutation:變異，即更改字段（rename,update,replace,split ...）
			split => ["message",":"]
		}
		grok {	# 最主要的解析器
			patterns_dir => ["/home/logtools/logstash-1.4.2/mypatterns"]	# 指定解析patterns （以正規表達式為主）
			match => { "message" => "%{UserOnOffLog}" }
		}
		alter {	# 更改字段（按官方的說法以後可能會合并到mutate）
			condrewrite => [ "host", "%{host}", "10.0.0.139" ]		# 假如内容為期望值，則變更字段内容 ["field_name", "expected_value", "new_value"]
		}
		date {	# 解析日期
			match => [ "create_time" , "yyyy/MM/dd HH:mm:ss" ]
		}
		multiline {								# 多行合并成一個事件，e.g. java stack traces
			type => "somefiletype"
			pattern => "^\s"				# 空格開頭的行
			what => "previous"				# 和之前的行合并
		}
	}
}

# output

# e.g.

output {
  elasticsearch {
    host => localhost
  }
  stdout { codec => rubydebug }
}

output {
  if "_grokparsefailure" not in [tags] {  # 不滿足filter pattern的資訊行會自動加了個叫“tag”的字段，字段内容為"_grokparsefailure"，通過這個tag可以在output真正過濾不顯示某些資訊行
    elasticsearch { ... }
  }
}

####################### 其它參數

# sprintf format，變量引用

# e.g.

increment => "apache.%{[response][status]}"	# 數組 %{[response][0]},%{[response][1]} ...
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"	# +FORMAT 适用于時間格式
IPORHOST (?:%{HOSTNAME}|%{IP})			# 條件判斷，相當于程式裡面的二目運算,假如A存在則A，否則B

# if

# 支援 ==, !=, <, >, <=, >=； =~, !~； in, not in； and, or, nand, xor； !

# e.g.

if [action] == "login" {		# 字段"action" 等于 "login"
    mutate { remove => "secret" }
} else if ...
if [foo] in [foobar] {}
if [foo] in ["hello", "world", "foo"] {}
if "_grokparsefailure" not in [tags] {}
if [message][0] =~ /^ORA-[0-9]{5}/ {}		# ORA-xxxxx格式，ora錯誤

移去_grokparsefailure tag方法，在filter中加上下面語句：

alter
	{
	  remove_tag => "_grokparsefailure"
	}

####################### Grok 基礎

# %{SYNTAX:SEMANTIC} SYNTAX是比對樣式名字, SEMANTIC其實就是比對文本對應的标簽(其實就是把字段換成新的名字)，如下面的字段名“IP”換成“client”

# e.g.

%{NUMBER:duration} %{IP:client}

# 下面的例子，第一個是表達式别名，第二到第四個是比對表達式，以空格嚴格分開（也就是說沒有空格不比對；某個表達式不匹上，整個也就不匹上）

COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}

####################### 合并行

input 
{
	file 
	{
		codec => multiline 
		{
			patterns_dir => ["/home/logagent/mypatterns"]
			pattern => "^(%{TIMESTAMP1})"
			negate => true		# true:移動NOT match pattern的行; false(default):移動match pattern的行
			what => previous	# previous:往前移；next:往後移
		}
	}
}

filter 
{
	if [platform] in ["oracle","db2","mysql"]
	{
	  mutate
	  {
    	    gsub => ["message", "\r\n", "#L_B#"]
   	    gsub => ["message", "\n", "#L_B#"]
	  }
	}
	if [platform] == "oracle"
	{
	  grok 
	  {
	    patterns_dir => ["/home/logagent/mypatterns"]
	    match => { "message" => "%{ORACLE_LOGLINE_ALL}"}
	  }
...

Log input

Wed Apr 01 16:42:36 2015
VKTM started with pid=3, OS id=6836 
VKTM running

Merged output

"message" => Wed Apr 01 16:42:36 2015\nVKTM started with pid=3, OS id=6836 \nVKTM running

參考：http://blog.chinaunix.net/uid-532511-id-4845841.html

####################### 問題

Q: restart 後觀察log發現程序“呆住”，隻有幾條無關緊要的資訊行。

A: 有時候你可以把Input file 檔案換一個目錄，可能就起作用了。

####################### Reference

參考：https://www.elastic.co/guide/en/logstash/current/index.html, http://writequit.org/articles/logstash-intro.html

Grok patterns 官方參考：https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html

Grok patterns 代碼參考：https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

Grok patterns 正則參考：http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt

Grok patterns 測試參考：http://grokdebug.herokuapp.com/

Lucene - Query Parser Syntax: http://lucene.apache.org/core/3_6_1/queryparsersyntax.html

其它例子：http://www.51itong.net/logstash-mutate-555.html

Logstash conf 參數解釋歸納

繼續閱讀

Logstash學習22_Logstash的multiline 插件：比對多行日志本文内容測試資料字段屬性按多行解析運作時日志把多行日志解析到字段參考資料

Logstash 基礎學習

Logstash學習9_Logstash去除一些字段裡面的換行符（\r）

Logstash學習3_通過Kafka傳輸資料給logstash-1.4和logstash-1.5通過kafka傳輸

logstash安裝與使用

ELK系統系列 2——Logstash的安裝使用&性能調優Logstash的安裝&使用

Logstash安裝和使用Logstash介紹

logstash實戰之apacheLogs和csv

Logstash學習19_CentOS下Logstash Sqlite input plugin 插件的使用

logstash cannot create pipeline reason= expected one of #

Logstash實踐之MySQL Slowlog解析Logstash實踐之MySQL Slowlog解析

kibana+logstash+file安裝實踐

袋鼠雲研發手記 | 開源·數棧-擴充FlinkSQL實作流與維表的join

logstash-jdbc的一次坑-sql資料庫索引資料到elasticsearch時間字段格式化

java 使用 grok 解析日志

日志分析方法概述為什麼要分析日志怎麼進行日志分析少量資料的情況更多的資料怎麼辦怎樣變得更簡單更多的問題