天天看点

Logstash conf 参数解释归纳

Logstash *.conf 配置文件所使用的参数源于Ruby,现归纳如下:

####################### 主要参数(总体架构是input/filter/output,而filter中最主要的是grok,grok包含了最常用的patterns,通过patterns可以把需要的信息过滤成所需的字段)

# input

# e.g.

input 
{
	file 
	{
		path => ["/home/logone/testlog3/ora_*.log","/home/logone/testlog3/alert_orcl.log"]
		start_position => beginning
		type => "db_log"
		add_field => { "platform" => "oracle" }
	}
}
           
file 
	{
		path => "F:\Temp\TmpLog\mig_20141031.log"
		codec => multiline 
		{
			# pattern => "^%{TIMESTAMP_ISO8601} ^%{DATE} ^%{DATESTAMP}"
			patterns_dir => ["F:\Dev\Logagent_3.142\logagent-3.0.142\mypatterns"]
			pattern => "%{TOMCATDATE}|%{TIME}"
			negate => true
			what => previous
		}
...
           

# filter

#e.g.

filter
{
	if [type] == "TmpLog" { # if [foo] in ["hello", "world", "foo"]
		mutate {
			replace => { "type" => "apache_access" }		# mutation:变异,即更改字段(rename,update,replace,split ...)
			split => ["message",":"]
		}
		grok {	# 最主要的解析器
			patterns_dir => ["/home/logtools/logstash-1.4.2/mypatterns"]	# 指定解析patterns (以正则表达式为主)
			match => { "message" => "%{UserOnOffLog}" }
		}
		alter {	# 更改字段(按官方的说法以后可能会合并到mutate)
			condrewrite => [ "host", "%{host}", "10.0.0.139" ]		# 假如内容为期望值,则变更字段内容 ["field_name", "expected_value", "new_value"]
		}
		date {	# 解析日期
			match => [ "create_time" , "yyyy/MM/dd HH:mm:ss" ]
		}
		multiline {								# 多行合并成一个事件,e.g. java stack traces
			type => "somefiletype"
			pattern => "^\s"				# 空格开头的行
			what => "previous"				# 和之前的行合并
		}
	}
}
           

# output

# e.g.

output {
  elasticsearch {
    host => localhost
  }
  stdout { codec => rubydebug }
}
           
output {
  if "_grokparsefailure" not in [tags] {  # 不满足filter pattern的信息行会自动加了个叫“tag”的字段,字段内容为"_grokparsefailure",通过这个tag可以在output真正过滤不显示某些信息行
    elasticsearch { ... }
  }
}
           

####################### 其它参数

# sprintf format,变量引用

# e.g.

increment => "apache.%{[response][status]}"	# 数组 %{[response][0]},%{[response][1]} ...
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"	# +FORMAT 适用于时间格式
IPORHOST (?:%{HOSTNAME}|%{IP})			# 条件判断,相当于程序里面的二目运算,假如A存在则A,否则B
           

# if

# 支持 ==,  !=,  <,  >,  <=, >=; =~, !~; in, not in; and, or, nand, xor; !

# e.g.

if [action] == "login" {		# 字段"action" 等于 "login"
    mutate { remove => "secret" }
} else if ...
if [foo] in [foobar] {}
if [foo] in ["hello", "world", "foo"] {}
if "_grokparsefailure" not in [tags] {}
if [message][0] =~ /^ORA-[0-9]{5}/ {}		# ORA-xxxxx格式,ora错误
           

移去_grokparsefailure tag方法,在filter中加上下面语句:

alter
	{
	  remove_tag => "_grokparsefailure"
	}
           

####################### Grok 基础

# %{SYNTAX:SEMANTIC} SYNTAX是匹配样式名字, SEMANTIC其实就是匹配文本对应的标签(其实就是把字段换成新的名字),如下面的字段名“IP”换成“client”

# e.g.

%{NUMBER:duration} %{IP:client}
           

# 下面的例子,第一个是表达式别名,第二到第四个是匹配表达式,以空格严格分开(也就是说没有空格不匹配;某个表达式不匹上,整个也就不匹上)

COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}
           

####################### 合并行

input 
{
	file 
	{
		codec => multiline 
		{
			patterns_dir => ["/home/logagent/mypatterns"]
			pattern => "^(%{TIMESTAMP1})"
			negate => true		# true:移动NOT match pattern的行; false(default):移动match pattern的行
			what => previous	# previous:往前移;next:往后移
		}
	}
}
           
filter 
{
	if [platform] in ["oracle","db2","mysql"]
	{
	  mutate
	  {
    	    gsub => ["message", "\r\n", "#L_B#"]
   	    gsub => ["message", "\n", "#L_B#"]
	  }
	}
	if [platform] == "oracle"
	{
	  grok 
	  {
	    patterns_dir => ["/home/logagent/mypatterns"]
	    match => { "message" => "%{ORACLE_LOGLINE_ALL}"}
	  }
...
           

Log input

Wed Apr 01 16:42:36 2015
VKTM started with pid=3, OS id=6836 
VKTM running
           

Merged output

"message" => Wed Apr 01 16:42:36 2015\nVKTM started with pid=3, OS id=6836 \nVKTM running
           

参考:http://blog.chinaunix.net/uid-532511-id-4845841.html

####################### 问题

Q: restart 后观察log发现进程“呆住”,只有几条无关紧要的信息行。

A: 有时候你可以把Input file 文件换一个目录,可能就起作用了。

####################### Reference

参考:https://www.elastic.co/guide/en/logstash/current/index.html, http://writequit.org/articles/logstash-intro.html

Grok patterns 官方参考:https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html

Grok patterns 代码参考:https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

Grok patterns 正则参考:http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt

Grok patterns 测试参考:http://grokdebug.herokuapp.com/

Lucene - Query Parser Syntax: http://lucene.apache.org/core/3_6_1/queryparsersyntax.html

其它例子:http://www.51itong.net/logstash-mutate-555.html

继续阅读