天天看點

ELK收集Nginx通路日志實戰案例

ELK收集Nginx通路日志應用架構

ELK收集Nginx通路日志實戰案例

Nginx的日志格式與日志變量

Nginx跟Apache一樣,都支援自定義輸出日志格式,在進行Nginx日志格式定義前,有必要先了解一下關于多層代理擷取使用者真實IP的幾個概念。

  • remote_addr:表示用戶端位址,但有個條件,如果沒有使用代理,這個位址就是用戶端的真實IP,如果使用了代理,這個位址就是上層代理的IP。相當于apache日志變量%a
  • X-Forwarded-For:簡稱XFF,這是一個HTTP擴充頭,格式為 X-Forwarded-For: client, proxy1, proxy2,如果一個HTTP請求到達伺服器之前,經過了三個代理 Proxy1、Proxy2、Proxy3,IP 分别為 IP1、IP2、IP3,使用者真實IP為 IP0,那麼按照 XFF标準,服務端最終會收到以下資訊:X-Forwarded-For: IP0, IP1, IP2

由此可知,IP3這個位址X-Forwarded-For并沒有擷取到,而remote_addr剛好擷取的就是IP3的位址。

還要幾個容易混淆的變量,這裡也列出來做下說明:

  • $remote_addr :此變量如果走代理通路,那麼将擷取上層代理的IP,如果不走代理,那麼就是用戶端真實IP位址。相當于apache日志中的%a
  • $http_x_forwarded_for:此變量擷取的就是X-Forwarded-For的值。
  • $proxy_add_x_forwarded_for:此變量是$http_x_forwarded_for和$remote_addr兩個變量之和。
ELK收集Nginx通路日志實戰案例

系統預設定義和引用的日志格式為main:

[[email protected] nginx]# grep -A 4 'log_format' /etc/nginx/nginx.conf 
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;
           

自定義Nginx日志格式

在掌握了Nginx日志變量的含義後,接着開始對它輸出的日志格式進行改造,這裡我們仍将Nginx日志輸出設定為json格式,下面僅列出Nginx配置檔案nginx.conf中日志格式和日志檔案定義部分,定義好的日志格式與日志檔案如下:

map $http_x_forwarded_for $clientRealIp {	# 定義日志變量clientRealIp
        "" $remote_addr;						# 當$http_x_forwarded_for變量為空時,将$remote_addr變量的值指派給$clientRealIp變量
        ~^(?P<firstAddr>[0-9\.]+),?.*$ $firstAddr;	# 當$http_x_forwarded_for變量非空時,使用正規表達式取出$http_x_forwarded_for變量中的第一個IP值并指派給$firstAddr變量,最後$firstAddr變量的值再指派給$clientRealIp變量 
    }		# 是以map指令整段配置就是要擷取到真正的用戶端IP位址并将其指派給$clientRealIp變量,$clientRealIp變量會在下面定義日志格式時引用

# 以下為自定義nginx日志格式
[[email protected] ~]# vim /etc/nginx/nginx.conf
    log_format nginx_log_json '{"accessip_list":"$proxy_add_x_forwarded_for","client_ip":"$clientRealIp","http_host":"$host","@timestamp":"$time_iso8601","method":"$request_method","url":"$request_uri","status":"$status","http_referer":"$http_referer","body_bytes_sent":"$body_bytes_sent","request_time":"$request_time","http_user_agent":"$http_user_agent","total_bytes_sent":"$bytes_sent","server_ip":"$server_addr"}';

    access_log  /var/log/nginx/access.log  nginx_log_json;   
           

驗證日志輸出

[[email protected] ~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
[[email protected] ~]# systemctl restart nginx
[[email protected] ~]# ifconfig ens32 | awk 'NR==2 {print $2}'
192.168.126.90
           

浏覽器通路 http://192.168.126.90

ELK收集Nginx通路日志實戰案例

檢視nginx日志

[[email protected] ~]# tailf /var/log/nginx/access.log 
{"accessip_list":"192.168.126.1","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T22:54:46+08:00","method":"GET","url":"/","status":"200","http_referer":"-","body_bytes_sent":"612","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"850","server_ip":"192.168.126.90"}
{"accessip_list":"192.168.126.1","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T22:54:46+08:00","method":"GET","url":"/favicon.ico","status":"404","http_referer":"-","body_bytes_sent":"153","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"308","server_ip":"192.168.126.90"}
           

為nginx伺服器設定一層反向代理

[[email protected] ~]# ifconfig ens32 | awk 'NR==2 {print $2}'
192.168.126.91
[[email protected] ~]# vim /etc/httpd/conf/httpd.conf
ProxyPass / http://192.168.126.90
ProxyPassReverse / http://192.168.126.90/
[[email protected] ~]# systemctl restart httpd
           

浏覽器通路 http://192.168.126.91

ELK收集Nginx通路日志實戰案例

檢視nginx日志

[[email protected] ~]# tailf /var/log/nginx/access.log
{"accessip_list":"192.168.126.1, 192.168.126.91","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T23:02:48+08:00","method":"GET","url":"/","status":"200","http_referer":"-","body_bytes_sent":"612","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"850","server_ip":"192.168.126.90"}
# 可以看到此時"accessip_list"字段是兩個IP,第一個是用戶端真實IP,第二個是代理IP;"client_ip"字段為真實用戶端IP
           

在一層的基礎上設定二層反向代理

[[email protected] ~]# ifconfig ens32 | awk 'NR==2 {print $2}'
192.168.126.92
[[email protected] ~]# vim /etc/httpd/conf/httpd.conf
ProxyPass / http://192.168.126.91
ProxyPassReverse / http://192.168.126.91/
[[email protected] ~]# systemctl restart httpd
           

浏覽器通路 http://192.168.126.92

ELK收集Nginx通路日志實戰案例

檢視nginx日志

[[email protected] ~]# tailf /var/log/nginx/access.log
{"accessip_list":"192.168.126.1, 192.168.126.92, 192.168.126.91","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T23:08:11+08:00","method":"GET","url":"/","status":"200","http_referer":"-","body_bytes_sent":"612","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"850","server_ip":"192.168.126.90"}
# 可以看到此時"accessip_list"字段是三個IP,第一個是用戶端真實IP,第一個IP是第一層代理IP,第二個IP是第二層代理IP;"client_ip"字段為真實用戶端IP
           

在這個輸出中,可以看到,client_ip和accessip_list輸出的異同,client_ip字段輸出的就是真實的用戶端IP位址,而accessip_list輸出是代理疊加而成的IP清單,第一條日志,是直接通路http://192.168.126.90不經過任何代理得到的輸出日志,第二條日志,是經過一層代理通路http://192.168.126.91 而輸出的日志,第三條日志,是經過二層代理通路http://192.168.126.92得到的日志輸出。

Nginx中擷取用戶端真實IP的方法很簡單,無需做特殊處理,這也給後面編寫logstash的事件配置檔案減少了很多工作量。

配置filebeat

filebeat是安裝在Nginx伺服器上的,這裡給出配置好的filebeat.yml檔案的内容:

[[email protected] filebeat]# vim /usr/local/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log
  fields:
    log_topic: nginxlogs
name: "192.168.126.90"
output.kafka:
  enabled: true
  hosts: ["192.168.126.91:9092", "192.168.126.92:9092", "192.168.126.93:9092"]
  version: "0.10"
  topic: '%{[fields][log_topic]}'
  partition.round_robin:
    reachable_only: true
  worker: 2
  required_acks: 1
  compression: gzip
  max_message_bytes: 10000000
logging.level: debug

# 啟動
[[email protected] filebeat]# nohup /usr/local/filebeat/filebeat -e -c /usr/local/filebeat/filebeat.yml &
[1] 1056
 nohup: ignoring input and appending output to ‘nohup.out’
           

啟動kafka+zookeeper叢集

/usr/local/zookeeper/bin/zkServer.sh start	
nohup /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties & 
           

浏覽器通路nginx

ELK收集Nginx通路日志實戰案例

檢視nginx通路日志

[[email protected] filebeat]# tailf /var/log/nginx/access.log 
{"accessip_list":"192.168.126.1","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-15T15:10:23+08:00","method":"GET","url":"/","status":"304","http_referer":"-","body_bytes_sent":"0","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"180","server_ip":"192.168.126.90"}
           

同時驗證filebeat采集日志資料

2021-08-15T15:10:32.294+0800	DEBUG	[publish]	pipeline/processor.go:308	Publish event: {
  "@timestamp": "2021-08-15T07:10:32.292Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.5.4"
  },
  "offset": 2007,
  "message": "{\"accessip_list\":\"192.168.126.1\",\"client_ip\":\"192.168.126.1\",\"http_host\":\"192.168.126.90\",\"@timestamp\":\"2021-08-15T15:10:23+08:00\",\"method\":\"GET\",\"url\":\"/\",\"status\":\"304\",\"http_referer\":\"-\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.000\",\"http_user_agent\":\"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0\",\"total_bytes_sent\":\"180\",\"server_ip\":\"192.168.126.90\"}",
  "fields": {
    "log_topic": "nginxlogs"
  },
  "prospector": {
    "type": "log"
  },
  "input": {
    "type": "log"
  },
  "beat": {
    "name": "192.168.126.90",
    "hostname": "filebeatserver",
    "version": "6.5.4"
  },
  "host": {
    "name": "192.168.126.90"
  },
  "source": "/var/log/nginx/access.log"
}
           

驗證kafka叢集是否能消費到

[[email protected] ~]# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper 192.168.126.91:2181,192.168.126.92:2181,192.168.126.93:2181 --topic nginxlogs
{"@timestamp":"2021-08-15T07:10:32.292Z","@metadata":{"beat":"filebeat","type":"doc","version":"6.5.4","topic":"nginxlogs"},"prospector":{"type":"log"},"input":{"type":"log"},"beat":{"name":"192.168.126.90","hostname":"filebeatserver","version":"6.5.4"},"host":{"name":"192.168.126.90"},"source":"/var/log/nginx/access.log","offset":2007,"message":"{\"accessip_list\":\"192.168.126.1\",\"client_ip\":\"192.168.126.1\",\"http_host\":\"192.168.126.90\",\"@timestamp\":\"2021-08-15T15:10:23+08:00\",\"method\":\"GET\",\"url\":\"/\",\"status\":\"304\",\"http_referer\":\"-\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.000\",\"http_user_agent\":\"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0\",\"total_bytes_sent\":\"180\",\"server_ip\":\"192.168.126.90\"}","fields":{"log_topic":"nginxlogs"}}
           

均能正确收集到日志資訊

配置logstash

由于在Nginx輸出日志中已經定義好了日志格式,是以在logstash中就不需要對日志進行過濾和分析操作了,下面直接給出logstash事件配置檔案kafka_nginx_into_es.conf的内容:

[[email protected] ~]# vim /usr/local/logstash/config/kafka_nginx_into_es.conf
input {
    kafka {
        bootstrap_servers => "192.168.126.91:9092,192.168.126.92:9092,192.168.126.93:9092" 
        topics => "nginxlogs"		#指定輸入源中需要從哪個topic中讀取資料,這裡會自動建立一個名為nginxlogs的topic
        group_id => "logstash"
        codec => json {
           charset => "UTF-8"
        }
        add_field => { "[@metadata][myid]" => "nginxaccess-log" }   #增加一個字段,用于辨別和判斷,在output輸出中會用到。
    }
}

filter {
    if [@metadata][myid] == "nginxaccess-log" {
      mutate {
        gsub => ["message","\\x","\\\x"]  # 這裡的message就是message字段,也就是日志的内容。這個插件的作用是将message字段内容中UTF-8單位元組編碼做替換處理,這是為了應對URL有中文出現的情況。
      }
      if ('method":"HEAD' in [message]) {    # 如果message字段中有HEAD請求,就删除此條資訊。
           drop{}
      }
      json {
            source => "message"
            remove_field => "prospector"
            remove_field => "beat"
            remove_field => "source"
            remove_field => "input"
            remove_field => "offset"
            remove_field => "fields"
            remove_field => "host"
            remove_field => "@version"
            remove_field => "message"
     }
  }
}

output {
    if [@metadata][myid] == "nginxaccess-log" {
        elasticsearch {
            hosts => ["192.168.126.95:9200","192.168.126.96:9200","192.168.126.97:9200"]	   	
            index => "logstash_nginxlogs-%{+YYYY.MM.dd}"   #指定Nginx日志在elasticsearch中索引的名稱,這個名稱會在Kibana中用到。索引的名稱推薦以logstash開頭,後面跟上索引辨別和時間。
        }
    }
}
           

這個logstash事件配置檔案非常簡單,沒對日志格式或邏輯做任何特殊處理,由于整個配置檔案跟elk收集apache日志的配置檔案基本相同。所有配置完成後,就可以啟動logstash了,執行如下指令:

[[email protected] ~]# nohup /usr/local/logstash/bin/logstash -f /usr/local/logstash/config/kafka_nginx_into_es.conf &
[1] 1084
nohup: ignoring input and appending output to ‘nohup.out’
           

啟動es叢集

su - elasticsearch
/usr/local/elasticsearch/bin/elasticsearch -d
           

通路nginx使其産生日志,并檢視es叢集是否生成對應的索引(生成索引需要一定的時間)

ELK收集Nginx通路日志實戰案例

配置Kibana

Filebeat從nginx上收集資料到kafka,然後logstash從kafka拉取資料,如果資料能夠正确發送到elasticsearch,我們就可以在Kibana中配置索引了。

[[email protected] ~]# ifconfig | awk 'NR==2 {print $2}'
192.168.126.96
# 啟動
[[email protected] ~]# nohup /usr/local/kibana/bin/kibana &
[1] 1495
nohup: ignoring input and appending output to ‘nohup.out’
           

浏覽器通路 http://192.168.126.96:5601 登入Kibana,首先配置一個index_pattern,點選kibana左側導航中的Management菜單,然後選擇右側的Index Patterns按鈕,最後點選左上角的Create index pattern。

ELK收集Nginx通路日志實戰案例
ELK收集Nginx通路日志實戰案例
ELK收集Nginx通路日志實戰案例
ELK收集Nginx通路日志實戰案例
ELK收集Nginx通路日志實戰案例
ELK收集Nginx通路日志實戰案例