天天看點

使用 Fluentd 和 ElasticSearch Stack 實作 Kubernetes 的叢集 Logging

經過一段時間的探索,我們先後完成了 Kubernetes叢集搭建 DNS Dashboard Heapster 等插件安裝, 叢集安全配置 ,搭建 作為Persistent Volume的CephRBD ,以及 服務更新 探索 和實作工作。現在Kubernetes 叢集層面的Logging

需求逐漸浮上水面了。

随着一些小應用在我們的Kubernetes叢集上的部署上線,叢集的運作邁上了正軌。但問題随之而來,那就是如何查找和診斷叢集自身的問題以及運作于Pod中應用的問題。日志,沒錯!我們也隻能依賴Kubernetes元件以及Pod中應用輸出的日志。不過目前我們僅能通過kubectl logs指令或

Kubernetes Dashboard

來檢視Log。在沒有cluster level logging的情況下,我們需要分别檢視各個Pod的日志,操作繁瑣,過程低效。我們迫切地需要為Kubernetes叢集搭建一套叢集級别的集中日志收集和分析設施。

對于任何基礎設施或後端服務系統,日志都是極其重 要的。對于受Google内部容器管理系統Borg啟發而催生出的Kubernetes項目來說,自然少不了對Logging的支援。在“

Logging Overview

“中,官方概要介紹了Kubernetes上的幾個層次的Logging方案,并給出Cluster-level logging的參考架構:

Kubernetes還給出了參考實作:

– Logging Backend:

Elastic Search stack(包括: Kibana

)

– Logging-agent:

fluentd

ElasticSearch stack實作的cluster level logging的一個優勢在于其對Kubernetes叢集中的Pod沒有侵入性,Pod無需做任何配合性改動。同時EFK/ELK方案在業内也是相對成熟穩定的。

在本文中,我将為我們的Kubernetes 1.3.7叢集安裝ElasticSearch、Fluentd和Kibana。由于1.3.7版本略有些old,EFK能否在其上面run起來,我也是心中未知。能否像《

生化危機:終章 》那樣有一個完美的結局,我們還需要一步一步“打怪更新”慢慢看。

一、Kubernetes 1.3.7叢集的 “漏網之魚”

Kubernetes 1.3.7叢集是通過kube-up.sh搭建并初始化的。按照

K8s官方文檔

有關elasticsearch logging的介紹,在kubernetes/cluster/ubuntu/config-default.sh中,我也發現了下面幾個配置項:

// kubernetes/cluster/ubuntu/config-default.sh
# Optional: Enable node logging.
ENABLE_NODE_LOGGING=false
LOGGING_DESTINATION=${LOGGING_DESTINATION:-elasticsearch}

# Optional: When set to true, Elasticsearch and Kibana will be setup as part of the cluster bring up.
ENABLE_CLUSTER_LOGGING=false
ELASTICSEARCH_LOGGING_REPLICAS=${ELASTICSEARCH_LOGGING_REPLICAS:-1}
           

顯然,當初如果搭建叢集伊始時要是知道這些配置的意義,可能那個時候就會将elastic logging內建到叢集中了。現在為時已晚,叢集上已經跑了很多應用,無法重新通過kube-up.sh中斷叢集運作并安裝elastic logging了。我隻能手工進行安裝了!

二、鏡像準備

1.3.7源碼中kubernetes/cluster/addons/fluentd-elasticsearch下的manifest已經比較old了,我們直接使用kubernetes最新源碼中的

manifest檔案

k8s.io/kubernetes/cluster/addons/fluentd-elasticsearch$ ls *.yaml
es-controller.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml
           

分析這些yaml,我們需要三個鏡像:

gcr.io/google_containers/fluentd-elasticsearch:1.22
 gcr.io/google_containers/elasticsearch:v2.4.1-1
 gcr.io/google_containers/kibana:v4.6.1-1
           

顯然鏡像都在牆外。由于生産環境下的Docker引擎并沒有配置加速器代理,是以我們需要手工下載下傳一下這三個鏡像。我采用的方法是通過另外一台配置了加速器的機器上的

Docker引擎

将三個image下載下傳,并重新打tag,上傳到我在hub.docker.com上的賬号下,以elasticsearch:v2.4.1-1為例:

# docker pull gcr.io/google_containers/elasticsearch:v2.4.1-1
# docker tag gcr.io/google_containers/elasticsearch:v2.4.1-1 bigwhite/elasticsearch:v2.4.1-1
# docker push bigwhite/elasticsearch:v2.4.1-1
           

下面是我們在後續安裝過程中真正要使用到的鏡像:

bigwhite/fluentd-elasticsearch:1.22
bigwhite/elasticsearch:v2.4.1-1
bigwhite/kibana:v4.6.1-1
           

三、啟動fluentd

fluentd是以

DaemonSet

的形式跑在K8s叢集上的,這樣k8s可以保證每個k8s cluster node上都會啟動一個fluentd(注意:将image改為上述鏡像位址,如果你配置了加速器,那自然就不必了)。

# kubectl create -f fluentd-es-ds.yaml --record
daemonset "fluentd-es-v1.22" created
           

檢視daemonset中的Pod的啟動情況,我們發現:

kube-system fluentd-es-v1.22-as3s5 0/1 CrashLoopBackOff 2 43s 172.16.99.6 10.47.136.60
kube-system fluentd-es-v1.22-qz193 0/1 CrashLoopBackOff 2 43s 172.16.57.7 10.46.181.146
           

fluentd Pod啟動失敗,fluentd的日志可以通過/var/log/fluentd.log檢視:

# tail -100f /var/log/fluentd.log

2017-03-02 02:27:01 +0000 [info]: reading config file path="/etc/td-agent/td-agent.conf"
2017-03-02 02:27:01 +0000 [info]: starting fluentd-0.12.31
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-mixin-config-placeholders' version '0.4.0'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-mixin-plaintextformatter' version '0.2.6'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-docker_metadata_filter' version '0.1.3'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '1.5.0'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-kafka' version '0.4.1'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '0.24.0'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-mongo' version '0.7.16'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.5'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-s3' version '0.8.0'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-scribe' version '0.10.14'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-td' version '0.10.29'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.2'
2017-03-02 02:27:01 +0000 [info]: gem 'fluent-plugin-webhdfs' version '0.4.2'
2017-03-02 02:27:01 +0000 [info]: gem 'fluentd' version '0.12.31'
2017-03-02 02:27:01 +0000 [info]: adding match pattern="fluent.**" type="null"
2017-03-02 02:27:01 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2017-03-02 02:27:02 +0000 [error]: config error file="/etc/td-agent/td-agent.conf" error="Invalid Kubernetes API v1 endpoint https://192.168.3.1:443/api: 401 Unauthorized"
2017-03-02 02:27:02 +0000 [info]: process finished code=256
2017-03-02 02:27:02 +0000 [warn]: process died within 1 second. exit.
           

從上述日志中的error來看:fluentd通路apiserver secure port(443)出錯了:Unauthorized! 通過分析 cluster/addons/fluentd-elasticsearch/fluentd-es-image/build.sh和td-agent.conf,我們發現是fluentd image中的

fluent-plugin-kubernetes_metadata_filter

要去通路API Server以擷取一些kubernetes的metadata資訊。不過未做任何特殊配置的fluent-plugin-kubernetes_metadata_filter,我猜測它使用的是kubernetes為Pod傳入的環境變量:KUBERNETES_SERVICE_HOST和KUBERNETES_SERVICE_PORT來得到API Server的通路資訊的。但API Server在secure port上是開啟了

安全身份驗證機制

的,fluentd直接通路必然是失敗的。

我們找到了fluent-plugin-kubernetes_metadata_filter項目在github.com上的

首頁

,在這個頁面上我們看到了fluent-plugin-kubernetes_metadata_filter支援的其他配置,包括:ca_file、client_cert、client_key等,顯然這些字眼非常眼熟。我們需要修改一下fluentd image中td-agent.conf的配置,為fluent-plugin-kubernetes_metadata_filter增加一些配置項,比如:

// td-agent.conf
... ...
<filter kubernetes.**>
 type kubernetes_metadata
 ca_file /srv/kubernetes/ca.crt
 client_cert /srv/kubernetes/kubecfg.crt
 client_key /srv/kubernetes/kubecfg.key
</filter>
... ...
           

這裡我不想重新制作image,那麼怎麼辦呢?Kubernetes提供了

ConfigMap

這一強大的武器,我們可以将新版td-agent.conf制作成kubernetes的configmap資源,并挂載到fluentd pod的相應位置以替換image中預設的td-agent.conf。

需要注意兩點:

* 在基于td-agent.conf建立configmap資源之前,需要将td-agent.conf中的注釋行都删掉,否則生成的configmap的内容可能不正确;

* fluentd pod将建立在kube-system下,是以configmap資源也需要建立在kube-system namespace下面,否則kubectl create無法找到對應的configmap。

# kubectl create configmap td-agent-config --from-file=./td-agent.conf -n kube-system
configmap "td-agent-config" created

# kubectl get configmaps -n kube-system
NAME DATA AGE
td-agent-config 1 9s

# kubectl get configmaps td-agent-config -o yaml
apiVersion: v1
data:
 td-agent.conf: |
 <match fluent.**>
 type null
 </match>

 <source>
 type tail
 path /var/log/containers/*.log
 pos_file /var/log/es-containers.log.pos
 time_format %Y-%m-%dT%H:%M:%S.%NZ
 tag kubernetes.*
 format json
 read_from_head true
 </source>
... ...
           

fluentd-es-ds.yaml也要随之做一些改動,主要是增加兩個mount: 一個是mount 上面的configmap td-agent-config,另外一個就是mount hostpath:/srv/kubernetes以擷取到相關client端的數字證書:

spec:
 containers:
 - name: fluentd-es
 image: bigwhite/fluentd-elasticsearch:1.22
 command:
 - '/bin/sh'
 - '-c'
 - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
 resources:
 limits:
 memory: 200Mi
 #requests:
 #cpu: 100m
 #memory: 200Mi
 volumeMounts:
 - name: varlog
 mountPath: /var/log
 - name: varlibdockercontainers
 mountPath: /var/lib/docker/containers
 readOnly: true
 - name: td-agent-config
 mountPath: /etc/td-agent
 - name: tls-files
 mountPath: /srv/kubernetes
 terminationGracePeriodSeconds: 30
 volumes:
 - name: varlog
 hostPath:
 path: /var/log
 - name: varlibdockercontainers
 hostPath:
 path: /var/lib/docker/containers
 - name: td-agent-config
 configMap:
 name: td-agent-config
 - name: tls-files
 hostPath:
 path: /srv/kubernetes
           

接下來,我們重新建立fluentd ds,步驟不贅述。這回我們的建立成功了:

kube-system fluentd-es-v1.22-adsrx 1/1 Running 0 1s 172.16.99.6 10.47.136.60
kube-system fluentd-es-v1.22-rpme3 1/1 Running 0 1s 172.16.57.7 10.46.181.146
           

但通過檢視/var/log/fluentd.log,我們依然能看到“問題”:

2017-03-02 03:57:58 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-03-02 03:57:59 +0000 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Can not reach Elasticsearch cluster ({:host=>\"elasticsearch-logging\", :port=>9200, :scheme=>\"http\"})!" plugin_id="object:3fd99fa857d8"
 2017-03-02 03:57:58 +0000 [warn]: suppressed same stacktrace
2017-03-02 03:58:00 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-03-02 03:58:03 +0000 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Can not reach Elasticsearch cluster ({:host=>\"elasticsearch-logging\", :port=>9200, :scheme=>\"http\"})!" plugin_id="object:3fd99fa857d8"
2017-03-02 03:58:00 +0000 [info]: process finished code=9
2017-03-02 03:58:00 +0000 [error]: fluentd main process died unexpectedly. restarting.
           

由于ElasticSearch logging還未建立,這是連不上elasticsearch所緻。

四、啟動elasticsearch

啟動elasticsearch:

# kubectl create -f es-controller.yaml
replicationcontroller "elasticsearch-logging-v1" created

# kubectl create -f es-service.yaml
service "elasticsearch-logging" created

get pods:

kube-system elasticsearch-logging-v1-3bzt6 1/1 Running 0 7s 172.16.57.8 10.46.181.146
kube-system elasticsearch-logging-v1-nvbe1 1/1 Running 0 7s 172.16.99.10 10.47.136.60
           

elastic search logging啟動成功後,上述fluentd的fail日志就沒有了!

不過elastic search真的運作ok了麼?我們檢視一下elasticsearch相關Pod日志:

# kubectl logs -f elasticsearch-logging-v1-3bzt6 -n kube-system
F0302 03:59:41.036697 8 elasticsearch_logging_discovery.go:60] kube-system namespace doesn't exist: the server has asked for the client to provide credentials (get namespaces kube-system)
goroutine 1 [running]:
k8s.io/kubernetes/vendor/github.com/golang/glog.stacks(0x19a8100, 0xc400000000, 0xc2, 0x186)
... ...
main.main()
 elasticsearch_logging_discovery.go:60 +0xb53

[2017-03-02 03:59:42,587][INFO ][node ] [elasticsearch-logging-v1-3bzt6] version[2.4.1], pid[16], build[c67dc32/2016-09-27T18:57:55Z]
[2017-03-02 03:59:42,588][INFO ][node ] [elasticsearch-logging-v1-3bzt6] initializing ...
[2017-03-02 03:59:44,396][INFO ][plugins ] [elasticsearch-logging-v1-3bzt6] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
... ...
[2017-03-02 03:59:44,441][INFO ][env ] [elasticsearch-logging-v1-3bzt6] heap size [1007.3mb], compressed ordinary object pointers [true]
[2017-03-02 03:59:48,355][INFO ][node ] [elasticsearch-logging-v1-3bzt6] initialized
[2017-03-02 03:59:48,355][INFO ][node ] [elasticsearch-logging-v1-3bzt6] starting ...
[2017-03-02 03:59:48,507][INFO ][transport ] [elasticsearch-logging-v1-3bzt6] publish_address {172.16.57.8:9300}, bound_addresses {[::]:9300}
[2017-03-02 03:59:48,547][INFO ][discovery ] [elasticsearch-logging-v1-3bzt6] kubernetes-logging/7_f_M2TKRZWOw4NhBc4EqA
[2017-03-02 04:00:18,552][WARN ][discovery ] [elasticsearch-logging-v1-3bzt6] waited for 30s and no initial state was set by the discovery
[2017-03-02 04:00:18,562][INFO ][http ] [elasticsearch-logging-v1-3bzt6] publish_address {172.16.57.8:9200}, bound_addresses {[::]:9200}
[2017-03-02 04:00:18,562][INFO ][node ] [elasticsearch-logging-v1-3bzt6] started
[2017-03-02 04:01:15,754][WARN ][discovery.zen.ping.unicast] [elasticsearch-logging-v1-3bzt6] failed to send ping to [{#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}]
SendRequestTransportException[[][127.0.0.1:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][127.0.0.1:9300] Node not connected];
... ...
Caused by: NodeNotConnectedException[[][127.0.0.1:9300] Node not connected]
 at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
 at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
 at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
 ... 12 more
           

總結了一下,日志中有兩個錯誤:

- 無法通路到API Server,這個似乎和fluentd最初的問題一樣;

- elasticsearch兩個節點間互ping失敗。

要想找到這兩個問題的原因,還得回到源頭,去分析elastic search image的組成。

通過cluster/addons/fluentd-elasticsearch/es-image/run.sh檔案内容:

/elasticsearch_logging_discovery >> /elasticsearch/config/elasticsearch.yml

chown -R elasticsearch:elasticsearch /data

/bin/su -c /elasticsearch/bin/elasticsearch elasticsearch
           

我們了解到image中,其實包含了兩個程式,一個為/elasticsearch_logging_discovery,該程式執行後生成一個配置檔案: /elasticsearch/config/elasticsearch.yml。該配置檔案後續被另外一個程式:/elasticsearch/bin/elasticsearch使用。

我們檢視一下已經運作的docker中的elasticsearch.yml檔案内容:

# docker exec 3cad31f6eb08 cat /elasticsearch/config/elasticsearch.yml
cluster.name: kubernetes-logging

node.name: ${NODE_NAME}
node.master: ${NODE_MASTER}
node.data: ${NODE_DATA}

transport.tcp.port: ${TRANSPORT_PORT}
http.port: ${HTTP_PORT}

path.data: /data

network.host: 0.0.0.0

discovery.zen.minimum_master_nodes: ${MINIMUM_MASTER_NODES}
discovery.zen.ping.multicast.enabled: false
           

這個結果中缺少了一項:

discovery.zen.ping.unicast.hosts: ["172.30.0.11", "172.30.192.15"]
           

這也是導緻第二個問題的原因。綜上,elasticsearch logging的錯誤其實都是由于/elasticsearch_logging_discovery無法通路API Server導緻 /elasticsearch/config/elasticsearch.yml沒有被正确生成造成的,我們就來解決這個問題。

我檢視了一下/elasticsearch_logging_discovery的

源碼

,elasticsearch_logging_discovery是一個典型通過

client-go

通過service account通路API Server的程式,很顯然這就是我在《

在Kubernetes Pod中使用Service Account通路API Server

》一文中提到的那個問題:預設的service account不好用。

解決方法:在kube-system namespace下建立一個新的service account資源,并在es-controller.yaml中顯式使用該新建立的service account。

建立一個新的serviceaccount在kube-system namespace下:

//serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
 name: k8s-efk

# kubectl create -f serviceaccount.yaml -n kube-system
serviceaccount "k8s-efk" created

# kubectl get serviceaccount -n kube-system
NAME SECRETS AGE
default 1 139d
k8s-efk 1 17s
           

在es-controller.yaml中,使用service account “k8s-efk”:

//es-controller.yaml
... ...
spec:
 replicas: 2
 selector:
 k8s-app: elasticsearch-logging
 version: v1
 template:
 metadata:
 labels:
 k8s-app: elasticsearch-logging
 version: v1
 kubernetes.io/cluster-service: "true"
 spec:
 serviceAccount: k8s-efk
 containers:
... ...
           

重新建立elasticsearch logging service後,我們再來檢視elasticsearch-logging pod的日志:

# kubectl logs -f elasticsearch-logging-v1-dklui -n kube-system
[2017-03-02 08:26:46,500][INFO ][node ] [elasticsearch-logging-v1-dklui] version[2.4.1], pid[14], build[c67dc32/2016-09-27T18:57:55Z]
[2017-03-02 08:26:46,504][INFO ][node ] [elasticsearch-logging-v1-dklui] initializing ...
[2017-03-02 08:26:47,984][INFO ][plugins ] [elasticsearch-logging-v1-dklui] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2017-03-02 08:26:48,073][INFO ][env ] [elasticsearch-logging-v1-dklui] using [1] data paths, mounts [[/data (/dev/vda1)]], net usable_space [16.9gb], net total_space [39.2gb], spins? [possibly], types [ext4]
[2017-03-02 08:26:48,073][INFO ][env ] [elasticsearch-logging-v1-dklui] heap size [1007.3mb], compressed ordinary object pointers [true]
[2017-03-02 08:26:53,241][INFO ][node ] [elasticsearch-logging-v1-dklui] initialized
[2017-03-02 08:26:53,241][INFO ][node ] [elasticsearch-logging-v1-dklui] starting ...
[2017-03-02 08:26:53,593][INFO ][transport ] [elasticsearch-logging-v1-dklui] publish_address {172.16.57.8:9300}, bound_addresses {[::]:9300}
[2017-03-02 08:26:53,651][INFO ][discovery ] [elasticsearch-logging-v1-dklui] kubernetes-logging/Ky_OuYqMRkm_918aHRtuLg
[2017-03-02 08:26:56,736][INFO ][cluster.service ] [elasticsearch-logging-v1-dklui] new_master {elasticsearch-logging-v1-dklui}{Ky_OuYqMRkm_918aHRtuLg}{172.16.57.8}{172.16.57.8:9300}{master=true}, added {{elasticsearch-logging-v1-vjxm3}{cbzgrfZATyWkHfQYHZhs7Q}{172.16.99.10}{172.16.99.10:9300}{master=true},}, reason: zen-disco-join(elected_as_master, [1] joins received)
[2017-03-02 08:26:56,955][INFO ][http ] [elasticsearch-logging-v1-dklui] publish_address {172.16.57.8:9200}, bound_addresses {[::]:9200}
[2017-03-02 08:26:56,956][INFO ][node ] [elasticsearch-logging-v1-dklui] started
[2017-03-02 08:26:57,157][INFO ][gateway ] [elasticsearch-logging-v1-dklui] recovered [0] indices into cluster_state
[2017-03-02 08:27:05,378][INFO ][cluster.metadata ] [elasticsearch-logging-v1-dklui] [logstash-2017.03.02] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []
[2017-03-02 08:27:06,360][INFO ][cluster.metadata ] [elasticsearch-logging-v1-dklui] [logstash-2017.03.01] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []
[2017-03-02 08:27:07,163][INFO ][cluster.routing.allocation] [elasticsearch-logging-v1-dklui] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[logstash-2017.03.01][3], [logstash-2017.03.01][3]] ...]).
[2017-03-02 08:27:07,354][INFO ][cluster.metadata ] [elasticsearch-logging-v1-dklui] [logstash-2017.03.02] create_mapping [fluentd]
[2017-03-02 08:27:07,988][INFO ][cluster.metadata ] [elasticsearch-logging-v1-dklui] [logstash-2017.03.01] create_mapping [fluentd]
[2017-03-02 08:27:09,578][INFO ][cluster.routing.allocation] [elasticsearch-logging-v1-dklui] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[logstash-2017.03.02][4]] ...]).

           

elasticsearch logging啟動運作ok!

五、啟動kibana

有了elasticsearch logging的“前車之鑒”,這次我們也把上面新建立的serviceaccount:k8s-efk顯式指派給kibana-controller.yaml:

//kibana-controller.yaml
... ...
spec:
 serviceAccount: k8s-efk
 containers:
 - name: kibana-logging
 image: bigwhite/kibana:v4.6.1-1
 resources:
 # keep request = limit to keep this container in guaranteed class
 limits:
 cpu: 100m
 #requests:
 # cpu: 100m
 env:
 - name: "ELASTICSEARCH_URL"
 value: "http://elasticsearch-logging:9200"
 - name: "KIBANA_BASE_URL"
 value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging"
 ports:
 - containerPort: 5601
 name: ui
 protocol: TCP
... ...
           

啟動kibana,并觀察pod日志:

# kubectl create -f kibana-controller.yaml
# kubectl create -f kibana-service.yaml
# kubectl logs -f kibana-logging-3604961973-jby53 -n kube-system
ELASTICSEARCH_URL=http://elasticsearch-logging:9200
server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
{"type":"log","@timestamp":"2017-03-02T08:30:15Z","tags":["info","optimize"],"pid":6,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
           

kibana緩存着實需要一段時間,請耐心等待!可能是幾分鐘。之後你将會看到如下日志:

# kubectl logs -f kibana-logging-3604961973-jby53 -n kube-system
ELASTICSEARCH_URL=http://elasticsearch-logging:9200
server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
{"type":"log","@timestamp":"2017-03-02T08:30:15Z","tags":["info","optimize"],"pid":6,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
{"type":"log","@timestamp":"2017-03-02T08:40:04Z","tags":["info","optimize"],"pid":6,"message":"Optimization of bundles for kibana and statusPage complete in 588.60 seconds"}
{"type":"log","@timestamp":"2017-03-02T08:40:04Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:05Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:05Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:05Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:05Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:06Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:06Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:06Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-03-02T08:40:06Z","tags":["listening","info"],"pid":6,"message":"Server running at http://0.0.0.0:5601"}
{"type":"log","@timestamp":"2017-03-02T08:40:11Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"yellow","message":"Status changed from yellow to yellow - No existing Kibana index found","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2017-03-02T08:40:14Z","tags":["status","plugin:[email protected]","info"],"pid":6,"state":"green","message":"Status changed from yellow to green - Kibana index ready","prevState":"yellow","prevMsg":"No existing Kibana index found"}
           

接下來,通過浏覽器通路下面位址就可以通路kibana的web頁面了,注意:Kinaba的web頁面加載也需要一段時間。

https://{API Server external IP}:{API Server secure port}/api/v1/proxy/namespaces/kube-system/services/kibana-logging/app/kibana#/settings/indices/
           

下面是建立一個index(相當于mysql中的一個database)頁面:

取消“Index contains time-based events”,然後點選“Create”即可建立一個Index。

點選頁面上的”Setting” -> “Status”,可以檢視目前elasticsearch logging的整體狀态,如果一切ok,你将會看到下圖這樣的頁面:

建立Index後,可以在Discover下看到ElasticSearch logging中彙聚的日志:

六、小結

以上就是在Kubernetes 1.3.7叢集上安裝Fluentd和ElasticSearch stack,實作kubernetes cluster level logging的過程。在

使用kubeadm安裝的Kubernetes 1.5.1環境

下安裝這些,則基本不會遇到上述這些問題。

另外ElasticSearch logging預設挂載的volume是emptyDir,實驗用可以。但要部署在生産環境,必須換成Persistent Volume,比如:

CephRBD

本文轉自掘金-

使用 Fluentd 和 ElasticSearch Stack 實作 Kubernetes 的叢集 Logging

繼續閱讀