天天看點

在 Kubernetes 上部署 Fluentd 叢集消費 Kafka 消息存儲到 Elasticsearch

文章目錄

    • 1. 制作安裝 Fluentd 的 CentOS 鏡像
      • 1.1. install-fluentd.sh
      • 1.2. fluentd-dockerfile
      • 1.3. td-agent-dockerfile
    • 2. K8S 部署過程
      • 2.1. 配置字典建立配置檔案
      • 2.2. 工作負載建立部署檔案
      • 2.3. 啟動叢集
    • 3. 測試結果

可以參考我翻譯的官方相關文檔:官方 Fluentd 1.0 文檔。裡面有對用到的插件的配置的詳細介紹。

1. 制作安裝 Fluentd 的 CentOS 鏡像

本來還要安裝 fluent-plugin-kafka 和 fluent-plugin-elasticsearch 插件的,其實 Fluentd 内部已經內建了這兩個插件,官網上隻提到了 td-agent 內建了 fluent-plugin-elasticsearch,官方文檔是個坑。

操作涉及到的檔案:

[[email protected] ~]$ cd /home/test/fluentd-image
[[email protected] fluentd-image]$ ll
total 12
-rw-rw-r-- 1 test test  73 Jul 27 18:23 fluentd-dockerfile
-rw-rw-r-- 1 test test  846 Jul 27 18:30 install-fluentd.sh
-rw-rw-r-- 1 test test  81 Jul 29 15:32 td-agent-dockerfile
           

1.1. install-fluentd.sh

下面是 install-fluentd.sh 檔案的内容,裡面隻是将 sudo 權限的地方去除了,因為在鏡像中直接是 root 使用者操作的。這個檔案的内容主要是執行了 td-agent 的 yum 安裝。

echo "=============================="
echo " td-agent Installation Script "
echo "=============================="
echo "This script requires superuser access to install rpm packages."
echo "You will be prompted for your password by sudo."

# clear any previous sudo permission
# sudo -k

# run inside sudo
sh <<SCRIPT

  # add GPG key
  rpm --import https://packages.treasuredata.com/GPG-KEY-td-agent

  # add treasure data repository to yum
  cat >/etc/yum.repos.d/td.repo <<'EOF';
[treasuredata]
name=TreasureData
baseurl=http://packages.treasuredata.com/4/redhat/\$releasever/\$basearch
gpgcheck=1
gpgkey=https://packages.treasuredata.com/GPG-KEY-td-agent
EOF

  # update your sources
  yum check-update

  # install the toolbelt
  yes | yum install -y td-agent

SCRIPT

# message
echo ""
echo "Installation completed. Happy Logging!"
echo ""
           

1.2. fluentd-dockerfile

下面是 fluentd-dockerfile 檔案内容,這個意思是基于 CentOS 鏡像,然後添加 install-fluentd.sh 腳本,并且執行。

FROM centos:latest

ADD install-fluentd.sh /
RUN sh /install-fluentd.sh
           

然後通過下面指令生成一個鏡像 fluentd:0.0.1,fluentd:0.0.1 這個鏡像相當于一個安裝了 td-agent 的 CentOS,可以作為基礎鏡像。

1.3. td-agent-dockerfile

下面是 td-agent-dockerfile 的内容。這個鏡像檔案裡面隻是對啟動時候的配置檔案進行了指定,這裡指定的是預設配置檔案。主要是為了在 K8S 上的部署。

FROM fluentd:0.0.1
ENTRYPOINT /usr/sbin/td-agent -c /etc/td-agent/td-agent.conf
           

然後通過下面指令生成一個鏡像 td-agent:1.0.1:

然後将該鏡像推送到我們的鏡像倉庫 Harbor:

docker tag td-agent:1.0.1 192.168.50.28:30002/gtcom/td-agent:1.0.1
docker push 192.168.50.28:30002/gtcom/td-agent:1.0.1
           

接下來就可以進行 K8S的配置了。

2. K8S 部署過程

2.1. 配置字典建立配置檔案

在 Kubernetes 上部署 Fluentd 叢集消費 Kafka 消息存儲到 Elasticsearch

先在 gtcom-logging 命名空間下配置中心的配置字典中建立一個 fluentd-config 配置檔案(通過界面操作的,建立 ConfigMap):

---
apiVersion: v1
data:
  td-agent.conf: |-
    <source>
      @type kafka_group
      brokers 192.168.50.10:9092,192.168.50.11:9092,192.168.50.12:9092
      consumer_group fluentd-k8s
      format json
      topics fluent-bit-k8s
    </source>

    <match **>
      @type elasticsearch
      hosts 192.168.50.16:9200,192.168.50.17:9200,192.168.50.18:9200,192.168.50.19:9200
      user test
      password 12345
      logstash_format true
      time_key timestamp
      time_key_exclude_timestamp true
      utc_index false
    </match>
kind: ConfigMap
metadata:
  creationTimestamp: '2021-07-29T06:50:37Z'
  labels:
    k8s-app: fluentd
  managedFields:
    - apiVersion: v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:data':
          .: {}
          'f:td-agent.conf': {}
        'f:metadata':
          'f:labels':
            .: {}
            'f:k8s-app': {}
      manager: Mozilla
      operation: Update
      time: '2021-07-29T07:35:09Z'
  name: fluentd-config
  namespace: gtcom-logging
  resourceVersion: '6700711'
  uid: daa57430-6e9d-497b-a582-1451f1fc1812
           

為了解決時間問題,還需要再建立一個 timezone 配置檔案(通過界面操作的,建立 ConfigMap):

---
apiVersion: v1
data:
  TZ: Asia/Shanghai
kind: ConfigMap
metadata:
  creationTimestamp: '2021-07-31T02:10:49Z'
  managedFields:
    - apiVersion: v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:data':
          .: {}
          'f:TZ': {}
      manager: Mozilla
      operation: Update
      time: '2021-07-31T02:10:49Z'
  name: timezone
  namespace: gtcom-logging
  resourceVersion: '6616359'
  uid: 9a2dc843-d199-436c-b579-79e23636e5a0
           

2.2. 工作負載建立部署檔案

在 Kubernetes 上部署 Fluentd 叢集消費 Kafka 消息存儲到 Elasticsearch

接下來需要在應用程式中的工作負載下建立一個部署 fluentd-logging(也是界面操作的,建立 Deployment):

---
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: '9'
  creationTimestamp: '2021-07-29T06:59:37Z'
  generation: 14
  labels:
    k8s-app: fluentd
    k8s.kuboard.cn/layer: cloud
    k8s.kuboard.cn/name: fluentd-logging
  managedFields:
    - apiVersion: apps/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:labels':
            .: {}
            'f:k8s-app': {}
            'f:k8s.kuboard.cn/layer': {}
            'f:k8s.kuboard.cn/name': {}
        'f:spec':
          'f:progressDeadlineSeconds': {}
          'f:replicas': {}
          'f:revisionHistoryLimit': {}
          'f:selector': {}
          'f:strategy':
            'f:rollingUpdate':
              .: {}
              'f:maxSurge': {}
              'f:maxUnavailable': {}
            'f:type': {}
          'f:template':
            'f:metadata':
              'f:annotations':
                .: {}
                'f:kubectl.kubernetes.io/restartedAt': {}
              'f:labels':
                .: {}
                'f:k8s-app': {}
                'f:k8s.kuboard.cn/layer': {}
                'f:k8s.kuboard.cn/name': {}
            'f:spec':
              'f:containers':
                'k:{"name":"fluentd-logging"}':
                  .: {}
                  'f:envFrom': {}
                  'f:image': {}
                  'f:imagePullPolicy': {}
                  'f:name': {}
                  'f:resources':
                    .: {}
                    'f:limits':
                      .: {}
                      'f:memory': {}
                    'f:requests':
                      .: {}
                      'f:memory': {}
                  'f:terminationMessagePath': {}
                  'f:terminationMessagePolicy': {}
                  'f:volumeMounts':
                    .: {}
                    'k:{"mountPath":"/etc/localtime"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
                      'f:readOnly': {}
                    'k:{"mountPath":"/etc/td-agent/"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
              'f:dnsPolicy': {}
              'f:restartPolicy': {}
              'f:schedulerName': {}
              'f:securityContext': {}
              'f:terminationGracePeriodSeconds': {}
              'f:volumes':
                .: {}
                'k:{"name":"fluentd-config"}':
                  .: {}
                  'f:configMap':
                    .: {}
                    'f:defaultMode': {}
                    'f:name': {}
                  'f:name': {}
                'k:{"name":"volume-sa44e"}':
                  .: {}
                  'f:hostPath':
                    .: {}
                    'f:path': {}
                    'f:type': {}
                  'f:name': {}
      manager: Mozilla
      operation: Update
      time: '2021-07-31T02:11:56Z'
    - apiVersion: apps/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            .: {}
            'f:deployment.kubernetes.io/revision': {}
        'f:status':
          'f:availableReplicas': {}
          'f:conditions':
            .: {}
            'k:{"type":"Available"}':
              .: {}
              'f:lastTransitionTime': {}
              'f:lastUpdateTime': {}
              'f:message': {}
              'f:reason': {}
              'f:status': {}
              'f:type': {}
            'k:{"type":"Progressing"}':
              .: {}
              'f:lastTransitionTime': {}
              'f:lastUpdateTime': {}
              'f:message': {}
              'f:reason': {}
              'f:status': {}
              'f:type': {}
          'f:observedGeneration': {}
          'f:readyReplicas': {}
          'f:replicas': {}
          'f:updatedReplicas': {}
      manager: kube-controller-manager
      operation: Update
      time: '2021-08-06T06:39:50Z'
  name: fluentd-logging
  namespace: gtcom-logging
  resourceVersion: '8295654'
  uid: 57370963-234f-4e32-8c41-3f623dd085b0
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: fluentd
      k8s.kuboard.cn/layer: cloud
      k8s.kuboard.cn/name: fluentd-logging
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/restartedAt: '2021-07-29T15:39:05+08:00'
      creationTimestamp: null
      labels:
        k8s-app: fluentd
        k8s.kuboard.cn/layer: cloud
        k8s.kuboard.cn/name: fluentd-logging
    spec:
      containers:
        - envFrom:
            - configMapRef:
                name: timezone
          image: '192.168.50.28:30002/gtcom/td-agent:1.0.1'
          imagePullPolicy: IfNotPresent
          name: fluentd-logging
          resources:
            limits:
              memory: 1Gi
            requests:
              memory: 1Gi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /etc/td-agent/
              name: fluentd-config
            - mountPath: /etc/localtime
              name: volume-sa44e
              readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - configMap:
            defaultMode: 420
            name: fluentd-config
          name: fluentd-config
        - hostPath:
            path: /etc/localtime
            type: File
          name: volume-sa44e
status:
  availableReplicas: 3
  conditions:
    - lastTransitionTime: '2021-07-29T06:59:37Z'
      lastUpdateTime: '2021-07-31T02:12:01Z'
      message: ReplicaSet "fluentd-logging-76bcfc8db4" has successfully progressed.
      reason: NewReplicaSetAvailable
      status: 'True'
      type: Progressing
    - lastTransitionTime: '2021-08-06T06:39:50Z'
      lastUpdateTime: '2021-08-06T06:39:50Z'
      message: Deployment has minimum availability.
      reason: MinimumReplicasAvailable
      status: 'True'
      type: Available
  observedGeneration: 14
  readyReplicas: 3
  replicas: 3
  updatedReplicas: 3
           

2.3. 啟動叢集

啟動後會在應用程式下的容器組中看到啟動的容器,我這裡啟動了三個,相當于部署了三個 Fluentd 服務叢集。

在 Kubernetes 上部署 Fluentd 叢集消費 Kafka 消息存儲到 Elasticsearch

後面調試 Fluentd 配置,隻需要修改 ConfigMap 中關于 fluentd 的配置後重新開機負載,或者直接删除對應容器即可(隻要負載還在,容器删除後會自動建立的。)

3. 測試結果

索引格式是

logstash-yyyy.MM.dd

,在 Kibana 上查詢入庫到 ES 的日志資料:

GET logstash-2021.08.11/_search
           

查詢到的資料的

hits

内容如下所示:

{
    "_index":"logstash-2021.08.11",
    "_type":"_doc",
    "_id":"OzBLMnsBbXDyscXDUuJP",
    "_score":null,
    "_source":{
        "timestamp":1628636399.053576,
        "log_time":"2021-08-11 06:59:59,053",
        "thread":"pool-6-thread-2",
        "level":"DEBUG",
        "class":"com.gtcom.governance.sink.impl.backup.DataBackupSink",
        "message":"正在處理線程[1]隊列中資料,立即?[true]",
        "log_flie_path":"/var/log/containers/gtcom-governance-backup-wechat-by-taskmanager-1-1_gtcom-governance_flink-task-manager-4c211dba58b5d0d92085dd5f6d7597cb3755a2c1e33734975194c229fc61d38d.log",
        "stream":"stdout",
        "time":"2021-08-10T22:59:59.053576295Z",
        "kubernetes":{
            "pod_name":"gtcom-governance-backup-wechat-by-taskmanager-1-1",
            "namespace_name":"gtcom-governance",
            "pod_id":"f94665c0-fe96-4741-9889-5ad7d0156e08",
            "host":"gov-node37",
            "container_name":"flink-task-manager",
            "docker_id":"4c211dba58b5d0d92085dd5f6d7597cb3755a2c1e33734975194c229fc61d38d",
            "container_hash":"192.168.50.28:30002/gtcom/[email protected]:8ba00b84c67e105b8e5102ecdf977805feeef676d831c78be437fa922cc27b16",
            "container_image":"192.168.50.28:30002/gtcom/gtcom-governance:2.3.57"
        },
        "source":"k8s-cluster-gtcom"
    }
}
           
{
    "_index":"logstash-2021.08.11",
    "_type":"_doc",
    "_id":"Hwf9MnsBhJL2ix-ygWvF",
    "_score":null,
    "_source":{
        "timestamp":1628648076.021926,
        "log_time":"2021-08-11 10:14:36,021",
        "thread":"Legacy Source Thread - Source: 消費kafka -> String轉換為JSONObject -> Sink: 備份wechat-by (7/9)#0",
        "level":"DEBUG",
        "class":"com.gtcom.governance.sink.impl.backup.DataBackupSink",
        "message":"Key[20210811]有[1]條資料待寫入",
        "log_flie_path":"/var/log/containers/gtcom-governance-backup-wechat-by-taskmanager-1-1_gtcom-governance_flink-task-manager-4c211dba58b5d0d92085dd5f6d7597cb3755a2c1e33734975194c229fc61d38d.log",
        "stream":"stdout",
        "time":"2021-08-11T02:14:36.021926086Z",
        "kubernetes":{
            "pod_name":"gtcom-governance-backup-wechat-by-taskmanager-1-1",
            "namespace_name":"gtcom-governance",
            "pod_id":"f94665c0-fe96-4741-9889-5ad7d0156e08",
            "host":"gov-node37",
            "container_name":"flink-task-manager",
            "docker_id":"4c211dba58b5d0d92085dd5f6d7597cb3755a2c1e33734975194c229fc61d38d",
            "container_hash":"192.168.50.28:30002/gtcom/[email protected]:8ba00b84c67e105b8e5102ecdf977805feeef676d831c78be437fa922cc27b16",
            "container_image":"192.168.50.28:30002/gtcom/gtcom-governance:2.3.57"
        },
        "source":"k8s-cluster-gtcom"
    }
}
           

可以看出日志時間和索引字尾日期是沒問題的,之前的八小時時差問題通過前面配置的時區檔案已經解決了。