Preface

Metrics, logs, and links are the three pillars of observability, and logs are mainly used to record traces of code execution to facilitate the location and troubleshooting of problems. Currently, mainstream applications run in Kubernetes clusters in the form of containers, which may be frequently created and destroyed due to the dynamic nature of containers. Log collection and persistence become especially important to ensure that runtime information is still accessible after the container has ended its lifecycle. The following section describes how to use the observation cloud to collect Kubernetes container logs, and parse, query, visualize, analyze, and back up the collected logs.

Access scheme

Deploy the DataKit collector

采集 Kubernetes 容器日志需要先部署 DataKit。

登录观测云控制台,点击「集成」 -「DataKit」 - 「Kubernetes」,下载 datakit.yaml ,拷贝第 3 步中的 token 。

Best practices for collecting Kubernetes container logs

Edit datakit.yaml, paste the token after "token=" in the value of the ENV_DATAWAY environment variable, set the value of the environment variable ENV_CLUSTER_NAME_K8S and add the environment variable ENV_NAMESPACE, the values of these two environment variables generally correspond to the cluster name, and the cluster name of a workspace should be unique.

- name: ENV_NAMESPACE
          value: k8s-prod

Upload datakit.yaml to a host that can connect to the Kubernetes cluster and run the following command:

kubectl apply -f datakit.yaml
kubectl get pod -n datakit

When you see that the status is "Running", it means that DataKit is successfully deployed.

Collect logs from the console

By default, DataKit collects the logs (stdout/stderr) output from all containers to the console, which can be viewed through kubectl logs. Log in to the Observation Cloud console and click Logs - Viewer to see the collected logs, in which the data source is displayed by default with the name of the container, and the custom data source will be used in the next collection.

DataKit 也提供了自监控功能 ,实时看采集情况。 DataKit 默认部署在 datakit namespace 下面,执下 Kubectl exec 命令进入 DataKit 容器。

kubectl exec -it datakit-6rjjp -n datakit bash

Run DataKit Monitor, and the line at the beginning of logging/ in the lower right corner is the real-time monitoring data that collects container logs.

The default collection method is not too flexible, so we recommend the best collection method, which is to turn off all the logs that are output to the console by default, and add annotations to the deployment file that needs to collect logs by coloring to specify whether the logs need to be collected, change the data source name, and tag the logs.

Add the following environment variable to datakit.yaml, i.e., do not collect any console logs.

- name: ENV_INPUT_CONTAINER_CONTAINER_EXCLUDE_LOG
          value: image:*

然后在应用的 Deployment yaml 文件中添加 annotation。

annotations:
        datakit/logs: |
          [
            {
             "disable" : false,
             "source": "log_stdout_demo",
             "tags": {
               "region": "hangzhou"
               }
            }
          ]

Field description:

disable 是否禁用该容器的日志采集，默认是 false。
source 日志来源,非必填项。
tags key/value 键值对，添加额外的 tags，非必填项。

Collect log files in containers

The collection of log files in the container is also achieved by adding annotations.

annotations:
        datakit/logs: |
          [
            {
             "disable": false,
             "type": "file",
             "path":"/data/app/logs/log.log",
             "source": "log_file_demo",
             "tags": {
               "region": "beijing"
               }
            }
          ]

Field description:

disable 是否禁用该容器的日志采集，默认是 false。
type 默认为空是采集 stdout/stderr，采集文件必须写 file。
path configuration file path. If you want to collect files in a container, you must fill in the path of the volume, note that it is not the path of the file in the container, but the path that can be accessed outside the container.
source 日志来源,非必填项。
tags key/value 键值对，添加额外的 tags，非必填项。

Note: You need to mount the log path directory to emptyDir, which is /data/app/logs.

volumeMounts:
        - mountPath: /data/app/logs
          name: varlog
      ......
      volumes:
      - name: varlog
        emptyDir: {}

For example, if the log file is /tmp/opt/**/*.log, the directory must be higher than the wildcard directory, such as /tmp or /tmp/opt.

Log in to the Observation Cloud console and click "Logs" - "Viewer" to see the collected logs, and you can also use custom tags to retrieve them.

Nichishi Analysis

In order to quickly filter and correlate specific content in logs, you need to use Pipeline to structure logs, such as extracting trace_id and log status.

The following is a business log and the corresponding pipeline.

2024-04-11 11:10:17.921 [http-nio-9201-exec-9] INFO  c.r.s.c.SysRoleController - [list,48] - ry-system-dd 2350624413051873476 1032190468283316 - 查询角色列表开始

grok(_, "%{TIMESTAMP_ISO8601:time} %{NOTSPACE:thread_name} %{LOGLEVEL:status}%{SPACE}%{NOTSPACE:class_name} - \\[%{NOTSPACE:method_name},%{NUMBER:line}\\] - %{DATA:service} %{DATA:trace_id} %{DATA:span_id} - %{GREEDYDATA:msg}")
default_time(time, "Asia/Shanghai")

Successfully parse the tags such as trace_id, span_id, and service, which is convenient for subsequent quick screening and correlation analysis.

Log queries

Observability Cloud allows you to query and analyze log data through multiple operations.

Text search

Log Viewer supports keyword query, wildcard query, * means match 0 or more arbitrary characters, ? Indicates that 1 arbitrary character is matched, and to combine multiple terms into a complex query, you can use the Boolean operator (AND/OR/NOT) concatenation.

A term can be a word or a phrase. Like what:

Single word: guance;
多个单词：guance test；（等同于 guance AND test）
Phrase: "guance test"; (Use double quotes to convert a group of words into a phrase)

Example of a search query:

JSON search

The viewer natively supports accurate retrieval of message content in JSON format, and the search format is: @key:value, if it is multi-level JSON, you can use "." to undertake, that is, @key1.key2:value, as shown in the figure:

Log visualization and analysis

Scene Graph

Observation Cloud has a variety of built-in data monitoring view templates, which can be imported to create dashboards and viewers, and customize the editing and configuration, or select a custom creation method to build data insight scenarios through a series of settings. For example, if you can count the number of info and error logs according to the status field parsed above, you can use the following steps to create a visual dashboard.

Step 1: In Scene-> Create a blank dashboard, select the type of view you want.

Step 2: Select the log data source, set the filter conditions and grouping, and click Create.

Powerful correlation

1. View configuration jump link

The observation cloud provides a link function, which can smoothly jump to the dashboard & viewer to achieve data linkage analysis and comprehensive system observability.

On the View Settings page, configure the link address.

Then click the data in the view to jump to the corresponding log viewer and quickly realize the linkage analysis between the view and the viewer.

2. Bind the built-in view

Observability Cloud also supports saving views as built-in views and binding them to viewers, making it easier to view log data and analyze data from other dimensions.

When you view log details, you can view the built-in views bound to the above and bind views from other dimensions, such as the metric view of hosts.

Log alarms

Observability Cloud provides out-of-the-box monitoring templates to create monitors, and also supports custom new monitors, and sets detection rules and trigger conditions through more than 10 detection rules, such as threshold detection, log detection, mutation detection, and interval detection. After you enable the monitor, you can receive alerts about abnormal events triggered by detection rules.

Log detection is used to monitor all log data generated by log collectors in a workspace. It supports setting alarms based on log-based keywords to detect abnormal patterns that do not meet the estimated behavior in a timely manner (such as abnormal tags in log text data), which is mostly suitable for code exception or task scheduling detection in IT monitoring scenarios.

Step 1: Create a log detection monitor in the Monitoring->.

Step 2: Set detection rules and trigger conditions.

For example, if the log contains "WARN", an alarm is triggered when more than 100 entries are set.

Step 3: Edit the event notification content and alarm policy, and click Create.

Log backups

Observant Cloud forwards log data to Observant Cloud Object Storage and external storage (including Observable Cloud Backup Logs, AWS S3, HUAWEI CLOUD OBS, Alibaba Cloud OSS, and Kafka Message Queue). You can freely select storage objects and flexibly manage log backup data.

Log backups

Step 1: Click Logs - > Data Forwarding

Step 2: Click Forwarding Rule - > Create Rule

Step 3: Set the data source to be backed up and the relevant filter conditions, and click OK.

Note: The minimum storage of log data under this rule is 180 days by default, and you can modify the data forwarding storage policy in Management > Settings > Change Data Storage Policy.

View the backup data

Step 1: Click Log - > Data Forwarding and select a rule from the drop-down list.

Step 2: Customize the time range query, you can select multiple dates, define the start time and end time, and the time will be accurate to the hour, then you can query the backup data.

For more information about log backup, see the official documentation.

summary

In the above ways, you can quickly collect logs from various business systems deployed in Kubernetes clusters to the observation cloud platform to implement a complete set of solutions such as log collection, log parsing, query and analysis, monitoring and alarming, and archiving and backup.

Best practices for collecting Kubernetes container logs

Preface

Access scheme

Deploy the DataKit collector

Collect logs from the console

Collect log files in containers

Nichishi Analysis

Log queries

Text search

JSON search

Log visualization and analysis

Scene Graph

Powerful correlation

1. View configuration jump link

2. Bind the built-in view

Log alarms

Log backups

Log backups

View the backup data

summary