Preface

In the era of digital transformation, modern, large-scale applications can generate hundreds of millions of pieces of log data per day. It is a valuable asset in the operation and management of a business, recording the various activities and events of systems, applications, and devices. By analyzing log data, organizations can gain insight into how their business is performing, identify potential issues, and optimize opportunities to improve system stability, security, and performance. Therefore, it is obvious that enterprises need to build a "unified log center", which can centrally manage and analyze all kinds of log data, achieve comprehensive monitoring, rapid response and in-depth analysis, help enterprises achieve troubleshooting, security audit, performance optimization and other goals, improve operational efficiency, reduce risks, and provide strong support for business development.

Adhering to the concept of "unified collection, unified processing, and unified analysis", the observation cloud has built a set of efficient observation data analysis platform. This article observes how the cloud can help enterprises build a unified log center quickly and well from the processes of log collection, processing, storage, and analysis.

Observe the ability to collect cloud logs

In terms of collection, DataKit, a unified data collector of Observation Cloud, has rich collection templates and powerful integration and expansion capabilities, and can collect logs in a variety of ways, such as collecting disk files, container stdout, receiving remote push logs, and sidecar collection, to adapt to the heterogeneous IT environment and diverse log collection needs of enterprises.

Observe how the cloud handles different logs

Create an enterprise-grade, unified log center with Observability Cloud

DataKit has built-in collectors for common databases and middleware, such as operating system logs, container logs, MySQL & Nginx, etc., which can collect monitoring object attributes, metric data, and log data at the same time, and minimize the number of agents deployed on monitoring objects with the concept of unified collection.

In the K8S environment, DataKit runs in DaemonSet mode, ensuring that a log collector pod runs on each node, so as to achieve comprehensive coverage of the log data of the entire cluster. No matter how many nodes there are in the cluster, the logs of each node are collected in real time. Observability Cloud also provides DataKit Client Agent (DCA), which is a tool for batch management of DataKit. DCA allows users to perform batch operations and maintenance on configurations such as DataKit status, log parsing templates, and blacklists, improving the efficiency and accuracy of data collection.

For enterprises that have already used the open-source agent to collect logs, they usually tend to forward the logs collected by the original agent to the newly built unified log center. Observability Cloud can use Fluentd, Logstash, Kafka, APIs, and other methods to receive collected logs, fully retaining the company's past technology investment and reducing replacement costs.

Observe the cloud log processing capabilities

In the process of log processing, Observation Cloud has built-in more than 10 official log analysis templates for common databases and middleware, including Kafka, Elasticsearch, MySQL, Nginx, Redis, Tomcat, and MongoDB, which are convenient for users to use directly. In addition, dozens of script functions are provided, allowing users to quickly reference and debug in real time according to different log processing scenarios.

For example, if many enterprises want to desensitize sensitive information in logs during log collection, you can use the masking function in the Observation Cloud Pipeline to mask data within a specified field range, which can be completed on the collection side, avoiding compliance issues caused by sensitive information transmission over the Internet.

For another example, many enterprises have also mentioned that they hope to configure the blacklist of the collection side to reduce valuable bandwidth resources and reduce the performance overhead of centralized processing. On the observation cloud, you can use the drop function (to discard the entire log or a field), the sample function (to sample the logs), or to configure a log blacklist.

For log formats that cannot be adapted to the template, Observant Cloud also provides a variety of ways for users to easily complete log parsing. First of all, it provides the ability to obtain samples with one click and debug in real time during the pipeline parsing process, so that users can check whether the processing of logs by the pipeline meets their requirements in real time. At the same time, in order to reduce the difficulty of writing pipelines, Observation Cloud also provides interactive command-line tools to help users quickly select functions suitable for a certain field.

Interactive command-line tools reduce the difficulty of log parsing

In addition, Pipeline also supports export and import through openAPI, so that group organizations can directly refer to it when creating workspaces, and quickly distribute the adjusted Pipeline to different business systems, improving the user experience of business teams.

It is worth mentioning that for the problem of multi-line log collection that is often encountered in log collection, Observation Cloud provides automatic multi-line mode and custom multi-line mode. The automatic multi-line mode identifies multi-line logs based on specific start tags, end tags, regular expressions, etc., reducing the burden on users.

Observe the storage capacity of cloud logs

In the process of providing SaaS services to users around the world, Observation Cloud needs to implement the processing and storage of massive logs, and how to maintain a balance between cost and user experience, Observation Cloud has made great efforts. In the past, when we used the elasticsearch/opensearch solution, we found that the solution had problems such as a large number of write resources, poor support for schemaless tables, and poor aggregate query performance. Therefore, after rigorous research, development, and testing, GuanceDB has launched a new architecture.

GuanceDB satisfies the requirements of Observation Cloud for Schema Free, solves the pain points of frequent changes in data schemas, improves the performance of data writing, ensures the timeliness of data writing and the real-time nature of queries, and improves the performance of full-text retrieval and reduces the resource overhead of queries...... All in all, the observation cloud finally realized: only 1/3 of the cost of Elasticsearch, 2~4 times the performance improvement, and the overall cost performance increased by nearly 10 times!

In order to meet the long-term storage requirements of enterprises for logs (usually for log audits) and to take into account storage costs, Observant Cloud provides the "Data Forwarding" function, which allows Observant Cloud to store logs and other data in object storage and forwarded to external storage. On the Data Forwarding page, you can set the query time and data forwarding rules to quickly query and store data (including Observa Cloud Backup Logs, AWS S3, HUAWEI CLOUD OBS, Alibaba Cloud OSS, and Kafka Message Queues), and view historical backup logs and SLS Query Logstore data on Observator Cloud without additional processing.

Observe the cloud log analysis capability

Debug Query Language (DQL) is a data query language developed by Observation Cloud. DQL allows you to query data in the observation cloud workspace or terminal devices, allowing you to retrieve and analyze various data stored in the observation cloud, including logs, metrics, and traces.

Thanks to the DQL unified query language and unified style of "viewer", users have a consistent experience when analyzing logs, links, and RUM (user experience data) on the observation cloud, with the same ease of use and the same second-level query experience.

Commonly used search history templates, quick filtering, field completion and other functions are all available on the observation cloud. Considering that teams often have the need for data sharing in the process of troubleshooting and analyzing anomalies, Observation Cloud has designed an intimate "snapshot sharing", where team members can create instant copies of data in a custom time period and generate shortcut access links with specified viewing permissions. The snapshot is not a single static data, but an interactive log analysis interface, which greatly reduces the communication cost of using screenshots and sending original logs between teams. For the sake of data security, the sharer can set a variety of features such as field desensitization, watermarking, expiration date, link encryption, and access IP whitelist to the snapshot, so as to fully ensure the security of the data while efficiently transmitting information, and facilitate collaboration between teams and external enterprises.

For many enterprises, they are particularly worried about the existence of sensitive data in the collected logs, but it is extremely difficult to find them during the collection, storage, and analysis of logs. In order to solve this problem, Observation Cloud has developed the "Sensitive Data Scan" function, which has nearly 100 built-in sensitive data rule bases, and allows users to customize new sensitive data rules, actively scan the log data stored in Observation Cloud by referencing these rules, and encrypt sensitive fields once they are found to ensure data compliance.

In addition, Observability Cloud also supports configuring corresponding log data access query ranges for different roles through the Data Access feature to achieve fine-grained management of data query scopes. For example, teams responsible for different business systems and different log permissions can be flexibly configured within the same team based on roles.

In the Intelligent Log Detection feature, Observability Cloud uses intelligent detection algorithms to monitor the log data generated by collectors in the workspace. Intelligently identify abnormal data such as sudden increases/drops in the number of logs and sudden increase in error logs, detect abnormal states that do not meet expectations, and remind the team to pay attention to whether the business has abnormal performance in time.

Intelligent detection of log bursts

summary

When an enterprise considers building a unified log center, it often needs to consider functional requirements, scalability, security, compatibility, performance and efficiency, cost-effectiveness, and user experience. I believe that after the above introduction, you have a more comprehensive understanding of the log analysis capabilities of observed clouds. Observability Cloud has been committed to providing users with better and better observability capabilities, and is believed to be an ideal choice for enterprises to build a unified log center and a unified observability platform.

Create an enterprise-grade, unified log center with Observability Cloud