laitimes

In-depth reveal! Observe the core concept of cloud products

introduction

I'm Jiang Shuomiao, the Chief Technical Architect and Product Controller of Observation Cloud. Today, I'm honored to share with you the core philosophy that our team adheres to in the design and implementation of the product. These ideas are not only a guiding beacon for our work, but also a source of motivation for the continuous advancement of our technology.

At Observation Cloud, we firmly believe that the strong vitality and competitiveness of a product comes from its inherent philosophy and philosophy. As the leader of the team, I lead every member to adhere to these core concepts. They are the cornerstone of our design and implementation of our products and our compass on the path of technological development.

Engineer-centric

Observation Cloud is a monitoring and observation product for the enterprise market, designed to meet the needs of end users. We recognise that for a product to be long-lastingly competitive, it must be recognized by engineers, easy to use, and able to create value for them. By increasing the productivity of our engineers, we believe that the end customer will also benefit.

In traditional monitoring products, the design is often based on the needs of O&M personnel, covering dashboards, alarms, data access and other aspects. However, the design concept of observation clouds is different. Here are a few examples:

The agent model of the observation cloud adopts an architecture similar to OpenTelemetry, separating the Instrumentor (Probe) and the Collector (Agent), which is completely different from traditional APM vendors.

The architecture design has several advantages, but here are just a few of the key ones. We recognize that the use of both eBPF probes and traditional bytecode probes can have some impact on the application. At the same time, probes and agents may need to be upgraded periodically to continuously enhance monitoring capabilities. This architecture allows the R&D team to independently verify the stability of the probe and the scalability of the data fields, while the O&M team is responsible for agent upgrades and configuration adjustments. By separating the probe from the agent, the upgrade of the agent will not interfere with the normal operation of the application, and at most will result in temporary unavailability of data. During the application upgrade process, we ensure that its compatibility with the probe has been verified in the test environment, thus avoiding the risk of failure of the entire business system due to improper probe upgrade.

In addition, this architecture improves system compatibility and supports a variety of instrumentors, including but not limited to probes, log collectors, and integration into the Prometheus ecosystem.

In-depth reveal! Observe the core concept of cloud products

Users of Observability Cloud will appreciate the wide range of freedom it affords, not only in designing personalized dashboards, flexibly configuring data structures, customizing query templates, and ingeniously designing alerting strategies. We understand the subtle psychological needs of R&D engineers, so we have cleverly incorporated "only me" options into the design of multiple features to provide engineers with a private space where they can confidently debug themselves and analyze dashboards without fear of being snooped by. This thoughtful design, out of deep insight and respect for the psychology of developers, encourages freer and more unfettered innovation practices, thereby truly unleashing the potential of observability capabilities and bringing substantial value enhancement to the project.

In-depth reveal! Observe the core concept of cloud products

The previous model examples not only simplified the R&D process, but also gave them the ability to freely expand fields and deepen business insights in the probe, helping to build a more accurate and comprehensive analysis system. Many observation cloud users have reported that compared with some products advertised as AIOps, our solution is better in terms of functionality and AI analysis capabilities, but it is humble that it does not take this as the core selling point. The reason is that we firmly believe that the true value of algorithms goes far beyond operations (Ops), and that it should permeate all aspects of outlier analysis, anomaly detection, and empower R&D teams. This is not only about monitoring CPU behavior anomalies, but also about understanding how code calls behave in different scenarios.

Products that overplay the concept of AIOps often fall into the trap of exaggeration, and the truly great products are the ones that seamlessly integrate these advanced capabilities into their day-to-day R&D operations processes and become an integral part of their toolbox. We are committed to bringing technology back to its roots, providing tangible value to our users, not just empty hype.

In-depth reveal! Observe the core concept of cloud products

Our meticulous polishing of the observed cloud is reflected in many aspects, such as the flexibility of time control. We support direct input of unixtime, simplifying the tedious operation of engineers on time conversion, avoiding the use of Linux commands or selecting from unintuitive calendar controls. This attention to detail is something that many of its counterparts overlook.

ObservationCloud is updated bi-weekly, not only to introduce new features, but more importantly, to optimize the product based on real-world user feedback. These updates may involve a large number of detailed improvements, and while it is impossible to list them all, they together constitute our commitment to continuous product improvement. We believe that users who actually use Observation Cloud will be able to feel our dedication and sincerity in these details.

In-depth reveal! Observe the core concept of cloud products

Make an open product

My attitude towards open source is clear: I support the true spirit of open source, which is to open up the code and communicate with developers around the world. However, I oppose those "pseudo-open source" behaviors that pursue commercial interests in the name of open source and ignore the development of the community and technology. ObservationCloud has taken an open stance at this point, exposing all of our end-side code and maintaining dozens of open source projects on Github. We encourage team members to actively participate in the open source community and submit pull requests (PRs) when issues are found to promote the progress of related projects.

We also continue to contribute to open source projects such as Victoriametrics. Although Observation Cloud is a commercial product, we still adhere to the principle of openness, but this openness is selective. Our goal is to provide an open and controlled environment that ensures the stability and safety of our products.

In-depth reveal! Observe the core concept of cloud products

We are committed to integrating open-source technologies to enhance the capabilities of the observation cloud. We fully support existing observing technology frameworks, such as deep integration with the Prometheus ecosystem. Not only is our system capable of collecting Prometheus' various data types, including Exporter and Push data, but we also surpass Prometheus' official Push Gateway in terms of stability and performance in terms of Push data support. In addition, we've strengthened support for Prometheus self-discovery.

In terms of log collection, we are compatible with a variety of log generation methods, such as supporting Log4J to send logs directly through sockets, avoiding the need to write logs to disk. This support enables developers to efficiently collect large amounts of log data even in performance-critical scenarios.

We also offer a wide range of support for distributed tracing, including ddtrace, OpenTelemetry, Zipkin, SkyWalking, and Jaeger. These open-source solutions may have different implementations in different applications, resulting in fragmentation of analysis. Observation clouds are integrated in a unified way to make the data from these different sources consistent in their use, as if they were designed specifically for observation clouds, although the content of the data they collect may differ.

In-depth reveal! Observe the core concept of cloud products

Again, our commitment to open source technology is not only reflected in supporting eBPF tracing, but also in our open-source efforts to promote a unified standard. Unlike other open source vendors in China, we are committed to generating spans and traces that comply with the OpenTelemetry standard. This means that eBPF data collected using Observable Cloud can be analyzed alongside data ingested by other technologies, without the need for additional back-end systems or dedicated database storage clusters.

When it comes to technical documentation, we are committed to open and transparent sharing. ObservationCloud confidently leads the industry in terms of visibility of technical documentation. We have disclosed a large number of technical implementation details for easy learning and reference by industry colleagues.

Replace demand-driven with cognitive-driven

For those who are familiar with Observation Cloud or those who are new to it, they may not have noticed the fact that Observation Cloud not only provides cloud services to achieve seamless global access, but also supports flexible private deployment solutions to meet the specific needs of different users. However, what is less well known is that Observation Cloud adheres to an important principle in product development – we never customize a product for any single customer. From the local version in China to the overseas international market, and then to those customized solutions deployed in the customer's private environment, although some versions have slight differences due to market strategies or technology iteration progress, Observation Cloud has always been consistent and insists on providing a unified and standard product experience for all users.

This is our insistence. Why?

First of all, Observation Cloud adheres to the principle of being responsible for customers. In a rapidly iterative product environment, we understand the importance of maintaining the main line of the product. Customized products may seem to meet specific needs immediately, but in the long run, they often detach from the mainline version, which not only hurts the long-term interests of our customers, but also makes it more difficult for us to maintain the product experience. Therefore, we resolutely avoid any form of version forking to ensure that every customer can enjoy continuous, stable and efficient product service.

Second, we focus on rapid response and standardization of valuable needs. In the process of rapid iteration, we understand the diversity and complexity of customer needs. However, not all requirements translate directly into product functionality, especially when there are logical paradoxes or ill-articulated requirements that are not clearly stated. As product experts, we listen to our customers, combine product architecture with industry insights, and translate specific needs into standardized product features. This responsible approach not only increases the value of our products, but also deepens our knowledge of the product industry, which in turn drives us to continuously optimize our products, creating a virtuous cycle. Our updates are much faster than our customers can explore open source solutions on their own, ensuring that our platform can be easily and efficiently used by R&D and operations engineers.

Finally, we are committed to the openness and flexibility of our products. In order to meet the diverse individual needs of users, we strive to achieve standardized integration of various technology stacks, rather than relying on customization. By building a rich flexibility mechanism, we have built a complete set of architecture to meet a wider range of business scenarios and needs. This flexibility of high customization requires that we not simply use open source products to cobblework, but need to design core components such as data engines, databases, and UI frameworks to ensure the overall performance and user experience of the product. This comprehensive control from the bottom to the top enables our products to flexibly respond to various challenges and create greater value for users.

Observation Cloud is a product carefully built by our team, which not only carries our technical pursuits, but also integrates our values. We are committed to continuously optimizing and improving our products to ensure that they bring real value to our users. Our goal is for the observation cloud to play a key role in the enterprise, especially for the team of engineers. We believe that through continuous efforts and innovation, Observation Cloud can improve the work efficiency of engineers and enhance the operational capabilities of enterprises. Our goal is not only to meet the current needs of users, but also to anticipate and lead future technology trends, so as to bring long-term value to the enterprise.

In-depth reveal! Observe the core concept of cloud products

Read on