laitimes

Flink OLAP helps ByteHTAP debut at the top of the database VLDB

author:ByteDance cloud native computing

From 5 September to 9 September 2022, VLDB 2022 will be held in Sydney, Australia. ByteDance's infrastructure research results, ByteHTAP: ByteDance's HTAP System with High Data Freshness and Strong Data Consistency, were received by VLDB 2022 and invited to give a live presentation.

The VLDB conference, known as the International Conference on Very Large Data Bases, is one of the three top conferences (SIGMOD, VLDB, ICDE) with a long history in the database field, and is also a real-time dissemination place for outstanding research and development achievements in the field of databases, reflecting the current cutting-edge direction of database research, the latest technology in industry and the research and development level of various countries. Since its establishment in 1975, VLDB has attracted submissions from the world's top research institutions every year, and has extremely high requirements for system innovation, integrity, and experimental design.

Flink OLAP helps ByteHTAP debut at the top of the database VLDB

Core contribution of the dissertation

ByteHTAP: ByteDance's HTAP System with High Data Freshness and Strong Data Consistency introduces ByteDance's HTAP system with high data freshness and strong data consistency built to cope with business scenarios.

  • With a standalone engine and shared storage architecture, ByteHTAP's modular system design leverages ByteDance's existing OLTP system and OLAP system.
  • ByteHTAP can provide high data freshness with a latency of less than 1 second, opening up many new business opportunities for customers, and customers can configure different data freshness thresholds based on business needs.
  • ByteHTAP provides robust data consistency through the global timestamps of its OLTP and OLAP systems, eliminating developers having to deal with complex data consistency issues in the system.
  • ByteHTAP uses Flink as the OLAP compute engine, introducing important performance optimizations in compute and storage, such as refactoring the Flink job scheduling process to improve query QPS, pushing compute to the storage layer, and using delete bitmaps to handle deletion efficiently.
  • The article concludes by sharing byTeDance's lessons learned and best practices for developing and running ByteHTAP in production, including cross-OLAP database query capabilities, efficient data import, and development enhancements to Flink.

The core compute engine, Flink OLAP

As the OLAP calculation engine of the ByteHTAP system, Flink has been used in several internal services of the company. ByteDance's Flink technical team has made a lot of in-depth optimizations for the Flink engine to support OLAP computing, effectively improving Flink OLAP computing performance. At present, there are 1600 core clusters, and the QPS of 128 concurrent simple query scheduling under small data volume reaches more than 1000, and the QPS of complex query scheduling reaches more than 100; 1000 Concurrent WordCount query Latency is around 100ms. Next, we will focus on https://issues.apache.org/jira/browse/FLINK-25318 and gradually contribute internal optimization to the community.

  1. Query optimizer. Support TopN, Aggregate and other operators to push down; Support for Plan Cache and DAG parallel builds; Cached Catalog is supported. The TPC-DS SF100 improves performance by more than 20%.
  2. Query execution optimization. Support ClassLoader multiplexing and cross-job Codegen Cache, reducing CPU usage and Meta Space consumption during the execution phase; Implement Runtime Filter to optimize Join computing performance; Asynchronous data reading and concurrency optimization, etc.
  3. Resource management and job scheduling. Simplify the query resource application and release process, optimize the interaction between JobMaster and ResourceManager/TaskManager nodes, and allocate job resources according to the Granularity of TaskManager to improve resource application performance; Supports batch deployment of compute tasks, optimizes the deployment structure and serialization/deserialization, and improves the deployment performance of compute tasks.
  4. Query result management. Implement the submission of queries through the Websocket protocol, and the calculation results are optimized from Pull mode to Push mode, so as to avoid time-consuming pull polling waiting; Reduce query Latency by reducing JobMaster and TaskManager creating unnecessary connections and interactions when query and compute tasks are initialized with Dispatcher connection reuse.
  5. Memory management optimizations. Optimized MemoryManager and NetworkBufferPool memory request and release management to reduce the number of memory interactions and locks when the compute task starts and stops; By reducing unnecessary Metrics, adding parallel GCs and other optimizations, the FGC/YGC of the JobManager/TaskManager node is reduced, and the query execution performance and production cluster stability are improved.
  6. Network management optimization. Implement TaskManager multi-job network connection multiplexing, optimize the interaction process of upstream and downstream computing tasks Partition Request, reduce the frequent initialization loss of the network layer and the number of upstream and downstream computing task messages, and improve the initialization performance of computing tasks.
  7. Resource isolation management. Supports resource group management according to TaskManager dimensions, and physical isolation of query jobs between multi-tenants; Implements fine-grained scheduling and execution of computational tasks within TaskManager, supporting a small query priority strategy under high load.

ByteDance best practices

Inside ByteDance, ByteHTAP currently supports User Growth, e-commerce, Happiness, Feishu and other services, with a total of 11 clusters of 6000+ Core AP resources and a daily Query of 50w+.

As the byteHTAP core computing engine Flink OLAP, the relevant capabilities are gradually launching the volcano engine commercial product - streaming computing Flink version - volcano engine. As an enterprise-level unified computing engine that integrates and optimizes ByteDance's internal cloud-native big data solution, streaming Flink Edition has features such as out-of-the-box, elastic deployment, integrated streaming and batch, and OLAP multimodal computing.

Click to download the original paper: "Link"