Flink's landing practice in the real-time computing scenario of the auto home

Guest | Wang Gang

Edit | Strict

Powerful enough to develop and run many different kinds of applications, Apache Flink is recognized in the industry as one of the highest-performing real-time computing engines for big data. Flink has proven to scale to thousands of cores, with terabytes of state and still maintaining high throughput and low latency. Globally, more and more companies have begun to use Flink, and more well-known Internet companies in China such as Alibaba, ByteDance, JD.com, Meituan, etc., are using Flink as a distributed big data processing engine for enterprises on a large scale.

The industry's original positioning of Flink is more of a stream processor or stream computing engine, under the general trend of real-time transformation of big data, as practitioners, we can't help but think about what else Flink can do? How can you take full advantage of Flink and implement a solution that covers a larger range of real-time problems? In specific business scenarios, what challenges will we face and what solutions will there be?

Based on the above-mentioned problems, we invited Wang Gang, a data engineer at autohome smart data center, to share the core practice of Apache Flink in autohome. At the same time, Mr. Wang Gang will bring you the sharing of "Flink-based real-time computing platform and real-time data into the lake practice" in the QCon+ case study club [Flink's landing practice in real-time computing application scenarios], hoping to bring inspiration to everyone.

The following is an interview with Teacher Wang Gang:

InfoQ: Tell us about what you've been doing lately.

Hello everyone, my name is Wang Gang, and I am currently responsible for the design, development and maintenance of the real-time computing platform, real-time access platform and data lake of The Autohome. After continuous polishing in the actual production process, the ease of use of the platform and the stability of the task have been greatly improved, and it has served the business line of the whole company, and the daily computing volume has reached the scale of trillions.

InfoQ: What difficulties and pain points have you encountered in achieving these work results? What kind of effort was it made to solve it? What are the sediments and inspirations?

I started working on the real-time computing platform at the end of 2018, and there were really small difficulties in the process. But think about it, under the order of magnitude of autohome data, most of the problems we encounter with Flink are in the process of use, or small troubles caused by improper application of resources, thanks to Flink's excellent design and the strong community behind it. In terms of customization requirements, thanks to the excellent packaging of the Flink compute engine, it can be supported by some simple changes; some of the more difficult problems encountered in the computing engine can also be solved with the help of the community; and there is also a type of environmental problem that will also bring us a lot of trouble, such as the problem of glibc, which leads to the native memory leak of the JVM. At that time, we were just approaching Flink, and at one point we suspected that it was the problem with the Flink engine itself, and we took many detours, and then we found that the number of many consecutive 64MB memory segments in the process was increasing over time, which identified the problem.

There is also a part of the problem from doing platform-related work, such as with the increase in the number of users, on call pressure is very large, then you have to constantly reflect:

What is something that the current user can't do, and we must help him do it? Can it be empowered to users through the platform?

If the platform can automatically help users diagnose and give solution suggestions for frequently encountered problems in the use of on call?

How to prepare for the re-insurance task?

InfoQ: Flink has been emphasizing the integration of streaming and batching in recent years, what practices and explorations do you have in actual business scenarios?

In this regard, we mainly have two directions to explore:

When users on our platform use Flink SQL to develop stream computing tasks, they can apply the SQL of the previous batch processing task to the development of stream computing with slight changes, which not only greatly reduces the user's learning and development costs, but also unifies the computing caliber;

We introduced Iceberg as a unified Table Format in the storage layer, so that the streaming and batching of the storage layer can be read in full or incremental mode, and the batch/stream task processing can be carried out.

For future planning, we will also try to use Flink SQL as a batch calculation engine, giving full play to the advantages of Flink streaming and batch integration to further empower users and reduce user development costs.

InfoQ: What are you focusing on right now? What are the new hot spots and trends in Flink? What more needs to be done to take full advantage of Flink?

Recently, I've been focusing on the fine-grained management features of task resources in the new version of Flink. Before Flink 1.14, it was a coarse-grained way of resource management, we can make full use of resources through Slot Sharing, but in individual scenarios, it will still cause unnecessary waste of resources. I think allocating resources to SlotSharingGroup at a fine-grained level is a good idea to address waste of resources.

Flink's landing practice in the real-time computing scenario of the auto home

On the other hand, I'm more focused on the Flink CDC (Change Data Capture) project. Prior to the release of Flink CDC, we implemented a real-time access distribution platform on Flink that synchronizes business library data such as MySQL, SQLServer, and TiDB. At this year's FFA (Flink Forward Asia) conference, the Flink CDC theme shared by The Cloud Evil teacher gave me a lot of inspiration, and I think that the ease of improvement that Flink CDC needs to do (Schema changes, library-level data into the lake) are the problems that our company's digital warehouse and business line users urgently need us to solve.

InfoQ: In the exploration of Flink, what are the problems that need to be solved due to various reasons and objective conditions?

We encountered the following two obvious problems:

Flink versions are iterated faster and have lower compatibility between versions, which makes it difficult for the platform to integrate new versions of Flink;

Flink SQL allows users to complete the development of real-time computation through SQL, but the expression ability of SQL is limited, and sometimes users need to write a lot of SQL to complete the data development of a real-time large screen, so that the data will be double-calculated, resulting in waste of resources. This kind of problem is actually more like the real-time OLAP scenario, we are currently using StarRocks to support this scenario, but we expect Flink to have a one-stop solution to further reduce our maintenance costs.

InfoQ: Finally, what to say to those who are interested in Flink and want to learn more about the application!

I think the way to approach any new technology stack in software development is universal, it is recommended to find some scenarios to use first, then take the problem to understand, and finally think about what you would do if you implemented it yourself. In addition, when working, we must also develop the habit of summarizing and reflecting more. Many students who have just come into contact with software development think that the problem is solved and the problem is finished, in fact, every time the problem appears, no matter how big or small, it is an excellent opportunity to think. Once the problem is solved, we can go back and think about how we can better help users quickly locate the problem or avoid such problems next time. For example, reflect on it: is the problem so difficult to locate because the program design is too complex and the link is too long?

Guest Profiles

Wang Gang

Autohome Smart Data Center Data Engineer

Graduated from Shenyang University of Aeronautics and Astronautics, majoring in Computer Science and Technology. In 2018, he joined Autohome, redesigned and developed the log collection platform, designed and developed a real-time computing platform and real-time access platform based on Apache Flink from 0 to 1; began to explore and land on the integrated architecture of lake warehouse in 2020, leading the integration and optimization of Apache Iceberg; likes technology exploration, pays attention to user thinking, and is good at locating and solving various difficult and complicated diseases encountered in the work.

Recommended Activities:

Buy tickets now and enjoy a 20% discount, buy tickets discounted by 1760 yuan, scan the code for details.

Click on one to see fewer bugs

Flink's landing practice in the real-time computing scenario of the auto home

Read on

Li Xiang, a resounding name, is also a great figure. Li Xiang is a well-known serial entrepreneur in China who founded Autohome, the world's most visited car website. plum

Knowing the car emperor winter test has been questioned by many opponents: to see the energy consumption of new energy in winter, you have to go back to the car home

Good news is coming, I have to say that the U.S. stock market closed, mixed gains, although the Chinese concept stocks fell hard, but the performance is not bad, at least I see the good. It's also a heavy setback, but you can look at the premium

Autohome Research Institute: 2023 User Car Purchase Journey and Experience Insight Report

What will be the impact of Huawei's initiative to stop the three major car portals of Chedi Yiche and Autohome

Autohome is going to "sell cars"?

Only three weeks have passed in January, and the sales volume of Wenjie has reached 20,100! It seems that the goal of 30,000 yuan mentioned before may be achieved, and Wenjie is now ahead of the ideal, in the first three weeks

Xiaomi car was leaked, the matter has finally come to an end, Autohome has done a good job! Xiaomi SU7 Since the Xiaomi car technology conference has been extremely hot, it is well known that it is in many media

The range of ZEEKR 001 measured in the professional test of Autohome is only a little higher than that of the car emperor, the former measured the range of 416.1km (21-inch wheels), the latter 406.7

ZEEKR 001 executives measured a range of 586 kilometers, which is amazing. The evaluation of the autohome certified by the executive is not accurate, and the evaluation of the car emperor and the easy car has become recharged, and now the evaluation of its own executives is up

The value of the product jumped Song PLUSDM-i Glory Edition won the first place in the plug-in hybrid SUV sales list in MarchRecently, major car companies have announced their own model sales report cards, BYD's sales champion

Judging from the measured results, what is the level of the endurance of the Leapmotor C10? Leapmotor, one of the leading new power car companies, has recently performed eye-catching in the market: in March, Leapmotor sold 14

Autohome joins hands with Ping An Property & Casualty to launch the "100 Cities Rejuvenation Car Buying Festival"

"Hundred Cities Rejuvenation Car Buying Festival" Kicks Off Autohome joins hands with Ping An Property & Casualty to help improve the automotive industry

Autohome, how to show its strength in the wave of "trade-in"?

At the age of 63, he became Fangyuan, and after many years of being like this, Wang Gang blamed himself and said: I am sorry for Yuanzi

"Old Guy" will be broadcast tonight, Zhang Guoli, Wang Gang, and Zhang Tielin will retire and start a business again in the "Iron Triangle".

Wang Gang and Zhang Tielin appeared at the premiere of Zhang Guoguo's movie, the Iron Triangle reunited, netizens: two foreigners

"Old Guy" premiered, it's addictive to watch! Zhang Guowen is stable, Wang Gang is funny, and Zhang Tielin is still exaggerated

75-year-old Wang Gang's current situation: his grandson and son are in the same high school, and their white hair is old, and his wife is still charming

Audio-Technica Zhang Guoli, Wang Gang and Zhang Tielin appeared on the screen again together, hilarious and famous

Autohome's equity exposure: Ping An of China holds 46.4% of the shares, FIL holds 5.5% of the shares

Satellite TV ratings first, Zhang Guoli, Wang Gang, and Zhang Tielin teamed up to play Wang Bang, and there is a ceiling for life dramas

After leaving CCTV, 56-year-old Zhou Tao switched to film and television partner Wang Gang and starred in "Old Guy" and became a blockbuster

Unexpected! "Old Guy" Wang Gang changed to a decent role, but I was a fan of the villain in the play, which was too annoying

Stop it, Wang Gang! Don't come out to act at the age of 76, in the new play, squeezing your eyebrows and making your eyes, your facial features flying around are too spicy

Shaking his head and squeezing his eyebrows, Zhang Guoli and Wang Gang's "Old Guy" was launched, with "textbook" acting skills

Old guy: Zhang Guoli is old, Wang Gang is still virtuous and Shenna, and Zhang Tielin's acting skills have not improved at all

I thought it was a hit drama, but after the broadcast, there was a lot of scolding, Wang Gang, Zhang Guoguo, and Zhang Tielin were selling their feelings again?

This time it was the turn of the 75-year-old Wang Gang to overturn the car, and he and the 20-year-old Zhou Tao played husband and wife, and the audience called out the spicy eyes!

75-year-old Wang Gang's current situation: white-haired and old, his son is in junior high school, and his wife's charm still exists

Stop it, Wang Gang! Don't come out to act at the age of 76, it's too spicy to squeeze your eyebrows and make your eyes fly around in the new play