laitimes

System architecture is six months ahead of the business, which is the key to solving the problem gracefully

Guest 丨 Zhou Xun

Edited by 丨 Xue Liang

With the growth of the number of Internet users, the explosive growth of data volume and traffic, conventional data processing and service recommendation means have been difficult to adapt to the current complex business scenarios, how to fine, accurate, efficient and intelligent connection of products and users, has become the increasing focus of various platform systems. Therefore, a personal recommendation system based on thousands of faces has become an indispensable part of each business scenario.

Mr. Zhou Xun's previous work in charge of iQIYI is related to data and algorithms, including the personalized recommendation of iQIYI's main traffic end, the construction of big data platforms, such as A/B experimental platforms, user analysis platforms, user portrait platforms, etc., in general, it is a big data middle office team that provides data capabilities for the whole company's business, and uses algorithms and big data technologies to make the business run and grow more efficiently. Zhou Xun has ten years of experience in big data and intelligent algorithms, and has in-depth practical experience in personalized recommendation algorithms and platform architecture, large-scale OLAP systems, user portraits, data science and other fields.

At the 2021 ArchSummit Global Architect Summit, we invited Zhou Xun to serve as the producer of the "New Era Recommendation System Technology" topic to plan the topic of big data and recommendation system exploration.

1

Experience the upgrade of the data platform architecture

Previously, as the head of iQIYI's recommendation system and big data application team, Mr. Zhou Xun personally experienced and led the upgrading of many data platform architectures, and was deeply impressed by the construction and transformation of iQIYI's big data architecture. When Zhou Xun first joined iQIYI, one of the projects in charge was user portraits, simply put, it was to use the user's various behavioral trajectory data to label users, one of the greater difficulties at that time was that user data was scattered in various businesses and systems, the data team spent a lot of energy to extract this information from various business systems and databases, and sometimes there were a series of problems such as inconsistent user IDs, inaccurate behavior buried information, etc. It can be said that this stage is very painful.

About 2017, as the company's data team gradually integrated together, data standardization and unification has become the primary task of the big data department, Zhou Xun led the team to first design a unified company buried specification and user ID specification, and then through the construction and promotion of data in the middle of the nearly 2 years, it solved the various pain points in the first stage. Subsequently, they integrated and SaaS construction of data products in the company, such as user-side analysis systems, content-side analysis systems, A/B experimental systems, etc., and focused on real-time, intelligent, and mobile three directions, and put data capabilities in the forefront to make data application products the primary reliance for business analysis and decision-making.

2

Recommend the construction background of the middle office

When we have many recommended scenarios that need to be optimized in parallel and iteratively, human efficiency improvement and ability sharing have become an urgent problem to be solved.

Zhou Xun split into two stages to introduce: the first stage, the team to achieve some of the simplest configurable scenes online, mostly through the reuse of other scene models, so that the advantage is to be able to quickly access and online, the disadvantage is that it can not be well further optimized the recommendation effect, this program is a better choice in the early stage of iQIYI's personalized transformation.

When the personalized penetration of user products reaches a relatively high degree, the data team begins to enter the second stage of the recommendation of the middle platform, the main purpose is to configure and open some core components, so that engineers responsible for different recommended scenarios can further carry out in-depth optimization, in the process of configuration construction, also pay more attention to experience precipitation, some good "operators" for abstract management, so that the team can directly learn from each other's experience. As algorithm maturity and business complexity continue to increase, the role of recommending the middle office will become more and more important.

Of course, the middle office structure will inevitably take some detours in the construction process. Zhou Xun believes that the upgrade of architecture is often driven by both business and technology, and more advanced technology is better to solve more complex problems, but architecture upgrades often bring about a painful period of business. From the perspective of Zhou Xun's experience, only by running the architecture in front of the business can we solve the problem more elegantly, zhou Xun's requirements for the team is that the architecture should be at least half a year ahead of the business, of course, this requires the architect to have higher requirements for the understanding and prediction of the business.

3

Algorithms are the soul of the recommendation system

The most relevant thing about the quality of the big data personalized recommendation system is the recommendation algorithm it uses, which is the soul of the entire system. With recommendations based on association rules, content, and collaborative filtering, what are the considerations in the selection planning? Zhou Xun said that the modern recommendation system is the fusion of multiple algorithms, while considering a variety of business goals, the choice of algorithms must start from the essence of the business, he even believes that thinking about business logic is an essential course for algorithm engineers.

In addition, user data plays an important role in recommendations. Zhou Xun said that in addition to the common user portrait ability, more importantly, including the granularity and real-time nature of user behavior data, in iQIYI's recommendation system, user behavior data can be divided into three levels (real-time, nearline and offline) to affect the algorithm strategy. In addition, the granularity of the buried point will also directly affect the upper limit of the recommendation system, such as the drag-and-drop behavior of the user watching the video, the jump point information, etc. can be used for the algorithm model to learn.

Although the use of big data personalized recommendation in the current Internet industry has become hot, it is undeniable that the recommendation effect of many products is far from what is expected, and the road ahead is still very long. As Zhou Xun said, the measurement method of recommendation effect is different in different companies and different industries, and the content class may emphasize user time, the e-commerce class may emphasize transactions, and the social class emphasizes relationship establishment and so on. How to define the recommendation performance indicator is the first problem to be solved, which involves the thinking of business logic mentioned earlier. At the same time, in different periods of product development, the recommended goals should also be strategically adjusted, there is no fixed formula, but it must be based on the essence of the business and at the same time it is measurable by data.

Speakers:

Zhou Xun: Director of eBay's China Research and Development Center (CCOE).

The leader of the recommended advertising team, the former head of the iQIYI main App recommendation and big data application platform team, has more than ten years of experience in big data and intelligent algorithms, and has in-depth practical experience in the fields of recommended advertising algorithms and system architecture, user portraits, AB experimental data science, and large-scale OLAP.

Read on