laitimes

Baidu recommends the practice of cold start of resources

author:A data man's own place

Reading guide The cold start strategy of new resources is a very important topic in the distribution ecology,This article will share Baidu's practice in cold start of new resources。

The article mainly consists of four parts:

1. Content Cold Start Concepts and Challenges

2. Content cold-start algorithm practice

3. Content cold start experiment system

4. Q&A

Guest speaker|Li Shantao, Senior R&D Engineer of Baidu

Edited and organized|Wang Lulu

Content proofreading|Li Yao

Exhibiting Section|DataFun

01

Content Cold Start Concepts and Challenges

Baidu recommends the practice of cold start of resources

Baidu Feed Recommendation is a comprehensive information feed recommendation platform with hundreds of millions of monthly active users. The platform covers a variety of content types such as graphics, videos, dynamics, mini programs, and Q&A. It not only provides point-and-click recommendations similar to single or double columns, but also includes a variety of recommendation forms such as video immersion. At the same time, the recommendation system is a multi-stakeholder system that includes more than just the C-side user experience. Content producers play an important role in the recommendation system, and Baidu feeds a large number of active authors and produces massive amounts of content every day.

The essence of the content platform-based recommendation system is to achieve a win-win situation for all parties, and for the user side: the platform needs to continuously recommend high-quality, fresh and diverse content to users, attract more users and contribute more time; On the author's side, if the high-quality and fresh content published by the author is not quickly and sufficiently exposed, the author will choose to withdraw from the platform, which is not conducive to the sustainable development of the platform. Based on the above discussion, several key words can be extracted: freshness, quality, diversity, author posting, and retention. This is closely related to the cold start that this article will discuss. First of all, it is necessary to allow more resources to obtain sufficient display, and increase the amount of content that can be recommended by the system by collecting feedback on more content, so as to increase the diversity of user consumption resources; Secondly, the rapid increase of new resources improves the user's sense of freshness, and then drives the duration, DAU, and CTR of the market; On the author side, the number of active authors and the number of content published are increased through the stimulation of authors' enthusiasm.

Baidu recommends the practice of cold start of resources

There are some differences between a cold start for a new resource and a regular recommendation algorithm. The challenges of cold start can be summarized in three main areas:

The first is the challenge of accurate recommendations. With the development of recommendation algorithms in the past decade, from the initial matrix factorization to the later wide application of deep learning, the role of ID features in models has become increasingly prominent. However, due to the scarcity or even non-existence of the number of cold-start samples of new resources, the training of ID-like features on cold-start samples is insufficient, which affects the recommendation accuracy.

Second, there is a general Matthew effect in recommendation systems, that is, resources that have been recognized by users are more likely to be recommended, so as to get more exposure and clicks, and further consolidate their position. Conversely, new resources can be difficult to get referrals and may even be ignored altogether.

Finally, we need to give new resources a certain amount of cold start support, so how to support new resources more efficiently and fairly? This leads to the two concepts of fairness and impartiality, which refers to the fact that each content product can get a certain exposure opportunity and have a fair competition opportunity in the early stage of cold start. Impartiality means: we must reflect the value of high-quality content, and the quality of the content must be able to affect the weight of cold support. Therefore, it is also a great challenge to find the right balance between fairness and justice in new resources, so that high-quality resources can stand out and maximize the overall benefits.

02

Content cold start algorithm practice

1. Content-based cold start

Baidu recommends the practice of cold start of resources

The following are common recall methods for new resources, as the traditional i-to-i (item-to-item) and u-to-i (user-to-item) recall methods are not suitable because the number of interactions between new resources and users is small. As a result, cold starts rely primarily on content recommendation methods. For example, a method of direct recall based on the most basic user personas, content tags, and classifications has a lower degree of personalization and relatively poor recall accuracy.

Secondly, with the increasing number of authors with personified attributes on major content platforms, cold start based on attention relationship has become an effective method. However, the attention is relatively sparse and cannot meet the posts of many authors with low followers; Therefore, it is necessary to take it a step further and mine the author's potential fans through algorithms to expand the influence based on paying attention to the cold start. For example, users who frequently consume the author but do not follow it, and calculate potential attention relationships based on the composition of user-author attention relationships.

In addition, multimodal recall is also an effective method. With the development of cross-modal, multi-model and large-model technologies, the integration of various modal information in the recommendation system is remarkable, especially in the cold-start recommendation system. CLIP is a pre-training method based on contrasting text and images, which mainly includes two modules, a text encoder and an image encoder, which maps text and image information to the same space and provides better assistance for downstream tasks. There is a certain problem with using this vector directly for recall, the vector represents the prior information of the content, and the prior similarity does not necessarily mean that users will like it, and we need to associate the prior representation with the posterior representation learned based on behavioral data in the recommendation system.

The mapping is based on the resources that have been distributed with sufficient embedding and sufficient learning to collect a few samples and use them as labels to train the projection network. The projection network maps the a priori representation of the cross-modality to the posterior behavior representation of the recommender system. One advantage of this approach is that it can seamlessly use the recall and ordering models already in the recommender system without adding any models. For example, for the twin-tower model, we only need to take the existing user side vector without making any changes, and then use the projection network to project the new resources into the posterior representation space of the twin-tower model, so that we can simply and quickly launch a twin-tower recall. Similarly, existing graph recalls and tree-based recalls can also be brought online in a low-cost manner.

Of course, there is a small drawback to this mapping method, which is the difficulty of regression. In CB2CF, this is a regression problem, and regression is generally more difficult to learn. Therefore, we can also use the Pairwise approach to learn the mapping relationship. Specifically, the positive sample can be set to a similar item pair learned by item CF, and the negative sample can be obtained by global negative sampling, etc., and the input also includes some prior and dynamic information of the item, and then such a map can be obtained through learning.

By using the prior information of the content, it is basically possible to effectively realize the recall method commonly used in the large market on cold start.

2. Cold start based on seed users

Baidu recommends the practice of cold start of resources

Since the early cold starts are mainly designed for a few zero-click resources, once there are some early cold starts, these will collect a certain number of positive feedback seeds. At this point, we can use the lookalike method to recall.

An important advantage of Lookalike is that it is particularly up-to-date. This method is mainly derived from the world of Internet advertising, where advertisers used to select a number of potentially interesting users as seed users, and then the system would look for similar users of these seed users for proliferation. In the recommendation system, we can subscribe to the online real-time stream log to get the positive feedback collected during the previous cold start to give resources, such as clicks, plays, interactions, followers, etc., and even include negative feedback, such as users who swipe quickly. Then, based on these seed users, the system can get the representation of the item through the user's embedding, through various aggregation methods or adding some self-attention mechanisms. This representation can be updated very quickly, and then spread outward based on this representation, which is very time-sensitive.

03

Content cold start experiment system

1. Optimized ID features

Baidu recommends the practice of cold start of resources

In terms of models, the optimization points of cold start can be summarized into three paradigms: ID discarding, ID generation, and dynamic parameters of the model. These three paradigms can actually be combined with each other.

For ID discarding optimization, the model is easy to cater to the head resource because the overall resource sample is small, so the ID learning of the head resource is very sufficient, and the feature importance in the model is particularly high. However, cold-start resources are less present and ID learning is inadequate. There are two ways to solve this problem: one is to avoid using IDs as much as possible, and the other is how to make better use of IDs.

The first paradigm is drop-optimized and one of the classic approaches is DropoutNet. During the training process, DropoutNet randomly discards item ID and user ID features to maximize the model's emphasis on non-ID features and enhance the model's generalization ability. Doing so can actually improve the cold start effect for new users or resources.

In addition, in recent years, some methods of contrastive learning have also emerged. Contrastive learning is a self-supervised learning method that does not rely on manual annotation and can construct a large number of samples, which provides help for optimizing the multi-gang cold start problem, because we can construct additional samples to strengthen the position of the cold start data. For example, in a two-tower model, an auxiliary contrast loss can be added on the item side. The parameters of the two towers are shared, and the network parameters and embedding features of the resource towers can be affected by the comparative learning loss, and the samples with ID features and other cold-start features are masked in different proportions through the masking method, so as to take into account the generalization ability of the model and the particularity of the cold-start resources.

Baidu recommends the practice of cold start of resources

Next up is generative optimization, which mentioned that you should use as little as possible for unreliable ID features, but it's better to make it more reliable for now. The general idea is to initialize the embedding of the ID based on the prior characteristics of the ID. Taking the two-tower model as an example, under normal circumstances, the new features will be randomly initialized or completely zero, which will lead to inaccurate prediction of new resources and slow convergence speed. Therefore, you can take advantage of some prior characteristics of content, such as tags, content tags, author tags, etc., as well as some similar IDs (such as popular IDs), select some ID embedding of resources with sufficient high posterior and high distribution as labels, and then train a generator to generate the embedding of IDs to replace the initial value. Of course, you can also directly use the ID embedding of the new resource and the most similar top K popular resources to average it as the embedding initialization of the new resource, which is relatively stable and has a very low cost, and is used more in the industrial world.

Baidu recommends the practice of cold start of resources

For the problem that popular ID-dominated models are more dependent on ID features, we can adopt the idea of multi-task and multi-scenario optimization. Using the two-tower model as an example, the prediction of cold-start and non-cold-start resources can be split into two separate targets. Through the common multi-objective model, the model pays more attention to new content. A classic approach is a CGC network, as shown on the left side of the diagram above. In this network, all tasks share an embedding layer, and then learn independent expert networks through cold-start tasks and non-cold-start tasks to improve the ability of cold-start prediction. Another approach is to adjust the parameter weights of different resource types in the network through dynamic weighting, as shown on the right side of the figure above. In this network, the rightmost network is a cold start indicator, which receives information about cold start resources (such as the current number of clicks and impressions and resource type), and then outputs the weights of each layer of the network to control the transmission channel of information in the network under different resource types, so as to make the model more accurate prediction in the cold start situation.

2. Design of flow control mechanism

Baidu recommends the practice of cold start of resources

New resources need to be increased as soon as possible to improve the authors' publishing experience and the realization of recommendations, but due to the Matthew effect, we need to tilt the new resources to a certain extent. The general cold start tilt can be divided into two types of flow: base flow and booster flow. Basic traffic means fairness, and we need to give all resources some inclusive traffic to explore. The booster traffic is based on the potential estimation of authors' quality resources and the performance of primary traffic.

The support mechanism of cold start has two parameters at the abstract level: time and distribution, that is, through forced insertion, power adjustment and other means, so that resources can reach the given distribution target in a given time. For different businesses, we will set different distribution volumes and required times. For example, for a normal resource, it may only take 100 impressions in a 24-hour period to suffice; For new hot resources, it may need to be faster, such as 3,000 impressions in half an hour. At the same time, a large cold start quota may be set for new authors.

Specifically, the t in the formula is the normalization of the current publishing time divided by the time required for the goal, i.e., the current time schedule, and x represents the current distribution schedule. We want t and x to be equal, which means distribution at normal schedule. If x is less than t, the current cold start speed is slower and the weight or force insertion coefficient needs to be increased. θ in the formula controls how skewed the resource is in the early stage.

However, this formula is based on the premise that the flow of the product is uniform over time, which does not meet this assumption. There are differences between peaks and troughs in the distribution of traffic of general Internet products, so it needs to be adjusted according to the actual situation. For example, if a piece of content is published at 2 a.m., it might only take 25 distributions by 8 a.m. because there is less traffic during the wee hours of the morning. Therefore, t in the formula needs to be integrated based on the actual traffic distribution.

3. Serving user selection

Baidu recommends the practice of cold start of resources

Another key question is who should be directed to at the beginning of resource distribution? The most common practice is to try to recommend new resources to old users instead of new users, because old users are usually more tolerant and can avoid harm to new users due to inaccurate recommendation of new resources. In addition, if the promotion of cold start resources is regarded as an intervention, based on the Uplift idea, you can learn the impact of the intervention on user duration and retention, and try to select users who do not have a negative impact on the intervention for cold start.

Both of the above points are based on the perspective of C-side user influence. However, the choice of cold-start audience will also affect the subsequent communication development of the resource. From the perspective of information dissemination, secondary communication theory divides information dissemination into two steps. First, among the vast amounts of information generated every day, some groups of people have the ability to sift through and boost the information, which we call opinion leaders. The resources amplified by these opinion leaders are then disseminated on a large scale.

In the current era, the role of opinion leaders also exists on social platforms, well-known media, TV stations, etc. For the recommendation system, there is also the concept of key node user resources, who influence the consumption behavior of other users by screening high-quality resources and making recommendations.

So, how do you tap into these key users? Through the above discussion, key users have two characteristics: one is that they have a high ability to identify the quality of resources, and the other is that the content they recommend has a high probability of being accepted by other users. Therefore, there are two mining methods:

Firstly, the resources are divided into high-quality and low-quality resources according to the posteriori of the resources, and they are used as labels. The user ID of the initial click on these resources is then used as a characteristic to predict the posterior situation of the resource. The weight of each user ID learned by the model can be considered as a key index for that user.

Secondly, through the online user collaborative filtering recommendation system, the recommendation success rate among users was mined. Users with a high referral success rate can be considered key users in the referral system. Through these two methods, the key users in the graph are mined and recommended to them when the resource is cold-started.

4. Experimental system

Baidu recommends the practice of cold start of resources

There are some particularities that need to be paid attention to when designing the experimental system with cold start content, because the sample of the recommendation system is shared, so the feedback collected by the experimental group will also be learned in the control group, which makes it difficult to accurately measure the effect of the cold start strategy. Therefore, we need to conduct content isolation experiments to evaluate the impact of the cold start strategy on the entire system.

A common experimental design is to completely isolate users from resources, as shown in the lower left of the diagram above. Of those, 50% of users only see 50% of the content, and different resource groupings use different cold start policies. This allows you to assess the impact of a cold start strategy on the entire system. However, this approach can have a significant impact on the experience of C-end users, as they can only see a portion of the content.

Another gentle approach is to completely isolate users and resources during the cold start phase, such as the first 3000 times, and then different groups carry out different cold start policies. After a cold start, the resource can be distributed to all users. Such a design can reduce the impact on the C-side user experience.

Through experiments, we can analyze the following metrics:

  • Cold start compliance rate, rate, and efficiency metrics during cold starts, such as click-through rate (CTR), completion rate, etc.
  • In the comprehensive distribution stage, indicators such as the proportion of high-quality content, the breakage rate, the explosion rate, and the number of articles published by the corresponding authors in different resource groups.

04

Q&A

Q1: How to judge the hot and cold twin towers? One is a hot tower and the other is a cold tower.

A1: The judgment of hot and cold towers is usually based on the amount of resources distributed. In general, a resource with a lower distribution volume is considered a cold tower, while a resource with a higher distribution volume is considered a hot tower. For example, you can consider a resource that has been distributed less than 100 times as a cold-start resource. Of course, it is necessary to analyze the prediction accuracy of the online model and determine the specific judgment criteria according to the actual situation.

Q2: How to judge the potential of resources here? Is it a new hot spot in the field or a value model to make predictions?

A2: A high-quality boost for cold-start flow typically involves an assessment of resource potential. Determining resource potential can be combined with multiple sources. For example, if it is a new hot topic in the field, you can comprehensively consider the information of the whole network, including the hot list information of each product, as well as the topic discussion and attention in related fields. For the valuation of resources, the quality of the authors may be considered, including their performance in the early stages, interactions, and other factors. Combining this information, a more comprehensive estimate of the potential of the resource can be made.

Q3: How to solve the ideal t and the actual t? What about the exposure curve? How to ensure that the actual exposure is consistent with the trend of the broader market.

A3: When solving for the ideal t and the actual t, this can be demonstrated by observing the exposure curve. The exposure curve shows the amount of exposure of the asset over different time periods, with the ideal t being the theoretical exposure progress calculated based on the time required for the set target, and the actual t being based on the current actual exposure progress. In order to ensure that the actual exposure is consistent with the overall market trend, it is necessary to stably monitor the proportion of the overall traffic to ensure that the progress of the cold start is consistent with the overall traffic trend. If the cold start is slow, you may need to increase your exposure or adjust other recommended strategies to speed it up, while if you are too fast, you may need to slow down the exposure to avoid overexposure of your assets.

Q4: Users can only see 50% of the content during the experiment, and 100% of the content when they are full. How do you prove that the experiment is consistent with the full effect?

A4: In fact, it is difficult to accurately measure the exact value of the effect of cold start. Now it is generally a matter of comparing which is better, the experimental group and the control group. That's all for this sharing, thank you.

Baidu recommends the practice of cold start of resources

Share the guests

INTRODUCTION

Li Shantao

Baidu

Senior R&D engineer

Master's degree, senior R&D engineer of Baidu, responsible for Baidu's information flow recommendation distribution ecology and recall-related technical workers

Baidu recommends the practice of cold start of resources

SPRING HAS ARRIVED

Baidu recommends the practice of cold start of resources

Read on