laitimes

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

author:JD Cloud developer

1. System overview

Practice has proved that online advertising to monetize Internet traffic is the most successful business model of the Internet, and the e-commerce scenario is the core scenario of online advertising. JD.com serves hundreds of millions of users and a large number of merchants in China, with a massive product pool. It is challenging for platforms to push products on the premise of taking into account user experience, platform, and advertiser revenue. On the premise of ensuring efficient and reliable services, JD advertising retrieval platform needs to effectively match advertisements with user needs, provide personalized and accurate advertising recommendation and retrieval services, and create better interaction and value for advertisers and users.

1.1. Overview of the functions of the retrieval platform

The retrieval platform translates the advertiser's advertising appeal into the language of the playback system, and at the same time, as the upstream of the advertising system, completes the preliminary matching of people, goods and yards. From hundreds of millions of retrieval space, hundreds of materials are returned and sent downstream, which needs to consider the user's experience, advertisers' advertising demands, the relevance of recall results, platform revenue, etc., which carries most of the advertising business logic. Its effect determines the ceiling of the entire advertising performance.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 1: JD.com Advertising Retrieval System Architecture

1.2. Problem definition

Search the core capabilities of the platform

This document focuses on the core function of the retrieval system, which is to retrieve relevant advertisements for users, i.e., recall. Other core functions are documented separately and will not be repeated. In order not to lose generality, the correlation function can be abstracted into a scoring function

, then the recall process is one of the most valuable search questions: for scoring

, given the input

from the Candidate Set

to find a subset of the fixed size

make

at

Rank as high as possible. The widespread use of deep learning in online advertising is a watershed moment:

•In the pre-deep learning period, recall was mainly done by simple algorithms or rules

•In the post-deep learning period, the recall is mainly done by the twin-tower model + vector retrieval

Relevance scoring in the pre-rule-based deep learning period is an abstraction of business rules, which has the advantage of strong explanatory nature. However, scores from different rules are often not comparable. For example, label-based rule matching targeting is a special scoring method that only returns Boolean results, and its scoring function can be expressed as:

, which is difficult to compare with other branches. Scoring in the post-deep learning period is done by the model. The correlation modeling of multi-way models is similar, the value evaluation method is uniform, and the scores are comparable. However, due to the limitation of computing power/time consumption of the retrieval system, the recall model usually adopts a simple two-tower model, which limits the expressive ability of the model. In the context of hardware development and computing power release, the scoring function is becoming more and more complex.

The core technical difficulties of the search platform

The matching of people, goods and yards in the retrieval system is completed by multiple OPs (operators).

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 2: The amount of data processed by the OP inside the retrieval system is funnel-shaped, which is used to balance the contradiction between massive data and limited computing power

Taking filtering OP as an example, that is, selecting advertisements that meet the constraints of business rules from the set of advertisements that pass through the recall link, if the filtering is abstracted into a scoring function with the output as a Boolean value, it is expressed as follows: business score

, given the input

, look for a fixed-size one

a subset of

make

at

Get as far forward as possible. Due to the difference between the recall and business scoring functions, the recalled ads may not meet the business requirements. index

The difference between the scoring functions of the two links is measured, and it is also possible to roughly measure how much computing power is wasted in the recall process to evaluate invalid ads. In the pre-depth period, due to the limited computing power consumed by the scoring of the recall link, the wasted computing power will not affect the overall retrieval efficiency of the system. In the post-deep learning period when scoring models are becoming more and more complex, the waste of recall computing power can no longer be ignored in the context of the gradual increase in the amount of single-point computation for each candidate ad in the candidate set.

The core technical difficulty of the retrieval platform: finding a balance between limited computing power and massive data processing. In order to alleviate the contradiction between massive data and limited computing power in the retrieval stage, the retrieval system explores the following aspects:

1. Computing power allocation: It saves computing power and time-consuming for the computation-intensive link of the retrieval system

2. Computing power optimization: Improve the accuracy of correlation scoring and improve the business revenue generated by unit computing power

3. Iterative efficiency: Configure the center - a one-stop experimental platform to improve iterative efficiency

The following article focuses on the above three aspects and describes the process of core capacity building of the retrieval system

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 3: Retrieval Platform Core Capabilities. The blue squares are the links covered in this article

2. Main line 1: Beyond Serverless, a data-driven adaptive computing power optimization framework

The business logic of the retrieval system is complex, computationally intensive, and there are many modules. The optimization of the individual computing power of each module is not equal to the optimization of the overall computing power of the system. In order to optimize the computing power allocation of the retrieval system, the retrieval system has completed the upgrade from the graphization of a single module to the distributed execution graph linking the upstream and downstream.

2.1. Full graphization to data-driven graphization, approaching the ceiling of computing power optimization

Distributed execution graph is a data-driven, cross-service graph framework based on RPC calls, which breaks through the upstream and downstream dependence between physical services, starts from data dependence, stands on the overall link to achieve the optimal solution of global computing power, and pushes the automatic computing power allocation capability of JD's advertising retrieval system to the industry-leading level.

Difficulty: "Why should we manage sub-graphs based on the whole" 1) Solve the fragmentation between services (sub-graphs). After the architecture enters the serverless era, each graph is separated from each other, and it is difficult to consider the benefits of the overall architecture for iterative optimization within the service.

Innovation: "Data, Data, Data" From a top-down perspective, the big picture of the advertising system is a function

, relying on its own data sources

, for downstream output

。 From the bottom up, each subgraph is made up of multiple OP functions, and so is each OP

。 The key to clarifying graphs and graphs, OPs and OPs, is to clarify the data dependencies at each level. From the perspective of dependency, the distributed execution graph divides the data dependency between OPs into three types: no dependency, partial dependence, and total dependency. Execution flows are fully parallel between non-dependent OPs, and the concept of batches can be introduced between locally dependent OPs.

"One Architecture, Multiple Perspectives" complete distributed execution graph can realize automatic scheduling, automatic serial/parallel execution according to business orchestration, and support data-driven DAG expression, fully release computing power, and continuously maintain the system in the best state of computing power allocation. By fully exploring the data dependencies between the retrieval system and its upstream and downstream, and the system's process-driven architecture after automatic orchestration, the retrieval process is saved by more than 16%, which greatly releases the computing power of the retrieval link.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 4: Distributed execution graph implements a set of architectures and multiple perspectives

2.2. Flexible system and intelligent computing power allocation help business grow steadily

JD's advertising retrieval platform manages more than hundreds of thousands of CPU cores on a daily basis, and processes a large number of requests on and off the site on a daily basis. The traffic will double during the big promotion period, and how to ensure the stability of the platform will bring huge challenges to JD's advertising retrieval system.

Nodus:

• Diverse platforms. The business involved in JD's retrieval platform includes search advertising, recommendation advertising, first-focus advertising and off-site advertising. Different platforms have differences in the retrieval process, and the number of visits is also different, and each business is modeled separately, which has a large labor cost

• Hardware heterogeneity. JD Advertising Cluster not only manages different computing hardware (CPU/GPU), but also has performance differences due to different purchase batches and different brands and models

Solution: Precipitate the allocation of computing power into basic capabilities to empower the advertising system in different periods and scenarios.

Mathematical modeling: After several years of iteration, the goal of JD's advertising elasticity system has shifted from maintaining the stability of the system through traffic peaks to increasing revenue through computing power allocation. The goals of modeling resilient systems have also changed.

Phase 1: PID elastic system to maintain system stability:

The system error is modeled based on the difference between the current CPU and the target CPU, and the PID is used to control the elastic degradation to make the server CPU utilization reach a preset level. Compared with the common modeling method of controlling QPS to indirectly control the CPU, the CPU is more direct. The CPU utilization rate of machines with different performance under the same QPS is also different, and the CPU target modeling takes into account the characteristics of JD retrieval service heterogeneous hardware, which is more applicable.

Stage 2: Rational use of the system's daily idle computing power to bring benefits to the system:

The traffic distribution of JD APP presents a two-peak structure in the morning and evening, and a large amount of redundant computing power is idle during off-peak periods. The goal of Phase 2 is to maximize the value of traffic under the constraints of the system. The control means to expand/decrease the queue length of each link of the recall system. The new system feedback is defined as follows:

The goal of the system is to maximize the expected return per unit of computing power at a certain time granularity. The modeling challenges are as follows:

• The value of traffic is difficult to define. The value evaluation function is trained using the Groudtruth of the policy's posterior Uplift earnings as value. Ad retrieval systems use clicks and spend as revenue metrics.

• A bad strategy can cause irreparable damage to the online system. The system uses offline data to pre-train the elastic system. In practice, the elastic policy takes effect within the security boundary specified by the system. At the same time, the complete circuit breaker mechanism also ensures that a more stable conservative policy will take over the system after the elastic policy fails.

• Resilient systems based on revenue optimization are already being used in everyday situations. At this stage, the value evaluation function of the elastic system is still relatively simple, and the elastic system cannot be applied to the big promotion stage. The goal of the next phase is to refine the value assessment and apply the elastic system to maximize the benefits of the promotion.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 5: 京东广告弹性系统迭代road map

3. Main line 2: Race against time, and the efficient retrieval engine opens the ceiling of advertising effect

Maximizing the value of computing power in a limited time is the goal pursued by the retrieval team. In the context of limited hardware resources, it is difficult to traverse the 100 million-level commodity pool and score with the model in a 100 ms time. The JD search team referred to a large number of excellent public design documents in the industry, combined with the actual situation of JD advertising, and planned the iterative route of an efficient algorithm search engine. The overall plan can be divided into 4 phases:

The first stage: the algorithm retrieval engine matrix is beginning to take shape

In the initial stage, the common two-tower paradigm of the Internet was reused to quickly empower the advertising business. In the subsequent iteration process, technologies such as PQ index compression, business-based hierarchical retrieval, and full-database retrieval have been continuously precipitated, which are in line with the industry's advanced retrieval system and effectively support the development of JD's advertising business.

The second stage: efficiency is king, and the timeliness of data is greatly improved

The retrieval system perceives the timeliness of the material, which has a significant impact on the retrieval effect of the engine. Improve the speed of the algorithm retrieval engine to perceive materials, and compress the perception delay of hundreds of millions of materials to minutes, reaching the industry-leading level.

Phase 3: Link target alignment, recall of arbitrary target modeling

The goal of an ad retrieval system is to select relevant ads for users, usually modeled with a single goal such as CTR metrics. Whereas, the advertising system usually evaluates the ad based on the eCPM of the ad, i.e., bid x CTR. Inconsistencies between the search target and the system goal can result in a loss of performance for the ad system. The ad search team started from the business and launched an ANN search paradigm that supports maximizing arbitrary goals.

Stage 4: Scalar vector mixing, retrieval target realignment

In the old architecture mode of retrieval + filtering, some of the retrieved ad units were filtered because they did not meet the constraints of business rules, wasting computing power. Based on the background of deeper evolution of the model structure, the waste of computing power brought by this architecture is magnified. From the perspective of saving computing power, the retrieval team builds scalar vector hybrid retrieval capabilities, and uses the idea of pre-filtering to complete the goal alignment of retrieval and post-link filtering, so as to improve the income per unit of computing power.

3.1. Twin towers to the depth, from the industry to the first echelon

JD Retrieval Advertising has created an algorithmic retrieval engine product matrix from scratch, completed the evolution from a simple tree index to a hierarchical index combined with business, and then to a deep index, supporting the platform's efficient retrieval of hundreds of millions of advertisements.

"What is an ANN?" In order to complete the search for hundreds of millions of candidate sets within the specified time-consuming constraints, the system usually uses Approximate Nearest Neighbor Search (ANN) to avoid exhaustive listing of all the ads in the candidate set. Commonly used tree indexes place vectors that are close to each other in the index according to the Euclidean distance between the vectors. The leaf nodes on this index are the nodes corresponding to the advertisement, and the middle nodes are the nodes with no physical meaning generated by clustering. If you combine a beam search with a width of K, the number of nodes to be scored on each layer of the index is less than or equal to that

to reduce the amount of computation.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 6: A 2-fork example illustrates the beam search process with k=1: the nodes P2,1 and P2,2 are scored and then P2,1 with the higher score is selected. And so on, and finally recalls SKU1

In the specific scenario of "Innovative, Business-based Hierarchical Indexing" JD Advertising, users show obvious intent, which can effectively help the retrieval system narrow down the candidate set. Using this business knowledge, JD Advertising launched a multi-level vector index based on business understanding. In the case of search, for example, a user query contains the user's explicit intent. If you partition ads offline by user intent, only the specified partitions will be retrieved when retrieving online. It can not only effectively reduce the amount of retrieval calculation, but also reduce the bad cases introduced due to model generalization. The use of tree indexes within partitions can further reduce the time and computational overhead of retrieval.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 7: Combined with the hierarchical recall of the service, the index is partitioned according to the service. The recall phase only needs to retrieve the indexes under the specified partition

The "Full Database Index" JD retrieval system has many management branches, covering hundreds of millions of advertisements, and each advertisement is represented by multi-dimensional floating-point vectors, occupying considerable memory of the retrieval system. In the search engine iteration, the tree index is gradually replaced by a flat Product Quantization (PQ) index. PQ converts high-dimensional floating-point vectors into low-dimensional integer vectors, and the measured memory compression rate is as high as 85%, which greatly improves the expression capacity of the retrieval system. Thanks to the computing power saved by PQ, the flat index uses brute force computation instead of the resemble index Beam Search retrieval method to achieve full-database search.

The "deep indexing" twin-tower model has been widely used in the recall stage because the vectors generated meet the characteristics of nearest neighbor retrieval. However, the ability to represent the model, i.e., the scoring ability, is also affected by the following:

•The independence of the twin towers leads to insufficient integration of features on the user side and the item side

•The upper-layer Matching function is a vector dot product, which limits the expressive ability of the model

Combined with the above shortcomings, JD Advertising Retrieval launched an EM-based deep index. The new index breaks through the structural limitations of traditional indexes on the twin tower model. The algorithm can not only iterate vertically, where the representation function is more complex, but also laterally, where the matching function is more complex, and the user and the ad can be fused at any stage.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 8: Deep indexing supports a new iterative mode of recall algorithms

It is worth noting that because the recall model no longer adheres to the paradigm of the two-tower model, that is, the model no longer assumes that users and ads are mapped in the same vector space, and the vectors generated by the model no longer have the nature of approximate nearest neighbor retrieval.

The essence of ad retrieval is to find a path to high-value ads on the index. The tree index of the two-tower model, as a special depth index, determines the path from the root node to the leaf node by the vector cross product without the participation of the model. The path of deep indexing is confirmed according to the model score, and the goal is to maximize the value of the path to the advertisement.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 9: Vector index building and recall process abstraction

3.2. Recall architecture upgrade, the ultimate pursuit of data timeliness

"Why pursue timeliness" JD advertising retrieval is the most upstream of the advertising link, and the timeliness of its data greatly affects the marketing effect of the whole link.

Difficulty: JD Advertising serves a large number of advertisers, covering hundreds of millions of advertisements, and every minute there are situations where the status of advertisements changes due to factors such as the operation of advertisers. For peak traffic periods, when every second counts, the effective time of the advertiser's action will directly affect the advertiser's marketing effectiveness. If the resource consumption of the daily retrieval of the platform is superimposed, it puts forward higher requirements for the platform's capabilities, especially in the case of JD.com, which is a huge challenge in the case of hundreds of millions of advertising.

"Industry-leading, advertising takes effect in minutes" JD Advertising Retrieval System supports minute-level advertising information updates and is reflected in the algorithm index. The index construction adopts the idea of full + incremental, and only effective advertisements are quickly indexed during the full period, and the changes in advertising information after the full amount are reflected in the increment. The data upstream-pipeline system of indexes integrates the idea of data lakes into index construction, reducing the time required to build full indexes and shortening the delay of index validity. At the same time, with excellent scalability, it can efficiently build and manage various forms of indexes for the retrieval system, such as vector indexes and KV indexes.

3.3. Link alignment, retrieval efficiency is further improved

"Why link alignment" From a top-down perspective, there are two levels of link alignment for a target:

•From the perspective of the advertising system, retrieval is responsible for screening user-related ads, and the goal of post-link links such as coarse/fine layout is to maximize the eCPM of candidate ads. The misalignment of goals across the various modules of the advertising system limits the overall revenue of the advertising system

•Multiple OP targets within the retrieval system are inconsistent, resulting in the retrieval results falling into local optimization iterations that affect the entire advertising system

"Arbitrary target recall" one model, multiple uses. Only a small amount of modification of the vector is required to complete the "painless transformation" of the search target.

Difficulty: Changing the target of retrieval requires changing the training method of the model, which is extremely costly.

Innovation: Taking the CTR modeling model as an example, the pCTR estimated by the two-tower model is calculated as follows:

,

represents the user vector,

Represents the ad vector. Simply add 1D data to the original vector to transform the search target from maximizing CTR to maximizing eCPM:

User vector grooming:

Item vector modifications:

, the mathematically deduced and modified vector inner product can approximate eCPM,

The dot product of the modified user vector and the ad vector is positively correlated with the eCPM. At this time, the ads retrieved by ANN are the top-k selected according to the maximum eCPM, which completes the alignment of the retrieval system with the overall goal of the advertisement.

"Vector Scalar Mixed Inspection" builds a first-class retrieval engine with business expression capabilities and industry

Difficulty: It is difficult for the vector search engine to express the requirements of business filtering in the retrieval stage. In order to meet the requirements of advertisers, the retrieval system often adopts the architecture of vector engine + scalar filtering.

Innovation: JD Advertising abstracts the vector index structure into: interest layer and business layer. The business layer is usually advertising, which has a physical meaning. The layer of interest is an intermediate product of the path and has no physical meaning. Taking the double-tower index as an example, the leaf node represents an advertisement, and the status of the advertisement (up/down) should directly affect whether the leaf node can be retrieved. The middle node represents the implicit interest abstracted by the advertising clustering, which is not affected by the state of the advertisement at the business level.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 10: Abstraction of index structure and abstraction of the retrieval process

In order to reduce the waste of computing power on invalid nodes, scalars are introduced into the leaf nodes to avoid the calculation of invalid leaf nodes in the recall stage, and to ensure that the number of valid results in the retrieval queue is sufficient. Scalar vector hybrid retrieval not only improves the revenue per unit of computing power, but also promotes the target integration of retrieval recall OP with other backlink OPs, and improves the overall retrieval efficiency of the retrieval system.

4. Main line 3: Platform power: platform infrastructure releases R&D productivity

4.1. Sharpening knives and cutting wood: starting from the challenges faced by JD's advertising business iteration

The "JD Advertising Business Iterative Intensive" advertising retrieval platform is a complex business, computing and IO intensive, providing online advertising retrieval services for JD APP, JD Mini Program, JD PC and other clients. The important positioning of the platform also determines the intensity of its iterations: the search code base has an average of 600+ merges per year, and the average number of full code or configuration releases per year has exceeded 500, which can be seen.

"The challenge of R&D capacity and efficiency" supports the rapid and stable iteration of the retrieval system, which needs to be supported by a large enough R&D capacity. Each vertical line of business (search/recommendation/first focus/off-site) includes business architecture and algorithm strategy development. At the same time, the horizontal module (recall/creative/bidding) across business lines also includes the corresponding platform business architecture and algorithm strategy development, as well as system research and development, testing, etc. The platform supports the simultaneous development of a diverse R&D team of nearly 300 people, and in order to ensure the healthy development of sustainable business, it needs to have hundreds to thousands of experimental throughput and provide accurate and easy-to-use insight analysis tools.

"Urgent Iterative Challenges of the Big Promotion Scenario" JD.com is facing the test of more than two big promotions a year (618, Double 11, etc.), and the unique 0 logic of the big promotion needs to be quickly implemented. Urgent iterations pose a challenge to the robustness and readability of the system code.

4.2. Vientiane adaptation: platform-based support for diversified business development and customization

"Business System Layering" divides the online system into a system architecture layer, a business framework layer and a business algorithm strategy customization layer, and the three layers of iteration are independent of each other. In this way, business R&D focuses on business logic orchestration and strategy itself, while system R&D focuses on infrastructure optimization.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 11: JD.com's Advertising Business System Layered Framework

"Operator-based design of business framework" The cornerstone of the healthy operation of the system is a robust system framework. Complex business systems are divided into multiple operators (OPs) according to functions, which not only has clear system boundaries, but also classifies and abstracts business strategies. As the atomic unit of OP, the operator has clear input and output data and clear service positioning. The atomization operator follows:

1. CLEAR DATA DEPENDENCE: EACH OP HAS ITS OWN INPUT AND OUTPUT, AND THE INPUT HAS READ-ONLY ATTRIBUTES

2. PERCEPTUAL MODELING OF OP: THE OP RECORDS RUNTIME DEBUG/TRACE DATA FOR EASY DEBUGGING, MONITORING AND ANALYSIS

3. Configurability modeling of OP: The configured control range is limited to the OP, and an independent function or functional parameter can be controlled

"Pluggable Policy Customization" provides an extension point for business policy customization for each service operator, which has the characteristics of flexible plugging and unplugging. Such a design idea is to adopt the idea of class combination relationship + function division and rule, and extract a single function point from the OP and manage it through a separate extension point class, which is more cohesive in function.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 12: An example of an extension point for JD Smart Bidding OP

4.3. Beyond the limit: one-stop configuration management and ultra-large capacity of the ultra-fast experimental release platform

"New Perspective Configuration Modeling" jumps out of the general KV modeling, and JD proposes to model based on three elements of business configuration from a new perspective: configuration item (Key), configuration condition (Condition), and configuration value (Value). Compared with the KV configuration component, JD.com's new configuration component is more flexible and has more powerful customization capabilities: taking the recommended bidding OP as an example, when using it, you only need to configure the key to act on the bidding OP, configure the Condition as the recommended business, and configure the value as 1. Under the new perspective configuration, the configuration key is subordinate to an operator OP, and the configuration condition can be used to customize and extend the traffic service identity, which can be iterated with the continuous development of the business. The configuration method has business semantics and is easy to understand.

All of the above configuration changes can be experimented with with one click, providing a large experimental capacity for the online system. The configuration system of online advertising is linked to the hierarchical experiment platform, and each operator has the ability to run 20+ hierarchical experiments at the same time. JD Advertising's daily experimental capacity is more than 400, and theoretically the hierarchical experiment can accommodate unlimited experimental capacity, which is enough to meet the daily iterative needs of the production and research team of more than 300 people. Configuration modification can be superimposed on one-click initiation of experiments, which greatly simplifies the burden of configuration development and testing for R&D personnel, and makes the experiment switching mode more flexible and controllable.

"One-stop configuration management and release" usually needs to fully understand the current configuration status and present all configurations in the unified management interface, which improves the convenience of unified configuration management on the one hand, and makes the configuration more readable, so that colleagues who do not have development capabilities can understand the business processing process of the ad retrieval system at any time and conduct one-click experimental operations. At the same time, one-stop configuration modification runs through the whole process tracking and hosting of self-test, joint debugging test, small traffic, full volume, and holdback R&D cycle, eliminating the trouble of switching between multiple platforms.

JD Advertising R&D - Efficiency is King: Practice of Unified Advertising Retrieval Platform

Fig 13: All-in-one configuration management and publishing interface

4.4. Insight Expert: An online insight system that can be traced and attributed

The strong demand for a "debuggable and trackable" online advertising retrieval system is traceability. For R&D, the online insight system provides DEBUG mode:

• Debugging mode can be selected: you can choose to track a specific ad or a specific link

•Real-time: Generate DEBUG data immediately

•Comprehensiveness: Comprehensively record the intermediate data of each business of each module, which is comprehensive and thorough

TRACE model for operations and advertisers:

•Online request tracking: The result data of any online request can be traced system-wide

•Resend requests: Resend requests in real time to speed up problem locating

•Algorithmic re-insight: For TRACE log data, an algorithm can be embedded in the re-insight analysis module, and the algorithm can customize commonly used business statistical analysis attribution scripts in the system to improve analysis efficiency. For example, the low-price diagnosis of search advertising often analyzes the price quantile and category diversity of a SKU in the same request candidate queue

IN ADDITION TO THE DEBUG/TRACE MODE, ONLINE SYSTEM ATTRIBUTION INSIGHTS DIAGNOSTICS ALSO PROVIDES FUNNEL INSIGHTS MODE. The funnel statistical analysis of the whole link of the system helps the problem analysis, and the verification strategy plays an extremely important role. Online Insights provides funnel insights visualization tools under custom traffic filtering, which brings immeasurable benefits to many advertising businesses.

5. Summarize and look forward

Through the implementation of the above three main lines, the advertising retrieval system has maximized the business revenue per unit of computing power and effectively supported the business development of JD Advertising. In the future, the students of the System Technology Department will continue to improve the architecture of advertising retrieval along these three main lines around computing power, retrieval efficiency, and iterative efficiency. We also welcome interested partners to join us, grow together, and help the development of JD's advertising business.

ad

Read on