DTC replenishment practice: from algorithm to implementation

"What I hope to measure the value of our AI team is not how deep the algorithm is, but the ability to cross the distance between the algorithm and the implementation. ”

In recent years, as the e-commerce industry has gradually grown from the incremental market to the stock market, the scale of many stores has reached a certain level, and the growth of revenue has begun to slow down.

The supply chain is like the supply of grain and grass in ancient wars, one or two blitzkriegs may not see the importance of grain and grass supply, but as long as it involves a protracted and large-scale battle, it must be that the three armies have not moved, and the grain and grass go first.

The same is true of the current shopping mall, in the small scale of the enterprise, in the stage of rapid growth and market occupation, the supply chain may not be concerned, but as long as the scale reaches a certain level, the supply chain will be like the heart of the enterprise, with less capital flow, to provide a more stable business flow for the business operation of the enterprise, and become one of the conditions for the enterprise to "be invincible".

Fortunately, in the process of cooperating with a large KA customer, we made an in-depth exploration of the replenishment of DTC scenarios, starting from the classic safety stock theory, to targeted design adjustment, and then to value prediction and calculation, and finally to the implementation of the business indicators optimized. This article will share our thinking and practice, and communicate with all the officials, hoping to have more gains:)

Algorithm design

The topic of supply chain has been developed in China for 20 years, and at the methodological level, we highly respect the theory precipitated by Mr. Liu Baohong's "Three Lines of Defense of Supply Chain", from forecasting to replenishment to procurement, from algorithm to process to organizational structure, we have very insightful insights, and we have been able to stand on the shoulders of giants to create value.

In the field of replenishment, the more classic and widely used in the industry is the safety stock theory, which rarely involves complex operations research or reinforcement learning. When we assume that the Lead Time is constant, the general safety stock calculation can be expressed as follows:

DTC replenishment practice: from algorithm to implementation

The corresponding mathematical logic can be explained by the figure below, if the actual sales in the future obey a normal distribution, then the prediction results of our AI model are often expectations for the future, and if we directly use the predicted value to replenish the stock, there will be a 50% probability that it will cause a shortage of stock.

At this time, we expect that the inventory will not be out of stock in the case of the specified service level, which is actually to find the value on the corresponding confidence interval. This is based on a one-day forecast replenishment, considering that replenishment orders placed during the day may not be available until a few days apart, and a multi-day safety stock is the above formula under the assumption that sales are normally distributed.

Figure 1: Statistical relationship between forecast, safety stock, and service level

Furthermore, in a decision-making cycle (e.g., replenishment once every three days, production scheduling once a week, etc.), the sales volume generally does not obey a normal distribution, but as long as our forecasting model is properly chosen, the error can often be close to the normal distribution, and we can also calculate safety stock in the same way. For example, in some scenarios, sales from Monday to Sunday show cyclical changes, so we can add the date factor to the model.

Although the reason is simple, when combined with the actual business, it will be found that there is a big problem with the normal distribution assumption here. The prediction error for each day in the future follows the same normal distribution, which is difficult to achieve in the DTC scenario.

1.1 Peculiarities of DTC

With the development of e-commerce to the present, almost everyone can feel that every day seems to be doing activities, and every day seems to be able to receive coupons, and it is not difficult for us to find the following characteristics from the data:

Promotions are frequent. For customers like our cooperation this time, our own promotions are promoted an average of 4 times a month, each lasting 3~4 days, not counting the platform cooperation type of polycost-effective and tens of billions of subsidies;
Promotional stimulus is significant. No matter how frequent the promotion is, what the promotion intensity is, compared with the non-promotion day, the sales volume has a certain increase; the promotion of special days has a significant increase compared with the usual promotion, such as 38, 88, 99 promotion, Qixi Festival, New Year's Festival, etc.; and after the end of the promotion, the sales volume will basically return to the daily sales level immediately;
There are similarities in promotions. Whether it is a periodic frequent promotion or a promotion on special festivals, there are similarities in terms of the total amount, the proportion of daily sales during the promotion period, and the proportion of sales of different products, and there are also certain similarities in the accuracy and error distribution of the same model.
Individual promotions are planned. Like Double 11, 618, or the live broadcast activities of some top anchors, there is a strong plan orientation. Brands and partners will have a clear sales plan in advance, and the entire supply chain is often guided by the agreed plan value, rather than through historical sales forecasts.

Because of the above characteristics, we will not take into account the promotion of 618 and Double 11 in this cooperation. For the rest of the period, it is natural that there will be no unified forecasting model, which can make the prediction error obey the same distribution every day, and the safety stock theory can replenish the stock. So what are we going to do?

1.2 Safety stock calculation based on joint error distribution

Based on the characteristics of DTC, we can at least model promotions and non-promotions separately, and when our model is selected appropriately, we can ensure that the error predicted by a single model obeys a normal distribution. According to the basic idea of the safety stock theory above, as long as the joint distribution of the errors of each model during the lead time is obtained, and the safety stock under the corresponding service level is found, the problem can be easily solved!

It should be noted that when the number of promotional and non-promotional days covered by Lead Time is different, the corresponding joint error distribution is also different, for example, two-day promotion and one-day promotion and two-day non-promotion, the safety stock under the same service level is different.

Figure 2: DTC promotions are frequent, and daily sales vary dramatically

Based on this idea, we have developed a corresponding replenishment system. In line with the following, we define 3 modules:

Predictive models, i.e., blended models of promotions and non-promotions.
The decision model, which is currently a safety stock model based on joint error distributions. Any way that can give a replenishment plan can be regarded as different decision-making models, including but not limited to operations research optimization, reinforcement learning and other models.
Simulation model, composed of certain business logic, inputs forecasts, decision-making models, and historical or future forecast data to simulate historical or future business performance.

For an in-depth discussion of these three modules, we can write another article to discuss it, focusing on how DTC can be implemented in this way.

We verify the effectiveness of the model through simulation backtesting and hyperparameter search based on simulation optimization. We simulate according to historical sales, inventory, and a certain business operation logic, in the same business flow situation, if we use our way to replenish, how will the business indicators change, as shown in Figure 3 below, which is the simulation results of inventory days and out-of-stock rate under our replenishment method in a certain period of time. Compared with the manual operation index, there is an improvement of more than 20%.

Figure 3: Simulated inventory days vs. manual inventory days

In terms of details, as shown in Figure 4, the pace of AI replenishment is relatively stable, and the replenishment value will change accordingly with future forecasted sales fluctuations. And the inventory also has a steady rise and low like the safety stock example diagram, which looks elegant :)

Figure 4: Replenishment value and inventory performance under simulation

It should be noted that the current simulation is only a theoretical value, because there are various abnormal factors in production, and the theoretical value is often only the upper limit that can be reached. Nevertheless, we have at least verified that the current algorithm has a certain landing value, and the rest is how to ensure that the loss between the theoretical value and the landing value is as small as possible in the landing process!

Preparation before landing

In the process of implementing algorithms, it will also bring a lot of interference to the management of business. In order to ensure high-quality landing results, we have adopted a combination of MLOps and simulation to manage the landing process. Prior to the launch, our client raised two concerns:

The model has performed well in the simulation backtesting history, but how to ensure that the model's performance in the future is continued to be stable?
DTC has a large number of products, can there be a scientific way to launch in batches and ensure that each batch of online products performs well?

We've got you covered.

2.1 Model performance stability

In the basic understanding of MLOps, we believe that development is relatively cheap, and continuous operation and maintenance online models are expensive. Models that work well in history are only the first step, and the more challenges are the ongoing O&M after go-live and the inevitable degradation of model performance. For our replenishment model, or ML model, we need to continuously monitor the process and intervene when the performance of the model drops significantly, whether it is to retrain the model or adjust the input data.

Figure 5: Model performance monitoring concept for MLOps

Especially in the industry where the business environment is diverse, any adjustment of strategy and market change may lead to the degradation of model performance. For example, DTC's live broadcast and Taobao customers were still hot topics in the past few years, but in recent years, the popularity has declined, and their stimulation of sales is also decreasing.

Figure 6: Model performance assurance system combined with the data centric concept

Based on this, we take the inventory days and out-of-stock rate, which business students are most concerned about, as the core monitoring indicators. In addition, we can also make certain pre-warnings through simulation.

As mentioned above, the model we obtained through simulation optimization performed well over a certain period of time, and we can also use the same hyperparameters to simulate different time periods in history to observe the stability of the same decision-making method.

In Figure 7 below, we use the same hyperparameters to simulate the historical data of another period, and we can see that the AI's inventory days and out-of-stock rate are close to the average of the previous period, and they are also stable and better than manual performance. It is proved that the same set of models for determining decision hyperparameters is relatively stable in different time periods.

Figure 7: Simulated inventory days for different time periods under the same set of decision hyperparameters

The stability warning of periodic special activities is also a simulation backtest during Figure 7, and careful students may find that there are 3 peaks in the AI out-of-stock rate. Combined with DTC's promotion plan, it is not difficult to find that these three times correspond to the Qixi Festival, 88, and 99 promotions, which are big promotions that will be done every year. When our forecasting and decision-making models are not adjusted, the same out-of-stock rate will increase in the simulation backtest every year.

On the other hand, for the big event that has not yet arrived, we can use the past event to do simulation backtesting to see if the current model has the confidence to survive the same event this year, which will leave us with opportunities to cope with future changes.

Here take the 88 promotion as an example, we can use the historical data of 88 in previous years to do a simulation hyperparameter search, and the results can be used for specific hyperparameters during the subsequent 88 period, as shown in Figure 8, compared with the above results of the superparameters that are stable for a long time, the 88 specific hyperparameters can optimize the out-of-stock rate by 4% during the 88 period, and as shown in Figure 9, the corresponding inventory days did not have much impact.

Figure 8: Comparison of out-of-stock rates under superparameter adjustments under special events

Figure 9: Comparison of the number of days of over-parameter adjustment inventory under special activities

Theoretically, we can make stability warnings for the future period of time every day, learn from the lessons of history, and make appropriate adjustments to the prediction or decision-making model in time. In the DTC scenario, periodic events are especially necessary, such as 38 Women's Day, 88 promotion, 99 promotion, New Year's Festival, etc.

2.2 Simulation-based online access mechanism

When the stability is guaranteed, the question of how to land in batches ensues. Although we can simulate and calculate, the overall business indicators are good, but the actual implementation cannot be directly and fully implemented, but in batches.

At this time, we also pay attention to which comes first, and we hope to be "the most sure", and more optimization indicators will be implemented first, which can not only boost the morale of the team, but also get greater authorization after the results come out.

If the indicators do not change much after landing, they may have special influencing factors and need to adjust the model in a targeted manner, and if the indicators become worse after landing, it may be difficult for them to use AI to automate replenishment and require manual intervention.

How do we distinguish these batches? At the outset, we thought that the first batch of products to land would need to meet the following criteria:

Sales volume continued to be stable
The accuracy of AI predictions continues to be good
The forecast error is consistently stable

After actual practice, it is found that although these products can perform well after the model is landed, there is not much room for optimization compared with manual work. After reflection, we found that we have entered a misunderstanding, these filters point to products with stable sales and good performance of the forecasting model, and although these products can be handled well by the replenishment model, the human can also cope well.

On the contrary, for products whose sales are not stable and it is not easy for manual workers to make predictions, the prediction model can actually make fewer mistakes than manual labor through the learning of similar scenarios and the decision-making model through precise calculation, so as to obtain better optimization potential. Isn't AI all about finding certainty in uncertainty to help people make better decisions?

Therefore, we propose a simulation-based online access mechanism.

Taking the gap between the business indicators and the manual indicators of the simulation backtest as the basis for judging whether the standard is met, the reasons why the cases that fail to meet the standard are analyzed are weaker than the manual ones, the model is adjusted accordingly, and then the simulation is used to calculate until the standard is reached and the online standard is met. For those that have reached the standard, we can also sort them according to the optimization potential of the simulation backtest business indicators compared with the manual ones, so that the ones with the greater potential can land first.

Figure 10: Simulation-based on-line admission mechanism

Expanded, the closed-loop of simulation-> attribution-> model adjustment is not only applicable to the online access of SKUs, but also to the online access of new models. Such a system is applicable to everything from the adjustment of the hyperparameters of the model to the transformation of the model structure, such as the replacement of LGB with Transformer.

Challenges in the landing process

After going around and around, our model finally got to the point where it was really put to the test, and it was fully challenged along the way.

3.1 What to do if the adoption rate is not high

During the trial operation of the landing, it was found through monitoring that the acceptance rate of the recommended value of replenishment by business students was very low without intervention. After analyzing and summarizing typical cases, it is found that the main root causes appear in the following two aspects: overestimation of sales and insecurity caused by insufficient trust in AI replenishment.

Taking a product as an example, as can be seen in Figure 11, there were two manual replenishments, and the actual inventory at that time was higher than the maximum inventory recommended by AI, and its sales volume continued to be sluggish from the perspective of daily consumption of actual inventory. From this, we can judge that these two times are unreasonable replenishment behaviors.

Figure 11: Retrospective diagram 1 of the replenishment scenario after encryption conversion based on the actual case

In the interview with the business classmates, we learned that these two replenishments corresponded to two important promotions, and the warehouse management students were under the pressure of the front-end sales classmates, and lacked the perception of the historical sales trend of this product, so they made up more goods.

On the one hand, AI can make predictions from a more global perspective, so that sales and costs are at the optimal level, neither for the sake of revenue to stock up, nor to reduce inventory costs and lose sales opportunities; on the other hand, AI can make more accurate predictions at the finest granularity, for important promotions, not only can know the growth of sales, but also know which goods have risen more and which have risen less.

In order to reduce the occurrence of similar situations, we have developed corresponding dashboards for business students, which can allow business students to perceive the historical sales trend of each product and enhance their trust in AI predictions, rather than just providing a cold recommended replenishment value.

Due to insecurity, do additional replenishment. Taking this product as an example, it can be seen that the business classmates have also replenished the goods twice, but the sales volume of this product is relatively stable, and it has not had much negative impact on the inventory, and the real inventory level is also below the maximum inventory recommended by AI. Judging from the comparison between the AI recommended ordering point and the actual inventory, the replenishment timing of the business students is very close to the timing of the AI recommended replenishment. According to the inventory and sales volume, if the business student does not replenish the stock at the first replenishment point, then at the second replenishment point, the AI will already recommend the replenishment.

Figure 12: Retrospective diagram 2 of the replenishment scenario after encryption conversion based on the actual case

Also in the interview with the business classmates, I learned that he was paying close attention to this product at that time, but the AI did not recommend him to replenish the stock, and the business classmates were worried about whether the AI had missed this product, so they made up for it twice, but considering the cooperation with us, they did not dare to make up more, and they were very entangled. And because the business classmates have replenished the stock in the past two days, the AI found that the current inventory was sufficient when the AI pulled the latest inventory, so it did not replenish the stock, which deepened the doubts of the business classmates about whether the AI missed this product.

Although similar behavior will not be the inventory level and out-of-stock rate that we mainly assess, if we replenish in small batches every time, it will cause additional costs to logistics and picking to a certain extent, and also increase the cost of our replenishment recommendation.

Correspondingly, we've added additional AI recommended order points and maximum inventory quantity fields to the replenishment recommendation table that we interact with our users. The former can alleviate the insecurity of the business classmates, let him know that although the AI does not recommend replenishment today, it will definitely be recommended next time, and the latter can allow the business classmates to refer to how much should be replenished that day when the business classmates insist on replenishment.

As shown in Figure 13, the business students can not only see the recommended replenishment value, but also make better decisions based on the recommended order point and the maximum inventory as a reference if they need to replenish the stock in advance under special circumstances.

Figure 13: The replenishment table that finally interacted with the business classmates

In the process of implementation, we also found an interesting phenomenon for your reference: for enterprises with more mature business processes, directly using AI to enforce it will bring great challenges to the existing business processes, so in the process of AI implementation, helping business students make more scientific decisions within the existing processes will help better generate business value.

When we went live, the adoption rate increased by more than 60%, and the business metrics improved significantly after 3 days.

3.2 Attribution Analysis

During the continuous implementation period, we also need to analyze the situation that the business students do not adopt the replenishment recommendation value while monitoring the business metrics, so as to gain a deeper understanding of whether it is a problem with the model itself or a process problem. In the early stages, this is a very tedious and heavy workload.

This process requires accurate and comprehensive feedback from business classmates, and the most important thing is to collect feedback from business students. For each piece of feedback that does not adopt the recommendation value, you need to use the data to go back to the scene at that time, confirm the feedback and attribute it to a limited number of labels, and at the same time, you will encounter a lot of text analysis work, for example, the same problem may be described by different feedback words, and the corresponding reasons behind similar feedback words may be different, and the behavior of triggering more or less supplements is also inconsistent.

Fortunately, Guanyuan has a strong BI product and an analysis methodology that has been precipitated after serving hundreds of customers, and has successfully completed the attribution analysis. The chart below shows the attribution results for our low adoption at a certain stage.

Figure 14: Attribution analysis that was not adopted

Through attribution analysis, we have been able to identify the shortcomings of the current model again and again, and deepen our understanding of the business. Like what:

For99 is stocked up in advance. At the beginning of the modeling, we made it clear to the business that the Lead Time is 3 days, and if there is an event on September 9, we can make up the amount of goods during the event on September 6.

However, for big promotions, business students often stock up in advance, and the volume of goods for big promotions is not made up for performance at one time, and there are not so many ready-made freight vehicles to transport, so they are all made up in batches in advance. This requires our replenishment model to make additional adjustments for high-sale replenishment.

Meet the demand for daily sales. Essentially, our model is inaccurate in predicting non-promotional periods for some products. After our in-depth analysis, the so-called "non-promotional" period is actually a promotion that has not been perceived, such as some activities such as tens of billions of subsidies and polyhuasuan that have not been integrated into the input data, or the originally planned activities have temporarily adjusted the promotional products but have not been collected by the model.

After communicating with the customer's IT team, we also learned that theoretically, all promotional activities can be collected, but they all require costs, so it is not recommended to collect some activities that are small in size and occur at low frequency. Correspondingly, this also requires our predictive models to have the ability to sense the start and end of unplanned activities, and adjust the forecast value in time for better replenishment.

In addition to the problems of the model itself, the implementation of AI and the implementation of the process are also very important, such as:

site demand, site can be regarded as front-end sales students, they want the warehouse to stock more, there is a certain overestimation of sales, will give a certain pressure to replenishment students, resulting in overstocking, which is a disturbance to AI replenishment. Mr. Liu Baohong also mentioned in "The Three Lines of Defense of the Supply Chain" that how to make the whole company use a prediction value has always been a huge challenge.
When some of our hot-selling goods are out of stock in the whole network, the business classmates will keep an eye on the production of the factory, and once the production is produced, they will stock up in the nearby DTC warehouse, even if the inventory level of the warehouse and cross-warehouse may be sacrificed, it is necessary to prevent the phenomenon of nationwide out-of-stock.

The impact on the business of these two scenarios is slightly different, the former will cause excessive inventory costs, while the latter will reduce stockouts, which is beneficial to the market to a certain extent.

Deficiencies and points that can be optimized

Through our efforts, the inventory days of our experimental group were reduced by 14.8% compared with the control group, and the out-of-stock rate was reduced by 6.9%, which was affirmed by customers.

4.1 Optimization of financial report value

In the above optimization process for business indicators, we mentioned the theoretical value measurement before landing, and strive to ensure the improvement of business indicators after landing after launch, and we have seen the process of business indicators moving from theoretical value to landing value. From a business perspective, we don't want to stop at business metrics, we want to get financial results. I hope that there will be an opportunity to invite finance students to comprehensively calculate the benefits of our optimization results.

Comparison of landed value and financial report value:

The landing value only considers the core assessment KPIs, and does not systematically consider the comprehensive cost. For example, the project does not consider the cost of picking and logistics, and if the high-frequency and small-quantity replenishment can be used as an "opportunistic" means;
The landed value has not been accounted for. For example, if we optimize the overall inventory level by 10%, it does not mean that our storage costs can be reduced by 10%;
Models are put to the test of durability.

The financial report value can also feed back into the model, consider the problem more comprehensively, and improve the optimization goal. It is expected that more and more projects will be able to obtain clear financial reports in the future, so that we can see the more core value of AI to enterprises.

4.2 Replenishment products that integrate forecasting, decision-making, and simulation

Behind the huge value brought by AI models, it cannot be ignored that its powerful products and perfect processes enable AI models to be quickly verified and iterated to generate value. In the replenishment scenario, we also need a closed-loop combination of prediction, decision-making and simulation to make the replenishment system more friendly, humanized and closer to the needs of management.

For algorithm students, they can:

Develop new predictive models, from LGB to Transformer or MQ-RNN, they can also adjust the decision logic from safety stock logic to operations research optimization problem or reinforcement learning solution, and they can also hyperparametric search on the current model for better performance. But at every step, when they commit the code, the system does simulation backtesting to verify the potential of the new approach.
For the model to be landed, when the business classmates are communicated, the product can also be based on the idea of simulated online access, recommend the landing products in batches, and inform which products need to be adjusted.
During the landing process, the product can do automatic attribution analysis for most of the non-adoption situations to efficiently improve the adoption rate.

For algorithm students, any new research results can be quickly verified through this process, accelerating the transformation of business value.

For business students, our products can tell the business how to replenish the stock every day, and the decision-making link behind it is the optimal combination of a variety of prediction and decision-making models through historical simulation. Business students can modify the parameters related to them according to their own understanding, such as:

The expected service level will be adjusted from 95% to 97%, and the product will adjust today's replenishment plan at the same time, while telling him that the service level will be improved by 2%, and after historical simulation backtesting, it is expected to bring x% increase in inventory costs, and the final comprehensive financial value has changed by y%;
Business students can also appropriately modify the future forecast value according to the information they get, for example, there will be a temporary live broadcast next Friday, which is expected to require 2w boxes of goods, and the product will also tell him how to make up rhythmically in the past few days.
For business students who have their own replenishment cadence, they can enter the replenishment rhythm they understand, and the simulation will also tell him the difference in cost between the two replenishment schemes.

We believe that a good replenishment system is not to force replenishment orders, but to help the business make better decisions. The replenishment process is visually explainable, and it can be implemented into visible and tangible costs.

Author: Fan Fei, from express delivery to FMCG, an ordinary supply chain algorithm cultivator.

Source-WeChat public account: Guanyuan Data Technical Team

Source: https://mp.weixin.qq.com/s/wC8cdNn-pBCna4jxYGPStQ