PID algorithm is widely used in the field of control, such as industrial control, automotive electronics and other fields; This article mainly introduces how to effectively use the PID algorithm in the recommendation system.
PID control concept
- Basic deviation
- Basic deviation e(t) = target value - actual value, the basic deviation is positive means that the actual value is lower than the target value, and positive adjustment is required; Negative means that the actual value is higher than the target value and needs to be adjusted in reverse; 0 means no adjustment is required;
- Steady-state error
- When the system enters a steady state, the deviation between the target value and the actual value is relatively stable but not 0, and the deviation at this time is called the steady-state error;
- PID basic concepts
- P (Proportional): proportional control, its function is to control the object in the proportion of constant P for linear adjustment, the advantage is that the adjustment speed is fast to reach the expected direction quickly, the disadvantage is that it will produce steady-state error;
- I (Integral): integral control: its role is to eliminate steady-state errors through the combined action of historical deviations;
- D (Derivative): differential control: its role is to weaken the overshoot and increase the inertial response speed;
PID formula
Application of PID in recommender systems
PID can be used in the recommendation system to control the flow rate, and adjust the exposure speed according to the difference between the actual value and the target value, so as to increase or decrease the exposure probability;
- Goal setting
Since the regulation of PID is based on deviation, it is necessary to determine the calculation logic of deviation;
- It is necessary to define the concept of target value, such as exposure, clicks, playback, exposure ratio, etc.;
- Need target disassembly, because the target may be too rough, such as exposure 100w in a week, in actual operation, if the target disassembly is not carried out, may be in the early stage, the algorithm believes that the regulation deviation is large, give a large amount of traffic support, which is not only not conducive to the exposure experience, but also does not meet the actual business requirements, target disassembly refers to the long-term goal disassembly into short-term goals, such as the week dimension - " day dimension - " hour dimension - minute dimension, The basis for dismantling can be processed according to the historical traffic time period distribution of the platform.
- Governance enforcement
- Timing calculation of target deviation, according to the size of the deviation to calculate the weight of PID, timing interval depends on the traffic size and real-time computing power, in general, in scenarios with relatively large traffic, the shorter the timing interval, the better.
- Since PID calculates the regulatory weight, and the recommendation needs to affect the exposure, the regulatory weight of PID is combined with the preference score of the model to finally form the final score, and then the final ranking is carried out, when the PID weight is greater than 1, it plays the role of flow support, and when the PID weight is less than 1, it plays the role of flow suppression, so as to achieve the final flow control.