最近在改AVS3的碼率控制子產品,是以先去了解了x265的碼率控制模型。
首先,了解x265的碼率控制原理可以先看看https://blog.csdn.net/liulinyu1234/article/details/80857652,本人也是在該部落格的原理的介紹下了解x265碼率控制模型。
這裡主要講一下自己對x265的VBV更新的了解。
涉及到VBV參數的類主要在class RateControl,參數主有

在x265編碼過程中,有VBV這樣一個機制,相當于一個容器,它有一個最大容量以及一個初始值,即BufferSize和initvbvBuffer,可以了解為,你編碼一幀,需要從容器裡取你編碼目前幀的bit,然後再往裡面加平均每幀的bit消耗。當你的編碼幀bit消耗太多時,即你從容器裡拿的速度遠大于往裡面加的速度,那麼裡邊初始的Buffer很快會被消耗完,出現下溢現象,就會導緻使用者觀看時的卡頓;相反,拿的速度遠小于加的速度,那麼VBV就會上溢,不過一般上溢的影響不大,個人了解為上溢的話是給了你那麼多的bit但是你沒有充分利用,視訊品質不是最佳的。上溢一般沒大多關系,主要注意不能出現下溢,因為主要的卡頓會直接影響使用者體驗。
x265的碼率控制主要在rateControlStart函數和rateControlEnd函數。前一個函數為編碼前的參數計算過程,包括QP的計算,主要函數調用為rateControlStart->rateEstimateQscale->getQScale、tuneAbrQScaleFromFeedback、clipQscale。VBV狀态對Qscale的影響主要展現在clipQscale函數中。
該函數會從目前幀開始,預測後面幾幀的Bit,然後根據VBV的狀态,調整目前幀的Qscale,使VBV狀态在0.5-0.8之間。該函數對Qscale的計算影響比較大,個人感覺是碼率控制過程防止VBV上下溢出的重要函數。貼出clipQscale函數以及自己的注釋代碼:
double RateControl::clipQscale(Frame* curFrame, RateControlEntry* rce, double q)
{
// B-frames are not directly subject to VBV,
// since they are controlled by referenced P-frames' QPs.
double lmin = m_lmin[rce->sliceType];
double lmax = m_lmax[rce->sliceType];
double q0 = q;
if (m_isVbv && m_currentSatd > 0 && curFrame)
{
if (m_param->lookaheadDepth || m_param->rc.cuTree ||
(m_param->scenecutThreshold || m_param->bHistBasedSceneCut) ||
(m_param->bFrameAdaptive && m_param->bframes))
{
/* Lookahead VBV: If lookahead is done, raise the quantizer as necessary
* such that no frames in the lookahead overflow and such that the buffer
* is in a reasonable state by the end of the lookahead. */
int loopTerminate = 0;
/* Avoid an infinite loop. loopTerminate==3 即loopTerminate|1又|2*/
for (int iterations = 0; iterations < 1000 && loopTerminate != 3; iterations++) //從目前開始,向後看i幀,計算所需的bit,估計vbv情況
{
double frameQ[3];
double curBits;
curBits = predictSize(&m_pred[m_predType], q, (double)m_currentSatd); //預測目前幀bit
double bufferFillCur = m_bufferFill - curBits; //bufferfill狀态
double targetFill;
double totalDuration = m_frameDuration;
frameQ[P_SLICE] = m_sliceType == I_SLICE ? q * m_param->rc.ipFactor : (m_sliceType == B_SLICE ? q / m_param->rc.pbFactor : q);
frameQ[B_SLICE] = frameQ[P_SLICE] * m_param->rc.pbFactor;
frameQ[I_SLICE] = frameQ[P_SLICE] / m_param->rc.ipFactor;
/* Loop over the planned future frames. */
bool iter = true;
for (int j = 0; bufferFillCur >= 0 && iter ; j++) //根據satd預測後面幀的bit
{
int type = curFrame->m_lowres.plannedType[j];
if (type == X265_TYPE_AUTO || totalDuration >= 1.0)
break;
totalDuration += m_frameDuration;
double wantedFrameSize = m_vbvMaxRate * m_frameDuration;
if (bufferFillCur + wantedFrameSize <= m_bufferSize)
bufferFillCur += wantedFrameSize; //理想的buffer狀态
int64_t satd = curFrame->m_lowres.plannedSatd[j] >> (X265_DEPTH - 8);
type = IS_X265_TYPE_I(type) ? I_SLICE : IS_X265_TYPE_B(type) ? B_SLICE : P_SLICE;
int predType = getPredictorType(curFrame->m_lowres.plannedType[j], type);
curBits = predictSize(&m_pred[predType], frameQ[type], (double)satd);
bufferFillCur -= curBits; //實際的buffer狀态
if (!m_param->bResetZoneConfig && ((uint64_t)j == (m_param->reconfigWindowSize - 1)))
iter = false;
}
if (rce->vbvEndAdj) //不進入IF條件
{
bool loopBreak = false;
double bufferDiff = m_param->vbvBufferEnd - (m_bufferFill / m_bufferSize);
rce->targetFill = m_bufferFill + m_bufferSize * (bufferDiff / (m_param->totalFrames - rce->encodeOrder));
if (bufferFillCur < rce->targetFill)
{
q *= 1.01;
loopTerminate |= 1;
loopBreak = true;
}
if (bufferFillCur > m_param->vbvBufferEnd * m_bufferSize)
{
q /= 1.01;
loopTerminate |= 2;
loopBreak = true;
}
if (!loopBreak)
break;
}
else //令bufferFillCur在0.5至0.8之間
{
/* Try to get the buffer at least 50% filled, but don't set an impossible goal. */
double finalDur = 1;
if (m_param->rc.bStrictCbr)
{
finalDur = x265_clip3(0.4, 1.0, totalDuration);
}
targetFill = X265_MIN(m_bufferFill + totalDuration * m_vbvMaxRate * 0.5, m_bufferSize * (1 - 0.5 * finalDur));
if (bufferFillCur < targetFill)
{
q *= 1.01;
loopTerminate |= 1;
continue;
}
/* Try to get the buffer not more than 80% filled, but don't set an impossible goal. */
targetFill = x265_clip3(m_bufferSize * (1 - 0.2 * finalDur), m_bufferSize, m_bufferFill - totalDuration * m_vbvMaxRate * 0.5);
if (m_isCbr && bufferFillCur > targetFill && !m_isSceneTransition)
{
q /= 1.01;
loopTerminate |= 2;
continue;
}
break;
}
}
q = X265_MAX(q0 / 2, q);
}
else
{
/* Fallback to old purely-reactive algorithm: no lookahead. */
if ((m_sliceType == P_SLICE || m_sliceType == B_SLICE ||
(m_sliceType == I_SLICE && m_lastNonBPictType == I_SLICE)) &&
m_bufferFill / m_bufferSize < 0.5)
{
q /= x265_clip3(0.5, 1.0, 2.0 * m_bufferFill / m_bufferSize);
}
// Now a hard threshold to make sure the frame fits in VBV.
// This one is mostly for I-frames.
double bits = predictSize(&m_pred[m_predType], q, (double)m_currentSatd);
// For small VBVs, allow the frame to use up the entire VBV.
double maxFillFactor;
maxFillFactor = m_bufferSize >= 5 * m_bufferRate ? 2 : 1;
// For single-frame VBVs, request that the frame use up the entire VBV.
double minFillFactor = m_singleFrameVbv ? 1 : 2;
for (int iterations = 0; iterations < 10; iterations++)
{
double qf = 1.0;
if (bits > m_bufferFill / maxFillFactor)
qf = x265_clip3(0.2, 1.0, m_bufferFill / (maxFillFactor * bits));
q /= qf;
bits *= qf;
if (bits < m_bufferRate / minFillFactor)
q *= bits * minFillFactor / m_bufferRate;
bits = predictSize(&m_pred[m_predType], q, (double)m_currentSatd);
}
q = X265_MAX(q0, q);
}
/* Apply MinCR restrictions */
double pbits = predictSize(&m_pred[m_predType], q, (double)m_currentSatd);
if (pbits > rce->frameSizeMaximum)
q *= pbits / rce->frameSizeMaximum;
/* To detect frames that are more complex in SATD costs compared to prev window, yet
* lookahead vbv reduces its qscale by half its value. Be on safer side and avoid drastic
* qscale reductions for frames high in complexity */
bool mispredCheck = rce->movingAvgSum && m_currentSatd >= rce->movingAvgSum && q <= q0 / 2;
if (!m_isCbr || (m_isAbr && mispredCheck))
q = X265_MAX(q0, q);
if (m_rateFactorMaxIncrement)
{
double qpNoVbv = x265_qScale2qp(q0);
double qmax = X265_MIN(lmax,x265_qp2qScale(qpNoVbv + m_rateFactorMaxIncrement));
return x265_clip3(lmin, qmax, q);
}
}
if (m_2pass)
{
double min = log(lmin);
double max = log(lmax);
q = (log(q) - min) / (max - min) - 0.5;
q = 1.0 / (1.0 + exp(-4 * q));
q = q*(max - min) + min;
return exp(q);
}
return x265_clip3(lmin, lmax, q);
}
在實際的一幀編碼完之後,rateControlEnd上場了,根據幀的實際編碼bit的消耗,跟新VBV狀态以及預測參數。主要函數調用過程為rateControlEnd->updateVbv->updatePredictor.其中,VBV狀态的跟新在updateVbv函數裡,即m_bufferFillFinal -= bits 減去實際編碼的bit; m_bufferFillFinal += m_bufferRate 加上每幀平均bit;即之前說的一取一加操作。在這邊會對VBV是否下溢進行一個判斷:
if (m_bufferFillFinal < 0)
x265_log(m_param, X265_LOG_WARNING, "poc:%d, VBV underflow (%.0f bits)\n", rce->poc, m_bufferFillFinal);
說明下溢是需要重點防止的。另外提一下,updatePredictor函數會更新bit的預測參數。在rateControlStart函數中會調用predictSize對不同幀類型,根據其SATD、Qscale預測其需要消耗的bit,該函數在clipQscale被多次用到,具有重要作用。updatePredictor函數的作用就是根據預測以及編碼實際bit,對相應幀類型(B/P/I/BREF)跟新預測參數,使下一次相同類型的幀的bit預測更加準确。