#码率控制系列
-
码率控制之ABR
码率控制之ABR 使用指南与分析
-
码率控制之MBTree
宏块树 mbtree 使用指南与源码分析
x264源码解析:码率控制之mbtree — --- – x264_macroblock_tree
x264源码解析:lookahead之frametype — x264_slicetype_path (Viterbi算法)【本文】
x264源码解析:lookahead之md/mv — --- x264_slicetype_frame_cost
x264源码解析:lookahead之cost计算基础 x264_slicetype_mb_cost等
x264源码解析:码率控制之mbtree — --- – propagate 计算流程
x264源码解析:码率控制之mbtree — --- – qp调整
-
码率控制之AQ
[x264/x265] Adaptive Quantization 使用指南
x264源码解析:码率控制之adaptive quantization
x265源码解析:码率控制之adaptive quantization【待补充】
-
码率控制之VBV
x264源码解析:码率控制之VBV 帧级
x264源码解析:码率控制之VBV 宏块级
viterbi算法在 slicetype decision中的应用
##适用范围
问题模型:
viterbi算法是多步骤每步多选择模型的最优选择问题。前续所有步骤的最小总代价cost以及路径path都将用于当前步骤的选择,在完成所有步骤的计算后,可以回溯找到其中最优的路径。
示例
参考 翻译
健康状态 = { 健康, 发烧 }
感受状态 = { 正常, 冷, 头晕 }
初始健康概率 = { 健康:0.6, 发烧:0.4 }
健康状况变化概率 = {
健康->健康: 0.7,
健康->发烧: 0.3,
发烧->健康: 0.4,
发烧->发烧: 0.6
}
健康与感受对应概率 = {
健康, (正常: 0.5, 冷: 0.4, 头晕: 0.1;
发烧, (正常: 0.1, 冷: 0.3, 头晕: 0.6
}
问题:
连续三天的个人感觉依次是:正常、冷、头晕,则健康状态最可能的变化路径是?
第一天 感觉正常
P(健康) = P(正常|健康)*P(健康|初始情况) = 0.5 * 0.6 = 0.3
P(发烧) = P(正常|发烧)*P(发烧|初始情况) = 0.1 * 0.4 = 0.04
第一天最可能:健康。
第二天 感觉冷
今天健康:
P(前一天健康,今天健康) = P(前一天健康)*P(健康->健康)*P(冷|健康) = 0.3 * 0.7 * 0.4 = 0.084
P(前一天发烧,今天健康) = P(前一天发烧)*P(发烧->健康)*P(冷|健康) = 0.04 * 0.4 * 0.4 = 0.0064
今天发烧
P(前一天健康,今天发烧) = P(前一天健康)*P(健康->发烧)*P(冷|发烧) = 0.3 * 0.3 *.03 = 0.027
P(前一天发烧,今天发烧) = P(前一天发烧)*P(发烧->发烧)*P(冷|发烧) = 0.04 * 0.6 * 0.3 = 0.0072
第二天最可能:健康(对比健康概率0.084和发烧概率0.027)
第三天 感觉头晕
P(前一天健康,今天健康) = P(前一天健康)*P(健康->健康)*P(头晕|健康) = 0.084 * 0.7 * 0.1 = 0.00588
P(前一天发烧,今天健康) = P(前一天发烧)*P(发烧->健康)*P(头晕|健康) = 0.027 * 0.4 * 0.1 = 0.00108
P(前一天健康,今天发烧) = P(前一天健康)*P(健康->发烧)*P(头晕|发烧) = 0.084 * 0.3 *0.6 = 0.01512
P(前一天发烧,今天发烧) = P(前一天发烧)*P(发烧->发烧)*P(头晕|发烧) = 0.027 * 0.6 * 0.6 = 0.00972
第三天最可能:发烧。
路径是:健康 -> 健康 -> 发烧
x264中的应用
--b-adapt <integer> Adaptive B-frame decision method
- 0: Disabled
- 1: Fast
- 2: Optimal (slow with high --bframes)
参数说明:
- b-adapt>0, 会自适应B帧设置(adaptive B-frame placement);
- b-adapt==2时,会采用viterbiho最优路径算法。
实现
/* Viterbi/trellis slicetype decision algorithm. */
/* Uses strings due to the fact that the speed of the control functions is
negligible compared to the cost of running slicetype_frame_cost, and because
it makes debugging easier. */
static void x264_slicetype_path( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, int length, char (*best_paths)[X264_LOOKAHEAD_MAX+1] )
{
char paths[2][X264_LOOKAHEAD_MAX+1]; ///< paths[2]: best/temp path
int num_paths = X264_MIN( h->param.i_bframe+1, length );///< path数 每个长度对应一个best path
int best_cost = COST_MAX;
int best_possible = 0;
int idx = 0;
/* Iterate over all currently possible paths */
for( int path = 0; path < num_paths; path++ ) ///< B placement 组合
{
/* Add suffixes to the current path */ ///< 初始化 0~bframes B
int len = length - (path + 1);
memcpy( paths[idx], best_paths[len % (X264_BFRAME_MAX+1)], len );
memset( paths[idx]+len, 'B', path );
strcpy( paths[idx]+len+path, "P" );
int possible = 1;
for( int i = 1; i <= length; i++ )
{
int i_type = frames[i]->i_type;
if( i_type == X264_TYPE_AUTO )
continue;
if( IS_X264_TYPE_B( i_type ) )
possible = possible && (i < len || i == length || paths[idx][i-1] == 'B');
else
{
possible = possible && (i < len || paths[idx][i-1] != 'B');
paths[idx][i-1] = IS_X264_TYPE_I( i_type ) ? 'I' : 'P';
}
}
if( possible || !best_possible )
{
if( possible && !best_possible )
best_cost = COST_MAX;
/* Calculate the actual cost of the current path */
int cost = x264_slicetype_path_cost( h, a, frames, paths[idx], best_cost ); ///< cost
if( cost < best_cost )
{
best_cost = cost;
best_possible = possible;
idx ^= 1;
}
}
}
/* Store the best path. */
memcpy( best_paths[length % (X264_BFRAME_MAX+1)], paths[idx^1], length );
}
//! path cost
static int x264_slicetype_path_cost( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, char *path, int threshold )
{
int loc = 1;
int cost = 0;
int cur_nonb = 0;
path--; /* Since the 1st path element is really the second frame */
while( path[loc] )
{ //! next_nonb frame cost(I/P)
int next_nonb = loc;
/* Find the location of the next non-B-frame. */
while( path[next_nonb] == 'B' )
next_nonb++;
/* Add the cost of the non-B-frame found above */
if( path[next_nonb] == 'P' )
cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, next_nonb );
else /* I-frame */
cost += x264_slicetype_frame_cost( h, a, frames, next_nonb, next_nonb, next_nonb );
/* Early terminate if the cost we have found is larger than the best path cost so far */
if( cost > threshold )
break;
//! B frames' cost
if( h->param.i_bframe_pyramid && next_nonb - cur_nonb > 2 )
{
int middle = cur_nonb + (next_nonb - cur_nonb)/2;
cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, middle );
for( int next_b = loc; next_b < middle && cost < threshold; next_b++ )
cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, middle, next_b );
for( int next_b = middle+1; next_b < next_nonb && cost < threshold; next_b++ )
cost += x264_slicetype_frame_cost( h, a, frames, middle, next_nonb, next_b );
}
else
for( int next_b = loc; next_b < next_nonb && cost < threshold; next_b++ )
cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, next_b );
loc = next_nonb + 1;
cur_nonb = next_nonb;
}
return cost;
}
参:x264源码解析:lookahead之 x264_slicetype_frame_cost
//! frame cost
static int x264_slicetype_frame_cost( x264_t *h, x264_mb_analysis_t *a,
x264_frame_t **frames, int p0, int p1, int b )