天天看点

Viterbi算法 及在x264编码中的应用viterbi算法在 slicetype decision中的应用

#码率控制系列

  • 码率控制之ABR

    码率控制之ABR 使用指南与分析

  • 码率控制之MBTree

    宏块树 mbtree 使用指南与源码分析

    x264源码解析:码率控制之mbtree — --- – x264_macroblock_tree

    x264源码解析:lookahead之frametype — x264_slicetype_path (Viterbi算法)【本文】

    x264源码解析:lookahead之md/mv — --- x264_slicetype_frame_cost

    x264源码解析:lookahead之cost计算基础 x264_slicetype_mb_cost等

    x264源码解析:码率控制之mbtree — --- – propagate 计算流程

    x264源码解析:码率控制之mbtree — --- – qp调整

  • 码率控制之AQ

    [x264/x265] Adaptive Quantization 使用指南

    x264源码解析:码率控制之adaptive quantization

    x265源码解析:码率控制之adaptive quantization【待补充】

  • 码率控制之VBV

    x264源码解析:码率控制之VBV 帧级

    x264源码解析:码率控制之VBV 宏块级

viterbi算法在 slicetype decision中的应用

##适用范围

问题模型:

viterbi算法是多步骤每步多选择模型的最优选择问题。前续所有步骤的最小总代价cost以及路径path都将用于当前步骤的选择,在完成所有步骤的计算后,可以回溯找到其中最优的路径。

示例

参考 翻译

健康状态 = { 健康, 发烧 }
感受状态 = { 正常, 冷,  头晕 }
初始健康概率 = { 健康:0.6, 发烧:0.4 }
健康状况变化概率 = {
	健康->健康: 0.7,
	健康->发烧: 0.3,
	发烧->健康: 0.4,
	发烧->发烧: 0.6
}
健康与感受对应概率 = {
	健康, (正常: 0.5, 冷: 0.4, 头晕: 0.1;
	发烧, (正常: 0.1, 冷: 0.3, 头晕: 0.6
}

问题:
连续三天的个人感觉依次是:正常、冷、头晕,则健康状态最可能的变化路径是?
           

第一天 感觉正常

P(健康) = P(正常|健康)*P(健康|初始情况) = 0.5 * 0.6 = 0.3

P(发烧) = P(正常|发烧)*P(发烧|初始情况) = 0.1 * 0.4 = 0.04

第一天最可能:健康。

第二天 感觉冷

今天健康:

P(前一天健康,今天健康) = P(前一天健康)*P(健康->健康)*P(冷|健康) = 0.3 * 0.7 * 0.4 = 0.084

P(前一天发烧,今天健康) = P(前一天发烧)*P(发烧->健康)*P(冷|健康) = 0.04 * 0.4 * 0.4 = 0.0064

今天发烧

P(前一天健康,今天发烧) = P(前一天健康)*P(健康->发烧)*P(冷|发烧) = 0.3 * 0.3 *.03 = 0.027

P(前一天发烧,今天发烧) = P(前一天发烧)*P(发烧->发烧)*P(冷|发烧) = 0.04 * 0.6 * 0.3 = 0.0072

第二天最可能:健康(对比健康概率0.084和发烧概率0.027)

第三天 感觉头晕

P(前一天健康,今天健康) = P(前一天健康)*P(健康->健康)*P(头晕|健康) = 0.084 * 0.7 * 0.1 = 0.00588

P(前一天发烧,今天健康) = P(前一天发烧)*P(发烧->健康)*P(头晕|健康) = 0.027 * 0.4 * 0.1 = 0.00108

P(前一天健康,今天发烧) = P(前一天健康)*P(健康->发烧)*P(头晕|发烧) = 0.084 * 0.3 *0.6 = 0.01512

P(前一天发烧,今天发烧) = P(前一天发烧)*P(发烧->发烧)*P(头晕|发烧) = 0.027 * 0.6 * 0.6 = 0.00972

第三天最可能:发烧。

路径是:健康 -> 健康 -> 发烧

Viterbi算法 及在x264编码中的应用viterbi算法在 slicetype decision中的应用

x264中的应用

--b-adapt <integer>     Adaptive B-frame decision method
                        - 0: Disabled
                        - 1: Fast
                        - 2: Optimal (slow with high --bframes)
           

参数说明:

  • b-adapt>0, 会自适应B帧设置(adaptive B-frame placement);
  • b-adapt==2时,会采用viterbiho最优路径算法。

实现

/* Viterbi/trellis slicetype decision algorithm. */
/* Uses strings due to the fact that the speed of the control functions is
   negligible compared to the cost of running slicetype_frame_cost, and because
   it makes debugging easier. */
static void x264_slicetype_path( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, int length, char (*best_paths)[X264_LOOKAHEAD_MAX+1] )
{
    char paths[2][X264_LOOKAHEAD_MAX+1];                    ///< paths[2]: best/temp path
    int num_paths = X264_MIN( h->param.i_bframe+1, length );///< path数 每个长度对应一个best path
    int best_cost = COST_MAX;
    int best_possible = 0;
    int idx = 0;

    /* Iterate over all currently possible paths */
    for( int path = 0; path < num_paths; path++ ) ///< B placement 组合
    {
        /* Add suffixes to the current path */    ///< 初始化 0~bframes B
        int len = length - (path + 1);
        memcpy( paths[idx], best_paths[len % (X264_BFRAME_MAX+1)], len );
        memset( paths[idx]+len, 'B', path );
        strcpy( paths[idx]+len+path, "P" );

        int possible = 1;
        for( int i = 1; i <= length; i++ )
        {
            int i_type = frames[i]->i_type;
            if( i_type == X264_TYPE_AUTO )
                continue;
            if( IS_X264_TYPE_B( i_type ) )
                possible = possible && (i < len || i == length || paths[idx][i-1] == 'B');
            else
            {
                possible = possible && (i < len || paths[idx][i-1] != 'B');
                paths[idx][i-1] = IS_X264_TYPE_I( i_type ) ? 'I' : 'P';
            }
        }

        if( possible || !best_possible )
        {
            if( possible && !best_possible )
                best_cost = COST_MAX;
            /* Calculate the actual cost of the current path */
            int cost = x264_slicetype_path_cost( h, a, frames, paths[idx], best_cost );   ///< cost
            if( cost < best_cost )
            {
                best_cost = cost;
                best_possible = possible;
                idx ^= 1;
            }
        }
    }

    /* Store the best path. */
    memcpy( best_paths[length % (X264_BFRAME_MAX+1)], paths[idx^1], length );
}
           
//! path cost
static int x264_slicetype_path_cost( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, char *path, int threshold )
{
    int loc = 1;
    int cost = 0;
    int cur_nonb = 0;
    path--; /* Since the 1st path element is really the second frame */
    while( path[loc] )
    {   //! next_nonb frame cost(I/P)
        int next_nonb = loc;
        /* Find the location of the next non-B-frame. */
        while( path[next_nonb] == 'B' )
            next_nonb++;

        /* Add the cost of the non-B-frame found above */
        if( path[next_nonb] == 'P' )
            cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, next_nonb );
        else /* I-frame */
            cost += x264_slicetype_frame_cost( h, a, frames, next_nonb, next_nonb, next_nonb );
        /* Early terminate if the cost we have found is larger than the best path cost so far */
        if( cost > threshold )
            break;
		//! B frames' cost
        if( h->param.i_bframe_pyramid && next_nonb - cur_nonb > 2 )
        {
            int middle = cur_nonb + (next_nonb - cur_nonb)/2;
            cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, middle );
            for( int next_b = loc; next_b < middle && cost < threshold; next_b++ )
                cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, middle, next_b );
            for( int next_b = middle+1; next_b < next_nonb && cost < threshold; next_b++ )
                cost += x264_slicetype_frame_cost( h, a, frames, middle, next_nonb, next_b );
        }
        else
            for( int next_b = loc; next_b < next_nonb && cost < threshold; next_b++ )
                cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, next_b );

        loc = next_nonb + 1;
        cur_nonb = next_nonb;
    }
    return cost;
}
           

参:x264源码解析:lookahead之 x264_slicetype_frame_cost

//! frame cost
static int x264_slicetype_frame_cost( x264_t *h, x264_mb_analysis_t *a,
                                      x264_frame_t **frames, int p0, int p1, int b )