天天看點

Viterbi算法 及在x264編碼中的應用viterbi算法在 slicetype decision中的應用

#碼率控制系列

  • 碼率控制之ABR

    碼率控制之ABR 使用指南與分析

  • 碼率控制之MBTree

    宏塊樹 mbtree 使用指南與源碼分析

    x264源碼解析:碼率控制之mbtree — --- – x264_macroblock_tree

    x264源碼解析:lookahead之frametype — x264_slicetype_path (Viterbi算法)【本文】

    x264源碼解析:lookahead之md/mv — --- x264_slicetype_frame_cost

    x264源碼解析:lookahead之cost計算基礎 x264_slicetype_mb_cost等

    x264源碼解析:碼率控制之mbtree — --- – propagate 計算流程

    x264源碼解析:碼率控制之mbtree — --- – qp調整

  • 碼率控制之AQ

    [x264/x265] Adaptive Quantization 使用指南

    x264源碼解析:碼率控制之adaptive quantization

    x265源碼解析:碼率控制之adaptive quantization【待補充】

  • 碼率控制之VBV

    x264源碼解析:碼率控制之VBV 幀級

    x264源碼解析:碼率控制之VBV 宏塊級

viterbi算法在 slicetype decision中的應用

##适用範圍

問題模型:

viterbi算法是多步驟每步多選擇模型的最優選擇問題。前續所有步驟的最小總代價cost以及路徑path都将用于目前步驟的選擇,在完成所有步驟的計算後,可以回溯找到其中最優的路徑。

示例

參考 翻譯

健康狀态 = { 健康, 發燒 }
感受狀态 = { 正常, 冷,  頭暈 }
初始健康機率 = { 健康:0.6, 發燒:0.4 }
健康狀況變化機率 = {
	健康->健康: 0.7,
	健康->發燒: 0.3,
	發燒->健康: 0.4,
	發燒->發燒: 0.6
}
健康與感受對應機率 = {
	健康, (正常: 0.5, 冷: 0.4, 頭暈: 0.1;
	發燒, (正常: 0.1, 冷: 0.3, 頭暈: 0.6
}

問題:
連續三天的個人感覺依次是:正常、冷、頭暈,則健康狀态最可能的變化路徑是?
           

第一天 感覺正常

P(健康) = P(正常|健康)*P(健康|初始情況) = 0.5 * 0.6 = 0.3

P(發燒) = P(正常|發燒)*P(發燒|初始情況) = 0.1 * 0.4 = 0.04

第一天最可能:健康。

第二天 感覺冷

今天健康:

P(前一天健康,今天健康) = P(前一天健康)*P(健康->健康)*P(冷|健康) = 0.3 * 0.7 * 0.4 = 0.084

P(前一天發燒,今天健康) = P(前一天發燒)*P(發燒->健康)*P(冷|健康) = 0.04 * 0.4 * 0.4 = 0.0064

今天發燒

P(前一天健康,今天發燒) = P(前一天健康)*P(健康->發燒)*P(冷|發燒) = 0.3 * 0.3 *.03 = 0.027

P(前一天發燒,今天發燒) = P(前一天發燒)*P(發燒->發燒)*P(冷|發燒) = 0.04 * 0.6 * 0.3 = 0.0072

第二天最可能:健康(對比健康機率0.084和發燒機率0.027)

第三天 感覺頭暈

P(前一天健康,今天健康) = P(前一天健康)*P(健康->健康)*P(頭暈|健康) = 0.084 * 0.7 * 0.1 = 0.00588

P(前一天發燒,今天健康) = P(前一天發燒)*P(發燒->健康)*P(頭暈|健康) = 0.027 * 0.4 * 0.1 = 0.00108

P(前一天健康,今天發燒) = P(前一天健康)*P(健康->發燒)*P(頭暈|發燒) = 0.084 * 0.3 *0.6 = 0.01512

P(前一天發燒,今天發燒) = P(前一天發燒)*P(發燒->發燒)*P(頭暈|發燒) = 0.027 * 0.6 * 0.6 = 0.00972

第三天最可能:發燒。

路徑是:健康 -> 健康 -> 發燒

Viterbi算法 及在x264編碼中的應用viterbi算法在 slicetype decision中的應用

x264中的應用

--b-adapt <integer>     Adaptive B-frame decision method
                        - 0: Disabled
                        - 1: Fast
                        - 2: Optimal (slow with high --bframes)
           

參數說明:

  • b-adapt>0, 會自适應B幀設定(adaptive B-frame placement);
  • b-adapt==2時,會采用viterbiho最優路徑算法。

實作

/* Viterbi/trellis slicetype decision algorithm. */
/* Uses strings due to the fact that the speed of the control functions is
   negligible compared to the cost of running slicetype_frame_cost, and because
   it makes debugging easier. */
static void x264_slicetype_path( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, int length, char (*best_paths)[X264_LOOKAHEAD_MAX+1] )
{
    char paths[2][X264_LOOKAHEAD_MAX+1];                    ///< paths[2]: best/temp path
    int num_paths = X264_MIN( h->param.i_bframe+1, length );///< path數 每個長度對應一個best path
    int best_cost = COST_MAX;
    int best_possible = 0;
    int idx = 0;

    /* Iterate over all currently possible paths */
    for( int path = 0; path < num_paths; path++ ) ///< B placement 組合
    {
        /* Add suffixes to the current path */    ///< 初始化 0~bframes B
        int len = length - (path + 1);
        memcpy( paths[idx], best_paths[len % (X264_BFRAME_MAX+1)], len );
        memset( paths[idx]+len, 'B', path );
        strcpy( paths[idx]+len+path, "P" );

        int possible = 1;
        for( int i = 1; i <= length; i++ )
        {
            int i_type = frames[i]->i_type;
            if( i_type == X264_TYPE_AUTO )
                continue;
            if( IS_X264_TYPE_B( i_type ) )
                possible = possible && (i < len || i == length || paths[idx][i-1] == 'B');
            else
            {
                possible = possible && (i < len || paths[idx][i-1] != 'B');
                paths[idx][i-1] = IS_X264_TYPE_I( i_type ) ? 'I' : 'P';
            }
        }

        if( possible || !best_possible )
        {
            if( possible && !best_possible )
                best_cost = COST_MAX;
            /* Calculate the actual cost of the current path */
            int cost = x264_slicetype_path_cost( h, a, frames, paths[idx], best_cost );   ///< cost
            if( cost < best_cost )
            {
                best_cost = cost;
                best_possible = possible;
                idx ^= 1;
            }
        }
    }

    /* Store the best path. */
    memcpy( best_paths[length % (X264_BFRAME_MAX+1)], paths[idx^1], length );
}
           
//! path cost
static int x264_slicetype_path_cost( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, char *path, int threshold )
{
    int loc = 1;
    int cost = 0;
    int cur_nonb = 0;
    path--; /* Since the 1st path element is really the second frame */
    while( path[loc] )
    {   //! next_nonb frame cost(I/P)
        int next_nonb = loc;
        /* Find the location of the next non-B-frame. */
        while( path[next_nonb] == 'B' )
            next_nonb++;

        /* Add the cost of the non-B-frame found above */
        if( path[next_nonb] == 'P' )
            cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, next_nonb );
        else /* I-frame */
            cost += x264_slicetype_frame_cost( h, a, frames, next_nonb, next_nonb, next_nonb );
        /* Early terminate if the cost we have found is larger than the best path cost so far */
        if( cost > threshold )
            break;
		//! B frames' cost
        if( h->param.i_bframe_pyramid && next_nonb - cur_nonb > 2 )
        {
            int middle = cur_nonb + (next_nonb - cur_nonb)/2;
            cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, middle );
            for( int next_b = loc; next_b < middle && cost < threshold; next_b++ )
                cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, middle, next_b );
            for( int next_b = middle+1; next_b < next_nonb && cost < threshold; next_b++ )
                cost += x264_slicetype_frame_cost( h, a, frames, middle, next_nonb, next_b );
        }
        else
            for( int next_b = loc; next_b < next_nonb && cost < threshold; next_b++ )
                cost += x264_slicetype_frame_cost( h, a, frames, cur_nonb, next_nonb, next_b );

        loc = next_nonb + 1;
        cur_nonb = next_nonb;
    }
    return cost;
}
           

參:x264源碼解析:lookahead之 x264_slicetype_frame_cost

//! frame cost
static int x264_slicetype_frame_cost( x264_t *h, x264_mb_analysis_t *a,
                                      x264_frame_t **frames, int p0, int p1, int b )