作者: 葉餘 來源: https://www.cnblogs.com/leisure_chn/p/10506636.html
FFmpeg 封裝格式處理相關内容分為如下幾篇文章:
[1].
FFmpeg 封裝格式處理-簡介 [2]. FFmpeg 封裝格式處理-解複用例程 [3]. FFmpeg 封裝格式處理-複用例程 [4]. FFmpeg 封裝格式處理-轉封裝例程 這幾篇文章内容聯系緊密,但放在一篇文章裡内容太長,遂作拆分。章節号不作調整。基于 FFmpeg 4.1 版本。1. 概述
1.1 封裝格式簡介
封裝格式(container format)可以看作是編碼流(音頻流、視訊流等)資料的一層外殼,将編碼後的資料存儲于此封裝格式的檔案之内。封裝又稱容器,容器的稱法更為形象,所謂容器,就是存放内容的器具,以一瓶飲料為例,飲料是内容,那麼裝飲料的瓶子就是容器。
不同封裝格式适用于不同的場合,不同封裝格式支援的編碼格式不一樣,幾個常用的封裝格式如下:
下表引用自“
視音頻編解碼技術零基礎學習方法”
封裝格式(檔案擴充名) | 推出機構 | 流媒體 | 支援的視訊編碼 | 支援的音頻編碼 | 目前使用領域 |
AVI(.avi) | Microsoft 公司 | 不支援 | 幾乎所有格式 | BT 下載下傳影視 | |
Flash Video(.flv) | Adobe 公司 | 支援 | Sorenson/VP6/H.264 | MP3/ADPCM/Linear PCM/AAC 等 | 網際網路視訊網站 |
MP4(.mp4) | MPEG 組織 | MPEG-2/MPEG-4/H.264/H.263 等 | AAC/MPEG-1 Layers I,II,III/AC-3 等 | ||
MPEGTS(.ts) | MPEG-1/MPEG-2/MPEG-4/H.264 | MPEG-1 Layers I,II,III/AAC | IPTV,數字電視 | ||
Matroska(.mkv) | CoreCodec 公司 | ||||
Real Video(.rmvb) | Real Networks 公司 | RealVideo 8,9,10 | AAC/Cook Codec/RealAudio Lossless |
1.2 FFmpeg 中的封裝格式
FFmpeg 關于封裝格式的處理涉及打開輸入檔案、打開輸出檔案、從輸入檔案讀取編碼幀、往輸出檔案寫入編碼幀這幾個步驟,這些都不涉及編解碼層。
在 FFmpeg 中,mux 指複用,是 multiplex 的縮寫,表示将多路流(視訊、音頻、字幕等)混入一路輸出中(普通檔案、流等)。demux 指解複用,是 mux 的反操作,表示從一路輸入中分離出多路流(視訊、音頻、字幕等)。mux 處理的是輸出格式,demux 處理的輸入格式。輸入/輸出媒體格式涉及檔案格式和封裝格式兩個概念。檔案格式由檔案擴充名辨別,主要起提示作用,通過擴充名提示檔案類型(或封裝格式)資訊。封裝格式則是存儲媒體内容的實際容器格式,不同的封裝格式對應不同的檔案擴充名,很多時候也用檔案格式代指封裝格式,例如常用 ts 格式(檔案格式)代指 mpegts 格式(封裝格式)。
例如,我們把 test.ts 改名為 test.mkv,mkv 擴充名提示了此檔案封裝格式為 Matroska,但檔案内容并無任何變化,使用 ffprobe 工具仍能正确探測出封裝格式為 mpegts。
1.2.1 檢視 FFmpeg 支援的封裝格式
使用
ffmpeg -formats
指令可以檢視 FFmpeg 支援的封裝格式。FFmpeg 支援的封裝格式非常多,下面僅列出最常用的幾種:
think@opensuse> ffmpeg -formats
File formats:
D. = Demuxing supported
.E = Muxing supported
--
DE flv FLV (Flash Video)
D aac raw ADTS AAC (Advanced Audio Coding)
DE h264 raw H.264 video
DE hevc raw HEVC video
E mp2 MP2 (MPEG audio layer 2)
DE mp3 MP3 (MPEG audio layer 3)
E mpeg2video raw MPEG-2 video
DE mpegts MPEG-TS (MPEG-2 Transport Stream)
1.2.2 h264/aac 裸流封裝格式
h264 裸流封裝格式和 aac 裸流封裝格式在後面的解複用和複用例程中會用到,這裡先讨論一下。
FFmpeg 音視訊進行中,編碼格式是指音視訊幀的壓縮格式,編碼器編碼過程和解碼器解碼過程處理的是編碼格式,複用器複用過程和解複用器解複用過程處理的則是封裝格式。通常提到 h264 是指 H.264 編碼格式,當用作封裝格式時表示的是 H.264 裸流格式,所謂裸流就是不含封裝資訊的流,也就是沒穿衣服的流。aac 等封裝格式類似。
我們看一下 FFmpeg 工程源碼中 h264 編碼格式以及 h264 封裝格式的定義:
h264 編碼格式定義:
h264 解碼器定義:(FFmpeg 工程包含 h264 解碼器,而不包含 h264 編碼器(一般使用第三方 libx264 編碼器用作 h264 編碼),是以隻有解碼器定義)
AVCodec ff_h264_decoder = {
.name = "h264",
.long_name = NULL_IF_CONFIG_SMALL("H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10"),
.type = AVMEDIA_TYPE_VIDEO,
.id = AV_CODEC_ID_H264,
......
};
h264 封裝格式定義:
h264 複用器(輸出封裝格式)定義:
AVOutputFormat ff_h264_muxer = {
.name = "h264",
.long_name = NULL_IF_CONFIG_SMALL("raw H.264 video"),
.extensions = "h264,264",
.audio_codec = AV_CODEC_ID_NONE,
.video_codec = AV_CODEC_ID_H264,
.write_header = force_one_stream,
.write_packet = ff_raw_write_packet,
.check_bitstream = h264_check_bitstream,
.flags = AVFMT_NOTIMESTAMPS,
};
h264 解複用器(輸入封裝格式)定義:
FF_DEF_RAWVIDEO_DEMUXER(h264, "raw H.264 video", h264_probe, "h26l,h264,264,avc", AV_CODEC_ID_H264)
#define FF_DEF_RAWVIDEO_DEMUXER(shortname, longname, probe, ext, id)\
FF_DEF_RAWVIDEO_DEMUXER2(shortname, longname, probe, ext, id, AVFMT_GENERIC_INDEX)
#define FF_DEF_RAWVIDEO_DEMUXER2(shortname, longname, probe, ext, id, flag)\
FF_RAWVIDEO_DEMUXER_CLASS(shortname)\
AVInputFormat ff_ ## shortname ## _demuxer = {\
.name = #shortname,\
.long_name = NULL_IF_CONFIG_SMALL(longname),\
.read_probe = probe,\
.read_header = ff_raw_video_read_header,\
.read_packet = ff_raw_read_partial_packet,\
.extensions = ext,\
.flags = flag,\
.raw_codec_id = id,\
.priv_data_size = sizeof(FFRawVideoDemuxerContext),\
.priv_class = &shortname ## _demuxer_class,\
};
将上述三段代碼宏展開,可得到 h264 解複用器(輸入封裝格式)定義如下:
AVInputFormat ff_h264_demuxer = {
.name = h264,
.long_name = "raw H.264 video",
.read_probe = h264_probe,
.read_header = ff_raw_video_read_header,
.read_packet = ff_raw_read_partial_packet,
.extensions = "h26l,h264,264,avc",
.flags = AVFMT_GENERIC_INDEX,
.raw_codec_id = AV_CODEC_ID_H264,
.priv_data_size = sizeof(FFRawVideoDemuxerContext),
.priv_class = &h264_demuxer_class,
};
1.2.3 mpegts 封裝格式
再看一下 mpegts 封裝格式定義,AVInputFormat 用于定義輸入封裝格式,AVOutputFormat 用于定義輸出封裝格式。mpegts 輸入封裝格式中并未指定檔案擴充名,而 mpegts 輸出封裝格式中則指定了檔案擴充名為"ts,m2t,m2ts,mts"。
AVInputFormat ff_mpegts_demuxer = {
.name = "mpegts",
.long_name = NULL_IF_CONFIG_SMALL("MPEG-TS (MPEG-2 Transport Stream)"),
.priv_data_size = sizeof(MpegTSContext),
.read_probe = mpegts_probe,
.read_header = mpegts_read_header,
.read_packet = mpegts_read_packet,
.read_close = mpegts_read_close,
.read_timestamp = mpegts_get_dts,
.flags = AVFMT_SHOW_IDS | AVFMT_TS_DISCONT,
.priv_class = &mpegts_class,
};
AVOutputFormat ff_mpegts_muxer = {
.name = "mpegts",
.long_name = NULL_IF_CONFIG_SMALL("MPEG-TS (MPEG-2 Transport Stream)"),
.mime_type = "video/MP2T",
.extensions = "ts,m2t,m2ts,mts",
.priv_data_size = sizeof(MpegTSWrite),
.audio_codec = AV_CODEC_ID_MP2,
.video_codec = AV_CODEC_ID_MPEG2VIDEO,
.init = mpegts_init,
.write_packet = mpegts_write_packet,
.write_trailer = mpegts_write_end,
.deinit = mpegts_deinit,
.check_bitstream = mpegts_check_bitstream,
.flags = AVFMT_ALLOW_FLUSH | AVFMT_VARIABLE_FPS | AVFMT_NODIMENSIONS,
.priv_class = &mpegts_muxer_class,
};
1.2.4 檔案擴充名與封裝格式
在 FFmpeg 指令行中,輸入檔案擴充名是錯的也沒有關系,因為 FFmpeg 會讀取一小段檔案來探測出真正的封裝格式;但是如果未顯式的指定輸出封裝格式,就隻能通過輸出檔案擴充名來确定封裝格式,就必須確定擴充名是正确的。
做幾個實驗,來研究一下 FFmpeg 中檔案擴充名與封裝格式的關系:
測試檔案下載下傳(右鍵另存為):
tnhaoxc.flv
檔案資訊如下:
think@opensuse> ffprobe tnhaoxc.flv
ffprobe version 4.1 Copyright (c) 2007-2018 the FFmpeg developers
Input #0, flv, from 'tnhaoxc.flv':
Metadata:
encoder : Lavf58.20.100
Duration: 00:02:13.68, start: 0.000000, bitrate: 838 kb/s
Stream #0:0: Video: h264 (High), yuv420p(progressive), 784x480, 25 fps, 25 tbr, 1k tbn, 50 tbc
Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp
實驗 1:将 flv 封裝格式轉換為 mpegts 封裝格式
使用轉封裝指令将 flv 封裝格式轉換為 mpegts 封裝格式,依次運作如下兩條指令:
ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.ts
ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.m2t
生成 tnhaoxc.ts 和 tnhaoxc.m2t 檔案,比較一下兩檔案有無不同:
diff tnhaoxc.ts tnhaoxc.m2t
指令行無輸出,表示兩檔案内容相同。即兩檔案僅是擴充名不同,封裝格式都是 mpegts,檔案内容并無任何不同。
實驗 2:為輸出檔案指定錯誤的擴充名
指定一個錯誤的擴充名再試一下(誤把封裝格式名稱當作檔案擴充名):
ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.mpegts
指令行輸出如下錯誤資訊:
ffmpeg version 4.1 Copyright (c) 2000-2018 the FFmpeg developers
Input #0, flv, from 'tnhaoxc.flv':
Metadata:
encoder : Lavf58.20.100
Duration: 00:02:13.68, start: 0.000000, bitrate: 838 kb/s
Stream #0:0: Video: h264 (High), yuv420p(progressive), 784x480, 25 fps, 25 tbr, 1k tbn, 50 tbc
Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp
[NULL @ 0x1d62e80] Unable to find a suitable output format for 'tnhaoxc.mpegts'
tnhaoxc.mpegts: Invalid argument
提示無法确定輸出格式。FFmpeg 無法根據此擴充名确定輸出檔案的封裝格式。
實驗 3:為輸出檔案指定錯誤的擴充名但顯式指定封裝格式
通過
-f mpegts
選項顯式指定封裝格式為 mpegts:
ffmpeg -i tnhaoxc.flv -map 0 -c copy -f mpegts tnhaoxc.mpegts
指令執行成功,看一下檔案内容是否正确:
diff tnhaoxc.mpegts tnhaoxc.ts
發現 tnhaoxc.mpegts 和 tnhaoxc.ts 檔案内容完全一樣,雖然 tnhaoxc.mpegts 有錯誤的檔案擴充名,仍然得到了我們期望的封裝格式。
不知道什麼指令可以查到封裝格式對應的擴充名。可以在 FFmpeg 工程源碼中搜尋封裝格式名稱,如搜尋“mpegts”,可以看到其擴充名為“ts,m2t,m2ts,mts”。
2. API 介紹
最主要的 API 有如下幾個。FFmpeg 中将編碼幀及未編碼幀均稱作 frame,本文為友善,将編碼幀稱作 packet,未編碼幀稱作 frame。
2.1 avformat_open_input()
/**
* Open an input stream and read the header. The codecs are not opened.
* The stream must be closed with avformat_close_input().
*
* @param ps Pointer to user-supplied AVFormatContext (allocated by avformat_alloc_context).
* May be a pointer to NULL, in which case an AVFormatContext is allocated by this
* function and written into ps.
* Note that a user-supplied AVFormatContext will be freed on failure.
* @param url URL of the stream to open.
* @param fmt If non-NULL, this parameter forces a specific input format.
* Otherwise the format is autodetected.
* @param options A dictionary filled with AVFormatContext and demuxer-private options.
* On return this parameter will be destroyed and replaced with a dict containing
* options that were not found. May be NULL.
*
* @return 0 on success, a negative AVERROR on failure.
*
* @note If you want to use custom IO, preallocate the format context and set its pb field.
*/
int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options);
這個函數會打開輸入媒體檔案,讀取檔案頭,将檔案格式資訊存儲在第一個參數 AVFormatContext 中。
2.2 avformat_find_stream_info()
/**
* Read packets of a media file to get stream information. This
* is useful for file formats with no headers such as MPEG. This
* function also computes the real framerate in case of MPEG-2 repeat
* frame mode.
* The logical file position is not changed by this function;
* examined packets may be buffered for later processing.
*
* @param ic media file handle
* @param options If non-NULL, an ic.nb_streams long array of pointers to
* dictionaries, where i-th member contains options for
* codec corresponding to i-th stream.
* On return each dictionary will be filled with options that were not found.
* @return >=0 if OK, AVERROR_xxx on error
*
* @note this function isn't guaranteed to open all the codecs, so
* options being non-empty at return is a perfectly normal behavior.
*
* @todo Let the user decide somehow what information is needed so that
* we do not waste time getting stuff the user does not need.
*/
int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);
這個函數會讀取一段視訊檔案資料并嘗試解碼,将取到的流資訊填入 AVFormatContext.streams 中。AVFormatContext.streams 是一個指針數組,數組大小是 AVFormatContext.nb_streams。
2.3 av_read_frame()
/**
* Return the next frame of a stream.
* This function returns what is stored in the file, and does not validate
* that what is there are valid frames for the decoder. It will split what is
* stored in the file into frames and return one for each call. It will not
* omit invalid data between valid frames so as to give the decoder the maximum
* information possible for decoding.
*
* If pkt->buf is NULL, then the packet is valid until the next
* av_read_frame() or until avformat_close_input(). Otherwise the packet
* is valid indefinitely. In both cases the packet must be freed with
* av_packet_unref when it is no longer needed. For video, the packet contains
* exactly one frame. For audio, it contains an integer number of frames if each
* frame has a known fixed size (e.g. PCM or ADPCM data). If the audio frames
* have a variable size (e.g. MPEG audio), then it contains one frame.
*
* pkt->pts, pkt->dts and pkt->duration are always set to correct
* values in AVStream.time_base units (and guessed if the format cannot
* provide them). pkt->pts can be AV_NOPTS_VALUE if the video format
* has B-frames, so it is better to rely on pkt->dts if you do not
* decompress the payload.
*
* @return 0 if OK, < 0 on error or end of file
*/
int av_read_frame(AVFormatContext *s, AVPacket *pkt);
本函數用于解複用過程。
本函數将存儲在輸入檔案中的資料分割為多個 packet,每次調用将得到一個 packet。packet 可能是視訊幀、音頻幀或其他資料,解碼器隻會解碼視訊幀或音頻幀,非音視訊資料并不會被扔掉、進而能向解碼器提供盡可能多的資訊。
對于視訊來說,一個 packet 隻包含一個視訊幀;對于音頻來說,若是幀長固定的格式則一個 packet 可包含整數個音頻幀,若是幀長可變的格式則一個 packet 隻包含一個音頻幀。
讀取到的 packet 每次使用完之後應調用
av_packet_unref(AVPacket *pkt)
清空 packet。否則會造成記憶體洩露。
2.4 av_write_frame()
/**
* Write a packet to an output media file.
*
* This function passes the packet directly to the muxer, without any buffering
* or reordering. The caller is responsible for correctly interleaving the
* packets if the format requires it. Callers that want libavformat to handle
* the interleaving should call av_interleaved_write_frame() instead of this
* function.
*
* @param s media file handle
* @param pkt The packet containing the data to be written. Note that unlike
* av_interleaved_write_frame(), this function does not take
* ownership of the packet passed to it (though some muxers may make
* an internal reference to the input packet).
* <br>
* This parameter can be NULL (at any time, not just at the end), in
* order to immediately flush data buffered within the muxer, for
* muxers that buffer up data internally before writing it to the
* output.
* <br>
* Packet's @ref AVPacket.stream_index "stream_index" field must be
* set to the index of the corresponding stream in @ref
* AVFormatContext.streams "s->streams".
* <br>
* The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
* must be set to correct values in the stream's timebase (unless the
* output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
* they can be set to AV_NOPTS_VALUE).
* The dts for subsequent packets passed to this function must be strictly
* increasing when compared in their respective timebases (unless the
* output format is flagged with the AVFMT_TS_NONSTRICT, then they
* merely have to be nondecreasing). @ref AVPacket.duration
* "duration") should also be set if known.
* @return < 0 on error, = 0 if OK, 1 if flushed and there is no more data to flush
*
* @see av_interleaved_write_frame()
*/
int av_write_frame(AVFormatContext *s, AVPacket *pkt);
本函數用于複用過程,将 packet 寫入輸出媒體。
packet 交織是指:音頻 packet 和視訊 packet 在輸出媒體檔案中應嚴格按照 packet 中 dts 遞增的順序交錯存放。
本函數直接将 packet 寫入複用器(muxer),不會緩存或記錄任何 packet。本函數不負責不同流的 packet 交織問題。由調用者負責。
如果調用者不願處理 packet 交織問題,應調用 av_interleaved_write_frame()替代本函數。
2.5 av_interleaved_write_frame()
/**
* Write a packet to an output media file ensuring correct interleaving.
*
* This function will buffer the packets internally as needed to make sure the
* packets in the output file are properly interleaved in the order of
* increasing dts. Callers doing their own interleaving should call
* av_write_frame() instead of this function.
*
* Using this function instead of av_write_frame() can give muxers advance
* knowledge of future packets, improving e.g. the behaviour of the mp4
* muxer for VFR content in fragmenting mode.
*
* @param s media file handle
* @param pkt The packet containing the data to be written.
* <br>
* If the packet is reference-counted, this function will take
* ownership of this reference and unreference it later when it sees
* fit.
* The caller must not access the data through this reference after
* this function returns. If the packet is not reference-counted,
* libavformat will make a copy.
* <br>
* This parameter can be NULL (at any time, not just at the end), to
* flush the interleaving queues.
* <br>
* Packet's @ref AVPacket.stream_index "stream_index" field must be
* set to the index of the corresponding stream in @ref
* AVFormatContext.streams "s->streams".
* <br>
* The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
* must be set to correct values in the stream's timebase (unless the
* output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
* they can be set to AV_NOPTS_VALUE).
* The dts for subsequent packets in one stream must be strictly
* increasing (unless the output format is flagged with the
* AVFMT_TS_NONSTRICT, then they merely have to be nondecreasing).
* @ref AVPacket.duration "duration") should also be set if known.
*
* @return 0 on success, a negative AVERROR on error. Libavformat will always
* take care of freeing the packet, even if this function fails.
*
* @see av_write_frame(), AVFormatContext.max_interleave_delta
*/
int av_interleaved_write_frame(AVFormatContext *s, AVPacket *pkt);
本函數将按需在内部緩存 packet,進而確定輸出媒體中不同流的 packet 能按照 dts 增長的順序正确交織。
2.6 avio_open()
/**
* Create and initialize a AVIOContext for accessing the
* resource indicated by url.
* @note When the resource indicated by url has been opened in
* read+write mode, the AVIOContext can be used only for writing.
*
* @param s Used to return the pointer to the created AVIOContext.
* In case of failure the pointed to value is set to NULL.
* @param url resource to access
* @param flags flags which control how the resource indicated by url
* is to be opened
* @return >= 0 in case of success, a negative value corresponding to an
* AVERROR code in case of failure
*/
int avio_open(AVIOContext **s, const char *url, int flags);
建立并初始化一個 AVIOContext,用于通路輸出媒體檔案。
2.7 avformat_write_header()
/**
* Allocate the stream private data and write the stream header to
* an output media file.
*
* @param s Media file handle, must be allocated with avformat_alloc_context().
* Its oformat field must be set to the desired output format;
* Its pb field must be set to an already opened AVIOContext.
* @param options An AVDictionary filled with AVFormatContext and muxer-private options.
* On return this parameter will be destroyed and replaced with a dict containing
* options that were not found. May be NULL.
*
* @return AVSTREAM_INIT_IN_WRITE_HEADER on success if the codec had not already been fully initialized in avformat_init,
* AVSTREAM_INIT_IN_INIT_OUTPUT on success if the codec had already been fully initialized in avformat_init,
* negative AVERROR on failure.
*
* @see av_opt_find, av_dict_set, avio_open, av_oformat_next, avformat_init_output.
*/
av_warn_unused_result
int avformat_write_header(AVFormatContext *s, AVDictionary **options);
向輸出檔案寫入檔案頭資訊。
2.8 av_write_trailer()
/**
* Write the stream trailer to an output media file and free the
* file private data.
*
* May only be called after a successful call to avformat_write_header.
*
* @param s media file handle
* @return 0 if OK, AVERROR_xxx on error
*/
int av_write_trailer(AVFormatContext *s);
向輸出檔案寫入檔案尾資訊。
6. 參考資料
[1] WIKI,
Digital_container_format[2] WIKI,
Comparison_of_container_formats[3] 雷霄骅,
使用 FFMPEG 類庫分離出多媒體檔案中的 H.264 碼流,
https://blog.csdn.net/leixiaohua1020/article/details/11800877[4] 雷霄骅,
最簡單的基于 FFmpeg 的封裝格式處理:視音頻分離器簡化版 https://blog.csdn.net/leixiaohua1020/article/details/397670557. 修改記錄
2019-03-08 V1.0 解複用例程初稿
2019-03-09 V1.0 拆分筆記
2019-03-10 V1.0 增加複用例程和轉封裝例程
「視訊雲技術」你最值得關注的音視訊技術公衆号,每周推送來自阿裡雲一線的實踐技術文章,在這裡與音視訊領域一流工程師交流切磋。