上篇回顧:
上一篇文章Android硬編解碼MediaCodec解析——從豬肉餐館的故事講起(一)已經叙述了MediaCodec工作流程和工作周期狀态機,今天開始進入實戰,從代碼角度詳細解析MediaCodec。如果沒有看過上篇,建議還是看下才能和本文無縫銜接。
MediaCodec代碼執行個體
本次講解的代碼執行個體是Google官方MediaCodec的學習項目 grafika,grafika由多個demo組成,比如視訊解碼播放、實時錄制視訊并将視訊編碼為H264儲存本地,錄屏等功能,每個demo都有會側重于某項技術。
以下為grafika的App首頁,每一項代表一個demo:
今天,我們就從最基本的第一個demo講起————解碼一個本地MP4視訊。
從gif可以看出,這是一個非常簡單的視訊,整個功能就是對mp4視訊進行解碼,然後将解碼後的資料渲染到螢幕,對應的代碼在com.android.grafika.PlayMovieActivity,基本流程結構圖如下:
那麼最核心的解碼代碼都在MoviePlayer中。
解複用代碼解析
首先要明白的概念是複用,也可以叫做封裝,即将已經壓縮編碼的視訊資料和音頻資料按照一定的格式打包到一起,比如熱愛看片的我們都很熟悉的MP4,MKV,RMVB,TS,FLV,AVI,就是複用格式。
比如FLV格式的資料,是由H.264編碼的視訊碼流和AAC編碼的音頻碼流打包一起。
FLV複用格式是由一個FLV Header檔案頭和一個一個的Tag組成的。Tag中包含了音頻資料以及視訊資料。FLV的結構如下圖所示(圖來源于視音頻資料處理入門:FLV封裝格式解析
C++學習資料免費擷取方法:關注音視訊開發T哥,點選下方連結即可免費擷取2023年最新C++音視訊開發進階獨家學習資料!
+資料包「連結」
那麼在解碼視訊之前,就必須先将H264視訊資料從複用格式中取出來,Android平台已經提供了MediaExtractor這個工具讓我們友善地進行解複用。
以下是官網提供的MediaExtractor使用代碼模闆:
MediaExtractor extractor = new MediaExtractor();
extractor.setDataSource(...);
int numTracks = extractor.getTrackCount();
//周遊媒體複用檔案中的每一條軌道資料流(音頻或者視訊流),得到我們需要處理的資料流的mime類型,并選中它
for (int i = 0; i < numTracks; ++i) {
MediaFormat format = extractor.getTrackFormat(i);
String mime = format.getString(MediaFormat.KEY_MIME);
if (weAreInterestedInThisTrack) {
//選中我們需要處理的資料流的mime類型的資料流
extractor.selectTrack(i);
}
}
ByteBuffer inputBuffer = ByteBuffer.allocate(...)
//循環讀取選中的音頻或者視訊流到inputBuffer中
while (extractor.readSampleData(inputBuffer, ...) >= 0) {
int trackIndex = extractor.getSampleTrackIndex();
long presentationTimeUs = extractor.getSampleTime();
...
extractor.advance();
}
extractor.release();
extractor = null;
注釋已經寫的比較詳細了,基本能看懂。
首先了解下MediaFormat,它是一個專門描述媒體檔案格式的類,内部通過一系列鍵值對來描述媒體格式,比如通用的媒體格式KEY:
視訊專有的格式KEY:
音頻專有的格式KEY:
在上面的模闆代碼中,就是取了KEY_MIME對應的值來判斷媒體檔案類型。
而常見的視訊的mime就有以下:
"video/x-vnd.on2.vp8" - VP8 video (i.e. video in .webm) "video/x-vnd.on2.vp9" - VP9 video (i.e. video in .webm) "video/avc" - H.264/AVC video "video/hevc" - H.265/HEVC video "video/mp4v-es" - MPEG4 video "video/3gpp" - H.263 video
因為現在講的編碼主要是H264,而H264對應的mine就是"video/avc"。
在grafika中的MoviePlayer的構造方法中com.android.grafika.MoviePlayer#MoviePlayer,就是通過MediaExtractor來擷取視訊的寬高:
//解複用
MediaExtractor extractor = null;
try {
extractor = new MediaExtractor();
//傳入視訊檔案的路徑
extractor.setDataSource(sourceFile.toString());
int trackIndex = selectTrack(extractor);
if (trackIndex < 0) {
throw new RuntimeException("No video track found in " + mSourceFile);
}
//選中得到的軌道(視訊軌道),即後面都是對此軌道的處理
extractor.selectTrack(trackIndex);
//通過該軌道的MediaFormat得到對視訊對應的寬高
MediaFormat format = extractor.getTrackFormat(trackIndex);
Log.d(TAG, "extractor.getTrackFormat format" + format);
//視訊對應的寬高
mVideoWidth = format.getInteger(MediaFormat.KEY_WIDTH);
mVideoHeight = format.getInteger(MediaFormat.KEY_HEIGHT);
if (VERBOSE) {
Log.d(TAG, "Video size is " + mVideoWidth + "x" + mVideoHeight);
}
} finally {
if (extractor != null) {
extractor.release();
}
}
在具體的播放視訊方法com.android.grafika.MoviePlayer#play中,通過擷取到的mime類型來建立一個MediaCodec解碼器:
MediaFormat format = extractor.getTrackFormat(trackIndex);
Log.d(TAG, "EgetTrackFormat format:" + format);
// Create a MediaCodec decoder, and configure it with the MediaFormat from the
// extractor. It's very important to use the format from the extractor because
// it contains a copy of the CSD-0/CSD-1 codec-specific data chunks.
String mime = format.getString(MediaFormat.KEY_MIME);
Log.d(TAG, "createDecoderByType mime:" + mime);
//通過視訊mime類型初始化解碼器
MediaCodec decoder = MediaCodec.createDecoderByType(mime);
此時MediaCodec處于Stopped狀态中的Uninitialized狀态,接下來開始啟動MediaCodec(老闆收拾廚房桌椅,要開店了):
//配置解碼器,指定MediaFormat以及視訊輸出的Surface,解碼器進入configure狀态
decoder.configure(format, mOutputSurface, null, 0);
//啟動解碼器,開始進入Executing狀态
// Immediately after start() the codec is in the Flushed sub-state, where it holds all the buffers
decoder.start();
//具體的解碼流程
doExtract(extractor, trackIndex, decoder, mFrameCallback);
注意到configure方法傳了mOutputSurface的Surface對象,在# Android硬編解碼利器MediaCodec解析——從豬肉餐館的故事講起(一) 講過,對于原始視訊資料來說:
視訊編解碼支援三種色彩格式,其中第二種就是native raw video format : COLOR_FormatSurface,可以用來處理surface模式的資料輸入輸出。而這個Surface對象是從Activity的TextureView擷取到的:
//MoviePlayer通過Surface将解碼後的原始視訊資料渲染到TextureView上
SurfaceTexture st = mTextureView.getSurfaceTexture();
Surface surface = new Surface(st);
MoviePlayer player = null;
try {
player = new MoviePlayer(
new File(getFilesDir(), mMovieFiles[mSelectedMovie]), surface, callback);
} catch (IOException ioe) {
Log.e(TAG, "Unable to play movie", ioe);
surface.release();
return;
}
解碼代碼解析
此時MediaCodec已經啟動,此時已經進入input端和output端的大循環階段(頭腦中開始想象采購員一次又一次将生豬肉裝進籃子中交給廚師,廚師做完又放在盤子上送給顧客的循環的場景)。關鍵代碼看com.android.grafika.MoviePlayer#doExtract:
/**
* Work loop. We execute here until we run out of video or are told to stop.
*/
private void doExtract(MediaExtractor extractor, int trackIndex, MediaCodec decoder,
FrameCallback frameCallback) {
// We need to strike a balance between providing input and reading output that
// operates efficiently without delays on the output side.
//
// To avoid delays on the output side, we need to keep the codec's input buffers
// fed. There can be significant latency between submitting frame N to the decoder
// and receiving frame N on the output, so we need to stay ahead of the game.
//
// Many video decoders seem to want several frames of video before they start
// producing output -- one implementation wanted four before it appeared to
// configure itself. We need to provide a bunch of input frames up front, and try
// to keep the queue full as we go.
//
// (Note it's possible for the encoded data to be written to the stream out of order,
// so we can't generally submit a single frame and wait for it to appear.)
//
// We can't just fixate on the input side though. If we spend too much time trying
// to stuff the input, we might miss a presentation deadline. At 60Hz we have 16.7ms
// between frames, so sleeping for 10ms would eat up a significant fraction of the
// time allowed. (Most video is at 30Hz or less, so for most content we'll have
// significantly longer.) Waiting for output is okay, but sleeping on availability
// of input buffers is unwise if we need to be providing output on a regular schedule.
//
//
// In some situations, startup latency may be a concern. To minimize startup time,
// we'd want to stuff the input full as quickly as possible. This turns out to be
// somewhat complicated, as the codec may still be starting up and will refuse to
// accept input. Removing the timeout from dequeueInputBuffer() results in spinning
// on the CPU.
//
// If you have tight startup latency requirements, it would probably be best to
// "prime the pump" with a sequence of frames that aren't actually shown (e.g.
// grab the first 10 NAL units and shove them through, then rewind to the start of
// the first key frame).
//
// The actual latency seems to depend on strongly on the nature of the video (e.g.
// resolution).
//
//
// One conceptually nice approach is to loop on the input side to ensure that the codec
// always has all the input it can handle. After submitting a buffer, we immediately
// check to see if it will accept another. We can use a short timeout so we don't
// miss a presentation deadline. On the output side we only check once, with a longer
// timeout, then return to the outer loop to see if the codec is hungry for more input.
//
// In practice, every call to check for available buffers involves a lot of message-
// passing between threads and processes. Setting a very brief timeout doesn't
// exactly work because the overhead required to determine that no buffer is available
// is substantial. On one device, the "clever" approach caused significantly greater
// and more highly variable startup latency.
//
// The code below takes a very simple-minded approach that works, but carries a risk
// of occasionally running out of output. A more sophisticated approach might
// detect an output timeout and use that as a signal to try to enqueue several input
// buffers on the next iteration.
//
// If you want to experiment, set the VERBOSE flag to true and watch the behavior
// in logcat. Use "logcat -v threadtime" to see sub-second timing.
//擷取解碼輸出資料的逾時時間
final int TIMEOUT_USEC = 0;
//輸入ByteBuffer數組(較高版本的MediaCodec已經用getInputBuffer取代了,可直接擷取buffer)
ByteBuffer[] decoderInputBuffers = decoder.getInputBuffers();
//記錄傳入了第幾塊資料
int inputChunk = 0;
//用于log每幀解碼時間
long firstInputTimeNsec = -1;
boolean outputDone = false;
boolean inputDone = false;
while (!outputDone) {
if (VERBOSE) Log.d(TAG, "loop");
if (mIsStopRequested) {
Log.d(TAG, "Stop requested");
return;
}
// Feed more data to the decoder.
if (!inputDone) {
//拿到可用的ByteBuffer的index
int inputBufIndex = decoder.dequeueInputBuffer(TIMEOUT_USEC);
if (inputBufIndex >= 0) {
if (firstInputTimeNsec == -1) {
firstInputTimeNsec = System.nanoTime();
}
//根據index得到對應的輸入ByteBuffer
ByteBuffer inputBuf = decoderInputBuffers[inputBufIndex];
Log.d(TAG, "decoderInputBuffers inputBuf:" + inputBuf + ",inputBufIndex:" + inputBufIndex);
// Read the sample data into the ByteBuffer. This neither respects nor
// updates inputBuf's position, limit, etc.
//從媒體檔案中讀取的一個sample資料大小
int chunkSize = extractor.readSampleData(inputBuf, 0);
if (chunkSize < 0) {
//檔案讀到末尾,設定标志位,發送一個空幀,給後面解碼知道具體結束位置
// End of stream -- send empty frame with EOS flag set.
//When you queue an input buffer with the end-of-stream marker, the codec transitions
// to the End-of-Stream sub-state. In this state the codec no longer accepts further
// input buffers, but still generates output buffers until the end-of-stream is reached
// on the output.
decoder.queueInputBuffer(inputBufIndex, 0, 0, 0L,
MediaCodec.BUFFER_FLAG_END_OF_STREAM);
Log.d(TAG, "queueInputBuffer");
inputDone = true;
if (VERBOSE) Log.d(TAG, "sent input EOS");
} else {
if (extractor.getSampleTrackIndex() != trackIndex) {
Log.w(TAG, "WEIRD: got sample from track " +
extractor.getSampleTrackIndex() + ", expected " + trackIndex);
}
//得到目前資料的播放時間點
long presentationTimeUs = extractor.getSampleTime();
//把inputBufIndex對應的資料傳入MediaCodec
decoder.queueInputBuffer(inputBufIndex, 0, chunkSize,
presentationTimeUs, 0 /*flags*/);
Log.d(TAG, "queueInputBuffer inputBufIndex:" + inputBufIndex);
if (VERBOSE) {
Log.d(TAG, "submitted frame " + inputChunk + " to dec, size=" +
chunkSize);
}
//記錄傳入了第幾塊資料
inputChunk++;
//extractor讀取遊标往前挪動
extractor.advance();
}
} else {
if (VERBOSE) Log.d(TAG, "input buffer not available");
}
}
if (!outputDone) {
//如果解碼成功,則得到解碼出來的資料的buffer在輸出buffer中的index。并将解碼得到的buffer的相關資訊放在mBufferInfo中。
// 如果不成功,則得到的是解碼的一些狀态
int outputBufferIndex = decoder.dequeueOutputBuffer(mBufferInfo, TIMEOUT_USEC);
Log.d(TAG, "dequeueOutputBuffer decoderBufferIndex:" + outputBufferIndex + ",mBufferInfo:" + mBufferInfo);
if (outputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
// no output available yet
if (VERBOSE) Log.d(TAG, "no output from decoder available");
} else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
// not important for us, since we're using Surface
if (VERBOSE) Log.d(TAG, "decoder output buffers changed");
} else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
MediaFormat newFormat = decoder.getOutputFormat();
if (VERBOSE) Log.d(TAG, "decoder output format changed: " + newFormat);
} else if (outputBufferIndex < 0) {
throw new RuntimeException(
"unexpected result from decoder.dequeueOutputBuffer: " +
outputBufferIndex);
} else { // decoderStatus >= 0
if (firstInputTimeNsec != 0) {
// Log the delay from the first buffer of input to the first buffer
// of output.
long nowNsec = System.nanoTime();
Log.d(TAG, "startup lag " + ((nowNsec - firstInputTimeNsec) / 1000000.0) + " ms");
firstInputTimeNsec = 0;
}
boolean doLoop = false;
if (VERBOSE) Log.d(TAG, "surface decoder given buffer " + outputBufferIndex +
" (output mBufferInfo size=" + mBufferInfo.size + ")");
//判斷是否到了檔案結束,上面設定MediaCodec.BUFFER_FLAG_END_OF_STREAM标志位在這裡判斷
if ((mBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
if (VERBOSE) Log.d(TAG, "output EOS");
if (mLoop) {
doLoop = true;
} else {
outputDone = true;
}
}
//如果解碼得到的buffer大小大于0,則需要渲染
boolean doRender = (mBufferInfo.size != 0);
// As soon as we call releaseOutputBuffer, the buffer will be forwarded
// to SurfaceTexture to convert to a texture. We can't control when it
// appears on-screen, but we can manage the pace at which we release
// the buffers.
if (doRender && frameCallback != null) {
//渲染前的回調,這裡具體實作是通過一定時長的休眠來盡量確定穩定的幀率
frameCallback.preRender(mBufferInfo.presentationTimeUs);
}
//得到輸出Buffer數組,較高版本已經被getOutputBuffer代替
ByteBuffer[] decoderOutputBuffers = decoder.getOutputBuffers();
Log.d(TAG, "ecoderOutputBuffers.length:" + decoderOutputBuffers.length);
//将輸出buffer數組的第outputBufferIndex個buffer繪制到surface。doRender為true繪制到配置的surface
decoder.releaseOutputBuffer(outputBufferIndex, doRender);
if (doRender && frameCallback != null) {
//渲染後的回調
frameCallback.postRender();
}
if (doLoop) {
Log.d(TAG, "Reached EOS, looping");
//需要循環的話,重置extractor的遊标到初始位置。
extractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC);
inputDone = false;
//重置decoder到Flushed狀态,不然就沒法開始新一輪播放
// You can move back to the Flushed sub-state at any time while
// in the Executing state using flush().
//You can move back to the Flushed sub-state at any time while in the Executing state using flush()
decoder.flush(); // reset decoder state
frameCallback.loopReset();
}
}
}
}
}
代碼有官方和我加上的詳細注釋,這裡主要挑幾個重點講下:
1.采購員向廚師詢問有無籃子可用:首先詢問Mediacodec目前有沒有可以input的Buffer可以使用:
int inputBufIndex = decoder.dequeueInputBuffer(TIMEOUT_USEC);
方法定義是:
/**
* Returns the index of an input buffer to be filled with valid data
* or -1 if no such buffer is currently available.
* This method will return immediately if timeoutUs == 0, wait indefinitely
* for the availability of an input buffer if timeoutUs < 0 or wait up
* to "timeoutUs" microseconds if timeoutUs > 0.
* @param timeoutUs The timeout in microseconds, a negative timeout indicates "infinite".
* @throws IllegalStateException if not in the Executing state,
* or codec is configured in asynchronous mode.
* @throws MediaCodec.CodecException upon codec error.
*/
public final int dequeueInputBuffer(long timeoutUs) {
int res = native_dequeueInputBuffer(timeoutUs);
if (res >= 0) {
synchronized(mBufferLock) {
validateInputByteBuffer(mCachedInputBuffers, res);
}
}
return res;
}
TIMEOUT_USEC為等待逾時時間。當傳回的inputBufIndex大于等于0,則說明目前有可用的Buffer,此時inputBufIndex表示可用Buffer在Mediacodec中的序号。如果等待了TIMEOUT_USEC時間還沒找到可用的Buffer,則傳回inputBufIndex小于0,等下次循環再來取Buffer。
2.采購員将生豬肉裝進籃子中并交給廚師:每次從MediaExtractor中的readSampleData方法讀出視訊一段資料放在ByteBuffer中,然後通過Mediacodec的queueInputBuffer将ByteBuffer傳給Mediacodec内部處理。
//從媒體檔案中讀取的一個sample資料大小到inputBuf中
int chunkSize = extractor.readSampleData(inputBuf, 0);
方法定義:
/**
* Retrieve the current encoded sample and store it in the byte buffer
* starting at the given offset.
* <p>
* <b>Note:</b>As of API 21, on success the position and limit of
* {@code byteBuf} is updated to point to the data just read.
* @param byteBuf the destination byte buffer
* @return the sample size (or -1 if no more samples are available).
*/
public native int readSampleData(@NonNull ByteBuffer byteBuf, int offset);
Android硬編解碼MediaCodec解析——從豬肉餐館的故事講起(一)中講過,根據官網描述,一般如果是視訊檔案資料,則都不要傳遞給Mediacodec不是完整幀的資料,除非是标記了BUFFER_FLAG_PARTIAL_FRAME的資料。是以這裡可以推斷readSampleData方法是讀取一幀的資料,後面我會對其進行驗證。
傳回值為讀取到資料大小,是以如果傳回值大于0,則說明是有讀取到資料的,則将資料傳入MediaCodec中:
//得到目前資料的播放時間點
long presentationTimeUs = extractor.getSampleTime();
//把inputBufIndex對應的資料傳入MediaCodec
decoder.queueInputBuffer(inputBufIndex, 0, chunkSize,
presentationTimeUs, 0 /*flags*/);
關于queueInputBuffer方法,定義的注釋實在太長了,簡單來說,這裡就是将input端第inputBufIndex個Buffer從第0位開始chunkSize個位元組資料傳入MediaCodec中,并指定這一幀資料的渲染時間為presentationTimeUs,在解析H264視訊編碼原理——從孫藝珍的電影說起(一)曾經說過
這裡由于B幀的引入,會導緻一個現象,就是編碼的幀順序和播放的幀順序會不一緻,是以也衍生了pts和dts2個時間戳(編碼時間和播放時間)
這裡的presentationTimeUs就是pts,因為解碼後的幀資料可能不是和播放順序一樣的,需要presentationTimeUs來指定播放順序。最後一個參數flags是對傳入的資料描述用的标志位,一般用于一些特殊情況,這裡傳0即可。
如果readSampleData方法傳回值,即讀到的資料大小為負數,則說明已經讀到視訊檔案尾部了,則還是調用queueInputBuffer方法,但是需要特殊處理:
decoder.queueInputBuffer(inputBufIndex, 0, 0, 0L,
MediaCodec.BUFFER_FLAG_END_OF_STREAM);
發送一個空幀,标志位傳BUFFER_FLAG_END_OF_STREAM,告訴MediaCodec,已經到檔案尾部了,這個檔案沒有剩下需要傳的資料了,即采購員告訴廚師,已經沒有生豬肉了。
發送了這個表示結束的空幀之後,就不能再傳資料給input端了,一直到MediaCodec進入了flushed狀态, 或者進入stopped 之後再start之後才可以重新傳入資料給input端。
input端的代碼就到這,然後馬不停蹄,立刻到ouptut端去嘗試擷取一下output的buffer(顧客走到廚師面前,問豬肉炒好了沒有):
int outputBufferIndex = decoder.dequeueOutputBuffer(mBufferInfo, TIMEOUT_USEC);
如果不成功(廚師對顧客說豬肉還沒炒好),則得到的是解碼的一些狀态,在項目代碼中,列出了以下幾種常見 的狀态:
1.MediaCodec.INFO_TRY_AGAIN_LATER:表示等了TIMEOUT_USEC時間長,也暫時還沒有解碼出成功的資料。一般來說,一個是等待時間還不夠,另一個就是輸入端是B幀,需要後面一幀P幀來作為參考幀才可以解碼(關于B幀P幀詳見# 解析H264視訊編碼原理——從孫藝珍的電影說起(一))
2.MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:輸出Buffer數組已經過時,需要及時更換,由于較新版本已經用getOutputBuffer擷取輸出Buffer了,是以該标志位也過時了。
3.MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:輸出資料的MediaFormat發生了變化。
如果解碼成功,則得到解碼出來的資料的buffer在輸出buffer中的index。并将解碼得到的buffer的相關資訊放在mBufferInfo中。然後執行非常關鍵的一段代碼:
decoder.releaseOutputBuffer(outputBufferIndex, doRender);
将輸出buffer數組的第outputBufferIndex個buffer繪制到surface(還記得configure方法傳了的Surface對象麼)。doRender為true,繪制到配置的surface。可以了解這行代碼就類似Android中Canvas的draw方法,調用就繪制一幀,并将Buffer回收。
總結
美好的時光總是如此短暫,我覺得解碼的關鍵代碼應該已經講得比較細緻了吧~
為了避免篇幅過長導緻讀者看了容易打瞌睡,我還是先到此為止把,下一篇博文 # Android硬編解碼工具MediaCodec解析——從豬肉餐館的故事講起(三)将講解本文代碼運作後的一些要點和注意細節,敬請關注~~
參考:
視音頻資料處理入門:FLV封裝格式解析 MediaCodec官網
安卓解碼器MediaCodec解析
作者:半島鐵盒裡的貓 連結:https://juejin.cn/post/7111340889691127815/ 來源:稀土掘金 著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。
在開發的路上你不是一個人,歡迎加入C++音視訊開發交流群「連結」大家庭讨論交流!