1.縮放
分辨率太高會導緻算法耗時增加,同時一些變換耗時也會增加很多,因為分辨率越大,byte[]也就越大,是以很多時候我們都需要進行縮放操作,具體代碼如下,我也是網上找到,出處已經不知道了,某些地方做過一些小優化,當然能找到原作更好,因為對于C++我懂的也很少。
void nv12_nearest_scale(uint8_t *__restrict src, uint8_t *__restrict dst,
int srcWidth, int srcHeight, int dstWidth, int
dstHeight) //restrict keyword is for compiler to optimize program
{
register int sw = srcWidth; //register keyword is for local var to accelorate
register int sh = srcHeight;
register int dw = dstWidth;
register int dh = dstHeight;
register int y, x;
unsigned long int srcy, srcx, src_index;
unsigned long int xrIntFloat_16 = (sw << 16) / dw + 1; //better than float division
unsigned long int yrIntFloat_16 = (sh << 16) / dh + 1;
uint8_t *dst_uv = dst + dh * dw; //memory start pointer of dest uv
uint8_t *src_uv = src + sh * sw; //memory start pointer of source uv
uint8_t *dst_uv_yScanline;
uint8_t *src_uv_yScanline;
uint8_t *dst_y_slice = dst; //memory start pointer of dest y
uint8_t *src_y_slice;
uint8_t *sp;
uint8_t *dp;
for (y = 0; y < (dh & ~7); ++y) //'dh & ~7' is to generate faster assembly code
{
srcy = (y * yrIntFloat_16) >> 16;
src_y_slice = src + srcy * sw;
if ((y & 1) == 0) {
dst_uv_yScanline = dst_uv + (y / 2) * dw;
src_uv_yScanline = src_uv + (srcy / 2) * sw;
}
for (x = 0; x < (dw & ~7); ++x) {
srcx = (x * xrIntFloat_16) >> 16;
dst_y_slice[x] = src_y_slice[srcx];
if ((y & 1) == 0) //y is even
{
if ((x & 1) == 0) //x is even
{
src_index = (srcx / 2) * 2;
sp = dst_uv_yScanline + x;
dp = src_uv_yScanline + src_index;
*sp = *dp;
++sp;
++dp;
*sp = *dp;
}
}
}
dst_y_slice += dw;
}
}
2.旋轉
這個旋轉算法是可以通過omp加速的,需要使用的話,加上頭檔案,添加“#pragma omp parallel for”就可以了,通過omp的多核運算,能很大程度提高速度,但同時對cpu資源的消耗也會很大。
#include <omp.h>
void YUV420spRotate90(unsigned char *dst, unsigned char *src, int imageWidth, int imageHeight) {
// Rotate the Y luma
int i = 0;
for (int x = 0; x < imageWidth; x++) {
for (int y = imageHeight - 1; y >= 0; y--) {
dst[i] = src[y * imageWidth + x];
i++;
}
}
// Rotate the U and V color components
i = imageWidth * imageHeight * 3 / 2 - 1;
for (int x = imageWidth - 1; x > 0; x = x - 2) {
for (int y = 0; y < imageHeight / 2; y++) {
dst[i] = src[(imageWidth * imageHeight) + (y * imageWidth) + x];
i--;
dst[i] = src[(imageWidth * imageHeight) + (y * imageWidth) + (x - 1)];
i--;
}
}
}
3.NV21轉BGR
這個用的應該很多吧,現在圖像處理算法大部分基于OpenCV,很多時候都得轉換成BGR再進行處理,由于這個代碼太多了,就不一一粘貼,後面會把.h檔案放上來,直接下載下傳看就可以了。
到這兒基本上圖像處理就差不多結束了,從我們拿到預覽資料,到縮放旋轉,再到轉化成BGR,一些列操作,每個步驟都會産生耗時,而且還會消耗CPU資源,如果再加個鏡像操作就更大了,後面我會講怎麼直接一步到位,優化耗時。
檔案位址:https://download.csdn.net/download/renlei0012/10988868,一個檔案懶得放git了,沒分的留下郵箱就行了