天天看點

pytorch->onnx->ncnn模型移植

為了将pytorch訓練出的人臉識别模型更好地部署到樹莓派中,這裡選用ncnn前向推理架構加速模型推理過程。

pytorch -> onnx

pytroch1.0以上的版本是自帶onnx的,是以轉換比較友善,直接用torch.onnx.export就能輸出.onnx檔案。為了保證pytorch與onnx的輸出一緻,我們用相同的輸入放入torch與onnx模型中,比較它們各自輸出,程式如下。

import torch.onnx
import torchvision
from model1 import MobileFaceNet
import torch
import cv2
import onnx
import onnxruntime
import numpy as np

model = MobileFaceNet(512)
device = torch.device("cpu")
dummy_input = torch.randn(1, 3, 112, 112).to(device)
state_dict = torch.load('./model_mobilefacenet.pth', map_location=device)
model.load_state_dict(state_dict)
model.eval()

out = model(dummy_input)
print(out[0][:10])

torch.onnx.export(model,               # model being run
                  dummy_input,                         # model input (or a tuple for multiple inputs)
                  "my_mobileface.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                 )
onnx_model = onnx.load('./my_mobileface.onnx')  # load onnx model
session = onnxruntime.InferenceSession("./my_mobileface.onnx", None)
input_name = session.get_inputs()[0].name
orig_result = session.run([], {input_name: dummy_input.data.numpy()})
print(orig_result[:10])
           

onnx->ncnn

ncnn安裝按照https://github.com/Tencent/ncnn官方提示來就行。

On Debian, Ubuntu or Raspberry Pi OS, you can install all required dependencies using:安裝依賴環境

sudo apt install build-essential git cmake libprotobuf-dev protobuf-compiler libvulkan-dev vulkan-utils libopencv-dev
           

然後git clone ncnn,因為不适用gpu,是以DNCNN_VULKAN=OFF

$ cd ncnn
$ mkdir -p build
$ cd build
build$ cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_VULKAN=OFF -DNCNN_SYSTEM_GLSLANG=ON -DNCNN_BUILD_EXAMPLES=ON ..
build$ make -j$(nproc)
           

Verify build by running some examples:

build$ cd ../examples
examples$ ../build/examples/squeezenet ../images/256-ncnn.png
[0 AMD RADV FIJI (LLVM 10.0.1)]  queueC=1[4]  queueG=0[1]  queueT=0[1]
[0 AMD RADV FIJI (LLVM 10.0.1)]  bugsbn1=0  buglbia=0  bugcopc=0  bugihfa=0
[0 AMD RADV FIJI (LLVM 10.0.1)]  fp16p=1  fp16s=1  fp16a=0  int8s=1  int8a=1
532 = 0.163452
920 = 0.093140
716 = 0.061584
example$
           

如果出現終端列印出這些資訊就說明安裝成功

在将onnx轉換為ncnn模型前,我們需要簡化onnx模型,以免出現不可編譯的情況

首先,安裝onnx-smiplifier

pip install onnx-simplifier
           

然後簡化onnx模型

python3 -m onnxsim my_mobileface.onnx my_mobileface-sim.onnx
           

onnx轉換為ncnn,需要使用在ncnn/build/tools/onnx2ncnn

./onnx2ncnn my_mobileface-sim.onnx my_mobileface.param my_mobileface.bin
           

生成的.bin與.param檔案就是我們在樹莓派上需要使用的NCNN模型檔案

最後,在c++環境下推理ncnn模型并輸出,這裡需要注意ncnn的輸入一定要對應pytorch的輸入,不然會嚴重影響NCNN的推理結果。

#include <iostream>
#include <fstream>
#include <stdio.h>
#include <algorithm>
#include <vector>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include "opencv2/imgproc/imgproc.hpp"

#include "net.h"
using namespace std;

//這個函數是官方提供的用于列印輸出的tensor
void pretty_print(const ncnn::Mat& m)
{
    for (int q=0; q<m.c; q++)
    {
        const float* ptr = m.channel(q);
        for (int y=0; y<m.h; y++)
        {
            for (int x=0; x<m.w; x++)
            {
                printf("%f ", ptr[x]);
            }
            ptr += m.w;
            printf("\n");
        }
        printf("------------------------\n");
    }
}
//main函數模闆
int main(){
    string img_path = "xxx.jpg";
    cv::Mat img = cv::imread(img_path, cv::IMREAD_COLOR);
    cv::Mat img2;
    int input_width = 512;//轉onnx時指定的輸入大小
    int input_height = 512;
    // resize
    cv::resize(img, img2, cv::Size(input_width, input_height));

    // 加載轉換并且量化後的alexnet網絡
    ncnn::Net net;
    //net.opt.num_threads=1;
    net.load_param("xxx.param");
    net.load_model("xxx.bin");
    // 把opencv的mat轉換成ncnn的mat
    ncnn::Mat input = ncnn::Mat::from_pixels(img2.data, ncnn::Mat::PIXEL_BGR, img2.cols, img2.rows);
    const float mean_vals[3] = {0.f,0.f,0.f};
    const float norm_vals[3] = {1/255.f,1/255.f,1/255.f};
    input.substract_mean_normalize(mean_vals, norm_vals);
    // ncnn前向計算
    ncnn::Extractor extractor = net.create_extractor();
    extractor.input("input", input);
    ncnn::Mat output0,output1;//取決于模型的輸出有幾個
    extractor.extract("output0", output0);
    extractor.extract("output1", output1);
    pretty_print(output0);
    pretty_print(output1);
    /*
    // 或者展平後輸出
    ncnn::Mat out_flatterned = output0.reshape(output0.w * output0.h * output0.c);
    std::vector<float> scores;
    scores.resize(out_flatterned.w);
    for (int j=0; j<out_flatterned.w; j++)
    {
        scores[j] = out_flatterned[j];
    }
    */
    cout<<"done"<<endl;
    return 0;
}