pytorch-＞onnx-＞ncnn模型移植

為了将pytorch訓練出的人臉識别模型更好地部署到樹莓派中，這裡選用ncnn前向推理架構加速模型推理過程。

pytorch -> onnx

pytroch1.0以上的版本是自帶onnx的，是以轉換比較友善，直接用torch.onnx.export就能輸出.onnx檔案。為了保證pytorch與onnx的輸出一緻，我們用相同的輸入放入torch與onnx模型中，比較它們各自輸出，程式如下。

import torch.onnx
import torchvision
from model1 import MobileFaceNet
import torch
import cv2
import onnx
import onnxruntime
import numpy as np

model = MobileFaceNet(512)
device = torch.device("cpu")
dummy_input = torch.randn(1, 3, 112, 112).to(device)
state_dict = torch.load('./model_mobilefacenet.pth', map_location=device)
model.load_state_dict(state_dict)
model.eval()

out = model(dummy_input)
print(out[0][:10])

torch.onnx.export(model,               # model being run
                  dummy_input,                         # model input (or a tuple for multiple inputs)
                  "my_mobileface.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                 )
onnx_model = onnx.load('./my_mobileface.onnx')  # load onnx model
session = onnxruntime.InferenceSession("./my_mobileface.onnx", None)
input_name = session.get_inputs()[0].name
orig_result = session.run([], {input_name: dummy_input.data.numpy()})
print(orig_result[:10])

onnx->ncnn

ncnn安裝按照https://github.com/Tencent/ncnn官方提示來就行。

On Debian, Ubuntu or Raspberry Pi OS, you can install all required dependencies using:安裝依賴環境

sudo apt install build-essential git cmake libprotobuf-dev protobuf-compiler libvulkan-dev vulkan-utils libopencv-dev

然後git clone ncnn，因為不适用gpu，是以DNCNN_VULKAN=OFF

$ cd ncnn
$ mkdir -p build
$ cd build
build$ cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_VULKAN=OFF -DNCNN_SYSTEM_GLSLANG=ON -DNCNN_BUILD_EXAMPLES=ON ..
build$ make -j$(nproc)

Verify build by running some examples:

build$ cd ../examples
examples$ ../build/examples/squeezenet ../images/256-ncnn.png
[0 AMD RADV FIJI (LLVM 10.0.1)]  queueC=1[4]  queueG=0[1]  queueT=0[1]
[0 AMD RADV FIJI (LLVM 10.0.1)]  bugsbn1=0  buglbia=0  bugcopc=0  bugihfa=0
[0 AMD RADV FIJI (LLVM 10.0.1)]  fp16p=1  fp16s=1  fp16a=0  int8s=1  int8a=1
532 = 0.163452
920 = 0.093140
716 = 0.061584
example$

如果出現終端列印出這些資訊就說明安裝成功

在将onnx轉換為ncnn模型前，我們需要簡化onnx模型，以免出現不可編譯的情況

首先，安裝onnx-smiplifier

pip install onnx-simplifier

然後簡化onnx模型

python3 -m onnxsim my_mobileface.onnx my_mobileface-sim.onnx

onnx轉換為ncnn，需要使用在ncnn/build/tools/onnx2ncnn

./onnx2ncnn my_mobileface-sim.onnx my_mobileface.param my_mobileface.bin

生成的.bin與.param檔案就是我們在樹莓派上需要使用的NCNN模型檔案

最後，在c++環境下推理ncnn模型并輸出，這裡需要注意ncnn的輸入一定要對應pytorch的輸入，不然會嚴重影響NCNN的推理結果。

#include <iostream>
#include <fstream>
#include <stdio.h>
#include <algorithm>
#include <vector>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include "opencv2/imgproc/imgproc.hpp"

#include "net.h"
using namespace std;

//這個函數是官方提供的用于列印輸出的tensor
void pretty_print(const ncnn::Mat& m)
{
    for (int q=0; q<m.c; q++)
    {
        const float* ptr = m.channel(q);
        for (int y=0; y<m.h; y++)
        {
            for (int x=0; x<m.w; x++)
            {
                printf("%f ", ptr[x]);
            }
            ptr += m.w;
            printf("\n");
        }
        printf("------------------------\n");
    }
}
//main函數模闆
int main(){
    string img_path = "xxx.jpg";
    cv::Mat img = cv::imread(img_path, cv::IMREAD_COLOR);
    cv::Mat img2;
    int input_width = 512;//轉onnx時指定的輸入大小
    int input_height = 512;
    // resize
    cv::resize(img, img2, cv::Size(input_width, input_height));

    // 加載轉換并且量化後的alexnet網絡
    ncnn::Net net;
    //net.opt.num_threads=1;
    net.load_param("xxx.param");
    net.load_model("xxx.bin");
    // 把opencv的mat轉換成ncnn的mat
    ncnn::Mat input = ncnn::Mat::from_pixels(img2.data, ncnn::Mat::PIXEL_BGR, img2.cols, img2.rows);
    const float mean_vals[3] = {0.f,0.f,0.f};
    const float norm_vals[3] = {1/255.f,1/255.f,1/255.f};
    input.substract_mean_normalize(mean_vals, norm_vals);
    // ncnn前向計算
    ncnn::Extractor extractor = net.create_extractor();
    extractor.input("input", input);
    ncnn::Mat output0,output1;//取決于模型的輸出有幾個
    extractor.extract("output0", output0);
    extractor.extract("output1", output1);
    pretty_print(output0);
    pretty_print(output1);
    /*
    // 或者展平後輸出
    ncnn::Mat out_flatterned = output0.reshape(output0.w * output0.h * output0.c);
    std::vector<float> scores;
    scores.resize(out_flatterned.w);
    for (int j=0; j<out_flatterned.w; j++)
    {
        scores[j] = out_flatterned[j];
    }
    */
    cout<<"done"<<endl;
    return 0;
}

pytorch-＞onnx-＞ncnn模型移植

pytorch -> onnx

onnx->ncnn

繼續閱讀

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

C++實作簡單順序表

27. Remove Element(清單)題目代碼

C經典書籍筆記——C陷阱與缺陷②(文法陷阱之優先級)一、錯誤案列二、優先級規律

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

線性表之順序表的實作

C++判斷素數、求最大公約數代碼判斷一個數是否為素數求兩個數的最大公約數

SequoiaDB巨杉資料庫C++驅動概述

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入