pytorch-＞onnx-＞ncnn模型移植

为了将pytorch训练出的人脸识别模型更好地部署到树莓派中，这里选用ncnn前向推理框架加速模型推理过程。

pytorch -> onnx

pytroch1.0以上的版本是自带onnx的，所以转换比较方便，直接用torch.onnx.export就能输出.onnx文件。为了保证pytorch与onnx的输出一致，我们用相同的输入放入torch与onnx模型中，比较它们各自输出，程序如下。

import torch.onnx
import torchvision
from model1 import MobileFaceNet
import torch
import cv2
import onnx
import onnxruntime
import numpy as np

model = MobileFaceNet(512)
device = torch.device("cpu")
dummy_input = torch.randn(1, 3, 112, 112).to(device)
state_dict = torch.load('./model_mobilefacenet.pth', map_location=device)
model.load_state_dict(state_dict)
model.eval()

out = model(dummy_input)
print(out[0][:10])

torch.onnx.export(model,               # model being run
                  dummy_input,                         # model input (or a tuple for multiple inputs)
                  "my_mobileface.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                 )
onnx_model = onnx.load('./my_mobileface.onnx')  # load onnx model
session = onnxruntime.InferenceSession("./my_mobileface.onnx", None)
input_name = session.get_inputs()[0].name
orig_result = session.run([], {input_name: dummy_input.data.numpy()})
print(orig_result[:10])

onnx->ncnn

ncnn安装按照https://github.com/Tencent/ncnn官方提示来就行。

On Debian, Ubuntu or Raspberry Pi OS, you can install all required dependencies using:安装依赖环境

sudo apt install build-essential git cmake libprotobuf-dev protobuf-compiler libvulkan-dev vulkan-utils libopencv-dev

然后git clone ncnn，因为不适用gpu，所以DNCNN_VULKAN=OFF

$ cd ncnn
$ mkdir -p build
$ cd build
build$ cmake -DCMAKE_BUILD_TYPE=Release -DNCNN_VULKAN=OFF -DNCNN_SYSTEM_GLSLANG=ON -DNCNN_BUILD_EXAMPLES=ON ..
build$ make -j$(nproc)

Verify build by running some examples:

build$ cd ../examples
examples$ ../build/examples/squeezenet ../images/256-ncnn.png
[0 AMD RADV FIJI (LLVM 10.0.1)]  queueC=1[4]  queueG=0[1]  queueT=0[1]
[0 AMD RADV FIJI (LLVM 10.0.1)]  bugsbn1=0  buglbia=0  bugcopc=0  bugihfa=0
[0 AMD RADV FIJI (LLVM 10.0.1)]  fp16p=1  fp16s=1  fp16a=0  int8s=1  int8a=1
532 = 0.163452
920 = 0.093140
716 = 0.061584
example$

如果出现终端打印出这些信息就说明安装成功

在将onnx转换为ncnn模型前，我们需要简化onnx模型，以免出现不可编译的情况

首先，安装onnx-smiplifier

pip install onnx-simplifier

然后简化onnx模型

python3 -m onnxsim my_mobileface.onnx my_mobileface-sim.onnx

onnx转换为ncnn，需要使用在ncnn/build/tools/onnx2ncnn

./onnx2ncnn my_mobileface-sim.onnx my_mobileface.param my_mobileface.bin

生成的.bin与.param文件就是我们在树莓派上需要使用的NCNN模型文件

最后，在c++环境下推理ncnn模型并输出，这里需要注意ncnn的输入一定要对应pytorch的输入，不然会严重影响NCNN的推理结果。

#include <iostream>
#include <fstream>
#include <stdio.h>
#include <algorithm>
#include <vector>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include "opencv2/imgproc/imgproc.hpp"

#include "net.h"
using namespace std;

//这个函数是官方提供的用于打印输出的tensor
void pretty_print(const ncnn::Mat& m)
{
    for (int q=0; q<m.c; q++)
    {
        const float* ptr = m.channel(q);
        for (int y=0; y<m.h; y++)
        {
            for (int x=0; x<m.w; x++)
            {
                printf("%f ", ptr[x]);
            }
            ptr += m.w;
            printf("\n");
        }
        printf("------------------------\n");
    }
}
//main函数模板
int main(){
    string img_path = "xxx.jpg";
    cv::Mat img = cv::imread(img_path, cv::IMREAD_COLOR);
    cv::Mat img2;
    int input_width = 512;//转onnx时指定的输入大小
    int input_height = 512;
    // resize
    cv::resize(img, img2, cv::Size(input_width, input_height));

    // 加载转换并且量化后的alexnet网络
    ncnn::Net net;
    //net.opt.num_threads=1;
    net.load_param("xxx.param");
    net.load_model("xxx.bin");
    // 把opencv的mat转换成ncnn的mat
    ncnn::Mat input = ncnn::Mat::from_pixels(img2.data, ncnn::Mat::PIXEL_BGR, img2.cols, img2.rows);
    const float mean_vals[3] = {0.f,0.f,0.f};
    const float norm_vals[3] = {1/255.f,1/255.f,1/255.f};
    input.substract_mean_normalize(mean_vals, norm_vals);
    // ncnn前向计算
    ncnn::Extractor extractor = net.create_extractor();
    extractor.input("input", input);
    ncnn::Mat output0,output1;//取决于模型的输出有几个
    extractor.extract("output0", output0);
    extractor.extract("output1", output1);
    pretty_print(output0);
    pretty_print(output1);
    /*
    // 或者展平后输出
    ncnn::Mat out_flatterned = output0.reshape(output0.w * output0.h * output0.c);
    std::vector<float> scores;
    scores.resize(out_flatterned.w);
    for (int j=0; j<out_flatterned.w; j++)
    {
        scores[j] = out_flatterned[j];
    }
    */
    cout<<"done"<<endl;
    return 0;
}

pytorch-＞onnx-＞ncnn模型移植

pytorch -> onnx

onnx->ncnn

继续阅读

libsvm for python 安装

学习软件测试基础测试第七天

Zeppelin 配置访问 REST APIApache Zeppelin Configuration REST API

【Torch】最简洁logging使用指南

C++实现简单顺序表

27. Remove Element(列表)题目代码

C经典书籍笔记——C陷阱与缺陷②(语法陷阱之优先级)一、错误案列二、优先级规律

Cloud Studio初体验

使用 ctypes 进行 Python 和 C 的混合编程

【python】【数据处理】画多维数据分布图

线性表之顺序表的实现

C++判断素数、求最大公约数代码判断一个数是否为素数求两个数的最大公约数

SequoiaDB巨杉数据库C++驱动概述

【python】netconf协议对接管理设备

「Python 网络自动化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 网络设备

在python中创建excel并写入