上一篇寫道:onnx2ncnn的時候,不支援sigmoid,upsample層,于是想着閱讀onnx2ncnn的源碼。目的:
- 了解ncnn中onnx2ncnn的主要流程
- 自定義upsample層(最高要求)
1. 相關資料
- Open Neural Network Exchange - ONNX ,onnx的文檔
- https://github.com/Tencent/ncnn,注意ncnn的不同版本代碼是不一樣,這裡以20180704為準。
2. 主要流程
2.1 ncnn.param儲存網絡結構參數的格式
2.2 onnx關鍵api
-
graph
GraphProto: graph定義了模型的計算邏輯以及帶有參數的node節點,組成一個有向圖結構;
關鍵屬性const onnx::GraphProto& graph = model.graph();
- initializer:好像是預訓練的權重
-
node
NodeProto: 網絡有向圖的各個節點OP的結構,通常稱為層,例如conv,relu層;
關鍵屬性const onnx::NodeProto& node = graph.node(i);
-
attribute
AttributeProto:各OP的參數,通過該結構通路,例如:conv層的stride,dilation等;
const onnx::AttributeProto& attr = node.attribute(i);
-
tensor
TensorProto: 序列化的tensor value,一般weight,bias等常量均儲存為該種結構;
// batchnorm
const onnx::TensorProto& scale = weights[node.input(1)];
const onnx::TensorProto& B = weights[node.input(2)];
const onnx::TensorProto& mean = weights[node.input(3)];
const onnx::TensorProto& var = weights[node.input(4)];
一些疑問
-
node.attribute怎麼确定?比如conv,batchnorm有不同的參數
猜想: 在pytorch2onnx中是不是又具體的定義或代碼?
-
比如batchnorm又多個預訓練權重的儲存順序
猜想: 還是在pytorch2onnx中定義的
4. 一些例子
主要分為兩類,無結構參數,如batchnorm,直接儲存到bin檔案中(注意各個參數的順序);第二類,有結構參數,無預訓練權重。就需要将結構參數儲存到ncnn.param網絡結構參數中。
4.1 batchnorm
float epsilon = get_node_attr_f(node, "epsilon", 1e-5f);
const onnx::TensorProto& scale = weights[node.input(1)];
const onnx::TensorProto& B = weights[node.input(2)];
const onnx::TensorProto& mean = weights[node.input(3)];
const onnx::TensorProto& var = weights[node.input(4)];
int channels = get_tensor_proto_data_size(scale);
fprintf(pp, " 0=%d", channels); // batchnorm的通道數
fwrite_tensor_proto_data(scale, bp); // batchnorm的縮放變量
fwrite_tensor_proto_data(mean, bp); // 均值
4.2 pooling
pooling層是沒有預訓練的參數,但是有很多類型(maxpool,averagepool),和網絡參數(kernel_size, pads)等。
std::string auto_pad = get_node_attr_s(node, "auto_pad");//TODO
std::vector<int> kernel_shape = get_node_attr_ai(node, "kernel_shape");
std::vector<int> strides = get_node_attr_ai(node, "strides");
std::vector<int> pads = get_node_attr_ai(node, "pads");
int pool = op == "AveragePool" ? 1 : 0;
int pad_mode = 1;
if (auto_pad == "SAME_LOWER" || auto_pad == "SAME_UPPER")
{
// TODO
pad_mode = 2;
}
fprintf(pp, " 0=%d", pool);
if (kernel_shape.size() == 1) {
fprintf(pp, " 1=%d", kernel_shape[0]);
} else if (kernel_shape.size() == 2) {
fprintf(pp, " 1=%d", kernel_shape[1]);
fprintf(pp, " 11=%d", kernel_shape[0]);
}
if (strides.size() == 1) {
fprintf(pp, " 2=%d", strides[0]);
} else if (strides.size() == 2) {
fprintf(pp, " 2=%d", strides[1]);
fprintf(pp, " 12=%d", strides[0]);
}
if (pads.size() == 1) {
fprintf(pp, " 3=%d", pads[0]);
} else if (pads.size() == 2) {
fprintf(pp, " 3=%d", pads[1]);
fprintf(pp, " 13=%d", pads[0]);
} else if (pads.size() == 4) {
fprintf(pp, " 3=%d", pads[1]);
fprintf(pp, " 13=%d", pads[0]);
fprintf(pp, " 14=%d", pads[3]);
fprintf(pp, " 15=%d", pads[2]);
}
fprintf(pp, " 5=%d", pad_mode);
reference
- https://blog.csdn.net/SilentOB/article/details/102863944