Pytorch C++ Inference
step 1:
模型轉化為torch script module
轉換模型有兩種方法:
function 1:tracing->構造一個Script module,隻有一個forward方法
**使用方式:**給定一個輸入,并且模型加載好已經訓練完成的權值檔案,進行一次forward,tracing就可以記錄整個資料流
适用對象:
①forward過程中沒有控制流的情況(if/else,for loop)
②輸出的script module隻是為了inference(srcipt 隻會記錄trace儲存時的mode(eval/train))
#the trace function only suitable for torch.nn.module model
#judge whether our net class is torch.nn.module
isinstance(net,torch.nn.module)
#before the process
#we should load the weights to net,and send net to a gpu device
#input should be at the same device as net locate
save_script_module=torch.jit.trace(net,input)
function 2: srcipt (顯式标記模型)
使用方式:
1.pytorch.version>=1.2 執行個體化一個torch.nn.module 類,将該module類執行個體化後,轉換成script module類
2.pytorch.version<=1.1,将原有的torch.nn.module 轉換成torch.jit.ScriptModule,原有的forwardmethod加上@torch.jit.script_method的修飾
**适用對象:**forward過程中存在控制流的情況
1.pytorch.version>=1.2
#如果method中有Script Module不支援的python feature,在method上方标注@torch.jit.ignore
class MyModule(torch.nn.Module):
def __init__(self, N, M):
super(MyModule, self).__init__()
self.weight = torch.nn.Parameter(torch.rand(N, M))
def forward(self, input):
if input.sum() > 0:
output = self.weight.mv(input)
else:
output = self.weight + input
return output
my_module = MyModule(10,20)
sm = torch.jit.script(my_module)
2.pytorch.version<=1.1
import torch
class MyModule(torch.jit.ScriptModule):
def __init__(self, N, M):
super(MyModule, self).__init__()
self.weight = torch.nn.Parameter(torch.rand(N, M))
@torch.jit.script_method
def forward(self, input):
return self.weight.mv(input)
tips:對于forward過程中可能有script module 不支援的python feature,可以考慮把這個相關layer封裝起來,用trace記錄相關的layer流,其他layer不用修改。參考示例:
import torch
import torch.nn as nn
import torch.nn.functional as F
class MyScriptModule(torch.jit.ScriptModule):
def __init__(self):
super(MyScriptModule, self).__init__()
# torch.jit.trace produces a ScriptModule's conv1 and conv2
self.conv1 = torch.jit.trace(nn.Conv2d(1, 20, 5), torch.rand(1, 1, 16, 16))
self.conv2 = torch.jit.trace(nn.Conv2d(20, 20, 5), torch.rand(1, 20, 16, 16))
@torch.jit.script_method
def forward(self, input):
input = F.relu(self.conv1(input))
input = F.relu(self.conv2(input))
return input
step 2:
儲存torch script module
#save script module
torch.jit.save(save_script_module,'path/scriptmodulename.pt')
step 3:
在 c++工程中加載.pt 的script module
tips:
1.這裡需要注意一下加載的過程,我們在儲存模型的時候,前向inference過程是在某塊gpu上面完成的,會有一個預設的location,如果直接加載,不設定gpu device,會提示cuda error:invalid device ordinary.
2.如果在python inference的過程中,有适用torch.no_grad進行gpu的顯存優化,則需要在c++代碼中加上對應的部分,否則會出現在c++ 代碼中進行inference的過程中的顯存占用>python inference過程中的顯存占用
模型加載過程示例:
#include <torch/script.h> // One-stop header.
#include <torch/torch.h>
#include <iostream>
#include <memory>
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module> gpuid\n";
return -1;
}
int gpu_id=argv[2];
torch.Device device(torch::cuda::is_available() ? torch::kCUDA :torch::kCPU,gpuid);
torch::jit::script::Module module;
//gpu optimize
torch::NoGradGuard no_grad;
try {
// Deserialize the ScriptModule from a file using torch::jit::load().
module = torch::jit::load(argv[1],device);
//gpu optimize
module.eval();
}
catch (const c10::Error& e) {
std::cerr << "error loading the model\n";
return -1;
}
std::cout << "ok\n";
}
step 4:下載下傳libtorch并進行cmake編譯
tips:c++ 的libtorch版本号應該大于或者等于python的torch版本,否則會出錯
參考:https://pytorch.org/tutorials/advanced/cpp_export.html
step 5:在c++代碼中進行前向inference
tips:此步驟中的input也應該發送到module所在的device,否則會報data的類型為cpu variable,module的類型為cuda variable。
// Create a vector of inputs.
at::Tensor input_tensor=torch::ones({1, 3, 224, 224});
input_tensor=input_tensor.to(device);
std::vector<torch::jit::IValue> inputs;
inputs.push_back(input_tensor);
// Execute the model and turn its output into a tensor.
at::Tensor output = module.forward(inputs).toTensor();
std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';