记一次微软onnxruntime推理框架的bug

ONNX Runtime is a cross-platform machine learning model accelerator with a flexible interface to integrate hardware-specific libraries. The ONNX runtime can be used with models from PyTorch, PaddlePaddle, Tensorflow/Keras, TFLite, scikit-learn, and other frameworks.

最近研究使用onnxruntime推理PaddleOCR模型,为了使用显卡加速,使用onnxruntime-gpu版本。技术研究,什么都爱搞最新版本的,于是就下载了CUDA的12.1版本并安装。 CUDA Toolkit Archive | NVIDIA Developer,onnxruntime下载了最新版Microsoft.ML.OnnxRuntime.Gpu.Windows的1.18.1版本

The C++ code of PaddleOCR https://gitee.com/raoyutian/PaddleOCRSharp by my project was ported and successfully compiled. After testing, the GPU inference speed is significantly faster than the GPU version of the Paddle framework.

So I found a 4090 graphics card machine for testing, and also installed CUDA version 12.1, as well as the same CUDNN version as the development environment. I look forward to using the graphics card acceleration smoothly as well. On the test machine, GPU acceleration could not be used!! So I used the dependency tool to detect, but there was a lack of dependency.

So the native also relies on tool detection, although there is no shortage of dependency dlls, but it is found that several dependencies actually point to the 11.8 version of CUDA (the native has various versions installed). The test machine only installed the 12.1 version of CUDA. So GPU acceleration cannot be used. Dead people!!

After checking the bin directory file in the installation directory of CUDA, it is found that several missing files exist, but the file names are different. The quote doesn't match

根据Onkrundime的官方文档nvitia - bay onkrundime1.18.1版本是支持bay的12.ch版本,ongrundime居然依赖文件名是错误的. 坑死宝宝了!!!

After installing CUDA 12.5, it is not possible to find it.

Finally using the 1.18.0 version of onnxruntime-gpu,

Perfectly solved! Hahahaha

It's not necessarily good if it's not the latest version, how many pits there are, how deep the pit is, let's explore it slowly.

最后附上onnxruntime-gpu使用PaddleOCR的模型推理效果:

PaddleOCR.dll封装后支持go\python\rust\c# C++等语言调用。