天天看點

cudnn更新方法,解決CuDNN版本不相容問題1 檢視cudnn版本2 下載下傳cudnn3 删除舊版本4 安裝新版本5 建立軟連接配接6 測試驗證

運作代碼時出現:

32/1109 [..............................] - ETA: 12:41 - loss: 3.4072 - accuracy: 0.0000e+002020-09-24 02:47:25.341531: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.5.1 but source was compiled with: 7.6.4.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible w  96/1109 [=>............................] - ETA: 12:31 - loss: 3.3774 - accuracy: 0.0312    Traceback (most recent call last):

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/runpy.py", line 193, in _run_module_as_main

    "__main__", mod_spec)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/runpy.py", line 85, in _run_code

    exec(code, run_globals)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/__main__.py", line 103, in <module>

    main()

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/__main__.py", line 92, in main

    cmdline_arguments.func(cmdline_arguments)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/cli/train.py", line 76, in train

    additional_arguments=extract_additional_arguments(args),

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/train.py", line 50, in train

    additional_arguments=additional_arguments,

  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/train.py", line 101, in train_async

    additional_arguments,

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/train.py", line 188, in _train_async_internal

    additional_arguments=additional_arguments,

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/train.py", line 223, in _do_training

    additional_arguments=additional_arguments,

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/train.py", line 361, in _train_core_with_validated_data

    additional_arguments=additional_arguments,

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/core/train.py", line 66, in train

    agent.train(training_data, **additional_arguments)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/core/agent.py", line 742, in train

    self.policy_ensemble.train(training_trackers, self.domain, **kwargs)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/core/policies/ensemble.py", line 124, in train

    policy.train(training_trackers, domain, **kwargs)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/rasa/core/policies/keras_policy.py", line 197, in train

    **self._train_params,

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit

    use_multiprocessing=use_multiprocessing)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 342, in fit

    total_epochs=epochs)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch

    batch_outs = execution_function(iterator)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function

    distributed_function(input_fn))

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in __call__

    result = self._call(*args, **kwds)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 599, in _call

    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2363, in __call__

    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call

    self.captured_inputs)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat

    ctx, args, cancellation_manager=cancellation_manager))

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 545, in call

    ctx=ctx)

  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute

    six.raise_from(core._status_to_exception(e.code, message), None)

  File "<string>", line 3, in raise_from

tensorflow.python.framework.errors_impl.UnknownError:  [_Derived_]  Fail to find the dnn implementation.

     [[{{node cond_29/then/_0/CudnnRNNV3}}]]

     [[sequential/lstm/StatefulPartitionedCall]] [Op:__inference_distributed_function_6721]

Function call stack:

distributed_function -> distributed_function -> distributed_function

代碼報錯,因為tensorflow的庫版本雨cudnn不比對,要求cudnn版本為7.6.4,而我之前安裝的版本是7.5.1,是以需要對cudnn進行更新,更新方法很簡單,而且不會對現有安裝環境造成破壞,更新完之後tensorflow還可以正常使用

1 檢視cudnn版本

首先使用以下指令檢視現有cudnn的版本

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
           

輸出如下

#define CUDNN_MAJOR 7
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"
           

表明是7.5.1版本

2 下載下傳cudnn

根據cuda和系統環境,在官網下載下傳對應版本,建議選擇tgz壓縮檔案,不要下載下傳Deb檔案。

cudnn更新方法,解決CuDNN版本不相容問題1 檢視cudnn版本2 下載下傳cudnn3 删除舊版本4 安裝新版本5 建立軟連接配接6 測試驗證

官方下載下傳需要新增賬號,這個版本已上傳到百度網盤,相同版本的可以自己去下載下傳,省去注冊的麻煩。

檔案:cudnn-10.1-linux-x64-v7.6.4.38.solitairetheme8

連結:https://pan.baidu.com/s/1ivxmaE_YUIaIaNsTTqskDQ 

提取碼:xcp1

下載下傳完後進行解壓,解壓方式如下:

# 重命名成tgz
mv cudnn-10.1-linux-x64-v7.6.4.38.solitairetheme8 cudnn-10.1-linux-x64-v7.6.4.38.tgz
# 解壓
tar -zxvf cudnn-10.1-linux-x64-v7.6.4.38.tgz
           

解壓出一個名為cuda的檔案夾,檔案夾中有include和lib64兩個檔案夾 

3 删除舊版本

sudo rm -rf /usr/local/cuda/include/cudnn.h
sudo rm -rf /usr/local/cuda/lib64/libcudnn*
           

4 安裝新版本

cd進入剛才解壓的cuda檔案夾

sudo cp include/cudnn.h /usr/local/cuda/include/
sudo cp lib64/lib* /usr/local/cuda/lib64/
           

5 建立軟連接配接

cd /usr/local/cuda/lib64/
sudo chmod +r libcudnn.so.7.6.4
sudo ln -sf libcudnn.so.7.6.4 libcudnn.so.7
sudo ln -sf libcudnn.so.7 libcudnn.so   
sudo ldconfig
           

6 測試驗證

輸入第1步的指令,得到輸出如下

#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 4
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"
           

更新成功,運作代碼再也不會報錯了!

繼續閱讀