天天看點

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

@ Bergen, Norway

第一次安裝 CUDA 的過程簡直抓狂,中間出現了很多次莫名其妙的 bug,踩了很多坑。比如裝好了 CUDA 重新開機後進不去桌面系統了,直接黑屏、比如滑鼠鍵盤都不 work 了、再比如裝好了卻安裝不了 TensorFlow-GPU......看了一圈網上的安裝教程,發現還是官方指南真香了~

新年第一篇,分享一下我的 Ubuntu 18.04 + CUDA 10.0 + cuDNN 7.6.5 + TensorFlow 2.0 安裝筆記,希望可以幫助大家少踩坑。

整個安裝流程大緻是:安裝顯示卡驅動 -> 安裝 CUDA[1] -> 安裝 cuDNN[2] -> 安裝 tensorflow-gpu 并測試。

全文目錄:

  1. Ubuntu安裝與更新
  2. 安裝顯示卡驅動
  3. 安裝CUDA
  4. 安裝cuDNN
  5. 安裝TensorFlow2.0 GPU及測試

1. Ubuntu安裝和更新

先進行Ubuntu18.04系統一些基本的安裝和更新,具體的作業系統安裝過程省略,比較容易,大家可自行百度,有很多教程。

sudo apt-get update # 更新源
sudo apt-get upgrade # 更新已安裝的包
sudo apt-get install vim

           

2. 安裝顯示卡驅動

2.1 禁用 Nouveau 驅動

注意:Linux 系統下有兩種方案安裝 CUDA:一種是 Package Manager Installation (.deb),另一種是 Runfile Installation (.run)。本文采取的是第一種(也是官方推薦的方式)。如果使用deb方式安裝CUDA可以忽略此步,本人測試OK。如果使用 runfile 安裝CUDA需要手動禁用系統自帶的 Nouveau 驅動:

lsmod | grep nouveau # 要確定這條指令無輸出

           
vim /etc/modprobe.d/blacklist-nouveau.conf
# 添加下面兩行:
#######################################################
blacklist nouveau
options nouveau modeset=0
#######################################################
# 儲存後重新開機:
sudo update-initramfs -u
sudo reboot
# 再次輸入以下指令,無輸出就表示設定成功了
lsmod | grep nouveau

           

2.2 安裝合适的顯示卡驅動[3]

# 先清空現有的顯示卡驅動及依賴并重新開機
sudo apt-get remove --purge nvidia*
sudo apt autoremove
sudo reboot

           
# 添加ppa源并安裝最新的驅動
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
ubuntu-drivers devices
sudo apt install nvidia-driver-440
# 為了防止自動更新驅動導緻的相容性問題,我們還可以鎖定驅動版本:
sudo apt-mark hold nvidia-driver-440
# nvidia-driver-440 set on hold.

           

并在【軟體和更新】菜單中的附加驅動清單中,可以找到剛剛安裝的

nvidia-driver-440

,標明即可。輸入

sudo reboot

重新開機後,輸入

nvidia-smi

,顯示下圖資訊,這樣表示顯示卡驅動已經 ready:

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)
lsmod | grep nvidia # 看到下面的輸出則為安裝成功,如果無輸出,表示有問題

           
深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

也可以手動去官網下載下傳對應的安裝程式安裝顯示卡[4]

# 動态監測顯示卡使用的方式:
watch -n 1 nvidia-smi # 1表示每1秒重新整理一次
watch -n 0.01 nvidia-smi # 也可改成0.01s重新整理一次
# 也可以用gpustat
pip install gpustat
gpustat -i 1 -P

           

3. 安裝 CUDA

百度百科:CUDA(Compute Unified Device Architecture),是顯示卡廠商NVIDIA[5]推出的運算平台。CUDA 是一種由 NVIDIA 推出的通用并行計算[6]架構,該架構使GPU[7]能夠解決複雜的計算問題。

Linux 系統下有兩種方案安裝 CUDA:一種是 Package Manager Installation (.deb),另一種是 Runfile Installation (.run)。本文采取的是第一種(也是官方推薦的方式)。

另外,CUDA 對于系統環境有嚴格的依賴,比如對于 CUDA10.0 有如下的要求。其他的版本可檢視對應的Online Documentation[8]。

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

3.1 安裝前的準備

在安裝 CUDA 之前需要先确定環境是 ready 的,以免出現亂七八糟的 bug 無從下手。直接引用官網的說明:

Some actions must be taken before the CUDA Toolkit and Driver can be installed on Linux:
  • Verify the system has a CUDA-capable GPU.
  • Verify the system is running a supported version of Linux.
  • Verify the system has gcc installed.
  • Verify the system has the correct kernel headers and development packages installed.
  • Download the NVIDIA CUDA Toolkit.
  • Handle conflicting installation methods.
3.1.1 确認你有支援 CUDA 的 GPU
lspci | grep -i nvidia | grep VGA

           
3.1.2 确認你的 linux 版本
uname -m && cat /etc/*release
uname -a
# The x86_64 line indicates you are running on a 64-bit system.

           
3.1.3 确認 gcc 版本
gcc --version
# gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

           
3.1.4 安裝對應核心版本的頭檔案

檢視 kernel 的版本:

uname -r
# 5.0.0-37-generic

           
This is the version of the kernel headers and development packages that must be installed prior to installing the CUDA Drivers.

安裝對應核心版本的頭檔案:

sudo apt-get install linux-headers-$(uname -r)

           
3.1.5 選擇安裝方式

下載下傳對應的安裝包(以官方推薦的 Deb packages 安裝方式為例)[9]

The CUDA Toolkit can be installed using either of two different installation mechanisms: distribution-specific packages (RPM and Deb packages), or a distribution-independent package (runfile packages).

(1) The distribution-independent package has the advantage of working across a wider set of Linux distributions, but does not update the distribution's native package management system.

(2) The distribution-specific packages interface with the distribution's native package management system. It is recommended to use the distribution-specific packages, where possible.

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)
深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)
3.1.6 徹底解除安裝之前安裝過的相關應用,避免沖突

如果是全新的 ubuntu,可忽略此部分,執行 3.2 部分即可。

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

如果 ubuntu 下用 RPM/Deb 安裝的:

sudo apt-get --purge remove <package_name>
sudo apt autoremove

           

如果是 runfile 安裝的:

sudo /usr/bin/nvidia-uninstall
sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl

           

3.2 安裝

首先確定已經下載下傳好對應的.deb 檔案,然後執行:

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub # 根據執行完第一步的提示輸入,比如我是:
# sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-toolkit-10-0 # 注意不是cuda,因為在第二步中裝過驅動了,此過程安裝cuda-toolkit-10-0即可

           
深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

3.3 安裝後

安裝之後需要手動進行一些設定才能使 CUDA 正常的工作。

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}

           
nvcc -V # 檢查CUDA是否安裝成功
# OUTPUT:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

           

最好關閉系統的自動更新,防止安裝好的環境突然 bug:

sudo vi /etc/apt/apt.conf.d/10periodic

# 修改為:
APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Download-Upgradeable-Packages "0";
APT::Periodic::AutocleanInterval "0";

           

也可以通過桌面設定:System Settings => Software&Updates => updates

4. 安裝 cuDNN[10]

NVIDIA cuDNN 是用于深度神經網絡的 GPU 加速庫。首先需要注冊下載下傳對應 CUDA 版本号的 cuDNN 安裝包: 連結[11]。

比如對應 CUDA10.0,我下載下傳的是:

tar -zxvf cudnn-10.0-linux-x64-v7.6.5.32.tgz

tar -zxvf cudnn-10.0-linux-x64-v7.6.5.32.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

           

驗證是否安裝成功:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
# 輸出
"""
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
"""

           

更推薦使用 Debian File 去安裝,因為可以通過裡面的樣例去驗證 cuDNN 是否成功安裝。首先下載下傳下面三個檔案:

# 分别下載下傳
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.0_amd64.deb
# 安裝完驗證:
cp -r /usr/src/cudnn_samples_v7/ $HOME
cd  $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
./mnistCUDNN
# Test passed!

           

另外也可以用 conda 來安裝 cudatoolkit 和 cuDNN,但要保證驅動是 ready 的。

conda install cudatoolkit=10.0
conda install -c anaconda cudnn

           

5. 安裝 TensorFlow2.0 GPU及測試

# 安裝conda
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc
conda create -y -n tf2 python=3.7
conda activate tf2
pip install --upgrade pip
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
pip install tensorflow-gpu
pip install catboost

           

測試:

import tensorflow as tf
print(tf.__version__)
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
"""
2.0.0
Num GPUs Available:  2
"""

           
"""
測試程式:
源連結:https://github.com/dragen1860/TensorFlow-2.x-Tutorials/blob/master/08-ResNet/main.py
"""
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "1" # os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
import tensorflow as tf
import numpy as np
from tensorflow import keras

tf.random.set_seed(22)
np.random.seed(22)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
assert tf.__version__.startswith('2.')

(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train.astype(np.float32) / 255., x_test.astype(
    np.float32) / 255.
# [b, 28, 28] => [b, 28, 28, 1]
x_train, x_test = np.expand_dims(x_train, axis=3), np.expand_dims(x_test,
                                                                  axis=3)
# one hot encode the labels. convert back to numpy as we cannot use a combination of numpy
# and tensors as input to keras
y_train_ohe = tf.one_hot(y_train, depth=10).numpy()
y_test_ohe = tf.one_hot(y_test, depth=10).numpy()

print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

# 3x3 convolution
def conv3x3(channels, stride=1, kernel=(3, 3)):
    return keras.layers.Conv2D(
        channels,
        kernel,
        strides=stride,
        padding='same',
        use_bias=False,
        kernel_initializer=tf.random_normal_initializer())

class ResnetBlock(keras.Model):
    def __init__(self, channels, strides=1, residual_path=False):
        super(ResnetBlock, self).__init__()
        self.channels = channels
        self.strides = strides
        self.residual_path = residual_path
        self.conv1 = conv3x3(channels, strides)
        self.bn1 = keras.layers.BatchNormalization()
        self.conv2 = conv3x3(channels)
        self.bn2 = keras.layers.BatchNormalization()
        if residual_path:
            self.down_conv = conv3x3(channels, strides, kernel=(1, 1))
            self.down_bn = tf.keras.layers.BatchNormalization()

    def call(self, inputs, training=None):
        residual = inputs
        x = self.bn1(inputs, training=training)
        x = tf.nn.relu(x)
        x = self.conv1(x)
        x = self.bn2(x, training=training)
        x = tf.nn.relu(x)
        x = self.conv2(x)
        # this module can be added into self.
        # however, module in for can not be added.
        if self.residual_path:
            residual = self.down_bn(inputs, training=training)
            residual = tf.nn.relu(residual)
            residual = self.down_conv(residual)
        x = x + residual
        return x

class ResNet(keras.Model):
    def __init__(self, block_list, num_classes, initial_filters=16, **kwargs):
        super(ResNet, self).__init__(**kwargs)
        self.num_blocks = len(block_list)
        self.block_list = block_list
        self.in_channels = initial_filters
        self.out_channels = initial_filters
        self.conv_initial = conv3x3(self.out_channels)
        self.blocks = keras.models.Sequential(name='dynamic-blocks')
        # build all the blocks
        for block_id in range(len(block_list)):
            for layer_id in range(block_list[block_id]):

                if block_id != 0 and layer_id == 0:
                    block = ResnetBlock(self.out_channels,
                                        strides=2,
                                        residual_path=True)
                else:
                    if self.in_channels != self.out_channels:
                        residual_path = True
                    else:
                        residual_path = False
                    block = ResnetBlock(self.out_channels,
                                        residual_path=residual_path)
                self.in_channels = self.out_channels
                self.blocks.add(block)
            self.out_channels *= 2
        self.final_bn = keras.layers.BatchNormalization()
        self.avg_pool = keras.layers.GlobalAveragePooling2D()
        self.fc = keras.layers.Dense(num_classes)

    def call(self, inputs, training=None):
        out = self.conv_initial(inputs)
        out = self.blocks(out, training=training)
        out = self.final_bn(out, training=training)
        out = tf.nn.relu(out)
        out = self.avg_pool(out)
        out = self.fc(out)
        return out

def main():
    num_classes = 10
    batch_size = 128
    epochs = 2
    # build model and optimizer
    model = ResNet([2, 2, 2], num_classes)
    model.compile(optimizer=keras.optimizers.Adam(0.001),
                  loss=keras.losses.CategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    model.build(input_shape=(None, 28, 28, 1))
    print("Number of variables in the model :", len(model.variables))
    model.summary()
    # train
    model.fit(x_train,
              y_train_ohe,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test_ohe),
              verbose=1)

    # evaluate on test set
    scores = model.evaluate(x_test, y_test_ohe, batch_size, verbose=1)
    print("Final test loss and accuracy :", scores)

if __name__ == '__main__':
    main()

           

監測 GPU 使用:

watch -n 0.01 nvidia-smi

           
深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

測試 catboost 使用 CPU:

from catboost.datasets import titanic
import numpy as np
from sklearn.model_selection import train_test_split
from catboost import CatBoostClassifier, Pool, cv
from sklearn.metrics import accuracy_score

train_df, test_df = titanic()
null_value_stats = train_df.isnull().sum(axis=0)
null_value_stats[null_value_stats != 0]

train_df.fillna(-999, inplace=True)
test_df.fillna(-999, inplace=True)

X = train_df.drop('Survived', axis=1)
y = train_df.Survived

X_train, X_validation, y_train, y_validation = train_test_split(X, y, train_size=0.75, random_state=42)
X_test = test_df

categorical_features_indices = np.where(X.dtypes != np.float)[0]

model = CatBoostClassifier(
    task_type="GPU",
    custom_metric=['Accuracy'],
    random_seed=666,
    logging_level='Silent'
)

model.fit(
    X_train, y_train,
    cat_features=categorical_features_indices,
    eval_set=(X_validation, y_validation),
    logging_level='Verbose',  # you can comment this for no text output
    plot=True
);

           

監測 GPU 使用:

watch -n 0.01 nvidia-smi

           
深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

REFERENCE

[1]

安裝CUDA: https://developer.nvidia.com/cuda-toolkit-archive

[2]

安裝cuDNN: https://developer.nvidia.com/rdp/cudnn-download

[3]

安裝合适的顯示卡驅動: http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux

[4]

也可以手動去官網下載下傳對應的安裝程式安裝顯示卡: https://www.geforce.cn/drivers

[5]

NVIDIA: https://baike.baidu.com/item/NVIDIA

[6]

并行計算: https://baike.baidu.com/item/并行計算/113443

[7]

GPU: https://baike.baidu.com/item/GPU

[8]

Online Documentation: https://developer.nvidia.com/cuda-toolkit-archive

[9]

下載下傳對應的安裝包(以官方推薦的Deb packages安裝方式為例): https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal

[10]

安裝cuDNN: https://developer.nvidia.com/rdp/cudnn-download

[11]

連結: https://developer.nvidia.com/rdp/cudnn-download

[12]

官方-NVIDIA CUDA Installation Guide for Linux: https://docs.nvidia.com/cuda/archive/10.0/cuda-installation-guide-linux/index.html

[13]

CUDA_Quick_Start_Guide-pdf: https://developer.download.nvidia.com/compute/cuda/10.0/Prod/docs/sidebar/CUDA_Quick_Start_Guide.pdf

[14]

CUDA_Installation_Guide_Linux-pdf: https://developer.download.nvidia.com/compute/cuda/10.0/Prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf

[15]

官方-cuDNN安裝: https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#install-linux

[16]

[How To] Install Latest NVIDIA Drivers In Linux: http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux

推薦原創幹貨閱讀:  

 聊聊近狀, 唠十塊錢的

【Deep Learning】詳細解讀LSTM與GRU單元的各個公式和差別

【手把手AI項目】一、安裝win10+linux-Ubuntu16.04的雙系統(全網最詳細)

【Deep Learning】為什麼卷積神經網絡中的“卷積”不是卷積運算?

【TOOLS】Pandas如何進行記憶體優化和資料加速讀取(附代碼詳解)

【TOOLS】python3利用SMTP進行郵件Email自主發送

【手把手AI項目】七、MobileNetSSD通過Ncnn前向推理架構在PC端的使用

【時空序列預測第一篇】什麼是時空序列問題?這類問題主要應用了哪些模型?主要應用在哪些領域?

公衆号:AI蝸牛車

保持謙遜、保持自律、保持進步

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

個人微信

備注:昵稱+學校/公司+方向

如果沒有備注不拉群!

拉你進AI蝸牛車交流群

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)

點個在看,麼麼哒!

深度學習環境配置 (Ubuntu18.04 + CUDA10.0 + cuDNN7.6.5 + TensorFlow2.0)
上一篇: Vue随筆