天天看點

caffe學習--安裝(cudnn7.5+cuda8.0+ubuntun16.04)

安裝環境:cudnn7.5+cuda8.0+ubuntun16.04

general dependencies:apt 

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
           

sudo apt-get install libatlas-base-dev

sudo apt-get instal 

python-dev

cd /usr/lib/x86_64-linux-gnu

ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so

ln -s libhdf5_serial.so.10.1.0 libhdf5.so

apt install liblmdb-dev libgflags-dev libgoogle-glog-dev

apt install libgflags-dev

sudo apt-get install —reinstall python-pkg-resources

cuda8.0

compilation with Make

cp Makefile.config.example Makefile.config
           
# Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
           

按照先前的環境配置 config檔案:

若要使用python來編寫layer,則

将       #WITH_PYTHON_LAYER := 1  

修改為 WITH_PYTHON_LAYER := 1 

重要的一項 :

将 # Whatever else you find you need goes here. 下面的

1 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include

2 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib 

修改為:

1 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial

2 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib

uncomment 

USE_NCCL=1

要另外安裝nccl(optimized primitives for collective multi-GPU communication), 到http://github.com/NVIDIA/nccl上下載下傳解壓

cd nccl 

make test

export LD_LIBRARY_PATH=./build/lib:$LD_LIBRARY_PATH

./build/test/single/all_reduce_test 10000000

make install

problem: No module named pkg_resources

solution: sudo apt-get install —reinstall python-pkg-resources

the error when installing caffe on Ubuntu

Check failed: status == CURAND_STATUS_SUCCESS (201 vs. 0)  CURAND_STATUS_LAUNCH_FAILURE

solution: update cuda8.0

compilation here:

make all
make test
make runtest
           

after make all , it will show:

CXX/LD -o .build_release/tools/caffe.bin

/usr/bin/ld: warning: libcudart.so.9.0, needed by /usr/local/lib/libnccl.so, may conflict with libcudart.so.7.5

CXX tools/compute_image_mean.cpp

CXX/LD -o .build_release/tools/compute_image_mean.bin

CXX tools/upgrade_solver_proto_text.cpp

CXX/LD -o .build_release/tools/upgrade_solver_proto_text.bin

CXX examples/mnist/convert_mnist_data.cpp

CXX/LD -o .build_release/examples/mnist/convert_mnist_data.bin

CXX examples/siamese/convert_mnist_siamese_data.cpp

CXX/LD -o .build_release/examples/siamese/convert_mnist_siamese_data.bin

CXX examples/cifar10/convert_cifar_data.cpp

CXX/LD -o .build_release/examples/cifar10/convert_cifar_data.bin

CXX examples/cpp_classification/classification.cpp

CXX/LD -o .build_release/examples/cpp_classification/classification.bin

after make test, it will show:

LD .build_release/src/caffe/test/test_bias_layer.o

LD .build_release/src/caffe/test/test_threshold_layer.o

LD .build_release/src/caffe/test/test_spp_layer.o

LD .build_release/src/caffe/test/test_benchmark.o

LD .build_release/src/caffe/test/test_hdf5data_layer.o

LD .build_release/src/caffe/test/test_euclidean_loss_layer.o

LD .build_release/src/caffe/test/test_deconvolution_layer.o

LD .build_release/src/caffe/test/test_data_layer.o

LD .build_release/src/caffe/test/test_softmax_with_loss_layer.o

LD .build_release/src/caffe/test/test_memory_data_layer.o

LD .build_release/src/caffe/test/test_lrn_layer.o

LD .build_release/src/caffe/test/test_image_data_layer.o

LD .build_release/src/caffe/test/test_convolution_layer.o

LD .build_release/cuda/src/caffe/test/test_im2col_kernel.o

after make runtest

problem: Warning! ***HDF5 library version mismatched error***

The HDF5 header files used to compile this application do not match

the version used by the HDF5 library to which this application is linked.

Data corruption or segmentation faults may occur if the application continues.

Headers are 1.10.1, library is 1.8.16

annconda自帶的hdf5和系統後來裝的hdf5不比對,可以下版本對應的hdf5 1.8.16在anaconda上重新安裝

問題則可以解決

if you want to use pycaffe:make 

[----------] Global test environment tear-down

[==========] 2175 tests from 285 test cases ran. (751535 ms total)

[  PASSED  ] 2175 tests.

表示安裝成功

配置環境變量:

vi ~/.bashrc 

export PYTHONPATH=/usr/caffe/python:$PYTHONPATH

source ~/.bashrc

配置pycaffe

先安裝requirements.txt裡面需要的Python包

cd caffe

make pycaffe

ImportError: /home/chkusr/gbx/caffe-master/python/caffe/_caffe.so: undefined symbol: _ZN5boost6python6detail11init_moduleER11PyModuleDefPFvvE

solution:在Makefile.config中取消注釋:PYTHON_LIBRARIES := boost_python3 python3.6m

[email protected]:/usr/lib/x86_64-linux-gnu# ln -s libboost_python-py35.so libboost_python3.so

problem:

cannot find  -lpython3.6m

solution:

cp /root/anaconda3/lib/libpython3.6m.so /usr/lib/libpython3.6m.so

 cp -r /root/anaconda3/lib/python3.6 /usr/lib

make clean 

make all

從頭開始編譯

Run the caffe example: 

1.minist

cd $CAFFE_ROOT

./data/mnist/get_mnist.sh

./examples/mnist/create_mnist.sh

#use lenet network for the training

./examples/mnist/train_lenet.sh

2.cifar10

cd $CAFFE_ROOT 

./data/cifar10/get_cifar10.sh

./examples/cifar10/create_cifar10.sh

./examples/cifar10/train_quick.sh

想要多GPU并行,可以在 

./build/tools/caffe train --solver=... 後加一個選項--gpu all or --gpu 0,1,2,3

/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0)

libpython3.6-dev