安裝環境:cudnn7.5+cuda8.0+ubuntun16.04
general dependencies:apt
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libatlas-base-dev
sudo apt-get instal
python-dev
cd /usr/lib/x86_64-linux-gnu
ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so
ln -s libhdf5_serial.so.10.1.0 libhdf5.so
apt install liblmdb-dev libgflags-dev libgoogle-glog-dev
apt install libgflags-dev
sudo apt-get install —reinstall python-pkg-resources
cuda8.0
compilation with Make
cp Makefile.config.example Makefile.config
# Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
按照先前的環境配置 config檔案:
若要使用python來編寫layer,則
将 #WITH_PYTHON_LAYER := 1
修改為 WITH_PYTHON_LAYER := 1
重要的一項 :
将 # Whatever else you find you need goes here. 下面的
1 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
2 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
修改為:
1 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
2 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib
uncomment
USE_NCCL=1
要另外安裝nccl(optimized primitives for collective multi-GPU communication), 到http://github.com/NVIDIA/nccl上下載下傳解壓
cd nccl
make test
export LD_LIBRARY_PATH=./build/lib:$LD_LIBRARY_PATH
./build/test/single/all_reduce_test 10000000
make install
problem: No module named pkg_resources
solution: sudo apt-get install —reinstall python-pkg-resources
the error when installing caffe on Ubuntu
Check failed: status == CURAND_STATUS_SUCCESS (201 vs. 0) CURAND_STATUS_LAUNCH_FAILURE
solution: update cuda8.0
compilation here:
make all
make test
make runtest
after make all , it will show:
CXX/LD -o .build_release/tools/caffe.bin
/usr/bin/ld: warning: libcudart.so.9.0, needed by /usr/local/lib/libnccl.so, may conflict with libcudart.so.7.5
CXX tools/compute_image_mean.cpp
CXX/LD -o .build_release/tools/compute_image_mean.bin
CXX tools/upgrade_solver_proto_text.cpp
CXX/LD -o .build_release/tools/upgrade_solver_proto_text.bin
CXX examples/mnist/convert_mnist_data.cpp
CXX/LD -o .build_release/examples/mnist/convert_mnist_data.bin
CXX examples/siamese/convert_mnist_siamese_data.cpp
CXX/LD -o .build_release/examples/siamese/convert_mnist_siamese_data.bin
CXX examples/cifar10/convert_cifar_data.cpp
CXX/LD -o .build_release/examples/cifar10/convert_cifar_data.bin
CXX examples/cpp_classification/classification.cpp
CXX/LD -o .build_release/examples/cpp_classification/classification.bin
after make test, it will show:
LD .build_release/src/caffe/test/test_bias_layer.o
LD .build_release/src/caffe/test/test_threshold_layer.o
LD .build_release/src/caffe/test/test_spp_layer.o
LD .build_release/src/caffe/test/test_benchmark.o
LD .build_release/src/caffe/test/test_hdf5data_layer.o
LD .build_release/src/caffe/test/test_euclidean_loss_layer.o
LD .build_release/src/caffe/test/test_deconvolution_layer.o
LD .build_release/src/caffe/test/test_data_layer.o
LD .build_release/src/caffe/test/test_softmax_with_loss_layer.o
LD .build_release/src/caffe/test/test_memory_data_layer.o
LD .build_release/src/caffe/test/test_lrn_layer.o
LD .build_release/src/caffe/test/test_image_data_layer.o
LD .build_release/src/caffe/test/test_convolution_layer.o
LD .build_release/cuda/src/caffe/test/test_im2col_kernel.o
after make runtest
problem: Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
Headers are 1.10.1, library is 1.8.16
annconda自帶的hdf5和系統後來裝的hdf5不比對,可以下版本對應的hdf5 1.8.16在anaconda上重新安裝
問題則可以解決
if you want to use pycaffe:make
[----------] Global test environment tear-down
[==========] 2175 tests from 285 test cases ran. (751535 ms total)
[ PASSED ] 2175 tests.
表示安裝成功
配置環境變量:
vi ~/.bashrc
export PYTHONPATH=/usr/caffe/python:$PYTHONPATH
source ~/.bashrc
配置pycaffe
先安裝requirements.txt裡面需要的Python包
cd caffe
make pycaffe
ImportError: /home/chkusr/gbx/caffe-master/python/caffe/_caffe.so: undefined symbol: _ZN5boost6python6detail11init_moduleER11PyModuleDefPFvvE
solution:在Makefile.config中取消注釋:PYTHON_LIBRARIES := boost_python3 python3.6m
[email protected]:/usr/lib/x86_64-linux-gnu# ln -s libboost_python-py35.so libboost_python3.so
problem:
cannot find -lpython3.6m
solution:
cp /root/anaconda3/lib/libpython3.6m.so /usr/lib/libpython3.6m.so
cp -r /root/anaconda3/lib/python3.6 /usr/lib
make clean
make all
從頭開始編譯
Run the caffe example:
1.minist
cd $CAFFE_ROOT
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
#use lenet network for the training
./examples/mnist/train_lenet.sh
2.cifar10
cd $CAFFE_ROOT
./data/cifar10/get_cifar10.sh
./examples/cifar10/create_cifar10.sh
./examples/cifar10/train_quick.sh
想要多GPU并行,可以在
./build/tools/caffe train --solver=... 後加一個選項--gpu all or --gpu 0,1,2,3
/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0)
libpython3.6-dev