版權聲明:本文為部落客原創文章,轉載請注明源位址。 https://blog.csdn.net/10km/article/details/52728157
這幾天在梳理深度學習架構Caffe的代碼結構,有一些心得做一下記錄。
從prototxt開始
按照我的了解,從系統整體結構來看,Caffe是個資料驅動型的系統,而非程式驅動型,如果要類比,可以類似于用于Java應用的Spring架構,我對Spring的了解也非常膚淺,不過我知道基于Sprin架構的java應用都是用xml配置檔案來控制的。xml檔案裡詳細定義了每個應用的Class以及相應的參數。。。。
從這一點來說,Caffe與Spring架構就有相似之處了,Caffe定義一個神經網絡,定義訓練和測試的參數都是通過ProtoBuffer格式的prototxt檔案來完成。
比如對于一個訓練來說,net定義,net中每個layer的參數,訓練/測試資料來源,訓練參數,都在定義在.prototxt檔案中(訓練超參數Hyper-Parameter),
lenet_solver.prototxt
# net 指定訓練/測試的網絡定義檔案
net: "examples/mnist/lenet_train_test.prototxt"
############定義各種訓練參數#############
test_iter: 100
test_interval: 500
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
########以下定義訓練過程控制參數#########
# 每疊代100次輸出log
display: 100
# 最大疊代次數
max_iter: 10000
# 訓練快照儲存間隔(5000疊代儲存一次)
snapshot: 5000
# 訓練快照儲存的位置
snapshot_prefix: "examples/mnist/lenet"
# Caffe運作模式CPU/GPU
solver_mode: GPU
複制
在上面這個訓練超參數檔案中,描述了一次訓練所需要的所有參數,這其中最關注的就是第一個參數net,從這裡順藤摸瓜,我們就找到了神經網絡定義檔案。
是以,在開始學習Caffe的代碼整體結構時,并不需要急于看cpp/h代碼,先看prototxt,通過solver.prototxt以及net 的prototxt檔案就可以提綱挈領,從全局角度對代碼結構有一個總體的了解。
另外還要提到一個檔案就是
src/caffe/proto/caffe.proto
,Caffe幾乎所有的資料類型定義都在這裡,還以前面這個lenet_solver.prototxt檔案為例,這個檔案被Caffe讀取後,最終也會被解析成一個叫SolverParameter的資料對象,打開caffe.proto,就能找到它,代碼如下(已經簡化删除掉一些無關或廢棄的字段,完整定義請參見caffe.proto)。
message SolverParameter {
// prototxt格式的網絡定義檔案名
optional string net = 24;
// The number of iterations for each test net.
repeated int32 test_iter = 3;
// The number of iterations between two testing phases.
optional int32 test_interval = 4 [default = 0];
optional float base_lr = 5; // The base learning rate
// the number of iterations between displaying info. If display = 0, no info
// will be displayed.
optional int32 display = 6;
optional int32 max_iter = 7; // the maximum number of iterations
optional string lr_policy = 8;
optional float gamma = 9; // The parameter to compute the learning rate.
optional float power = 10; // The parameter to compute the learning rate.
optional float momentum = 11; // The momentum value.
optional float weight_decay = 12; // The weight decay.
// the stepsize for learning rate policy "step"
optional int32 stepsize = 13;
// the stepsize for learning rate policy "multistep"
repeated int32 stepvalue = 34;
// Set clip_gradients to >= 0 to clip parameter gradients to that L2 norm,
// whenever their actual L2 norm is larger.
optional float clip_gradients = 35 [default = -1];
optional int32 snapshot = 14 [default = 0]; // The snapshot interval
optional string snapshot_prefix = 15; // The prefix for the snapshot.
optional SnapshotFormat snapshot_format = 37 [default = BINARYPROTO];
// 運算類型定義
enum SolverMode {
CPU = 0;
GPU = 1;
}
// 運算模式 CPU/GPU
optional SolverMode solver_mode = 17 [default = GPU];
}
複制
那麼同理,我們也可以推斷lenet_solver.prototxt中定義的net檔案lenet_train_test.prototxt也會被Caffe解析成一個叫NetParameter 資料對象,下面是NetParameter 的定義(已經簡化删除掉一些無關或廢棄的字段,完整定義請參見caffe.proto)
message NetParameter {
// 網絡名字 比如LeNet
optional string name = 1;
repeated int32 input_dim = 4;
optional bool force_backward = 5 [default = false];
optional NetState state = 6;
optional bool debug_info = 7 [default = false];
// 儲存所有layer對象的layer數組,prototxt中的每個layer最終都儲存在這裡
repeated LayerParameter layer = 100; // ID 100 so layers are printed last.
}
複制
這裡我們再看看lenet_train_test.prototxt,其中的name和layer就與NetParameter中的字段有了對應關系。
name: "LeNet"
// 每個layer都是NetParameter中layer數組的一個元素
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
// 。。。略過
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
複制
再進一步NetParameter中layer的類型LayerParameter數組,那麼在caffe.proto中找到LayerParameter,就可以搞知道LayerParameter的所有字段及含義描述:
message LayerParameter {
optional string name = 1; // the layer name
optional string type = 2; // the layer type
repeated string bottom = 3; // the name of each bottom blob
repeated string top = 4; // the name of each top blob
// The train / test phase for computation.
optional Phase phase = 10;
// The amount of weight to assign each top blob in the objective.
// Each layer assigns a default value, usually of either 0 or 1,
// to each top blob.
repeated float loss_weight = 5;
// 略過.....
optional MILParameter mil_param = 0x004d494c; //"MIL"
}
複制
。。。由此類推,你可以找到每個資料對象的定義及說明,由此也能找到每個資料對象的cpp/h檔案。
layer factory
LayerParameter定義中的type是一個字元串,指定了這個層的類型,C++又不像java有Class.forName()這樣的方法可以直接将一個類名執行個體化為一個Class,那麼這個type中的名字是如何與實際的C++對象聯系在一起的呢?這是我一直想搞清楚的問題。
Caffe維護了一個name->建立layer對象函數指針的map.是以Caffe可以根據type字段指定的名字,建立對應的layer對象。具體實作代碼參見caffe/layer_factory.hpp和caffe/layer_factory.cpp
caffe/layer_factory.hpp ,中文為本文作者添加
#ifndef CAFFE_LAYER_FACTORY_H_
#define CAFFE_LAYER_FACTORY_H_
#include <map>
#include <string>
#include <vector>
#include "caffe/common.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"
namespace caffe {
template <typename Dtype>
class Layer;
template <typename Dtype>
class LayerRegistry {
public:
// 定義建立layer對象的函數指針類型
typedef shared_ptr<Layer<Dtype> > (*Creator)(const LayerParameter&);
// type -> layer對象的函數指針的映射類型
typedef std::map<string, Creator> CreatorRegistry;
static CreatorRegistry& Registry() {
// 全局靜态變量(map執行個體)
static CreatorRegistry* g_registry_ = new CreatorRegistry();
return *g_registry_;
}
// Adds a creator.向map中加入一個映射
static void AddCreator(const string& type, Creator creator) {
CreatorRegistry& registry = Registry();
CHECK_EQ(registry.count(type), 0)
<< "Layer type " << type << " already registered.";
registry[type] = creator;
}
// Get a layer using a LayerParameter.
static shared_ptr<Layer<Dtype> > CreateLayer(const LayerParameter& param) {
if (Caffe::root_solver()) {
LOG(INFO) << "Creating layer " << param.name();
}
const string& type = param.type();
CreatorRegistry& registry = Registry();
CHECK_EQ(registry.count(type), 1) << "Unknown layer type: " << type
<< " (known types: " << LayerTypeListString() << ")";
return registry[type](param);
}
static vector<string> LayerTypeList() {
CreatorRegistry& registry = Registry();
vector<string> layer_types;
for (typename CreatorRegistry::iterator iter = registry.begin();
iter != registry.end(); ++iter) {
layer_types.push_back(iter->first);
}
return layer_types;
}
private:
// Layer registry should never be instantiated - everything is done with its
// static variables.
LayerRegistry() {}
static string LayerTypeListString() {
vector<string> layer_types = LayerTypeList();
string layer_types_str;
for (vector<string>::iterator iter = layer_types.begin();
iter != layer_types.end(); ++iter) {
if (iter != layer_types.begin()) {
layer_types_str += ", ";
}
layer_types_str += *iter;
}
return layer_types_str;
}
};
template <typename Dtype>
class LayerRegisterer {
public:
LayerRegisterer(const string& type,
shared_ptr<Layer<Dtype> > (*creator)(const LayerParameter&)) {
// LOG(INFO) << "Registering layer type: " << type;
// 将指定建立layer對象的函數指針加入map
LayerRegistry<Dtype>::AddCreator(type, creator);
}
};
//宏定義用于将建立layer對象的函數指針加入map
#define REGISTER_LAYER_CREATOR(type, creator) \
static LayerRegisterer<float> g_creator_f_##type(#type, creator<float>); \
static LayerRegisterer<double> g_creator_d_##type(#type, creator<double>) \
#define REGISTER_LAYER_CLASS(type) \
template <typename Dtype> \
shared_ptr<Layer<Dtype> > Creator_##type##Layer(const LayerParameter& param) \
{ \
return shared_ptr<Layer<Dtype> >(new type##Layer<Dtype>(param)); \
} \
REGISTER_LAYER_CREATOR(type, Creator_##type##Layer)
} // namespace caffe
#endif // CAFFE_LAYER_FACTORY_H_
複制
caffe/layer_factory.cpp ,中文為本文作者添加
// Make sure we include Python.h before any system header
// to avoid _POSIX_C_SOURCE redefinition
#ifdef WITH_PYTHON_LAYER
#include <boost/python.hpp>
#endif
#include <string>
#include "caffe/layer.hpp"
#include "caffe/layer_factory.hpp"
#include "caffe/layers/conv_layer.hpp"
#include "caffe/layers/lrn_layer.hpp"
#include "caffe/layers/pooling_layer.hpp"
#include "caffe/layers/relu_layer.hpp"
#include "caffe/layers/sigmoid_layer.hpp"
#include "caffe/layers/softmax_layer.hpp"
#include "caffe/layers/tanh_layer.hpp"
#include "caffe/proto/caffe.pb.h"
#ifdef USE_CUDNN
#include "caffe/layers/cudnn_conv_layer.hpp"
#include "caffe/layers/cudnn_lcn_layer.hpp"
#include "caffe/layers/cudnn_lrn_layer.hpp"
#include "caffe/layers/cudnn_pooling_layer.hpp"
#include "caffe/layers/cudnn_relu_layer.hpp"
#include "caffe/layers/cudnn_sigmoid_layer.hpp"
#include "caffe/layers/cudnn_softmax_layer.hpp"
#include "caffe/layers/cudnn_tanh_layer.hpp"
#endif
#ifdef WITH_PYTHON_LAYER
#include "caffe/layers/python_layer.hpp"
#endif
namespace caffe {
// Get convolution layer according to engine.
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetConvolutionLayer(
const LayerParameter& param) {
ConvolutionParameter conv_param = param.convolution_param();
ConvolutionParameter_Engine engine = conv_param.engine();
#ifdef USE_CUDNN
bool use_dilation = false;
for (int i = 0; i < conv_param.dilation_size(); ++i) {
if (conv_param.dilation(i) > 1) {
use_dilation = true;
}
}
#endif
if (engine == ConvolutionParameter_Engine_DEFAULT) {
engine = ConvolutionParameter_Engine_CAFFE;
#ifdef USE_CUDNN
if (!use_dilation) {
engine = ConvolutionParameter_Engine_CUDNN;
}
#endif
}
if (engine == ConvolutionParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new ConvolutionLayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == ConvolutionParameter_Engine_CUDNN) {
if (use_dilation) {
LOG(FATAL) << "CuDNN doesn't support the dilated convolution at Layer "
<< param.name();
}
return shared_ptr<Layer<Dtype> >(new CuDNNConvolutionLayer<Dtype>(param));
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"Convolution"加入定義
REGISTER_LAYER_CREATOR(Convolution, GetConvolutionLayer);
// Get pooling layer according to engine.
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetPoolingLayer(const LayerParameter& param) {
PoolingParameter_Engine engine = param.pooling_param().engine();
if (engine == PoolingParameter_Engine_DEFAULT) {
engine = PoolingParameter_Engine_CAFFE;
#ifdef USE_CUDNN
engine = PoolingParameter_Engine_CUDNN;
#endif
}
if (engine == PoolingParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == PoolingParameter_Engine_CUDNN) {
if (param.top_size() > 1) {
LOG(INFO) << "cuDNN does not support multiple tops. "
<< "Using Caffe's own pooling layer.";
return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
}
// CuDNN assumes layers are not being modified in place, thus
// breaking our index tracking for updates in some cases in Caffe.
// Until there is a workaround in Caffe (index management) or
// cuDNN, use Caffe layer to max pooling, or don't use in place
// layers after max pooling layers
if (param.pooling_param().pool() == PoolingParameter_PoolMethod_MAX) {
return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
} else {
return shared_ptr<Layer<Dtype> >(new CuDNNPoolingLayer<Dtype>(param));
}
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"Pooling"加入定義
REGISTER_LAYER_CREATOR(Pooling, GetPoolingLayer);
// Get LRN layer according to engine
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetLRNLayer(const LayerParameter& param) {
LRNParameter_Engine engine = param.lrn_param().engine();
if (engine == LRNParameter_Engine_DEFAULT) {
#ifdef USE_CUDNN
engine = LRNParameter_Engine_CUDNN;
#else
engine = LRNParameter_Engine_CAFFE;
#endif
}
if (engine == LRNParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new LRNLayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == LRNParameter_Engine_CUDNN) {
LRNParameter lrn_param = param.lrn_param();
if (lrn_param.norm_region() ==LRNParameter_NormRegion_WITHIN_CHANNEL) {
return shared_ptr<Layer<Dtype> >(new CuDNNLCNLayer<Dtype>(param));
} else {
// local size is too big to be handled through cuDNN
if (param.lrn_param().local_size() > CUDNN_LRN_MAX_N) {
return shared_ptr<Layer<Dtype> >(new LRNLayer<Dtype>(param));
} else {
return shared_ptr<Layer<Dtype> >(new CuDNNLRNLayer<Dtype>(param));
}
}
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"LRN"加入定義
REGISTER_LAYER_CREATOR(LRN, GetLRNLayer);
// Get relu layer according to engine.
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetReLULayer(const LayerParameter& param) {
ReLUParameter_Engine engine = param.relu_param().engine();
if (engine == ReLUParameter_Engine_DEFAULT) {
engine = ReLUParameter_Engine_CAFFE;
#ifdef USE_CUDNN
engine = ReLUParameter_Engine_CUDNN;
#endif
}
if (engine == ReLUParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new ReLULayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == ReLUParameter_Engine_CUDNN) {
return shared_ptr<Layer<Dtype> >(new CuDNNReLULayer<Dtype>(param));
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"ReLU"加入定義
REGISTER_LAYER_CREATOR(ReLU, GetReLULayer);
// Get sigmoid layer according to engine.
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetSigmoidLayer(const LayerParameter& param) {
SigmoidParameter_Engine engine = param.sigmoid_param().engine();
if (engine == SigmoidParameter_Engine_DEFAULT) {
engine = SigmoidParameter_Engine_CAFFE;
#ifdef USE_CUDNN
engine = SigmoidParameter_Engine_CUDNN;
#endif
}
if (engine == SigmoidParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new SigmoidLayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == SigmoidParameter_Engine_CUDNN) {
return shared_ptr<Layer<Dtype> >(new CuDNNSigmoidLayer<Dtype>(param));
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"Sigmoid"加入定義
REGISTER_LAYER_CREATOR(Sigmoid, GetSigmoidLayer);
// Get softmax layer according to engine.
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetSoftmaxLayer(const LayerParameter& param) {
SoftmaxParameter_Engine engine = param.softmax_param().engine();
if (engine == SoftmaxParameter_Engine_DEFAULT) {
engine = SoftmaxParameter_Engine_CAFFE;
#ifdef USE_CUDNN
engine = SoftmaxParameter_Engine_CUDNN;
#endif
}
if (engine == SoftmaxParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new SoftmaxLayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == SoftmaxParameter_Engine_CUDNN) {
return shared_ptr<Layer<Dtype> >(new CuDNNSoftmaxLayer<Dtype>(param));
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"Softmax"加入定義
REGISTER_LAYER_CREATOR(Softmax, GetSoftmaxLayer);
// Get tanh layer according to engine.
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetTanHLayer(const LayerParameter& param) {
TanHParameter_Engine engine = param.tanh_param().engine();
if (engine == TanHParameter_Engine_DEFAULT) {
engine = TanHParameter_Engine_CAFFE;
#ifdef USE_CUDNN
engine = TanHParameter_Engine_CUDNN;
#endif
}
if (engine == TanHParameter_Engine_CAFFE) {
return shared_ptr<Layer<Dtype> >(new TanHLayer<Dtype>(param));
#ifdef USE_CUDNN
} else if (engine == TanHParameter_Engine_CUDNN) {
return shared_ptr<Layer<Dtype> >(new CuDNNTanHLayer<Dtype>(param));
#endif
} else {
LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"TanH"加入定義
REGISTER_LAYER_CREATOR(TanH, GetTanHLayer);
#ifdef WITH_PYTHON_LAYER
template <typename Dtype>
shared_ptr<Layer<Dtype> > GetPythonLayer(const LayerParameter& param) {
Py_Initialize();
try {
bp::object module = bp::import(param.python_param().module().c_str());
bp::object layer = module.attr(param.python_param().layer().c_str())(param);
return bp::extract<shared_ptr<PythonLayer<Dtype> > >(layer)();
} catch (bp::error_already_set) {
PyErr_Print();
throw;
}
}
// 使用layer_factory.hpp定義的REGISTER_LAYER_CREATOR宏将"Python"加入定義
REGISTER_LAYER_CREATOR(Python, GetPythonLayer);
#endif
// Layers that use their constructor as their default creator should be
// registered in their corresponding cpp files. Do not register them here.
} // namespace caffe
複制
這個檔案的主要作用就是将net中用到的這些layer類型的添加到map,搞明白這個原理,我們也可以根據自己需要建立自己layer,然後如上加入map,就可以在自己的網絡定義中使用自定義的layer。