了解Caffe的網絡模型

1. 初見LeNet原始模型
2. Caffe LeNet的網絡結構
3. 逐層了解Caffe LeNet

3.1 Data Layer
3.2 Conv1 Layer
3.3 Pool1 Layer
3.4 Conv2 Layer
3.5 Pool2 Layer
3.6 Ip1 Layer
3.7 Relu1 Layer
3.8 Ip2 Layer
3.9 Loss Layer

1. 初見LeNet原始模型

Fig.1. Architecture of original LeNet-5.

圖檔來源： Lecun, et al., Gradient-based learning applied to document recognition, P IEEE, vol. 86, no. 11, 1998, pp. 2278-2324.

在這篇圖檔的論文中，較長的描述了LeNet-5的結構。

這裡不對LeNet-5原始模型進行讨論。可以參考這些資料：

回到頂部(go to top)

2. Caffe LeNet的網絡結構

他山之石，可以攻玉。本來是準備畫出Caffe LeNet的圖的，但發現已經有人做了，并且畫的很好，就直接拿過來輔助了解了。

第3部分圖檔來源：http://www.2cto.com/kf/201606/518254.html

先從整體上感覺Caffe LeNet的拓撲圖，由于Caffe中定義網絡的結構采用的是bottom&top這種上下結構，是以這裡的圖也采用這種方式展現出來，更加友善了解。

Fig.2. Architecture of caffe LeNet.

3. 逐層了解Caffe LeNet
本節将采用定義與圖解想結合的方式逐層了解Caffe LeNet的結構。

3.1 Data Layer
#==============定義TRAIN的資料層============================================
layer { 
  name: "mnist" #定義該層的名字
  type: "Data"  #該層的類型是資料
  top: "data"   #該層生成一個data blob
  top: "label"  #該層生成一個label blob
  include {
    phase: TRAIN #說明該層隻在TRAIN階段使用
  }
  transform_param {
    scale: 0.00390625 #資料歸一化系數，1/256，歸一到[0,1)
  }
  data_param {
    source: "E:/MyCode/DL/caffe-master/examples/mnist/mnist_train_lmdb" #訓練資料的路徑
    batch_size: 64 #批量處理的大小
    backend: LMDB
  }
}
#==============定義TEST的資料層============================================
layer { 
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST #說明該層隻在TEST階段使用
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "E:/MyCode/DL/caffe-master/examples/mnist/mnist_test_lmdb" #測試資料的路徑
    batch_size: 100
    backend: LMDB
  }
}
2

Fig.3. Architecture of data layer.

Fig.3 是train情況下，資料層讀取lmdb資料，每次讀取64條資料，即N=64。

Caffe中采用4D表示，N*C*H*W(Num*Channels*Height*Width)。

3.2 Conv1 Layer
#==============定義卷積層1=============================
layer {
  name: "conv1"       #該層的名字conv1，即卷積層1
  type: "Convolution" #該層的類型是卷積層
  bottom: "data"      #該層使用的資料是由資料層提供的data blob
  top: "conv1"        #該層生成的資料是conv1
  param {
    lr_mult: 1        #weight learning rate(簡寫為lr)權值的學習率，1表示該值是lenet_solver.prototxt中base_lr: 0.01的1倍
  }
  param {
    lr_mult: 2        #bias learning rate偏移值的學習率，2表示該值是lenet_solver.prototxt中base_lr: 0.01的2倍
  }
  convolution_param {
    num_output: 20    #産生20個輸出通道
    kernel_size: 5    #卷積核的大小為5*5
    stride: 1         #卷積核移動的步幅為1
    weight_filler {
      type: "xavier"  #xavier算法，根據輸入和輸出的神經元的個數自動初始化權值比例
    }
    bias_filler {
      type: "constant"  #将偏移值初始化為“穩定”狀态，即設為預設值0
    }
  }
}
3

Fig.4. Architecture of conv1 layer.

conv1的資料變化的情況：batch_size*1*28*28->batch_size*20*24*24

3.3 Pool1 Layer
#==============定義池化層1=============================
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"     #該層使用的資料是由conv1層提供的conv1
  top: "pool1"        #該層生成的資料是pool1
  pooling_param {
    pool: MAX         #采用最大值池化
    kernel_size: 2    #池化核大小為2*2
    stride: 2         #池化核移動的步幅為2，即非重疊移動
  }
}
4

Fig.5. Architecture of pool1 layer.

池化層1過程資料變化：batch_size*20*24*24->batch_size*20*12*12

3.4 Conv2 Layer
#==============定義卷積層2=============================
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
conv2層的圖與Fig.4 類似，卷積層2過程資料變化：batch_size*20*12*12->batch_size*50*8*8。

3.5 Pool2 Layer
#==============定義池化層2=============================
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
pool2層圖與Fig.5類似，池化層2過程資料變化：batch_size*50*8*8->batch_size*50*4*4。

3.6 Ip1 Layer
#==============定義全連接配接層1=============================
layer {
  name: "ip1"
  type: "InnerProduct" #該層的類型為全連接配接層
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500 #有500個輸出通道
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
5

Fig.6. Architecture of ip11 layer.

ip1過程資料變化：batch_size*50*4*4->batch_size*500*1*1。

此處的全連接配接是将C*H*W轉換成1D feature vector，即800->500.

3.7 Relu1 Layer
#==============定義ReLU1層=============================
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
6
Fig.7. Architecture of relu1 layer.
ReLU1層過程資料變化：batch_size*500*1*1->batch_size*500*1*1

3.8 Ip2 Layer
#==============定義全連接配接層2============================
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10          #10個輸出資料，對應0-9十個數字
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
ip2過程資料變化：batch_size*500*1*1->batch_size*10*1*1

3.9 Loss Layer
#==============定義損失函數層============================
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
7

Fig.8. Architecture of loss layer.

損失層過程資料變化：batch_size*10*1*1->batch_size*10*1*1

note:注意到caffe LeNet中有一個accuracy layer的定義，這是輸出測試結果的層。

回到頂部(go to top)
4. Caffe LeNet的完整定義
name: "LeNet" #定義網絡的名字
#==============定義TRAIN的資料層============================================
layer { 
  name: "mnist" #定義該層的名字
  type: "Data"  #該層的類型是資料
  top: "data"   #該層生成一個data blob
  top: "label"  #該層生成一個label blob
  include {
    phase: TRAIN #說明該層隻在TRAIN階段使用
  }
  transform_param {
    scale: 0.00390625 #資料歸一化系數，1/256，歸一到[0,1)
  }
  data_param {
    source: "E:/MyCode/DL/caffe-master/examples/mnist/mnist_train_lmdb" #訓練資料的路徑
    batch_size: 64 #批量處理的大小
    backend: LMDB
  }
}
#==============定義TEST的資料層============================================
layer { 
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST #說明該層隻在TEST階段使用
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "E:/MyCode/DL/caffe-master/examples/mnist/mnist_test_lmdb" #測試資料的路徑
    batch_size: 100
    backend: LMDB
  }
}
#==============定義卷積層1=============================
layer {
  name: "conv1"       #該層的名字conv1，即卷積層1
  type: "Convolution" #該層的類型是卷積層
  bottom: "data"      #該層使用的資料是由資料層提供的data blob
  top: "conv1"        #該層生成的資料是conv1
  param {
    lr_mult: 1        #weight learning rate(簡寫為lr)權值的學習率，1表示該值是lenet_solver.prototxt中base_lr: 0.01的1倍
  }
  param {
    lr_mult: 2        #bias learning rate偏移值的學習率，2表示該值是lenet_solver.prototxt中base_lr: 0.01的2倍
  }
  convolution_param {
    num_output: 20    #産生20個輸出通道
    kernel_size: 5    #卷積核的大小為5*5
    stride: 1         #卷積核移動的步幅為1
    weight_filler {
      type: "xavier"  #xavier算法，根據輸入和輸出的神經元的個數自動初始化權值比例
    }
    bias_filler {
      type: "constant"  #将偏移值初始化為“穩定”狀态，即設為預設值0
    }
  }
}#卷積過程資料變化：batch_size*1*28*28->batch_size*20*24*24
#==============定義池化層1=============================
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"     #該層使用的資料是由conv1層提供的conv1
  top: "pool1"        #該層生成的資料是pool1
  pooling_param {
    pool: MAX         #采用最大值池化
    kernel_size: 2    #池化核大小為2*2
    stride: 2         #池化核移動的步幅為2，即非重疊移動
  }
}#池化層1過程資料變化：batch_size*20*24*24->batch_size*20*12*12
#==============定義卷積層2=============================
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}#卷積層2過程資料變化：batch_size*20*12*12->batch_size*50*8*8
#==============定義池化層2=============================
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}#池化層2過程資料變化：batch_size*50*8*8->batch_size*50*4*4
#==============定義全連接配接層1=============================
layer {
  name: "ip1"
  type: "InnerProduct" #該層的類型為全連接配接層
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500 #有500個輸出通道
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}#全連接配接層1過程資料變化：batch_size*50*4*4->batch_size*500*1*1
#==============定義ReLU1層=============================
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}#ReLU1層過程資料變化：batch_size*500*1*1->batch_size*500*1*1
#==============定義全連接配接層2============================
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10          #10個輸出資料，對應0-9十個數字
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}#全連接配接層2過程資料變化：batch_size*500*1*1->batch_size*10*1*1
#==============定義顯示準确率結果層============================
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
#==============定義損失函數層============================
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}#損失層過程資料變化：batch_size*10*1*1->batch_size*10*1*1

了解Caffe的網絡模型

1. 初見LeNet原始模型

2. Caffe LeNet的網絡結構

繼續閱讀

Kafka：Topic概念與API介紹

5G小型蜂應用指南

PAT (Advanced Level) Practise 1012 The Best Rank (25)

mysql5.7的sql優化

線程通信和程序通信差別（線程程序差別）

Matlab随機波動率SV、GARCH用MCMC馬爾可夫鍊蒙特卡羅方法分析匯率時間序列

微信小程式前端解密擷取使用者資訊

Spring MVC 自學雜記（五） -- SpringMVC與前台的json資料互動

《MySQL技術内幕：InnoDB存儲引擎》筆記

擴容TIKV節點遇到的坑

PHP輔導代做程式設計：CS353 Database System

自學Zabbix3.10.2-事件通知Notifications upon events-Actions報警配置點選傳回：自學zabbix集錦

HDU 5678 ztr loves trees

拓端tecdat|R語言彈性網絡Elastic Net正則化懲罰回歸模型交叉驗證可視化

二叉樹及其應用--二叉樹建立

詳解STM32單片機的堆棧