天天看點

DL之LeNet-5:LeNet-5算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略

LeNet-5算法的簡介(論文介紹)

DL之LeNet-5:LeNet-5算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略

      LeNet-5模型是Yann LeCun教授于1998年在論文《Gradient-based learning applied to document recognition》中提出。它是第一個成功應用于手寫數字識别問題并産生實際商業(郵政行業)價值的卷積神經網絡。

Abstract

     Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques.

     利用反向傳播算法訓練的多層神經網絡構成了一種成功的基于梯度的學習技術。在适當的網絡結構下,基于梯度的學習算法可以用來合成一個複雜的決策曲面,該曲面可以用最少的預處理對高維模式(如手寫字元)進行分類。本文綜述了手寫字元識别的各種方法,并在一個标準的手寫數字識别任務上進行了比較。卷積神經網絡是專門設計用來處理二維形狀變化的,它的表現優于其他所有技術。

     Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure.

     現實生活中的文檔識别系統由多個子產品組成,包括字段提取、分割識别和語言模組化。一種新的學習範式稱為圖變網絡(GTN),它允許使用基于梯度的方法對這種多子產品系統進行全局訓練,進而最小化總體性能度量。

     Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks.

     介紹了兩種線上手寫識别系統。實驗表明,該方法具有全局訓練的優點,并具有圖形變壓器網絡的靈活性。

     A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

     本論文還描述了一種用于讀取銀行支票的圖形變壓器網絡。它使用卷積神經網絡字元識别器,結合全局訓練技術,為企業和個人支票提供準确的記錄。它已投入商業使用,每天可讀取數百萬張支票。

論文

https://ieeexplore.ieee.org/document/726791 http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
DL之LeNet-5:LeNet-5算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略

1998 年《Gradient-Based Learning Applied to Documnet Recognition》

http://yann.lecun.com/exdb/lenet/

LeNet-5算法的架構詳解

DL之LeNet-5:LeNet-5算法的架構詳解

DL之LeNet-5:LeNet-5算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略

LeNet-5算法的案例應用

PyTorch之LeNet-5:利用PyTorch實作最經典的LeNet-5卷積神經網絡對手寫數字圖檔識别CNN

1、LeNet-5算法的代碼實作(LeNet-5——PyTorch)

PyTorch:利用PyTorch實作搭建最經典的LeNet卷積神經網絡CNN——Jason niu

class LeNet(nn.Module):

   def __init__(self):

       super(LeNet,self).__init__()

       #Conv1 和 Conv2:卷積層,每個層輸出在卷積核(小尺寸的權重張量)和同樣尺寸輸入區域之間的點積;

       self.conv1 = nn.Conv2d(1,10,kernel_size=5)

       self.conv2 = nn.Conv2d(10,20,kernel_size=5)

       self.conv2_drop = nn.Dropout2d()

       self.fc1 = nn.Linear(320,50)

       self.fc2 = nn.Linear(50,10)

   def forward(self,x):

       x = F.relu(F.max_pool2d(self.conv1(x),2)) #使用 max 運算執行特定區域的下采樣(通常 2x2 像素);

       x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)),2))

       x = x.view(-1, 320)

       x = F.relu(self.fc1(x))  #修正線性單元函數,使用逐元素的激活函數 max(0,x);

       x = F.dropout(x, training=self.training) #Dropout2D随機将輸入張量的所有通道設為零。當特征圖具備強相關時,dropout2D 提升特征圖之間的獨立性;

       x = self.fc2(x)

       return F.log_softmax(x, dim=1)  #将 Log(Softmax(x)) 函數應用到 n 維輸入張量,以使輸出在 0 到 1 之間。

#建立 LeNet 類後,建立對象并移至 GPU

model = LeNet()

cuda_gpu = torch.cuda.is_available()

if cuda_gpu:

   model.cuda()

print ('MNIST_net model:\n')

print (model)

繼續閱讀