文章目錄
-
-
- 1、論文總述
- 2、普通卷積與深度可分離卷積的計算量對比
- 3、移除部分非線性
- 4、 The difference between residual block and inverted residual
- 5、消融實驗
- 6、SSDlite
-
1、論文總述
這篇論文提出了一種适合移動端部署的分類網絡:MobileNetV2,是在MobileNetV1的基礎上改進得來,整體上還是采用MobileNetV1中的深度可分離卷積來降低網絡的參數量和推理速度,從論文标題中就可以看出本篇論文的兩個主要的改進點:Inverted Resduals 和 Linear Bottlenecks,Inverted Resduals是指加入了Resnet中的shotcut結構,但是又和它不一樣,resnet中的bottleneck是中間層的feature map的通道數少,而兩側的feature map的通道數多,是一個沙漏型,而本文提出的倒置殘差結構是兩側的feature map的通道數少,而中間的通道數多,是柳葉型;至于 Linear Bottlenecks是将bottleneck中的最後的通道數較少的feature map後面跟的relu6激活函數去掉,即去掉了非線性。

注:其中table2中的t為1*1卷積用來升維時的expansion,作者大部分實驗采用的是6,n是重複單元個數,s是步長
Our main contribution is a novel layer module: the inverted residual with linear bottleneck.
This module takes as an input a low-dimensional compressed
representation which is first expanded to high dimension and filtered with a lightweight depthwise convolution. Features are subsequently projected back to a
low-dimensional representation with a linear convolution. The official implementation is available as part of
TensorFlow-Slim model library in [4].
Furthermore, this convolutional module is particularly suitable for mobile designs, because it allows to signifi-
cantly reduce the memory footprint needed during inference by never fully materializing large intermediate
tensors
這個網絡速度快是因為卷積時候并沒有用标準卷積去卷積很大很深的feature map,在bottleneck中雖然是先升維,但是升維之後用的是深度可分裂卷積,然後降維時候用的是1*1卷積,這些參數量都很少,而且乘加法次數也少。
. Our network design is based on MobileNetV1 [27]. It retains its simplicity and does not require any special operators while significantly improves its accuracy, achieving state of the art on multiple image classification and
detection tasks for mobile e applications.
2、普通卷積與深度可分離卷積的計算量對比
3、移除部分非線性
作者在論文中花費了很多篇幅來證明:當使用了depthwise卷積後,且feature map的通道個數比較少時,這時候的卷積後面就不要跟着非線性激活函數了,直接去掉relu就行,會有性能提升。
To summarize, we have highlighted two properties
that are indicative of the requirement that the manifold
of interest should lie in a low-dimensional subspace of
the higher-dimensional activation space:
》1. If the manifold of interest remains non-zero volume after ReLU transformation, it corresponds to
a linear transformation.
》2. ReLU is capable of preserving complete information about the input manifold, but only if the input
manifold lies in a low-dimensional subspace of the
input space.
These two insights provide us with an empirical hint
for optimizing existing neural architectures: assuming
the manifold of interest is low-dimensional we can capture this by inserting linear bottleneck layers into the
convolutional blocks. Experimental evidence suggests
that using linear layers is crucial as it prevents nonlinearities from destroying too much information. In
Section 6, we show empirically that using non-linear
layers in bottlenecks indeed hurts the performance by
several percent, further validating our hypothesis3
. We
note that similar reports where non-linearity was helped
were reported in [29] where non-linearity was removed
from the input of the traditional residual block and that
lead to improved performance on CIFAR dataset.
4、 The difference between residual block and inverted residual
residual block是先降維再升維, inverted residual是先升維再降維
5、消融實驗
6、SSDlite
SSDLite: In this paper, we introduce a mobile
friendly variant of regular SSD. We replace all the regular convolutions with separable convolutions (depthwise
followed by 1 × 1 projection) in SSD prediction layers. This design is in line with the overall design of
MobileNets and is seen to be much more computationally efficient. We call this modified version SSDLite.
Compared to regular SSD, SSDLite dramatically reduces both parameter count and computational cost as
shown in Table 5.
參考文獻
1、如何評價mobilenet v2 ?
2、(二十八)通俗易懂了解——MobileNetV1 & MobileNetV2