論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

文章目錄

- - 1、論文總述
  - 2、普通卷積與深度可分離卷積的計算量對比
  - 3、移除部分非線性
  - 4、 The difference between residual block and inverted residual
  - 5、消融實驗
  - 6、SSDlite

1、論文總述

這篇論文提出了一種适合移動端部署的分類網絡：MobileNetV2，是在MobileNetV1的基礎上改進得來，整體上還是采用MobileNetV1中的深度可分離卷積來降低網絡的參數量和推理速度，從論文标題中就可以看出本篇論文的兩個主要的改進點：Inverted Resduals 和 Linear Bottlenecks，Inverted Resduals是指加入了Resnet中的shotcut結構，但是又和它不一樣，resnet中的bottleneck是中間層的feature map的通道數少，而兩側的feature map的通道數多，是一個沙漏型，而本文提出的倒置殘差結構是兩側的feature map的通道數少，而中間的通道數多，是柳葉型；至于 Linear Bottlenecks是将bottleneck中的最後的通道數較少的feature map後面跟的relu6激活函數去掉，即去掉了非線性。

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

注：其中table2中的t為1*1卷積用來升維時的expansion，作者大部分實驗采用的是6，n是重複單元個數，s是步長

Our main contribution is a novel layer module: the inverted residual with linear bottleneck.

This module takes as an input a low-dimensional compressed

representation which is first expanded to high dimension and filtered with a lightweight depthwise convolution. Features are subsequently projected back to a

low-dimensional representation with a linear convolution. The official implementation is available as part of

TensorFlow-Slim model library in [4].

Furthermore, this convolutional module is particularly suitable for mobile designs, because it allows to signifi-

cantly reduce the memory footprint needed during inference by never fully materializing large intermediate

tensors

這個網絡速度快是因為卷積時候并沒有用标準卷積去卷積很大很深的feature map，在bottleneck中雖然是先升維，但是升維之後用的是深度可分裂卷積，然後降維時候用的是1*1卷積，這些參數量都很少，而且乘加法次數也少。

. Our network design is based on MobileNetV1 [27]. It retains its simplicity and does not require any special operators while significantly improves its accuracy, achieving state of the art on multiple image classification and

detection tasks for mobile e applications.

2、普通卷積與深度可分離卷積的計算量對比

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

3、移除部分非線性

作者在論文中花費了很多篇幅來證明：當使用了depthwise卷積後，且feature map的通道個數比較少時，這時候的卷積後面就不要跟着非線性激活函數了，直接去掉relu就行，會有性能提升。

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

To summarize, we have highlighted two properties

that are indicative of the requirement that the manifold

of interest should lie in a low-dimensional subspace of

the higher-dimensional activation space:

》1. If the manifold of interest remains non-zero volume after ReLU transformation, it corresponds to

a linear transformation.

》2. ReLU is capable of preserving complete information about the input manifold, but only if the input

manifold lies in a low-dimensional subspace of the

input space.

These two insights provide us with an empirical hint

for optimizing existing neural architectures: assuming

the manifold of interest is low-dimensional we can capture this by inserting linear bottleneck layers into the

convolutional blocks. Experimental evidence suggests

that using linear layers is crucial as it prevents nonlinearities from destroying too much information. In

Section 6, we show empirically that using non-linear

layers in bottlenecks indeed hurts the performance by

several percent, further validating our hypothesis3

. We

note that similar reports where non-linearity was helped

were reported in [29] where non-linearity was removed

from the input of the traditional residual block and that

lead to improved performance on CIFAR dataset.

4、 The difference between residual block and inverted residual

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

residual block是先降維再升維， inverted residual是先升維再降維

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

5、消融實驗

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

6、SSDlite

SSDLite: In this paper, we introduce a mobile

friendly variant of regular SSD. We replace all the regular convolutions with separable convolutions (depthwise

followed by 1 × 1 projection) in SSD prediction layers. This design is in line with the overall design of

MobileNets and is seen to be much more computationally efficient. We call this modified version SSDLite.

Compared to regular SSD, SSDLite dramatically reduces both parameter count and computational cost as

shown in Table 5.

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

參考文獻

1、如何評價mobilenet v2 ?

2、（二十八）通俗易懂了解——MobileNetV1 & MobileNetV2

論文閱讀：MobileNetV2: Inverted Residuals and Linear Bottlenecks

文章目錄

1、論文總述

2、普通卷積與深度可分離卷積的計算量對比

3、移除部分非線性

4、 The difference between residual block and inverted residual

5、消融實驗

6、SSDlite

繼續閱讀

【配準】弱監督(Weakly-Supervised)系列配準論文閱讀

論文閱讀：Fast R-CNN1、論文總述2、RCNN和SPPnet的缺點3、SPPnet不能更新SPP層之前的參數的原因4、Multi-task loss5、Truncated SVD for faster detection6、Which layers to fine-tune?（檢測時從哪個層開始finetune）7、 Does multi-task training help?參考文獻

強化學習論文筆記：Real-Time Reinforcement Learning簡介問題方法SAC作為Baseline，Metrics是平均累計回報。總結

[論文閱讀：姿态識别&Transformer] TransPose: Keypoint Localization via Transformer 2021 ICCV1. 摘要2.主要工作3. Contributions4. 架構總覽 4.1. Architecture 5. Experiments

論文分享（三）——權重采樣音頻對抗樣本攻擊一.介紹二.相關工作三.背景四.方法五.實驗結果六.總結

Few-Shot Object Detection via Sample Processing

Lattice-BERT 論文閱讀Motivation 創新點

CVPR2020場景文字資料增強（python實作）

文獻閱讀--Certified Adversarial Robustness via Randomized Smoothing1 概述2 問題的引出3 Randomized smoothing

新手如何快速入門車輛控制領域？（附帶讀論文的工具）

Fast Spatio-Temporal Residual Network for Video Super-Resolution閱讀了解

論文閱讀——Parallel Multi-Resolution Fusion Network for Image Inpainting網絡結構損失函數

Glove公式推導

《論文閱讀》SAPBERT: Speaker-Aware Pretrained BERT for Emotion Recognition in Conversation

目标檢測系相關論文閱讀基礎網絡檢測算法架構優化方向

論文閱讀筆記（三）：Research on Network Attack Effect Evaluation Based on Confrontational Perspective一. 論文簡介二. 創新點和貢獻：三. 相關領域的概述(related work)四. 作者的方案五. 主要的資訊流（approach）六. 總結