文章目錄
-
-
-
-
- MobileNetV3 主要貢獻
- 1. NetAapt 的搜尋過程
- 2. Redesign the expensive layers
- 3. Nonlinearities,非線性激活函數的改進
- 4. mobilenet v3 large與mobilenet v3 small結構
- 5. Lite R-ASPP
- 參考文獻
-
-
-
MobileNetV3 主要貢獻
論文位址:https://arxiv.org/abs/1905.02244
代碼位址:
- pytorch 版: https://github.com/xiaolai-sqlai/mobilenetv3
- tensorflow版(含Lite R-ASPP 實作): https://github.com/xiaochus/MobileNetV3
一句話總結: mobilienet v3主要在神經網絡結構搜尋NetAdapt算法上,針對原始算法NetAdapt 在samll mobile models上優化不好的問題,非線性激活函數的計算複雜度高以及mobilenet v2的latency 問題提出解決方案:
- Complementary search techniques(主要針對神經網絡搜尋的NetAdapt算法在small mobile models上accuracy的變化比latency的變化劇烈的問題,修改了優化目标,原始論文最小化accuracy change,目前論文最小化latency change 與accuracy change的比值)
- A new efficient versions of nonlinearties practical (這部分主要解決非線性映射函數 s w i s h x = x ⋅ σ ( x ) swish\ x=x \cdot \sigma (x) swish x=x⋅σ(x)中的sigmoid函數計算複雜度高的問題,提出 h − s w i s h [ x ] = x ⋅ R e L u 6 ( x + 3 ) 6 h-swish[x]=x\cdot \frac{ReLu6(x+3)}{6} h−swish[x]=x⋅6ReLu6(x+3)近似代替原始映射函數)
- A new efficient network design (這部分主要針對mobilenet v2的inverted bottleneck會産生較大的latency的問題, 将特征産生層後的avg-pooling前移,保證精度的情況下減少了latency;其次是優化了3*3卷積層的filter數目,從32優化成16)
- A new efficient segmentation decoder(這部分主要輕量化了deeplab的ASPP結構,具體而言将ASPP與Squeeze and excitation,skip connection等trick結合,簡化了deeplab v3的參數)
1. NetAapt 的搜尋過程
每一步産生新的proposals,每個proposal相較之前的proposal在latency上有至少 δ \delta δ的降低。對于每個proposal,利用上一步pretrain的model來填充和剪枝新的網路結構,然後fine tune T 步直到model的accuracy滿足要求。利用一些定義好的metric選擇最佳的proposal.
NetAdapt采用的metric是accuracy的變化量,mobilenet v3采用的是accurcy的變化量與latency變化量的比值,隻要解決移動端上小模型的accuracy與latency變化不一緻的問題。
2. Redesign the expensive layers
- 2.1 針對mobilenet v2的inverted bottleneck帶來的latency上的問題,将特征産生層後的avg-pooling layer前移
- 2.2 相比mobilenet v2,mobilenet v3加入了squeeze-excition net的思想
- 2.3 SE module 實作:
class SeModule(nn.Module):
def __init__(self, in_size, reduction=4):
super(SeModule, self).__init__()
self.se = nn.Sequential(
nn.AdaptiveAvgPool2d(1),
nn.Conv2d(in_size, in_size // reduction, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(in_size // reduction),
nn.ReLU(inplace=True),
nn.Conv2d(in_size // reduction, in_size, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(in_size),
hsigmoid()
)
def forward(self, x):
return x * self.se(x)
- 2.4 mobilenet v3 block實作
class Block(nn.Module):
'''expand + depthwise + pointwise'''
def __init__(self, kernel_size, in_size, expand_size, out_size, nolinear, semodule, stride):
super(Block, self).__init__()
self.stride = stride
self.se = semodule
self.conv1 = nn.Conv2d(in_size, expand_size, kernel_size=1, stride=1, padding=0, bias=False)
self.bn1 = nn.BatchNorm2d(expand_size)
self.nolinear1 = nolinear
self.conv2 = nn.Conv2d(expand_size, expand_size, kernel_size=kernel_size, stride=stride, padding=kernel_size//2, groups=expand_size, bias=False)
self.bn2 = nn.BatchNorm2d(expand_size)
self.nolinear2 = nolinear
self.conv3 = nn.Conv2d(expand_size, out_size, kernel_size=1, stride=1, padding=0, bias=False)
self.bn3 = nn.BatchNorm2d(out_size)
self.shortcut = nn.Sequential()
if stride == 1 and in_size != out_size:
self.shortcut = nn.Sequential(
nn.Conv2d(in_size, out_size, kernel_size=1, stride=1, padding=0, bias=False),
nn.BatchNorm2d(out_size),
)
def forward(self, x):
out = self.nolinear1(self.bn1(self.conv1(x)))
out = self.nolinear2(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))
if self.se != None:
out = self.se(out)
out = out + self.shortcut(x) if self.stride==1 else out
return out
3. Nonlinearities,非線性激活函數的改進
提出 h − s w i s h [ x ] = x ⋅ R e L u 6 ( x + 3 ) 6 h-swish[x]=x\cdot \frac{ReLu6(x+3)}{6} h−swish[x]=x⋅6ReLu6(x+3)近似代替原始映射函數 s w i s h x = x ⋅ σ ( x ) swish\ x=x \cdot \sigma (x) swish x=x⋅σ(x),減少sigmoid函數在計算上帶來的latency問題。同時作者注意到 h − s w i s h [ x ] h-swish[x] h−swish[x]函數的作用範圍在整個網絡的後半部分效果較好。
pytorch 版本:
class hswish(nn.Module):
def forward(self, x):
out = x * F.relu6(x + 3, inplace=True) / 6
return out
class hsigmoid(nn.Module):
def forward(self, x):
out = F.relu6(x + 3, inplace=True) / 6
return out
4. mobilenet v3 large與mobilenet v3 small結構
- 4.1 mobilenet v3 large
- 4.2 mobilenet v3 large 實作
class MobileNetV3_Large(nn.Module):
def __init__(self, num_classes=1000):
super(MobileNetV3_Large, self).__init__()
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(16)
self.hs1 = hswish()
self.bneck = nn.Sequential(
Block(3, 16, 16, 16, nn.ReLU(inplace=True), None, 1),
Block(3, 16, 64, 24, nn.ReLU(inplace=True), None, 2),
Block(3, 24, 72, 24, nn.ReLU(inplace=True), None, 1),
Block(5, 24, 72, 40, nn.ReLU(inplace=True), SeModule(40), 2),
Block(5, 40, 120, 40, nn.ReLU(inplace=True), SeModule(40), 1),
Block(5, 40, 120, 40, nn.ReLU(inplace=True), SeModule(40), 1),
Block(3, 40, 240, 80, hswish(), None, 2),
Block(3, 80, 200, 80, hswish(), None, 1),
Block(3, 80, 184, 80, hswish(), None, 1),
Block(3, 80, 184, 80, hswish(), None, 1),
Block(3, 80, 480, 112, hswish(), SeModule(112), 1),
Block(3, 112, 672, 112, hswish(), SeModule(112), 1),
Block(5, 112, 672, 160, hswish(), SeModule(160), 1),
Block(5, 160, 672, 160, hswish(), SeModule(160), 2),
Block(5, 160, 960, 160, hswish(), SeModule(160), 1),
)
self.conv2 = nn.Conv2d(160, 960, kernel_size=1, stride=1, padding=0, bias=False)
self.bn2 = nn.BatchNorm2d(960)
self.hs2 = hswish()
self.linear3 = nn.Linear(960, 1280)
self.bn3 = nn.BatchNorm1d(1280)
self.hs3 = hswish()
self.linear4 = nn.Linear(1280, num_classes)
self.init_params()
def forward(self, x):
out = self.hs1(self.bn1(self.conv1(x)))
out = self.bneck(out)
out = self.hs2(self.bn2(self.conv2(out)))
out = F.avg_pool2d(out, 7)
out = out.view(out.size(0), -1)
out = self.hs3(self.bn3(self.linear3(out)))
out = self.linear4(out)
return out
- 4.3 mobilenet v3 small
- 4.4 mobilenet v3 small 代碼實作
class MobileNetV3_Small(nn.Module):
def __init__(self, num_classes=1000):
super(MobileNetV3_Small, self).__init__()
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(16)
self.hs1 = hswish()
self.bneck = nn.Sequential(
Block(3, 16, 16, 16, nn.ReLU(inplace=True), SeModule(16), 2),
Block(3, 16, 72, 24, nn.ReLU(inplace=True), None, 2),
Block(3, 24, 88, 24, nn.ReLU(inplace=True), None, 1),
Block(5, 24, 96, 40, hswish(), SeModule(40), 2),
Block(5, 40, 240, 40, hswish(), SeModule(40), 1),
Block(5, 40, 240, 40, hswish(), SeModule(40), 1),
Block(5, 40, 120, 48, hswish(), SeModule(48), 1),
Block(5, 48, 144, 48, hswish(), SeModule(48), 1),
Block(5, 48, 288, 96, hswish(), SeModule(96), 2),
Block(5, 96, 576, 96, hswish(), SeModule(96), 1),
Block(5, 96, 576, 96, hswish(), SeModule(96), 1),
)
self.conv2 = nn.Conv2d(96, 576, kernel_size=1, stride=1, padding=0, bias=False)
self.bn2 = nn.BatchNorm2d(576)
self.hs2 = hswish()
self.linear3 = nn.Linear(576, 1280)
self.bn3 = nn.BatchNorm1d(1280)
self.hs3 = hswish()
self.linear4 = nn.Linear(1280, num_classes)
self.init_params()
def forward(self, x):
out = self.hs1(self.bn1(self.conv1(x)))
out = self.bneck(out)
out = self.hs2(self.bn2(self.conv2(out)))
out = F.avg_pool2d(out, 7)
out = out.view(out.size(0), -1)
out = self.hs3(self.bn3(self.linear3(out)))
out = self.linear4(out)
return out
5. Lite R-ASPP
Lite R-ASPP主要将mobilenet v3的最後一個block中不同resolution的資訊,通過與SE-net,skip-connection相結合,實作移動端的圖像分割。
tensorflow 版本 https://github.com/xiaochus/MobileNetV3/blob/master/model/LR_ASPP.py
參考文獻
- https://arxiv.org/abs/1905.02244
- https://github.com/xiaolai-sqlai/mobilenetv3
- https://github.com/xiaochus/MobileNetV3