《魔改Yolov5》替換MobileNet V3 backbone

作者：MuggleZero

《魔改Yolov5》專欄位址：魔改Yolov5

MobileNet V3 論文翻譯位址：

MobileNet V3 論文翻譯

h_sigmoid複現：

common.py

class h_sigmoid(nn.Module):
     def __init__(self, inplace=True):
         super(h_sigmoid, self).__init__()
         self.relu = nn.ReLU6(inplace=inplace)
 
     def forward(self, x):
         return self.relu(x + 3) / 6

h_swish複現：

common.py

至此已經實作了h_swish公式内容。

class h_swish(nn.Module):
     def __init__(self, inplace=True):
         super(h_swish, self).__init__()
         self.sigmoid = h_sigmoid(inplace=inplace)
 
     def forward(self, x):
         y = self.sigmoid(x)
         return x * y

網絡結構複現

主體結構：

NL：使用的非線性層

HS：h-swish

RE：ReLU

NBN：沒有批處理規範化

SE： squeeze-and-excite

第一行的意思：使用了h-swish的非線性層的3x3 conv2d，stride=2， out channels=16。複現代碼：

class Conv3BN(nn.Module):
     """
     This equals to
     def conv_3x3_bn(inp, oup, stride):
         return nn.Sequential(
             nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
             nn.BatchNorm2d(oup),
             h_swish()
         )
     """
 
     def __init__(self, inp, oup, stride):
         super(Conv3BN, self).__init__()
         self.conv = nn.Conv2d(inp, oup, 3, stride, 1, bias=False)
         self.bn = nn.BatchNorm2d(oup)
         self.act = h_swish()
 
     def forward(self, x):
         return self.act(self.bn(self.conv(x)))
 
     def fuseforward(self, x):
         return self.act(self.conv(x))

squeeze-and-excite

MobileNet v3 中引用的Squeeze-and-Excite是怎麼回事

AdaptiveAvgPool2d
 Linear
 ReLU
 Linear
 h_sigmoid

對應的操作：

Squeeze（Fsq）

用AdaptiveAvgPool2d将H × W × C H的feature map壓縮為1 × 1 × C
Excitation（Fex）

得到Squeeze的1 × 1 × C 的feature map後，使用FC全連接配接層，對每個通道的重要性進行預測，得到不同channel的重要性大小。有兩個全連接配接，一個降維，一個恢複次元。
Scale

最後将學習到的各個channel的激活值（sigmoid激活，值0~1）乘以之前的feature map的對應channel上。

class SELayer(nn.Module):
     def __init__(self, channel, reduction=4):
         super(SELayer, self).__init__()
         #用1x1的卷積
         self.avg_pool = nn.AdaptiveAvgPool2d(1) #Squeeze，将H × W × C H的feature map壓縮為1 × 1 × C
         self.fc = nn.Sequential(
             nn.Linear(channel, channel // reduction),
             nn.ReLU(inplace=True),
             nn.Linear(channel // reduction, channel),
             h_sigmoid() # h_sigmoid，修改了x
         ) # 使用FC全連接配接層，對每個通道的重要性進行預測，得到不同channel的重要性大小
 
     def forward(self, x):
         # b=1,c=16,_=64，16個資料組成一行，一共64排資料。
         b, c, _, _ = x.size() #傳回x張量行數、列數、深度.
         # Squeeze
         y = self.avg_pool(x).view(b, c)  #池化後的1 × 1 × C feature map重塑為 b*c 的張量
         
         # Excitation 預測并重塑為 b*c*1*1張量
         y = self.fc(y).view(b, c, 1, 1)
         
         # Scale
         return x * y

濾波器4維張量：

[height,width,input_channels,output_channels]   //濾波器張量
 [卷積核高度，卷積核寬度，圖像通道數，卷積核個數]  //第三維input_channels為input張量的第四維。

圖像4維張量張量：

[batch_size,height,width, channels]  //圖像張量
 [個數，高度，寬度，通道數]

InvertedResidual

InvertedResidual(論文中是bneck)是MobileNetv2中提出的，stride=1和stride=2在走向是有差異的。

stride=1的時候：

1、point-wise升維 2、depth-wise提取特征 3、通過Linear的point-wise降維。 4、input與結果相加（殘差結構）

第一層bneck是支援relu6的3x3卷積，第二個箭頭所指的bneck是支援hswish的5x5卷積。

整體結構如下所示，

即先利用1x1卷積進行升次元，再進行下面的操作，并具有殘差邊：

在輸入1x1卷積進行升次元後，進行3x3深度可分離卷積：

注意力機制調整每個通道的權重：

1x1卷積：

class InvertedResidual(nn.Module):
     def __init__(self, inp, oup, hidden_dim, kernel_size, stride, use_se, use_hs):
         super(InvertedResidual, self).__init__()
         assert stride in [1, 2]
 
         self.identity = stride == 1 and inp == oup
         # 如果輸入通道等于 隐藏層的維數，來一個dw
         if inp == hidden_dim:
             self.conv = nn.Sequential(
                 # dw Depth-wise
                 nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
                           bias=False),
                 nn.BatchNorm2d(hidden_dim),
                 h_swish() if use_hs else nn.ReLU(inplace=True),
                 # Squeeze-and-Excite
                 SELayer(hidden_dim) if use_se else nn.Sequential(),
                 # pw-linear
                 nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
                 nn.BatchNorm2d(oup),
             )
         else:
             self.conv = nn.Sequential(
                 # pw Point-wise 1x1 
                 nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
                 nn.BatchNorm2d(hidden_dim),
                 h_swish() if use_hs else nn.ReLU(inplace=True),
                 # dw Depth-wise
                 nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
                           bias=False),
                 nn.BatchNorm2d(hidden_dim),
                 # Squeeze-and-Excite
                 SELayer(hidden_dim) if use_se else nn.Sequential(),
                 h_swish() if use_hs else nn.ReLU(inplace=True),
                 # pw-linear
                 nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
                 nn.BatchNorm2d(oup),
             )
 
     def forward(self, x):
         y = self.conv(x)
         # stride=1時，輸入加輸出
         if self.identity:
             return x + y
         else:
             return y

yaml定義：

# YOLOv5 backbone
 backbone:
   # MobileNetV3-small
   # [from, number, module, args]
   [[-1, 1, Conv3BN, [16, 2]],                         # 0-p1/2
    [-1, 1, InvertedResidual, [16,  16, 3, 2, 1, 0]],  # 1-p2/4
    [-1, 1, InvertedResidual, [24,  72, 3, 2, 0, 0]],  # 2-p3/8
    [-1, 1, InvertedResidual, [24,  88, 3, 1, 0, 0]],  # 3-p3/8
    [-1, 1, InvertedResidual, [40,  96, 5, 2, 1, 1]],  # 4-p4/16
    [-1, 1, InvertedResidual, [40, 240, 5, 1, 1, 1]],  # 5-p4/16
    [-1, 1, InvertedResidual, [40, 240, 5, 1, 1, 1]],  # 6-p4/16
    [-1, 1, InvertedResidual, [48, 120, 5, 1, 1, 1]],  # 7-p4/16
    [-1, 1, InvertedResidual, [48, 144, 5, 1, 1, 1]],  # 8-p4/16
    [-1, 1, InvertedResidual, [96, 288, 5, 2, 1, 1]],  # 9-p5/32
    [-1, 1, InvertedResidual, [96, 576, 5, 1, 1, 1]],  # 10-p5/32
    [-1, 1, InvertedResidual, [96, 576, 5, 1, 1, 1]],  # 11-p5/32
   ]

《魔改Yolov5》替換MobileNet V3 backbone

作者：MuggleZero

《魔改Yolov5》專欄位址：魔改Yolov5

h_sigmoid複現：

h_swish複現：

網絡結構複現

squeeze-and-excite

InvertedResidual

yaml定義：

繼續閱讀

CSU 1561 (More) Multiplication

CSU 1563 Lexicography

HDU 4721 Food and Productivity

ZOJ 1041 Transmitters

CSU 1562 Fun House

CodeChef PALPROB Palindromeness

UVA 10344- 23 out of 5

ZOJ 1104 Leaps Tall Buildings

HDU 2821 Pusher

UVA 1401 Remember the Word

ZOJ 2748 Free Kick

CSU 1567 Reverse Rot

YOLO-V5 系列算法和代碼解析（七）—— 【val.py】名額評估

JAVA 系列——>開發工具IntelliJ IDEA的安裝以及配置、快捷鍵IDEA 簡介

UVA 519 Puzzle (II)

磁盤結構及在Linux中的命名