DSSD算法的簡介(論文介紹)
DSSD,是在SSD上做的改進,即Deconvolutional Single Shot Detector,反卷積單步驟探測器。
Abstract
The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection. To achieve this we first combine a state-ofthe-art classifier (Residual-101 [14]) with a fast detection framework (SSD [18]). We then augment SSD+Residual101 with deconvolution layers to introduce additional largescale context in object detection and improve accuracy, especially for small objects, calling our resulting system DSSD for deconvolutional single shot detector. While these two contributions are easily described at a high-level, a naive implementation does not succeed. Instead we show that carefully adding additional stages of learned transformations, specifically a module for feed-forward connections in deconvolution and a new output module, enables this new approach and forms a potential way forward for further detection research. Results are shown on both PASCAL VOC and COCO detection. Our DSSD with 513 × 513 input achieves 81.5% mAP on VOC2007 test, 80.0% mAP on VOC2012 test, and 33.2% mAP on COCO, outperforming a state-of-the-art method R-FCN [3] on each dataset.
本文的主要貢獻是将附加上下文引入到最先進的一般對象檢測中。為了實作這一點,我們首先結合了一個最先進的分類器(Residual-101[14])和一個快速檢測架構(SSD[18])。然後,我們使用反褶積層來增加SSD+Residual101,以在目标檢測中引入額外的大範圍上下文,并提高精度,特别是對于小對象,調用我們得到的系統DSSD來實作反卷積單鏡頭檢測器。雖然這兩個貢獻很容易在高層進行描述,但是一個簡單的實作是不會成功的。相反,我們展示了詳細添加額外的學習轉換階段,特别是反褶積中的前饋連接配接子產品和一個新的輸出子產品,使這種新方法成為可能,并為進一步的檢測研究形成了一個潛在的前進方向。結果表明,PASCAL VOC和COCO 檢測。我們的513×513輸入的DSSD在VOC2007測試中實作了81.5%的mAP,在VOC2012測試中實作了80.0%的mAP,在COCO上實作了33.2%的mAP,在每個資料集上都優于目前最先進的R-FCN[3]方法。
Conclusion
We propose an approach for adding context to a stateof-the-art object detection framework, and demonstrate its effectiveness on benchmark datasets. While we expect many improvements in finding more efficient and effective ways to combine the features from the encoder and decoder, our model still achieves state-of-the-art detection results on PASCAL VOC and COCO. Our new DSSD model is able to outperform the previous SSD framework, especially on small object or context specific objects, while still preserving comparable speed to other detectors. While we only apply our encoder-decoder hourglass model to the SSD framework, this approach can be applied to other detection methods, such as the R-CNN series methods [12, 11, 24], as well.
我們提出了一種将上下文添加到最先進的對象檢測架構的方法,并在基準資料集上證明了它的有效性。雖然我們期望在尋找更有效和更有效的方法來結合編碼器和解碼器的特性方面有許多改進,但我們的模型仍然在PASCAL VOC和COCO上實作了最先進的檢測結果。我們的新DSSD模型能夠超越以前的SSD架構,特别是在小對象或特定上下文對象上,同時仍然保持與其他檢測器相當的速度。雖然我們隻将我們的編解碼器沙漏模型應用于SSD架構,但是這種方法也可以應用于其他檢測方法,比如R-CNN系列方法[12,11,24]。
論文
Cheng-Yang Fu , Wei Liu , Ananth Ranga, AmbrishTyagi , Alexander C. Berg .
DSSD : Deconvolutional Single Shot Detector,CVPR 2017
https://arxiv.org/abs/1701.066591、DSSD架構結構
殘差網絡上的SSD和DSSD網絡:藍色子產品是SSD架構中添加的層,稱之為SSD層。在下圖中,紅色圖層是DSSD層。

2、DSSD模型的特點、貢獻
SSD算法對小目标不夠魯棒(會出現誤檢和漏檢);最主要的原因是淺層特征圖的表示能力不夠強。DSSD算法的核心思想就是提高淺層的表示能力。
DSSD在原來的SSD模型上主要作了兩大改進:
一是替換掉VGG,而改用了Resnet-101作為特征提取網絡并在對不同尺度的特征圖進行預設框檢測時使用了更新的檢測單元;
二則在網絡的後端使用了多個反卷積層(deconvolution layers)以有效地擴充低次元資訊的上下文資訊(contextual information) ,進而有效地改善了小尺度目标的檢測。
DSSD算法的架構詳解
更新……