天天看點

目标檢測R-CNN

目标檢測R-CNN

    • 論文大綱
    • 主要思想
    • 流程圖
    • 流程圖說明
    • 重要的數學公式
    • 重要觀點
    • 參考文獻(R-CNN)
    • 歡迎評論讨論

論文大綱

标題:Rich feature hierarchies for accurate object detection and semantic segmentation

abstract

1introduction

2object detection with R-CNN

2.1module design

2.2test-time detection

2.3training

2.4results on pascal voc 2010-12

2.5results on ILSVRC2013 detection

3Visualization ablation and modes of errro

3.1visualizing learned features

3.2ablation studies

3.3nework architetures

3.4detection error analysis

3.5bounding-box regression

3.6qualitative results

4the ILSVRC2013 detection dataset

4.1dataset overview

4.2region proposals

4.3training data

4.4validation and evaluation

4.5ablation study

4.6relationship to OverFeat

5Semantic segmentation

6Conclusion

Appendix

A.object proposal transformations

B.positive vs.negative examples and softmax

C.Bounding-box regression

D.Addition feature visualizations

E.Per-category segmentaion results

F.Analysis of cross-dataset redundancy

G.document changelog

主要思想

參考人的思維,要确定目标是什麼(目标分類)和目标在哪(目标定位),人會在整張圖的不同區域進行檢索,是以作者提出對于每一張圖像,都給出多個proposal region,然後判斷是什麼(目标分類)和還要繼續往哪個方向搜尋(邊框回歸)

流程圖

目标檢測R-CNN

流程圖說明

(1)輸入圖像

(2)提取出大約2000個提議(采用selective search的方法)

(3)對于提議進行resize之後作為CNN的輸入,對于每一個提議region proposal都會得到一個特征圖(一一對應)

(4)采用SVMs進行分類

重要的數學公式

目标檢測R-CNN

根據預測的回歸值計算預測的Bounding Box

目标檢測R-CNN

回歸目标函數或損失函數

目标檢測R-CNN

邊框回歸訓練時對應的标簽

重要觀點

(1)At test time, we score each proposal and predict its new

detection window only once. In principle, we could iterate

this procedure (i.e., re-score the newly predicted bounding

box, and then predict a new bounding box from it, and so

on). However, we found that iterating does not improve

results.

可知作者嘗試了使用疊代邊框回歸,但是效果不好,之後也有不少人在往這個方向走,

但是目前做的做好的是Cascade R-CNN

Cascade R-CNN: Delving into High Quality Object Detection

(2)It is worth noting that OverFeat has

a significant speed advantage over R-CNN: it is about 9x

faster, based on a figure of 2 seconds per image quoted from

[34]. This speed comes from the fact that OverFeat’s sliding

windows (i.e., region proposals) are not warped at the

image level and therefore computation can be easily shared

between overlapping windows. Sharing is implemented by

running the entire network in a convolutional fashion over

arbitrary-sized inputs. Speeding up R-CNN should be possible

in a variety of ways and remains as future work.

作者也說了,OverFeat由于使用了sliding windows作為初始提議的速度9倍于R-CNN,并提議之後加速就往這個方向發展。果不其然

論速度要看一階段法的Yolo SSD Retinanet,而二階段法的Faster R-CNN也使用了sliding windows作為初始提議,隻不過名稱變為了anchor,同一個RoI對應好幾個anchor

參考文獻(R-CNN)

[1]: Rich feature hierarchies for accurate object detection and semantic segmentation

歡迎評論讨論

繼續閱讀