Yolo V2算法的簡介(論文介紹)
摘要
We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don't have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.
我們介紹了YOLO9000,一個最先進的實時對象檢測系統,可以檢測超過9000個對象類别。首先,我們提出了對YOLO檢測方法的各種改進,既新穎又借鑒了前人的工作。改進後的YOLOv2模型在PASCAL VOC和COCO等标準檢測任務上是最先進的。在67 FPS情況下,YOLOv2在VOC 2007上獲得76.8 mAP。在40幀每秒的速度下,YOLOv2獲得了78.6張mAP,性能超過了最先進的方法,比如使用ResNet和SSD的更快的RCNN,同時仍然運作得非常快。最後提出了一種聯合訓練目标檢測與分類的方法。利用該方法,我們同時對YOLO9000進行了COCO檢測資料集和ImageNet分類資料集的訓練。我們的聯合訓練允許YOLO9000預測沒有标記檢測資料的對象類的檢測。我們在ImageNet檢測任務上驗證了我們的方法。YOLO9000在ImageNet檢測驗證集上獲得19.7 mAP,盡管在200個類中隻有44個類的檢測資料。在未使用COCO的156個類中,YOLO9000得到了16.0 mAP。但是YOLO可以檢測超過200個類;它預測了超過9000種不同對象類别的探測。它仍然是實時運作的。
Conclusion
We introduce YOLOv2 and YOLO9000, real-time detection systems. YOLOv2 is state-of-the-art and faster than other detection systems across a variety of detection datasets. Furthermore, it can be run at a variety of image sizes to provide a smooth tradeoff between speed and accuracy.
我們介紹了YOLOv2和YOLO9000,實時檢測系統。YOLOv2是最先進的,比其他檢測系統更快地通過各種檢測資料集。此外,它可以運作在各種圖像大小,以提供速度和精度之間的平穩權衡。
YOLO9000 is a real-time framework for detection more than 9000 object categories by jointly optimizing detection and classification. We use WordTree to combine data from various sources and our joint optimization technique to train simultaneously on ImageNet and COCO. YOLO9000 is a strong step towards closing the dataset size gap between detection and classification.
YOLO9000是通過聯合優化檢測和分類,實作對9000多個目标類别進行檢測的實時架構。我們使用WordTree來組合來自不同來源的資料,并使用我們的聯合優化技術在ImageNet和COCO上同時進行訓練。YOLO9000是縮小檢測和分類之間資料集大小差距的有力一步。
Many of our techniques generalize outside of object detection. Our WordTree representation of ImageNet offers a richer, more detailed output space for image classification. Dataset combination using hierarchical classification would be useful in the classification and segmentation domains. Training techniques like multi-scale training could provide benefit across a variety of visual tasks.
我們的許多技術可以推廣到對象檢測之外。ImageNet的WordTree表示為圖像分類提供了更豐富、更詳細的輸出空間。在分類和分割領域,采用層次分類的資料集組合方法是非常有用的。像多尺度訓練這樣的訓練技術可以為各種視覺任務提供好處。
For future work we hope to use similar techniques for weakly supervised image segmentation. We also plan to improve our detection results using more powerful matching strategies for assigning weak labels to classification data during training. Computer vision is blessed with an enormous amount of labelled data. We will continue looking for ways to bring different sources and structures of data together to make stronger models of the visual world.
在未來的工作中,我們希望使用類似的技術來進行弱監督圖像分割。我們還計劃在訓練中使用更強大的比對政策來為分類資料配置設定弱标簽,進而提高檢測結果。計算機視覺擁有大量的标記資料。我們将繼續尋找将不同的資料源和資料結構結合在一起的方法,進而建構更強大的可視化世界模型。
論文
Joseph Redmon , Ali Farhadi.
YOLO9000: Better, Faster, Stronger. CVPR 2017 (Best Paper Honorable Mention)
https://arxiv.org/abs/1612.082421、YOLOV2的特點、改進、優缺點
1、YOLOV2的特點
YOLOv2是YOLO的第二個版本,其目标是在提高速度的同時顯著提高準确度。
與基于proposal的檢測器相比,YOLOv1定位誤差更高,并且召回率(測量所有目标的定位有多好)更低。
SSD是YOLOv1的強大競争對手,它在某一方面表現出更高的實時處理精度。
2、YOLOV2的改進處
YOLO v2: 使用一系列的方法對YOLO v1進行了改進,在保持原有速度的同時提升準确度。
YOLO9000: 提出了一種目标分類與檢測的聯合訓練方法,通過WordTree來混合檢測資料集與識别資料集之中的資料,同時在COCO和ImageNet資料集中進行訓練得到YOLO9000,實作9000多種目标的實時檢測。