天天看點

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

文章目錄

  • 1、COCO資料集的介紹
  • 2、COCO資料集标注格式
    • 2.1執行個體分割Object Instance檔案格式
      • 2.1.1 info中的内容
      • 2.1.2 licenses中的内容
      • 2.1.3 images中的内容
      • 2.1.4 annotations中的内容
      • 2.1.4 categories中的内容
    • 2.2 關鍵點檢測Object Keypoint檔案格式
      • 2.2.1 annotations中的内容
      • 2.2.2 categories中的内容
    • 2.3 看圖說話Image Caption檔案格式
      • 2.3.1 annotation中的内容
  • 本文參考

本文主要是為了熟悉COCO資料集。

1、COCO資料集的介紹

首先上兩個連結,第一個 ,第二個

有以上兩個連結足夠了解COCO

整個資料集的分布如下

#step1: 下載下傳資料集
2017 Train images [118K/18GB]
2017 Val images [5K/1GB]
2017 Test images [41K/6GB]
2017 Train/Val annotations [241MB]

#step2: 按照下面結構存放檔案夾
coco
  ├── annotations
  │   ├── instances_train2014.json
  │   ├── instances_train2017.json
  │   ├── instances_val2014.json
  │   ├── instances_val2017.json
  │   |   ...
  ├── train2017
  │   ├── 000000000009.jpg
  │   ├── 000000580008.jpg
  │   |   ...
  ├── val2017
  │   ├── 000000000139.jpg
  │   ├── 000000000285.jpg
  │   |   ...
  |   ...
  
           

2、COCO資料集标注格式

本部分主要是參考https://zhuanlan.zhihu.com/p/70878433,這個連結進行同步整理的,直接看原文也可以,隻是覺的原文有點亂,不便于整體掌據該資料集。

COCO資料集大量使用Amazon Mechanical Turk來收集資料。COCO資料集現主要有三種标注類型:

  • object instance 目标執行個體
  • object keypoints 目标關鍵點
  • image captions 看圖說話。

    标注檔案使用JSON檔案進行存儲。如下為COCO2017資料集中train,val的标注檔案:

    COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考
    原檔案是

    annotations_trainval2017.zip

    ,解壓後是

    annotations

    檔案夾。可以看到一共有三種類型,每種類型包含訓練和驗證,共有6個JSON檔案。

2.1執行個體分割Object Instance檔案格式

以instance_val2017.json為例(驗證集檔案軟小,打開較快),總體格式如下:

{
    "info": info,
    "licenses": [license],
    "images":[image],
    "annotations":[annotation],
    "categories":[category]
}
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考
  • images字段下是一個清單,清單長度等同于劃入訓練集(或驗證集)的圖檔數量
  • annotatons字段下也是一個清單,清單長度等同地訓練集(或驗證集)中bounding box 的數量
  • categories字段下也是一個清單,清單長度等同于資料集類别的數,coco2017分類數是80,用VScode打開看:
    COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

可以看到整個JSON檔案是一個大的數字典。

通過jupyterlab打開看:

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.1.1 info中的内容

"info": {
        "description": "COCO 2017 Dataset",
        "url": "http://cocodataset.org",
        "version": "1.0",
        "year": 2017,
        "contributor": "COCO Consortium",
        "date_created": "2017/09/01"
    },
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

info中包括一些基本資訊,時間,版本,貢獻者等,沒什麼太大價值,可以忽略。

2.1.2 licenses中的内容

内容較少,這裡全部列出:

"licenses": [
        {
            "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
            "id": 1,
            "name": "Attribution-NonCommercial-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc/2.0/",
            "id": 2,
            "name": "Attribution-NonCommercial License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc-nd/2.0/",
            "id": 3,
            "name": "Attribution-NonCommercial-NoDerivs License"
        },
        {
            "url": "http://creativecommons.org/licenses/by/2.0/",
            "id": 4,
            "name": "Attribution License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-sa/2.0/",
            "id": 5,
            "name": "Attribution-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nd/2.0/",
            "id": 6,
            "name": "Attribution-NoDerivs License"
        },
        {
            "url": "http://flickr.com/commons/usage/",
            "id": 7,
            "name": "No known copyright restrictions"
        },
        {
            "url": "http://www.usa.gov/copyright.shtml",
            "id": 8,
            "name": "United States Government Work"
        }
    ],
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

一共有8條,也沒什麼價值,可以忽略。

2.1.3 images中的内容

内容較多,列幾條:

"images": [
        {
            "license": 4,
            "file_name": "000000397133.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
            "height": 427,
            "width": 640,
            "date_captured": "2013-11-14 17:02:52",
            "flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg",
            "id": 397133
        },
        {
            "license": 1,
            "file_name": "000000037777.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg",
            "height": 230,
            "width": 352,
            "date_captured": "2013-11-14 20:55:31",
            "flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg",
            "id": 37777
        },
        {
            "license": 4,
            "file_name": "000000252219.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
            "height": 428,
            "width": 640,
            "date_captured": "2013-11-14 22:32:02",
            "flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
            "id": 252219
        },
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

jupyter中看的效果:

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

images是一個清單,清單中每一個元素是一個字典,存儲一張圖檔中的資訊。分别就圖檔資訊做出說明:

  • license: 沒用
  • file_name:圖檔檔案名
  • coco_url:沒用
  • height:圖檔高
  • width:圖檔寬
  • date_captured:沒用
  • flickr_url沒用
  • id:圖檔的身份ID,每個圖檔特有的

    在以上資訊中,height,width,file_name,id這四個值非常重要。

2.1.4 annotations中的内容

該内容較多,列幾條:

"annotations": [
        {
            "segmentation": [
                [
                    510.66,
                    423.01,
                    511.72,
                    ...
                    423.01,
                    510.45,
                    423.01
                ]
            ],
            "area": 702.1057499999998,
            "iscrowd": 0,
            "image_id": 289343,
            "bbox": [
                473.07,
                395.93,
                38.65,
                28.67
            ],
            "category_id": 18,
            "id": 1768
        },
        {
            "segmentation": [
                [
                    289.74,
                    443.39,
                    302.29,
                   ...
                    444.27,
                    291.88,
                    443.74
                ]
            ],
            "area": 27718.476299999995,
            "iscrowd": 0,
            "image_id": 61471,
            "bbox": [
                272.1,
                200.23,
                151.97,
                279.77
            ],
            "category_id": 18,
            "id": 1773
        },
        ......
                    "segmentation": {
                "counts": [
                    272,
                    2,
                    4,
                    4,
                   ...
                    16,
                    228,
                    8,
                    10250
                ],
                "size": [
                    240,
                    320
                ]
            },
            "area": 18419,
            "iscrowd": 1,
            "image_id": 448263,
            "bbox": [
                1,
                0,
                276,
                122
            ],
            "category_id": 1,
            "id": 900100448263
        },
        
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

jupyter中效果:

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

annotations是該JSON檔案中最重要的。annotations是包含多個annotation執行個體的數組,annotation類型本身又包含一系列的字段:

  • segmentation:分割标簽
  • area:面積
  • iscrowd: 是否多個目标
  • image_id:與images中的id對應
  • bbox:目标框
  • category_id:類别
  • id:标注框的一個序号

    整體來說annotation的格式如下:

annotation{
    "segmentation": RLE or [polygon],
    "area" :float,
    "iscrowd": 0 or 1,
    "imgae_id": int,
    "bbox": [x,y,width,height],
    "category_id": int,
    "id": int
    
           

注意,單個對像(iscrowd=0)可能需要多個polygon來表示,比如這個對像在圖像中被擋住;而iscrow=1時(将标注一組對像,比如一群人),segmentation的格式是RLE格式。也就是說,隻要iscrowd=0,那麼segmentation格式就是polygon; 而iscrowd=1,則segmentation格式是RLE。另外不論iscrowd是0還是1,每個對像都會有一個矩型框bbox,提供框的左上角坐标以及矩形框的高和寬。

segmentation polygon格式,可以看到,是一個二維的清單,裡面的一堆數字是像素級分割得到的物體邊緣坐标,從上文中也能看到,坐标是成對出現的;RLE格式如下:

segmentation : 
{
    'counts': [272, 2, 4, 4, 4, 4, 2, 9, 1, 2, 16, 43, 143, 24......], 
    'size': [240, 320]
}
           

COCO資料集的RLE都是uncompressed RLE格式(與之相對的是compact RLE)。 RLE所占位元組的大小和邊界上的像素數量是正相關的。RLE格式帶來的好處就是當基于RLE去計算目标區域的面積以及兩個目标之間的unoin和intersection時會非常有效率。 上面的segmentation中的counts數組和size數組共同組成了這幅圖檔中的分割 mask。其中size是這幅圖檔的寬高,然後在這幅圖像中,每一個像素點要麼在被分割(标注)的目标區域中,要麼在背景中。很明顯這是一個bool量:如果該像素在目标區域中為true那麼在背景中就是False;如果該像素在目标區域中為1那麼在背景中就是0。對于一個240x320的圖檔來說,一共有76800個像素點,根據每一個像素點在不在目标區域中,我們就有了76800個bit,比如像這樣(随便寫的例子,和上文的數組沒關系):00000111100111110…;但是這樣寫很明顯浪費空間,我們直接寫上0或者1的個數不就行了嘛(Run-length encoding),于是就成了54251…,這就是上文中的counts數組。

area指向該segmentation的面積,iscrowd=0表示沒有重疊,iscrowd=1表示有重疊;image_id就是前面images中存儲的id.bbox指向的是物體的标注框;category_id指向的數字代表分類,共有80個分類;id不同于images中的id,這裡的id隻是每個框的身份編号。

2.1.4 categories中的内容

如下:

"categories": [
        {
            "supercategory": "person",
            "id": 1,
            "name": "person"
        },
        {
            "supercategory": "vehicle",
            "id": 2,
            "name": "bicycle"
        },
        {
            "supercategory": "vehicle",
            "id": 3,
            "name": "car"
        },
        {
            "supercategory": "vehicle",
            "id": 4,
            "name": "motorcycle"
        },
        {
            "supercategory": "vehicle",
            "id": 5,
            "name": "airplane"
        },
        {
            "supercategory": "vehicle",
            "id": 6,
            "name": "bus"
        },
        {
            "supercategory": "vehicle",
            "id": 7,
            "name": "train"
        },
        {
            "supercategory": "vehicle",
            "id": 8,
            "name": "truck"
        },
        {
            "supercategory": "vehicle",
            "id": 9,
            "name": "boat"
        },
        {
            "supercategory": "outdoor",
            "id": 10,
            "name": "traffic light"
        },
        {
            "supercategory": "outdoor",
            "id": 11,
            "name": "fire hydrant"
        },
        {
            "supercategory": "outdoor",
            "id": 13,
            "name": "stop sign"
        },
        {
            "supercategory": "outdoor",
            "id": 14,
            "name": "parking meter"
        },
        {
            "supercategory": "outdoor",
            "id": 15,
            "name": "bench"
        },
        {
            "supercategory": "animal",
            "id": 16,
            "name": "bird"
        },
        {
            "supercategory": "animal",
            "id": 17,
            "name": "cat"
        },
        {
            "supercategory": "animal",
            "id": 18,
            "name": "dog"
        },
        {
            "supercategory": "animal",
            "id": 19,
            "name": "horse"
        },
        {
            "supercategory": "animal",
            "id": 20,
            "name": "sheep"
        },
        {
            "supercategory": "animal",
            "id": 21,
            "name": "cow"
        },
        {
            "supercategory": "animal",
            "id": 22,
            "name": "elephant"
        },
        {
            "supercategory": "animal",
            "id": 23,
            "name": "bear"
        },
        {
            "supercategory": "animal",
            "id": 24,
            "name": "zebra"
        },
        {
            "supercategory": "animal",
            "id": 25,
            "name": "giraffe"
        },
        {
            "supercategory": "accessory",
            "id": 27,
            "name": "backpack"
        },
        {
            "supercategory": "accessory",
            "id": 28,
            "name": "umbrella"
        },
        {
            "supercategory": "accessory",
            "id": 31,
            "name": "handbag"
        },
        {
            "supercategory": "accessory",
            "id": 32,
            "name": "tie"
        },
        {
            "supercategory": "accessory",
            "id": 33,
            "name": "suitcase"
        },
        {
            "supercategory": "sports",
            "id": 34,
            "name": "frisbee"
        },
        {
            "supercategory": "sports",
            "id": 35,
            "name": "skis"
        },
        {
            "supercategory": "sports",
            "id": 36,
            "name": "snowboard"
        },
        {
            "supercategory": "sports",
            "id": 37,
            "name": "sports ball"
        },
        {
            "supercategory": "sports",
            "id": 38,
            "name": "kite"
        },
        {
            "supercategory": "sports",
            "id": 39,
            "name": "baseball bat"
        },
        {
            "supercategory": "sports",
            "id": 40,
            "name": "baseball glove"
        },
        {
            "supercategory": "sports",
            "id": 41,
            "name": "skateboard"
        },
        {
            "supercategory": "sports",
            "id": 42,
            "name": "surfboard"
        },
        {
            "supercategory": "sports",
            "id": 43,
            "name": "tennis racket"
        },
        {
            "supercategory": "kitchen",
            "id": 44,
            "name": "bottle"
        },
        {
            "supercategory": "kitchen",
            "id": 46,
            "name": "wine glass"
        },
        {
            "supercategory": "kitchen",
            "id": 47,
            "name": "cup"
        },
        {
            "supercategory": "kitchen",
            "id": 48,
            "name": "fork"
        },
        {
            "supercategory": "kitchen",
            "id": 49,
            "name": "knife"
        },
        {
            "supercategory": "kitchen",
            "id": 50,
            "name": "spoon"
        },
        {
            "supercategory": "kitchen",
            "id": 51,
            "name": "bowl"
        },
        {
            "supercategory": "food",
            "id": 52,
            "name": "banana"
        },
        {
            "supercategory": "food",
            "id": 53,
            "name": "apple"
        },
        {
            "supercategory": "food",
            "id": 54,
            "name": "sandwich"
        },
        {
            "supercategory": "food",
            "id": 55,
            "name": "orange"
        },
        {
            "supercategory": "food",
            "id": 56,
            "name": "broccoli"
        },
        {
            "supercategory": "food",
            "id": 57,
            "name": "carrot"
        },
        {
            "supercategory": "food",
            "id": 58,
            "name": "hot dog"
        },
        {
            "supercategory": "food",
            "id": 59,
            "name": "pizza"
        },
        {
            "supercategory": "food",
            "id": 60,
            "name": "donut"
        },
        {
            "supercategory": "food",
            "id": 61,
            "name": "cake"
        },
        {
            "supercategory": "furniture",
            "id": 62,
            "name": "chair"
        },
        {
            "supercategory": "furniture",
            "id": 63,
            "name": "couch"
        },
        {
            "supercategory": "furniture",
            "id": 64,
            "name": "potted plant"
        },
        {
            "supercategory": "furniture",
            "id": 65,
            "name": "bed"
        },
        {
            "supercategory": "furniture",
            "id": 67,
            "name": "dining table"
        },
        {
            "supercategory": "furniture",
            "id": 70,
            "name": "toilet"
        },
        {
            "supercategory": "electronic",
            "id": 72,
            "name": "tv"
        },
        {
            "supercategory": "electronic",
            "id": 73,
            "name": "laptop"
        },
        {
            "supercategory": "electronic",
            "id": 74,
            "name": "mouse"
        },
        {
            "supercategory": "electronic",
            "id": 75,
            "name": "remote"
        },
        {
            "supercategory": "electronic",
            "id": 76,
            "name": "keyboard"
        },
        {
            "supercategory": "electronic",
            "id": 77,
            "name": "cell phone"
        },
        {
            "supercategory": "appliance",
            "id": 78,
            "name": "microwave"
        },
        {
            "supercategory": "appliance",
            "id": 79,
            "name": "oven"
        },
        {
            "supercategory": "appliance",
            "id": 80,
            "name": "toaster"
        },
        {
            "supercategory": "appliance",
            "id": 81,
            "name": "sink"
        },
        {
            "supercategory": "appliance",
            "id": 82,
            "name": "refrigerator"
        },
        {
            "supercategory": "indoor",
            "id": 84,
            "name": "book"
        },
        {
            "supercategory": "indoor",
            "id": 85,
            "name": "clock"
        },
        {
            "supercategory": "indoor",
            "id": 86,
            "name": "vase"
        },
        {
            "supercategory": "indoor",
            "id": 87,
            "name": "scissors"
        },
        {
            "supercategory": "indoor",
            "id": 88,
            "name": "teddy bear"
        },
        {
            "supercategory": "indoor",
            "id": 89,
            "name": "hair drier"
        },
        {
            "supercategory": "indoor",
            "id": 90,
            "name": "toothbrush"
        }
    ]
           

分類從1到90,但有些數字是跳過的,是以隻有80個分類。

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.2 關鍵點檢測Object Keypoint檔案格式

COCO資料集中person_keypoints_train2017.json、person_keypoints_val2017.json這兩個檔案就是這種格式。檔案整體格式是:

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation],
    "categories": [category]
}
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

與instance_val2017.json相同。其中,info、licenses、images這三部分在不同的JSON檔案中是相同的,定義是共享的,不共享的是annotations和category這兩種在不同類型的JSON檔案中是不一樣的。

  • images字段下是一個清單,清單長度等同于劃入訓練集(或驗證集)的圖檔數量
  • annotatons字段下也是一個清單,清單長度等同地訓練集(或驗證集)中bounding box 的數量,這裡隻有人這個類别的bounding box
  • categories字段下也是一個清單,清單長度等同于資料集類别的數,這裡是1,隻有person這一個類。

    相同内容這裡就不再列了,隻列不同的。

2.2.1 annotations中的内容

這個類型中的annotation結構中包含 object instance中annotation所有的字段,再加上兩個額外的字段。新增的keypoints是一個長度為3*k的數組,第一個和第二個元素分别是x和y坐标值,第三個是标志位v,v為0時表示這個關鍵點沒有标注(這種情況下x=y=v=0),v為1時表示這個關鍵點标注了但是不可見(被遮擋了),v為2時表示這個關鍵點标注了同時也可見。num_keypoints表示這個目标上被标注的關鍵點的數量(v>0),比較小的目标上可能就無法标注關鍵點。

annotation{
   "segmentation": RLE or [polygon],
   "num_keypoints": int,    
   "area": float,
   "iscrowd": 0 or 1,
   "keypoints": [x1,y1,v1,...],
   "image_id": int,
   "bbox": [x,y,width,height],
   "category_id": int,
   "id": int 
}
           

列舉一個:

{
      "segmentation": [
          [
              492.38,
              238.33,
              491.91,
              234.15,
              494.47,
              227.65,
              495.17,
              215.1,
              497.02,
              199.54,
              503.53,
              197.22,
              503.3,
              194.43,
              503.3,
              190.95,
              506.08,
              183.51,
              511.89,
              185.84,
              514.21,
              187,
              514.21,
              196.29,
              521.88,
              200.7,
              526.76,
              216.03,
              520.25,
              227.65,
              519.56,
              234.38,
              519.09,
              239.49,
              519.09,
              244.84,
              519.56,
              246.93,
              518.16,
              248.32,
              516.3,
              256.91,
              510.03,
              256.45,
              513.28,
              240.89
          ]
      ],
      "num_keypoints": 13,
      "area": 1394.7431,
      "iscrowd": 0,
      "keypoints": [
          508,
          192,
          2,
          510,
          191,
          2,
          506,
          191,
          2,
          512,
          192,
          2,
          503,
          192,
          1,
          515,
          202,
          2,
          499,
          202,
          2,
          524,
          214,
          2,
          497,
          215,
          2,
          516,
          226,
          2,
          496,
          224,
          2,
          511,
          232,
          2,
          497,
          230,
          2,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0
      ],
      "image_id": 440475,
      "bbox": [
          491.91,
          183.51,
          34.85,
          73.4
      ],
      "category_id": 1,
      "id": 183302
  }
           

可以看到一共有17個關鍵點。

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.2.2 categories中的内容

對于category,相比object instance中的category,新增了兩個字段,keypoints是一個長度為k的數組,包含每個關鍵點的名稱;skeleton定義各關鍵點的連接配接性(比如人的左手腕和左肘就是連接配接的,但是左手腕和右手腕就不是)。目前,COCO的keypoints隻标注了person category (分類為人)。定義如下:

{
    "supercategory": str,
    "id": int,
    "name": str,    
    "keypoints": [str],
    "skeleton": [edge]
}
           

具體的:

"categories": [
    {
        "supercategory": "person",
        "id": 1,
        "name": "person",
        "keypoints": [
            "nose",
            "left_eye",
            "right_eye",
            "left_ear",
            "right_ear",
            "left_shoulder",
            "right_shoulder",
            "left_elbow",
            "right_elbow",
            "left_wrist",
            "right_wrist",
            "left_hip",
            "right_hip",
            "left_knee",
            "right_knee",
            "left_ankle",
            "right_ankle"
        ],
        "skeleton": [
            [
                16,
                14
            ],
            [
                14,
                12
            ],
            [
                17,
                15
            ],
            [
                15,
                13
            ],
            [
                12,
                13
            ],
            [
                6,
                12
            ],
            [
                7,
                13
            ],
            [
                6,
                7
            ],
            [
                6,
                8
            ],
            [
                7,
                9
            ],
            [
                8,
                10
            ],
            [
                9,
                11
            ],
            [
                2,
                3
            ],
            [
                1,
                2
            ],
            [
                1,
                3
            ],
            [
                2,
                4
            ],
            [
                3,
                5
            ],
            [
                4,
                6
            ],
            [
                5,
                7
            ]
        ]
    }
]
           
COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.3 看圖說話Image Caption檔案格式

captions_train2017.json、captions_val2017.json這兩個檔案就是這種格式。Image Caption這種格式的檔案從頭至尾按照順序分為以下段落,看起來和Object Instance一樣,不過沒有最後的categories字段:

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation]
}
           

其中,info、licenses、images這三個結構體/類型 ,在不同的JSON檔案中這三個類型是一樣的,定義是共享的。不共享的是annotations這種結構體,它在不同類型的JSON檔案中是不一樣的。

  • annotations: 數量要多于圖檔的數量,這是因為一個圖檔可以有多個場景描述;

2.3.1 annotation中的内容

這個類型中的annotation用來存儲描述圖檔的語句。每個語句描述了對應圖檔的内容,而每個圖檔至少有5個描述語句(有的圖檔更多)。annotation定義如下:

annotation{
    "image_id": int,
    "id": int,
    "caption": str
}

           

取一個具體片段:

{
        "image_id": 546219,
        "id": 396378,
        "caption": "A large group is sitting together and eating at a restaurant."
    },
    {
        "image_id": 546219,
        "id": 397413,
        "caption": "The people are gathered at the table for dinner."
    },
    {
        "image_id": 146155,
        "id": 397604,
        "caption": "Two  men standing near a bar drinking together"
    },
    {
        "image_id": 546219,
        "id": 399732,
        "caption": "A large group of people pose for a photo at dinner."
    },
    {
        "image_id": 546219,
        "id": 400023,
        "caption": "The diners are enjoying their various beverages with their meals.."
    }
           

這裡的image_id對應images中的Id.

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

本文參考

  • https://zhuanlan.zhihu.com/p/70878433
  • https://blog.csdn.net/weixin_38293440/article/details/81196428

繼續閱讀