文章目錄

1、COCO資料集的介紹
2、COCO資料集标注格式
- ２.1執行個體分割Object Instance檔案格式
- - 2.1.1 info中的内容
  - 2.1.2 licenses中的内容
  - 2.1.3 images中的内容
  - 2.1.4 annotations中的内容
  - 2.1.4 categories中的内容
- 2.2 關鍵點檢測Object Keypoint檔案格式
- - 2.2.1 annotations中的内容
  - 2.2.2 categories中的内容
- 2.3 看圖說話Image Caption檔案格式
- - 2.3.1 annotation中的内容
本文參考

本文主要是為了熟悉COCO資料集。

1、COCO資料集的介紹

首先上兩個連結，第一個，第二個

有以上兩個連結足夠了解COCO

整個資料集的分布如下

#step1: 下載下傳資料集
2017 Train images [118K/18GB]
2017 Val images [5K/1GB]
2017 Test images [41K/6GB]
2017 Train/Val annotations [241MB]

#step2: 按照下面結構存放檔案夾
coco
  ├── annotations
  │   ├── instances_train2014.json
  │   ├── instances_train2017.json
  │   ├── instances_val2014.json
  │   ├── instances_val2017.json
  │   |   ...
  ├── train2017
  │   ├── 000000000009.jpg
  │   ├── 000000580008.jpg
  │   |   ...
  ├── val2017
  │   ├── 000000000139.jpg
  │   ├── 000000000285.jpg
  │   |   ...
  |   ...

2、COCO資料集标注格式

本部分主要是參考https://zhuanlan.zhihu.com/p/70878433，這個連結進行同步整理的，直接看原文也可以，隻是覺的原文有點亂，不便于整體掌據該資料集。

COCO資料集大量使用Amazon Mechanical Turk來收集資料。COCO資料集現主要有三種标注類型：

object instance 目标執行個體
object keypoints 目标關鍵點
image captions 看圖說話。

标注檔案使用JSON檔案進行存儲。如下為COCO2017資料集中train,val的标注檔案：

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考
原檔案是 annotations_trainval2017.zip ,解壓後是 annotations 檔案夾。可以看到一共有三種類型，每種類型包含訓練和驗證，共有6個JSON檔案。

２.1執行個體分割Object Instance檔案格式

以instance_val2017.json為例（驗證集檔案軟小，打開較快），總體格式如下：

{
    "info": info,
    "licenses": [license],
    "images":[image],
    "annotations":[annotation],
    "categories":[category]
}

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

images字段下是一個清單，清單長度等同于劃入訓練集（或驗證集）的圖檔數量
annotatons字段下也是一個清單，清單長度等同地訓練集（或驗證集）中bounding box 的數量
categories字段下也是一個清單，清單長度等同于資料集類别的數，coco2017分類數是80,用VScode打開看：

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

可以看到整個JSON檔案是一個大的數字典。

通過jupyterlab打開看：

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.1.1 info中的内容

"info": {
        "description": "COCO 2017 Dataset",
        "url": "http://cocodataset.org",
        "version": "1.0",
        "year": 2017,
        "contributor": "COCO Consortium",
        "date_created": "2017/09/01"
    },

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

info中包括一些基本資訊，時間，版本，貢獻者等，沒什麼太大價值，可以忽略。

2.1.2 licenses中的内容

内容較少，這裡全部列出：

"licenses": [
        {
            "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
            "id": 1,
            "name": "Attribution-NonCommercial-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc/2.0/",
            "id": 2,
            "name": "Attribution-NonCommercial License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc-nd/2.0/",
            "id": 3,
            "name": "Attribution-NonCommercial-NoDerivs License"
        },
        {
            "url": "http://creativecommons.org/licenses/by/2.0/",
            "id": 4,
            "name": "Attribution License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-sa/2.0/",
            "id": 5,
            "name": "Attribution-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nd/2.0/",
            "id": 6,
            "name": "Attribution-NoDerivs License"
        },
        {
            "url": "http://flickr.com/commons/usage/",
            "id": 7,
            "name": "No known copyright restrictions"
        },
        {
            "url": "http://www.usa.gov/copyright.shtml",
            "id": 8,
            "name": "United States Government Work"
        }
    ],

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

一共有8條，也沒什麼價值，可以忽略。

2.1.3 images中的内容

内容較多，列幾條：

"images": [
        {
            "license": 4,
            "file_name": "000000397133.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
            "height": 427,
            "width": 640,
            "date_captured": "2013-11-14 17:02:52",
            "flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg",
            "id": 397133
        },
        {
            "license": 1,
            "file_name": "000000037777.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg",
            "height": 230,
            "width": 352,
            "date_captured": "2013-11-14 20:55:31",
            "flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg",
            "id": 37777
        },
        {
            "license": 4,
            "file_name": "000000252219.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
            "height": 428,
            "width": 640,
            "date_captured": "2013-11-14 22:32:02",
            "flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
            "id": 252219
        },

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

jupyter中看的效果：

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

images是一個清單，清單中每一個元素是一個字典，存儲一張圖檔中的資訊。分别就圖檔資訊做出說明：

license: 沒用
file_name:圖檔檔案名
coco_url:沒用
height:圖檔高
width:圖檔寬
date_captured:沒用
flickr_url沒用
id:圖檔的身份ID，每個圖檔特有的

在以上資訊中，height,width,file_name,id這四個值非常重要。

2.1.4 annotations中的内容

該内容較多，列幾條：

"annotations": [
        {
            "segmentation": [
                [
                    510.66,
                    423.01,
                    511.72,
                    ...
                    423.01,
                    510.45,
                    423.01
                ]
            ],
            "area": 702.1057499999998,
            "iscrowd": 0,
            "image_id": 289343,
            "bbox": [
                473.07,
                395.93,
                38.65,
                28.67
            ],
            "category_id": 18,
            "id": 1768
        },
        {
            "segmentation": [
                [
                    289.74,
                    443.39,
                    302.29,
                   ...
                    444.27,
                    291.88,
                    443.74
                ]
            ],
            "area": 27718.476299999995,
            "iscrowd": 0,
            "image_id": 61471,
            "bbox": [
                272.1,
                200.23,
                151.97,
                279.77
            ],
            "category_id": 18,
            "id": 1773
        },
        ......
                    "segmentation": {
                "counts": [
                    272,
                    2,
                    4,
                    4,
                   ...
                    16,
                    228,
                    8,
                    10250
                ],
                "size": [
                    240,
                    320
                ]
            },
            "area": 18419,
            "iscrowd": 1,
            "image_id": 448263,
            "bbox": [
                1,
                0,
                276,
                122
            ],
            "category_id": 1,
            "id": 900100448263
        },

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

jupyter中效果：

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

annotations是該JSON檔案中最重要的。annotations是包含多個annotation執行個體的數組，annotation類型本身又包含一系列的字段：

segmentation:分割标簽
area:面積
iscrowd: 是否多個目标
image_id:與images中的id對應
bbox:目标框
category_id:類别
id:标注框的一個序号

整體來說annotation的格式如下：

annotation{
    "segmentation": RLE or [polygon],
    "area" :float,
    "iscrowd": 0 or 1,
    "imgae_id": int,
    "bbox": [x,y,width,height],
    "category_id": int,
    "id": int

注意，單個對像（iscrowd=0)可能需要多個polygon來表示，比如這個對像在圖像中被擋住;而iscrow=1時（将标注一組對像，比如一群人），segmentation的格式是RLE格式。也就是說，隻要iscrowd=0，那麼segmentation格式就是polygon; 而iscrowd=1,則segmentation格式是RLE。另外不論iscrowd是0還是1，每個對像都會有一個矩型框bbox,提供框的左上角坐标以及矩形框的高和寬。

segmentation polygon格式，可以看到，是一個二維的清單，裡面的一堆數字是像素級分割得到的物體邊緣坐标，從上文中也能看到，坐标是成對出現的；RLE格式如下：

segmentation : 
{
    'counts': [272, 2, 4, 4, 4, 4, 2, 9, 1, 2, 16, 43, 143, 24......], 
    'size': [240, 320]
}

COCO資料集的RLE都是uncompressed RLE格式（與之相對的是compact RLE）。 RLE所占位元組的大小和邊界上的像素數量是正相關的。RLE格式帶來的好處就是當基于RLE去計算目标區域的面積以及兩個目标之間的unoin和intersection時會非常有效率。上面的segmentation中的counts數組和size數組共同組成了這幅圖檔中的分割 mask。其中size是這幅圖檔的寬高，然後在這幅圖像中，每一個像素點要麼在被分割（标注）的目标區域中，要麼在背景中。很明顯這是一個bool量：如果該像素在目标區域中為true那麼在背景中就是False；如果該像素在目标區域中為1那麼在背景中就是0。對于一個240x320的圖檔來說，一共有76800個像素點，根據每一個像素點在不在目标區域中，我們就有了76800個bit，比如像這樣（随便寫的例子，和上文的數組沒關系）：00000111100111110…；但是這樣寫很明顯浪費空間，我們直接寫上0或者1的個數不就行了嘛（Run-length encoding)，于是就成了54251…，這就是上文中的counts數組。

area指向該segmentation的面積，iscrowd=0表示沒有重疊，iscrowd=1表示有重疊；image_id就是前面images中存儲的id.bbox指向的是物體的标注框；category_id指向的數字代表分類，共有80個分類；id不同于images中的id,這裡的id隻是每個框的身份編号。

2.1.4 categories中的内容

如下:

"categories": [
        {
            "supercategory": "person",
            "id": 1,
            "name": "person"
        },
        {
            "supercategory": "vehicle",
            "id": 2,
            "name": "bicycle"
        },
        {
            "supercategory": "vehicle",
            "id": 3,
            "name": "car"
        },
        {
            "supercategory": "vehicle",
            "id": 4,
            "name": "motorcycle"
        },
        {
            "supercategory": "vehicle",
            "id": 5,
            "name": "airplane"
        },
        {
            "supercategory": "vehicle",
            "id": 6,
            "name": "bus"
        },
        {
            "supercategory": "vehicle",
            "id": 7,
            "name": "train"
        },
        {
            "supercategory": "vehicle",
            "id": 8,
            "name": "truck"
        },
        {
            "supercategory": "vehicle",
            "id": 9,
            "name": "boat"
        },
        {
            "supercategory": "outdoor",
            "id": 10,
            "name": "traffic light"
        },
        {
            "supercategory": "outdoor",
            "id": 11,
            "name": "fire hydrant"
        },
        {
            "supercategory": "outdoor",
            "id": 13,
            "name": "stop sign"
        },
        {
            "supercategory": "outdoor",
            "id": 14,
            "name": "parking meter"
        },
        {
            "supercategory": "outdoor",
            "id": 15,
            "name": "bench"
        },
        {
            "supercategory": "animal",
            "id": 16,
            "name": "bird"
        },
        {
            "supercategory": "animal",
            "id": 17,
            "name": "cat"
        },
        {
            "supercategory": "animal",
            "id": 18,
            "name": "dog"
        },
        {
            "supercategory": "animal",
            "id": 19,
            "name": "horse"
        },
        {
            "supercategory": "animal",
            "id": 20,
            "name": "sheep"
        },
        {
            "supercategory": "animal",
            "id": 21,
            "name": "cow"
        },
        {
            "supercategory": "animal",
            "id": 22,
            "name": "elephant"
        },
        {
            "supercategory": "animal",
            "id": 23,
            "name": "bear"
        },
        {
            "supercategory": "animal",
            "id": 24,
            "name": "zebra"
        },
        {
            "supercategory": "animal",
            "id": 25,
            "name": "giraffe"
        },
        {
            "supercategory": "accessory",
            "id": 27,
            "name": "backpack"
        },
        {
            "supercategory": "accessory",
            "id": 28,
            "name": "umbrella"
        },
        {
            "supercategory": "accessory",
            "id": 31,
            "name": "handbag"
        },
        {
            "supercategory": "accessory",
            "id": 32,
            "name": "tie"
        },
        {
            "supercategory": "accessory",
            "id": 33,
            "name": "suitcase"
        },
        {
            "supercategory": "sports",
            "id": 34,
            "name": "frisbee"
        },
        {
            "supercategory": "sports",
            "id": 35,
            "name": "skis"
        },
        {
            "supercategory": "sports",
            "id": 36,
            "name": "snowboard"
        },
        {
            "supercategory": "sports",
            "id": 37,
            "name": "sports ball"
        },
        {
            "supercategory": "sports",
            "id": 38,
            "name": "kite"
        },
        {
            "supercategory": "sports",
            "id": 39,
            "name": "baseball bat"
        },
        {
            "supercategory": "sports",
            "id": 40,
            "name": "baseball glove"
        },
        {
            "supercategory": "sports",
            "id": 41,
            "name": "skateboard"
        },
        {
            "supercategory": "sports",
            "id": 42,
            "name": "surfboard"
        },
        {
            "supercategory": "sports",
            "id": 43,
            "name": "tennis racket"
        },
        {
            "supercategory": "kitchen",
            "id": 44,
            "name": "bottle"
        },
        {
            "supercategory": "kitchen",
            "id": 46,
            "name": "wine glass"
        },
        {
            "supercategory": "kitchen",
            "id": 47,
            "name": "cup"
        },
        {
            "supercategory": "kitchen",
            "id": 48,
            "name": "fork"
        },
        {
            "supercategory": "kitchen",
            "id": 49,
            "name": "knife"
        },
        {
            "supercategory": "kitchen",
            "id": 50,
            "name": "spoon"
        },
        {
            "supercategory": "kitchen",
            "id": 51,
            "name": "bowl"
        },
        {
            "supercategory": "food",
            "id": 52,
            "name": "banana"
        },
        {
            "supercategory": "food",
            "id": 53,
            "name": "apple"
        },
        {
            "supercategory": "food",
            "id": 54,
            "name": "sandwich"
        },
        {
            "supercategory": "food",
            "id": 55,
            "name": "orange"
        },
        {
            "supercategory": "food",
            "id": 56,
            "name": "broccoli"
        },
        {
            "supercategory": "food",
            "id": 57,
            "name": "carrot"
        },
        {
            "supercategory": "food",
            "id": 58,
            "name": "hot dog"
        },
        {
            "supercategory": "food",
            "id": 59,
            "name": "pizza"
        },
        {
            "supercategory": "food",
            "id": 60,
            "name": "donut"
        },
        {
            "supercategory": "food",
            "id": 61,
            "name": "cake"
        },
        {
            "supercategory": "furniture",
            "id": 62,
            "name": "chair"
        },
        {
            "supercategory": "furniture",
            "id": 63,
            "name": "couch"
        },
        {
            "supercategory": "furniture",
            "id": 64,
            "name": "potted plant"
        },
        {
            "supercategory": "furniture",
            "id": 65,
            "name": "bed"
        },
        {
            "supercategory": "furniture",
            "id": 67,
            "name": "dining table"
        },
        {
            "supercategory": "furniture",
            "id": 70,
            "name": "toilet"
        },
        {
            "supercategory": "electronic",
            "id": 72,
            "name": "tv"
        },
        {
            "supercategory": "electronic",
            "id": 73,
            "name": "laptop"
        },
        {
            "supercategory": "electronic",
            "id": 74,
            "name": "mouse"
        },
        {
            "supercategory": "electronic",
            "id": 75,
            "name": "remote"
        },
        {
            "supercategory": "electronic",
            "id": 76,
            "name": "keyboard"
        },
        {
            "supercategory": "electronic",
            "id": 77,
            "name": "cell phone"
        },
        {
            "supercategory": "appliance",
            "id": 78,
            "name": "microwave"
        },
        {
            "supercategory": "appliance",
            "id": 79,
            "name": "oven"
        },
        {
            "supercategory": "appliance",
            "id": 80,
            "name": "toaster"
        },
        {
            "supercategory": "appliance",
            "id": 81,
            "name": "sink"
        },
        {
            "supercategory": "appliance",
            "id": 82,
            "name": "refrigerator"
        },
        {
            "supercategory": "indoor",
            "id": 84,
            "name": "book"
        },
        {
            "supercategory": "indoor",
            "id": 85,
            "name": "clock"
        },
        {
            "supercategory": "indoor",
            "id": 86,
            "name": "vase"
        },
        {
            "supercategory": "indoor",
            "id": 87,
            "name": "scissors"
        },
        {
            "supercategory": "indoor",
            "id": 88,
            "name": "teddy bear"
        },
        {
            "supercategory": "indoor",
            "id": 89,
            "name": "hair drier"
        },
        {
            "supercategory": "indoor",
            "id": 90,
            "name": "toothbrush"
        }
    ]

分類從1到90，但有些數字是跳過的，是以隻有80個分類。

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.2 關鍵點檢測Object Keypoint檔案格式

COCO資料集中person_keypoints_train2017.json、person_keypoints_val2017.json這兩個檔案就是這種格式。檔案整體格式是：

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation],
    "categories": [category]
}

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

與instance_val2017.json相同。其中，info、licenses、images這三部分在不同的JSON檔案中是相同的，定義是共享的，不共享的是annotations和category這兩種在不同類型的JSON檔案中是不一樣的。

images字段下是一個清單，清單長度等同于劃入訓練集（或驗證集）的圖檔數量
annotatons字段下也是一個清單，清單長度等同地訓練集（或驗證集）中bounding box 的數量，這裡隻有人這個類别的bounding box
categories字段下也是一個清單，清單長度等同于資料集類别的數，這裡是1，隻有person這一個類。

相同内容這裡就不再列了，隻列不同的。

2.2.1 annotations中的内容

這個類型中的annotation結構中包含 object instance中annotation所有的字段，再加上兩個額外的字段。新增的keypoints是一個長度為3*k的數組，第一個和第二個元素分别是x和y坐标值，第三個是标志位v,v為0時表示這個關鍵點沒有标注（這種情況下x=y=v=0），v為1時表示這個關鍵點标注了但是不可見（被遮擋了），v為2時表示這個關鍵點标注了同時也可見。num_keypoints表示這個目标上被标注的關鍵點的數量（v>0），比較小的目标上可能就無法标注關鍵點。

annotation{
   "segmentation": RLE or [polygon],
   "num_keypoints": int,    
   "area": float,
   "iscrowd": 0 or 1,
   "keypoints": [x1,y1,v1,...],
   "image_id": int,
   "bbox": [x,y,width,height],
   "category_id": int,
   "id": int 
}

列舉一個：

{
      "segmentation": [
          [
              492.38,
              238.33,
              491.91,
              234.15,
              494.47,
              227.65,
              495.17,
              215.1,
              497.02,
              199.54,
              503.53,
              197.22,
              503.3,
              194.43,
              503.3,
              190.95,
              506.08,
              183.51,
              511.89,
              185.84,
              514.21,
              187,
              514.21,
              196.29,
              521.88,
              200.7,
              526.76,
              216.03,
              520.25,
              227.65,
              519.56,
              234.38,
              519.09,
              239.49,
              519.09,
              244.84,
              519.56,
              246.93,
              518.16,
              248.32,
              516.3,
              256.91,
              510.03,
              256.45,
              513.28,
              240.89
          ]
      ],
      "num_keypoints": 13,
      "area": 1394.7431,
      "iscrowd": 0,
      "keypoints": [
          508,
          192,
          2,
          510,
          191,
          2,
          506,
          191,
          2,
          512,
          192,
          2,
          503,
          192,
          1,
          515,
          202,
          2,
          499,
          202,
          2,
          524,
          214,
          2,
          497,
          215,
          2,
          516,
          226,
          2,
          496,
          224,
          2,
          511,
          232,
          2,
          497,
          230,
          2,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0
      ],
      "image_id": 440475,
      "bbox": [
          491.91,
          183.51,
          34.85,
          73.4
      ],
      "category_id": 1,
      "id": 183302
  }

可以看到一共有17個關鍵點。

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.2.2 categories中的内容

對于category,相比object instance中的category,新增了兩個字段，keypoints是一個長度為k的數組，包含每個關鍵點的名稱；skeleton定義各關鍵點的連接配接性（比如人的左手腕和左肘就是連接配接的，但是左手腕和右手腕就不是）。目前，COCO的keypoints隻标注了person category （分類為人）。定義如下：

{
    "supercategory": str,
    "id": int,
    "name": str,    
    "keypoints": [str],
    "skeleton": [edge]
}

具體的：

"categories": [
    {
        "supercategory": "person",
        "id": 1,
        "name": "person",
        "keypoints": [
            "nose",
            "left_eye",
            "right_eye",
            "left_ear",
            "right_ear",
            "left_shoulder",
            "right_shoulder",
            "left_elbow",
            "right_elbow",
            "left_wrist",
            "right_wrist",
            "left_hip",
            "right_hip",
            "left_knee",
            "right_knee",
            "left_ankle",
            "right_ankle"
        ],
        "skeleton": [
            [
                16,
                14
            ],
            [
                14,
                12
            ],
            [
                17,
                15
            ],
            [
                15,
                13
            ],
            [
                12,
                13
            ],
            [
                6,
                12
            ],
            [
                7,
                13
            ],
            [
                6,
                7
            ],
            [
                6,
                8
            ],
            [
                7,
                9
            ],
            [
                8,
                10
            ],
            [
                9,
                11
            ],
            [
                2,
                3
            ],
            [
                1,
                2
            ],
            [
                1,
                3
            ],
            [
                2,
                4
            ],
            [
                3,
                5
            ],
            [
                4,
                6
            ],
            [
                5,
                7
            ]
        ]
    }
]

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

2.3 看圖說話Image Caption檔案格式

captions_train2017.json、captions_val2017.json這兩個檔案就是這種格式。Image Caption這種格式的檔案從頭至尾按照順序分為以下段落，看起來和Object Instance一樣，不過沒有最後的categories字段：

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation]
}

其中，info、licenses、images這三個結構體/類型，在不同的JSON檔案中這三個類型是一樣的，定義是共享的。不共享的是annotations這種結構體，它在不同類型的JSON檔案中是不一樣的。

annotations: 數量要多于圖檔的數量，這是因為一個圖檔可以有多個場景描述；

2.3.1 annotation中的内容

這個類型中的annotation用來存儲描述圖檔的語句。每個語句描述了對應圖檔的内容，而每個圖檔至少有5個描述語句（有的圖檔更多）。annotation定義如下：

annotation{
    "image_id": int,
    "id": int,
    "caption": str
}

取一個具體片段：

{
        "image_id": 546219,
        "id": 396378,
        "caption": "A large group is sitting together and eating at a restaurant."
    },
    {
        "image_id": 546219,
        "id": 397413,
        "caption": "The people are gathered at the table for dinner."
    },
    {
        "image_id": 146155,
        "id": 397604,
        "caption": "Two  men standing near a bar drinking together"
    },
    {
        "image_id": 546219,
        "id": 399732,
        "caption": "A large group of people pose for a photo at dinner."
    },
    {
        "image_id": 546219,
        "id": 400023,
        "caption": "The diners are enjoying their various beverages with their meals.."
    }

這裡的image_id對應images中的Id.

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

本文參考

https://zhuanlan.zhihu.com/p/70878433
https://blog.csdn.net/weixin_38293440/article/details/81196428

COCO資料集介紹1、COCO資料集的介紹2、COCO資料集标注格式本文參考

文章目錄

1、COCO資料集的介紹

2、COCO資料集标注格式

２.1執行個體分割Object Instance檔案格式

2.1.1 info中的内容

2.1.2 licenses中的内容

2.1.3 images中的内容

2.1.4 annotations中的内容

2.1.4 categories中的内容

2.2 關鍵點檢測Object Keypoint檔案格式

2.2.1 annotations中的内容

2.2.2 categories中的内容

2.3 看圖說話Image Caption檔案格式

2.3.1 annotation中的内容

本文參考

繼續閱讀

資料集 | 心髒病發作分析和預測資料集

資料集 | 克利夫蘭診所基金會心髒病資料集

資料集 | 心髒病患者資料集

pp-picodet從環境配置到部署全流程（5）——PaddleLite端側部署1. PaddleDetection支援的部署形式說明

資料集 | 各國人口壽命資料集

資料集 | 土耳其航空股價資料集

目标檢測架構｜又一新架構來襲，關系網絡用于目标檢測（文末附源碼）

yolov7 tensorrt模型加速部署【實戰】

基于改進FCOS的鋼帶表面缺陷檢測

車道線檢測資料集

資料集 | 網絡釣魚網站資料集

資料集 | 金融反欺詐資料集

資料集 | 2021東京奧運會獎牌榜資料集

目标檢測：YOLOV3論文解讀一、yolov3論文解讀

Pytorch機器學習（九）—— YOLO中對于錨框，預測框，産生候選區域及對候選區域進行标注詳解 Pytorch機器學習（九）—— YOLO中錨框，預測框，産生候選區域及對候選區域進行标注詳解前言一、基本概念二、代碼講解總結

2021-09-30三維點雲測量正方形包裹體積