天天看点

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

文章目录

  • 1、COCO数据集的介绍
  • 2、COCO数据集标注格式
    • 2.1实例分割Object Instance文件格式
      • 2.1.1 info中的内容
      • 2.1.2 licenses中的内容
      • 2.1.3 images中的内容
      • 2.1.4 annotations中的内容
      • 2.1.4 categories中的内容
    • 2.2 关键点检测Object Keypoint文件格式
      • 2.2.1 annotations中的内容
      • 2.2.2 categories中的内容
    • 2.3 看图说话Image Caption文件格式
      • 2.3.1 annotation中的内容
  • 本文参考

本文主要是为了熟悉COCO数据集。

1、COCO数据集的介绍

首先上两个链接,第一个 ,第二个

有以上两个链接足够了解COCO

整个数据集的分布如下

#step1: 下载数据集
2017 Train images [118K/18GB]
2017 Val images [5K/1GB]
2017 Test images [41K/6GB]
2017 Train/Val annotations [241MB]

#step2: 按照下面结构存放文件夹
coco
  ├── annotations
  │   ├── instances_train2014.json
  │   ├── instances_train2017.json
  │   ├── instances_val2014.json
  │   ├── instances_val2017.json
  │   |   ...
  ├── train2017
  │   ├── 000000000009.jpg
  │   ├── 000000580008.jpg
  │   |   ...
  ├── val2017
  │   ├── 000000000139.jpg
  │   ├── 000000000285.jpg
  │   |   ...
  |   ...
  
           

2、COCO数据集标注格式

本部分主要是参考https://zhuanlan.zhihu.com/p/70878433,这个链接进行同步整理的,直接看原文也可以,只是觉的原文有点乱,不便于整体掌据该数据集。

COCO数据集大量使用Amazon Mechanical Turk来收集数据。COCO数据集现主要有三种标注类型:

  • object instance 目标实例
  • object keypoints 目标关键点
  • image captions 看图说话。

    标注文件使用JSON文件进行存储。如下为COCO2017数据集中train,val的标注文件:

    COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考
    原文件是

    annotations_trainval2017.zip

    ,解压后是

    annotations

    文件夹。可以看到一共有三种类型,每种类型包含训练和验证,共有6个JSON文件。

2.1实例分割Object Instance文件格式

以instance_val2017.json为例(验证集文件软小,打开较快),总体格式如下:

{
    "info": info,
    "licenses": [license],
    "images":[image],
    "annotations":[annotation],
    "categories":[category]
}
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考
  • images字段下是一个列表,列表长度等同于划入训练集(或验证集)的图片数量
  • annotatons字段下也是一个列表,列表长度等同地训练集(或验证集)中bounding box 的数量
  • categories字段下也是一个列表,列表长度等同于数据集类别的数,coco2017分类数是80,用VScode打开看:
    COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

可以看到整个JSON文件是一个大的数字典。

通过jupyterlab打开看:

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

2.1.1 info中的内容

"info": {
        "description": "COCO 2017 Dataset",
        "url": "http://cocodataset.org",
        "version": "1.0",
        "year": 2017,
        "contributor": "COCO Consortium",
        "date_created": "2017/09/01"
    },
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

info中包括一些基本信息,时间,版本,贡献者等,没什么太大价值,可以忽略。

2.1.2 licenses中的内容

内容较少,这里全部列出:

"licenses": [
        {
            "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
            "id": 1,
            "name": "Attribution-NonCommercial-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc/2.0/",
            "id": 2,
            "name": "Attribution-NonCommercial License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc-nd/2.0/",
            "id": 3,
            "name": "Attribution-NonCommercial-NoDerivs License"
        },
        {
            "url": "http://creativecommons.org/licenses/by/2.0/",
            "id": 4,
            "name": "Attribution License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-sa/2.0/",
            "id": 5,
            "name": "Attribution-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nd/2.0/",
            "id": 6,
            "name": "Attribution-NoDerivs License"
        },
        {
            "url": "http://flickr.com/commons/usage/",
            "id": 7,
            "name": "No known copyright restrictions"
        },
        {
            "url": "http://www.usa.gov/copyright.shtml",
            "id": 8,
            "name": "United States Government Work"
        }
    ],
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

一共有8条,也没什么价值,可以忽略。

2.1.3 images中的内容

内容较多,列几条:

"images": [
        {
            "license": 4,
            "file_name": "000000397133.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
            "height": 427,
            "width": 640,
            "date_captured": "2013-11-14 17:02:52",
            "flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg",
            "id": 397133
        },
        {
            "license": 1,
            "file_name": "000000037777.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg",
            "height": 230,
            "width": 352,
            "date_captured": "2013-11-14 20:55:31",
            "flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg",
            "id": 37777
        },
        {
            "license": 4,
            "file_name": "000000252219.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
            "height": 428,
            "width": 640,
            "date_captured": "2013-11-14 22:32:02",
            "flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
            "id": 252219
        },
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

jupyter中看的效果:

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

images是一个列表,列表中每一个元素是一个字典,存储一张图片中的信息。分别就图片信息做出说明:

  • license: 没用
  • file_name:图片文件名
  • coco_url:没用
  • height:图片高
  • width:图片宽
  • date_captured:没用
  • flickr_url没用
  • id:图片的身份ID,每个图片特有的

    在以上信息中,height,width,file_name,id这四个值非常重要。

2.1.4 annotations中的内容

该内容较多,列几条:

"annotations": [
        {
            "segmentation": [
                [
                    510.66,
                    423.01,
                    511.72,
                    ...
                    423.01,
                    510.45,
                    423.01
                ]
            ],
            "area": 702.1057499999998,
            "iscrowd": 0,
            "image_id": 289343,
            "bbox": [
                473.07,
                395.93,
                38.65,
                28.67
            ],
            "category_id": 18,
            "id": 1768
        },
        {
            "segmentation": [
                [
                    289.74,
                    443.39,
                    302.29,
                   ...
                    444.27,
                    291.88,
                    443.74
                ]
            ],
            "area": 27718.476299999995,
            "iscrowd": 0,
            "image_id": 61471,
            "bbox": [
                272.1,
                200.23,
                151.97,
                279.77
            ],
            "category_id": 18,
            "id": 1773
        },
        ......
                    "segmentation": {
                "counts": [
                    272,
                    2,
                    4,
                    4,
                   ...
                    16,
                    228,
                    8,
                    10250
                ],
                "size": [
                    240,
                    320
                ]
            },
            "area": 18419,
            "iscrowd": 1,
            "image_id": 448263,
            "bbox": [
                1,
                0,
                276,
                122
            ],
            "category_id": 1,
            "id": 900100448263
        },
        
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

jupyter中效果:

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

annotations是该JSON文件中最重要的。annotations是包含多个annotation实例的数组,annotation类型本身又包含一系列的字段:

  • segmentation:分割标签
  • area:面积
  • iscrowd: 是否多个目标
  • image_id:与images中的id对应
  • bbox:目标框
  • category_id:类别
  • id:标注框的一个序号

    整体来说annotation的格式如下:

annotation{
    "segmentation": RLE or [polygon],
    "area" :float,
    "iscrowd": 0 or 1,
    "imgae_id": int,
    "bbox": [x,y,width,height],
    "category_id": int,
    "id": int
    
           

注意,单个对像(iscrowd=0)可能需要多个polygon来表示,比如这个对像在图像中被挡住;而iscrow=1时(将标注一组对像,比如一群人),segmentation的格式是RLE格式。也就是说,只要iscrowd=0,那么segmentation格式就是polygon; 而iscrowd=1,则segmentation格式是RLE。另外不论iscrowd是0还是1,每个对像都会有一个矩型框bbox,提供框的左上角坐标以及矩形框的高和宽。

segmentation polygon格式,可以看到,是一个二维的列表,里面的一堆数字是像素级分割得到的物体边缘坐标,从上文中也能看到,坐标是成对出现的;RLE格式如下:

segmentation : 
{
    'counts': [272, 2, 4, 4, 4, 4, 2, 9, 1, 2, 16, 43, 143, 24......], 
    'size': [240, 320]
}
           

COCO数据集的RLE都是uncompressed RLE格式(与之相对的是compact RLE)。 RLE所占字节的大小和边界上的像素数量是正相关的。RLE格式带来的好处就是当基于RLE去计算目标区域的面积以及两个目标之间的unoin和intersection时会非常有效率。 上面的segmentation中的counts数组和size数组共同组成了这幅图片中的分割 mask。其中size是这幅图片的宽高,然后在这幅图像中,每一个像素点要么在被分割(标注)的目标区域中,要么在背景中。很明显这是一个bool量:如果该像素在目标区域中为true那么在背景中就是False;如果该像素在目标区域中为1那么在背景中就是0。对于一个240x320的图片来说,一共有76800个像素点,根据每一个像素点在不在目标区域中,我们就有了76800个bit,比如像这样(随便写的例子,和上文的数组没关系):00000111100111110…;但是这样写很明显浪费空间,我们直接写上0或者1的个数不就行了嘛(Run-length encoding),于是就成了54251…,这就是上文中的counts数组。

area指向该segmentation的面积,iscrowd=0表示没有重叠,iscrowd=1表示有重叠;image_id就是前面images中存储的id.bbox指向的是物体的标注框;category_id指向的数字代表分类,共有80个分类;id不同于images中的id,这里的id只是每个框的身份编号。

2.1.4 categories中的内容

如下:

"categories": [
        {
            "supercategory": "person",
            "id": 1,
            "name": "person"
        },
        {
            "supercategory": "vehicle",
            "id": 2,
            "name": "bicycle"
        },
        {
            "supercategory": "vehicle",
            "id": 3,
            "name": "car"
        },
        {
            "supercategory": "vehicle",
            "id": 4,
            "name": "motorcycle"
        },
        {
            "supercategory": "vehicle",
            "id": 5,
            "name": "airplane"
        },
        {
            "supercategory": "vehicle",
            "id": 6,
            "name": "bus"
        },
        {
            "supercategory": "vehicle",
            "id": 7,
            "name": "train"
        },
        {
            "supercategory": "vehicle",
            "id": 8,
            "name": "truck"
        },
        {
            "supercategory": "vehicle",
            "id": 9,
            "name": "boat"
        },
        {
            "supercategory": "outdoor",
            "id": 10,
            "name": "traffic light"
        },
        {
            "supercategory": "outdoor",
            "id": 11,
            "name": "fire hydrant"
        },
        {
            "supercategory": "outdoor",
            "id": 13,
            "name": "stop sign"
        },
        {
            "supercategory": "outdoor",
            "id": 14,
            "name": "parking meter"
        },
        {
            "supercategory": "outdoor",
            "id": 15,
            "name": "bench"
        },
        {
            "supercategory": "animal",
            "id": 16,
            "name": "bird"
        },
        {
            "supercategory": "animal",
            "id": 17,
            "name": "cat"
        },
        {
            "supercategory": "animal",
            "id": 18,
            "name": "dog"
        },
        {
            "supercategory": "animal",
            "id": 19,
            "name": "horse"
        },
        {
            "supercategory": "animal",
            "id": 20,
            "name": "sheep"
        },
        {
            "supercategory": "animal",
            "id": 21,
            "name": "cow"
        },
        {
            "supercategory": "animal",
            "id": 22,
            "name": "elephant"
        },
        {
            "supercategory": "animal",
            "id": 23,
            "name": "bear"
        },
        {
            "supercategory": "animal",
            "id": 24,
            "name": "zebra"
        },
        {
            "supercategory": "animal",
            "id": 25,
            "name": "giraffe"
        },
        {
            "supercategory": "accessory",
            "id": 27,
            "name": "backpack"
        },
        {
            "supercategory": "accessory",
            "id": 28,
            "name": "umbrella"
        },
        {
            "supercategory": "accessory",
            "id": 31,
            "name": "handbag"
        },
        {
            "supercategory": "accessory",
            "id": 32,
            "name": "tie"
        },
        {
            "supercategory": "accessory",
            "id": 33,
            "name": "suitcase"
        },
        {
            "supercategory": "sports",
            "id": 34,
            "name": "frisbee"
        },
        {
            "supercategory": "sports",
            "id": 35,
            "name": "skis"
        },
        {
            "supercategory": "sports",
            "id": 36,
            "name": "snowboard"
        },
        {
            "supercategory": "sports",
            "id": 37,
            "name": "sports ball"
        },
        {
            "supercategory": "sports",
            "id": 38,
            "name": "kite"
        },
        {
            "supercategory": "sports",
            "id": 39,
            "name": "baseball bat"
        },
        {
            "supercategory": "sports",
            "id": 40,
            "name": "baseball glove"
        },
        {
            "supercategory": "sports",
            "id": 41,
            "name": "skateboard"
        },
        {
            "supercategory": "sports",
            "id": 42,
            "name": "surfboard"
        },
        {
            "supercategory": "sports",
            "id": 43,
            "name": "tennis racket"
        },
        {
            "supercategory": "kitchen",
            "id": 44,
            "name": "bottle"
        },
        {
            "supercategory": "kitchen",
            "id": 46,
            "name": "wine glass"
        },
        {
            "supercategory": "kitchen",
            "id": 47,
            "name": "cup"
        },
        {
            "supercategory": "kitchen",
            "id": 48,
            "name": "fork"
        },
        {
            "supercategory": "kitchen",
            "id": 49,
            "name": "knife"
        },
        {
            "supercategory": "kitchen",
            "id": 50,
            "name": "spoon"
        },
        {
            "supercategory": "kitchen",
            "id": 51,
            "name": "bowl"
        },
        {
            "supercategory": "food",
            "id": 52,
            "name": "banana"
        },
        {
            "supercategory": "food",
            "id": 53,
            "name": "apple"
        },
        {
            "supercategory": "food",
            "id": 54,
            "name": "sandwich"
        },
        {
            "supercategory": "food",
            "id": 55,
            "name": "orange"
        },
        {
            "supercategory": "food",
            "id": 56,
            "name": "broccoli"
        },
        {
            "supercategory": "food",
            "id": 57,
            "name": "carrot"
        },
        {
            "supercategory": "food",
            "id": 58,
            "name": "hot dog"
        },
        {
            "supercategory": "food",
            "id": 59,
            "name": "pizza"
        },
        {
            "supercategory": "food",
            "id": 60,
            "name": "donut"
        },
        {
            "supercategory": "food",
            "id": 61,
            "name": "cake"
        },
        {
            "supercategory": "furniture",
            "id": 62,
            "name": "chair"
        },
        {
            "supercategory": "furniture",
            "id": 63,
            "name": "couch"
        },
        {
            "supercategory": "furniture",
            "id": 64,
            "name": "potted plant"
        },
        {
            "supercategory": "furniture",
            "id": 65,
            "name": "bed"
        },
        {
            "supercategory": "furniture",
            "id": 67,
            "name": "dining table"
        },
        {
            "supercategory": "furniture",
            "id": 70,
            "name": "toilet"
        },
        {
            "supercategory": "electronic",
            "id": 72,
            "name": "tv"
        },
        {
            "supercategory": "electronic",
            "id": 73,
            "name": "laptop"
        },
        {
            "supercategory": "electronic",
            "id": 74,
            "name": "mouse"
        },
        {
            "supercategory": "electronic",
            "id": 75,
            "name": "remote"
        },
        {
            "supercategory": "electronic",
            "id": 76,
            "name": "keyboard"
        },
        {
            "supercategory": "electronic",
            "id": 77,
            "name": "cell phone"
        },
        {
            "supercategory": "appliance",
            "id": 78,
            "name": "microwave"
        },
        {
            "supercategory": "appliance",
            "id": 79,
            "name": "oven"
        },
        {
            "supercategory": "appliance",
            "id": 80,
            "name": "toaster"
        },
        {
            "supercategory": "appliance",
            "id": 81,
            "name": "sink"
        },
        {
            "supercategory": "appliance",
            "id": 82,
            "name": "refrigerator"
        },
        {
            "supercategory": "indoor",
            "id": 84,
            "name": "book"
        },
        {
            "supercategory": "indoor",
            "id": 85,
            "name": "clock"
        },
        {
            "supercategory": "indoor",
            "id": 86,
            "name": "vase"
        },
        {
            "supercategory": "indoor",
            "id": 87,
            "name": "scissors"
        },
        {
            "supercategory": "indoor",
            "id": 88,
            "name": "teddy bear"
        },
        {
            "supercategory": "indoor",
            "id": 89,
            "name": "hair drier"
        },
        {
            "supercategory": "indoor",
            "id": 90,
            "name": "toothbrush"
        }
    ]
           

分类从1到90,但有些数字是跳过的,所以只有80个分类。

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

2.2 关键点检测Object Keypoint文件格式

COCO数据集中person_keypoints_train2017.json、person_keypoints_val2017.json这两个文件就是这种格式。文件整体格式是:

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation],
    "categories": [category]
}
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

与instance_val2017.json相同。其中,info、licenses、images这三部分在不同的JSON文件中是相同的,定义是共享的,不共享的是annotations和category这两种在不同类型的JSON文件中是不一样的。

  • images字段下是一个列表,列表长度等同于划入训练集(或验证集)的图片数量
  • annotatons字段下也是一个列表,列表长度等同地训练集(或验证集)中bounding box 的数量,这里只有人这个类别的bounding box
  • categories字段下也是一个列表,列表长度等同于数据集类别的数,这里是1,只有person这一个类。

    相同内容这里就不再列了,只列不同的。

2.2.1 annotations中的内容

这个类型中的annotation结构中包含 object instance中annotation所有的字段,再加上两个额外的字段。新增的keypoints是一个长度为3*k的数组,第一个和第二个元素分别是x和y坐标值,第三个是标志位v,v为0时表示这个关键点没有标注(这种情况下x=y=v=0),v为1时表示这个关键点标注了但是不可见(被遮挡了),v为2时表示这个关键点标注了同时也可见。num_keypoints表示这个目标上被标注的关键点的数量(v>0),比较小的目标上可能就无法标注关键点。

annotation{
   "segmentation": RLE or [polygon],
   "num_keypoints": int,    
   "area": float,
   "iscrowd": 0 or 1,
   "keypoints": [x1,y1,v1,...],
   "image_id": int,
   "bbox": [x,y,width,height],
   "category_id": int,
   "id": int 
}
           

列举一个:

{
      "segmentation": [
          [
              492.38,
              238.33,
              491.91,
              234.15,
              494.47,
              227.65,
              495.17,
              215.1,
              497.02,
              199.54,
              503.53,
              197.22,
              503.3,
              194.43,
              503.3,
              190.95,
              506.08,
              183.51,
              511.89,
              185.84,
              514.21,
              187,
              514.21,
              196.29,
              521.88,
              200.7,
              526.76,
              216.03,
              520.25,
              227.65,
              519.56,
              234.38,
              519.09,
              239.49,
              519.09,
              244.84,
              519.56,
              246.93,
              518.16,
              248.32,
              516.3,
              256.91,
              510.03,
              256.45,
              513.28,
              240.89
          ]
      ],
      "num_keypoints": 13,
      "area": 1394.7431,
      "iscrowd": 0,
      "keypoints": [
          508,
          192,
          2,
          510,
          191,
          2,
          506,
          191,
          2,
          512,
          192,
          2,
          503,
          192,
          1,
          515,
          202,
          2,
          499,
          202,
          2,
          524,
          214,
          2,
          497,
          215,
          2,
          516,
          226,
          2,
          496,
          224,
          2,
          511,
          232,
          2,
          497,
          230,
          2,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0,
          0
      ],
      "image_id": 440475,
      "bbox": [
          491.91,
          183.51,
          34.85,
          73.4
      ],
      "category_id": 1,
      "id": 183302
  }
           

可以看到一共有17个关键点。

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

2.2.2 categories中的内容

对于category,相比object instance中的category,新增了两个字段,keypoints是一个长度为k的数组,包含每个关键点的名称;skeleton定义各关键点的连接性(比如人的左手腕和左肘就是连接的,但是左手腕和右手腕就不是)。目前,COCO的keypoints只标注了person category (分类为人)。定义如下:

{
    "supercategory": str,
    "id": int,
    "name": str,    
    "keypoints": [str],
    "skeleton": [edge]
}
           

具体的:

"categories": [
    {
        "supercategory": "person",
        "id": 1,
        "name": "person",
        "keypoints": [
            "nose",
            "left_eye",
            "right_eye",
            "left_ear",
            "right_ear",
            "left_shoulder",
            "right_shoulder",
            "left_elbow",
            "right_elbow",
            "left_wrist",
            "right_wrist",
            "left_hip",
            "right_hip",
            "left_knee",
            "right_knee",
            "left_ankle",
            "right_ankle"
        ],
        "skeleton": [
            [
                16,
                14
            ],
            [
                14,
                12
            ],
            [
                17,
                15
            ],
            [
                15,
                13
            ],
            [
                12,
                13
            ],
            [
                6,
                12
            ],
            [
                7,
                13
            ],
            [
                6,
                7
            ],
            [
                6,
                8
            ],
            [
                7,
                9
            ],
            [
                8,
                10
            ],
            [
                9,
                11
            ],
            [
                2,
                3
            ],
            [
                1,
                2
            ],
            [
                1,
                3
            ],
            [
                2,
                4
            ],
            [
                3,
                5
            ],
            [
                4,
                6
            ],
            [
                5,
                7
            ]
        ]
    }
]
           
COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

2.3 看图说话Image Caption文件格式

captions_train2017.json、captions_val2017.json这两个文件就是这种格式。Image Caption这种格式的文件从头至尾按照顺序分为以下段落,看起来和Object Instance一样,不过没有最后的categories字段:

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation]
}
           

其中,info、licenses、images这三个结构体/类型 ,在不同的JSON文件中这三个类型是一样的,定义是共享的。不共享的是annotations这种结构体,它在不同类型的JSON文件中是不一样的。

  • annotations: 数量要多于图片的数量,这是因为一个图片可以有多个场景描述;

2.3.1 annotation中的内容

这个类型中的annotation用来存储描述图片的语句。每个语句描述了对应图片的内容,而每个图片至少有5个描述语句(有的图片更多)。annotation定义如下:

annotation{
    "image_id": int,
    "id": int,
    "caption": str
}

           

取一个具体片段:

{
        "image_id": 546219,
        "id": 396378,
        "caption": "A large group is sitting together and eating at a restaurant."
    },
    {
        "image_id": 546219,
        "id": 397413,
        "caption": "The people are gathered at the table for dinner."
    },
    {
        "image_id": 146155,
        "id": 397604,
        "caption": "Two  men standing near a bar drinking together"
    },
    {
        "image_id": 546219,
        "id": 399732,
        "caption": "A large group of people pose for a photo at dinner."
    },
    {
        "image_id": 546219,
        "id": 400023,
        "caption": "The diners are enjoying their various beverages with their meals.."
    }
           

这里的image_id对应images中的Id.

COCO数据集介绍1、COCO数据集的介绍2、COCO数据集标注格式本文参考

本文参考

  • https://zhuanlan.zhihu.com/p/70878433
  • https://blog.csdn.net/weixin_38293440/article/details/81196428

继续阅读