YOLOV3解讀（6）

檢測函數

使用已經訓練完成的YOLO v3模型，檢測圖檔中的物體，其中：

建立YOLO類的執行個體yolo；
使用Image.open()加載圖像image；
調用yolo.detect_image()檢測圖像image；
關閉yolo的session；
顯示檢測完成的圖像r_image；

實作：

def detect_img_for_test():
    yolo = YOLO()
    img_path = './dataset/img.jpg'
    image = Image.open(img_path)
    r_image = yolo.detect_image(image)
    yolo.close_session()
    r_image.show()

YOLO參數

YOLO類的初始化參數：

anchors_path：anchor box的配置檔案，9個寬高組合；
model_path：已訓練完成的模型，支援重新訓練的模型；
classes_path：類别檔案，與模型檔案比對；
score：置信度的門檻值，删除小于門檻值的候選框；
iou：候選框的IoU門檻值，删除同類别中大于門檻值的候選框；
class_names：類别清單，讀取classes_path；
anchors：anchor box清單，讀取anchors_path；
model_image_size：模型所檢測圖像的尺寸，輸入圖像都需要按此填充；
colors：通過HSV色域，生成随機顔色集合，數量等于類别數class_names；
boxes、scores、classes：檢測的核心輸出，函數generate()所生成，是模型的輸出封裝。

實作：

self.anchors_path = 'configs/yolo_anchors.txt'
  # Anchors
self.model_path = 'model_data/yolo_weights.h5'  # 模型檔案
self.classes_path = 'configs/coco_classes.txt'  # 類别檔案

self.score = 0.20
self.iou = 0.20
self.class_names = self._get_class()  # 擷取類别
self.anchors = self._get_anchors()  # 擷取anchor
self.sess = K.get_session()
self.model_image_size = (416, 416)  # fixed size or (None, None), hw
self.colors = self.__get_colors(self.class_names)
self.boxes, self.scores, self.classes = self.generate()

在__get_colors()中：

将HSV的第0位H值，按1等分，其餘SV值，均為1，生成一組HSV清單；
調用hsv_to_rgb，将HSV色域轉換為RGB色域；
0~ 1的RGB值乘以255，轉換為完整的顔色值，(0~255)；
随機shuffle顔色清單；

實作：

@staticmethod def __get_colors(names):
    # 不同的框，不同的顔色
    hsv_tuples = [(float(x) / len(names), 1., 1.)
                  for x in range(len(names))]  # 不同顔色
    colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
    colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), colors))  # RGB
    np.random.seed(10101)
    np.random.shuffle(colors)
    np.random.seed(None)

    return colors

選擇HSV劃分，而不是RGB的原因是，HSV的顔色值偏移更好，畫出的框，顔色更容易區分。

輸出封裝

boxes、scores、classes是在模型的基礎上，繼續封裝，由函數generate()所生成，其中：

boxes：框的四個點坐标，(top, left, bottom, right)；
scores：框的類别置信度，融合框置信度和類别置信度；
classes：框的類别；

在函數generate()中，設定參數：

num_anchors：anchor box的總數，一般是9個；
num_classes：類别總數，如COCO是80個類；
yolo_model：由yolo_body所建立的模型，調用load_weights加載參數；

實作：

num_anchors = len(self.anchors)  # anchors的數量
num_classes = len(self.class_names)  # 類别數

self.yolo_model = yolo_body(Input(shape=(416, 416, 3)), 3, num_classes)
self.yolo_model.load_weights(model_path)  # 加載模型參數

接着，設定input_image_shape為placeholder，即TF中的參數變量。在yolo_eval中：

繼續封裝yolo_model的輸出output；
anchors，anchor box清單；
類别class_names的總數len()；
輸入圖檔的可選尺寸，input_image_shape，即(416, 416)；
score_threshold，框的整體置信度門檻值score；
iou_threshold，同類别框的IoU門檻值iou；
傳回，框的坐标boxes，框的類别置信度scores，框的類别classes；

實作：

self.input_image_shape = K.placeholder(shape=(2,))
boxes, scores, classes = yolo_eval(
    self.yolo_model.output, self.anchors, len(self.class_names),
    self.input_image_shape, score_threshold=self.score, iou_threshold=self.iou)
return boxes, scores, classes

輸出的scores值，都會大于score_threshold，小于的在yolo_eval()中已被删除。

YOLO評估

在函數yolo_eval()中，完成預測邏輯的封裝，其中輸入：

yolo_outputs：YOLO模型的輸出，3個尺度的清單，即13-26-52，最後1維是預測值，由255=3x(5+80)組成，3是每層的anchor數，5是4個框值xywh和1個框中含有物體的置信度，80是COCO的類别數；
anchors：9個anchor box的值；
num_classes：類别個數，COCO是80個類别；
image_shape：placeholder類型的TF參數，預設(416, 416)；
max_boxes：圖中最大的檢測框數，20個；
score_threshold：框置信度門檻值，小于門檻值的框被删除，需要的框較多，則調低門檻值，需要的框較少，則調高門檻值；
iou_threshold：同類别框的IoU門檻值，大于門檻值的重疊框被删除，重疊物體較多，則調高門檻值，重疊物體較少，則調低門檻值；

其中，yolo_outputs格式，

如下：

[(?, 13, 13, 255), (?, 26, 26, 255), (?, 52, 52, 255)]

其中，anchors清單，

如下：

[(10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198), (373,326)]

實作：

boxes, scores, classes = yolo_eval(
    self.yolo_model.output, self.anchors, len(self.class_names),
    self.input_image_shape, score_threshold=self.score, iou_threshold=self.iou)

def yolo_eval(yolo_outputs, anchors, num_classes, image_shape,
              max_boxes=20, score_threshold=.6, iou_threshold=.5):

接着，處理參數：

num_layers，輸出特征圖的層數，3層；
anchor_mask，将anchors劃分為3個層，第1層13x13是678，第2層26x26是345，第3層52x52是012；
input_shape：輸入圖像的尺寸，也就是第0個特征圖的尺寸乘以32，即13x32=416，這與Darknet的網絡結構有關。
num_layers = len(yolo_outputs)
anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]] # default setting
input_shape = K.shape(yolo_outputs[0])[1:3] * 32

特征圖越大，13->52，檢測的物體越小，需要的anchors越小，是以anchors清單以倒序指派。

接着，在YOLO的第l層輸出yolo_outputs中，調用yolo_boxes_and_scores()，提取框_boxes和置信度_box_scores，将3個層的框資料放入清單boxes和box_scores，再拼接concatenate展平，輸出的資料就是所有的框和置信度。

其中，輸出的boxes和box_scores的格式，如下：

boxes: (?, 4)  # ?是框數
box_scores: (?, 80)

實作：

boxes = []
box_scores = []
for l in range(num_layers):
    _boxes, _box_scores = yolo_boxes_and_scores(
        yolo_outputs[l], anchors[anchor_mask[l]], num_classes, input_shape, image_shape)
    boxes.append(_boxes)
    box_scores.append(_box_scores)
boxes = K.concatenate(boxes, axis=0)
box_scores = K.concatenate(box_scores, axis=0)

concatenate的作用是：将多個層的資料展平，因為框已經還原為真實坐标，不同尺度沒有差異。

在函數yolo_boxes_and_scores()中：

yolo_head的輸出：box_xy是box的中心坐标，(01)相對位置；box_wh是box的寬高，(01)相對值；box_confidence是框中物體置信度；box_class_probs是類别置信度；
yolo_correct_boxes，将box_xy和box_wh的(0~1)相對值，轉換為真實坐标，輸出boxes是(y_min,x_min,y_max,x_max)的值；
reshape，将不同網格的值展平為框的清單，即(?,13,13,3,4)->(?,4)；
box_scores是框置信度與類别置信度的乘積，再reshape展平，(?,80)；
傳回框boxes和框置信度box_scores。

實作：

def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape):
    '''Process Conv layer output'''
    box_xy, box_wh, box_confidence, box_class_probs = yolo_head(
        feats, anchors, num_classes, input_shape)
    boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape)
    boxes = K.reshape(boxes, [-1, 4])
    box_scores = box_confidence * box_class_probs
    box_scores = K.reshape(box_scores, [-1, num_classes])
    return boxes, box_scores

接着：

mask，過濾小于置信度門檻值的框，隻保留大于置信度的框，mask掩碼；
max_boxes_tensor，每張圖檔的最大檢測框數，max_boxes是20；

實作：

mask = box_scores >= score_threshold
max_boxes_tensor = K.constant(max_boxes, dtype='int32')

接着：

通過掩碼mask和類别c，篩選框class_boxes和置信度class_box_scores；
通過NMS，非極大值抑制，篩選出框boxes的NMS索引nms_index；
根據索引，選擇gather輸出的框class_boxes和置信class_box_scores度，再生成類别資訊classes；
将多個類别的資料組合，生成最終的檢測資料框，并傳回。

實作：

boxes_ = []
scores_ = []
classes_ = []
for c in range(num_classes):
    class_boxes = tf.boolean_mask(boxes, mask[:, c])
    class_box_scores = tf.boolean_mask(box_scores[:, c], mask[:, c])
    nms_index = tf.image.non_max_suppression(
        class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold)
    class_boxes = K.gather(class_boxes, nms_index)
    class_box_scores = K.gather(class_box_scores, nms_index)
    classes = K.ones_like(class_box_scores, 'int32') * c
    boxes_.append(class_boxes)
    scores_.append(class_box_scores)
    classes_.append(classes)
boxes_ = K.concatenate(boxes_, axis=0)
scores_ = K.concatenate(scores_, axis=0)
classes_ = K.concatenate(classes_, axis=0)

輸出格式：

boxes_: (?, 4)
scores_: (?,)
classes_: (?,)

檢測方法

第1步，圖像處理：

将圖像等比例轉換為檢測尺寸，檢測尺寸需要是32的倍數，周圍進行填充；
将圖檔增加1維，符合輸入參數格式；

代碼如下

if self.model_image_size != (None, None):  # 416x416, 416=32*13，必須為32的倍數，最小尺度是除以32
        assert self.model_image_size[0] % 32 == 0, 'Multiples of 32 required'
        assert self.model_image_size[1] % 32 == 0, 'Multiples of 32 required'
        boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))  # 填充圖像
    else:
        new_image_size = (image.width - (image.width % 32), image.height - (image.height % 32))
        boxed_image = letterbox_image(image, new_image_size)
    image_data = np.array(boxed_image, dtype='float32')
    print('detector size {}'.format(image_data.shape))
    image_data /= 255.  # 轉換0~1
    image_data = np.expand_dims(image_data, 0)  # 添加批次次元，将圖檔增加1維

第2步，feed資料，圖像，圖像尺寸；

out_boxes, out_scores, out_classes = self.sess.run(
    [self.boxes, self.scores, self.classes],
    feed_dict={
        self.yolo_model.input: image_data,
        self.input_image_shape: [image.size[1], image.size[0]],
        K.learning_phase(): 0
    })

第3步，繪制邊框，自動設定邊框寬度，繪制邊框和類别文字，使用Pillow繪圖庫。

font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
                          size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))  # 字型
thickness = (image.size[0] + image.size[1]) // 512  # 厚度
for i, c in reversed(list(enumerate(out_classes))):
    predicted_class = self.class_names[c]  # 類别
    box = out_boxes[i]  # 框
    score = out_scores[i]  # 執行度

    label = '{} {:.2f}'.format(predicted_class, score)  # 标簽
    draw = ImageDraw.Draw(image)  # 畫圖
    label_size = draw.textsize(label, font)  # 标簽文字

    top, left, bottom, right = box
    top = max(0, np.floor(top + 0.5).astype('int32'))
    left = max(0, np.floor(left + 0.5).astype('int32'))
    bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
    right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
    print(label, (left, top), (right, bottom))  # 邊框

    if top - label_size[1] >= 0:  # 标簽文字
        text_origin = np.array([left, top - label_size[1]])
    else:
        text_origin = np.array([left, top + 1])

    # My kingdom for a good redistributable image drawing library.
    for i in range(thickness):  # 畫框
        draw.rectangle(
            [left + i, top + i, right - i, bottom - i],
            outline=self.colors[c])
    draw.rectangle(  # 文字背景
        [tuple(text_origin), tuple(text_origin + label_size)],
        fill=self.colors[c])
    draw.text(text_origin, label, fill=(0, 0, 0), font=font)  # 文案
    del draw

補充

concatenate

concatenate将相同次元的資料元素連接配接到一起。

實作：

from keras import backend as K

sess = K.get_session()

a = K.constant([[2, 4], [1, 2]])
b = K.constant([[3, 2], [5, 6]])
c = [a, b]
c = K.concatenate(c, axis=0)

print(sess.run(c))
"""
[[2. 4.] [1. 2.] [3. 2.] [5. 6.]]
"""

gather

gather以索引選擇清單元素。

實作：

from keras import backend as K

sess = K.get_session()

a = K.constant([[2, 4], [1, 2], [5, 6]])
b = K.gather(a, [1, 2])

print(sess.run(b))
"""
[[1. 2.] [5. 6.]]
"""

參考：

http://www.jintiankansha.me/t/LMo7OOJA33

YOLOV3解讀（6）

檢測函數

YOLO參數

輸出封裝

YOLO評估

檢測方法

補充

繼續閱讀

yolov3原理+訓練損失

深入淺出Yolo系列之Yolov5核心基礎知識完整講解1 Yolov5四種網絡模型2 核心基礎内容3 Yolov5相關論文及代碼4 小目标分割檢測5 Yolox核心基礎完整講解6 後語

深入淺出Yolo系列之Yolox核心基礎完整講解1 Yolov3&Yolov4&Yolov5相關資料2 Yolox相關基礎知識點3 Yolox核心知識點4 深入淺出Yolox之自有資料集訓練5 不同的落地模型部署方式6 後續更新ing7 相關推薦：資料集分類下載下傳

YOLO系列算法原理講解----（3）Yolov3算法

從今日起記錄下自己的學曆曆程

Yolov3：win10下訓練自己的資料（GPU版）（詳細步驟）

win10+pytorch+yolov3 訓練爬取資料寫在前頭編譯運作拓展錯誤彙總小結

YOLOv3訓練自己的資料

使用yolov3訓練的資料集

使用yolov3訓練自己的資料集（c++ vs2017 win10）

yolov3 訓練及資料集準備【記錄】yolov3 訓練及資料集準備【記錄】

yolov3在win10下訓練自己的資料

redis學習筆記二（redis的資料類型）

給大家分享我常用的的火狐浏覽器常用書簽（截至到2019.01.24，以後會不斷更新，供自己學習用)

使用cmd啟動jar項目時，長時間不處理請求，再次請求時，無傳回問題解決

目标檢測：YOLOV3論文解讀一、yolov3論文解讀