文章目錄
-
- 調試設定
- 前置知識
- 整體結構
- 核心函數
調試設定
- Debug 設定
其它參數則是使用預設,其中 【–workers=0】表示僅僅使用主線程讀取資料。那麼調試過程中,每次讀取的資料是同一個資料,便于列印關鍵變量的變化。{ // 使用 IntelliSense 了解相關屬性。 // 懸停以檢視現有屬性的描述。 // 欲了解更多資訊,請通路: https://go.microsoft.com/fwlink/?linkid=830387 "version": "0.2.0", "configurations": [ { "name": "Python: Current File", "type": "python", "request": "launch", "program": "val.py", "console": "integratedTerminal", "justMyCode": true, "args":["--data", "coco128.yaml", "--workers", "0", "--batch-size", "1"] } ] }
-
調試資料
使用的是【coco.yaml】資料, 圖檔【000000000308.jpg】,總共有4類,13個目标框,如下圖所示,
前置知識
了解mAP的計算過程,有一些基本的概念和專業術語需要弄清楚。參考部落格連結:名額評估 —— AP & mAP 詳細解讀.
整體結構
名額評估主要涉及兩個腳本檔案,【val.py】:名額評估的主要流程控制;【metrics.py】:核心功能的具體代碼實作,比如AP,mAP的計算過程,PR曲線的繪制等。
- 腳本【val.py】的大緻結構如下,
def save_one_txt(predn, save_conf, shape, file): # save txt def save_one_json(predn, jdict, path, class_map): # Save one JSON result {"image_id": 42, # "category_id": 18, # "bbox": [258.15, 41.29, 348.26, 243.78], # "score": 0.236} def process_batch(detections, labels, iouv): """ Return correct predictions matrix. Both sets of boxes are in (x1, y1, x2, y2) format. Arguments: detections (Array[N, 6]), x1, y1, x2, y2, conf, class labels (Array[M, 5]), class, x1, y1, x2, y2 Returns: correct (Array[N, 10]), for 10 IoU levels """ # 計算名額的關鍵函數之一 # iou:[0.5:0.95],10個不同的iou門檻值下,計算标簽與預測的比對結果,存于矩陣,标記是否預測正确 @torch.no_grad() def run( data, weights=None, # model.pt path(s) batch_size=32, # batch size imgsz=640, # inference size (pixels) conf_thres=0.001, # confidence threshold iou_thres=0.6, # NMS IoU threshold task='val', # train, val, test, speed or study device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu workers=8, # max dataloader workers (per RANK in DDP mode) ... ... ): """ # 函數run()的處理流程如下: 1. 加載模型; 2. 加載資料; 3. 網絡預測,NMS處理; 4. 計算AP,mAP; 5. 繪制名額圖; 6. 儲存結果; """ def parse_opt(): # 運作相關參數定義 def main(opt): # 入口函數 run(**vars(opt)) if __name__ == "__main__": opt = parse_opt() main(opt)
-
腳本【metric.py】的大緻結構如下,
包含核心的功能實作代碼
def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='.', names=(), eps=1e-16): """ Compute the average precision, given the recall and precision curves. Source: https://github.com/rafaelpadilla/Object-Detection-Metrics. # Arguments tp: True positives (nparray, nx1 or nx10). conf: Objectness value from 0-1 (nparray). pred_cls: Predicted object classes (nparray). target_cls: True object classes (nparray). plot: Plot precision-recall curve at [email protected] save_dir: Plot save directory # Returns The average precision as computed in py-faster-rcnn. """ # 計算AP,F1,Presion,Recall,繪制名額圖 def compute_ap(recall, precision): """ Compute the average precision, given the recall and precision curves # Arguments recall: The recall curve (list) precision: The precision curve (list) # Returns Average precision, precision curve, recall curve """ # 根據PR曲線,計算每一類的AP值 # 兩種方法:插值101點或者連續點計算曲線下方的面積 class ConfusionMatrix: # Updated version of https://github.com/kaanakan/object_detection_confusion_matrix def __init__(self, nc, conf=0.25, iou_thres=0.45): self.matrix = np.zeros((nc + 1, nc + 1)) self.nc = nc # number of classes self.conf = conf self.iou_thres = iou_thres # 計算混淆矩陣 def bbox_iou(box1, box2, xywh=True, GIoU=False, DIoU=False, CIoU=False, eps=1e-7): # Returns Intersection over Union (IoU) of box1(1,4) to box2(n,4) def plot_pr_curve(px, py, ap, save_dir=Path('pr_curve.png'), names=()): # 繪制PR曲線
核心函數
- process_batch
函數作用:為計算AP,mAP作準備。
基本思想:在一組不同的IOU門檻值下,标記每一個預測框是正确預測,還是錯誤預測,正确标記為True,錯誤标記為False,标記結果存于【correct】矩陣。經過NMS處理後,存在預測框和标簽框不是一一對應的關系,為了後續計算mAP,需要保證一個标簽框對應一個預測框。
去除重複框:當一個預測框對應多個标簽框時,需要根據iou值的大小,最大iou值的框标記為正确預測,其它iou值的重複預測框均為錯誤預測;當一個标簽框對應多個預測框時,根據置信度分數篩選,置信度最大的預測框為正确預測,其它置信度分數的則為錯誤預測。
該函數的基本流程如下:
(1)初始化傳回矩陣 correct,shape=[num_predicts,10],num_predicts:網絡預測的框經過NMS處理後剩下的數量;10:10個不同的iou門檻值;
(2)計算所有預測框和真實框兩兩之間的交并比值,形狀[num_gt, num_predicts];
(3)标記正确預測的類别,存于correct_class,形狀[num_gt, num_predicts];
(4)循環每一個iou門檻值,并計算該iou值下的正确預測的框,循環處理流程為(5)-(9);
(5)篩選出類别預測正确,并且iou大于門檻值的框,傳回索引,存于x;
(6)x不為空的情況下,将标簽框的id,預測框的id,以及對應的iou值組合為新的矩陣,存于matches;
(7)對IOU進行從大到小進行排序,進而對matches進行排序;
(8)去除重複的預測框和真實框,保證預測框和預測框一一對應;
(9)将新的matches[:,1]取出,更新correct相應位置為True;
傳回Tensor的解釋:
correct:[num_detections, 10],正确預測框被标記為True;每一列對應不同的iou門檻值下,預測框的正确與否;每一行标記類别預測正确,且iou最大的框的id為true,其它為false;
一些問題,思考:
(1)為何要根據iou對 matches進行排序?保留最大的iou預測框。
(2)為何要去除matches中重複的預測框和真實框?一個真實框隻能有一個正确的預測框。
def process_batch(detections, labels, iouv): """ Return correct predictions matrix. Both sets of boxes are in (x1, y1, x2, y2) format. Arguments: detections (Array[N, 6]), x1, y1, x2, y2, conf, class labels (Array[M, 5]), class, x1, y1, x2, y2 Returns: correct (Array[N, 10]), for 10 IoU levels """ # detections:[300,6],iouv:[10]=[0.5,0.55,0.60,0.65,0.70,0.75,0.80,0.85,0.90,0.95] # labels: [13,5] # correct: [300,10],每一清單示一個iou門檻值下,該檢測框是否為正樣本; # 300:表示預測框經過NMS處理後剩下的數量,10:iou門檻值的數量 # 是:true;否:false correct = np.zeros((detections.shape[0], iouv.shape[0])).astype(bool) # 兩兩計算預測框和真實框的IOU值, 得到iou: [13,300] # 矩陣的每一個位置:記錄每一個标簽框與預測框的IOU值 iou = box_iou(labels[:, 1:], detections[:, :4]) # 标記正确預測的類别,correct_class=[13,300] # 該矩陣的每一個位置:記錄預測框是否與标簽框的類别一緻 correct_class = labels[:, 0:1] == detections[:, 5] # iouv: [0.5:0.05:0.95],計算每一個iou下的正确預測的情況 for i in range(len(iouv)): # x: i=0, len(x)=2, # x[0]:true的行索引,len(x[0])=26,表示有26個位置iou既大于門檻值,且類别預測正确 # x[1]:true的列索引, x = torch.where((iou >= iouv[i]) & correct_class) # IoU > threshold and classes match if x[0].shape[0]: # matches: [26,3],存儲目前iou門檻值下,預測的情況 # matches[:,0]: 标簽框的id索引 # matches[:,1]: 預測框的id索引 # matches[:,2]: iou的具體值 matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() # [label, detect, iou] if x[0].shape[0] > 1: # 根據iou值對matches進行從大到小排序 matches = matches[matches[:, 2].argsort()[::-1]] # 含義:每一個預測框隻能出現一次,如果一個預測框(表現為matches[:, 1]相同的數值)與多個gt的iou大于門檻值,則保留iou最大的一個 # 去除重複的檢測框id,保留重複id中最大iou的id,留下一個最大的iou框 matches = matches[np.unique(matches[:, 1], return_index=True)[1]] # matches = matches[matches[:, 2].argsort()[::-1]] # 含義:一個gt不能對應多個預測框;此時的matche[:,1]已經從小到大進行排序(在NMS階段,得分高的id小), # 此時按照matches[:,0]進行排序的過程中,會去除重複的标簽id;實際上,結合已經排序的matches[:,1], # 是以是根據置信度分數去除重複的标簽id,置信度最高的保留 matches = matches[np.unique(matches[:, 0], return_index=True)[1]] # 每一列标記iou=0.5,..., 0.95 保留下來的檢測框為True # # matches[:,1]: 預測框的id索引 correct[matches[:, 1].astype(int), i] = True return torch.tensor(correct, dtype=torch.bool, device=iouv.device)
- ap_per_class
函數的基本處理流程如下:
(1)對置信度進行從大到小排序;
(2)循環标簽中不重複類别,單獨處理每一類别;
(3)計算TPs,FPs;
(4)計算每一類别在不同IOU門檻值下的AP值;
(5)計算F1值;
給定單個PR序列值,計算PR-曲線下的面積,也即是目前類别下的【ap】值,代碼如下:def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='.', names=(), eps=1e-16): """ Compute the average precision, given the recall and precision curves. Source: https://github.com/rafaelpadilla/Object-Detection-Metrics. # Arguments tp: True positives (nparray, nx1 or nx10). conf: Objectness value from 0-1 (nparray). pred_cls: Predicted object classes (nparray). target_cls: True object classes (nparray). plot: Plot precision-recall curve at [email protected] save_dir: Plot save directory # Returns The average precision as computed in py-faster-rcnn. """ # tp: [300,10],每一個iou值下,正确預測樣本标記為True,否則為False # conf: [300],置信度 # pred_cls: [300], 預測類别 # target_cls: [13] # Sort by objectness i = np.argsort(-conf) # 根據i(置信度分數從大到小的索引)對tp的每一列,conf,pred_cls重新排序 tp, conf, pred_cls = tp[i], conf[i], pred_cls[i] # Find unique classes # 統計标簽中不重複類别的數量以及每一類别的數量 unique_classes, nt = np.unique(target_cls, return_counts=True) nc = unique_classes.shape[0] # number of classes, number of detections # Create Precision-Recall curve and compute AP for each class px, py = np.linspace(0, 1, 1000), [] # for plotting # ap:記錄每一類,不同iou下的ap值,p,r:存儲pr值 ap, p, r = np.zeros((nc, tp.shape[1])), np.zeros((nc, 1000)), np.zeros((nc, 1000)) # 循環取出标簽框中的每一類别 for ci, c in enumerate(unique_classes): i = pred_cls == c # 找到預測框中類别為c的位置,标記為True n_l = nt[ci] # number of labels,标簽中類别為c的數量 n_p = i.sum() # number of predictions,标記為true的預測框數量 if n_p == 0 or n_l == 0: # 空标簽或者空的預測值,不處理 continue # Accumulate FPs and TPs # 累積TP和FP數量,用于計算 precision # tp[i]:取出tp中類别為c的類别的框,[48,10]有48個框預測為類别c # .cumsum(0) 用于計算Accumulate TPs fpc = (1 - tp[i]).cumsum(0) tpc = tp[i].cumsum(0) # Recall # 召回計算:recall = tp/all_groudtruths,recall:[num_valid_box,10] recall = tpc / (n_l + eps) # recall curve # 插值新的點列[-px], -conf[i]:置信度分數,原插值點橫坐标,recall[:, 0]:iou=0.5時的召回率值 r[ci] = np.interp(-px, -conf[i], recall[:, 0], left=0) # negative x,xp because xp decreases # Precision # 精确度:precision = tp/(tp+fp) precision = tpc / (tpc + fpc) # precision curve # 與召回率類似,取iou=0.5 p[ci] = np.interp(-px, -conf[i], precision[:, 0], left=1) # p at pr_score # AP from recall-precision curve # 循環處理目前類别下,每一個iou值的所有類的ap計算 # ci:真實标簽類别id,j:列索引,每一清單示一個iou值 for j in range(tp.shape[1]): ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j]) # 隻繪制 iou=0.5的PR-curve圖 if plot and j == 0: py.append(np.interp(px, mrec, mpre)) # precision at [email protected] # Compute F1 (harmonic mean of precision and recall) f1 = 2 * p * r / (p + r + eps) names = [v for k, v in names.items() if k in unique_classes] # list: only classes that have data names = dict(enumerate(names)) # to dict if plot: plot_pr_curve(px, py, ap, Path(save_dir) / 'PR_curve.png', names) plot_mc_curve(px, f1, Path(save_dir) / 'F1_curve.png', names, ylabel='F1') plot_mc_curve(px, p, Path(save_dir) / 'P_curve.png', names, ylabel='Precision') plot_mc_curve(px, r, Path(save_dir) / 'R_curve.png', names, ylabel='Recall') # i = smooth(f1.mean(0), 0.1).argmax() # max F1 index p, r, f1 = p[:, i], r[:, i], f1[:, i] tp = (r * nt).round() # true positives fp = (tp / (p + eps) - tp).round() # false positives return tp, fp, p, r, f1, ap, unique_classes.astype(int)
def compute_ap(recall, precision): """ Compute the average precision, given the recall and precision curves # Arguments recall: The recall curve (list) precision: The precision curve (list) # Returns Average precision, precision curve, recall curve """ # recall:固定單個類别和iou值下的序列 # precision:固定單個類别和iou值下的序列 # Append sentinel values to beginning and end # 閉合區間 # mrec: ap曲線的橫坐标 # mrpe: ap曲線的縱坐标 mrec = np.concatenate(([0.0], recall, [1.0])) mpre = np.concatenate(([1.0], precision, [0.0])) # Compute the precision envelope # 将P曲線變為單調序列??? mpre = np.flip(np.maximum.accumulate(np.flip(mpre))) # Integrate area under curve # 計算PR曲線下面積的兩種方法:插值法(11點和101點)和連續法 method = 'interp' # methods: 'continuous', 'interp' if method == 'interp': x = np.linspace(0, 1, 101) # 101-point interp (COCO) ap = np.trapz(np.interp(x, mrec, mpre), x) # integrate else: # 'continuous' i = np.where(mrec[1:] != mrec[:-1])[0] # points where x axis (recall) changes ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) # area under curve return ap, mpre, mrec