FaceNet: A Unified Embedding for Face Recognition and Clustering論文解讀

2023-03-15 21:39:02

這篇paper提出了一個統一的系統，通過CNN來學習圖檔的歐氏嵌入，在嵌入空間的歐氏距離可以直接和相似度correspond。

和其他方法不同，FaceNet使用triplet loss直接訓練輸出128-D的embedding，triplets組成是兩個matching的face 縮略圖和一個non-matching的縮略圖，縮略圖是對人臉區域的直接crop，沒有2D和3D對齊，而不是縮放和平移。

對于挑選triplet是一件很困難的事，文章提出了線上online篩選negative的政策來訓練網絡，也explore了hard-positive mining techniques。

文章探索了兩種網絡，第一種是ZF-net，包含多種交錯的卷積、非線性激活層，還額外的增加了幾個1*1*d的卷積層。第二種是基于Inception架構，使用混合層來并行的運作不同的卷積和池化操作，再連接配接它們的輸出。

Given模型，最重要的部分是端到端的系統。為此，利用triplet loss來直接學習reflects。從一張圖像x，到一個特征空間R^d,去strive這樣的f(x)，保證在所有的face之間，一個人的歐氏距離很小，不同人之間的歐氏距離很大。

FaceNet: A Unified Embedding for Face Recognition and Clustering論文解讀

Triplets Loss

FaceNet: A Unified Embedding for Face Recognition and Clustering論文解讀

注意，生成所有可能的triplets會導緻triplet loss很容易滿足，這些triplets對訓練沒作用而且會導緻收斂變慢，因為它們仍然需要傳過Net。是以，對于選擇hard triplets是至關重要的，它們對于提升model很有作用，下面說下triplets咋選。

Triplets Selection

FaceNet: A Unified Embedding for Face Recognition and Clustering論文解讀

在整個dataset上去計算argmin和argmax是不可行的，可能導緻 poor training, as mislabelled and poorly imaged faces would dominate the hard positives and negatives. There are two obvious choices that avoid this issue:

每n步去離線生成triplets，使用最近的checkpoints，然後計算資料subset的argmax和argmin;
online生成Triplets，通過在一個mini-batch中選擇hard positive/negative的exemplars。

在這裡focus on online 的篩選，在mini-batch中計算argmax和argmin。

在實驗中,每個mini-batch中的每個圖檔(anchor) around 40個同類的faces，另外，negative樣例随機加入到mini-batch中。

這裡沒用hardest-positive, we use all anchor-positive pairs in a mini-batch while still selecting the hard negatives. We don’t have a side-by-side comparison of hard anchor-positive pairs versus all anchor-positive pairs within a mini-batch, but we found in practice that the all anchor- positive method was more stable and converged slightly faster at the beginning of training.

文字也說了離線生成triplets的政策，可能允許使用小點的batch size，但沒做實驗。

每次選擇最hard的negative會導緻過早滴陷入局部最優，是以選擇的手段是semi-hard.

FaceNet: A Unified Embedding for Face Recognition and Clustering論文解讀

這篇文字重點就這些，後續就是用Inception和ZF-Net的訓練過程和結果，實驗結果等。

FaceNet: A Unified Embedding for Face Recognition and Clustering論文解讀

Triplets Loss

Triplets Selection

繼續閱讀

考證大全 | 證券從業資格考試

敲黑闆！2021年證券從業考試考點預測

2021年銀行從業考試考情介紹,果斷收藏!

證券從業合格證書什麼時候列印？有哪些注意事項？

【幹貨滿滿】初級銀行從業考試《個人理财》重點梳理

2020年經濟師考試，難嗎？

初級銀行從業資格證有什麼用？

MBA提前面試純幹貨分享

MBA值得學麼

吳恩達logistic回歸實作

【人工智能行業大師訪談1】吳恩達采訪 Geoffery Hinton

深度學習模型分析人類複雜疾病的準确性

PLDA簡介

【趨高機器視覺】機器視覺技術原了解析及解決方案

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

cs231n斯坦福基于卷積神經網絡的CV學習筆記（一）KNN和線性分類器/分類器損失/反向傳播一，KNN圖像分類算法二，線性分類器三，線性分類器損失四，反向傳播五，神經網絡