FaceNet: A Unified Embedding for Face Recognition and Clustering论文解读

2023-03-15 21:39:02

这篇paper提出了一个统一的系统，通过CNN来学习图片的欧氏嵌入，在嵌入空间的欧氏距离可以直接和相似度correspond。

和其他方法不同，FaceNet使用triplet loss直接训练输出128-D的embedding，triplets组成是两个matching的face 缩略图和一个non-matching的缩略图，缩略图是对人脸区域的直接crop，没有2D和3D对齐，而不是缩放和平移。

对于挑选triplet是一件很困难的事，文章提出了在线online筛选negative的策略来训练网络，也explore了hard-positive mining techniques。

文章探索了两种网络，第一种是ZF-net，包含多种交错的卷积、非线性激活层，还额外的增加了几个1*1*d的卷积层。第二种是基于Inception架构，使用混合层来并行的运行不同的卷积和池化操作，再连接它们的输出。

Given模型，最重要的部分是端到端的系统。为此，利用triplet loss来直接学习reflects。从一张图像x，到一个特征空间R^d,去strive这样的f(x)，保证在所有的face之间，一个人的欧氏距离很小，不同人之间的欧氏距离很大。

FaceNet: A Unified Embedding for Face Recognition and Clustering论文解读

Triplets Loss

FaceNet: A Unified Embedding for Face Recognition and Clustering论文解读

注意，生成所有可能的triplets会导致triplet loss很容易满足，这些triplets对训练没作用而且会导致收敛变慢，因为它们仍然需要传过Net。所以，对于选择hard triplets是至关重要的，它们对于提升model很有作用，下面说下triplets咋选。

Triplets Selection

FaceNet: A Unified Embedding for Face Recognition and Clustering论文解读

在整个dataset上去计算argmin和argmax是不可行的，可能导致 poor training, as mislabelled and poorly imaged faces would dominate the hard positives and negatives. There are two obvious choices that avoid this issue:

每n步去离线生成triplets，使用最近的checkpoints，然后计算数据subset的argmax和argmin;
online生成Triplets，通过在一个mini-batch中选择hard positive/negative的exemplars。

在这里focus on online 的筛选，在mini-batch中计算argmax和argmin。

在实验中,每个mini-batch中的每个图片(anchor) around 40个同类的faces，另外，negative样例随机加入到mini-batch中。

这里没用hardest-positive, we use all anchor-positive pairs in a mini-batch while still selecting the hard negatives. We don’t have a side-by-side comparison of hard anchor-positive pairs versus all anchor-positive pairs within a mini-batch, but we found in practice that the all anchor- positive method was more stable and converged slightly faster at the beginning of training.

文字也说了离线生成triplets的策略，可能允许使用小点的batch size，但没做实验。

每次选择最hard的negative会导致过早滴陷入局部最优，所以选择的手段是semi-hard.

FaceNet: A Unified Embedding for Face Recognition and Clustering论文解读

这篇文字重点就这些，后续就是用Inception和ZF-Net的训练过程和结果，实验结果等。

FaceNet: A Unified Embedding for Face Recognition and Clustering论文解读

Triplets Loss

Triplets Selection

继续阅读

考证大全 | 证券从业资格考试

敲黑板！2021年证券从业考试考点预测

2021年银行从业考试考情介绍,果断收藏!

证券从业合格证书什么时候打印？有哪些注意事项？

【干货满满】初级银行从业考试《个人理财》重点梳理

2020年经济师考试，难吗？

初级银行从业资格证有什么用？

MBA提前面试纯干货分享

MBA值得学么

吴恩达logistic回归实现

【人工智能行业大师访谈1】吴恩达采访 Geoffery Hinton

深度学习模型分析人类复杂疾病的准确性

PLDA简介

【趋高机器视觉】机器视觉技术原理解析及解决方案

解码器用于语义分割：数据依赖的解码可以实现灵活的特征聚合

cs231n斯坦福基于卷积神经网络的CV学习笔记（一）KNN和线性分类器/分类器损失/反向传播一，KNN图像分类算法二，线性分类器三，线性分类器损失四，反向传播五，神经网络