1. 開發環境
OS: ubuntu16.04
tensorflow版本:1.12.0
python版本: 3.6.7
2. 下載下傳源碼到本地
facenet官方github: https://github.com/davidsandberg/facenet.git
git clone https://github.com/davidsandberg/facenet.git
在requirements.txt檔案看到要安裝相關的依賴庫,自己用pip指令安裝一下就好了
tensorflow==1.14.0
scipy
scikit-learn
opencv-python
h5py
matplotlib
Pillow
requests
psutil
3. 下載下傳LFW資料集
下載下傳位址:http://vis-www.cs.umass.edu/lfw/
下載下傳步驟:->Menu->Download->All images as gzipped tar file
把下載下傳的壓縮包放在 facenet/data/lfw_data 目錄下,然後進行解壓。
- 對LFW圖檔預處理
lfw的圖檔原圖尺寸為 250*250,我們要修改圖檔尺寸,使其大小和預訓練模型的圖檔輸入尺寸一緻,即160*160,轉換後的資料集存儲在 facenet/data/lfw_data/lfw_160檔案夾内。
- 修改圖檔尺寸
align_dataset_mtcnn.py 會對dataset的圖檔進行人臉檢測,進一步細化人臉圖檔,然後再把人臉圖檔尺寸修改為160×160的尺寸。
進入到facenet/src 目錄下,把align_dataset_mtcnn.py 檔案拷貝到src目錄:
cd facenet/src
cp -i align/align_dataset_mtcnn.py ./
python align_dataset_mtcnn.py ../data/lfw_data/lfw ../data/lfw_data/lfw_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25
列印如下表示成功。
[外鍊圖檔轉存失敗(img-RQVgA6oy-1566983772469)(https://pic1.xuehuaimg.com/proxy/csdn/https://img-blog.csdnimg.cn/20190228181338621.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3UwMTI1MDU2MTc=,size_16,color_FFFFFF,t_70)]
4. 下載下傳Google預訓練的網絡模型
下載下傳位址 https://github.com/davidsandberg/facenet ,可以看到有兩個基于不同的dataset預訓練好的模型。這裡我下載下傳的是VGGFace2資料集的模型,并把模型放到facenet/models目錄下,然後解壓。
[外鍊圖檔轉存失敗(img-0Kf0rcvk-1566983772470)(https://pic1.xuehuaimg.com/proxy/csdn/https://img-blog.csdnimg.cn/20190228182209889.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3UwMTI1MDU2MTc=,size_16,color_FFFFFF,t_70)]
5. 預訓練模型準确率測試
使用預訓練模型進行測試:
python src/validate_on_lfw.py data/lfw_data/lfw_160/ models/20180402-114759/
由于我使用的tf版本的原因,我使用的是 tf1.12版本的, 作者的預訓練模型是在tf 1.7版本訓練的,是以在導入graph時會出錯。出現如下錯誤:
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:734: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the
tf.data
module.
Model directory: models/20180402-114759/
Metagraph file: model-20180402-114759.meta
Checkpoint file: model-20180402-114759.ckpt-275
2019-02-28 19:54:02.009422: W tensorflow/core/graph/graph_constructor.cc:1265] Importing a graph with a lower producer version 24 into an existing graph with producer version 27. Shape inference will have run different parts of the graph with different producer versions.
Traceback (most recent call last):
File “src/validate_on_lfw.py”, line 164, in
main(parse_arguments(sys.argv[1:]))
File “src/validate_on_lfw.py”, line 73, in main
facenet.load_model(args.model, input_map=input_map)
File “/home/liguiyuan/study/deep_learning/project/facenet/src/facenet.py”, line 381, in load_model
saver = tf.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py”, line 1674, in import_meta_graph
meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
…
KeyError: “The name ‘decode_image/cond_jpeg/is_png’ refers to an Operation not in the graph.”
解決方法:
1.把Tensorflow換為1.7版本的;
2.在facenet.py代碼中找到create_input_pipeline 再添加一行語句 with tf.name_scope(“tempscope”):就可以完美解決(貌似Tensorflow 1.10及以上版本才修複這個bug)。
[外鍊圖檔轉存失敗(img-AN7KuGeB-1566983772470)(https://pic1.xuehuaimg.com/proxy/csdn/https://img-blog.csdnimg.cn/20190228203353274.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3UwMTI1MDU2MTc=,size_16,color_FFFFFF,t_70)]
改好之後, 再重新執行python代碼。準确率達到了 0.98500±0.00658,列印如下:
[外鍊圖檔轉存失敗(img-bE7Vd1ZA-1566983772471)(https://pic1.xuehuaimg.com/proxy/csdn/https://img-blog.csdnimg.cn/20190228204156658.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3UwMTI1MDU2MTc=,size_16,color_FFFFFF,t_70)]
6. 比較兩張圖檔的距離
執行以下指令:
python src/compare.py models/20180402-114759/20180402-114759.pb data/images/Anthony_Hopkins_0001.jpg data/images/Anthony_Hopkins_0002.jpg
又出現了錯誤:
2019-03-01 15:53:40.632821: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 5.94GiB freeMemory: 5.50GiB
2019-03-01 15:53:40.632855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-03-01 15:53:40.838198: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-01 15:53:40.838230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-03-01 15:53:40.838261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-03-01 15:53:40.838410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6078 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-03-01 15:53:40.934468: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 5.94G (6373572608 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-03-01 15:53:41.996521: E tensorflow/stream_executor/cuda/cuda_dnn.cc:373] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
這是申請GPU記憶體失敗了,可以通過設定GPU的配置參數來解決。在compare.py檔案中把GPU的使用率從1.0改為0.7:
parser.add_argument('--gpu_memory_fraction', type=float,
help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
# 該為:
parser.add_argument('--gpu_memory_fraction', type=float,
help='Upper bound on the amount of GPU memory that will be used by the process.', default=0.7)
這次成功了!得到的值為0.8396,這個值代表的是歐氏距離,用來判别這兩張圖檔是否為同一個人。兩張人臉圖檔越相似,空間距離越小;差别越大,則空間距離越大。

參考教程:
http://www.cnblogs.com/gmhappy/p/9472388.html