模型會按照評分的從高到低,挑選K個回答。如果正确的回答在這K個當中,我們就認為這條測試樣本預測正确。顯然,K越大,事情越簡單。
首先,對于單一個查詢圖檔,在系統中搜尋它的k個最近鄰的圖檔,如果傳回的k張圖檔中有至少一張圖檔和查詢圖檔屬于同一個類,則該次查詢的score記為1,否則記為0。[email protected]則是測試集中所有查詢圖檔score的平均.
from bottleneck import argpartition
def evaluate_emb(emb, labels):
"""Evaluate embeddings based on [email protected]."""
d_mat = get_distance_matrix(emb) # N*N 距離矩陣
d_mat = d_mat.asnumpy()
labels = labels.asnumpy()
names = []
accs = []
for k in [1, 2, 4, 8, 16]:
names.append('Recall@%d' % k)
correct, cnt = 0.0, 0.0
for i in range(emb.shape[0]):
d_mat[i, i] = 1e10
nns = argpartition(d_mat[i], k)[:k] # 擷取k近鄰的index
if any(labels[i] == labels[nn] for nn in nns):
correct += 1
cnt += 1
accs.append(correct/cnt)
return names, accs