VoxSRC 2020
- 競賽連結:http://www.robots.ox.ac.uk/~vgg/data/voxceleb/competition2020.html
- Baseline Codebases:https://github.com/clovaai/voxceleb_trainer
- Development toolkit:https://github.com/a-nagrani/VoxSRC2020
基準測試
基準來自 voxceleb_trainer 項目的已訓練模型,在新的 dev 資料集上的結果為:
- Cosine:
with Threshold6.7480%
.0.4959
- 2-norm:
with threshold6.7541%
.-1.0027
具體測試過程可下載下傳該 Notebook to Html 檔案:https://github.com/mechanicalsea/voxsrc2020/blob/master/Baseline.html
開發工具箱
考慮到 voxceleb_trainer 代碼相對繁瑣,作者從中提取并修改的部分内容,建立了一個便于資料增益和模型設計的工具包:
- 連結:https://github.com/mechanicalsea/voxsrc2020/blob/master/base.py
- 案例:
if __name__ == "__main__":
# 定義訓練集、測試集及其兩者的根目錄
trainlst = "/workspace/rwang/voxceleb/train_list.txt"
testlst = "/workspace/rwang/VoxSRC2020/data/verif/trials.txt"
traindir = "/workspace/rwang/voxceleb/voxceleb2/"
testdir = "/workspace/rwang/voxceleb/"
maptrain5994 = "/workspace/rwang/competition/voxsrc2020/maptrain5994.txt"
# 載入訓練集
train = load_train(trainlst=trainlst, traindir=traindir,
maptrain5994=maptrain5994)
# 載入測試集
trial = load_trial(testlst=testlst, testdir=testdir)
# 定義說話人嵌入提取模型
net = ResNetSE34L(nOut=512, num_filters=[16, 32, 64, 128])
# 定義頂層分類器模型
top = AMSoftmax(in_feats=512, n_classes=5994, m=0.2, s=30)
# sklearn 模型生成
snet = SpeakerNet(net=net, top=top)
# 模型訓練
modelst, step_num, loss, prec1, prec5 = snet.train(train, num_epoch=1)
# 模型評估
eer, thresh, all_scores, all_labels, all_trials, trials_feat = snet.eval(
trial, step_num=0, trials_feat=None)
歡迎關注,歡迎交流。