天天看點

VoxSRC 2020 基準模型和開發工具VoxSRC 2020

VoxSRC 2020

  • 競賽連結:http://www.robots.ox.ac.uk/~vgg/data/voxceleb/competition2020.html
  • Baseline Codebases:https://github.com/clovaai/voxceleb_trainer
  • Development toolkit:https://github.com/a-nagrani/VoxSRC2020

基準測試

基準來自 voxceleb_trainer 項目的已訓練模型,在新的 dev 資料集上的結果為:

  • Cosine:

    6.7480%

    with Threshold

    0.4959

    .
  • 2-norm:

    6.7541%

    with threshold

    -1.0027

    .

具體測試過程可下載下傳該 Notebook to Html 檔案:https://github.com/mechanicalsea/voxsrc2020/blob/master/Baseline.html

開發工具箱

考慮到 voxceleb_trainer 代碼相對繁瑣,作者從中提取并修改的部分内容,建立了一個便于資料增益和模型設計的工具包:

  • 連結:https://github.com/mechanicalsea/voxsrc2020/blob/master/base.py
  • 案例:
if __name__ == "__main__":
    # 定義訓練集、測試集及其兩者的根目錄
    trainlst = "/workspace/rwang/voxceleb/train_list.txt"
    testlst = "/workspace/rwang/VoxSRC2020/data/verif/trials.txt"
    traindir = "/workspace/rwang/voxceleb/voxceleb2/"
    testdir = "/workspace/rwang/voxceleb/"
    maptrain5994 = "/workspace/rwang/competition/voxsrc2020/maptrain5994.txt"
    # 載入訓練集
    train = load_train(trainlst=trainlst, traindir=traindir,
                       maptrain5994=maptrain5994)
    # 載入測試集
    trial = load_trial(testlst=testlst, testdir=testdir)
    # 定義說話人嵌入提取模型
    net = ResNetSE34L(nOut=512, num_filters=[16, 32, 64, 128])
    # 定義頂層分類器模型
    top = AMSoftmax(in_feats=512, n_classes=5994, m=0.2, s=30)
    # sklearn 模型生成
    snet = SpeakerNet(net=net, top=top)
    # 模型訓練
    modelst, step_num, loss, prec1, prec5 = snet.train(train, num_epoch=1)
    # 模型評估
    eer, thresh, all_scores, all_labels, all_trials, trials_feat = snet.eval(
        trial, step_num=0, trials_feat=None)
           

歡迎關注,歡迎交流。