天天看點

語音識别之HTK重了解

語音識别之HTK重了解

趁着沒開學,今天把語音識别中的隐馬爾可夫模型相關訓練重新跑了一遍,結合網絡大佬的經驗,對HTK工具的繼續運作深入了解,重新訓練了資料,并結合實際進行了更新和完善。

環境問題我就不說了,我預設已經是配置好的了。

今天還是孤立詞,内容呢是

one,two,three,當然,後面自由發揮

首先進行資料的采集

rec -b 8 data/train/speech/01.wav
rec -b 8 data/train/speech/02.wav.....
           
語音識别之HTK重了解

我這裡錄了十個one十個two十個three,儲存在train的speech檔案夾下

然後進行訓練資料更改,結合前幾篇的内容看

修改grammer為所需類别

修改codetrain.scp為訓練檔案路徑和生成mfc路徑

修改train.scp為為mfc路徑

修改wordlist内容為訓練文本清單

修改trainprompts訓練所對應的文本,這個就相當于标注。

完成之後直接運作以下所有指令:

HParse ./config/grammer ./config/wordnet
HDMan -m -w ./lists/wordlist -n ./lists/monophones -g ./config/global.ded ./dict/dict_color ./dict/beep ./dict/otherDict
perl ./scripts/prompts2mlf ./labels/trainwords.mlf ./labels/trainprompts
HLEd -l '*' -d ./dict/dict_color -i ./labels/phones_color.mlf ./config/mkphones_color.led ./labels/trainwords.mlf 
HCopy -T 1 -C ./config/config_HCopy -S ./config/codetrain.scp
HCompV -C ./config/config_color -f 0.01 -m -S ./config/train.scp -M ./hmm0 ./config/proto
perl scripts/makeMacros hmm0/vFloors hmm0/macros
perl scripts/makeHmmdefs hmm0/proto lists/monophones hmm0/hmmdefs
perl scripts/makeMonoOffsp ./lists/monophones ./lists/monoOffSP
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm0/macros -H ./hmm0/hmmdefs -M ./hmm1/ ./lists/monoOffSP
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm1/macros -H ./hmm1/hmmdefs -M ./hmm2/ ./lists/monoOffSP
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm2/macros -H ./hmm2/hmmdefs -M ./hmm3/ ./lists/monoOffSP
perl ./scripts/fixSil hmm3/hmmdefs hmm4/hmmdefs
cp hmm3/macros ./hmm4/macros
HHEd -H ./hmm4/macros -H ./hmm4/hmmdefs -M hmm5/ config/sil.hed ./lists/monophones
HLEd -l '*' -d ./dict/dict_color -i ./labels/phones_color.mlf ./config/mkphones_color_HLEd.led ./labels/trainwords.mlf
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm5/macros -H ./hmm5/hmmdefs -M ./hmm6/ ./lists/monophones
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm6/macros -H ./hmm6/hmmdefs -M ./hmm7/ ./lists/monophones
           

Hparse指令進行建立一個詞網絡,用以描述詞與詞之間的轉移,grammer為修改後的文法,wordnet為生成的網絡

HDMan建立詞典,基于前面的beep和otherDict,生成了dict_color字典

HLEd轉換成mlf

HCopy提取特征參數

HCompV掃描所有的訓練資料,得到均值方差

訓練0-7

HERest進行重估

完成之後,在相關檔案夾裡會有新生成檔案。

接下來進行測試

我在這裡改成了先錄音,在轉mfc,在測試然後顯示

錄音

rec -b 8 data/test/speech/test.wav
           
語音識别之HTK重了解

轉換

HCopy -T 1 -C ./config/config_HCopy -S ./config/codetest.scp
           

識别

HVite -H ./hmm7/macros -H ./hmm7/hmmdefs -C ./config/config_color -S ./config/test.scp -l '*' -i ./results/recout.txt -w ./config/wordnet -p 0.0 -s 5.0 ./dict/dict_color ./lists/monophones
           

顯示

cat ./results/recout.txt |tail -n +3|head -n 3
           
語音識别之HTK重了解

最終可以看到,顯示識别結果是two,是沒有問題的。

繼續閱讀