lattice/輸出

# lattice/輸出 ## dev/test > [name=鴻欣]kaldi這馬攏用best case ## decoding - kaldi模型訓練好，來辨識新的音檔內容 - [維特比演算法](https://zh.wikipedia.org/wiki/%E7%BB%B4%E7%89%B9%E6%AF%94%E7%AE%97%E6%B3%95) #### 資料夾內容 - scoring/text.filt: 辨識的正確答案 - scoring/X.a.b.txt: 參數lm-scale=X word-insert=a.b的kaldi辨識結果 - wer_X_a.b: 參數lm-scale=X word-insert=a.b的錯誤率 - 檔案內的WER：word error rate詞錯誤率 - 檔案內的SER：setence error rate - 咱討論會講的SER：syallable error rate - CER：charactor error rate漢字錯誤計 ``` 0000000倪齊民-tong0000000-ku0000000 sian-sinn｜sian-sinn 詞 0000000倪齊民-tong0000000-ku0000000 sian-sinn｜sian-Kinn 詞 100% 0000000倪齊民-tong0000000-ku0000000 sian｜sian sinn｜sinn 音節 0000000倪齊民-tong0000000-ku0000000 sian｜sian sinn｜Kinn 音節 50 % ``` `` to｜to sia｜sia tse｜tse sia｜sia `` %WER 28.47 [ 24652 / 86587, 3375 ins, 3721 del, 17556 sub ] #### 辨識指令佇[kaldi/egs/taiwanese/s5c/走評估.sh](https://github.com/sih4sing5hong5/kaldi/blob/taiwanese/egs/taiwanese/s5c/%E8%B5%B0%E8%A9%95%E4%BC%B0.sh#L54-L59) ``` graph_dir=exp/tri4/graph $train_cmd $graph_dir/mkgraph.log \ utils/mkgraph.sh $lang exp/tri4 $graph_dir steps/decode_fmllr.sh --nj $nj --cmd "$decode_cmd" \ --config conf/decode.config \ $graph_dir data/dev exp/tri4/decode_train_dev ``` tri3,4的模型, decode結果 `exp/tri3/decode_train_dev/` tri3,4的exp/tri3/final.mdl,exp/tri3/topo, ... ## prune, rescoring ##### [ Speech activity detection and endpointing in real time ASR ](https://groups.google.com/d/msg/kaldi-help/v9_b_r4r4Og/-g8n7fUDAQAJ) - lm 斷句kah斷段，效果愛做實驗 ##### [ prune language model or small text corpus ](https://groups.google.com/d/msg/kaldi-help/SyncOpoXpUQ/O3FJ9oLGBgAJ) - prune了後,perplexity huān-sè 會變好 - GT 是 dan 經驗siōng hó ê discount #### ctm chain model出來ê[愛`*frame-subsampling-factor`，預設=3](http://kaldi-asr.org/doc/chain.html)，: LOG (nnet3-latgen-faster[5.5.162~1420-ca32c]:CheckAndFixConfigs():nnet-am-decodable-simple.cc:294) Increasing [--frames-per-chunk](http://kaldi-asr.org/doc/dnn3_scripts_context.html) from 50 to 51 to make it a multiple of --frame-subsampling-factor=3 ##### align `steps/get_train_ctm.sh` ##### decoding ###### [ show-alignment at a frame level? ](https://groups.google.com/d/msg/kaldi-help/-H7a9IuhLLE/sUig5eT2AQAJ) show-alignments, ali-to-phones, copy-int-vector #### 輸出 ##### [Replacing unk in decode output by best matching phone combination ](https://groups.google.com/d/msg/kaldi-help/uGiExyOiyjc/VkDQfOB0AgAJ) ##### [ the difference between posteriors calculated by forward-backward and lattice-to-post ](https://groups.google.com/d/msg/kaldi-help/iBgeXN4diSY/27CD0v0wCwAJ) ### 其他參考文章 * decode * https://groups.google.com/forum/m/#!msg/kaldi-help/tAGblb-8Hy0/oMWnpgvhAAAJ * lm、fst http://vpanayotov.blogspot.tw/2012/06/kaldi-decoding-graph-construction.html * acwt參數 * https://groups.google.com/forum/#!topic/kaldi-developers/IBhn9ndmjQI * lattice * https://groups.google.com/forum/#!topic/kaldi-help/cXX7y3Hvf3w * https://groups.google.com/forum/#!topic/kaldi-help/QqUQoX816GE * https://groups.google.com/forum/#!topic/kaldi-help/UiVD5WPA8fI * 檢查聲學模型 * https://groups.google.com/forum/m/#!msg/kaldi-help/HU5FT32EguU/cIMjnWRUAQAJ real-time decoding * https://groups.google.com/forum/#!searchin/kaldi-help/recording|sort:relevance/kaldi-help/xogaf6nAF3E/xSfh0gFuLAAJ ### lattices ##### [Query on converting phoneme lattice to word lattice](https://groups.google.com/d/msg/kaldi-help/APrZQNaF-S4/6KYXyGo_AwAJ) - 用`utils/show-lattice.sh`來顯示lattice圖 - cost愛規條path看才有意義，因為圖無frame的資訊 - word lattice變phoneme lattice有資訊無去 ##### [ how can we compose with syllable lattices to get word lattices](https://groups.google.com/d/msg/kaldi-help/Dx3gqHQd1Hw/kXpPuIPaAwAJ) 有介紹lattice=>fst，佇fst操作 ### decoding ##### [reduce the beams來加速](https://groups.google.com/d/msg/kaldi-help/Jj1yVqr-rbQ/SiiOIm-tBwAJ) ##### [ Re: missing words while do-endpointing is true](https://groups.google.com/d/msg/kaldi-help/XIcyX3weV9w/EHQIC4VPEQAJ) The --do-endpointing is not really segmentation. It is for the scenario where you want to just stop the recognition at a certain point (e.g. to demonstrate how you might do that for an interactive application). So it will just discard the rest of the file. ##### [Decoding longer utterances of AMI dev](https://groups.google.com/d/msg/kaldi-help/Qraqjz83Tw4/5u9JBSaVBQAJ) `Determinization finished earlier than the beam for utterance AMI_ES2011b_SDM_FEE041_0107148_0107863 LOG (latgen-faster:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:111) Log-like per frame for utterance AMI_ES2011b_SDM_FEE041_0107148_0107863 is 0.713222 over 713 frames. `代表` the lattice is not as deep as it might have been.`，無啥要緊 ##### [ Is it possible to score in kaldi by ignoring the compound word splitting in particular language (Telugu). ](https://groups.google.com/d/msg/kaldi-help/vQetEk1h4m4/5zTYJYGdBQAJ) 用`local/wer_output_filter`調整輸出結果，方便比較 ##### [ Recovering "OOV" phones after SAU conversion ](https://groups.google.com/d/msg/kaldi-help/8UV7ngdp72I/yYG4qnBZBwAJ) 辨識結果莫出現`unk`的改法：`fstprint`共fst轉做txt，共unk相關的提掉，才用`fstcompile`轉轉`txt` ##### [ The right way of adding new words to existing ngram LM. ](https://groups.google.com/d/msg/kaldi-help/d1KIOgREd84/qBBg98JCBAAJ) [add_unigrams_arpa.pl](https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/utils/lang/add_unigrams_arpa.pl)會當直接佇arpa加unigram，而且加的機率攏會當調整 ##### [ Replacing <unk> in decode output by best matching phone combination ](https://groups.google.com/d/msg/kaldi-help/uGiExyOiyjc/tXlfTKVKCAAJ) - 提掉oov的兩種可能做法 ##### [Some Kaldi Notes](http://jrmeyer.github.io/asr/2016/02/01/Kaldi-notes.html) `L.fst`,`L_disambig.fst`,`G.fst`,...的介紹