# tsm-rnnt Installation Guide
[https://github.com/Chung-I/tsm-rnnt](https://github.com/Chung-I/tsm-rnnt)
## Installation
1. conda create --name taiwan python==3.6.8
2. pip install allennlp==0.9.0
3. pip install overrides==3.1.0
4. pip install other dependencies that are complaning (use newest version)
5. install warp-transducer
```
-- cuda --
1. driver:
sudo apt install nvidia-driver-470
sudo apt install nvidia-cuda-toolkit (default is version 10)
2. use g++-8 / gcc-8:
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-8 30
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 30
3. install warp-transducer
export CUDA_HOME=/usr/lib/cuda (where include/cuda.h and lib/libcudart.so live, e.g. /usr/local/cuda)
git clone https://github.com/HawkAaron/warp-transducer
cd warp-transducer
mkdir build
cd build
cmake -DCUDA_TOOLKIT_ROOT_DIR=$CUDA_HOME ..
make
cd ../pytorch_binding
python setup.py install
```
6. modified ```*/lib/python3.6/site-packages/allennlp/nn/beam_search.py```
```python=
backpointer = backpointer.type(torch.int64)
```
7. [pretrained model](https://drive.google.com/drive/folders/14mXqSZBGPEMAgYQLPXwhhatJFcO1w0vD) (try this first): CER on 台羅數字調 (13.8%)
##
## Quick start
```
allennlp evaluate ../TAT-tailo-specaug/ /storage/public/TAT-Vol1-train-lavalier-dev --output-file run.out --include-package stt
```
```
echo "../SuiSiann-0.2.1/ImTong/SuiSiann_0002.wav " | allennlp predict --predictor online_stt --output-file run.out --include-package stt ../TAT-tailo-specaug/ -
```
## Reproduce the result
1. Evaluate totally 275 files in ```TAT-Vol1-train/IU_IUF0008``` has **the CER of 10.5%**, some implementation details listed below:
- separate characters by space
- remove the last character of the real data, because "some"(*not all*) of them includes trailing punctuation marks
The source code is at ```/home/jiyuntu/prediction-tools/CER.py```.
## Tips
### Debug allennlp with VS Code
1. make a soft link from site-packages/allennlp to tsm-rnnt/

2. open tsm-rnnt/run.py
(齒輪可以打開launch file)

3. 再launch.json裡新增"args"

4. set breakpoint in allennlp source code to debug (see how does allen work)
<!--
## Installation
### Environment
PyTorch 1.10
CUDA 11.3
[kaldi](https://github.com/kaldi-asr/kaldi)
### Steps
1. ```git clone``` the repository
### Problems
1. ```allennlp==0.8.4``` is not supported
2. using ```allennlp==2.8.0```
3. 
because https://github.com/allenai/allennlp/releases?page=4
*Tokenizer specification changed because of #3361. Instead of something like "tokenizer": {"word_splitter": {"type": "spacy"}}, you now just do "tokenizer": {"type": "spacy"} (more technically: the WordTokenizer has now been removed, with the things we used to call WordSplitters now just moved up to be top-level Tokenizers themselves).*
WordTokenizer -> SpacyTokenizer
4.

TokenIndexer[List[int]] -> Tokenizer
5. 
deprecated
6.

--->