# DSP Final Project ## Final report https://docs.google.com/document/d/1VgmEAUuor8mXLuRDJTeQiFOgc1tSf0N_bjBmeiaH4-Q/edit ## TODO - Convert each sec ... done - Convert wave file to spectrogram ... done - Find a female voice data set ... done - 剪掉voice前面的slicence ... done - 先這樣 ## Reference https://github.com/andabi/deep-voice-conversion https://github.com/robmsmt/KerasDeepSpeech https://medium.com/@jonathan_hui/speech-recognition-deep-speech-ctc-listen-attend-and-spell-d05e940e9ed1 * THCHS-30 資料集論文 https://arxiv.org/pdf/1512.01882.pdf * CTC Loss 論文 https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.75.6306&rep=rep1&type=pdf * CTC Loss 圖片 https://distill.pub/2017/ctc/ [DeepSpeech2 on PaddlePaddle](https://github.com/PaddlePaddle/DeepSpeech) ## text to speech [Tacotron-2](https://github.com/Rayhane-mamah/Tacotron-2) ## voice segmentation [pyAudioAnalysis](https://github.com/tyiannak/pyAudioAnalysis) https://github.com/tyiannak/pyAudioAnalysis/wiki/5.-Segmentation [inaSpeechSegmenter](https://github.com/ina-foss/inaSpeechSegmenter) ## Data set `organize_data_info.py`: generate data set info , store to .csv file `run_wav2mel.py`: <filename list> <output dir> Voice data(.wav): https://drive.google.com/drive/folders/1m6euURTOeQhEcJI4wguDgvvWy8xptvp9?usp=sharing Generate file list: ```bash $ for f in ./*.wav; do echo "$f" >> mylist.txt; done ```