# Whisper 文字轉語音

# 原始資料 https://github.com/openai/whisper # 測試 code https://github.com/ddasdkimo/RaiWhisper # 由 huggingface 下載模型進行中英文語音推論測試 1.run.py ## 差異記錄 macbook pro chinese2_32.wav 7秒語音 ``` 多元學藝廳，在您的右手邊，請直走後由該側三樓前往此處 ``` ### openai/whisper-tiny ``` ['多元雪一天,在您的右手邊,請枝早後,由該測三樓前往此處。'] time: 0.9937560558319092 ``` ### openai/whisper-small ``` ['多元雪一天,在您的右手邊,請枝早後,由該測三樓前往此處。'] time: 6.127533197402954 ``` ### model_name = 'openai/whisper-medium' ``` ['多元學藝廳在您的右手邊請直走後由該側三樓前往此處。'] time: 19.628818035125732 ``` ### model_name = 'openai/whisper-large' ``` ['多元學藝廳在您的右手邊請直走後由該側三樓前往此處'] time: 40.77557301521301 ``` # 待驗證功能 ## 遷移學習 https://huggingface.co/blog/fine-tune-whisper ## 提示詞功能提示中需要使用 initial_prompt ``` whisper_model.transcribe(audio_file_path, initial_prompt=" - How are you? - I'm fine, thank you.", **other_whisper_options) ``` ## onnxruntime 測試 https://medium.com/microsoftazure/build-and-deploy-fast-and-portable-speech-recognition-applications-with-onnx-runtime-and-whisper-5bf0969dd56b https://github.com/onnxruntime/Whisper-HybridLoop-Onnx-Demo ## jetson 測試 https://github.com/maxbbraun/whisper-edge#jetson-nano