Services of personalized TTS systems for the Mandarin-speaking speech impaired are rarely mentioned. Taiwan started the VoiceBanking project in 2020, aiming to build a complete set of services to deliver personalized Mandarin TTS systems to amyotrophic lateral sclerosis patients. This paper reports the corpus design, corpus recording, data purging and correction for the corpus, and evaluations of the developed personalized TTS systems, for the VoiceBanking project. The developed corpus is named after the VoiceBank-2023 speech corpus because of its release year. The corpus contains 29.78 hours of utterances with prompts of short paragraphs and common phrases spoken by 111 native Mandarin speakers. The corpus is labeled with information about gender, degree of speech impairment, types of users, transcription, SNRs, and speaking rates. The VoiceBank-2023 is available by request for non-commercial use and welcomes all parties to join the VoiceBanking project to improve the services for the speech impaired.
針對北京話語音障礙者的個人化文字轉語音系統服務很少被提及。台灣於2020年啟動了語音銀行計劃,旨在建立一套完整的服務,為漸凍人患者提供個人化的北京話文字轉語音系統。本論文報告了語音銀行計劃的語料庫設計、語料庫記錄、語料庫數據清理和校正,以及對所開發的個人化文字轉語音系統的評估。開發的語料庫因其發布年份而命名為VoiceBank-2023語音語料庫。該語料庫包含111位母語為北京話的人所講的29.78小時的短段落的提示和常用片語。語料庫標註了性別、語言障礙程度、用戶類型、轉錄出的文字、訊雜比和語速等信息。VoiceBank-2023可應要求用於非商業用途,也歡迎各方各界加入語音銀行計劃,以改善為語言障礙者提供的服務。
Introduction
The voice of each individual may be regarded as his/her identity. Amyotrophic lateral sclerosis (ALS) patients will gradually lose the ability to control their muscles, which affects the control of the glottal fold and shape of the vocal tract, and become difficult to pronounce and communicate smoothly. ALS patients are encouraged to record their voices before they become dysarthria. The recorded speech can be used to construct personalized text-to-speech (TTS) systems, which serve as speech-generating devices (SGD) for augmentative and alternative communication (AAC).
每個人的聲音都可以視為其身份象徵。肌萎縮性脊髓側索硬化症(ALS)患者會逐漸失去控制肌肉的能力,從而影響控制聲門閉合和聲道形狀,導致發音與溝通困難。我們鼓勵ALS患者在失去語言能力之前錄製他們的聲音。錄製下來的語音可以用來構建個人化的文字轉語音(TTS)系統,作為輔助和替代溝通(AAC)的語音生成設備(SGD)。
In English-speaking countries, many companies or research institutes are providing services to make personalized TTS systems for ALS patients to use. The significant one is Model Talker \cite{ModelTalker}, which is the earliest and largest research platform in the US, established by the Nemours Speech Research Laboratory located at the Alfred I. duPont Hospital for Children in Delaware, US. With the advances in speech technologies, the following commercial SGD providers can be found: Cereproc Cerevoice ME \cite{CereProc}, VocalID \cite{VOCALiD}, Acapela my-own-voice DNN \cite{Acapela}, the Voice Keeper \cite{VoiceKeeper}, and SpeakUnique\footnote{SpeakUnique powers the user-friendly voice-banking platform, “I Will Always Be Me” (https://www.iwillalwaysbeme.com/)} \cite{SpeakUnique}.
joseph861030 changed 2 years agoView mode Like Bookmark