語音串接 - Web Speech Api

# 語音串接 - Web Speech Api [toc] <br/> | Type | Web Speech Api | | -------- | -------- | | STT (speech-to-text) | SpeechRecognition | | TTS (text-to-speech) | SpeechSynthesis | <br/> ## SpeechRecognition ### Basic Techniques ![image](https://hackmd.io/_uploads/HJqYzzoQp.png) <div style="font-size: 12px; text-align: center;">figure from MDN website</div> * Asynthesized * SpeechRecognition use a server-based recognition engine in the browser. Therefore, it requires a condition of online. * To do speech recognition offline, the speech recognition engine must be embedded within the browser. * Language set required to help recognize the voice (default: en) ``` On some browsers, like Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won't work offline. ``` <br/> --- ### Voice Stream in the SpeechRecognition setting * SpeechRecognition.continuous * Controls whether continuous results are returned for each recognition, or only a single result. Defaults to single (false.) * SpeechRecognition.interimResults * Controls whether interim results should be returned (true) or not (false.) Interim results are results that are not yet final (e.g. the SpeechRecognitionResult.isFinal property is false.) ```javascript var SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; if (SpeechRecognition) { //some browser does not support Web Speech Recognition API, //ex. Firefox use Firefox Nightly instead speechRecognition = new SpeechRecognition(); //both properties set true to make the voice result use stream speechRecognition.continuous = true; speechRecognition.interimResults = true; } ``` --- <br/> ## SpeechSynthesis ### Basic Techniques ![Group 1.png](https://hackmd.io/_uploads/S1WMs-9m6.png) <div style="font-size: 12px; text-align: center;">figure from intersec website</div> * Synthesized * SpeechSynthesis run in the browser without any server support * Main Useful Parameters: * voice select (man/woman) * rate value (voice rate number) * language setting <br/> ## Reference [Web Speech API - Speech Recognition (from mozilla wiki)](https://wiki.mozilla.org/Web_Speech_API_-_Speech_Recognition) [SpeechRecognition_MDN](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition) [SpeechSynthesis_MDN](https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis) [Web Speech API - Creating a web interface with 0 clicks](https://techtalk.intersec.com/2023/03/web-speech-api-creating-a-web-interface-with-0-clicks/)