語音辨識

tags: 智慧聯網 IoT NTA-Lab 語音辨識

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Github : https://github.com/JimboChien/speech-recognition

📦 Installation

  1. 下載相關套件

    ​​​​$ sudo pip3 install SpeechRecognition wave gTTS pygame jieba pyusb click
    ​​​​$ sudo apt-get install python3-pyaudio portaudio.dev flac
    
  2. 下載範例程式

    ​​​​$ git clone https://github.com/JimboChien/speech-recognition
    

🔊 麥克風、喇叭設定

  1. 查看麥克風

    ​​​​$ arecord -l
    

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  2. 查看喇叭

    ​​​​$ aplay -l
    

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  3. 設定麥克風及喇叭

    ​​​​$ sudo nano /home/pi/.asoundrc
    
  4. 根據所使用的麥克風、喇叭設定

    ​​​​pcm.!default {
    ​​​​   type asym
    ​​​​   playback.pcm {
    ​​​​     type plug
    ​​​​     slave.pcm "hw:0,0"
    ​​​​   }
    ​​​​   capture.pcm {
    ​​​​     type plug
    ​​​​     slave.pcm "hw:1,0"
    ​​​​   }
    ​​​​}
    
  5. 設定音量

    ​​​​$ alsamixer
    
  6. 測試麥克風

    ​​​​$ speaker-test -t wav -c 2
    

自然語言(Neuro-Linguistic Programming, NLP)

自然語言介紹

  1. 不同物種有自己的溝通方式

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  2. NLP就是人類和機器間溝通的橋樑

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

自然語言典型應用

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
情緒分析

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
聊天機器人

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
語音辨識

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
機器翻譯

雲端處理語音辨識

語音轉文字(Speech-to-Text)

首先透過語音輸入,接著交給雲端 Google 服務幫我們轉換為字串

  1. 執行 Ex_1.py

    ​​​​$ python3 Ex_1.py
    

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  2. 接著輸入 s + Enter

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  3. 輸入後將開始錄音

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  4. 當錄完音且辨識完成後,將會回傳結果

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

文本朗讀(Text-to-Speech)

將字串上傳給 Google,Google 小姐將文字轉成語音後回傳一個語音檔,接著透過 pygame 套件來播放語音檔

  1. 執行 Ex_2.py

    ​​​​$ python3 Ex_2.py
    

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  2. 接著輸入一串中文 + Enter,將會唸出輸入的字串

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

全句符合反應

當接收到字串後比對是否符合目標句子,如果符合將回應相對應的句子

  1. 目標句子將存在 keysentence.txt 檔案中

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  2. 回應句子將存在 reply.txt 檔案中

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

  3. 可透過 gui.py 來進行編輯

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

    • 在左側插入一個空行並按下 Add 按鈕,右側會在對應的位置也插入空行
    • 若發生捲動,兩邊視窗會同時捲動,方便直接對相對應位置更改
    • 更改完後按下 Save 就會儲存到原檔案中
  4. 執行 Ex_3.py

    ​​​​$ python3 Ex_3.py
    

    Ex_3.py 具體將 Ex_1.pyEx_2.py 進行結合

  5. 輸入 s + Enter 後開始錄音,當輸入句子符合目標句子時,將會透過語音及文字方式回應對應的句子

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

由語音控制 GPIO

  1. 樹莓派 GPIO 分為 BOARDBCM 模式,可以夠過 pinout 指令查看, BOARD 模式對應 Pin #BCM 模式對應 Name

    ​​​​$ pinout
    
    Name Pin # Pin # Name
    3V3 (1) (2) 5V
    GPIO2 (3) (4) 5V
    GPIO3 (5) (6) GND
    GPIO4 (7) (8) GPIO14
    GND (9) (10) GPIO15
    GPIO17 (11) (12) GPIO18
    GPIO27 (13) (14) GND
    GPIO22 (15) (16) GPIO23
    3V3 (17) (18) GPIO24
    GPIO10 (19) (20) GND
    GPIO9 (21) (22) GPIO25
    GPIO11 (23) (24) GPIO8
    GND (25) (26) GPIO7
    GPIO0 (27) (28) GPIO1
    GPIO5 (29) (30) GND
    GPIO6 (31) (32) GPIO12
    GPIO13 (33) (34) GND
    GPIO19 (35) (36) GPIO16
    GPIO26 (37) (38) GPIO20
    GND (39) (40) GPIO21
  2. 接上 LED

  3. 執行 Ex_4.py

    ​​​​$ python3 Ex_4.py
    
  4. 輸入 s + Enter

  5. 輸入後將開始錄音

  6. 當收到開燈 LED 將亮起,收到關燈 LED 將暗掉

中文自動分詞

一般來說,語音助理不會以指令方式下令,而是更自然的語言,如:XXX,請幫我開燈。這時若想找到控制詞,就得加入自然語言處理的斷詞方法

  • 詞:自然語言處理中帶有語意的最小單位
    • 單字詞:水、火
    • 雙字詞:蘋果、電腦
  • 斷詞:將一個句子切分成詞的組合
    • 輸入:今天天氣很好
    • 輸出:[今天],[天氣],[很],[好]
  1. 執行 Ex_5.py

    ​​​​$ python3 Ex_5.py
    

  2. 輸入要段詞的句子

  3. 可以重複輸入

  4. 離開請輸入 quit

斷詞不一定是完全正確的

使用斷詞控制 GPIO

  1. 執行 Ex_06.py

    ​​​​$ python3 Ex_06.py
    

  2. 試著以自然的方式下指令,如:幫我開燈

試試看一個句子中包含多個指令

  • 開燈後開始閃爍
  • 閃爍,再關燈