# Speech Processing
[toc]
---
## 1. General information and organization issues
:::info
- Voice-based human-machine communication via HMI - Human Machine Interface

- Dialogue systems like chatbox

- the middle part is called Natural Language Processing (NLP)
- TTS: Text-to-Speech
- ASR: Automatic Speech Recognition
- speech transcription from acoustic to text form

- replacement of keyboard by voice
- automated transcription of audio records/streams
- speaker recognition
- system of biometrics identification for autorization purposes, forensic applications
- human articulatory system

- speech production model

- samples of different parts of speech signal

- plosive sound is hard to deal with because of short duration
- general description of speech signal
- acoustic level: purely analyze waveform and signal itself without considering content
- phonetic level: information content, try to separate it into subwords
- signal sampling and quantization, Pulse Code Modulation(PCM)

- speech sampling

- linear quantization of speech signal

- freq. perception

- itensity

- loudness

> freq. 200 to 5000, we have the highest sensitivity for the sounds
:::
---
## 2. Basic time-domain and spectral characteristics of speech signal
:::info
- Most of them are mathematical eq., check slide.
:::