HackMD - Collaborative Markdown Knowledge Base

## 2.NormAdp 此區塊在做syllable duration, pause duration.. 等參數對prior適應後正規化 ### 輸入輸出 input:'plm.al','NormPriors.f40.mat' output:'NormFactors.f40.mat' ### 執行代碼: 在normadp資料夾開terminal後輸入: ```terminal python3 NormAdp.py ``` 或是 ```terminal python3 -c "import NormAdp; NormAdp.NormAdp('plm.al','NormPriors.f40.mat','NormFactors.f40.json')" ``` 由於normadp涉及rand，所以和matlab端的輸出要手動比對參數說明: | 參數 | 是啥 | | -------- | -------- | | $g$ |表示全域| | $k$ |表示語料中第k句| | $n,k$ |表示語料中第k句的第n個syllable| | $sd'_{n,k}$ | 正規化後的syllable duration| | $sd_{n,k}$ |未正規化的syllable duration| | $x_k$ |第k句的inverse speaking rate| | $\sigma^{sd},\mu^{sd}$ |依syllable duration-speaking rate分布形似Gamma distribution，統計出的標準差及平均值| 對語速的正規化: syllable duration正規化: 由於syllable duration的pdf隨著語速分布不均勻一般沒有prior介入的sd正規化是一般對語速的正規化，依其標準差做正規化: $$sd'_{n,k}=\frac{(sd_{n,k}-x_k)}{\tilde{\sigma} ^{sd}(x_k)}\cdot \sigma^{sd}_g+\mu^{sd}_g$$ 標準差部分: $\tilde{\sigma} ^{sd}(x_k)= a_1x_k^2 + b_1x_k + c_1$ 其為smooth curve ----------------- 參數說明: | 參數 | 是啥 | | -------- | -------- | | $a_{1}^{*}, b_{1}^{*}, c_{1}^{*}$ |MAPLR後的標準差curve參數| | $a_{1}, b_{1}, c_{1}$ |標準差curve參數| | $\mathbf{o}^{s d}=\left\{o_{k}^{s d}\right\}_{k=1 \sim K}$ |observed utterancewise standard deviations，觀測值的第k句sd標準差| | $w(\mathbf{x})$ |weight function| | $\bar{\sigma}$ | 標準差平均值| | $\bar{x}$ | inverse speaking rate平均值| | $\lambda$ |Lagrange multiplier，用以找極值| The MAPLR method is formulated to estimate $a_1$ , $b_1$ , and $c_1$ with a heuristic constraint that lets the smooth standard deviation curve pass through the centroid of adaptation data: $$ \begin{aligned} & a_{1}^{*}, b_{1}^{*}, c_{1}^{*} \approx \\ & \arg \max _{a_{1}, b_{1}, c_{1}}\left[\begin{array}{l} \ln \left(\left(P\left(\mathbf{o}^{s d} \mid a_{1}, b_{1}, c_{1}\right)\right)^{w(\mathbf{x})} P\left(a_{1}, b_{1}, c_{1}\right)\right) \\ +\lambda\left(\bar{\sigma}-a_{1} \bar{x}^{2}-b_{1} \bar{x}-c_{1}\right) \end{array}\right] \end{aligned} $$ $$ P\left(\mathbf{o}^{s d} \mid a_{1}, b_{1}, c_{1}\right)=\prod_{k} N\left(o_{k}^{s d} ; \tilde{\sigma}^{s d}\left(x_{k}\right), v^{s d}\right) $$ $$ w(\mathbf{x})=\operatorname{std}\left(x_{k}\right) / \operatorname{std}\left(\hat{x}_{k}\right) $$