Baum-Welch Algorithm

--- tags: HMM, python, Bioinformatics --- # Baum-Welch Algorithm >本篇只會介紹程式碼，原理與更多解釋請參考[這裡](https://hackmd.io/@XibGbVrGSw6szTtbuQpNBw/HJdmQtqmj) ```python= def BaumWelch(O, a, b, pi, n_iter=100): N = a.shape[0] T = len(O) # total time for i in range(n_iter): alpha = Forward(O, a, b, pi, True) beta = Backward(O, a, b) xi = np.zeros((N, N, T - 1)) # init xi # compute xi # scaled alpha and beta wont effect xi # xi: probability of being in state i at time t and state j at time t + 1 for t in range(T - 1): denominator = np.dot( np.dot(alpha[t, :].T, a) * b[:, O[t + 1]].T, beta[t + 1, :]) for i in range(N): numerator = alpha[t, i] * a[i, :] * \ b[:, O[t + 1]].T * beta[t + 1, :].T xi[i, :, t] = numerator / denominator # use xi to compute gamma # scaled alpha and beta wont effect xi # gamma : probability of being in state i at time t (fix i and t then sum all j) gamma = np.sum(xi, axis=1) # use xi and gamma to update a # sum xi and gamma over time t # new a = sum(xi) / sum(gamma) # because sum(xi, 2) is two demension and sum(gamma, axis=1) is one demension, gamma need to be expand to two demension # fix row, reshape column a = np.sum(xi, 2) / np.sum(gamma, axis=1).reshape((-1, 1)) # Add additional T'th element in gamma (because xi is T-1) # hsack: stack arrays in sequence horizontally (column wise) # use the origin gamma formula to compute the last gamma denominator = alpha[T - 1] @ beta[T - 1] gamma = np.hstack((gamma, np.expand_dims( (alpha[T - 1] * beta[T - 1]) / denominator, axis=1))) # use gamma to update b # K is the number of observation types (four in this case) K = b.shape[1] denominator = np.sum(gamma, axis=1) for l in range(K): b[:, l] = np.sum(gamma[:, O == l], axis=1) b = np.divide(b, denominator.reshape((-1, 1))) return a, b ``` ## 參數宣告與初始化 ```python= N = a.shape[0] T = len(O) # total time ``` N: 狀態數量，T: 序列長度(時間長度) ## 取得$\alpha$和$\beta$ ```python= alpha = Forward(O, a, b, pi, True) beta = Backward(O, a, b) ``` ## 計算$\xi$ ```python= for t in range(T - 1): denominator = np.dot( np.dot(alpha[t, :].T, a) * b[:, O[t + 1]].T, beta[t + 1, :]) for i in range(N): numerator = alpha[t, i] * a[i, :] * \ b[:, O[t + 1]].T * beta[t + 1, :].T xi[i, :, t] = numerator / denominator ``` * 分母: 因為矩陣型狀的關係，alpha在和a做內積前要先轉方向，之後乘的b也要轉方向(別忘了我們的a跟b都是直的，跟原本解說中畫的表格不一樣。)，最後和beta內積。 * 分子: 分子也一樣要做轉置，而且要把分子分母調整成一樣的方向。 * **特別注意$\xi$只會計算到T-1，因為T沒有下一個狀態可以計算了** ## 計算$\gamma$ ```python= gamma = np.sum(xi, axis=1) ``` 這裡是利用$\xi$和$\gamma$的特殊關係，把$\xi$加起來就是了。不過要注意因為$\xi$只能算到T - 1，所以最後一個$\gamma$要另外補上。 ```python= denominator = alpha[T - 1] @ beta[T - 1] gamma = np.hstack((gamma, np.expand_dims( (alpha[T - 1] * beta[T - 1]) / denominator, axis=1))) ``` 這裡就用回$\gamma$原本的公式，因為是幫陣列多一個值，要用hstack插入。 ## 更新 a 和 b ```python= a = np.sum(xi, 2) / np.sum(gamma, axis=1).reshape((-1, 1)) ``` 套用公式，但因為$\xi$是三維矩陣，$\gamma$要用reshape來把自己的維度調整成和$\xi$一樣才可以計算。 ```python= K = b.shape[1] denominator = np.sum(gamma, axis=1) for l in range(K): b[:, l] = np.sum(gamma[:, O == l], axis=1) b = np.divide(b, denominator.reshape((-1, 1))) ``` K是序列種類，一樣套用公式計算，利用迴圈找出序列中當前所要計算的第K種結果的數量，最後除上$xi$(一樣因為維度關係要reshape)。