# Long-Short Term Memory (LSTM) :::warning :notebook_with_decorative_cover: **摘要** [toc] ::: $$ \{\boldsymbol{x}: x_t \forall t=1,\cdots,T \} $$ # LSTM Cell 結構 LSTM的架構主要由以下五個元件所組成: - <font color = 'purple'>單元狀態(cell state)</font>:LSTM的**內部單元狀態**($c$),負責儲存先前的記憶 - <font color = 'brown'>隱藏狀態(hidden state)</font>:LSTM的**外部隱藏狀態**($h$),負責計算預測結果 - <font color = 'red'>輸入閘(input gate)</font>:用來決定要把多少當期輸入($x_t$)送進<font color = 'purple'>當期單元狀態($c_t$)</font>。 - 當期輸入($x_t$)會先被轉換成<font color = 'green'>候選值($\tilde{c_t}$)</font>。 - <font color = 'blue'>遺忘閘(forget gate)</font>:用來決定要把多少<font color = 'purple'>前期單元狀態($c_{t-1}$)</font>送進<font color = 'purple'>當期單元狀態($c_t$)</font>。 - <font color = 'darkorange'>輸出閘(output gate)</font>:用來決定要把多少<font color = 'purple'>當期單元狀態($c_t$)</font>輸出到<font color = 'brown'>當期隱藏狀態($h_t$)</font> 相關式子如下: $$ \color{red}{i_t} = \sigma(\color{red}{W_{ix}}x_t + \color{red}{W_{ih}}h_{t-1} + \color{red}{b_i}) $$ $$ \color{blue}{f_t} = \sigma(\color{blue}{W_{fx}}x_t + \color{blue}{W_{fh}}h_{t-1} + \color{blue}{b_f}) $$ $$ \color{darkorange}{o_t} = \sigma(\color{darkorange}{W_{ox}}x_t + \color{darkorange}{W_{ot}}h_{t-1} + \color{darkorange}{b_o}) $$ $$ \color{green}{\tilde{c_t}} = \tanh(\color{green}{W_{cx}}x_t + \color{green}{W_{ch}}h_{t-1} + \color{green}{b_c}) $$ $$ \sigma(\cdot) = \dfrac{1}{1+e^{-x}} $$ $$ \color{purple}{c_t} = \color{blue}{f_t} \color{purple}{c_{t-1}} + \color{red}{i_t}\color{green}{\tilde{c_t}} $$ $$ \color{brown}{h_t} = \color{darkorange}{o_t}\tanh(\color{purple}{c_t}) $$ 圖形表示如下:  # 改善表現的小技巧 - 貪婪取樣(greedy sampling):尋找機率值最高的前 k 個候選項 - 束搜尋(beam search):往下 m 個 timestep 尋找 - 雙向LSTM - peehole connection:偷看前幾 ###### tags: `DL`
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up