--- type: slide --- # NLP --- - RNN - seq2seq - transformer --- ## RNN (Recurrent Neural Network) ---- - 我是學生,所以我去學校 - 我是勞工,所以我去公司 ---- - 學生 -> 學校 - 勞工 -> 公司 ---- $O_t=g(V*S_t)$ $S_t = f(U*x_t + W*S_{t-1})$ ---- V U W 共用的權重 $S_t$隨前面的結果改變 ---- ![20001976y5kxBTjmM7](https://hackmd.io/_uploads/SkT970bEp.jpg) ---- 上個隱藏層狀態+本層輸入=新的隱藏層狀態 ---- 序列應用 ![20150622iFlEGNLQXp](https://hackmd.io/_uploads/Syu8bCWEa.jpg) --- ## 舉些栗子 --- ## one to one ---- 圖像分類之類的(基本上就跟沒有R一樣 --- ## one to many ---- 圖轉文 --- ## many to one ---- 文字分類 文轉圖 --- ## many to many ---- 翻譯 QA 摘要... ---- ## a.k.a. sequence to sequence ---- ## a.k.a seq2seq --- # RNN的侷限 ---- - Vanishing gradient problem - input weight conflict - output weight conflict --- ## Vanishing gradient problem 梯度消失 ---- 越往前梯度越小 | v RNN的記憶只有三秒 --- ## input weight conflict ---- 現在不重要不代表不重要 --- ## LSTM Long short-term memory ---- - cell state - Gate - forget gate - input gate - output gate ---- ![20150622oOc4kL8PdY](https://hackmd.io/_uploads/SJwiG1zNT.png) --- encoder decoder ![20150622EXk3Ii70ge](https://hackmd.io/_uploads/BJH-ZJGVp.jpg)