# <center><i class="fa fa-edit"></i> Understanding RNN, LTSM </center>
###### tags: `Internship`
:::info
**Goal:**
To gain a basic understanding of the RNN, LTSM techniques. Focus on vocabulary and systems overview.
- [x] Overview
- [x] Easy Implementation
**Resources:**
[Towards Data Science Page](https://towardsdatascience.com/understanding-lstm-and-its-quick-implementation-in-keras-for-sentiment-analysis-af410fd85b47)
[Neural Networks, RNN](https://hackmd.io/@j-chen/H1_3Hf_LP)
[Machine Learning](https://hackmd.io/@Derni/HJQkjlnIP)
:::
### Overview
- RNN: Recurrent neural network
- Conventional NNs don't learn from previous events --> can't pass the information up to the next step
- RNN: learns from **immediate** previous step
- Does not have long-term dependency (can't learn from steps other than the immediate previous one)
- Not practical
- More loops = large updates to weights = accumulates error gradients = unstable network

- LSTM: Long short term memory
- Applications:
- Speech recognition
- Language modeling
- Sentiment analysis
- Text prediction
- Variant of RNN
- Uses gates to control memorizing process
- Diagram below:
- *X*: scaling of information
- *+*: adding information
- *sigma*: sigmoid layer
- Sigmoid output: 0 or 1
- *tanh*: tanh layer
- Overcomes vanishing gradient problem
- Second derivative sustains for long range before going to zero
- *h(t-1)*: output of last LSTM unit
- *c(t-1)*: memory from last LSTM unit
- *X(t)*: current input
- *c(t)*: new updated memory
- *h(t)*: current output

- Three main components:
- Forget unnecessary information:
- Sigmoid layer takes input X9t) and h(t-1)
- Removes old output by outputting 0
- Forget gate f(t)
- Outputs f(t)*c(t-1)
- Store information:
- Takes new input X(t)
- Store into cell state
- Steps:
- Sigmoid layer decides what to update or ignore
- tanh layer creates vector of all possible values from new input
- Multiply sigmoid and tanh layers to update cell state
- Add new memory to old memory c(t-1) to give c(t)
- Decide output:
- Decided by a sigmoid layer
- Put cell state through tanh to generate all possible values and multiply it by output of sigmoid gate
- If sigmoid has 0, then multiplication yields 0
### Simple implementation:
- Tokenizer to vectorize text and covert to sequence of integers
- pad_sequences to convert sequences into 2D numpy array
- LSTM network:
- Hyper parameters:
- embed_dim: embedding layer encodes input sequence into sequence of dense vectors of dimension embed_dim
- lstm_out: LSTM transforms vector sequence into single vector of size lstm_out, containing information about the entire sequence
- drouput
- batch_size
- softmax: activation function