There is no commentSelect some text and then click Comment, or simply add a comment to this page from below to start a discussion.
LSTM and GRU
Brief Outline
Problem
The state of an RNN records information from all previous time steps.
At each new timestep the old information gets morphed by the current input
One could imagine that after many time steps the information stored at a later time step might get completely morphed so much that it would be impossible to extract the original information stored.
Key Idea
To get more flexibility and have a better modelling choice:
Selective read: Selectively read required information
We don't want to write the whole to , we just want to write selective portions of that into .
We introduce a vector which decides what fraction should be passed. ie. Selective Write =
But how does the RNN know what should be the values of ? - We introduce parameters.
We compute and as
is known as the output gate.
LSTM
Selective Read
Now that we have , which contains only selectively written values from .
We may not want to pass this along with directly to as also may contain irrelevant information. Hence, we define an intermediate step
Then Selective Read = where is defined as
Selective Forget
We now have and , and have to combine these to get , ie. the new hidden state.
One simple way of doing this is adding both the above terms. However, we may want to forget some parts of instead of passing it directly. We introduce another gate
Update new state
Finally after combing all the above three gates, we get the new state as