# <center><i class="fa fa-edit"></i> Math-Based Tutorial on Neural Networks Cont. </center>
###### tags: `Internship`
:::info
**Goal:**
- [x] Gradient Descent
- [x] Python Implementation of LTSM
**Resources:**
[Towards Data Science Page](https://towardsdatascience.com/understanding-lstm-and-its-quick-implementation-in-keras-for-sentiment-analysis-af410fd85b47)
[Adventures in Machine Learning](https://adventuresinmachinelearning.com/neural-networks-tutorial/#first-attempt-feed-forward)
[Machine Learning](https://hackmd.io/@Derni/HJQkjlnIP)
:::
### Gradient Descent
Update weight waccording to:

Definitions:
- x_old: value does not matter as long as abs(x_new - x_old) > precision
- Gradient error: at w_old
- alpha: step size
- Must be well-tuned
Example implementation:
```
x_old = 0
x_new = 6 # The algorithm starts at x=6
gamma = 0.01
precision = 0.00001
def df(x):
y = 4 * x**3 - 9 * x**2
return y
while abs(x_new - x_old) > precision:
x_old = x_new
x_new += -gamma * df(x_old)
print("The local minimum occurs at %f" % x_new)
```
## Python Implementation of LSTM
- RNN: models time or sequence dependent behavior
- Feeds back output at time t to input of same layer at time t+1
- Solves vanishing/exploding gradient problem
- 1st diagram:
- RNN feeds back outputs into same hidden layer
- Hidden layer has sigmoid activations (normal to any densely connected neural network)
- Passes through delay block to allow input of h^(t-1) into hidden layer
- Delay helps model time or sequence
- 2nd diagram: unrol2nd
- RNN and unrolled RNN
- New word and output of previous F supplied for each time step

