<center><i class="fa fa-edit"></i> Math-Based Tutorial on Neural Networks Cont. </center>

# <center><i class="fa fa-edit"></i> Math-Based Tutorial on Neural Networks Cont. </center> ###### tags: `Internship` :::info **Goal:** - [x] Gradient Descent - [x] Python Implementation of LTSM **Resources:** [Towards Data Science Page](https://towardsdatascience.com/understanding-lstm-and-its-quick-implementation-in-keras-for-sentiment-analysis-af410fd85b47) [Adventures in Machine Learning](https://adventuresinmachinelearning.com/neural-networks-tutorial/#first-attempt-feed-forward) [Machine Learning](https://hackmd.io/@Derni/HJQkjlnIP) ::: ### Gradient Descent Update weight waccording to: ![](https://i.imgur.com/Ua1AkIi.png) Definitions: - x_old: value does not matter as long as abs(x_new - x_old) > precision - Gradient error: at w_old - alpha: step size - Must be well-tuned Example implementation: ``` x_old = 0 x_new = 6 # The algorithm starts at x=6 gamma = 0.01 precision = 0.00001 def df(x): y = 4 * x**3 - 9 * x**2 return y while abs(x_new - x_old) > precision: x_old = x_new x_new += -gamma * df(x_old) print("The local minimum occurs at %f" % x_new) ``` ## Python Implementation of LSTM - RNN: models time or sequence dependent behavior - Feeds back output at time t to input of same layer at time t+1 - Solves vanishing/exploding gradient problem - 1st diagram: - RNN feeds back outputs into same hidden layer - Hidden layer has sigmoid activations (normal to any densely connected neural network) - Passes through delay block to allow input of h^(t-1) into hidden layer - Delay helps model time or sequence - 2nd diagram: unrol2nd - RNN and unrolled RNN - New word and output of previous F supplied for each time step ![](https://i.imgur.com/yvPbwBT.png) ![](https://i.imgur.com/swPEmWY.png)