Lecture 1: The spelled-out intro to neural networks and backpropagation: building micrograd

Lecture 1: The spelled-out intro to neural networks and backpropagation: building micrograd === This is the most step-by-step spelled-out explanation of backpropagation and training of neural networks. It only assumes basic knowledge of Python and a vague recollection of calculus from high school. - [lecture video](https://www.youtube.com/watch?v=VMj-3S1tku0&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=1) - https://github.com/karpathy/micrograd ## micrograd overview 1. Based on Autograd engine 2. Implements backpropagation: the core of neural network ## Derivative ![](https://hackmd.io/_uploads/SyRp_ibpn.png) In mathematics, the derivative shows the sensitivity of **change** of a function's output with respect to the input. ```python= h = 0.001 x = 3.0 (f(x + h) - f(x))/h # slope: 14.00300000000243 ``` ```python= h = 0.00000000000000001 x = 3.0 (f(x + h) - f(x))/h # slope: 0.0 ``` ## Backpropagation ### dL/dd ![](https://hackmd.io/_uploads/rJCBtabT3.png) dL/dd = f ```bash= (f(x+h) - f(x))/h = ((d+h)*f - d*f)/h = (d*f + h*f - d*f)/h = f ``` 同理 dL/df = d ### dd/dc ![](https://hackmd.io/_uploads/HJ-QY6b63.png) dd/dc = 1.0 ```bash= (f(x+h) - f(x))/h = ((c+h+e)-(c+e))/h = h/h = 1.0 ``` 同理 dd/de = 1.0 ### how to calculate dL/dc: Chain Rule dL/dc = dL/dd * dd/dc = -2 * 1.0 = -2 ### dL/da ![](https://hackmd.io/_uploads/BJQZs6ba2.png) dL/da = dL/de * de/da = -2 * -3 = 6 同理 dL/db = -4 ![](https://hackmd.io/_uploads/Skmg6a-p2.png) 1:27:55 https://colab.research.google.com/drive/1qwE4uCKW6QOtuWubgDgueyzdWFuOnyYb?usp=sharing