My First Coursera Guided Project

[TOC] # My First Coursera Guided Project It's almost the end of August, 2021. Ho Chi Minh City is under a complete lockdown due to the current Covid-19 outbreak. My mental health is not in a great shape these days so I decided to take some online lessons to alleviate this discomfort. Here is my quick review on my first guided project on Coursera - [The Pytorch basics you need to start your ML projects](https://www.coursera.org/learn/the-pytorch-basics-you-need-to-start-your-ml-projects/) ![There are 6 short video lessons, shown in split view to let you practice simultaneously while watching the videos](https://i.imgur.com/Ly8L8pA.png) # Summary ## 1. What PyTorch is and why we use it? - Pytorch is an open source machine learning framework developed by Facebook. - We use Pytorch because it helps machine learning practitioners and reasearchers accelerate implementation process. ## 2. How to prepare your ML coding environment? - Visit the official installation site of Pytorch https://pytorch.org/ - In this project, we use Linux OS, Python language, and CUDA 10.2 - CUDA stands for Compute Unified Device Architecture <-- this is the first time someone explain to me the CUDA acronym meaning! And it got asked in the final quiz :D - Install CUDA for Linux ``` wget https://repo.anaconda.com/archive/Anaconda3-2021.05-MacOSX-x86_64.pkg bash Anaconda3-2021.05-MacOSX-x86_64.sh ``` - Restart the terminal then check if conda is in installed ``` conda -h ``` - Create a new environment for each project to avoid library conflicts ``` conda create -n pytorch_env python=3.8 conda activate pytorch_env ``` - Install Pytorch ``` conda install pytorch torchvision -c pytorch #for machine with no GPU conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch -c nvidia #for machine with GPU ``` - Let's test it. First open a Python intepreter ``` python3 ``` ```typescript= import torch print(torch.rand(5)) ``` The result should look like this `tensor([0.8253, 0.0288, 0.3023, 0.1370, 0.5824])` That's it for step 3. We have learnt how to set up an environment! ## 3. How to initialize and use Pytorch tensors? - First let's create a tensor. ```typescript=3 tnsr = torch.rand(3,2,2) tnsr ``` >Output ``` tensor([[[0.7048, 0.8351], [0.0934, 0.2679]], [[0.9654, 0.5296], [0.4489, 0.6469]], [[0.9701, 0.4278], [0.2610, 0.8301]]]) ``` - Let's check the current device using and shape of the tensor. ```typescript=5 tnsr.device tnsr.shape ``` >Output ``` device(type='cpu') torch.Size([3, 2, 2]) ``` - We can index the tensor as in numpy arrays. Note that index in python starts from 0. ```typescript=7 tnsr[2] # last 2x2 block of the 3x2x2 tensor ``` >Output ``` tensor([[0.9701, 0.4278], [0.2610, 0.8301]]) ``` ```typescript=8 tnsr[2,1] # block 3 row 2 of the 3x2x2 tensor ``` >Output ``` tensor([0.2610, 0.8301]) ``` - Some other handy functions ```typescript=9 torch.ones_like(tnsr) # create a tensor with all ones with the same shape as tnsr torch.zeros_like(tnsr) # create a tensor with all zeros with the same shape as tnsr torch.rand_like(tnsr) # create a tensor with randome values with the same shape as tnsr ``` >Output ``` tensor([[[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]]]) tensor([[[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]) tensor([[[-1.2878, -0.7696], [ 1.1910, -1.7987]], [[ 0.9267, -1.1113], [-0.5338, 1.1583]], [[ 0.4401, -0.7307], [-0.3230, 2.5825]]]) ``` That's it for step 3! We have leant how to create a tensor using pytorch and some of the tensor properties. ## 4. How to use the PyTorch neural network module? Finally the neural network part has come! In this project, we build a basic neural network with a linear layer, a ReLU activation function (step 4), and optimizers (step 5). ```typescript= import torch import torch.nn as nn linear = nn.Linear(10,2) input = torch.rand(3,10) input output = linear(input) output relu = nn.ReLU() relu_output = relu(output) relu_output ``` >Output ``` >>> input tensor([[0.4462, 0.0043, 0.2773, 0.1597, 0.9943, 0.0232, 0.6335, 0.4154, 0.9456, 0.1234], [0.4195, 0.9713, 0.9492, 0.2999, 0.3050, 0.2913, 0.3314, 0.8984, 0.4644, 0.2570], [0.8660, 0.9288, 0.4416, 0.9511, 0.4101, 0.8471, 0.1060, 0.6130, 0.7972, 0.7111]]) >>> output tensor([[-0.3816, 0.1372], [-0.3751, -0.0716], [-0.3905, -0.3343]], grad_fn=<AddmmBackward>) >>> relu_output tensor([[0.0000, 0.1372], [0.0000, 0.0000], [0.0000, 0.0000]], grad_fn=<ReluBackward0>) ``` That's a simple neural network with a linear layer and a ReLU layer. ## 5. How to use Pytorch optimizers? - What are optimizers? - Optimizers are algorithms or method used to solve optimization problems by minimizing the function. - Optimizers are used to change attributes of a neural network such as weights and learning rate to reduce the loss. - There are different types of optimizers: gradient descent, the stochastic gradient descent, mini batch gradient descent, momentum based gradient descent, etc. - Here we use the Adam Optimizer ```typescript= import torch import torch.nn as nn import torch.optim as optim ``` - Let's use `nn.Sequential` to create a series of steps in a neural network. ```typescript=4 mlp_layer = nn.Sequential(nn.Linear(5,2), nn.BatchNorm1d(2), nn.ReLU()) mlp_layer ``` >Output ``` Sequential( (0): Linear(in_features=5, out_features=2, bias=True) (1): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() ``` - Let's create a randon 5x5 input tensor, pass it throuh the above `mlp_layer`, and use the Adam optimizer to optimize it. ```typescript=6 input = torch.rand(5,5) + 1 input mlp_layer(input) adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1) ``` >Output ``` >>> input tensor([[1.5023, 1.5349, 1.6604, 1.9086, 1.3933], [1.4985, 1.1789, 1.3796, 1.2407, 1.4150], [1.5343, 1.5321, 1.6557, 1.9746, 1.5551], [1.8356, 1.1646, 1.5396, 1.1645, 1.0196], [1.4857, 1.6966, 1.3791, 1.9170, 1.2555]]) >>> mlp_layer(input) tensor([[0.0000, 0.6137], [0.3924, 0.0000], [0.0000, 0.3760], [1.4502, 0.6307], [0.3675, 0.3648]], grad_fn=<ReluBackward0>) >>> adam_opt Adam ( Parameter Group 0 amsgrad: False betas: (0.9, 0.999) eps: 1e-08 lr: 0.1 weight_decay: 0 ``` - Step 5 is done! We have created a simple neural net and optimized it. ## 6. The basic ML training loop with PyTorch. - Training time! Let's create a simple training loop. There are 4 basic steps in a training loop: 1. Set all the gradients to zeros. 2. Calculate the loss. 3. Calculate the gradients with respect to the loss. 4. Update the parameters being optimized. >Q: Why we need to set all gradients to zeros first? >A: Because Pytorch accumulates the gradient on subsequent backward passes when doing backward propagation. ```typescript=10 train_ex = torch.randn(100,5) + 1 adam_opt.zero_grad() curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean() curr_loss.backward() adam_opt.step() curr_loss ``` >Output ``` tensor(0.8000, grad_fn=<MeanBackward0>) ``` - Let's loop through it 10 times to see how the loss changes. ```typescript=10 train_ex = torch.randn(100,5) + 1 num_epoch = 10 for i in range(num_epoch): adam_opt.zero_grad() curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean() curr_loss.backward() adam_opt.step() curr_loss ``` >Output ``` tensor(0.7688, grad_fn=<MeanBackward0>) tensor(0.7135, grad_fn=<MeanBackward0>) tensor(0.6520, grad_fn=<MeanBackward0>) tensor(0.5929, grad_fn=<MeanBackward0>) tensor(0.5294, grad_fn=<MeanBackward0>) tensor(0.4543, grad_fn=<MeanBackward0>) tensor(0.3612, grad_fn=<MeanBackward0>) tensor(0.2538, grad_fn=<MeanBackward0>) tensor(0.1469, grad_fn=<MeanBackward0>) tensor(0.0443, grad_fn=<MeanBackward0>) ``` - The loss gradually reduces. It seems working! That's also the end of my first guided project on Coursera! # Notes on using Rhyme on Coursera - Timeout after a limited hours of time. It cannot be used for too long. ![](https://i.imgur.com/WPcdVdW.png) - Pretty confusing at first, I couldn't type anything in the cloud workspace. - If you're using a linux local machine, use it to practice along instead of Rhyme. # Thoughts on writing the first blog post - This is much more time consuming than I thought, approximately 1 day in total for course taking and blogging. - Writing helps me clarify some points that the lecture does not address (eg: why the layers are linear-ReLU). - The theme of hackmd.io is beautiful! # Appendix - Full Python code in terminal ## Step 2-3 ```typescript= >> import torch >>> print(torch.rand(5)) tensor([0.8253, 0.0288, 0.3023, 0.1370, 0.5824]) >>> tnsr = torch.rand(3,2,2) >>> tnsr tensor([[[0.7048, 0.8351], [0.0934, 0.2679]], [[0.9654, 0.5296], [0.4489, 0.6469]], [[0.9701, 0.4278], [0.2610, 0.8301]]]) >>> tnsr.device device(type='cpu') >>> tnsr.shape torch.Size([3, 2, 2]) >>> tnsr[2] tensor([[0.9701, 0.4278], [0.2610, 0.8301]]) >>> tnsr[2,1] tensor([0.2610, 0.8301]) [0.2610, 0.8301]]]) >>> tnsr[2] tensor([[0.9701, 0.4278], [0.2610, 0.8301]]) >>> tnsr[2,1] tensor([0.2610, 0.8301]) >>> torch.ones_like(tnsr) tensor([[[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]]]) >>> torch.zeros_like(tnsr) tensor([[[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]) >>> torch.randn_like(tnsr) tensor([[[-1.2878, -0.7696], [ 1.1910, -1.7987]], [[ 0.9267, -1.1113], [-0.5338, 1.1583]], [[ 0.4401, -0.7307], [-0.3230, 2.5825]]]) ``` ## Step 4 ```typescript= >>> import torch >>> import torch.nn as nn >>> linear = nn.Linear(10,2) >>> input = torch.rand(3,10) >>> output = linear(input) >>> output tensor([[ 0.0553, -0.1094], [ 0.4083, -0.0111], [ 0.0809, -0.0664]], grad_fn=<AddmmBackward>) >>> input tensor([[0.1444, 0.8795, 0.5772, 0.2250, 0.5413, 0.6340, 0.1473, 0.9413, 0.1569, 0.7012], [0.4363, 0.8177, 0.9814, 0.4345, 0.6005, 0.5576, 0.2776, 0.5850, 0.6348, 0.2550], [0.6458, 0.7376, 0.7566, 0.9595, 0.2711, 0.1724, 0.0750, 0.2433, 0.0572, 0.8343]]) >>> linear = nn.Linear(10,2) >>> input = torch.rand(3,10) >>> input tensor([[0.4462, 0.0043, 0.2773, 0.1597, 0.9943, 0.0232, 0.6335, 0.4154, 0.9456, 0.1234], [0.4195, 0.9713, 0.9492, 0.2999, 0.3050, 0.2913, 0.3314, 0.8984, 0.4644, 0.2570], [0.8660, 0.9288, 0.4416, 0.9511, 0.4101, 0.8471, 0.1060, 0.6130, 0.7972, 0.7111]]) >>> output = linear(input) >>> output tensor([[-0.3816, 0.1372], [-0.3751, -0.0716], [-0.3905, -0.3343]], grad_fn=<AddmmBackward>) >>> relu = nn.ReLU() >>> relu_output = relu(output) >>> relu_output tensor([[0.0000, 0.1372], [0.0000, 0.0000], [0.0000, 0.0000]], grad_fn=<ReluBackward0>) ``` ## Step 5 ```typescript= >>> import torch >>> import torch.nn as nn >>> import torch.optim as optim >>> mlp_layer = nn.Sequential(nn.Linear(5,2), nn.BatchNorm1d(2), nn.ReLU()) >>> mlp_layer Sequential( (0): Linear(in_features=5, out_features=2, bias=True) (1): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() ) >>> input = torch.rand(5,5) + 1 >>> input tensor([[1.5023, 1.5349, 1.6604, 1.9086, 1.3933], [1.4985, 1.1789, 1.3796, 1.2407, 1.4150], [1.5343, 1.5321, 1.6557, 1.9746, 1.5551], [1.8356, 1.1646, 1.5396, 1.1645, 1.0196], [1.4857, 1.6966, 1.3791, 1.9170, 1.2555]]) >>> mlp_layer(input) tensor([[0.0000, 0.6137], [0.3924, 0.0000], [0.0000, 0.3760], [1.4502, 0.6307], [0.3675, 0.3648]], grad_fn=<ReluBackward0>) >>> adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1) Adam ( Parameter Group 0 amsgrad: False betas: (0.9, 0.999) eps: 1e-08 lr: 0.1 weight_decay: 0 ``` ## Step 6 Rewrite all the needed importing and functions. ```typescript= >>> import torch >>> import torch.nn as nn >>> import torch.optim as optim >>> mlp_layer = nn.Sequential(nn.Linear(5,2), nn.BatchNorm1d(2), nn.ReLU()) >>> adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1) >>> train_ex = torch.randn(100,5) + 1 >>> adam_opt.zero_grad() >>> curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean() >>> curr_loss.backward() >>> adam_opt.step() >>> curr_loss tensor(0.8000, grad_fn=<MeanBackward0>) ``` Track `curr_loss` after 10 loops ```typescript=14 >>> train_ex = torch.randn(100,5) + 1 >>> num_epoch = 10 >>> for i in range(num_epoch): ... adam_opt.zero_grad() ... curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean() ... curr_loss.backward() ... adam_opt.step() ... curr_loss ... tensor(0.7688, grad_fn=<MeanBackward0>) tensor(0.7135, grad_fn=<MeanBackward0>) tensor(0.6520, grad_fn=<MeanBackward0>) tensor(0.5929, grad_fn=<MeanBackward0>) tensor(0.5294, grad_fn=<MeanBackward0>) tensor(0.4543, grad_fn=<MeanBackward0>) tensor(0.3612, grad_fn=<MeanBackward0>) tensor(0.2538, grad_fn=<MeanBackward0>) tensor(0.1469, grad_fn=<MeanBackward0>) tensor(0.0443, grad_fn=<MeanBackward0>) ```