[TOC]
# My First Coursera Guided Project
It's almost the end of August, 2021. Ho Chi Minh City is under a complete lockdown due to the current Covid-19 outbreak. My mental health is not in a great shape these days so I decided to take some online lessons to alleviate this discomfort.
Here is my quick review on my first guided project on Coursera - [The Pytorch basics you need to start your ML projects](https://www.coursera.org/learn/the-pytorch-basics-you-need-to-start-your-ml-projects/)

# Summary
## 1. What PyTorch is and why we use it?
- Pytorch is an open source machine learning framework developed by Facebook.
- We use Pytorch because it helps machine learning practitioners and reasearchers accelerate implementation process.
## 2. How to prepare your ML coding environment?
- Visit the official installation site of Pytorch https://pytorch.org/
- In this project, we use Linux OS, Python language, and CUDA 10.2
- CUDA stands for Compute Unified Device Architecture <-- this is the first time someone explain to me the CUDA acronym meaning! And it got asked in the final quiz :D
- Install CUDA for Linux
```
wget https://repo.anaconda.com/archive/Anaconda3-2021.05-MacOSX-x86_64.pkg
bash Anaconda3-2021.05-MacOSX-x86_64.sh
```
- Restart the terminal then check if conda is in installed
```
conda -h
```
- Create a new environment for each project to avoid library conflicts
```
conda create -n pytorch_env python=3.8
conda activate pytorch_env
```
- Install Pytorch
```
conda install pytorch torchvision -c pytorch #for machine with no GPU
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch -c nvidia #for machine with GPU
```
- Let's test it. First open a Python intepreter
```
python3
```
```typescript=
import torch
print(torch.rand(5))
```
The result should look like this `tensor([0.8253, 0.0288, 0.3023, 0.1370, 0.5824])`
That's it for step 3. We have learnt how to set up an environment!
## 3. How to initialize and use Pytorch tensors?
- First let's create a tensor.
```typescript=3
tnsr = torch.rand(3,2,2)
tnsr
```
>Output
```
tensor([[[0.7048, 0.8351],
[0.0934, 0.2679]],
[[0.9654, 0.5296],
[0.4489, 0.6469]],
[[0.9701, 0.4278],
[0.2610, 0.8301]]])
```
- Let's check the current device using and shape of the tensor.
```typescript=5
tnsr.device
tnsr.shape
```
>Output
```
device(type='cpu')
torch.Size([3, 2, 2])
```
- We can index the tensor as in numpy arrays. Note that index in python starts from 0.
```typescript=7
tnsr[2] # last 2x2 block of the 3x2x2 tensor
```
>Output
```
tensor([[0.9701, 0.4278],
[0.2610, 0.8301]])
```
```typescript=8
tnsr[2,1] # block 3 row 2 of the 3x2x2 tensor
```
>Output
```
tensor([0.2610, 0.8301])
```
- Some other handy functions
```typescript=9
torch.ones_like(tnsr) # create a tensor with all ones with the same shape as tnsr
torch.zeros_like(tnsr) # create a tensor with all zeros with the same shape as tnsr
torch.rand_like(tnsr) # create a tensor with randome values with the same shape as tnsr
```
>Output
```
tensor([[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]]])
tensor([[[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.]]])
tensor([[[-1.2878, -0.7696],
[ 1.1910, -1.7987]],
[[ 0.9267, -1.1113],
[-0.5338, 1.1583]],
[[ 0.4401, -0.7307],
[-0.3230, 2.5825]]])
```
That's it for step 3! We have leant how to create a tensor using pytorch and some of the tensor properties.
## 4. How to use the PyTorch neural network module?
Finally the neural network part has come! In this project, we build a basic neural network with a linear layer, a ReLU activation function (step 4), and optimizers (step 5).
```typescript=
import torch
import torch.nn as nn
linear = nn.Linear(10,2)
input = torch.rand(3,10)
input
output = linear(input)
output
relu = nn.ReLU()
relu_output = relu(output)
relu_output
```
>Output
```
>>> input
tensor([[0.4462, 0.0043, 0.2773, 0.1597, 0.9943, 0.0232, 0.6335, 0.4154, 0.9456,
0.1234],
[0.4195, 0.9713, 0.9492, 0.2999, 0.3050, 0.2913, 0.3314, 0.8984, 0.4644,
0.2570],
[0.8660, 0.9288, 0.4416, 0.9511, 0.4101, 0.8471, 0.1060, 0.6130, 0.7972,
0.7111]])
>>> output
tensor([[-0.3816, 0.1372],
[-0.3751, -0.0716],
[-0.3905, -0.3343]], grad_fn=<AddmmBackward>)
>>> relu_output
tensor([[0.0000, 0.1372],
[0.0000, 0.0000],
[0.0000, 0.0000]], grad_fn=<ReluBackward0>)
```
That's a simple neural network with a linear layer and a ReLU layer.
## 5. How to use Pytorch optimizers?
- What are optimizers?
- Optimizers are algorithms or method used to solve optimization problems by minimizing the function.
- Optimizers are used to change attributes of a neural network such as weights and learning rate to reduce the loss.
- There are different types of optimizers: gradient descent, the stochastic gradient descent, mini batch gradient descent, momentum based gradient descent, etc.
- Here we use the Adam Optimizer
```typescript=
import torch
import torch.nn as nn
import torch.optim as optim
```
- Let's use `nn.Sequential` to create a series of steps in a neural network.
```typescript=4
mlp_layer = nn.Sequential(nn.Linear(5,2), nn.BatchNorm1d(2), nn.ReLU())
mlp_layer
```
>Output
```
Sequential(
(0): Linear(in_features=5, out_features=2, bias=True)
(1): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
```
- Let's create a randon 5x5 input tensor, pass it throuh the above `mlp_layer`, and use the Adam optimizer to optimize it.
```typescript=6
input = torch.rand(5,5) + 1
input
mlp_layer(input)
adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1)
```
>Output
```
>>> input
tensor([[1.5023, 1.5349, 1.6604, 1.9086, 1.3933],
[1.4985, 1.1789, 1.3796, 1.2407, 1.4150],
[1.5343, 1.5321, 1.6557, 1.9746, 1.5551],
[1.8356, 1.1646, 1.5396, 1.1645, 1.0196],
[1.4857, 1.6966, 1.3791, 1.9170, 1.2555]])
>>> mlp_layer(input)
tensor([[0.0000, 0.6137],
[0.3924, 0.0000],
[0.0000, 0.3760],
[1.4502, 0.6307],
[0.3675, 0.3648]], grad_fn=<ReluBackward0>)
>>> adam_opt
Adam (
Parameter Group 0
amsgrad: False
betas: (0.9, 0.999)
eps: 1e-08
lr: 0.1
weight_decay: 0
```
- Step 5 is done! We have created a simple neural net and optimized it.
## 6. The basic ML training loop with PyTorch.
- Training time! Let's create a simple training loop. There are 4 basic steps in a training loop:
1. Set all the gradients to zeros.
2. Calculate the loss.
3. Calculate the gradients with respect to the loss.
4. Update the parameters being optimized.
>Q: Why we need to set all gradients to zeros first?
>A: Because Pytorch accumulates the gradient on subsequent backward passes when doing backward propagation.
```typescript=10
train_ex = torch.randn(100,5) + 1
adam_opt.zero_grad()
curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean()
curr_loss.backward()
adam_opt.step()
curr_loss
```
>Output
```
tensor(0.8000, grad_fn=<MeanBackward0>)
```
- Let's loop through it 10 times to see how the loss changes.
```typescript=10
train_ex = torch.randn(100,5) + 1
num_epoch = 10
for i in range(num_epoch):
adam_opt.zero_grad()
curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean()
curr_loss.backward()
adam_opt.step()
curr_loss
```
>Output
```
tensor(0.7688, grad_fn=<MeanBackward0>)
tensor(0.7135, grad_fn=<MeanBackward0>)
tensor(0.6520, grad_fn=<MeanBackward0>)
tensor(0.5929, grad_fn=<MeanBackward0>)
tensor(0.5294, grad_fn=<MeanBackward0>)
tensor(0.4543, grad_fn=<MeanBackward0>)
tensor(0.3612, grad_fn=<MeanBackward0>)
tensor(0.2538, grad_fn=<MeanBackward0>)
tensor(0.1469, grad_fn=<MeanBackward0>)
tensor(0.0443, grad_fn=<MeanBackward0>)
```
- The loss gradually reduces. It seems working!
That's also the end of my first guided project on Coursera!
# Notes on using Rhyme on Coursera
- Timeout after a limited hours of time. It cannot be used for too long.

- Pretty confusing at first, I couldn't type anything in the cloud workspace.
- If you're using a linux local machine, use it to practice along instead of Rhyme.
# Thoughts on writing the first blog post
- This is much more time consuming than I thought, approximately 1 day in total for course taking and blogging.
- Writing helps me clarify some points that the lecture does not address (eg: why the layers are linear-ReLU).
- The theme of hackmd.io is beautiful!
# Appendix - Full Python code in terminal
## Step 2-3
```typescript=
>> import torch
>>> print(torch.rand(5))
tensor([0.8253, 0.0288, 0.3023, 0.1370, 0.5824])
>>> tnsr = torch.rand(3,2,2)
>>> tnsr
tensor([[[0.7048, 0.8351],
[0.0934, 0.2679]],
[[0.9654, 0.5296],
[0.4489, 0.6469]],
[[0.9701, 0.4278],
[0.2610, 0.8301]]])
>>> tnsr.device
device(type='cpu')
>>> tnsr.shape
torch.Size([3, 2, 2])
>>> tnsr[2]
tensor([[0.9701, 0.4278],
[0.2610, 0.8301]])
>>> tnsr[2,1]
tensor([0.2610, 0.8301]) [0.2610, 0.8301]]])
>>> tnsr[2]
tensor([[0.9701, 0.4278],
[0.2610, 0.8301]])
>>> tnsr[2,1]
tensor([0.2610, 0.8301])
>>> torch.ones_like(tnsr)
tensor([[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]]])
>>> torch.zeros_like(tnsr)
tensor([[[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.]]])
>>> torch.randn_like(tnsr)
tensor([[[-1.2878, -0.7696],
[ 1.1910, -1.7987]],
[[ 0.9267, -1.1113],
[-0.5338, 1.1583]],
[[ 0.4401, -0.7307],
[-0.3230, 2.5825]]])
```
## Step 4
```typescript=
>>> import torch
>>> import torch.nn as nn
>>> linear = nn.Linear(10,2)
>>> input = torch.rand(3,10)
>>> output = linear(input)
>>> output
tensor([[ 0.0553, -0.1094],
[ 0.4083, -0.0111],
[ 0.0809, -0.0664]], grad_fn=<AddmmBackward>)
>>> input
tensor([[0.1444, 0.8795, 0.5772, 0.2250, 0.5413, 0.6340, 0.1473, 0.9413, 0.1569,
0.7012],
[0.4363, 0.8177, 0.9814, 0.4345, 0.6005, 0.5576, 0.2776, 0.5850, 0.6348,
0.2550],
[0.6458, 0.7376, 0.7566, 0.9595, 0.2711, 0.1724, 0.0750, 0.2433, 0.0572,
0.8343]])
>>> linear = nn.Linear(10,2)
>>> input = torch.rand(3,10)
>>> input
tensor([[0.4462, 0.0043, 0.2773, 0.1597, 0.9943, 0.0232, 0.6335, 0.4154, 0.9456,
0.1234],
[0.4195, 0.9713, 0.9492, 0.2999, 0.3050, 0.2913, 0.3314, 0.8984, 0.4644,
0.2570],
[0.8660, 0.9288, 0.4416, 0.9511, 0.4101, 0.8471, 0.1060, 0.6130, 0.7972,
0.7111]])
>>> output = linear(input)
>>> output
tensor([[-0.3816, 0.1372],
[-0.3751, -0.0716],
[-0.3905, -0.3343]], grad_fn=<AddmmBackward>)
>>> relu = nn.ReLU()
>>> relu_output = relu(output)
>>> relu_output
tensor([[0.0000, 0.1372],
[0.0000, 0.0000],
[0.0000, 0.0000]], grad_fn=<ReluBackward0>)
```
## Step 5
```typescript=
>>> import torch
>>> import torch.nn as nn
>>> import torch.optim as optim
>>> mlp_layer = nn.Sequential(nn.Linear(5,2), nn.BatchNorm1d(2), nn.ReLU())
>>> mlp_layer
Sequential(
(0): Linear(in_features=5, out_features=2, bias=True)
(1): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
>>> input = torch.rand(5,5) + 1
>>> input
tensor([[1.5023, 1.5349, 1.6604, 1.9086, 1.3933],
[1.4985, 1.1789, 1.3796, 1.2407, 1.4150],
[1.5343, 1.5321, 1.6557, 1.9746, 1.5551],
[1.8356, 1.1646, 1.5396, 1.1645, 1.0196],
[1.4857, 1.6966, 1.3791, 1.9170, 1.2555]])
>>> mlp_layer(input)
tensor([[0.0000, 0.6137],
[0.3924, 0.0000],
[0.0000, 0.3760],
[1.4502, 0.6307],
[0.3675, 0.3648]], grad_fn=<ReluBackward0>)
>>> adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1)
Adam (
Parameter Group 0
amsgrad: False
betas: (0.9, 0.999)
eps: 1e-08
lr: 0.1
weight_decay: 0
```
## Step 6
Rewrite all the needed importing and functions.
```typescript=
>>> import torch
>>> import torch.nn as nn
>>> import torch.optim as optim
>>> mlp_layer = nn.Sequential(nn.Linear(5,2), nn.BatchNorm1d(2), nn.ReLU())
>>> adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1)
>>> train_ex = torch.randn(100,5) + 1
>>> adam_opt.zero_grad()
>>> curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean()
>>> curr_loss.backward()
>>> adam_opt.step()
>>> curr_loss
tensor(0.8000, grad_fn=<MeanBackward0>)
```
Track `curr_loss` after 10 loops
```typescript=14
>>> train_ex = torch.randn(100,5) + 1
>>> num_epoch = 10
>>> for i in range(num_epoch):
... adam_opt.zero_grad()
... curr_loss = torch.abs(1 - mlp_layer(train_ex)).mean()
... curr_loss.backward()
... adam_opt.step()
... curr_loss
...
tensor(0.7688, grad_fn=<MeanBackward0>)
tensor(0.7135, grad_fn=<MeanBackward0>)
tensor(0.6520, grad_fn=<MeanBackward0>)
tensor(0.5929, grad_fn=<MeanBackward0>)
tensor(0.5294, grad_fn=<MeanBackward0>)
tensor(0.4543, grad_fn=<MeanBackward0>)
tensor(0.3612, grad_fn=<MeanBackward0>)
tensor(0.2538, grad_fn=<MeanBackward0>)
tensor(0.1469, grad_fn=<MeanBackward0>)
tensor(0.0443, grad_fn=<MeanBackward0>)
```