# Pytorch 學習講義
###### tags: `python`
## Tensor
1. **導入模塊**
```python
import torch
import numpy as np
```
2. **初始化**
```python
# from list
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
# from numpy array
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
# from another tensor
x_ones = torch.ones_like(x_data)
x_rand = torch.rand_like(x_data, dtype=torch.float)
x_zeros = torch.zeros_like(x_data)
# specify shape
shape = (2,3)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
```
3. **屬性**
```python
tensor = torch.rand(3,4)
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
```
+ shape: 張量的形狀
+ dtype: 張量的資料型別
+ device: 張量儲存的位置
4. **操作**
+ 移動到 GPU: 預設會儲存在 CPU 中,使用 $.to$ 方法將資料移動到 GPU 中。
```python
if torch.cuda.is_available():
tensor = tensor.to('cuda')
```
+ 索引&切片
```python
tensor = torch.ones(4, 4)
# indexing
print('First row: ', tensor[0])
# slicing
print('First column: ', tensor[:, 0])
print('Last column:', tensor[..., -1])
tensor[:,1] = 0
```
+ 合併
```python
tensor = torch.ones(4, 4)
t1 = torch.cat([tensor, tensor, tensor], dim=1) # shape = (4, 12)
```
+ 乘法
```python
## matrix multipication
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)
y3 = torch.rand_like(tensor)
torch.matmul(tensor, tensor.T, out=y3)
## element-wise multiplication
z1 = tensor * tensor
z2 = tensor.mul(tensor)
z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)
```
+ 取出元素: 取出張量中的數值,適用張量只有一個元素的情況。
```python
agg = tensor.sum()
agg_item = agg.item()
```
+ 轉換為 NumPy Array
```python
t = np.ones(5)
n = t.numpy()
```
+ Inplace 操作
```python
t.add_(1)
```
## Datasets
1. **導入模塊**
```python
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
```
2. **下載資料**
```python
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=ToTensor()
)
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=ToTensor()
)
```
+ ``root``: 資料儲存的路徑。
+ ``train``: 訓練 or 測試樣本。
+ ``download``: 若為 ``True`` 且 ``root`` 中沒有資料,從網路上下載資料。
+ ``transform``: 對資料進行前處理的函數。
+ ``target_transform``: 對資料標籤進行處理的函數。
3. **客製化資料集**: 客製化的資料集需繼承 ``torch.utils.data.Dataset``,且要有 ``__init__``, ``__len__``, ``__getitem__`` 方法。``__len__`` 回傳資料長度,``__getitem__`` 回傳 $(data, \ label)$。
```python
import os
import pandas as pd
from torchvision.io import read_image
class CustomImageDataset(Dataset):
def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
self.img_labels = pd.read_csv(annotations_file)
self.img_dir = img_dir
self.transform = transform
self.target_transform = target_transform
def __len__(self):
return len(self.img_labels)
def __getitem__(self, idx):
img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
image = read_image(img_path)
label = self.img_labels.iloc[idx, 1]
if self.transform:
image = self.transform(image)
if self.target_transform:
label = self.target_transform(label)
return image, label
```
4. **資料分割**
+ 方法一
```python
train_set_size = int(len(train_set) * 0.8)
valid_set_size = len(train_set) -train_set_size
train_set, valid_set = torch.utils.data.random_split(train_set, [train_set_size, valid_set_size])
```
+ 方法二
```python
from sklearn.model_selection import train_test_split
train_indices, val_indices = train_test_split(list(range(len(dataset.targets))), test_size=0.2, stratify=dataset.targets)
train_dataset = torch.utils.data.Subset(dataset, train_indices)
val_dataset = torch.utils.data.Subset(dataset, val_indices)
```
## Dataloader
1. **導入模塊**
```python
from torch.utils.data import DataLoader
```
2. **建立 dataloader**: ``Dataloader`` 會以 mini-batch 的形式輸出資料,每個 epoch 會重新打亂資料,另外,包含了 ``multiprocessing``的操作,能加速整個訓練流程。
```python
train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)
```
## nn.Module
1. **導入模塊**
```python
import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
```
2. **建立模型**: 模型需繼承 ``nn.Module``。使用 ``forward`` 方法回傳計算結果。
```python
device = 'cuda' if torch.cuda.is_available() else 'cpu'
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10),
)
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
model = NeuralNetwork().to(device)
```
## Gradient
1. **初始化計算圖**: Tensor 屬性設為 ``requires_grad=True``,表示其為變數,可以計算梯度。
```python
import torch
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
```

2. **計算梯度**
```python
loss.backward()
print(w.grad) # gradient of w
print(b.grad) # gradient of b
```
+ 只有 ``require_grad=True`` 的張量才有 ``grad`` 屬性。
+ 使用 ``backward`` 方法後,計算圖會被釋放,再次使用 ``backward`` 會出現錯誤,可使用 ``retain_graph=True`` 來避免錯誤,範例如下:
```python
x = torch.randn((1,4),dtype=torch.float32,requires_grad=True)
y = x ** 2
z = y * 4
output1 = z.mean()
output2 = z.sum()
output1.backward(retain_graph=True)
output2.backward()
```
3. **阻止梯度追蹤**
```python
z = torch.matmul(x, w)+b
print(z.requires_grad) # True
# method 1
with torch.no_grad():
z = torch.matmul(x, w)+b
print(z.requires_grad) # False
# method 2
z = torch.matmul(x, w)+b
z_det = z.detach()
print(z_det.requires_grad) # False
```
4. **應用場景**
+ 模型預測不需要使用 backpropagation,可提升預測效率。
+ 訓練時,凍結部分參數更新。
## 最佳化 & 模型訓練
1. **案例一**: 對函數最佳化。
```python
import torch
import torch.optim as optim
def square(x):
return x**2
x = torch.ones((1,1), requires_grad=True)
optimizer = optim.Adam([x], lr=1e-2)
for i in range(200):
optimizer.zero_grad()
output = square(x) # forward
output.backward() # backward
optimizer.step() # optimize
```

2. **案例二**: 訓練模型,[完整程式碼](https://drive.google.com/file/d/1Qo9lG5oWtzKhST7lX-XdBNkWKM9zZ2xr/view?usp=sharing)。
```python
# hyperparameter
learning_rate = 1e-3
epochs = 5
# loss function
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
def train_loop(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
for batch, (X, y) in enumerate(dataloader):
# Compute prediction and loss
pred = model(X)
loss = loss_fn(pred, y)
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
```
+ `nn.CrossEntropyLoss` 包含 softmax & crossentropy,因此,不需要額外再使用 softmax 函數。
## 儲存 & 載入模型
1. **方法一**:
```python
# save model
model = models.vgg16(pretrained=True)
torch.save(model.state_dict(), 'model_weights.pth')
# load model
model = models.vgg16()
model.load_state_dict(torch.load('model_weights.pth'))
model.eval()
```
2. **方法二**:
```python
torch.save(model, 'model.pth')
model = torch.load('model.pth')
```
## 學習率排程
1. [**LambdaLR**](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.LambdaLR.html): 自定義學習排程。
```python
lambda_fun = lambda epoch: 0.95 ** epoch
scheduler = LambdaLR(optimizer, lr_lambda= lambda_fun)
for epoch in range(100):
train(...)
validate(...)
scheduler.step()
```
+ `optimizer`
+ `lr_lambda`
+ `verbose`
2. [**StepLR**](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.StepLR.html): 每經過固定 epoch 後,下降學習率。
```python
scheduler = StepLR(optimizer, step_size=30, gamma=0.1)
for epoch in range(100):
train(...)
validate(...)
scheduler.step()
```
+ `optimizer`
+ `step_size`
+ `gamma`
+ `verbose`
3. [**ReduceLROnPlateau**](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html#torch.optim.lr_scheduler.ReduceLROnPlateau): 當學習率不再增加時,下降學習率。
```python
scheduler = ReduceLROnPlateau(optimizer, 'min')
for epoch in range(100):
train(...)
val_loss = validate(...)
# Note that step should be called after validate()
scheduler.step(val_loss)
```
+ `optimizer`
+ `factor`
+ `patience`
+ `verbose`
## 參考資料
1. https://pytorch.org/tutorials/
2. https://www.youtube.com/watch?v=ZoZHd0Zm3RY
3. https://www.youtube.com/watch?v=P31hB37g4Ak