--- Tags: PyTorch, MIS583, Multi-Tasking --- # PyTorch & Assigment 4 Hints Hi, this is Yao-Rong Chen. There are some tips of PyTorch and Assignment 4 for you guys. 針對 PyTorch 或 Assignment 4 有些小技巧提示給大家 [toc] ## PyTorch ### Type Conversion 注意資料型態 ```python x = torch.rand(3, 3) x.numpy() # to numpy x.tolist() # torch to pure python ``` ### Recommended usage of device 推薦 device 寫法 ```python # device = torch.device('cpu') # on cpu device = torch.device('cuda') # on gpu (default gpu0) # device = torch.device('cuda:2') # on third GPU model.to(device) # move model to GPU image = image.to(device) # move data to GPU label = label.to(device) # move label to GPU ``` ### Test your dataset(multi-tasking) Test whether your custom dataset correctly parse the data files. 測試資料集是否正常解析資料集 ```python class HelloData(torch.utils.data.Dataset): def __init__(self): ... def __getitem__(self, idx): ... ``` If above class runs without errors, doesn't mean your class is correct. Test it by creating an instnce. 上面正常跑不代表 class 正確,通過建立 instance 來測試 ```python train_dataset = HelloData() ``` Works fine? Fix it if you got an error. Warning, it JUST test the `__init__` function. 正常嗎?有 bug 的話去把它修好 注意!這只有測試到 `__init__` 函式 ```python print(train_dataset[0]) # If you output a tuple of `tuple(image, category, attribute)` d_img, d_cate, d_attr = train_dataset[1] print(d_img.shape, d_cate.shape, d_attr.shape) # torch.Size([3, 224, 224]) torch.Size([]) torch.Size([4]) print(d_cate) # tensor(1) print(d_attr) # tensor([ 3, 4, 5, 13]) ``` Accessing by `train_dataset[1]` can truly call and test the `__getitem__` method. 存取 `train_dataset[1]` 才能真的呼叫並測試 `__getitem__` 函式 ### Test your model(multi-tasking) Test whether your model work correctly. 如果要測試一下你的 model 有沒有寫對 ```python import torch # 32 is batch_size fake_x = torch.rand(32, 3, 224, 224) # generate fake data pred_cate, pred_attr = model(fake_x) # output of category and attribute print(pred_cate.shape) # shape should be (batch, 10) print(pred_attr.shape) # shape should be (batch, 15) ``` ### Test your criterion 測試 criterion ```python y_cate = torch.rand(32, 10) # fake true category y_attr = torch.rand(32, 15) # fake true attribute loss1 = criterion1(pred_cate, y_cate) loss2 = criterion2(pred_attr, y_attr) print(loss1.item()) # .item() to pure python print(loss2.item()) ``` ## Training ### Transfer Learning Offical tutorial (recommended). https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html#finetuning-the-convnet ### metrics - f1_score Rather than accuracy, the f1_score metrics consider precision score and recall score. It would be better for multi-label tasks. You can use `f1_score` function from package `scikit-learn`'s . 使用 f1_score 比使用 acc 更考慮到 precision 和 recall,因此會更加適合 multi-label 問題 f1_score 可以使用 scikit-learn 套件 ```bash pip install scikit-learn # install from bash ``` And then ```python from sklearn.metrics import f1_score torch.munual_seed(1340) # rand() to generate fake confedience # > 0.5 output True or False # .int() convert True/False into 1/0 # .to('cuda') is optional depends on your environment x = (torch.rand(4, 4) > 0.5).int()#.to('cuda') y = (torch.rand(4, 4) > 0.5).int()#.to('cuda') x # tensor([[1, 0, 0, 0], # [1, 1, 1, 1], # [1, 1, 1, 1], # [0, 1, 0, 1]], dtype=torch.int32) y # tensor([[1, 1, 0, 1], # [1, 0, 1, 0], # [1, 0, 1, 1], # [1, 1, 0, 1]], dtype=torch.int32) f1_score(y, x, average='samples') # f1_score(y.cpu(), x.cpu(), average='samples') # .cpu if you need # 0.7059523809523809 ``` Thera are 3 types of formula for calculating f1 score: micro, macro and samples. micro: sum up all the tp and fp of x and y, calculate one total f1 score macro: calculate f1 score by each row. Get the average number of them. samples: calculate f1 score by each column. Get the average number of them. micro and macro methods are bad if we got imbalance label from our data. We choose samples as method of calculating f1 score. One detailed sample are there: https://www.kaggle.com/c/ee448-2019-node-classification/discussion/89348 f1_score 針對 multi-label 主要有三種算法,micro、macro 和 samples micro 是加總所有 x 和 y 的 tp, fp 和 fn,計算一次 f1 macro 則是按照 each row 計算 f1_score,最後取平均值 samples 則是按照 each column(each label)計算 f1_score,最後取平均值 micro macro 比較沒有考慮 imbalance label,因次我們選用 samples 當作 metrics 詳細範例可以看:https://www.kaggle.com/c/ee448-2019-node-classification/discussion/89348 ### How to calculate F1 In previous assignment, we calculate accuracy for each batch and get the average of them. It is not suitable for our samples method of f1 score. It will cause the f1 score got wrong. You should save all of the predictions. Calculate one total f1 score. Below is pseudocode. 將每個 batch 計算一次 acc,然後取平均是之前的做法,但是並不適合我們這次的 samples 方法,會導致估算 f1 與真正的 f1 不符 應該要將所有 predict 都存起來,最後計算一次 f1 虛擬碼 ```python preds = [] for batch in loader: out = model(batch) preds.append(out) test_f1 = f1_score(preds, trues, average='samples') ```