PyTorch基本功

# PyTorch基本功 ## Import ![](https://i.imgur.com/gg7WqIo.png) ![](https://i.imgur.com/bNkc3HZ.png) 特徵(feature)->結果(label,output) --- ## Tensors Pytorch tensors are created using "torch.tensor" ![](https://i.imgur.com/EycoSi4.png) ![](https://i.imgur.com/bbiIrEC.png) ```python= # scalar 單一值 scalar = torch.tensor(7) #輸出tensor(7) #查看dimension scalar.ndim #輸出0 # Get tensor back as Python int scalar.item() # vector 向量(一維) vector = torch.tensor([7,7]) #輸出tensor([7, 7]) vector.shape #輸出torch.Size([2]) vector.ndim #輸出1 # matrix 向量(二維) MATRIX = torch.tensor([[7,7], [8,8]]) #輸出tensor([[7, 7], # [8, 8]]) MATRIX[0] #輸出tensor([7, 7]) MATRIX.shape #輸出torch.Size([2, 2]) #tensor (三維,或是N維) TENSOR = torch.tensor([ [[1,2,3], [4,5,6], [7,8,9] ], [[2,2,2], [3,3,3], [4,4,4] ]]) TENSOR.ndim #輸出3 TENSOR.shape #輸出torch.Size([2, 3, 3]) ``` --- ## Random Tensors Why random tensors? Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent data EX: start with random number -> look at data ->update random number -> look at data -> update random number **until better represent the data or image** ```python= #create a random tensor of size (3*3*3) random_t = torch.rand(3,3,3) random_t ``` ![](https://i.imgur.com/iVI6ryR.png) ```python= #create a random tensor with similar shape to an image tensor #將圖片encode成為一個tensor random_image_size_tensor = torch.rand(size=(224,224,3)) #height , width , color channel ``` ![](https://i.imgur.com/CVEpyJZ.png) ```python= #create a tensors of all zeros z = torch.zeros(size = (3,4)) #torch.ones創造全是1 ``` ![](https://i.imgur.com/vH1JTpo.png) --- ## 複製Tensor ```python= #有範圍的tensor a = torch.arange(start=1,end=11,step=1) ``` ![](https://i.imgur.com/dkzZqfv.png) ```python= # replicate same-size tensor t = torch.zeros_like(input=a) #input放要複製的 ``` ![](https://i.imgur.com/KTEjnZO.png) --- ## Tensor datatype **NOTE: tensor datatypes is one of 3 big errors you'll run into with pytorch & deep learning** **1. Tensors not right datatype(一致才能運算)(tensor.dtype) 2. Tensors not right shape(維度一致)(tensor.shape) 3. Tensors not on the right device(必須在相同的硬體上)(tensor.device)** ### 預設值都是float32 ```python= ex = torch.tensor([3.0,6.0,9.0], dtype= None, #what datatype is the tensor device = "cpu", #what device is your tensor on requires_grad = False #whather or not to track gradients(梯度) with this tensor operastion ) ex.dtype ``` ![](https://i.imgur.com/i1j7dqU.png) ### 轉換dtype ![](https://i.imgur.com/uEAG4yh.png) --- ## Manipulating Tensors * 加 ![](https://i.imgur.com/9eKkPEN.png) * 減 * 乘以下為 element-wise multiplication ![](https://i.imgur.com/27GFexp.png) ![](https://i.imgur.com/AUUF3d9.png) * 除 * 矩陣相乘 1. **inner dimensions** must match ex: 2 * 3 matmul 3 * 2 --OK ex: ~~3 * 4 matmul 3 * 5~~ --規格不合 2. The resulting matrix has the shape of the **outer dimensions** ex: 2 * 3 matmul 3 * 2 得到 2 * 2 矩陣 ![](https://i.imgur.com/dcqn2FO.png) 使用 **torch.matmul(t1,t2)**或是**torch.mm(t1,t2)** ![](https://i.imgur.com/va64P24.png) ![](https://i.imgur.com/80TvoRW.png) 使用 %%time 可以看出此 inbulid method 真的很快 ![](https://i.imgur.com/1VYEAxr.png) --- ## 修復matmul時矩陣shape不相容問題:當遇到 A[2 * 3] matmul B[2 * 3] 使用**transpose**，**B.T** 來轉換B達到可以做矩陣相乘的結果 ![](https://i.imgur.com/EYWVVOl.png) --- ## Tensor aggregation(min,max....) ![](https://i.imgur.com/k9KrTdW.png) ![](https://i.imgur.com/uw37OJf.png) 註：找mean時datatype要是float不能是long，這邊用x.type轉換型態。 ![](https://i.imgur.com/lGE0Huw.png) ![](https://i.imgur.com/F6Qo0lM.png) --- ## 找min,max位置 ![](https://i.imgur.com/mAil5UQ.png) ![](https://i.imgur.com/RDxtJEO.png) --- ## Reshaping,stacking,squeezing,unsqueezing tensor ![](https://i.imgur.com/3J2g9xt.png) 前置： ![](https://i.imgur.com/IvnIJ3a.png) ### Reshaping：element的量要保持一樣，例如：下方就是把它reshape成9*1 ![](https://i.imgur.com/iUdny5p.png) ### View：共享同一個記憶體區間，例如：把z第一個值變成5，x也跟著更改。 ![](https://i.imgur.com/zKUXCqm.png) ![](https://i.imgur.com/A3ONexC.png) ### Stack：dim代表dimention，還有vstack(垂直),hstack(水平) ![](https://i.imgur.com/nI8c93g.png) ![](https://i.imgur.com/7qNlr2d.png) ### Squeez：去除所有1維的部分(或是某個維度)，例如：下圖中的torch.Size([1,9])，代表的是0維的部分有1階，2維的部分有9階。 ![](https://i.imgur.com/X3OFLR4.png) ### Unsqueez：以下範例：是加入0維的部分，接著是加入1維 ![](https://i.imgur.com/s8SH09F.png) ![](https://i.imgur.com/PTLPs1o.png) ### Permute：把維度置換成想要的樣子，例如：2 * 3 * 4 變成4 * 3 * 2，常用於**圖像領域**。 ![](https://i.imgur.com/xCGHNtt.png) --- ## Indexing(select data from tensor) 類似於numpy ![](https://i.imgur.com/UcphDbv.png) 操作： ![](https://i.imgur.com/J20GKH0.png) ![](https://i.imgur.com/WqXndOw.png) ![](https://i.imgur.com/rVMpulM.png) ![](https://i.imgur.com/FRhkJke.png) --- ## Pytorch tensor & numpy ### 兩者鬼轉兩者佔用不同記憶體區間 ![](https://i.imgur.com/HGC6luh.png) ![](https://i.imgur.com/hB30U4g.png) 註：numpy的預設dtype是float64,而Pytorch的tensor是float32。當numpy轉pytorch,pytorch的dtype會reflect numpy ```python= #例如：鬼轉numpy torch.tensor(train_loss_count).numpy() ``` 要轉dtype可以用以下程式碼： ![](https://i.imgur.com/1rXycjx.png) ![](https://i.imgur.com/AgPw3iE.png) 註：變成pytorch的預設dtype --- ## Reproducbility(seed)(減少一開始rand tensor 的隨機性) ![](https://i.imgur.com/nngvNq9.png) 為了減少demo時的不確定性，使用random seed 使每次隨機產生的數字一樣。 ![](https://i.imgur.com/2SW4E3y.png) 註：筆記本式的要一次一次叫。pycharm之類的在頂端寫一次就好。 --- ## Access GPU ![](https://i.imgur.com/hoK0lyh.png) ![](https://i.imgur.com/bEwMvJv.png) follow：https://pytorch.org/get-started/locally/ ### 看colab gpu !nvidia-smi ### 看自己電腦gpu ![](https://i.imgur.com/ksn6viC.png) ### 基礎設置 ![](https://i.imgur.com/yPC9hjj.png) --- ## 把tensor(model)放到GPU ![](https://i.imgur.com/O4K7UMt.png) 註：在之前把device用torch.cuda.is_available()設定好。 ![](https://i.imgur.com/rnHHkce.png) 註：轉為numpy時，一定要在CPU，因為numpy只能在CPU運行 ![](https://i.imgur.com/f5Ek7R6.png) --- # ｛預測｝Liner Regression ![](https://i.imgur.com/gAf5csQ.png) ![](https://i.imgur.com/WS8muvd.png) ## Data prepare and loading ![](https://i.imgur.com/tviTkKi.png) ![](https://i.imgur.com/zY8qwZ5.png) 建立一組data集y，而他是從y=weight * X + bias 來的。我們預先知道兩者資料的關係 ```python= #建立一組data weight = 0.7 bias = 0.3 start = 0 end = 1 step = 0.02 X = torch.arange(start,end,step).unsqueeze(dim = 1 ) #通常大寫代表MATRIX y = weight*X + bias #通常小寫代表vector ``` --- ## spliting test and train set ![](https://i.imgur.com/cLDZmiG.png) **The Goal ：Generalization** = the ability for a machine learning model to perform well on data it hasn't seen before ![](https://i.imgur.com/jf4MRPN.png) ```python= #切分test與train set split_len = int(0.8 * len(X)) #最簡單，8,2拆分 train_X,train_y = X[:split_len],y[:split_len] test_X,test_y = X[split_len:],y[split_len:] ``` --- ## visualization best way to understand data 使用plt視覺化。 ```python= def plot_predictions(train_data=train_X, train_labels=train_y, test_data=test_X, test_labels=test_y, predictions=None): plt.figure(figsize=(10, 7)) #圖片大小 plt.scatter(train_data,train_labels, c ="r" ,s=5, label = "Training data") plt.scatter(test_data,test_labels, c ="b" , s =5 ,label = "test data") if predictions is not None: plt.scatter(test_data,prediction, c ="g" , s =5 ,label = "prediction") # Show the legend(線條的label) plt.legend(prop={"size":20}); ``` ## 建立model ![](https://i.imgur.com/7Bnrp13.png) ![](https://i.imgur.com/GsCaRyA.png) ```python= class LR (nn.Module): #nn.Module類似建構神經網路的LEGO積木 def __init__(self): super().__init__() self.weight = nn.Parameter(torch.randn(1,dtype = torch.float), #start with random weights (this will get adjusted as the model learns),PyTorch loves float32 by default requirex_grad=True)# <- can we update this value with gradient descent?) self.bias = nn.Parameter(torch.randn(1,dtype = torch.float), #start with random bias (this will get adjusted as the model learns) requires_grad=True) # Forward defines what computation is in your model def forward(self,x:torch.Tensor)->torch.Tensor:#<- "x" is the input data (e.g. training/testing features) return self.weight*x+self.bias #<- this is the linear regression formula (y = m*x + b) ``` ![](https://i.imgur.com/KHqNWTh.png) --- ## 查看model內容與參數 ![](https://i.imgur.com/jQJY2SR.png) --- ## model常用套件 ![](https://i.imgur.com/JOxBwfz.png) ![](https://i.imgur.com/vKTixqy.png) ![](https://i.imgur.com/r1rDRrw.png) 查看更多：https://pytorch.org/tutorials/beginner/ptcheat.html --- ## 用一開始隨機取數的model做預測用x_test去 inference(prediction) y_test ![](https://i.imgur.com/yImbQk0.png) 使用**inference_mode()**，程式碼會杜絕梯度追蹤，**加快執行速度** ```python= with torch.inference_mode(): y_p = model_0(test_X) ``` 差很多 ![](https://i.imgur.com/Hh6QTxA.png) ## 訓練前置作業-LossFunction與optimizer 1. 將model中隨機的參數擬合實際資料的參數 2. **設定loss function** 3. **設定optimzier**，他會依照loss of the model來調參數 ![](https://i.imgur.com/2LpHkOr.png) ![](https://i.imgur.com/XK3chGU.png) ![](https://i.imgur.com/M2tdfQL.png) ```python= #loss function loss_f = nn.L1Loss() #optimizer op = torch.optim.SGD(params =model_0.parameters(),#SGD = 隨機graident decent lr = 0.01) #lr = 一次調整參數多少，通常用default就很好了(ex：提高weight可以降低loss function,一次向上調0.01) ``` --- ## 訓練model 把gradient想成斜率，在一個山丘(斜率-x或+x)往平地(斜率等於0)走 ![](https://i.imgur.com/V1l8Btp.png) ![](https://i.imgur.com/xXIKlIn.png) ![](https://i.imgur.com/9seXND4.png) ```python= epochs = 1000 #0.loop trough the data for e in range(epochs): model_0.train() #Put model in training mode，他會把所有人requires_gradient設為true，追蹤所有人的梯度 #1.forward pass y_pre = model_0(train_X) #2.caculate the loss loss = loss_f(y_pre,train_y) #(預測,實際) #3.optimier zero grand op.zero_grad() #初始化歸0，因為optimizer.step()會累加 #4.Loss backwards loss.backward() #5.step the optimizer (perform grandient decent) op.step() model_0.eval() #評估模式的方法。當模型處於評估模式時，它會關閉一些特定的訓練功能，例如Dropout和BatchNormalization等，從而使模型更加穩定且一致地進行預測。 #此外，在評估模式下，模型計算的結果不會被用於計算梯度，從而能夠提高模型的預測速度。 test_pre = model_0(test_X) test_loss = loss_f(test_pre,test_y) if e%100==0: print(f"訓練回合={e} Loss={loss} test loss={test_loss}") print(model_0.state_dict()) ``` 視覺化訓練過程：(目標weight=0.7,bias=0.3) ![](https://i.imgur.com/3sYPxR0.png) 利用numpy矩陣,圖表視覺化： ```python= epochs_c = [] train_loss_count = [] test_loss_count = [] #在訓練時加入以下： if e%100==0: epochs_c.append(e) train_loss_count.append(loss) test_loss_count.append(test_loss) print(f"訓練回合={e} Loss={loss} test loss={test_loss}") print(model_0.state_dict()) #畫圖，plt要用Numpy plt.plot(epochs_c,torch.tensor(train_loss_count).numpy(),label="train loss") plt.plot(epochs_c,torch.tensor(test_loss_count).numpy(),label="test loss") plt.title("loss cruve") plt.ylabel("loss") plt.xlabel("epoch") plt.legend() ``` ![](https://i.imgur.com/qGvxJSo.png) --- ## 儲存模型 ![](https://i.imgur.com/4INgRlg.png) ```python= # Instantiate a fresh instance of LinearRegressionModelV2 loaded_model_1 = LinearRegressionModelV2() # Load model state dict loaded_model_1.load_state_dict(torch.load(MODEL_SAVE_PATH)) # Put model to target device (if your data is on GPU, model will have to be on GPU to make predictions) loaded_model_1.to(device) print(f"Loaded model:\n{loaded_model_1}") print(f"Model on device:\n{next(loaded_model_1.parameters()).device}") # Evaluate loaded model loaded_model_1.eval() with torch.inference_mode(): loaded_model_1_preds = loaded_model_1(X_test) y_preds == loaded_model_1_preds ``` ```python= #colab儲存｛整個模型｝，名叫tt.pt torch.save(model_0,"tt.pt") #colab儲存｛模型參數｝，名叫qq.pt torch.save(model_0.state_dict(),"qq.pt") ``` --- ## 載入模型 ![](https://i.imgur.com/YAXDifz.png) ```python= #建立一個liner regression 的 instance new_model = LR() #載入儲存的參數(qq.pt檔案) new_model.load_state_dict(torch.load("qq.pt")) ``` 原先模型與新創模型獲得一樣參數 ![](https://i.imgur.com/R1Z3woq.png) 他們一起預測test_X都得到相同結果 ![](https://i.imgur.com/DKdzZ4z.png) --- # ｛二元分類｝Classcification ![](https://i.imgur.com/D6b9LhN.png) ![](https://i.imgur.com/5IdTgqb.png) **註：batch_size(ALGO一次處理多少圖片，資料)通常都設32，效果也都不錯** ## classcification常用超參數(自行設定的) ![](https://i.imgur.com/CO704hI.png) ![](https://i.imgur.com/5wAWiDg.png) ## toy data + 視覺化建立分析用資料 ```python= from sklearn.datasets import make_circles n = 1000 X,y = make_circles(n, #X是feature,y是label(0,1) noise=0.03, random_state=42) #seed ``` 利用表格視覺化 -> 無法看出什麼 ```python= import pandas as pd #以dictionary的形式 test_table = pd.DataFrame({"X1":X[:,0], "X2":X[:,1], "Y":y}) ``` ![](https://i.imgur.com/73lAYre.png) 利用圖表視覺化 -> 看出點點分兩類 ```python= import matplotlib.pyplot as plt plt.scatter(X[:,0], X[:,1], c=y, #以y(label)是0或1來，區分顏色 cmap=plt.cm.RdYlGn)#選colormap ``` ![](https://i.imgur.com/llmI8OX.png) ## 【防以後出錯】檢查shape,tpype,並轉為Tensor 兩者原本皆為np預設為float64 ![](https://i.imgur.com/JV2m7TC.png) 轉成tensor與運算時的dtype=float32 ```python= # Turn data into tensors # Otherwise this causes issues with computations later on import torch X = torch.from_numpy(X).type(torch.float) y = torch.from_numpy(y).type(torch.float) ``` ![](https://i.imgur.com/DJPKrnW.png) ```python= from sklearn.model_selection import train_test_split #隨機切分train test set的套件 X_train,X_test,y_train,y_test = train_test_split( X, y, test_size=0.2, #20%的資料作為test set random_state=42) #seed ``` ## 建立model ![](https://i.imgur.com/DBb3LqZ.png) ```pthon= from torch import nn device = "cuda" if torch.cuda.is_available() else "cpu" ``` ![](https://i.imgur.com/DTAYIuL.png) ```python= # 1. Construct a model class that subclasses nn.Module class classcificationModel_1 (nn.Module): def __init__(self) : super().__init__() # 2. Create 2 nn.Linear layers capable of handling X and y input and output shapes self.layer_1 = nn.Linear(in_features=2,out_features=5) # takes in 2 features (跟X(800,2)一樣), produces 5 features(越多有機會學習到更多資訊) self.layer_2 = nn.Linear(in_features=5,out_features=1) # takes in 5 features, produces 1 feature (跟y一樣) # 3. Define a forward method containing the forward pass computation def forward(self,x): # Return the output of layer_2, a single feature, the same shape as y return self.layer_2(self.layer_1(x)) #X傳入第一層再把結果傳入第二層 ``` 放到device ![](https://i.imgur.com/EecfjuA.png) --- ## 簡化建立model **nn.Sequential** = 模型**依照順序**一層一層傳遞。 ```python= #利用nn.Sequential建立模型 class classcificationModel_2 (nn.Module): def __init__(self) : super().__init__() self.layer_s = nn.Sequential( nn.Linear(in_features=2,out_features=5), nn.Linear(in_features=5,out_features=1) ) def forward(self,x): return self.layer_s(x) ``` ![](https://i.imgur.com/iJ5Bh7E.png) --- ## tensorflow-playground視覺化模型 ![](https://i.imgur.com/BxeC6CY.png) --- ## 使用原始model預測 ```python= with torch.inference_mode() : predict_no_train = model_1(X_test.to(device)) ``` ![](https://i.imgur.com/I4hvrdE.png) --- ## loss function and optimizer ![](https://i.imgur.com/kBAjx7W.png) ![](https://i.imgur.com/oVduAMw.png) logits layer：In context of deep learning the logits layer means the layer **that feeds in to softmax** (or other such normalization). The output of the softmax are the probabilities for the classification task and its input is logits layer. ```pyhon= loss_fn = nn.BCEWithLogitsLoss() #BCEloss結合sigmoid函數，計算上更穩定。建議用BCE時都用這個 optimizer = torch.optim.SGD(params=model_1.parameters(),#最佳化時，調整這個模型的參數 lr =0.1 #learning rate ) ``` --- ## 使用活化函數(二維分類sigmoid)轉成機率 ![](https://i.imgur.com/nnhtdTT.png) ![](https://i.imgur.com/ogz7OL8.png) 利用活化函數把logits(raw output of model)轉為prediction probability ![](https://i.imgur.com/inzeRM8.png) ![](https://i.imgur.com/ktbgrkt.png) ![](https://i.imgur.com/DZCo3Zu.png) ![](https://i.imgur.com/NinjQPe.png) ![](https://i.imgur.com/QXo7ym2.png) ```python= #一行化簡 y_pred_labels = torch.round(torch.sigmoid(model_0(X_test.to(device))[:5])) ``` 壓縮dim=1可方便比較 ![](https://i.imgur.com/MHtLKXC.png) --- ## 訓練(非線性模型有瑕疵) ```python= epochs =100 #先將train與test的data放入device X_train,y_train = X_train.to(device),y_train.to(device) X_test,y_test = X_test.to(device),y_test.to(device) for e in range(epochs): # Training model_1.train() # 1. Forward pass (model outputs raw logits) y_logits = model_1(X_train).squeeze() # squeeze to remove extra `1` dimensions, this won't work unless model and data are on same device y_pred = torch.round(torch.sigmoid(y_logits))# turn logits -> pred probs -> 取整數pred labels # 2. Calculate loss/accuracy loss = loss_fn(y_logits, #這邊用logits因為我們本身BCEWithLogitsLoss自帶sigmoid，如果用一般BCE，要放torch.sigmoid(y_logits)。 y_train) accuracy = ac(y_true = y_train , y_pred = y_pred) # 3. Optimizer zero grad optimizer.zero_grad() #4. Loss backwards loss.backward() #5.optimizer.step,調整參數 optimizer.step() ### Testing 用測試及來看看怎樣 model_1.eval() with torch.inference_mode(): # 1. Forward pass test_logits = model_1(X_test).squeeze() test_pred = torch.round(torch.sigmoid(test_logits)) # 2. Caculate loss/accuracy test_loss = loss_fn(test_logits,y_test) test_acc = ac(y_true=y_test,y_pred=test_pred) if e%10 == 0: print(f"Epoch: {e} | Loss: {loss:.5f}, Accuracy: {accuracy:.2f}% | Test loss: {test_loss:.5f}, Test acc: {test_acc:.2f}%") ``` 跟用猜的沒兩樣 ![](https://i.imgur.com/AKVt5Lh.png) ![](https://i.imgur.com/1omdNF4.png) --- ## 視覺化查看問題運用作者寫好的視覺化工具從github網站引進寫好的函式(點raw version) ```python= import requests from pathlib import Path # Download helper functions from Learn PyTorch repo (if not already downloaded) if Path("helper_functions.py").is_file(): print("helper_functions.py already exists, skipping download") else: print("Downloading helper_functions.py") request = requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py") with open("helper_functions.py", "wb") as f: #wb是 write permission f.write(request.content) from helper_functions import plot_predictions, plot_decision_boundary ``` **直線的切割，因為nn.Linear** ![](https://i.imgur.com/A1rexzC.png) ![](https://i.imgur.com/S5theUW.png) --- ## ｛重要｝如何改善模型註：layer 指：nn.Linear 註：hidden units 指：in out feature 多一點註：lr調太高會產生梯度爆炸 ![](https://i.imgur.com/f5Z1LHn.png) ![](https://i.imgur.com/52CR8b8.png) ![](https://i.imgur.com/Lu4O1p5.png) 改善model有兩個面向 1. 從model著手(如上) 2. 從data著手建立更複雜模型： 1. 更多layer 2. 更多hidden unit ```python= class CircleModelV1(nn.Module): def __init__(self): super().__init__() self.layer_1 = nn.Linear(in_features=2, out_features=10) self.layer_2 = nn.Linear(in_features=10, out_features=10) # extra layer self.layer_3 = nn.Linear(in_features=10, out_features=1) def forward(self, x): # note: always make sure forward is spelt correctly! # Creating a model like this is the same as below, though below # generally benefits from speedups where possible. # z = self.layer_1(x) # z = self.layer_2(z) # z = self.layer_3(z) # return z return self.layer_3(self.layer_2(self.layer_1(x))) model_1 = CircleModelV1().to(device) model_1 ``` ![](https://i.imgur.com/2ti0D6B.png) --- ## 加入非線性建構模型 **Create intricate patterns** 使用活化函數：RELU(0以下都是0) https://ithelp.ithome.com.tw/articles/10304438?sc=iThelpR 建立模型： ```python= # 1. Construct a model class that subclasses nn.Module class classcificationModel_nonL (nn.Module): def __init__(self) : super().__init__() self.z = nn.Sequential( nn.Linear(in_features=2,out_features=8), nn.ReLU() , nn.Linear(in_features=8,out_features=64), nn.ReLU(), nn.Linear(in_features=64,out_features=1) ) def forward(self,x): return self.z(x) ``` loss_function與optimizer： ```python= loss_fn = nn.BCEWithLogitsLoss() #BCEloss結合sigmoid函數，計算上更穩定。建議用BCE時都用這個 optimizer = torch.optim.SGD(params=model_2.parameters(),#最佳化時，調整這個模型的參數 lr =0.1 #learning rate ) ``` 訓練： ```python= pochs =1000 #先將train與test的data放入device X_train,y_train = X_train.to(device),y_train.to(device) X_test,y_test = X_test.to(device),y_test.to(device) for e in range(epochs): # Training model_2.train() # 1. Forward pass (model outputs raw logits) y_logits = model_2(X_train).squeeze() # squeeze to remove extra `1` dimensions, this won't work unless model and data are on same device y_pred = torch.round(torch.sigmoid(y_logits))# turn logits -> pred probs -> 取整數pred labels # 2. Calculate loss/accuracy loss = loss_fn(y_logits, #這邊用logits因為我們本身BCEWithLogitsLoss自帶sigmoid，如果用一般BCE，要放torch.sigmoid(y_logits)。 y_train) accuracy = ac(y_true = y_train , y_pred = y_pred) # 3. Optimizer zero grad optimizer.zero_grad() #4. Loss backwards loss.backward() #5.optimizer.step,調整參數 optimizer.step() ### Testing 用測試及來看看怎樣 model_2.eval() with torch.inference_mode(): # 1. Forward pass test_logits = model_2(X_test).squeeze() test_pred = torch.round(torch.sigmoid(test_logits)) # 2. Caculate loss/accuracy test_loss = loss_fn(test_logits,y_test) test_acc = ac(y_true=y_test,y_pred=test_pred) if e%10 == 0: print(f"Epoch: {e} | Loss: {loss:.5f}, Accuracy: {accuracy:.2f}% | Test loss: {test_loss:.5f}, Test acc: {test_acc:.2f}%") ``` ![](https://i.imgur.com/NvYkrDI.png) ![](https://i.imgur.com/3boHzy0.png) --- # ｛多項分類｝multi_classcification ![](https://i.imgur.com/9KfklgG.png) ## 創造資料集->轉tensor確認型態->切分訓練+測試集 ```python= # Import dependencies import torch import matplotlib.pyplot as plt from sklearn.datasets import make_blobs from sklearn.model_selection import train_test_split # Set the hyperparameters for data creation NUM_CLASSES = 4 NUM_FEATURES = 2 RANDOM_SEED = 42 # 1. Create multi-class data X_blob, y_blob = make_blobs(n_samples=1000, n_features=NUM_FEATURES, # X features centers=NUM_CLASSES, # y labels cluster_std=1.5, # give the clusters a little shake up (try changing this to 1.0, the default) random_state=RANDOM_SEED ) # 2. Turn data into tensors X_blob = torch.from_numpy(X_blob).type(torch.float) y_blob = torch.from_numpy(y_blob).type(torch.LongTensor) print(X_blob[:5], y_blob[:5]) # 3. Split into train and test sets X_blob_train, X_blob_test, y_blob_train, y_blob_test = train_test_split(X_blob, y_blob, test_size=0.2, random_state=RANDOM_SEED ) # 4. Plot data plt.figure(figsize=(10, 7)) plt.scatter(X_blob[:, 0], X_blob[:, 1], c=y_blob, cmap=plt.cm.RdYlBu); ``` ![](https://i.imgur.com/Fz3uo8r.png) --- ## device code ```python= device = "cuda" if torch.cuda.is_available() else "cpu" ``` --- ## 建model 我建的： ```python= class M_Classcification(nn.Module): def __init__(self): super().__init__() self.z = nn.Sequential( nn.Linear(in_features = 2 , out_features = 8), nn.ReLU(), nn.Linear(in_features = 8 , out_features = 64), nn.ReLU(), nn.Linear(in_features = 64 , out_features = 4) ) def forward(self,x): return self.z(x) ``` 利用arg彈性調整模型： ```python= from torch import nn # Build model class BlobModel(nn.Module): def __init__(self, input_features, output_features, hidden_units=8): """Initializes all required hyperparameters for a multi-class classification model. Args: input_features (int): Number of input features to the model. out_features (int): Number of output features of the model (how many classes there are). hidden_units (int): Number of hidden units between layers, default 8. """ super().__init__() self.linear_layer_stack = nn.Sequential( nn.Linear(in_features=input_features, out_features=hidden_units), # nn.ReLU(), # <- does our dataset require non-linear layers? (try uncommenting and see if the results change) nn.Linear(in_features=hidden_units, out_features=hidden_units), # nn.ReLU(), # <- does our dataset require non-linear layers? (try uncommenting and see if the results change) nn.Linear(in_features=hidden_units, out_features=output_features), # how many classes are there? ) def forward(self, x): return self.linear_layer_stack(x) # Create an instance of BlobModel and send it to the target device model_4 = BlobModel(input_features=NUM_FEATURES, output_features=NUM_CLASSES, hidden_units=8).to(device) ``` ![](https://i.imgur.com/3xheZWB.png) --- ## loss function 與 optimizer 多分類：cross entropy (inbalance data set 可用weight改善) 二分類：BCE ```python= loss_fn =nn.CrossEntropyLoss() op = torch.optim.Adam(model_3.parameters(), lr = 0.1) ``` --- ## logits -> pred probs -> pred label ![](https://i.imgur.com/stVFv0Y.png) 多分類：softmax 二分類：sigmoid(常用σ表示) ![](https://i.imgur.com/v9z30Ls.png) softmax維度參數：https://blog.csdn.net/Will_Ye/article/details/104994504 如何判斷dim:https://blog.csdn.net/qq_27261889/article/details/88613932 --- ## train loop **trouble shooting：使用softmax時，label type 要改成 LongTensor** ```python= epochs = 1000 X_blob_train,y_blob_train = X_blob_train.to(device),y_blob_train.to(device) X_blob_test,y_blob_test = X_blob_test.to(device),y_blob_test.to(device) for e in range(epochs): model_3.train() y_logits = model_3(X_blob_train) y_pred = torch.softmax(y_logits, dim=1).argmax(dim=1) loss = loss_fn(y_logits,y_blob_train) op.zero_grad() loss.backward() op.step() model_3.eval() with torch.inference_mode(): test_logits = model_3(X_blob_test) test_pred = torch.softmax(test_logits, dim=1).argmax(dim=1) test_loss = loss_fn(test_logits,y_blob_test) if e %100 ==0: print(f"回合:{e} || loss:{loss:.5f} || test_loss:{test_loss:.5f}") ``` ![](https://i.imgur.com/59zh5fW.png) --- ## 評估模型 ![](https://i.imgur.com/InVVEPH.png) ![](https://i.imgur.com/8VctVC5.png) ```python= model_3.eval() with torch.inference_mode(): y_p = model_3(X_blob_test) y_p = torch.softmax(y_p,dim=1).argmax(dim=1) #評估 try: from torchmetrics import Accuracy except: !pip install torchmetrics==0.9.3 # this is the version we're using in this notebook (later versions exist here: https://torchmetrics.readthedocs.io/en/stable/generated/CHANGELOG.html#changelog) from torchmetrics import Accuracy # Setup metric and make sure it's on the target device torchmetrics_accuracy = Accuracy(task='multiclass', num_classes=4).to(device) # Calculate accuracy torchmetrics_accuracy(y_p, y_blob_test) ``` --- # ComputerVision ![](https://i.imgur.com/BpllGlh.png) ![](https://i.imgur.com/z9XGPwM.png) ![](https://i.imgur.com/j2X0GQH.png) ## CNN參數+架構+套件+前置import ![](https://i.imgur.com/YRUjNQs.png) ![](https://i.imgur.com/3TtpOft.png) ```python= import torch from torch import nn import torchvision from torchvision import datasets from torchvision import transforms from torchvision.transforms import ToTensor #用於圖片轉tensor import matplotlib.pyplot as plt #確認版本 print(torch.__version__) print(torchvision.__version__) ``` --- ## Get Dataset 註：pytorch常用target來代稱label 載入FashionMnist ```python= train_data = datasets.FashionMNIST( root = "data", #下載在哪裡 train = True, #True代表下載training data ，False代表下載test data download = True, #download data if it doesn't exist on disk transform=ToTensor(), #把圖片轉換成tensor target_transform=None #是否要改變target(label) ) test_data = datasets.FashionMNIST( root = "data", train = False, #True代表下載training data ，False代表下載test data download = True, transform=ToTensor(), target_transform=None ) ``` 圖片與label對應 ![](https://i.imgur.com/eCzXdHc.png) ![](https://i.imgur.com/zMBM657.png) 查看class ![](https://i.imgur.com/oVmSMog.png) ![](https://i.imgur.com/LwLJdCt.png) --- ## 隨機取數，視覺化查看圖片資料集補：torch隨機取數，size代表生成幾個 ![](https://i.imgur.com/Pwj4nFT.png) ![](https://i.imgur.com/16x6GYr.png) ```python= fig = plt.figure(figsize=(9,9)) #畫一張底圖 rows,cols=4,4 for i in range(1,rows*cols+1): ran_idx = torch.randint(0,len(train_data),size=[1]).item() img,lebel = train_data[ran_idx] fig.add_subplot(rows,cols,i) plt.imshow(img.squeeze(),cmap="gray") #plt只接受長*寬，把顏色的部分dim=1拿掉 plt.title(class_name[lebel]) plt.axis(False) ``` ![](https://i.imgur.com/NPAvhNf.png) --- ## Data Loader 分成batch更好計算，一個batch有32張圖，gradient一次更新32次，更有效率。 ![](https://i.imgur.com/KUe00cQ.png) ![](https://i.imgur.com/MztDfXL.png) ![](https://i.imgur.com/NKRGhRp.png) 註：train_set洗牌有助於不要讓model背順序，但在test_set沒差註：HyperParameter通常全部都會大寫註：num_worker =>要使用多少core來load data 註：drop_last =>最後一組沒有達到batch_size數量，要不要捨棄他。 ```python= from torch.utils.data import DataLoader #HyperParameter BATCH_SIZE=32 train_dataloader = DataLoader(train_data,batch_size=BATCH_SIZE,shuffle=True) test_dataloader = DataLoader(test_data,batch_size=BATCH_SIZE,shuffle=False) ``` ![](https://i.imgur.com/Hxy2aeS.png) 註：next(iter(train_dataloader))=>是從 train_dataloader 中取出下一個 batch 的數據。其中，iter(train_dataloader) 將 train_dataloader 轉換成一個可迭代對象，next() 函數則從可迭代對象中取出下一個元素，即下一個 batch 的數據。 ![](https://i.imgur.com/0wlZagE.png) 註：[32,1,28,28] = [batch,ColorChannel,長,寬] ![](https://i.imgur.com/fHmvH6k.png) ![](https://i.imgur.com/YkAgRmP.png) --- ## (比CNN爛)建BaseLine model + training ### Baseline model ![](https://i.imgur.com/G8xxQCV.png) **nn.Flatten() :** 1. 把多維的輸入壓扁為一維輸出，常用在從卷積層到全連接層的過渡。 2. nn.Flatten() compresses the dimensions of a tensor into a single vector. nn.Flatten() layer took our shape from [color_channels, height, width] to [color_channels, height*width]. Why do this? Because we've now turned our pixel data from height and width dimensions into one long feature vector. And **nn.Linear() layers like their inputs to be in the form of feature vectors**. ![](https://i.imgur.com/9I3OoS1.png) ![](https://i.imgur.com/DPm81j4.png) ![](https://i.imgur.com/67lpXok.png) ### loss_fn+op ```python= # Setup loss function and optimizer loss_fn = nn.CrossEntropyLoss() # this is also called "criterion"/"cost function" in some places optimizer = torch.optim.SGD(params=model_0.parameters(), lr=0.1) ``` ### 計算運行time ![](https://i.imgur.com/xCQ55Bv.png) ```python= from timeit import default_timer as timer def print_train_time(start: float, end: float, device: torch.device = None): """Prints difference between start and end time. Args: start (float): Start time of computation (preferred in timeit format). end (float): End time of computation. device ([type], optional): Device that compute is running on. Defaults to None. Returns: float: time between start and end in seconds (higher is longer). """ total_time = end - start print(f"Train time on {device}: {total_time:.3f} seconds") return total_time ``` ![](https://i.imgur.com/JkW9YqI.png) ### Baseline model train loop ![](https://i.imgur.com/1Is9uVG.png) ```python= # Import tqdm for progress bar(讀條) from tqdm.auto import tqdm # Set the seed and start the timer torch.manual_seed(42) train_time_start_on_cpu = timer() # Set the number of epochs (we'll keep this small for faster training times) epochs = 3 # Create training and testing loop for epoch in tqdm(range(epochs)): print(f"Epoch: {epoch}\n-------") ### Training train_loss = 0 # train_dataloader裡面有1875個batch(32個資料一組)) for batch, (X, y) in enumerate(train_dataloader): model_0.train() # 1. Forward pass y_pred = model_0(X) # 2. Calculate loss (per batch) loss = loss_fn(y_pred, y) train_loss += loss # accumulatively add up the loss per epoch # 3. Optimizer zero grad optimizer.zero_grad() # 4. Loss backward loss.backward() # 5. Optimizer step optimizer.step() # batch *32 / 一共60000筆 if batch % 400 == 0: % 400 == 0: print(f"Looked at {batch * len(X)}/{len(train_dataloader.dataset)} s print(f"Looked at {batch * len(X)}/{len(train_dataloader.dataset)} samples") # Divide total train loss by 1875個batch (average loss per batch per epoch) train_loss /= len(train_dataloader) ### Testing # Setup variables for accumulatively adding up loss and accuracy test_loss, test_acc = 0, 0 model_0.eval() with torch.inference_mode(): for X, y in test_dataloader: # 1. Forward pass test_pred = model_0(X) # 2. Calculate loss (accumatively) test_loss += loss_fn(test_pred, y) # accumulatively add up the loss per epoch # 3. Calculate accuracy (preds need to be same as y_true) test_acc += accuracy_fn(y_true=y, y_pred=test_pred.argmax(dim=1)) # Calculations on test metrics need to happen inside torch.inference_mode() # Divide total test loss by length of test dataloader (per batch) test_loss /= len(test_dataloader) # Divide total accuracy by length of test dataloader (per batch) test_acc /= len(test_dataloader) ## Print out what's happening print(f"\nTrain loss: {train_loss:.5f} | Test loss: {test_loss:.5f}, Test acc: {test_acc:.2f}%\n") # Calculate training time train_time_end_on_cpu = timer() total_train_time_model_0 = print_train_time(start=train_time_start_on_cpu, end=train_time_end_on_cpu, device=str(next(model_0.parameters()).device)) ``` 80%正確率 =>尚可 ![](https://i.imgur.com/zkmWmEZ.png) --- ## Baselinemodel make prediction ![](https://i.imgur.com/4vB0P4A.png) --- ## 為何有時GPU不比CPU快 ![](https://i.imgur.com/caN1DlB.png) ## 建構CNN https://poloclub.github.io/cnn-explainer/ ![](https://i.imgur.com/H8EIoVr.png) ![](https://i.imgur.com/92LTel0.png) ![](https://i.imgur.com/QMVKKU8.png) ![](https://i.imgur.com/6TYbJSU.png) ![](https://i.imgur.com/knFJcgo.png) ```python= class CNN (nn.Module): def __init__(self, input_shape:int, hidden_unit:int, output_shape:int): super().__init__() self.block1 = nn.Sequential( nn.Conv2d(in_channels=input_shape,#因為我們2D圖片所以用2D，也有1D,3D out_channels=hidden_unit, kernel_size=3, # 方格的大小 stride=1, #一次移動幾格 padding=1), #1=>填充周圍，避免輸入圖像的邊緣被“修剪”掉了（邊邊不會在kernel中心，邊緣處只檢測了部分像素點，丟失了圖片邊界處的眾多信息） nn.ReLU(), nn.Conv2d(in_channels=hidden_unit, out_channels=hidden_unit, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=2,# 萃取的方格是2*2 stride=2)# default stride value is same as kernel_size ) self.block2 = nn.Sequential( nn.Conv2d(in_channels=hidden_unit,#因為我們2D圖片所以用2D，也有1D,3D out_channels=hidden_unit, kernel_size=3, # 方格的大小 stride=1, #一次移動幾格 padding=1), #填充周圍，避免輸入圖像的邊緣被“修剪”掉了（邊邊不會在kernel中心，邊緣處只檢測了部分像素點，丟失了圖片邊界處的眾多信息） nn.ReLU(), nn.Conv2d(in_channels=hidden_unit, out_channels=hidden_unit, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=2,# 萃取的方格是2*2 stride=2)# default stride value is same as kernel_size ) self.classifier = nn.Sequential( nn.Flatten(), nn.Linear(in_features=hidden_unit*7*7 , out_features=output_shape) ) def forward(self,x:torch.tensor): x = self.block1(x) print(x.shape) #torch.Size([10, 14, 14]) x = self.block2(x) print(x.shape) #torch.Size([10, 7, 7]) # 此時可得知 flatten的input是[10,7*7] 如果要相乘就要 [49,10] # 但如果用usqueeze處理輸入資料，就要變成[1*490] * [490*10] x = self.classifier(x) print(x.shape) return x ``` loss+op ```python= loss_fn = nn.CrossEntropyLoss() op = torch.optim.SGD(params = model_2.parameters(). lr = 0.1) ``` ## 模組化準確率計算(helper_fn) ```python= import requests from pathlib import Path # Download helper functions from Learn PyTorch repo (if not already downloaded) if Path("helper_functions.py").is_file(): print("helper_functions.py already exists, skipping download") else: print("Downloading helper_functions.py") request = requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py") with open("helper_functions.py", "wb") as f: #wb是 write permission f.write(request.content) from helper_functions import plot_predictions, plot_decision_boundary from helper_functions import accuracy_fn ``` ## 模組化 Evaluation ```python= def eval (model:torch.nn.Module, data_loader:torch.utils.data.DataLoader, loss_fn:torch.nn.Module, accuracy_fn, device:torch.device = device): loss,acc=0,0 model.eval() with torch.inference_mode(): for X,y in data_loader: X,y = X.to(device),y.to(device) pre = model(X) loss += loss_fn(pre,y) acc+=accuracy_fn(y_true=y,y_pred=pre.argmax(dim=1)) loss /= len(data_loader) #1875個batch acc /= len(data_loader) return {"model_name":model.__class__.__name__, "model_loss":loss.item(), "model_acc":acc} ``` ## 模組化 train_loop ```python= def train_step(model:torch.nn.Module,# 後方非必須，但是可以增加程式碼易讀性 data_loader:torch.utils.data.DataLoader, loss_fn:torch.nn.Module, optimizer:torch.optim.Optimizer, accuracy_fn, device:torch.device = device): train_loss,train_acc =0,0 model.train() for batch ,(X,y) in enumerate(data_loader): X,y = X.to(device),y.to(device) pre = model(X) loss = loss_fn(pre,y) train_loss += loss train_acc += accuracy_fn(y_true = y , y_pred = pre.argmax(dim=1)) #logit -> prediction label optimizer.zero_grad() loss.backward() optimizer.step() train_loss /= len(data_loader) train_acc /= len(data_loader) print(f"\nTrain loss: {train_loss:.5f} | Train acc:{train_acc:.2f}%\n") ``` ## 模組化 test_loop ```python= def test_step(model:torch.nn.Module, data_loader:torch.utils.data.DataLoader, loss_fn:torch.nn.Module, accuracy_fn, device:torch.device = device): test_loss,test_acc = 0,0 model.eval() with torch.inference_mode(): for X,y in data_loader: X,y = X.to(device),y.to(device) pre = model(X) #output raw logit test_loss += loss_fn(pre,y) test_acc += accuracy_fn(y_true = y, y_pred = pre.argmax(dim=1)) #logit to prediction label #要在inference_mode()之中創立，編寫的參數，只能在其中修改 test_loss /= len(data_loader) test_acc /= len(data_loader) print(f"Test loss: {test_loss:.5f} || Test acc:{test_acc:.2f}%\n") ``` --- ## Train Loop+eval 訓練 ```python= from tqdm.asyncio import tqdm train_time_start_model = timer() epochs = 3 for epoch in tqdm(range(epochs)): print(f"Epoch : {epoch}\n------") train_step(model=model_C, data_loader=train_dataloader, loss_fn=loss_fn, optimizer=op, accuracy_fn=accuracy_fn, device=device) test_step(model=model_C, data_loader=test_dataloader, loss_fn=loss_fn, accuracy_fn=accuracy_fn, device=device) train_time_end_model = timer() total_time = print_train_time(start=train_time_start_model, end=train_time_end_model, device=device) ``` ![](https://i.imgur.com/7pZ1S8L.png) 評估 ```python= model_C_result = eval( model = model_C, data_loader = test_dataloader, loss_fn = loss_fn, accuracy_fn=accuracy_fn, device=device ) ``` ![](https://i.imgur.com/SZ0YwtP.png) --- ## 預測+混淆矩陣評估混淆矩陣適用於calsscification評估參考：https://www.learnpytorch.io/03_pytorch_computer_vision/ ![](https://i.imgur.com/hnQKIzY.png) --- ## save&load https://www.learnpytorch.io/03_pytorch_computer_vision/ --- # 自行建構圖片分類 https://www.learnpytorch.io/04_pytorch_custom_datasets/ ![](https://i.imgur.com/XlQLNkv.png) ![](https://i.imgur.com/aT3e2LM.png) ```python= import torch import torch.nn import matplotlib.pyplot as plt print(torch.__version__) device = "cuda" if torch.cuda.is_available() else "cpu" ``` --- ## 載Data set (from github) ![](https://i.imgur.com/wR31sXg.png) ```python= import requests import zipfile from pathlib import Path # Setup path to data folder data_path = Path("data/") image_path = data_path / "pizza_steak_sushi" # If the image folder doesn't exist, download it and prepare it... if image_path.is_dir(): print(f"{image_path} directory exists.") else: print(f"Did not find {image_path} directory, creating one...") image_path.mkdir(parents=True, exist_ok=True) # Download pizza, steak, sushi data with open(data_path / "pizza_steak_sushi.zip", "wb") as f: request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip") print("Downloading pizza, steak, sushi data...") f.write(request.content) # Unzip pizza, steak, sushi data with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref: print("Unzipping pizza, steak, sushi data...") zip_ref.extractall(image_path) ``` --- ## Data preparation & exploration 用OS查看每個資料夾 ```python= import os def walk_through_dir(dir_path): """ Walks through dir_path returning its contents. Args: dir_path (str or pathlib.Path): target directory Returns: A print out of: number of subdiretories in dir_path number of images (files) in each subdirectory name of each subdirectory """ for dirpath, dirnames, filenames in os.walk(dir_path): print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.") ``` ![](https://i.imgur.com/BsTZzYh.png) 設定路徑 ```python= train_p = image_path / "train" test_p = image_path / "test" ``` ![](https://i.imgur.com/ERIn6Wb.png) 視覺化內容 ![](https://i.imgur.com/SlmF4xi.png) ```python= import random from PIL import Image # 1. Get all image paths (* means "any combination"),glob=>聚集再一起 image_path_list = list(image_path.glob("*/*/*.jpg")) ``` ![](https://i.imgur.com/6S5RYxN.png) ```python # 2. Get random image path random_image_path = random.choice(image_path_list) # 3. Get image class from path name (the image class is the name of the directory where the image is stored) image_class = random_image_path.parent.stem # 4. Open image img = Image.open(random_image_path) # 5. Print metadata print(f"Random image path: {random_image_path}") print(f"Image class: {image_class}") print(f"Image height: {img.height}") print(f"Image width: {img.width}") img ``` ![](https://i.imgur.com/kctERza.png) 用matplotlib視覺化 ```python= import numpy as np import matplotlib.pyplot as plt # Turn the image into an array img_as_array = np.asarray(img) print(img_as_array) # Plot the image with matplotlib plt.figure(figsize=(10, 7)) plt.imshow(img_as_array) plt.title(f"Image class: {image_class} | Image shape: {img_as_array.shape} -> [height, width, color_channels]") plt.axis(False); ``` ![](https://i.imgur.com/OBELROH.png) 註：注意shape避免錯誤 ![](https://i.imgur.com/nOZZJJz.png) --- ## 將Data轉為Tensor ![](https://i.imgur.com/IoSyeOk.png) **先變成Dataset，再轉用DataLoader(以batch載入)** 圖片轉Tensor 註：64X64 相較於 255X255 或計算的較快，但同時也會失去一些細節於特徵。 ```python= import torch from torch.utils.data import DataLoader from torchvision import datasets, transforms data_transform = transforms.Compose([ # 1.Resize image transforms.Resize(size=(64,64)), # 2.Flip the images randomly on the horizontal, p=>機率 transforms.RandomHorizontalFlip(p=0.5), # 3. Turn image to Tensor transforms.ToTensor() ]) ``` ![](https://i.imgur.com/uCeQSbV.png) 64X64有些失真 ![](https://i.imgur.com/vnpeVYJ.png) 255X255較完整(此圖有HorizontalFlip) ![](https://i.imgur.com/GLix7LZ.png) --- ## Loading Image Data Using ImageFolder **turn our image data !!ALL!! into a Dataset capable of being used with PyTorch** Since our data is in standard image classification format, we can use the class *torchvision.datasets.ImageFolder*. Where we can pass it the file path of a target image directory as well as a series of transforms we'd like to perform on our images. ```python= from torchvision import datasets train_data = datasets.ImageFolder(root = train_p,# target folder of images transform=data_transform,# transforms to perform on data (images) target_transform=None)# transforms to perform on labels (if necessary) test_data = datasets.ImageFolder(root=test_p, transform=data_transform) print(f"Train data:\n{train_data}\nTest data:\n{test_data}") ``` **注意：創立出class_name的list方便之後預測label/target對應** ![](https://i.imgur.com/8HP55lA.png) ![](https://i.imgur.com/r3t14C7.png) ![](https://i.imgur.com/aWXFIrO.png) ![](https://i.imgur.com/iudF8Ev.png) ![](https://i.imgur.com/jxDSX2T.png) 使用matplotlib視覺化 => 用permute轉換配置。 matplotlib需求：[長,寬,顏色] ```python= # Rearrange the order of dimensions img_permute = img.permute(1, 2, 0) # Print out different shapes (before and after permute) print(f"Original shape: {img.shape} -> [color_channels, height, width]") print(f"Image permute shape: {img_permute.shape} -> [height, width, color_channels]") # Plot the image plt.figure(figsize=(10, 7)) plt.imshow(img.permute(1, 2, 0)) plt.axis("off") plt.title(class_name[label], fontsize=14); ``` ![](https://i.imgur.com/kbk6ME8.png) --- ## 如果prebuild套件(ImageFolder)不能使用，自行建構 https://www.learnpytorch.io/04_pytorch_custom_datasets/ ![](https://i.imgur.com/uNLiAp2.png) 從資料夾取得Label： ![](https://i.imgur.com/XNRAPBK.png) ![](https://i.imgur.com/IkoSapp.png) 建構自己的 dataset 轉換器： ![](https://i.imgur.com/I7trCkU.png) 註：getitem用於取代以下-> ![](https://i.imgur.com/PuLPJMb.png) ![](https://i.imgur.com/EOshqsI.png) ```python= # Write a custom dataset class (inherits from torch.utils.data.Dataset) from torch.utils.data import Dataset # 1. Subclass torch.utils.data.Dataset class ImageFolderCustom(Dataset): # 2. Initialize with a targ_dir and transform (optional) parameter def __init__(self, targ_dir: str, transform=None) -> None: # 3. Create class attributes # Get all image paths self.paths = list(pathlib.Path(targ_dir).glob("*/*.jpg")) # note: you'd have to update this if you've got .png's or .jpeg's # Setup transforms self.transform = transform # Create classes and class_to_idx attributes self.classes, self.class_to_idx = find_classes(targ_dir) # 4. Make function to load images def load_image(self, index: int) -> Image.Image: "Opens an image via a path and returns it." image_path = self.paths[index] return Image.open(image_path) # 5. Overwrite the __len__() method (optional but recommended for subclasses of torch.utils.data.Dataset) def __len__(self) -> int: "Returns the total number of samples." return len(self.paths) # 6. Overwrite the __getitem__() method (required for subclasses of torch.utils.data.Dataset) def __getitem__(self, index: int) -> Tuple[torch.Tensor, int]: "Returns one sample of data, data and label (X, y)." img = self.load_image(index) class_name = self.paths[index].parent.name # expects path in data_folder/class_name/image.jpeg class_idx = self.class_to_idx[class_name] # Transform if necessary if self.transform: return self.transform(img), class_idx # return data, label (X, y) else: return img, class_idx # return untransform data, label (X, y) ``` transform程式碼： ```python= # Augment train data train_transforms = transforms.Compose([ transforms.Resize((64, 64)), transforms.RandomHorizontalFlip(p=0.5), transforms.ToTensor() ]) # Don't augment test data, only reshape test_transforms = transforms.Compose([ transforms.Resize((64, 64)), transforms.ToTensor() ]) ``` 呼叫： ```python= train_data_custom = ImageFolderCustom(targ_dir=train_dir, transform=train_transforms) test_data_custom = ImageFolderCustom(targ_dir=test_dir, transform=test_transforms) train_data_custom, test_data_custom ``` 結果： ![](https://i.imgur.com/qZofq3V.png) ![](https://i.imgur.com/5fcD1VZ.png) 視覺化圖片： https://www.learnpytorch.io/04_pytorch_custom_datasets/ --- ## Turn loaded images(tensor型式) into DataLoader **資料batch化，利用batch分次載入，避免run out of memory** 註：不像之前classcification,linear資料量很少。 Turning our Dataset's into DataLoader's makes them iterable so a model can go through learn the relationships between samples and targets (features and labels). ![](https://i.imgur.com/9FwKYug.png) 在dataloader中，我們最多可以利用2個num_worker載資料： ![](https://i.imgur.com/BUKWreC.png) ```python= from torch.utils.data import DataLoader BATCH_SIZE = 32 train_dataloader = DataLoader(dataset=train_data, batch_size=BATCH_SIZE, num_workers = os.cpu_count(), shuffle = True) test_dataloader = DataLoader(dataset = test_data, batch_size=BATCH_SIZE, num_workers=os.cpu_count(), shuffle=True) ``` ![](https://i.imgur.com/AIBAJN9.png) ![](https://i.imgur.com/84pYIfh.png) 8組資料，每組有32筆 ### DataLoader查看+取出資料 ```python= img,label = next(iter(train_dataloader)) print(f"Image shape: {img.shape} -> [batch_size, color_channels, height, width]") print(f"Label shape: {label.shape}") ``` ![](https://i.imgur.com/IL5x9te.png) --- ## Data Augmentation ![](https://i.imgur.com/1DouRIy.png) {補}提高模型準度的把戲：https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/#break-down-of-key-accuracy-improvements 本次使用： ![](https://i.imgur.com/cR4HK4l.png) ```python= data_transform = transforms.Compose([ # 1.Resize image transforms.Resize(size=(64,64)), #num_magnitude_bins設定0~31，越大變化程度越多 transforms.TrivialAugmentWide(num_magnitude_bins=31), # 3. Turn image to Tensor transforms.ToTensor() ]) ``` ![](https://i.imgur.com/N5jnwIm.png) ![](https://i.imgur.com/74SBHrh.png) --- ## Build Model 註：forward全部幹再一起可以更有計算效率。return self.classifier(self.conv_block_2(self.conv_block_1(x))) # <- leverage the benefits of **operator fusion** 連貫過程： ### 資料轉換程式 ```python= #資料轉換程式 t_transform = transforms.Compose([transforms.Resize((64,64)), transforms.TrivialAugmentWide(num_magnitude_bins=31), transforms.ToTensor()]) ``` ### 載入資料並且轉換 ```python= #載入資料並且轉換 from torchvision import datasets train_data = datasets.ImageFolder(root = train_p,transform = t_transform) test_data = datasets.ImageFolder(root = test_p,transform = t_transform) ``` ### 轉成dataloader ```python= #轉成dataloader import os from torch.utils.data import DataLoader BATCH_SIZE = 32 NUM_WORKERS = os.cpu_count() print(f"Creating DataLoader's with batch size {BATCH_SIZE} and {NUM_WORKERS} workers.") train_dataloader = DataLoader(train_data, batch_size =BATCH_SIZE, shuffle = True, num_workers = NUM_WORKERS ) test_dataloader = DataLoader(test_data, batch_size =BATCH_SIZE, num_workers = NUM_WORKERS ) ``` ### 建模型(不太妙的) ```python= #建模型 from torch import nn class TinyVGG(nn.Module): """ Model architecture copying TinyVGG from: https://poloclub.github.io/cnn-explainer/ """ def __init__(self,input_shape:int,hidden_unit:int,output_shape:int): super().__init__() self.b1 = nn.Sequential( nn.Conv2d(in_channels=input_shape, out_channels=hidden_unit, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.Conv2d(in_channels=hidden_unit, out_channels=hidden_unit, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2) ) self.b2 = nn.Sequential( nn.Conv2d(in_channels=hidden_unit, out_channels=hidden_unit, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.Conv2d(in_channels=hidden_unit, out_channels=hidden_unit, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2) ) self.classifier = nn.Sequential( nn.Flatten(), nn.Linear(in_features=hidden_unit*16*16, out_features=output_shape) ) def forward(self,x): z = self.b1(x) print(z.shape) z = self.b2(z) print(z.shape) z = self.classifier(z) print(z.shape) return z ``` ```python= model_0 = TinyVGG(input_shape=3, # number of color channels (3 for RGB) hidden_unit=10, output_shape=len(train_data.classes)).to(device) model_0 ``` ![](https://i.imgur.com/S4qEu73.png) ![](https://i.imgur.com/mYpUXWj.png) ### 實際資料幹進去 ```python= # 1. Get a batch of images and labels from the DataLoader img_batch, label_batch = next(iter(train_dataloader)) # 2. Get a single image from the batch and unsqueeze the image so its shape fits the model img_single, label_single = img_batch[0].unsqueeze(dim=0), label_batch[0] print(f"Single image shape: {img_single.shape}\n") # 3. Perform a forward pass on a single image model_0.eval() with torch.inference_mode(): pred = model_0(img_single.to(device)) # 4. Print out what's happening and convert model logits -> pred probs -> pred label print(f"Output logits:\n{pred}\n") print(f"Output prediction probabilities:\n{torch.softmax(pred, dim=1)}\n") print(f"Output prediction label:\n{torch.argmax(torch.softmax(pred, dim=1), dim=1)}\n") print(f"Actual label:\n{label_single}") ``` ![](https://i.imgur.com/goekxqf.png) ### torchinfo看模型 ```python= # Install torchinfo if it's not available, import it if it is try: import torchinfo except: !pip install torchinfo import torchinfo from torchinfo import summary summary(model_0, input_size=[1, 3, 64, 64]) # do a test pass through of an example input size ``` ![](https://i.imgur.com/PMkT4po.png) ### 模組化train ```python= def train_step(model: torch.nn.Module, dataloader: torch.utils.data.DataLoader, loss_fn: torch.nn.Module, optimizer: torch.optim.Optimizer): # Put model in train mode model.train() # Setup train loss and train accuracy values train_loss, train_acc = 0, 0 # Loop through data loader data batches for batch, (X, y) in enumerate(dataloader): # Send data to target device X, y = X.to(device), y.to(device) # 1. Forward pass y_pred = model(X) # 2. Calculate and accumulate loss loss = loss_fn(y_pred, y) train_loss += loss.item() # 3. Optimizer zero grad optimizer.zero_grad() # 4. Loss backward loss.backward() # 5. Optimizer step optimizer.step() # Calculate and accumulate accuracy metric across all batches y_pred_class = torch.argmax(torch.softmax(y_pred, dim=1), dim=1) train_acc += (y_pred_class == y).sum().item()/len(y_pred) # Adjust metrics to get average loss and accuracy per batch train_loss = train_loss / len(dataloader) train_acc = train_acc / len(dataloader) return train_loss, train_acc ``` ### 模組化test set ```python= def test_step(model: torch.nn.Module, dataloader: torch.utils.data.DataLoader, loss_fn: torch.nn.Module): # Put model in eval mode model.eval() # Setup test loss and test accuracy values test_loss, test_acc = 0, 0 # Turn on inference context manager with torch.inference_mode(): # Loop through DataLoader batches for batch, (X, y) in enumerate(dataloader): # Send data to target device X, y = X.to(device), y.to(device) # 1. Forward pass test_pred_logits = model(X) # 2. Calculate and accumulate loss loss = loss_fn(test_pred_logits, y) test_loss += loss.item() # Calculate and accumulate accuracy test_pred_labels = test_pred_logits.argmax(dim=1) test_acc += ((test_pred_labels == y).sum().item()/len(test_pred_labels)) # Adjust metrics to get average loss and accuracy per batch test_loss = test_loss / len(dataloader) test_acc = test_acc / len(dataloader) return test_loss, test_acc ``` ### 訓練模組化(包含train+test) ```python= from tqdm.auto import tqdm # 1. Take in various parameters required for training and test steps def train(model: torch.nn.Module, train_dataloader: torch.utils.data.DataLoader, test_dataloader: torch.utils.data.DataLoader, optimizer: torch.optim.Optimizer, loss_fn: torch.nn.Module = nn.CrossEntropyLoss(), epochs: int = 5): # 2. Create empty results dictionary results = {"train_loss": [], "train_acc": [], "test_loss": [], "test_acc": [] } # 3. Loop through training and testing steps for a number of epochs for epoch in tqdm(range(epochs)): train_loss, train_acc = train_step(model=model, dataloader=train_dataloader, loss_fn=loss_fn, optimizer=optimizer) test_loss, test_acc = test_step(model=model, dataloader=test_dataloader, loss_fn=loss_fn) # 4. Print out what's happening print( f"Epoch: {epoch+1} | " f"train_loss: {train_loss:.4f} | " f"train_acc: {train_acc:.4f} | " f"test_loss: {test_loss:.4f} | " f"test_acc: {test_acc:.4f}" ) # 5. Update results dictionary results["train_loss"].append(train_loss) results["train_acc"].append(train_acc) results["test_loss"].append(test_loss) results["test_acc"].append(test_acc) # 6. Return the filled results at the end of the epochs return results ``` ### 訓練結果 ```python= # Set random seeds torch.manual_seed(42) torch.cuda.manual_seed(42) # Set number of epochs NUM_EPOCHS = 5 # Recreate an instance of TinyVGG model_0 = TinyVGG(input_shape=3, # number of color channels (3 for RGB) hidden_units=10, output_shape=len(train_data.classes)).to(device) # Setup loss function and optimizer loss_fn = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(params=model_0.parameters(), lr=0.001) # Start the timer from timeit import default_timer as timer start_time = timer() # Train model_0 model_0_results = train(model=model_0, train_dataloader=train_dataloader_simple, test_dataloader=test_dataloader_simple, optimizer=optimizer, loss_fn=loss_fn, epochs=NUM_EPOCHS) # End the timer and print out how long it took end_time = timer() print(f"Total training time: {end_time-start_time:.3f} seconds") ``` ### 模組化預測 ```python= def pred_and_plot_image(model: torch.nn.Module, image_path: str, class_names: List[str] = None, transform=None, device: torch.device = device): """Makes a prediction on a target image and plots the image with its prediction.""" # 1. Load in image and convert the tensor values to float32 target_image = torchvision.io.read_image(str(image_path)).type(torch.float32) # 2. Divide the image pixel values by 255 to get them between [0, 1] target_image = target_image / 255. # 3. Transform if necessary if transform: target_image = transform(target_image) # 4. Make sure the model is on the target device model.to(device) # 5. Turn on model evaluation mode and inference mode model.eval() with torch.inference_mode(): # Add an extra dimension to the image target_image = target_image.unsqueeze(dim=0) # Make a prediction on image with an extra dimension and send it to the target device target_image_pred = model(target_image.to(device)) # 6. Convert logits -> prediction probabilities (using torch.softmax() for multi-class classification) target_image_pred_probs = torch.softmax(target_image_pred, dim=1) # 7. Convert prediction probabilities -> prediction labels target_image_pred_label = torch.argmax(target_image_pred_probs, dim=1) # 8. Plot the image alongside the prediction and prediction probability plt.imshow(target_image.squeeze().permute(1, 2, 0)) # make sure it's the right size for matplotlib if class_names: title = f"Pred: {class_names[target_image_pred_label.cpu()]} | Prob: {target_image_pred_probs.max().cpu():.3f}" else: title = f"Pred: {target_image_pred_label} | Prob: {target_image_pred_probs.max().cpu():.3f}" plt.title(title) plt.axis(False); ``` 參考：https://www.learnpytorch.io/04_pytorch_custom_datasets/ 結語：模型不式很好，要用transferLearning --- ## 其他CNN層 1. BatchNorm2d(): https://blog.csdn.net/bigFatCat_Tom/article/details/91619977 2. dropout(): https://blog.csdn.net/leviopku/article/details/120786990 --- ## over/under fitting ![](https://i.imgur.com/OmYPUFq.jpg) ![](https://i.imgur.com/K1OBujm.png) ![](https://i.imgur.com/0kjNHuV.png) ![](https://i.imgur.com/xiQN8a9.png)