###### tags: `PyTorch`
# PyTorch - 練習kaggle - [Dogs vs. Cats](https://www.kaggle.com/c/dogs-vs-cats/overview) - 使用自定義的 CNN model
先前已經使用 Keras 練習過貓狗辨識的 model ,見[先前筆記](https://www.pyexercise.com/2019/01/kaggle-dog-cat.html) ,從最簡單的Linear NN,到CNN,到Transfer learning(fine tune) ,經過此題目的練習,會更了解到深度學習的過程。我們也同樣藉由此題目的練習,來更了解 PyTorch 在圖像分類辨識 model 的使用。
練習目標: (將會是一系列文章,本篇為第一篇)
1. 先自定義CNN model train 一次看看,看能否成功跑起來!!
2. 使用 transfer learning 的方法來 train ,看效能是否提高。
3. 使用 transfer learning 加上 regularization & 一些 deep learning 技法(如 batch-normolize, residual...等) 看看效能是否提升。
以下代碼開始 CNN model練習:
## 資料預處理
訓練&驗證data,參考 [kernal-data](https://www.kaggle.com/pocahontas1010/dogs-vs-cats-for-pytorch) 請至連結下載 data,下載後放進根目錄中。
測試 data 為 [原始 data](https://www.kaggle.com/c/dogs-vs-cats/data),請至連結下載。
會用到的套件:
```python=
import torch
import torch.nn as nn
from torchvision import datasets ,models,transforms
from torch.utils.data.sampler import SubsetRandomSampler
from torch.optim import lr_scheduler
from pathlib import Path
from matplotlib import pyplot as plt
import numpy as np
import torch.nn.functional as F
from torch.autograd import Variable
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
```
在後續能使用 GPU 就使用 GPU 來訓練模型,省時間!
```python=+
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
print('CUDA is not available. Training on CPU ...')
else:
print('CUDA is available! Training on GPU ...')
```
給定以下路徑,將訓練&驗證&測試資料夾路經標示。
為了節省訓練時間,我們只取 train set 2000個data(貓狗各1000),validation set 1000個data(貓狗各500),test set 1000個data(貓狗各500)。
```python=+
PATH_train="..\\cats_and_dogs\\train"
PATH_val="..\\cats_and_dogs\\validation"
PATH_test="..\\cats_and_dogs\\test"
```
```python=+
TRAIN =Path(PATH_train)
VALID = Path(PATH_val)
TEST=Path(PATH_test)
print(TRAIN)
print(VALID)
print(TEST)
```
```
..\cats_and_dogs_small\train
..\cats_and_dogs_small\validation
..\cats_and_dogs_small\test
```
設定一些參數 [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html) 在讀取批量資料時,需要設定些參數,這邊train,val,test 都會用到,所以寫在一起。
```python=+
# number of subprocesses to use for data loading
num_workers = 0
# how many samples per batch to load
batch_size = 32
# learning rate
LR = 0.01
```
在 torchvision 裏頭,有個API [TORCHVISION.TRANSFORMS](https://pytorch.org/docs/stable/torchvision/transforms.html) ,這個 API 中包含resize、crop 等常見的 data augmentation 操作,基本上 PyTorch 中的 data augmentation 操作都可以透過該API來實現,如本次練習會用到如下:
將圖檔resize至(224,224)像素,然後轉換成tensor的資料型態,最後做narmalize(其中narmalize的係數是參考[文檔中的example](https://pytorch.org/docs/stable/torchvision/models.html),也可以使用
```
transforms.Normalize(mean=[0.5, 0.5, 0.5],std=[0.5, 0.5, 0.5])
```
```python=+
# convert data to a normalized torch.FloatTensor
train_transforms = transforms.Compose([transforms.Resize((224,224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
valid_transforms = transforms.Compose([transforms.Resize((224,224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
test_transforms = transforms.Compose([transforms.Resize((224,224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
```
接著利用 [pytorch Dataset 的 ImageFolder](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder) 將訓練集、驗證集、測試集打包,其使用方式是假設所有的文件按文件夾保存好,每個文件夾下面存放同一類別的圖片,文件夾的名字為分類的名字。如下: 其詳細用法參考 [PyTorch 文檔](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder)
```
root/dog/xxx.png
root/dog/xxy.png
root/dog/xxz.png
root/cat/123.png
root/cat/nsdf3.png
root/cat/asd932_.png
```
```python=+
# choose the training and test datasets
train_data = datasets.ImageFolder(TRAIN, transform=train_transforms)
valid_data = datasets.ImageFolder(VALID,transform=valid_transforms)
test_data = datasets.ImageFolder(PATH1, transform=test_transforms)
```
其對應文件夾的label
```python=+
print(train_data.class_to_idx)
print(valid_data.class_to_idx)
```
```
{'cats': 0, 'dogs': 1}
{'cats': 0, 'dogs': 1}
```
接著就是利用 [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html) 數據加載器,將打包好的數據可以跌代進模型中訓練。
```python=+
# prepare data loaders (combine dataset and sampler)
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, num_workers=num_workers,shuffle=True)
valid_loader = torch.utils.data.DataLoader(valid_data, batch_size=batch_size, num_workers=num_workers,shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, num_workers=num_workers)
```
```python=+
images,labels=next(iter(train_loader))
images.shape,labels.shape
```
```
(torch.Size([32, 3, 224, 224]), torch.Size([32]))
```
將 train set 的 20 張圖畫出來看看。因為 train_loader 中的資料已經是normalize過的,所以在畫出來前要先 denormalize。
```python=+
import matplotlib.pyplot as plt
%matplotlib inline
classes = ['cat','dog']
mean , std = torch.tensor([0.485, 0.456, 0.406]),torch.tensor([0.229, 0.224, 0.225])
def denormalize(image):
image = transforms.Normalize(-mean/std,1/std)(image) #denormalize
image = image.permute(1,2,0) #Changing from 3x224x224 to 224x224x3
image = torch.clamp(image,0,1)
return image
# helper function to un-normalize and display an image
def imshow(img):
img = denormalize(img)
plt.imshow(img)
```
將前20張圖畫出來,用 subplot 畫比較不占版面。
```python=+
# obtain one batch of training images
dataiter = iter(train_loader)
images, labels = dataiter.next()
# convert images to numpy for display
# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 8))
# display 20 images
for idx in np.arange(20):
ax = fig.add_subplot(2, 20/2, idx+1, xticks=[], yticks=[])
imshow(images[idx])
ax.set_title("{} ".format( classes[labels[idx]]))
```
![](https://i.imgur.com/KCktcSh.png)
看起來 train_loader 已經沒有問題了。
## 建立模型
這邊主要練習自己建構 CNN model 看看自己建的 model train 不 train 的起來。
model 後面註解每層 input 與 output_shape 的變化。
```python=+
# Create CNN Model
class CNN_Model(nn.Module):
def __init__(self):
super(CNN_Model, self).__init__()
# Convolution 1 , input_shape=(3,224,224)
self.cnn1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=5, stride=1, padding=0) #output_shape=(16,220,220) #(224-5+1)/1 #(weigh-kernel+1)/stride 無條件進位
self.relu1 = nn.ReLU() # activation
# Max pool 1
self.maxpool1 = nn.MaxPool2d(kernel_size=2) #output_shape=(16,110,110) #(220/2)
# Convolution 2
self.cnn2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=0) #output_shape=(32,106,106)
self.relu2 = nn.ReLU() # activation
# Max pool 2
self.maxpool2 = nn.MaxPool2d(kernel_size=2) #output_shape=(32,53,53)
# Convolution 3
self.cnn3 = nn.Conv2d(in_channels=32, out_channels=16, kernel_size=3, stride=1, padding=0) #output_shape=(16,51,51)
self.relu3 = nn.ReLU() # activation
# Max pool 3
self.maxpool3 = nn.MaxPool2d(kernel_size=2) #output_shape=(16,25,25)
# Convolution 4
self.cnn4 = nn.Conv2d(in_channels=16, out_channels=8, kernel_size=3, stride=1, padding=0) #output_shape=(8,23,23)
self.relu4 = nn.ReLU() # activation
# Max pool 4
self.maxpool4 = nn.MaxPool2d(kernel_size=2) #output_shape=(8,11,11)
# Fully connected 1 ,#input_shape=(8*12*12)
self.fc1 = nn.Linear(8 * 11 * 11, 512)
self.relu5 = nn.ReLU() # activation
self.fc2 = nn.Linear(512, 2)
self.output = nn.Softmax(dim=1)
def forward(self, x):
out = self.cnn1(x) # Convolution 1
out = self.relu1(out)
out = self.maxpool1(out)# Max pool 1
out = self.cnn2(out) # Convolution 2
out = self.relu2(out)
out = self.maxpool2(out) # Max pool 2
out = self.cnn3(out) # Convolution 3
out = self.relu3(out)
out = self.maxpool3(out) # Max pool 3
out = self.cnn4(out) # Convolution 4
out = self.relu4(out)
out = self.maxpool4(out) # Max pool 4
out = out.view(out.size(0), -1) # last CNN faltten con. Linear NN
out = self.fc1(out) # Linear function (readout)
out = self.fc2(out)
out = self.output(out)
return out
```
接著利用上一篇 model summary 的 function 印出自己建構的 model 看看。
```python=+
model = CNN_Model()
from torchsummary import summary
summary(model.cuda(), (3, 224, 224))
```
![](https://i.imgur.com/RTcR2G2.png)
## 訓練模型
在 Keras 中訓練模型都會有進度條,PyTorch 中沒有進度條,可以使用 [tqdm](https://tqdm.github.io/) 來達到相同的目的。
```python=+
from tqdm import tqdm_notebook as tqdm
```
訓練過程與 MNIST 上練習的一樣。
```python=+
if train_on_gpu:
model.cuda()
# number of epochs to train the model
n_epochs = 50
valid_loss_min = np.Inf # track change in validation loss
#train_losses,valid_losses=[],[]
for epoch in range(1, n_epochs+1):
# keep track of training and validation loss
train_loss = 0.0
valid_loss = 0.0
print('running epoch: {}'.format(epoch))
###################
# train the model #
###################
model.train()
for data, target in tqdm(train_loader):
# move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# clear the gradients of all optimized variables
optimizer.zero_grad()
# forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# calculate the batch loss
loss = criterion(output, target)
# backward pass: compute gradient of the loss with respect to model parameters
loss.backward()
# perform a single optimization step (parameter update)
optimizer.step()
# update training loss
train_loss += loss.item()*data.size(0)
######################
# validate the model #
######################
model.eval()
for data, target in tqdm(valid_loader):
# move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# calculate the batch loss
loss = criterion(output, target)
# update average validation loss
valid_loss += loss.item()*data.size(0)
# calculate average losses
#train_losses.append(train_loss/len(train_loader.dataset))
#valid_losses.append(valid_loss.item()/len(valid_loader.dataset)
train_loss = train_loss/len(train_loader.dataset)
valid_loss = valid_loss/len(valid_loader.dataset)
# print training/validation statistics
print('\tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
train_loss, valid_loss))
# save model if validation loss has decreased
if valid_loss <= valid_loss_min:
print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format(
valid_loss_min,
valid_loss))
torch.save(model.state_dict(), 'model_CNN.pth')
valid_loss_min = valid_loss
```
最後一 run 的讀條:
![](https://i.imgur.com/sPALhDT.png)
## 評估&測試模型
訓練結束後,要來測試模型效能,就要在 test data 上評估。
```python=+
def test(loaders, model, criterion, use_cuda):
# monitor test loss and accuracy
test_loss = 0.
correct = 0.
total = 0.
model.eval()
for batch_idx, (data, target) in enumerate(loaders):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
# forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# calculate the loss
loss = criterion(output, target)
# update average test loss
test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data - test_loss))
# convert output probabilities to predicted class
pred = output.data.max(1, keepdim=True)[1]
# compare predictions to true label
correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
total += data.size(0)
print('Test Loss: {:.6f}'.format(test_loss))
print('Test Accuracy: %2d%% (%2d/%2d)' % (
100. * correct / total, correct, total))
```
```python=+
use_cuda = torch.cuda.is_available()
model.cuda()
test(test_loader, model, criterion, use_cuda)
```
```
Test Loss: 0.701602
Test Accuracy: 60% (608/1000)
```
自己建構的 CNN model 跌代訓練 50 次後的 Accuracy 約 60%。其實跟 [Keras 不做任何處理的 model](https://www.pyexercise.com/2019/01/kaggle-dog-cat.html) 差不多。