# DLCV HW1
###### tags: `Course`
:::success
湯濬澤
NTUST_M11015117
:::
## Problem 1 - Image Classification
### 1. Architecture of model A
![](https://i.imgur.com/R7s6SoL.png)
### 2. Accuracy on validation dataset
:::success
Model A (CNN)
48%
:::
:::success
Model B (ResNet 50)
86.76%
:::
### 3. Implement details
首先是 Dataset 部分,會依據檔名去抓出各照片的 Label。
```python!
class MyDataset(Dataset):
def __init__(self, root_dir, transform=None):
self.root_dir = root_dir
self.transform = transform
self.datas = []
for file in os.listdir(root_dir):
if file.endswith(".png"):
filename_split = file.split("_")
label = int(filename_split[0])
self.datas.append((file, label))
def __len__(self):
return len(self.datas)
def __getitem__(self, idx):
img_path = os.path.join(self.root_dir, self.datas[idx][0])
label = self.datas[idx][1]
image = Image.open(img_path)
if self.transform:
image = self.transform(image)
return image, label
```
然後針對 Training set 做左右翻轉,以及資料正規化,正規化的參數由 training dataset 算出。
```python
transform = transforms.Compose([
transforms.Resize((32,32)),
transforms.RandomHorizontalFlip(p=0.5),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.5077, 0.4813, 0.4312],
std=[0.2627, 0.2547, 0.2736]
),
])
# Normalization
imgs = torch.stack([img_t for img_t, _ in train_dataset], dim=3)
print("Dataset shape:", imgs.shape)
print("Dataset mean:", imgs.view(3, -1).mean(dim=1))
print("Datset std:", imgs.view(3, -1).std(dim=1))
```
Loss 採用 crossEntropy,optimizer 為 SGD,Schedular 負責管理 learning rate,每隔 20 個 epoch 會把 leaning rate 降 10 倍。
```python
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
scheduler = StepLR(optimizer, step_size=20, gamma=0.1)
```
最後 pca 與 t-sne 透過 scikit-learn 計算。
```python
def get_tsne(data, n_components = 2, n_images = None):
if n_images is not None:
data = data[:n_images]
tsne = manifold.TSNE(n_components = n_components, random_state = 0)
tsne_data = tsne.fit_transform(data)
return tsne_data
def get_pca(data, n_components = 2):
pca = decomposition.PCA()
pca.n_components = n_components
pca_data = pca.fit_transform(data)
return pca_data
```
### 4. Alternative model (ResNet 50)
自己的 Model 與 Resnet 50 最大的差別莫過於網路的深度,ResNet 50 設計了 5 個 Stage 以應付 224×224 的圖片,而由於 dataset 的 input 只有 32,因此當初自己的 CNN Model 深度就沒有設計得太深。不過因為網路越深,反而可能導致錯誤率提升,因此 ResNet 透過引入殘差的機制,也就是讓網路學習前一層與現在這層的差異,來避免網路退化。藉以達到較好的效果。
### 5. PCA
![](https://i.imgur.com/w0myhlO.png)
### 6. t-SNE
或許是因為有著 50 種 Class 要分類,導致 t-SNE 出來的結果顏色太相近,較難以分辨。不過 epoch 5 的結果與 epoch 100 相比,確實 epoch 100 的結果比較能看出一些小群體的感覺。
* **Epoch 5**
![](https://i.imgur.com/Qw99ep6.png)
* **Epoch 50**
![](https://i.imgur.com/61aGroq.png)
* **Epoch 100**
![](https://i.imgur.com/qLdGgXd.png)
## Problem 2 - Semantic Segmentation
### 1. **Architecture of model A (VGG16-FCN32s)**
![](https://i.imgur.com/1PghOBq.png)
### 2. **network architecture of model B (VGG16-FCN8s)**
![](https://i.imgur.com/VOG2kbN.png)
:::info
與 FCN32s 相同,兩者都是先做多個 Convolution,由於這些 convolution 與 VGG-16 雷同,因此直接把 convolution 部分替換成 pre-train 的 VGG16。不過相比 FCN32s,FCN8s 多了幾層的上採樣,融合各個層的訊息後再輸出結果,理論上表現應該會比 FCN32s 好上不少。
而在輸出方面,FCN32s 的 mask 呈現明顯的塊狀,而多了上採樣與多層資訊的 FCN8s 就沒這個現象。
:::
### 3. mIoU
:::success
Model A (FCN32s)
66.1319%
:::
:::success
Model B (FCN8s)
72.0480%
:::
### 4. Segmentation results
:::warning
:::spoiler {state=open} **FCN32s**
#### **FCN32s**
**0013_sat**
| Epoch 5 | Epoch 45 | Epoch 70 |
| -------- | -------- | -------- |
| ![](https://i.imgur.com/GF3ZbVD.png) | ![](https://i.imgur.com/fSLYiKD.png) | ![](https://i.imgur.com/MIaDkYE.png) |
**0062_sat**
| Epoch 5 | Epoch 45 | Epoch 70 |
| -------- | -------- | -------- |
| ![](https://i.imgur.com/2kwcK2y.png) | ![](https://i.imgur.com/F6hNgfZ.png) | ![](https://i.imgur.com/TE7AFvI.png) |
**0104_sat**
| Epoch 5 | Epoch 45 | Epoch 70 |
| -------- | -------- | -------- |
| ![](https://i.imgur.com/tKDZsxS.png) | ![](https://i.imgur.com/mMyf5Ge.png) | ![](https://i.imgur.com/qKlQRYf.png) |
:::
:::warning
:::spoiler {state=open} **FCN8s**
#### **FCN8s**
**0013_sat**
| Epoch 5 | Epoch 80 | Epoch 160 |
| -------- | -------- | -------- |
| ![](https://i.imgur.com/iUqtyWk.png) | ![](https://i.imgur.com/e4xtDjx.png) | ![](https://i.imgur.com/KVtE2og.png) |
**0062_sat**
| Epoch 5 | Epoch 80 | Epoch 160 |
| -------- | -------- | -------- |
| ![](https://i.imgur.com/JE72mfS.png) | ![](https://i.imgur.com/ueYFlPR.png) | ![](https://i.imgur.com/lQK3bgs.png) |
**0104_sat**
| Epoch 5 | Epoch 80 | Epoch 160 |
| -------- | -------- | -------- |
| ![](https://i.imgur.com/a5fQQJF.png) | ![](https://i.imgur.com/lv3nASR.png) | ![](https://i.imgur.com/xzPofZq.png) |
:::