2023/07/19 - HackMD

--- title: 2023/07/19 # 簡報的名稱 tags: meeting # 簡報的標籤 slideOptions: # 簡報相關的設定 theme: black # 顏色主題 transition: 'fade' # 換頁動畫 spotlight: enabled: true --- # 本週進度 1. 建立 Pascal VOC Multi-Label 的訓練環境 ## Pascal VOC 2007 [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/) 是一個早期的線上影像預測任務，他總共有以下任務 1. Classification 2. Segmentation 3. Object Detection 總共有 9963 張影像，分為 trainval 和 test - trainval: 5011 張影像 - test: 4952 張影像在這 9963 張影像中共有 20 個類別 - Person: person - Animal: bird, cat, cow, dog, horse, sheep - Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train - Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor 我們所關心的任務只有 Classification ## Co-Pattern Matrix 在建立訓練環境之前，我們先來檢驗該筆資料有沒有明顯的標籤之間的相關性，因此我們用 heatmap 的方式呈現，其中 heatmap $(i,j)$ 位置表示在 class-$i$ 出現時 class-$j$ 出現的機率 (機率的估計是用簡單的數量相除) - Training Heatmap ![](https://hackmd.io/_uploads/ryZ66Kz5h.png) - Testing Heatmap ![](https://hackmd.io/_uploads/S1SxkcG93.png) 可以在圖中看到 person 在大多數的影像中都有出現，而 chair 與 diningtable 共同出現的機率也很高，其他標籤之間似乎沒有明顯的相關性 ## Statistics 以下提供該資料的一些統計敘述 - Training Label per Image: 1.58 - Testing Label per Image: 1.54 | Class | Train Count | Train Imbalance Ratio | Test Count | Test Imbalance Ratio | |--------------|-------------|----------------------|------------|----------------------| | aeroplane | 240 | 0.0269 | 205 | 0.0303 | | bicycle | 255 | 0.0328 | 250 | 0.0322 | | bird | 333 | 0.0379 | 289 | 0.0421 | | boat | 188 | 0.0231 | 176 | 0.0238 | | bottle | 262 | 0.0314 | 240 | 0.0331 | | bus | 197 | 0.0240 | 183 | 0.0249 | | car | 761 | 0.1015 | 775 | 0.0962 | | cat | 344 | 0.0435 | 332 | 0.0435 | | chair | 572 | 0.0714 | 545 | 0.0723 | | cow | 146 | 0.0166 | 127 | 0.0185 | | diningtable | 263 | 0.0324 | 247 | 0.0332 | | dog | 430 | 0.0567 | 433 | 0.0543 | | horse | 294 | 0.0366 | 279 | 0.0372 | | motorbike | 249 | 0.0305 | 233 | 0.0315 | | person | 2095 | 0.2748 | 2097 | 0.2648 | | pottedplant | 273 | 0.0333 | 254 | 0.0345 | | sheep | 97 | 0.0128 | 98 | 0.0123 | | sofa | 372 | 0.0465 | 355 | 0.0470 | | train | 263 | 0.0339 | 259 | 0.0332 | | tvmonitor | 279 | 0.0334 | 255 | 0.0353 | ## Result 我們一開始使用最簡易的訓練過程作為訓練的 baseline，實驗數據如下 ```yaml= seed: 42 model: EfficientNet-B4 augmentation: HorizontalFlip criterion: BCEWithLogitsLoss resolution: 448 * 448 epochs: 100 optimizer: Adam Scheduler: Polynomial decreasing batchsize: 16 lr: 1e-4 wd: 0 ``` 實驗結果如下 (OV 表示平均): ![](https://hackmd.io/_uploads/SkkpxZSq2.png)