2020 YOLOv4
YOLOv4 提出了 Object detection 中各個問題的分類:Input, Backbone (CNN 架構) , Neck (融合各種 scale 的方式) , Head (如何propose bounding box 的方法 與 loss function)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
而 YOLOv4 的架構選擇如下
- Backbone: CSPDarknet53
- Neck: SPP, PAN (PANet)
- Head: YOLOv3
以下為網路架構:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Backbone: CSPDarknet53
我們將 CSP 架構套用到 Darknet53 上即是 CSPDarknet53。
什麼是 CSP?CSPNet (Cross-Stage Partial Network) 提出主要是為了解決三個問題:
- 增強 CNN 的學習能力,能夠在輕量化的同時保持準確性。
- 降低計算瓶頸
- 降低內存成本
我們將 CSP 架構套用到 DensNet 上
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
加上 CSP 架構後可以觀察到不僅參數量減少,甚至還會提高準確率。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Neck: SPP
圖(a) 是原本 Yolov2 的方式,2D feature map 被弄成 1D 的 feature,所以最後面需要接 fully-connected layers,
圖(b) 則是把他們再接起來,所以後面能繼續接 CNN,為 Yolov3 使用的方式。然後好處是你可以想像成他又多了一層這樣,如下下圖所示
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Neck: PAN (PANet)
之前有介紹過 FPN,而 PANet 就是 FPN 的進階版。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
而 YOLOv4 有更改一個地方,就是將 addition 改成 concatenation
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Head: YOLOv3
YOLOv3
Bag of Freebies & Bag of Specials
論文裡面還分兩類改善方式
- Bag of Freebies (BoF)
是指不增加推論時間又能夠提高模型準確率的方法
- Bag of Specials (BoS)
是指增加一些推論時間但能夠提高模型準確率的方法
而 YOLOv4 用到以下這些技巧:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
BoF for backbone
- CutMix
混合與修改 label 的操作
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Mosaic data augmentation
一張圖片當4 張用,某種程度有增加 minibatch size 的效果
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- DropBlock regularization
因為圖片是連續的,所以只隨機 dropout 幾個 pixel 是沒什麼用的,要的話就是一整個區域。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Class label smoothing
因為最後面一層是 cross entropy,所以你要讓你的預測接近 1 的話,這個 label 就要 predict 一個很大的數,假設你最後一層的 output 是 [0.1, 0.1, 0.1, 0.1, 6] 做完 cross entropy 變成 [0.002, 0.002, 0.002, 0.002, 0.992],你會發現 6 是 0.1 的 60 倍,這種嚴重的偏差,很容易導致 overfitting,所以解決方式很簡單,把你的 ground truth 如下面的公式改成 [0.02, 0.02, 0.02, 0.02, 0.92],也就是說過度完美的解答是不好的。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
BoS for backbone
- Cross-Stage Partial connections (CSP)
- Mish activation
除了大家熟知的 ReLU 以外,還有 leaky ReLU (<0 時還有一點斜率) 為主流。而 Yolov4 使用 Mish activation 為連續可微的函數,如下圖
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
數據結果證明 Mish 是比較厲害的
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Multi-input Weighted Residual Connections (MiWRC)
BoF for detector
- CIoU-loss
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
IoU-loss:
IoU-loss 有很明顯的缺點,就是當 Bbox 與 ground truth 無交集時,IoU 為 0,並且不能反映出兩個 box 間的遠近,這樣就失去了梯度方向,也就是說無法優化。因此衍伸出了 GIoU-loss,DIoU-loss 和 CIoU-loss 等 loss,這些 loss 都是在 IoU-loss 的基礎上增加一個懲罰項 來改進 loss。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
GIoU-loss:
Giou-loss 在 IoU-loss 的基礎上增加一個懲罰項,根據同時包含兩個 box 的最小區域去做計算,當 bbox 的距離越大時,懲罰項將越大。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
這邊可以觀察到兩個 box 離得越遠 loss 越大。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
但 GIoU 還是有以下缺點,這些缺點會導致收斂速度變慢:
1.當兩個 box 重疊時, GIoU-loss 會退化成 IoU-loss ,而且值都會一樣
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
2.當兩個 box 平行或是垂直的時候,會導致 GIoU-loss 的值都會一樣
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
DIoU-loss:
DIoU 比 GIoU 更符合 Bbox 回歸的機制,除了參考 GIoU 的方法外還考慮了兩個 box 的中心距離差,使得收斂變快。
這邊的 c 是使用同時包含兩個 box 的最小區域的對角線去計算。
而 d 是使用兩個 box 中心點間的歐式距離。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
下圖可以觀察到,DIoU-loss 隨著中心點不同而改變,克服了第一個缺點。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
下圖可以觀察到,DIoU-loss 隨著中心點不同而改變,克服了第二個缺點。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
而收斂速度方面如下圖所示:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
CIoU-loss:
在 DIoU 的基礎下又考慮了長寬比這項因素,懲罰項如下式,其中 𝛼 是權重函數,𝜐 是用來衡量兩個 box 長寬比的相似性。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
下圖可以觀察到兩個 box 的長寬比不同 CIoU-loss 也會有所影響。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Cross mini-Batch Normalization (CmBN)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- DropBlock regularization
- Mosaic data augmentation
- Self-Adversarial Training (SAT)
這個作者還在開發中,用 adversal (對抗) 來避免 overfit。
作法如下:
1.圖片輸入 model 後,中間會輸出 feature 的值
2.扣掉這些值,也就是沿著 gradient 對圖片做修改 (model 的參數沒有動)
3.然後把修改的圖片再丟進 model 訓練。
e.g. 如果這個 model 只判斷狗的眼睛,就 classify 出他是狗的話,那 SAT 會在步驟二把 狗狗的眼睛的 feature 從圖片中扣掉。這樣 model 就必須學到狗的其他特徵,才可以 classify 出來。
- Cosine annealing scheduler
因為模型應該會收斂,所以我們要用越來越小的 learning rate 才好。會比原本的 schedule 一段時間就降 learning rate 來的好(藍色虛線)。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
BoS for detector
- Mish activation
- SPP-block
- SAM-block
對區域做 attention,會生成 HxWx1 的 attention ,哪些區域重要就會被加權起來。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- PAN path-aggregation block
- DIOU-NMS
是在做 objection detection 時,會 propose 出許多同義的 bounding box。NMS 作法就是有 overlap 到的就 deduplicated。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
當然有可能會錯刪(比如說 confidence 不高且靠得很近),所以 DIOU-NMS 就是再加上距離的 criteria
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →