Try   HackMD

YOLOv1-v4+PP-YOLO(下)

全名You Only Look Once

V4
YOLOv4: Optimal Speed and Accuracy of Object Detection

PP-YOLO
PP-YOLO: An Effective and Efficient Implementation of Object Detector

整理的ppt簡報下載:網址

tags: papper

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • CBM: Yolov4網路結構中的最小組件,由Conv+Bn+Mish啟動函數三者組成。
  • CBL:由Conv+Bn+Leaky_relu啟動函數三者組成。
  • Res unit:借鑒Resnet網路中的殘差結構,讓網路可以構建的更深。
  • CSPX:借鑒CSPNet網路結構,由卷積層和X個Res unint模組Concate組成。
  • SPP:採用1×1,5×5,9×9,13×13的最大池化的方式,進行多尺度融合。

每個CSP模組前面的卷積核的大小都是3x3,stride=2

因為Backbone有5個CSP模組
特徵圖變化:608->304->152->76->38->19

Backbone中採用了Mish啟動函數,
後面仍然採用Leaky_relu啟動函數。


YOLOv4 包含以下三部分:
骨幹網路:CSPDarknet53
Neck:SPP、PAN
Head:YOLOv3


具體而言,YOLO v4 使用了:

Bag of freebies(BoF)是指只新增訓練成本而不新增推理成本來提高檢測精度的方法,通常指的是數據增强。

Bag of Specials(BoS)通過新增推理成本,改變網絡結構,來提高目標檢測精度的方法。

用於骨幹網路的Bag of Freebies(BoF):CutMix 和Mosaic數據增強、DropBlock 正則化和類標籤平滑(Label Smoothing);

用於骨幹網路的Bag of Specials(BoS):Mish啟動、CSP和多輸入加權殘差連接(MiWRC);

用於檢測器的Bag of Freebies(BoF):CIoU-loss、CmBN、DropBlock 正則化、Mosaic數據增強、自對抗訓練(SAT)、消除網格敏感性(Eliminate grid sensitivity)、針對一個真值使用多個錨(Multiple Anchor)、餘弦退火(cosine annealing)、優化超參數和隨機訓練形狀(Random training shape);

用於檢測器的Bag of Specials(BoS):Mish啟動、SPP 塊、SAM塊、PAN 路徑聚合塊和 DIoU-NMS。


Mosaic數據增強

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Mish啟動函數

Backbone中採用了Mish啟動函數,網路後面仍然採用Leaky_relu啟動函數。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Dropblock

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

SPP模組

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

使用k={1x1,5x5,9x9,13x13}的最大池化的方式,再將不同尺度的特徵圖進行Concat操作。

FPN+PAN

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

原本的PANet網路的PAN,兩個特徵圖結合是採用shortcut操作,而Yolov4中則採用concat

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

GIOU to DIOU to CIOU_loss

GIOU

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

面臨的問題:在內部相同大小的框
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


DIOU:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

面臨的問題:沒有長寬資訊
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

加入長寬資訊:CIOU

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

DIOU_nms

將原始IOU的部分改為使用DIOU
不使用CIOU是因為測試時沒有GT的資訊




PP-YOLOv1

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

BackBone骨幹網路:
ResNet50-vd,最後一層的3x3卷積替換成DCN卷積(可變形捲積)

模型的精準度由 38.9% 達到 39.1%,速度由 58.2 FPS 提升到 79.2 FPS。

DetectionNeck:
依舊使用FPN,最後三個捲積層C3, C4, C5

Tricks的選擇:

  • 更大的batchsize:batchsize從64調整到196
  • BN裡的滑動平均:λ這裏取0.9998
  • DropBlock
  • Matrix NMS相較傳統NMS運行速度更快。
  • CoordConv
  • SPP
  • GRID Sensitive:alpha設定為1.05
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

Image Mixup ( 圖片混和 )

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

訓練過程中所使用的 loss 為兩張圖片的 loss 乘以各自權重的加和。


DropBlock

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


Matrix NMS

soft NMS:Soft NMS 會將該框原始的信心分數納入考量

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →


CoordConv

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

因為 filter 在掃的過程中,根本不知道自己位在哪裡,於是 Uber 團隊提出了一個叫做 CoordConv 的新方法,輸入的時候多兩個代表座標的新通道一併餵給網路。

如果CoordConv的座標通道沒有學習到任何資訊就等價於傳統卷積,;而如果座標通道學習到了 一定的資訊,那麼此時CoordConv就具備了一定的平移依賴性。


SPP ( Spatial Pyramid Pooling )

由 SPPNet 所提出來的方法,主要解決兩個問題

1.避免 R-CNN 對於圖像的剪裁、縮放等操作導致物體不全
2.解決 CNN 網路對於重複圖像重複的特徵提取問題
SPP 架構圖解:


實驗


總結:
PP-YOLO沒有像yolo4使用現有的SOTA網路結構
而是著眼於合理的tricks堆疊