YOLOv1-v4+PP-YOLO(下)

全名You Only Look Once

V4
YOLOv4: Optimal Speed and Accuracy of Object Detection

PP-YOLO
PP-YOLO: An Effective and Efficient Implementation of Object Detector

整理的ppt簡報下載:網址

tags: `papper`

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

CBM： Yolov4網路結構中的最小組件，由Conv+Bn+Mish啟動函數三者組成。
CBL：由Conv+Bn+Leaky_relu啟動函數三者組成。
Res unit：借鑒Resnet網路中的殘差結構，讓網路可以構建的更深。
CSPX：借鑒CSPNet網路結構，由卷積層和X個Res unint模組Concate組成。
SPP：採用1×1，5×5，9×9，13×13的最大池化的方式，進行多尺度融合。

每個CSP模組前面的卷積核的大小都是3x3，stride=2

因為Backbone有5個CSP模組
特徵圖變化：608->304->152->76->38->19

Backbone中採用了Mish啟動函數，
後面仍然採用Leaky_relu啟動函數。

YOLOv4 包含以下三部分：
骨幹網路：CSPDarknet53
Neck：SPP、PAN
Head：YOLOv3

具體而言，YOLO v4 使用了：

Bag of freebies（BoF）是指只新增訓練成本而不新增推理成本來提高檢測精度的方法，通常指的是數據增强。

Bag of Specials（BoS）通過新增推理成本，改變網絡結構，來提高目標檢測精度的方法。

用於骨幹網路的Bag of Freebies（BoF）：CutMix 和Mosaic數據增強、DropBlock 正則化和類標籤平滑(Label Smoothing);

用於骨幹網路的Bag of Specials（BoS）：Mish啟動、CSP和多輸入加權殘差連接（MiWRC）;

用於檢測器的Bag of Freebies（BoF）：CIoU-loss、CmBN、DropBlock 正則化、Mosaic數據增強、自對抗訓練(SAT)、消除網格敏感性（Eliminate grid sensitivity）、針對一個真值使用多個錨(Multiple Anchor)、餘弦退火(cosine annealing)、優化超參數和隨機訓練形狀(Random training shape);

用於檢測器的Bag of Specials（BoS）：Mish啟動、SPP 塊、SAM塊、PAN 路徑聚合塊和 DIoU-NMS。

Mosaic數據增強

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Mish啟動函數

Backbone中採用了Mish啟動函數，網路後面仍然採用Leaky_relu啟動函數。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Dropblock

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

SPP模組

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

使用k={1x1，5x5，9x9，13x13}的最大池化的方式，再將不同尺度的特徵圖進行Concat操作。

FPN+PAN

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

原本的PANet網路的PAN，兩個特徵圖結合是採用shortcut操作，而Yolov4中則採用concat

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

GIOU to DIOU to CIOU_loss

GIOU

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

面臨的問題:在內部相同大小的框

DIOU:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

面臨的問題:沒有長寬資訊

加入長寬資訊:CIOU

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

DIOU_nms

將原始IOU的部分改為使用DIOU
不使用CIOU是因為測試時沒有GT的資訊

PP-YOLOv1

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

BackBone骨幹網路:
ResNet50-vd,最後一層的3x3卷積替換成DCN卷積(可變形捲積)
模型的精準度由 38.9% 達到 39.1%，速度由 58.2 FPS 提升到 79.2 FPS。

DetectionNeck:
依舊使用FPN,最後三個捲積層C3， C4， C5

Tricks的選擇:

更大的batchsize:batchsize從64調整到196
BN裡的滑動平均:λ這裏取0.9998
DropBlock
Matrix NMS相較傳統NMS運行速度更快。
CoordConv
SPP
GRID Sensitive:alpha設定為1.05
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Image Mixup ( 圖片混和 )

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

訓練過程中所使用的 loss 為兩張圖片的 loss 乘以各自權重的加和。

DropBlock

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Matrix NMS

soft NMS:Soft NMS 會將該框原始的信心分數納入考量

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

CoordConv

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

因為 filter 在掃的過程中，根本不知道自己位在哪裡，於是 Uber 團隊提出了一個叫做 CoordConv 的新方法，輸入的時候多兩個代表座標的新通道一併餵給網路。

如果CoordConv的座標通道沒有學習到任何資訊就等價於傳統卷積，;而如果座標通道學習到了一定的資訊，那麼此時CoordConv就具備了一定的平移依賴性。

SPP ( Spatial Pyramid Pooling )

由 SPPNet 所提出來的方法，主要解決兩個問題

1.避免 R-CNN 對於圖像的剪裁、縮放等操作導致物體不全
2.解決 CNN 網路對於重複圖像重複的特徵提取問題
SPP 架構圖解:

實驗

總結:
PP-YOLO沒有像yolo4使用現有的SOTA網路結構
而是著眼於合理的tricks堆疊

YOLOv1-v4+PP-YOLO(下)

全名You Only Look Once

整理的ppt簡報下載:網址

tags: papper

Mosaic數據增強

Mish啟動函數

Dropblock

SPP模組

FPN+PAN

GIOU to DIOU to CIOU_loss

DIOU_nms

PP-YOLOv1

Image Mixup ( 圖片混和 )

DropBlock

Matrix NMS

CoordConv

SPP ( Spatial Pyramid Pooling )

實驗

Read more

安裝 GroundingDINO、GroundingSAM

LeetCode筆記

特殊的電腦深度視覺模型

YOLO-Fastest訓練到部屬樹莓派Pi4

tags: `papper`