# Yolo-V4
###### tags: `lab` `ntu`
#### DIoU
\begin{equation}
\begin{aligned}
DIoU &= IoU - R_{DIoU}(M,B_i) \\
&= IoU - \frac{\rho^2(b,b^{gt})}{c^2} \\
R_{DIoU}(M,B_i) = \frac{\rho^2(b,b^{gt})}{c^2} \\
\end{aligned}
\end{equation}
#### mish
\begin{equation}
\begin{aligned}
softplus(x) &= log(1+e^x) \\
tanh(x) &= \frac{e^x-e^{-x}}{e^x+e^{-x}} \\
mish(x) &= x * tanh(softplus(x)) \\
\end{aligned}
\end{equation}
#### CIoU
\begin{equation}
\begin{aligned}
L_{CIoU} = 1 - IoU + \frac{\rho^2(b,b^{gt})}{c^2} + \alpha \frac{4}{\pi^2}(arctan\frac{w^{gt}}{h^{gt}}-arctan\frac{w}{h})^2
\end{aligned}
\end{equation}
<br><br><br>
---
# 問題們
---
1. backbone跟detector是一起fintuned嗎?
會fixed住嗎?先猜會
3. ResNeXt在classification較好, Darknet在detection卻比較好
4. backbone train在甚麼上面imagenet?
試的imagenet
6. AP vs mAP ??
AP 去算IoU然後排序,
10. AP50 AP75代表甚麼意思啊?
8. 為甚麼要modeifi SAM, PAN , 論文好像沒寫?
試了效果比較好 讚啦
11. 所以YOLO v4的架構到底長得怎麼樣啊?
12. 為甚麼她速度可以比較快(fps)?
13. self adversarial training?
自對抗訓練也是一種新的數據增強方法,可以一定程度上抵抗對抗攻擊。其包括兩個階段,每個階段進行一次前向傳播和一次反向傳播。
第一階段,CNN通過反向傳播改變圖片信息,而不是改變網絡權值。通過這種方式,CNN可以進行對抗性攻擊,改變原始圖像,造成圖像上沒有目標的假象。
第二階段,對修改後的圖像進行正常的目標檢測。
15. data augment有用啥 只有mosaic嗎?
mosaic, SAT(Self-Adversarial Training)
15. 為甚麼他們要挑resNeXT去比啊??
因為receptive field
17. (CSPResNeXt) CSP試甚麼 ?
CSPNet: A New Backbone that can Enhance Learning Capability of CNN
17. Dropblock , Dropconnect ??
18. pan path aggregation 優點??
19. label smoothing
部會給label是1
ok
20.
<br><br><br>
---
# Hey
---
## 相關文章
https://towardsdatascience.com/yolo-v4-optimal-speed-accuracy-for-object-detection-79896ed47b50
## features used
### Weighted-Residual-Connections (WRC)
### Cross-Stage-Partial-connection (CSP)
### Cross mini-Batch Normalization (CmBN)
### Self-adversaral-training (SAT)
### Mish-activation
### Mosaic data augumentation
### CopBlock regularization
### CIoU loss
### modify
* CBN
* PAN
* SAM
<br><br><br>
---
# PPT
---
## Modern Object-detector

* stages
* two-stage
* R-CNN
* fast R-CNN
* faster R-CNN
* one-stage
* YOLO
* SSD
* backbone
* GPU
* VGG16
* ResNet-50
* ResNeXt-101
* DenseNet
* CPU
* MobelNet
* ShuffleNet
* neck
* FPN (feature pyramid network)


* PaNet (path aggregation network)
* Bi-FPN
* head
This is a network in charge of actually doing the detection part (classification and regression) of bounding boxes. A single output may look like (depending on the implementation): 4 coordinates describing the predicted box (x, y, h, w) and the probability of k classes + 1 (one extra for background). Objected detectors anchor-based, like YOLO, apply the head network to each anchor box. Other popular one-stage detectors, which are anchor-based, are: Single Shot Detector[6] and RetinaNet[4].
*
## Bag of freebies
* focal loss
* deal with imbalance between various classes
* IoU loss
* GIoU
* DIoU
* CIoU
converge faster & better acc
*
*
## Bag of specials
增加一點點inference time,但是performance上升明顯
* Enhance receptive field
* SPP
* ASPP
* RFB
dilated conv, cost 7% extra inference time, increase 5.7% MS COCO AP
* attention module
* Squeeze-and-Excitation(SE) (channel-wise)
* Spatial Attentino Module(SAM) (point-wise)
* post-processsing
* NMS
<br><br><br>
---
# Methodology
---
## Selection of architecture
<font color="red">A reference model which is optimal for classification is not always optimal for a detector</font>
* Objective 1
* resolution
* number of conv layers
* number of parameters (f_size^2 x filters x channels/groups)
* Objective 2
* increase receptive field
FPN, PAN, ASFF, BiFPN
* what is receptive field

* In contrast to classifier, detector need:
1. Higher input network size (resoution) - for detecting mutiple small-sized objects
2. More layers - for higher receptive field to cover increased size of input network
```
2.
image classifier圖片都且好好的了,但是detector task不然
他們scale不一樣, 要增加receptive field
```
<br><br><br>
---
# Others
---
yolo演化: https://mropengate.blogspot.com/2018/06/yolo-yolov3.html
yolov4介紹: https://towardsdatascience.com/yolo-v4-optimal-speed-accuracy-for-object-detection-79896ed47b50
fb看到yolov4: https://bangqu.com/rrhsI5.html?fbclid=IwAR2fchCn5dGhQAfMmmDPUh-kKNFICnEnSndnTpKGHRJIsaSicJem7Lfyv8w
IoU: https://blog.csdn.net/donkey_1993/article/details/104006474