---
title: CNN Architecture # 簡報的名稱
tags: Meeting # 簡報的標籤
slideOptions: # 簡報相關的設定
theme: midnight
slideNumber: true
---
# CNN Architecture
---
## 示意圖
**卷積層**
Normal: 
With padding: 
---
**Maxpooling**

(來源:cs231n)
---
**全連接層**

[playground](https://reurl.cc/5goWoM)
---
## LeNet架構 (1994)

---
## VGGNet
> [K Simonyan et. al 2014]
[paper](https://arxiv.org/abs/1409.1556)
主要差異: $3\times (3^2\times C^2)$ vs. $7^2\times C^2$

---
## Inception
> [C Szegedy et. al 2014]
[paper](https://arxiv.org/abs/1409.4842)
```graphviz
digraph {
node[shape=box]
"previous layer"-> "1x1 convolution"->"Filter concatnation";
"previous layer" -> "1x1convolution"->"3x3 convolutions"->"Filter concatnation";
"previous layer" -> "1x1 convolution"->"5x5 convolutions"->"Filter concatnation";
"previous layer" -> "MaxPooling"->"1x1 convolution "->"Filter concatnation";
}
```
[GoogLeNet結構圖](https://i.imgur.com/rBIXwcL.jpg)
----
#### 1x1的捲積核有什麼用呢?
* 減少維度
* Ex:上一層的輸出為100x100x128,經過具有256個通道的5x5卷積層之後(stride=1,pad=2),輸出數據為100x100x256,其中,卷積層的參數為128x5x5x256= 819200。
* 而假如上一層輸出先經過具有32個通道的1x1卷積層,再經過具有256個輸出的5x5卷積層,那麼輸出數據仍為為100x100x256,但卷積參數量已經減少為128x1x1x32 + 32x5x5x256= 204800,大約減少了4倍。
---
## ResNet
> [He et. al 2015 ]
[paper](https://arxiv.org/pdf/1512.03385.pdf)

[ResNet結構圖](https://i.imgur.com/AG6eRti.png)
---
## DenseNet
> G Huang et. al
[paper](https://arxiv.org/abs/1608.06993)
---

---
首先假設一張圖$x_0$進入類神經網路
網路有L層, 每層都是一個非線性轉換$H_l(\cdot)$,下標$l$代表第幾層。
H是多個函數的綜合體(BN,Conv,Relu,Pool)。$x_l$表示$l^{th}$的output。
---
* 傳統網路:
使用$l^{th}$的output作為$(l+1^{th})$的input, $x_l=H_l(x_{l-1})$
* ResNets:
$x_l=H_l(x_{l-1})+x_{l-1}$
* DenseNets:
$x_l=H_l([x_0,x_1,\dots ,x_{l-1}])$,也就是從頭開始的每個特徵圖$x_l$都送入$H_l$並將結果串聯
---
## Mobilnet
> [AG Howard et. al 2017]
[paper](https://arxiv.org/abs/1704.04861)
* Depth-wise Seperable Convolution

計算量:$D_k\times D_k\times M\times D_F\times D_F+ M \times N \times D_F \times D_F$
---
* Width Multiplier $\alpha$: Thinner Models
用於控制輸入與輸出的通道數,算量如下
$D_k\times D_k\times\alpha M \times D_F \times D_F+\alpha M \times \alpha N \times D_F \times D_F$
* Resolution Multiplier $\rho$: Reduced Representation
用於控制輸入的解析度,算量如下
$D_k\times D_k\times\alpha M \times \rho D_F \times \rho D_F+$
$\alpha M \times \alpha N \times \rho D_F \times \rho D_F$
* Relu6: [](https://colab.research.google.com/drive/10pYIgS-_XUFpeWf9MMSqEGYUrrodPfR2)
---
原本卷積:$D_k\times D_k\times M \times N \times D_F \times D_F$
使用DWSC算量差距: $\dfrac{1}{N}+\dfrac{1}{D_k^2}$
使用$\alpha$算量差距: $\dfrac{\alpha}{N}+\dfrac{\alpha ^2}{D_k^2}$
使用$\rho$算量差距: $\dfrac{\rho}{N}+\dfrac{\rho ^2}{D_k^2}$
[MobileNet結構圖](https://i.imgur.com/MkMJnph.png)
----

----

---
| Model| Size | Top-1 Accuracy |Top-5 Accuracy|Parameters|
| -------- | -------- | -------- |--------|-------|
| VGG16 | 528 MB | 0.713 |0.901|138,357,544
| ResNet50 | 98 MB | 0.749 |0.921|25,636,712
|InceptionV3|92 MB |0.779|0.937|23,851,784
| MobileNet | 16 MB | 0.704 |0.895|4,253,864
---
Top-1 Acc: 模型只給出一個正確答案
Top-5 Acc: 模型給出的五個答案內有包含正確答案
人為給定的label不一定那麼準確,故top-5也是一個指標
例如:[巴哥犬與鬥牛犬](https://i2.kknews.cc/SIG=2vmjkb0/ctp-vzntr/1539256797239r0r573qo21.jpg)
> 資料來自:[Keras 文檔](https://keras.io/applications/)
----
# ACCURACY
**錯誤率(Error)**=$\dfrac{a}{m}$
m為樣本數,a為分類錯誤的樣本數
**準確度(Accuracy)**=$1-Error$
----
$Acc=\dfrac{TP+TN}{TP+TN+FP+FN}$