CNN Architecture

示意圖

卷積層
Normal:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

With padding:

Maxpooling

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

(來源:cs231n)

全連接層

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

playground

LeNet架構 (1994)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

VGGNet

[K Simonyan et. al 2014]
paper

主要差異:

3 \times (3^{2} \times C^{2})

vs.

7^{2} \times C^{2}

Inception

[C Szegedy et. al 2014]
paper

GoogLeNet結構圖

1x1的捲積核有什麼用呢？

減少維度
Ex:上一層的輸出為100x100x128，經過具有256個通道的5x5卷積層之後(stride=1，pad=2)，輸出數據為100x100x256，其中，卷積層的參數為128x5x5x256= 819200。
而假如上一層輸出先經過具有32個通道的1x1卷積層，再經過具有256個輸出的5x5卷積層，那麼輸出數據仍為為100x100x256，但卷積參數量已經減少為128x1x1x32 + 32x5x5x256= 204800，大約減少了4倍。

ResNet

[He et. al 2015 ]
paper

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

ResNet結構圖

DenseNet

G Huang et. al
paper

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

首先假設一張圖

x_{0}

進入類神經網路
網路有L層，每層都是一個非線性轉換

H_{l} (\cdot)

，下標

l

代表第幾層。
H是多個函數的綜合體(BN,Conv,Relu,Pool)。

x_{l}

表示

l^{t h}

的output。

傳統網路:
使用
$l^{t h}$ 的output作為
$(l + 1^{t h})$ 的input,
$x_{l} = H_{l} (x_{l - 1})$
ResNets:

$x_{l} = H_{l} (x_{l - 1}) + x_{l - 1}$
DenseNets:

$x_{l} = H_{l} ([x_{0}, x_{1}, \dots, x_{l - 1}])$ ,也就是從頭開始的每個特徵圖
$x_{l}$ 都送入
$H_{l}$ 並將結果串聯

Mobilnet

[AG Howard et. al 2017]
paper

Depth-wise Seperable Convolution
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
計算量:
$D_{k} \times D_{k} \times M \times D_{F} \times D_{F} + M \times N \times D_{F} \times D_{F}$

Width Multiplier
$α$ : Thinner Models
用於控制輸入與輸出的通道數,算量如下

$D_{k} \times D_{k} \times α M \times D_{F} \times D_{F} + α M \times α N \times D_{F} \times D_{F}$
Resolution Multiplier
$ρ$ : Reduced Representation
用於控制輸入的解析度，算量如下

$D_{k} \times D_{k} \times α M \times ρ D_{F} \times ρ D_{F} +$

$α M \times α N \times ρ D_{F} \times ρ D_{F}$
Relu6:
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

原本卷積:

D_{k} \times D_{k} \times M \times N \times D_{F} \times D_{F}

使用DWSC算量差距:

\frac{1}{N} + \frac{1}{D_{k}^{2}}

使用

α

算量差距:

\frac{α}{N} + \frac{α^{2}}{D_{k}^{2}}

使用

ρ

算量差距:

\frac{ρ}{N} + \frac{ρ^{2}}{D_{k}^{2}}

MobileNet結構圖

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Model	Size	Top-1 Accuracy	Top-5 Accuracy	Parameters
VGG16	528 MB	0.713	0.901	138,357,544
ResNet50	98 MB	0.749	0.921	25,636,712
InceptionV3	92 MB	0.779	0.937	23,851,784
MobileNet	16 MB	0.704	0.895	4,253,864

Top-1 Acc: 模型只給出一個正確答案
Top-5 Acc: 模型給出的五個答案內有包含正確答案

人為給定的label不一定那麼準確,故top-5也是一個指標
例如:巴哥犬與鬥牛犬

資料來自:Keras 文檔

ACCURACY

錯誤率(Error)=

\frac{a}{m}

m為樣本數,a為分類錯誤的樣本數

準確度(Accuracy)=

1 - E r r o r

A c c = \frac{T P + T N}{T P + T N + F P + F N}

CNN Architecture

示意圖

LeNet架構 (1994)

VGGNet

Inception

1x1的捲積核有什麼用呢？

ResNet

DenseNet

Mobilnet

ACCURACY

Read more

mix_precision實做

Mixed precision

Deeply Supervised Object Detector

Synthetic Data for Text Localisation in Natural Images