{%hackmd @themes/dracula %}
## General Design Principles
==避免在前面的網路過度壓縮==
For any cut separating the inputs from the outputs, one can access the amount of information passing though the cut. One should avoid bottlenecks with extreme compression
也就是說,把一層網路拆成兩層網路,一方面可以量測中間的資料,更重要的是可以避免過度的壓縮,input 到output應該慢慢
==高維度更容易被處理?==

在卷機網路加進activation function會讓training速度增快
==在低維度做Spatial aggregation==
Spatial aggregation can be done over lower dimensional embeddings without much or any loss in representational power.
在低維度做捲積不會有太多訊息的損失,所以在進行捲積前可以先減少維度
==Balance the width and depth of the network==
## Factorizing Convolutions with Large Filter Size
==Factorization into smaller convolutions==
把較大的捲積層換成幾個小的

Ex. 把conv5換成兩個conv3,增加深度也增加了速度
==Spatial Factorization into Asymmetric Convolutions==
把捲積層換成不對稱的,根據實驗發現在前面的網路用效果不好,feature在m*m使用最好(m = 12~20)

Ex. conv3換成1\*n跟n\*1