各種Residual Block
NN不斷在往Deep發展,如果只是簡單地不斷加深網路的層數,並不能使training error下降
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
後來出現了ResNet,ResNet疊了100個Layer,所以他們實作了Residual Block來解決越深卻沒有越好的問題
Residual Block
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
提出這樣的Block的好處在於,在做過去Conv. Layer在做Training的時候所需要的工作量,隨著加入簡單的Identity層"跳躍"兩個Conv. Layer,降低了Conv. Layer訓練時成本(所需要調整的Residual),也因此NN的層數也就能夠增加,如下圖所示
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
生物學上的神經傳導跳躍
還沒看ResNet的論文,不過個人猜測這應該是受到生物學上的跳躍式動作電位傳導的啟發
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Bottleneck Residual Block
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
左邊是一般的Residual Block,右邊是Bottleneck Residual Block,兩個參數數量一樣,所以時間複雜度一樣但是Bottleneck Residual Block有三層,一般Residual Block只有兩層。
ResNet作者是在超過50層的時候使用Bottle Residual Block,理由是為了增加層數但不想增加訓練時間(論文p.6的Deeper Bottleneck Architectures的描述),不過個人認為"增加層數就能取得較高準確率"的假設這件事情雖然普遍被實驗支持,但從未有數學上的驗證,並非不能被推翻。
Inverted Residual Block
所以這時候我們再回來看MobileNetV2裡面的Inverted Residual Block的時候就看得懂裡面在講甚麼了
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Inverted Residual Block與其他Residual Block的差別在於他先升高Channel數再降Channel數,一般的Residual Block是先降Channel數再升Channel數。
網路上很多文章都說這是為了保留更多特徵訊息,不過如果回去看論文,這是為了Memory Efficient! 他在第五章有證明不過還沒看得很懂。
所以現在就可以很輕鬆的看懂,PyTorchHub上MobileNetV2的code了,下面就是Inverted Residual Block的實作。
Reference
- https://zhuanlan.zhihu.com/p/28413039
- https://www.jianshu.com/p/e502e4b43e6d
- https://arxiv.org/pdf/1512.03385.
- https://www.cnblogs.com/hejunlin1992/p/9395345.html
- https://zhuanlan.zhihu.com/p/33075914