RegNet: Regular Network

tags: `Research`

過往的尋求好模型的方法，即是 NAS (Neural Architecture Search)，但是過往的 NAS 技術是在一個給定設計空間 (designed search space)，並在此空間裡搜索出最佳的一組參數。這篇研究的重點是在於如何設計出設計空間(design design space)，而不僅是搜索出最佳的一組參數。

Chapter 1: Introduction

過往的研究如 LeNet、AlexNet、VGG 和 ResNet 等，皆讓我們對於模型的設計有更好的概念，像是模型的卷積、資料大小、模型深度、模型殘差等。而 NAS 為一種從一個設計空間中找出一個好的模型，但此方法無法讓我們知道如何找到一個好的網路架構。

過往的模型搜尋方法中，以手工設計 (manual design) 以及 NAS 為主。而各自的好處如下：

手工設計的好處：可解釋性、簡單、泛化
NAS 的好處：半自動化達到我們要求模型的效果

而這篇研究是期望可以達到以上兩種方法的優點，並實作在 VGG 或是 ResNet 等經典模型，並從一個設計空間 AnyNet，經過一些方法得到我們要的設計空間 RegNet。

先講此研究的結論：

RegNet 在執行效率上非常好
比起 EfficientNet，效果更好且訓練速度在 GPU 上快了 5 倍之多

Chapter 2: Design Space Design

下圖展示了此篇研究的方法，比起找到唯一最好的模型，在此研究模型參數的母體，以此研究出一個可泛化的模型設計準則：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

在此我們藉由從特定的設計空間中，抽樣一些參數組合，找到此設計空間的模型分佈，並透過統計數據進行分析此設計空間的性質。藉由這些步驟，找到更簡單、效果更好的模型。並且從模型的偏誤分佈進行分析後，我們可以獲得更多且更穩健的資訊。

目標希望為：

簡化模型的複雜度
增加設計空間的可解釋性
優化並且維持設計空間的品質
在設計空間中維持模型的多樣性

具體的步驟如下：

先從設計空間抽樣一些模型
畫出 error empirical distribution function (EDF) 並用 empirical bootstrap 來獲得更多深入觀察
重新規劃出新的設計空間

在此隨機挑選了 500 個模型，並各自跑了 10 個 epochs。接下來就是發現一些準則，建構出一個個設計空間，來一步步挑選出好的模型。

2.1 AnyNetXA

與許多的深度學習模型框架類似，由 stem、body 以及 head 組成：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

stem：輸入資料層
body：模型的主架構層，用來提取數據特徵，裡面再由許多 stage 組成(大部分的論文以 block 稱呼他)，每一個 stage 再由 block 組成(大部分的論文以 layer 稱呼他)
head：模型的輸出層，依照不同的任務類型來調整輸出內容

模型的優化，全部都在 body 中進行，總共有4個 stage，裡面的超參數空間如下：

block 的層數
$d_{i}$ ：滿足
$1 \leq d_{i} \leq 16$
每一層的通道數
$w_{j}$ ：滿足
$8 \times k$ ，
$1 \leq k \leq 128$
bottleneck ratio (EfficientNet 中的瓶頸率)：
$b_{i} \in {1, 2, 4}$
分組卷積的組數 (平行的 layer 數)：
$g_{i} \in {1, 2, . . ., 32}$

總共有

(16 \cdot 128 \cdot 3 \cdot 6)^{4} \approx 10^{18}

可能的模型選擇在 AnyNetX 中。

2.2 AnyNetXB

固定 bottleneck ratio

b_{i}

後，發現 error EDF 幾乎沒有變化：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

而整個設計空間的樣本空間量級卻減少很多：

(16 \cdot 128 \cdot 6)^{4} \times 3 \approx 6.8 \cdot 10^{16}

2.3 AnyNetXC

共享分組卷積的組數

g_{i}

後，發現 error EDF 幾乎沒有變化：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

而分組卷積的組數有 6 種：

(16 \cdot 128)^{4} \times 3 \times 6 \approx 3.2 \cdot 10^{14}

2.4 AnyNetXD

當我們使用遞增的模型寬度

w_{i}

時，發現設計空間內的模型表現越來越好：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

此時的模型樣本空間量級為：

\frac{(16 \cdot 128)^{4} \times 3 \times 6}{4!} \approx 1.3 \cdot 10^{12}

2.5 AnyNetXE

當我們使用遞增的模型層數

d_{i}

時，發現設計空間內的模型表現越來越好：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

此時的模型樣本空間量級為：

\frac{(16 \cdot 128)^{4} \times 3 \times 6}{(4!)^{2}} \approx 5.5 \cdot 10^{11}

整個過程將模型的樣本空間量級減少了

O (10^{7})

。再次總結上面的結果：

bottleneck ratio 共享
分組層數共享
模型寬度增加
模型層數增加

2.6 RegNet

從上一節發現，模型的寬度增加，對於模型的表現有正向的效果。接下來便是需要研究，模型的寬度要以什麼樣的方式增加，才會更好。模型的深度也是此部分的研究目的。

下圖為從 AnyNetE 的空間中，抽取 20 個最優的模型隨著模型寬度的增加，模型層數增加的折線圖：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

最後近似出一個函數為：

u_{j} = w_{0} + w_{a} \cdot j for 0 \leq j < d

其中

d

為模型深度，

w_{0} > 0

為初始模型寬度，

w_{a} > 0

為模型斜率。

但是實際上，在操作的時候，此篇研究採取此做法：

w_{i} = w_{0} \cdot w_{m}^{R o u n d (s_{j})}

d_{i} = \sum_{j} 1 [R o u n d (s_{j}) = i]

其中

d < 64

，

w_{0}, w_{a} < 256

1.5 \leq w_{m} \leq 3

，這便是 RegNet 的樣本空間了！此空間的量級為

3.0 \times 10^{8}

。EDF 曲線如下圖，並且 RegNet 再用 Grid Search 進行優化：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

其中還有許多細節，在這邊先不贅述，有興趣的讀者可以去看原文。

Chapter 3: Result

與 ResNet 相比，模型的表現更好：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

與 EfficientNet 相比，模型效果更好，並且在 inference 的速度有5倍之快：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Chapter 4: Conclusion

此篇研究最大的不同，在於不同於NAS的搜尋方式，搜索出一個最佳的模型，而是藉由了解設計空間，設計出一套程序，去找尋到較好的設計空間，並從中找到較佳的模型。結果來看，相較於 ResNet 以及 EfficientNet，效果更好，並且 inference 速度更快，藉此達到更可以實務上操作的 AutoML，並保持模型輕量化。

Reference

Radosavovic, I., Kosaraju, R. P., Girshick, R., He, K., & Dollár, P. (2020). Designing network design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10428-10436).
Radosavovic, I., Johnson, J., Xie, S., Lo, W. Y., & Dollár, P. (2019). On network design spaces for visual recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1882-1890).

RegNet: Regular Network

tags: Research

Chapter 1: Introduction

Chapter 2: Design Space Design

2.1 AnyNetXA

2.2 AnyNetXB

2.3 AnyNetXC

2.4 AnyNetXD

2.5 AnyNetXE

2.6 RegNet

Chapter 3: Result

Chapter 4: Conclusion

Reference

Read more

搭建 Docker Registry: Harbor

Leetcode Notes

Ubuntu GPU Docker 環境建置

ResNet: Residual Network

tags: `Research`