[TOC]
---
# When <span><!-- .element: class="fragment highlight-blue" -->ONNC</span> meets <span><!-- .element: class="fragment highlight-green" -->NVDLA</span>
當開源專案交會時互放的光亮
a127a127 - 蔡德育
a127a127@skymizer.com
---
## 講者背景
- IOI / ICPC
演算法
- Linker / Loader
Compiler 相關 Toolchain
- JVM / Clang / LLVM
Process VM / Compiler
<!-- .slide: data-background="https://intern-md.skymizer.com/uploads/upload_3c8c34d502c5d84c4a2d11553c256e8e.png"-->
---
## <span style="color:lime">NVDLA</span>
<span>**NV**IDIA **D**eep **L**earning **A**ccelerator
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
<span>[http://nvdla.org/](http://nvdla.org/)
<!-- .element: class="fragment" data-fragment-index="2" -->
</span>
---
### 什麼是 DLA?
**D**eep **L**earning **A**ccelerator
---
### 加速這種鬼東西的硬體
![](https://i.imgur.com/kH0V8kf.png)
---
### [N][C][H][W]
![](https://i.imgur.com/o8HZUCO.png)
---
## NVDLA 詳細是什麼?
> The NVIDIA Deep Learning Accelerator (NVDLA) is a <span><!-- .element: class="fragment highlight-red" -->free and open</span> architecture that promotes a standard way to design deep learning inference accelerators. With its modular architecture, NVDLA is <span><!-- .element: class="fragment highlight-red" -->scalable, highly configurable</span>, and designed to simplify integration and portability.
<!-- .element: class="fragment" data-fragment-index="1" -->
---
> Delivered as an <span style="color:red">open source</span> project under the NVIDIA Open NVDLA License, all of the software, hardware, and documentation will be available on GitHub.
https://github.com/nvdla/
---
<!-- .slide: data-transition="slide-in fade-out" -->
![](https://i.imgur.com/Qc5hq5A.png)
---
<!-- .slide: data-transition="fade-in" -->
![](https://i.imgur.com/lV4fjNV.png)
---
### CONV
<span>![](https://media.giphy.com/media/i4NjAwytgIRDW/giphy.gif)
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
<span>https://media.giphy.com/media/i4NjAwytgIRDW/giphy.gif
<!-- .element: class="fragment" data-fragment-index="2" -->
</span>
<span>http://cs231n.github.io/convolutional-networks/
<!-- .element: class="fragment" data-fragment-index="3" -->
</span>
---
### SDP
> The <span><!-- .element: class="fragment highlight-red" -->Single Data</span> Point Processor (SDP) allows for the application of both linear and non-linear functions onto individual data points.
簡單的說,就是對個別 data 做一些運算
---
### PDP
> <span><!-- .element: class="fragment highlight-red" -->Planar Data</span> Processor
簡單的說,就是對個別 channel 做一些運算
目前用來做 pooling layer
---
### CDP
> <span><!-- .element: class="fragment highlight-red" -->Cross-channel Data</span> Processor
簡單的說,就是用來做單一 pixel 跨 channel 的運算
目前用來做 LRN,在 channel 方向的 normalization
---
### RUBIK
![](https://media.giphy.com/media/Zc8c0DRlusDWU/giphy.gif)
> The data reshape engine performs <span><!-- .element: class="fragment highlight-red" -->data format transformations</span>
簡單的說,就是用來改變 data layout
---
### BDMA
> The bridge DMA (BDMA) module provides a <span><!-- .element: class="fragment highlight-red" -->data copy</span> engine to move data between the system DRAM and the dedicated high-performance memory interface
簡單的說,就是在 DRAM 和專屬記憶體之間搬資料
---
![](https://i.imgur.com/3WDiA9e.png)
![](https://i.imgur.com/pnSrOpf.png)
from https://software.intel.com/en-us/articles/intel-processors-for-deep-learning-training
---
## NVDLA 的 projects
- Verilog<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="1" --> - https://github.com/nvdla/hw/tree/nvdlav1/vmod</span>
- C-model<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="2" --> - https://github.com/nvdla/hw/tree/nvdlav1/cmod</span>
- Linux drivers<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="3" --> - https://github.com/nvdla/sw/tree/master/kmd</span>
- test benches and test suites<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="4" --> - https://github.com/nvdla/sw/tree/master/regression</span>
- kernel- and user-mode software<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="5" --> - https://github.com/nvdla/sw</span>
- software development tools
- Virtual Platform<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="6" --> - https://github.com/nvdla/vp</span>
- Compiler (nvdla_compiler)<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="7" --> - 沒有 open source :scream:</span>
---
## <span style="color:DodgerBlue">ONNC</span>
**O**pen **N**eural **N**etwork **C**ompiler
https://onnc.ai/
> ONNC (Open Neural Network Compiler) is a collection of open source, modular, and reusable <span><!-- .element: class="fragment highlight-red" -->compiler</span> algorithms/toolchains <span><!-- .element: class="fragment highlight-red" -->targeted on deep learning accelerators (DLAs)</span>
---
### 什麼是 Compiler?
> A compiler is a computer program that <span><!-- .element: class="fragment highlight-red" -->translates</span> computer code written in <span><!-- .element: class="fragment highlight-red" -->one programming language</span> (the source language) <span><!-- .element: class="fragment highlight-red" -->into another programming language</span> (the target language)
from https://en.wikipedia.org/wiki/Compiler
---
#### Compiler 通常是從高階語言轉到低階的語言
---
#### 以 IR (intermediate representation) 為中心
通常可以分為三個部分
- Front-end
- Source language -> IR
- Middle-end
- IR -> IR
- Back-end
- IR -> Target language
---
#### 其中最核心的就是 middle-end
一個 IR -> IR 的 function,通常在 Compiler 領域叫做 `Pass`,在一個 compiler 中會有非常多的 passes
---
### ONNX
> **O**pen **N**eural **N**etwork e**X**change format
https://onnx.ai/
---
#### 有了 ONNX,front-end 就搞定了
![](https://media2.giphy.com/media/nXxOjZrbnbRxS/giphy.webp?cid=790b7611095c50120845995242525f8c0f14ce245c69d570&rid=giphy.webp)
---
### ONNC 還有很多自己的 Middle-end pass
---
#### Backend 很難實作 Gemm?
<span>我們有個 Pass 會幫你把 Gemm 轉成 Conv
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
<span>從此你只需要實作 Conv 就好
<!-- .element: class="fragment" data-fragment-index="2" -->
</span>
---
#### Conv 在硬體上有 size 限制?
<span>我們有個 Pass 會幫你把 Conv 切小
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
---
![image alt](https://media.giphy.com/media/d7c56SbLa9PRC/giphy.gif)
---
### ONNC 的 NVDLA Back-end
<span>就是從 ONNC IR 轉成 NVDLA Loadable file
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
---
## ONNC 如何幫助 NVDLA
---
### Case 1: Support `GlobalAveragePool`
---
#### `AveragePool`
![](https://i.imgur.com/k81qrJS.png)
---
#### NVDLA PDP
![](https://i.imgur.com/W3octwc.png)
> Max, min, and mean pooling methods are supported.
---
#### `GlobalAveragePool`
![](https://i.imgur.com/84mjMlz.png)
---
#### 把 kernel 的寬高設成跟 input 一樣就好了
![](https://media.giphy.com/media/Z54VPIMUNTnyg/giphy.gif)
---
#### 但是 ...
---
NVDLA 的 PDP
**不支援** kernel 的高或寬 > 8 的 Pooling layer
![](http://giphygifs.s3.amazonaws.com/media/FyzkKAKnDUI8g/giphy.gif)
---
#### 解法
![](https://i.imgur.com/I2cSOCX.png)
---
#### 但是 ... ...
---
input 的寬高可能無法被順利的分解
例如他是個質數
![](https://i.imgur.com/6SG6zw4.png)
---
#### 利用 padding 功能
![](https://i.imgur.com/tpCF1xO.png)
---
### Case 2: Support Concat
---
The inception module in the inception network.
![](https://i.imgur.com/yEz79r0.png)
---
#### 就在 channel 方向把資料依序接起來就好
---
#### 但是 ... ... ...
---
#### 其實在 NVDLA 中 memory layout 不是 NCHW
![](https://media.giphy.com/media/DowKEtWnLZcru/giphy.gif)
---
#### 而是 NCHWc, [N][C][H][W] -> [N][C/c][H][W][c]
![](https://i.imgur.com/tMlUc6s.png)
---
目前官方 compiler 的 Concat 實作,並沒有考慮到在 channel 方向會有 bubble,所以會造成拿到錯誤的值
---
ONNC 可以在發現 Concat 會有 bubble 時,增加新的 Hardware operation 去移除那些 bubble
![](https://media.giphy.com/media/3o7P4F86TAI9Kz7XYk/giphy.gif)
---
## ONNC + NVDLA Supported Models in ONNX Model Zoo
| | |
|:--------------:|:--------------:|
| AlexNet | GoogLeNet |
| CaffeNet | R-CNN ILSVRC13 |
| DenseNet-121 | Inception v1 |
| Inception v2 | ResNet-50 |
| ShuffleNet | SqueezeNet |
| VGG-19 | ZFNet-512 |
---
## NVDLA 如何幫助 ONNC
<span style="font-size:3em">OPEN SOURCE !!
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
---
#### 但是 ... ... ... ...
![](https://media3.giphy.com/media/wJfRt2hMBRQT9xzfEt/200w.webp?cid=790b7611920074880023538c332ef44d5c697ee434b48401&rid=200w.webp)
---
### Open Source 之外
---
### 還需要有好的 Document
---
![](https://i.imgur.com/ftyuDLo.png)
---
![](https://i.imgur.com/XOCTePf.png)
---
![](https://i.imgur.com/iF4vKIa.png)
---
![](https://i.imgur.com/cEIBbxG.png)
---
## <span style="color:DodgerBlue">ONNC</span> + <span style="color:Lime">NVDLA</span>
<span>可以自行 support <span style="color:DodgerBlue">ONNC</span> 不 support 的 layer
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
<span>甚至可以自行 support <span style="color:Lime">NVDLA</span> 不 support 的 layer
<!-- .element: class="fragment" data-fragment-index="2" -->
</span>
---
### Relu6
<span>`min(max(features, 0), 6)`
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
<span>`min(Relu(features), 6)`
<!-- .element: class="fragment" data-fragment-index="2" -->
</span>
---
![](https://i.imgur.com/cGowwYD.png)
---
### ROI Pooling
<span>Patches are welcome!
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
---
## Q & A
<span>![](https://i.imgur.com/vLYHmkA.jpg)
<span style="position:relative;top:-4em;right:-5em;">住在 RD 部門的貓 https://www.facebook.com/hachu.cat/</span>
<!-- .element: class="fragment" data-fragment-index="1" -->
</span>
---
有更多的問題想要私下問,可以寄信到
support@skymizer.com
願意投履歷的,可以寄信到
recruiting@skymizer.com
{"metaMigratedAt":"2023-06-14T23:35:22.977Z","metaMigratedFrom":"YAML","title":"When <span><!-- .element: class=\"fragment highlight-blue\" -->ONNC</span> meets <span><!-- .element: class=\"fragment highlight-green\" -->NVDLA</span>","breaks":true,"slideOptions":"{\"transition\":\"slide\",\"transition-speed\":\"fast\"}","contributors":"[{\"id\":\"7cdd79f7-cac0-4eb2-b17f-5b994daff08b\",\"add\":10894,\"del\":824},{\"id\":\"7df69d44-27cb-4a77-a3a7-13aa034d8fd3\",\"add\":7,\"del\":7}]"}