[TOC] --- # When <span><!-- .element: class="fragment highlight-blue" -->ONNC</span> meets <span><!-- .element: class="fragment highlight-green" -->NVDLA</span> 當開源專案交會時互放的光亮 a127a127 - 蔡德育 a127a127@skymizer.com --- ## 講者背景 - IOI / ICPC 演算法 - Linker / Loader Compiler 相關 Toolchain - JVM / Clang / LLVM Process VM / Compiler <!-- .slide: data-background="https://intern-md.skymizer.com/uploads/upload_3c8c34d502c5d84c4a2d11553c256e8e.png"--> --- ## <span style="color:lime">NVDLA</span> <span>**NV**IDIA **D**eep **L**earning **A**ccelerator <!-- .element: class="fragment" data-fragment-index="1" --> </span> <span>[http://nvdla.org/](http://nvdla.org/) <!-- .element: class="fragment" data-fragment-index="2" --> </span> --- ### 什麼是 DLA? **D**eep **L**earning **A**ccelerator --- ### 加速這種鬼東西的硬體 ![](https://i.imgur.com/kH0V8kf.png) --- ### [N][C][H][W] ![](https://i.imgur.com/o8HZUCO.png) --- ## NVDLA 詳細是什麼? > The NVIDIA Deep Learning Accelerator (NVDLA) is a <span><!-- .element: class="fragment highlight-red" -->free and open</span> architecture that promotes a standard way to design deep learning inference accelerators. With its modular architecture, NVDLA is <span><!-- .element: class="fragment highlight-red" -->scalable, highly configurable</span>, and designed to simplify integration and portability. <!-- .element: class="fragment" data-fragment-index="1" --> --- > Delivered as an <span style="color:red">open source</span> project under the NVIDIA Open NVDLA License, all of the software, hardware, and documentation will be available on GitHub. https://github.com/nvdla/ --- <!-- .slide: data-transition="slide-in fade-out" --> ![](https://i.imgur.com/Qc5hq5A.png) --- <!-- .slide: data-transition="fade-in" --> ![](https://i.imgur.com/lV4fjNV.png) --- ### CONV <span>![](https://media.giphy.com/media/i4NjAwytgIRDW/giphy.gif) <!-- .element: class="fragment" data-fragment-index="1" --> </span> <span>https://media.giphy.com/media/i4NjAwytgIRDW/giphy.gif <!-- .element: class="fragment" data-fragment-index="2" --> </span> <span>http://cs231n.github.io/convolutional-networks/ <!-- .element: class="fragment" data-fragment-index="3" --> </span> --- ### SDP > The <span><!-- .element: class="fragment highlight-red" -->Single Data</span> Point Processor (SDP) allows for the application of both linear and non-linear functions onto individual data points. 簡單的說,就是對個別 data 做一些運算 --- ### PDP > <span><!-- .element: class="fragment highlight-red" -->Planar Data</span> Processor 簡單的說,就是對個別 channel 做一些運算 目前用來做 pooling layer --- ### CDP > <span><!-- .element: class="fragment highlight-red" -->Cross-channel Data</span> Processor 簡單的說,就是用來做單一 pixel 跨 channel 的運算 目前用來做 LRN,在 channel 方向的 normalization --- ### RUBIK ![](https://media.giphy.com/media/Zc8c0DRlusDWU/giphy.gif) > The data reshape engine performs <span><!-- .element: class="fragment highlight-red" -->data format transformations</span> 簡單的說,就是用來改變 data layout --- ### BDMA > The bridge DMA (BDMA) module provides a <span><!-- .element: class="fragment highlight-red" -->data copy</span> engine to move data between the system DRAM and the dedicated high-performance memory interface 簡單的說,就是在 DRAM 和專屬記憶體之間搬資料 --- ![](https://i.imgur.com/3WDiA9e.png) ![](https://i.imgur.com/pnSrOpf.png) from https://software.intel.com/en-us/articles/intel-processors-for-deep-learning-training --- ## NVDLA 的 projects - Verilog<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="1" --> - https://github.com/nvdla/hw/tree/nvdlav1/vmod</span> - C-model<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="2" --> - https://github.com/nvdla/hw/tree/nvdlav1/cmod</span> - Linux drivers<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="3" --> - https://github.com/nvdla/sw/tree/master/kmd</span> - test benches and test suites<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="4" --> - https://github.com/nvdla/sw/tree/master/regression</span> - kernel- and user-mode software<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="5" --> - https://github.com/nvdla/sw</span> - software development tools - Virtual Platform<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="6" --> - https://github.com/nvdla/vp</span> - Compiler (nvdla_compiler)<span style="font-size: 0.6em"><!-- .element: class="fragment" data-fragment-index="7" --> - 沒有 open source :scream:</span> --- ## <span style="color:DodgerBlue">ONNC</span> **O**pen **N**eural **N**etwork **C**ompiler https://onnc.ai/ > ONNC (Open Neural Network Compiler) is a collection of open source, modular, and reusable <span><!-- .element: class="fragment highlight-red" -->compiler</span> algorithms/toolchains <span><!-- .element: class="fragment highlight-red" -->targeted on deep learning accelerators (DLAs)</span> --- ### 什麼是 Compiler? > A compiler is a computer program that <span><!-- .element: class="fragment highlight-red" -->translates</span> computer code written in <span><!-- .element: class="fragment highlight-red" -->one programming language</span> (the source language) <span><!-- .element: class="fragment highlight-red" -->into another programming language</span> (the target language) from https://en.wikipedia.org/wiki/Compiler --- #### Compiler 通常是從高階語言轉到低階的語言 --- #### 以 IR (intermediate representation) 為中心 通常可以分為三個部分 - Front-end - Source language -> IR - Middle-end - IR -> IR - Back-end - IR -> Target language --- #### 其中最核心的就是 middle-end 一個 IR -> IR 的 function,通常在 Compiler 領域叫做 `Pass`,在一個 compiler 中會有非常多的 passes --- ### ONNX > **O**pen **N**eural **N**etwork e**X**change format https://onnx.ai/ --- #### 有了 ONNX,front-end 就搞定了 ![](https://media2.giphy.com/media/nXxOjZrbnbRxS/giphy.webp?cid=790b7611095c50120845995242525f8c0f14ce245c69d570&rid=giphy.webp) --- ### ONNC 還有很多自己的 Middle-end pass --- #### Backend 很難實作 Gemm? <span>我們有個 Pass 會幫你把 Gemm 轉成 Conv <!-- .element: class="fragment" data-fragment-index="1" --> </span> <span>從此你只需要實作 Conv 就好 <!-- .element: class="fragment" data-fragment-index="2" --> </span> --- #### Conv 在硬體上有 size 限制? <span>我們有個 Pass 會幫你把 Conv 切小 <!-- .element: class="fragment" data-fragment-index="1" --> </span> --- ![image alt](https://media.giphy.com/media/d7c56SbLa9PRC/giphy.gif) --- ### ONNC 的 NVDLA Back-end <span>就是從 ONNC IR 轉成 NVDLA Loadable file <!-- .element: class="fragment" data-fragment-index="1" --> </span> --- ## ONNC 如何幫助 NVDLA --- ### Case 1: Support `GlobalAveragePool` --- #### `AveragePool` ![](https://i.imgur.com/k81qrJS.png) --- #### NVDLA PDP ![](https://i.imgur.com/W3octwc.png) > Max, min, and mean pooling methods are supported. --- #### `GlobalAveragePool` ![](https://i.imgur.com/84mjMlz.png) --- #### 把 kernel 的寬高設成跟 input 一樣就好了 ![](https://media.giphy.com/media/Z54VPIMUNTnyg/giphy.gif) --- #### 但是 ... --- NVDLA 的 PDP **不支援** kernel 的高或寬 > 8 的 Pooling layer ![](http://giphygifs.s3.amazonaws.com/media/FyzkKAKnDUI8g/giphy.gif) --- #### 解法 ![](https://i.imgur.com/I2cSOCX.png) --- #### 但是 ... ... --- input 的寬高可能無法被順利的分解 例如他是個質數 ![](https://i.imgur.com/6SG6zw4.png) --- #### 利用 padding 功能 ![](https://i.imgur.com/tpCF1xO.png) --- ### Case 2: Support Concat --- The inception module in the inception network. ![](https://i.imgur.com/yEz79r0.png) --- #### 就在 channel 方向把資料依序接起來就好 --- #### 但是 ... ... ... --- #### 其實在 NVDLA 中 memory layout 不是 NCHW ![](https://media.giphy.com/media/DowKEtWnLZcru/giphy.gif) --- #### 而是 NCHWc, [N][C][H][W] -> [N][C/c][H][W][c] ![](https://i.imgur.com/tMlUc6s.png) --- 目前官方 compiler 的 Concat 實作,並沒有考慮到在 channel 方向會有 bubble,所以會造成拿到錯誤的值 --- ONNC 可以在發現 Concat 會有 bubble 時,增加新的 Hardware operation 去移除那些 bubble ![](https://media.giphy.com/media/3o7P4F86TAI9Kz7XYk/giphy.gif) --- ## ONNC + NVDLA Supported Models in ONNX Model Zoo | | | |:--------------:|:--------------:| | AlexNet | GoogLeNet | | CaffeNet | R-CNN ILSVRC13 | | DenseNet-121 | Inception v1 | | Inception v2 | ResNet-50 | | ShuffleNet | SqueezeNet | | VGG-19 | ZFNet-512 | --- ## NVDLA 如何幫助 ONNC <span style="font-size:3em">OPEN SOURCE !! <!-- .element: class="fragment" data-fragment-index="1" --> </span> --- #### 但是 ... ... ... ... ![](https://media3.giphy.com/media/wJfRt2hMBRQT9xzfEt/200w.webp?cid=790b7611920074880023538c332ef44d5c697ee434b48401&rid=200w.webp) --- ### Open Source 之外 --- ### 還需要有好的 Document --- ![](https://i.imgur.com/ftyuDLo.png) --- ![](https://i.imgur.com/XOCTePf.png) --- ![](https://i.imgur.com/iF4vKIa.png) --- ![](https://i.imgur.com/cEIBbxG.png) --- ## <span style="color:DodgerBlue">ONNC</span> + <span style="color:Lime">NVDLA</span> <span>可以自行 support <span style="color:DodgerBlue">ONNC</span> 不 support 的 layer <!-- .element: class="fragment" data-fragment-index="1" --> </span> <span>甚至可以自行 support <span style="color:Lime">NVDLA</span> 不 support 的 layer <!-- .element: class="fragment" data-fragment-index="2" --> </span> --- ### Relu6 <span>`min(max(features, 0), 6)` <!-- .element: class="fragment" data-fragment-index="1" --> </span> <span>`min(Relu(features), 6)` <!-- .element: class="fragment" data-fragment-index="2" --> </span> --- ![](https://i.imgur.com/cGowwYD.png) --- ### ROI Pooling <span>Patches are welcome! <!-- .element: class="fragment" data-fragment-index="1" --> </span> --- ## Q & A <span>![](https://i.imgur.com/vLYHmkA.jpg) <span style="position:relative;top:-4em;right:-5em;">住在 RD 部門的貓 https://www.facebook.com/hachu.cat/</span> <!-- .element: class="fragment" data-fragment-index="1" --> </span> --- 有更多的問題想要私下問,可以寄信到 support@skymizer.com 願意投履歷的,可以寄信到 recruiting@skymizer.com
{"metaMigratedAt":"2023-06-14T23:35:22.977Z","metaMigratedFrom":"YAML","title":"When <span><!-- .element: class=\"fragment highlight-blue\" -->ONNC</span> meets <span><!-- .element: class=\"fragment highlight-green\" -->NVDLA</span>","breaks":true,"slideOptions":"{\"transition\":\"slide\",\"transition-speed\":\"fast\"}","contributors":"[{\"id\":\"7cdd79f7-cac0-4eb2-b17f-5b994daff08b\",\"add\":10894,\"del\":824},{\"id\":\"7df69d44-27cb-4a77-a3a7-13aa034d8fd3\",\"add\":7,\"del\":7}]"}
    2588 views
   owned this note