Optimizing implementation of CNN inferences: change the *model* or the *architecture*?

# Optimizing implementation of CNN inferences: change the *model* or the *architecture*? _by Alexandre Honorat (IETR - Vaader) - 2021.04.08_ ###### tags: `VAADER` `Reading Group` ## Abstract <div style="text-align: justify"> CNN are now widely used so it is necessary to implement them efficiently. To do so, CNN are most commonly implemented on GPU processors, and also a bit on FPGA. In this talk, without entering into the details, we will list some problems arising when implementing the CNN inferences, especially on FPGA. We will also link these problems to the CNN models themselves and we will highlight a few general recommendations extracted from the following papers. </div> ## Slides {%pdf https://florianlemarchand.github.io/ressources/pdfs/VAADER_Reading_Group/2021-08-04-Honorat-Hardware.pdf %} ## Presentation and Discussions <iframe src="https://videos.insa-rennes.fr/video/0203-vaader-reading-group-6-alexandre-honorat-optimizing-implementation-of-cnn-inferences-change-the-model-or-the-architecture/?is_iframe=true" width="640" height="360" style="padding: 0; margin: 0; border:0" allowfullscreen ></iframe> ## Related Material [Accelerating Very Deep Convolutional Networks for Classification and Detection](https://arxiv.org/pdf/1505.06798.pdf), IEEE Trans. PAMI 2016, quoted 211 (IEEE) or 467 (Google) times [FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks](https://dl.acm.org/doi/pdf/10.1145/3242897), ACM Trans. Reconfigurable Technol. Syst. 2018, quoted 32 (ACM) or 83 (Google) times [Parallel Multi Channel convolution using General Matrix Multiplication](https://arxiv.org/pdf/1704.04428.pdf), ASAP2017, quoted 22 (IEEE) or 66 (Google) times}