--- tags: Model Compression --- # 技術研討方向 https://zhuanlan.zhihu.com/p/58805980 ![](https://i.imgur.com/8ZgYEQ4.png) ## Redundancy in Weights https://www.cnblogs.com/wujianming-110117/p/12702802.html ### 1. 剪枝(Pruning) 90年代只有neural network: - **Magnitude-based**: 對網絡中每個hidden unit與其絕對值相關的weight decay來最小化hidden unit數量 - **Optimal brain damage(OBD)/Optimal brain surgeon(OBS)**:基于损失函数相對於權重的二階導數(對權重向量来說即Hessian矩陣)来衡量網絡中權重的重要程度,然後對其進行裁減 2012年後,Deep neural network崛起 - **非結構化剪枝(Unstructured pruning)**:早期的方法多屬於非結構化剪枝,裁減神經元,但kernel會變得很稀疏,得到中間很多元素為0的矩陣,因此很難實質的提升性能。 - **結構化剪枝(Structured pruning)**:近期研究集中在結構化剪枝,可進一步細分為: - channel-wise - filter-wise - shape-wise ![](https://i.imgur.com/Jmsd1L1.png) 詳細介紹: [闲话模型压缩之网络剪枝(Network Pruning)篇](https://jinzhuojun.blog.csdn.net/article/details/100621397?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-7.control&dist_request_id=aa310a43-6bc3-468a-a8da-4f03824399e7&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-7.control) [閒話模型壓縮(Model compression)之網絡剪枝](https://www.twblogs.net/a/5d7eb791bd9eee5327ffdf31) **Filter pruning: (JN)** paper: [PRUNING FILTERS FOR EFFICIENT CONVNETS](https://arxiv.org/pdf/1608.08710.pdf) [論文中文講解1](https://www.itread01.com/yqxyc.html) [論文中文講解2](https://zhuanlan.zhihu.com/p/63779916)(中文講解也很多篇) code: [Pruning Filters for Efficient Convnets](https://github.com/Eric-mingjie/rethinking-network-pruning) $\to$ :star: 1200 (github有很多人寫這篇) [github 搜尋pruning filters for efficient convnets](https://github.com/search?q=Pruning+Filters+for+Efficient+ConvNets) **Channel pruning: paper: [Learning Efficient Convolutional Networks through Network Slimming](https://openaccess.thecvf.com/content_ICCV_2017/papers/Liu_Learning_Efficient_Convolutional_ICCV_2017_paper.pdf) code: [Network Slimming](https://github.com/Eric-mingjie/network-slimming)$\to$ :star: 564 [YOLOv3-model-pruning](https://github.com/Lam1360/YOLOv3-model-pruning) $\to$ :star: 1400 --- **Code:** Pruning (Keras): 1. [修剪非必要權重](https://www.tensorflow.org/model_optimization/guide/pruning?hl=zh-tw) 2. [Pruning in Keras example](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras?hl=zh-tw) [ Deep-Compression-PyTorch](https://github.com/mightydeveloper/Deep-Compression-PyTorch) $\to$ :star: 260 / 用簡單的 net 實作 Deep Compression [List of Weight and Filter pruning](https://github.com/he-y/Awesome-Pruning) $\to$ 很多paper及對應的github package: [Distiller](https://intellabs.github.io/distiller/index.html) $\to$ Distiller is an open-source Python package for neural network compression research.(PyTorch) - Pruning包含:filter/channel - Quantization --- **:star:相關經典 paper:** 1. [A Survey of Model Compression and Acceleration for Deep Neural Networks](https://arxiv.org/abs/1710.09282) $\to$ 介紹模型壓縮方法的各種來龍去脈 2. [Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding](https://arxiv.org/abs/1510.00149) $\to$ 作者將剪枝、量化和編碼等技術結合起來,在不顯著影響準確性的前提下,將存儲需求減少35x(AlexNet)至49x(VGG-19) 3. [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626) $\to$ Deep Compression 中剪枝的方法 4. [Single Shot Structured Pruning Before Training](https://arxiv.org/abs/2007.00389) --- **Resource:** Pruning 介紹 https://kknews.cc/zh-tw/science/g8kmo8e.html [Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method](https://openreview.net/pdf/4303cafc585e6d9e0738a283a208faea6c74f36e.pdf) --- ### 2.量化(Quantization) 概念: > 量化就是將神經網絡的浮點算法轉換為定點。量化有若干相似的術語。低精度(Low precision)可能是最通用的概念。常規精度一般使用FP32(32位浮點,單精度)存儲模型權重;低精度則表示FP16(半精度浮點),INT8(8位的定點整數)等等數值格式。不過目前低精度往往指代INT8。 參考資源: https://jackwish.net/2019/neural-network-quantization-introduction-chn.html --- Paper with code: [What Do Compressed Deep Neural Networks Forget?](https://paperswithcode.com/paper/selective-brain-damage-measuring-the) $\to$ :star: 16,032 [Training with Quantization Noise for Extreme Model Compression](https://paperswithcode.com/paper/training-with-quantization-noise-for-extreme) $\to$ :star: 11,300 ### 3.(Low-rank factorization) ### 4.(Knowledge distillation) ### 參考資料大補帖 https://github.com/memoiry/Awesome-model-compression-and-acceleration/blob/master/README.md