---
tags: Model Compression
---
# 技術研討方向
https://zhuanlan.zhihu.com/p/58805980

## Redundancy in Weights
https://www.cnblogs.com/wujianming-110117/p/12702802.html
### 1. 剪枝(Pruning)
90年代只有neural network:
- **Magnitude-based**: 對網絡中每個hidden unit與其絕對值相關的weight decay來最小化hidden unit數量
- **Optimal brain damage(OBD)/Optimal brain surgeon(OBS)**:基于损失函数相對於權重的二階導數(對權重向量来說即Hessian矩陣)来衡量網絡中權重的重要程度,然後對其進行裁減
2012年後,Deep neural network崛起
- **非結構化剪枝(Unstructured pruning)**:早期的方法多屬於非結構化剪枝,裁減神經元,但kernel會變得很稀疏,得到中間很多元素為0的矩陣,因此很難實質的提升性能。
- **結構化剪枝(Structured pruning)**:近期研究集中在結構化剪枝,可進一步細分為:
- channel-wise
- filter-wise
- shape-wise

詳細介紹: [闲话模型压缩之网络剪枝(Network Pruning)篇](https://jinzhuojun.blog.csdn.net/article/details/100621397?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-7.control&dist_request_id=aa310a43-6bc3-468a-a8da-4f03824399e7&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-7.control)
[閒話模型壓縮(Model compression)之網絡剪枝](https://www.twblogs.net/a/5d7eb791bd9eee5327ffdf31)
**Filter pruning: (JN)**
paper:
[PRUNING FILTERS FOR EFFICIENT CONVNETS](https://arxiv.org/pdf/1608.08710.pdf)
[論文中文講解1](https://www.itread01.com/yqxyc.html)
[論文中文講解2](https://zhuanlan.zhihu.com/p/63779916)(中文講解也很多篇)
code:
[Pruning Filters for Efficient Convnets](https://github.com/Eric-mingjie/rethinking-network-pruning) $\to$ :star: 1200 (github有很多人寫這篇)
[github 搜尋pruning filters for efficient convnets](https://github.com/search?q=Pruning+Filters+for+Efficient+ConvNets)
**Channel pruning:
paper:
[Learning Efficient Convolutional Networks through Network Slimming](https://openaccess.thecvf.com/content_ICCV_2017/papers/Liu_Learning_Efficient_Convolutional_ICCV_2017_paper.pdf)
code:
[Network Slimming](https://github.com/Eric-mingjie/network-slimming)$\to$ :star: 564
[YOLOv3-model-pruning](https://github.com/Lam1360/YOLOv3-model-pruning) $\to$ :star: 1400
---
**Code:**
Pruning (Keras):
1. [修剪非必要權重](https://www.tensorflow.org/model_optimization/guide/pruning?hl=zh-tw)
2. [Pruning in Keras example](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras?hl=zh-tw)
[
Deep-Compression-PyTorch](https://github.com/mightydeveloper/Deep-Compression-PyTorch) $\to$ :star: 260 / 用簡單的 net 實作 Deep Compression
[List of Weight and Filter pruning](https://github.com/he-y/Awesome-Pruning) $\to$ 很多paper及對應的github
package:
[Distiller](https://intellabs.github.io/distiller/index.html) $\to$ Distiller is an open-source Python package for neural network compression research.(PyTorch)
- Pruning包含:filter/channel
- Quantization
---
**:star:相關經典 paper:**
1. [A Survey of Model Compression and Acceleration for Deep Neural Networks](https://arxiv.org/abs/1710.09282) $\to$ 介紹模型壓縮方法的各種來龍去脈
2. [Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding](https://arxiv.org/abs/1510.00149) $\to$ 作者將剪枝、量化和編碼等技術結合起來,在不顯著影響準確性的前提下,將存儲需求減少35x(AlexNet)至49x(VGG-19)
3. [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626) $\to$ Deep Compression 中剪枝的方法
4. [Single Shot Structured Pruning Before Training](https://arxiv.org/abs/2007.00389)
---
**Resource:**
Pruning 介紹
https://kknews.cc/zh-tw/science/g8kmo8e.html
[Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method](https://openreview.net/pdf/4303cafc585e6d9e0738a283a208faea6c74f36e.pdf)
---
### 2.量化(Quantization)
概念:
> 量化就是將神經網絡的浮點算法轉換為定點。量化有若干相似的術語。低精度(Low precision)可能是最通用的概念。常規精度一般使用FP32(32位浮點,單精度)存儲模型權重;低精度則表示FP16(半精度浮點),INT8(8位的定點整數)等等數值格式。不過目前低精度往往指代INT8。
參考資源:
https://jackwish.net/2019/neural-network-quantization-introduction-chn.html
---
Paper with code:
[What Do Compressed Deep Neural Networks Forget?](https://paperswithcode.com/paper/selective-brain-damage-measuring-the) $\to$ :star: 16,032
[Training with Quantization Noise for Extreme Model Compression](https://paperswithcode.com/paper/training-with-quantization-noise-for-extreme) $\to$ :star: 11,300
### 3.(Low-rank factorization)
### 4.(Knowledge distillation)
### 參考資料大補帖
https://github.com/memoiry/Awesome-model-compression-and-acceleration/blob/master/README.md