---
# System prepended metadata

title: 技術研討方向
tags: [Model Compression]

---

---
tags: Model Compression
---
# 技術研討方向

https://zhuanlan.zhihu.com/p/58805980
![](https://i.imgur.com/8ZgYEQ4.png)

## Redundancy in Weights


https://www.cnblogs.com/wujianming-110117/p/12702802.html
### 1. 剪枝（Pruning）
  90年代只有neural network:
  - **Magnitude-based**: 對網絡中每個hidden unit與其絕對值相關的weight decay來最小化hidden unit數量
  - **Optimal brain damage(OBD)/Optimal brain surgeon(OBS)**:基于损失函数相對於權重的二階導數（對權重向量来說即Hessian矩陣）来衡量網絡中權重的重要程度，然後對其進行裁減

2012年後，Deep neural network崛起

  - **非結構化剪枝(Unstructured pruning)**:早期的方法多屬於非結構化剪枝，裁減神經元，但kernel會變得很稀疏，得到中間很多元素為0的矩陣，因此很難實質的提升性能。
  - **結構化剪枝(Structured pruning)**:近期研究集中在結構化剪枝，可進一步細分為:
    - channel-wise
    - filter-wise
    - shape-wise

![](https://i.imgur.com/Jmsd1L1.png)

詳細介紹: [闲话模型压缩之网络剪枝（Network Pruning）篇](https://jinzhuojun.blog.csdn.net/article/details/100621397?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-7.control&dist_request_id=aa310a43-6bc3-468a-a8da-4f03824399e7&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-7.control)
[閒話模型壓縮（Model compression）之網絡剪枝](https://www.twblogs.net/a/5d7eb791bd9eee5327ffdf31)

**Filter pruning: (JN)**
paper:
[PRUNING FILTERS FOR EFFICIENT CONVNETS](https://arxiv.org/pdf/1608.08710.pdf)
[論文中文講解1](https://www.itread01.com/yqxyc.html)
[論文中文講解2](https://zhuanlan.zhihu.com/p/63779916)（中文講解也很多篇）
code:
[Pruning Filters for Efficient Convnets](https://github.com/Eric-mingjie/rethinking-network-pruning) $\to$ :star: 1200 （github有很多人寫這篇）

[github 搜尋pruning filters for efficient convnets](https://github.com/search?q=Pruning+Filters+for+Efficient+ConvNets)

**Channel pruning:
paper: 
[Learning Efficient Convolutional Networks through Network Slimming](https://openaccess.thecvf.com/content_ICCV_2017/papers/Liu_Learning_Efficient_Convolutional_ICCV_2017_paper.pdf)
code:
[Network Slimming](https://github.com/Eric-mingjie/network-slimming)$\to$ :star: 564
[YOLOv3-model-pruning](https://github.com/Lam1360/YOLOv3-model-pruning) $\to$ :star: 1400


---
**Code:**
Pruning (Keras):
1. [修剪非必要權重](https://www.tensorflow.org/model_optimization/guide/pruning?hl=zh-tw)
2. [Pruning in Keras example](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras?hl=zh-tw)

[
Deep-Compression-PyTorch](https://github.com/mightydeveloper/Deep-Compression-PyTorch) $\to$ :star: 260 / 用簡單的 net 實作 Deep Compression


[List of Weight and Filter pruning](https://github.com/he-y/Awesome-Pruning) $\to$ 很多paper及對應的github

package:
[Distiller](https://intellabs.github.io/distiller/index.html) $\to$ Distiller is an open-source Python package for neural network compression research.(PyTorch)
- Pruning包含：filter/channel
- Quantization


---
**:star:相關經典 paper:**
1. [A Survey of Model Compression and Acceleration for Deep Neural Networks](https://arxiv.org/abs/1710.09282) $\to$ 介紹模型壓縮方法的各種來龍去脈
2. [Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding](https://arxiv.org/abs/1510.00149) $\to$ 作者將剪枝、量化和編碼等技術結合起來，在不顯著影響準確性的前提下，將存儲需求減少35x（AlexNet）至49x（VGG-19）
3. [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626) $\to$ Deep Compression 中剪枝的方法
4. [Single Shot Structured Pruning Before Training](https://arxiv.org/abs/2007.00389)

---
**Resource:**
Pruning 介紹
https://kknews.cc/zh-tw/science/g8kmo8e.html

[Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method](https://openreview.net/pdf/4303cafc585e6d9e0738a283a208faea6c74f36e.pdf)


---

### 2.量化（Quantization）
概念：
> 量化就是將神經網絡的浮點算法轉換為定點。量化有若干相似的術語。低精度（Low precision）可能是最通用的概念。常規精度一般使用FP32（32位浮點，單精度）存儲模型權重；低精度則表示FP16（半精度浮點），INT8（8位的定點整數）等等數值格式。不過目前低精度往往指代INT8。

參考資源：
https://jackwish.net/2019/neural-network-quantization-introduction-chn.html

---
Paper with code:
[What Do Compressed Deep Neural Networks Forget?](https://paperswithcode.com/paper/selective-brain-damage-measuring-the) $\to$ :star: 16,032

[Training with Quantization Noise for Extreme Model Compression](https://paperswithcode.com/paper/training-with-quantization-noise-for-extreme) $\to$ :star: 11,300

### 3.（Low-rank factorization）
### 4.（Knowledge distillation）

### 參考資料大補帖
https://github.com/memoiry/Awesome-model-compression-and-acceleration/blob/master/README.md