AI Compiler - HackMD

# AI Compiler ## Survey * [Pruning and Quantization for Deep Neural Network Acceleration: A Survey](https://arxiv.org/pdf/2101.09671.pdf) * [A Survey of Quantization Methods for Efficient Neural Network Inference](https://arxiv.org/pdf/2103.13630.pdf) * [Dynamic Neural Networks: A Survey](https://arxiv.org/pdf/2102.04906.pdf) ## Filter Pruning * [Pruning Filters for Efficient ConvNets](https://arxiv.org/pdf/1608.08710.pdf) ## Channel Pruning * [Channel Pruning for Accelerating Very Deep Neural Networks](https://openaccess.thecvf.com/content_ICCV_2017/papers/He_Channel_Pruning_for_ICCV_2017_paper.pdf) ## N:M Pruning * [Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch](https://openreview.net/pdf?id=K9bw7vqp_s) * [DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks](https://openreview.net/pdf?id=IGrC6koW_g) * [Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks](https://arxiv.org/pdf/2102.08124.pdf) * [Optimal Fine-Grained N:M sparsity for Activations and Neural Gradients](https://arxiv.org/pdf/2203.10991.pdf) * [Accelerating DNN Training with Structured Data Gradient Pruning](https://arxiv.org/pdf/2202.00774.pdf) ## Column Combinning * [Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization](https://dl.acm.org/doi/pdf/10.1145/3297858.3304028) ## MISC * [Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning](https://dl.acm.org/doi/pdf/10.1145/3410463.3414648) * [Neural Pruning via Growing Regularization](https://arxiv.org/pdf/2012.09243.pdf) * [Sparse Tensor Core: Algorithm and Hardware Co-Design for Vector-wise Sparse Neural Networks on Modern GPUs](https://par.nsf.gov/servlets/purl/10203997) * [Channel Pruning via Automatic Structure Search](https://arxiv.org/pdf/2001.08565.pdf) * [Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science](https://www.nature.com/articles/s41467-018-04316-3.pdf) * [Towards Fully Sparse Training: Information Restoration with Spatial Similarity](https://ojs.aaai.org/index.php/AAAI/article/view/20198/19957) * [Accelerating DNN Training with Structured Data Gradient Pruning](https://arxiv.org/pdf/2202.00774.pdf) * [Efficient Design Space Exploration for Sparse Mixed Precision Neural Architectures](https://dl.acm.org/doi/pdf/10.1145/3502181.3531463) * [1xN Pattern for Pruning Convolutional Neural Networks](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9847369) * [GEMM](https://blog.csdn.net/just_sort/article/details/124458249) * [DSC: Dense-Sparse Convolution for Vectorized Inference of Convolutional Neural Networks](https://openaccess.thecvf.com/content_CVPRW_2019/papers/SAIAD/Frickenstein_DSC_Dense-Sparse_Convolution_for_Vectorized_Inference_of_Convolutional_Neural_Networks_CVPRW_2019_paper.pdf) * [Sparta: High-Performance, Element-Wise Sparse Tensor Contraction on Heterogeneous Memory](https://dl.acm.org/doi/pdf/10.1145/3437801.3441581) * [SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute](https://www.usenix.org/system/files/osdi22-zheng-ningxin.pdf) * [Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity](https://dl.acm.org/doi/pdf/10.5555/3433701.3433722) * [MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning](https://arxiv.org/pdf/1903.10258.pdf) * [Efficient Tensor Core-Based GPU Kernels for Structured Sparsity under Reduced Precision](https://dl.acm.org/doi/pdf/10.1145/3458817.3476182) * [Exploring the Granularity of Sparsity in Convolutional Neural Networks](https://openaccess.thecvf.com/content_cvpr_2017_workshops/w29/papers/Mao_Exploring_the_Granularity_CVPR_2017_paper.pdf) * [SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference](https://arxiv.org/pdf/2008.11849.pdf) * [Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference](https://arxiv.org/pdf/2102.11289.pdf) ## Dynamic Computation * [BlockDrop: Dynamic Inference Paths in Residual Networks](https://openaccess.thecvf.com/content_cvpr_2018/papers/Wu_BlockDrop_Dynamic_Inference_CVPR_2018_paper.pdf) * [Slimmable Neural Networks](https://arxiv.org/pdf/1812.08928.pdf) * [Multi-scale Dense Networks for Resource Efficient Image Classification](https://arxiv.org/pdf/1703.09844.pdf) * [Runtime Neural Pruning](https://papers.nips.cc/paper/2017/file/a51fb975227d6640e4fe47854476d133-Paper.pdf) * [SkipNet: Learning Dynamic Routing in Convolutional Networks](https://arxiv.org/pdf/1711.09485.pdf) * [Dynamic Deep Neural Networks: Optimizing Accuracy Efficiency Trade-Offs by Selective Execution](https://ojs.aaai.org/index.php/AAAI/article/view/11630) * [DyFiP: Explainable AI-based Dynamic Filter Pruning of Convolutional Neural Networks ](https://euromlsys.eu/pdf/euromlsys22-final20.pdf) * [Manifold Regularized Dynamic Network Pruning](Tang_Manifold_Regularized_Dynamic_Network_Pruning_CVPR_2021_paper) * [Dynamic Channel Pruning: Feature Boosting and Suppression ](https://arxiv.org/pdf/1810.05331.pdf) * [Dynamic Domain Adaptation for Efficient Inference](https://openaccess.thecvf.com/content/CVPR2021/html/Li_Dynamic_Domain_Adaptation_for_Efficient_Inference_CVPR_2021_paper.html) * [Learning to Weight Samples for Dynamic Early-Exiting Networks ](https://link.springer.com/content/pdf/10.1007/978-3-031-20083-0_22.pdf) * [Spatially Adaptive Feature Refinement for Efficient Inference ](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9609974) * [HAQ: Hardware-Aware Automated Quantization with Mixed Precision](https://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_HAQ_Hardware-Aware_Automated_Quantization_With_Mixed_Precision_CVPR_2019_paper.pdf) * [Channel Gating Neural Networks](https://proceedings.neurips.cc/paper/2019/file/68b1fbe7f16e4ae3024973f12f3cb313-Paper.pdf) ## Dynamic Computation * [Stop or Forward: Dynamic Layer Skipping for Efficient Action Recognition](https://openaccess.thecvf.com/content/WACV2023/papers/Seon_Stop_or_Forward_Dynamic_Layer_Skipping_for_Efficient_Action_Recognition_WACV_2023_paper.pdf) * [Adaptive Neural Networks for Efficient Inference](http://proceedings.mlr.press/v70/bolukbasi17a/bolukbasi17a.pdf) * [Resolution Adaptive Networks for Efficient Inference](https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_Resolution_Adaptive_Networks_for_Efficient_Inference_CVPR_2020_paper.pdf) * [Bi-volution: A Static and Dynamic Coupled Filter](https://doi.org/10.1609/aaai.v36i1.19979) * [DeeCap: Dynamic Early Exiting for Efficient Image Captioning](https://openaccess.thecvf.com/content/CVPR2022/papers/Fei_DeeCap_Dynamic_Early_Exiting_for_Efficient_Image_Captioning_CVPR_2022_paper.pdf) * [PAME: Precision-Aware Multi-Exit DNN Serving for Reducing Latencies of Batched Inferences](https://dl.acm.org/doi/pdf/10.1145/3524059.3532366) * [QoS-Aware Irregular Collaborative Inference for Improving Throughput of DNN Services](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10046047) * [Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification](https://proceedings.neurips.cc/paper_files/paper/2020/file/1963bd5135521d623f6c29e6b1174975-Paper.pdf) * [Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions](https://dl.acm.org/doi/pdf/10.1145/3469116.3470012) * [DyFiP: Explainable AI-based Dynamic Filter Pruning of Convolutional Neural Networks](https://euromlsys.eu/pdf/euromlsys22-final20.pdf) * [Learning Dynamic Routing for Semantic Segmentation](https://openaccess.thecvf.com/content_CVPR_2020/papers/Li_Learning_Dynamic_Routing_for_Semantic_Segmentation_CVPR_2020_paper.pdf) * [Deep Learning Through the Lens of Example Difficulty](https://proceedings.neurips.cc/paper/2021/file/5a4b25aaed25c2ee1b74de72dc03c14e-Paper.pdf) * [SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning]([file:///home/johncow/Downloads/6098-Article%20Text-9323-1-10-20200513.pdf](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9407232)) * [Spatially Adaptive Feature Refinement for Efficient Inference](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9609974&tag=1) * [HAQ: Hardware-Aware Automated Quantization with Mixed Precision](https://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_HAQ_Hardware-Aware_Automated_Quantization_With_Mixed_Precision_CVPR_2019_paper.pdf) * [Multi-exit DNN Inference Acceleration based on Multi-Dimensional Optimization for Edge Intelligence](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9769868&tag=1) * [EXPLAINABLE AI-BASED DYNAMIC FILTER PRUNING OF CONVOLUTIONAL NEURAL NETWORKS ](https://openreview.net/pdf?id=vQmIksuciu2) * [Frequency-Domain Dynamic Pruning for Convolutional Neural Networks](https://proceedings.neurips.cc/paper/2018/file/a9a6653e48976138166de32772b1bf40-Paper.pdf) * [Dynamic Network Pruning with Interpretable Layerwise Channel Selection](https://ojs.aaai.org/index.php/AAAI/article/view/6098/5954) * [DYNAMIC MODEL PRUNING WITH FEEDBACK](https://arxiv.org/pdf/2006.07253.pdf) * [Network Pruning via Performance Maximization](https://openaccess.thecvf.com/content/CVPR2021/papers/Gao_Network_Pruning_via_Performance_Maximization_CVPR_2021_paper.pdf) * [WISE: Predicting the Performance of Sparse Matrix Vector Multiplication with Machine Learning](https://dl.acm.org/doi/pdf/10.1145/3572848.3577506) * [Efficient Direct Convolution Using Long SIMD Instructions](https://dl.acm.org/doi/pdf/10.1145/3572848.3577435) * [Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution](https://arxiv.org/pdf/1701.00299.pdf)

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.