2022q1 Lecture 3 (quiz3)

--- tags: AI accelerators --- # 2022q1 Lecture 3 (quiz3) ## Question `1` We consider weight pruning. Which of the following are true? - [ ] `(a)` We learned in the previous lecture that a convolutional layer is an MMM. Under structured pruning, where whole filters are pruned from a convolutional layer, the MMM for this pruned convolution layer will have a reduced size for that layer's weight matrix. However, the subsequent layer's input tensor will not have a reduced size. :::info - Unstructured pruning: 非結構的剪枝只會將剪去的 `weight` 設置為 `0`，故對於模型的參數 (大小) 並無影響 - Structured pruning: 下圖為 Kernel $n_{i+1}$ 剪去 $F_{i,j}$ 後對於 feature map $X_{i+1}$ 以及下一個 Kernel $n_{i+2}$ 的影響 ![](https://i.imgur.com/MAnNyp5.png) ::: - [ ] `(b)` For a given pre-trained full model, to achieve the same level of sparsity after pruning (e.g., 50% of weights are zeros), structural pruning usually results in a higher test accuracy than unstructured pruning where a filter may have only a subset of its weights being pruned. :::info - Unstructured fine-grained sparsity means: > Achieving high accuracy, but inefficient implementation on hardware - Structured coarse-grained sparsity means: > Efficient implmentation, but achieving low accuracy ::: - [x] `(c)` In the "DRAWING EARLY-BIRD TICKETS" 2020 paper, it is shown that winning tickets can be identified at early training epochs. This result means that we can do the remaining training for a smaller network and thus speed up the training. - [ ] `(d)` None of the above. ## Question `2` As we learned from today's lecture, under uniform quantization (UQ), the scale factor bounds the quantization error for an input x. Under logarithmic quantization (LQ), the quantization error for an input x will be related to a power-of-two quantized value. Which of the following are true? - [x] `(a)` Under UQ with a fixed scale factor, increasing bit-width by 1 bit for quantized values (e.g., using 8 bits instead of 7 bits) roughly doubles the size of the dynamic range that UQ covers. - [x] `(b)` Under LQ, quantization intervals have different sizes, and the maximum quantization error for an input x depends on the size of the interval x resides. - [ ] `(c)` None of the above. ## Question `3` We learned binarized neural networks (BNNs) in today's class. We note that the dot product of two binary vectors A and B computed by a binarized neural network has its value being $2 \times popcnt - N$ where $popcnt$ is the number of 1's in $XNOR(A, B)$. Which of the following are true? - [x] `(a)` $N$ is the length of vectors A and B. - [x] `(b)` The dot product of two binary vectors, A and B will never be greater than $N$. - [x] `(c)` The dot product may attain an increased maximum value when the length of vectors A and B increases. - [ ] `(d)` After activation, the dot product assumes a value $0$ or $+1$ before they are encoded into $-1$ or $+1$. :::info The true statement announced by the teach assistant: `(d)` After activation, the dot product assumes a value $-1$ or $+1$ before they are encoded into $0$ or $+1$. ::: - [ ] `(e)` None of the above.