2025/1/20 kevinchou

2025/1/20 kevinchou === ## BBS: Bi-directional Bit-level Sparsity for Deep Learning Acceleration ### Abstract - Bit-level sparsity methods skip ineffectual zero-bit operations and are typically applicable within bit-serial deep learning accelerators. They are orthogonal to and compatible with other deep neural network (DNN) efficiency methods such as quantization and pruning. - In this work, we enhance the practicality and efficiency of bit-level sparsity. - On the algorithmic side, we introduce bidirectional bit sparsity (BBS). Building on BBS, we also propose two bit-level binary pruning methods that require no retraining and can be seamlessly applied to quantized DNNs. - On the hardware side, we demonstrate the potential of BBS through BitVert, a bit-serial architecture with an efficient processing element (PE) design, aimed at accelerating DNNs with low overhead by exploiting our proposed binary pruning methods. --- ### Related Works ![image](https://hackmd.io/_uploads/Sy_RY9cwyg.png) - Bit-Parallel PE ![image](https://hackmd.io/_uploads/r1eSCqcDyl.png) - Pragmatic ![image](https://hackmd.io/_uploads/Sk3L05cPke.png) Pragmatic requires a variable shifter after every bit-serial multiplier to synchronize the significance of essential bits. - Bitlet ![image](https://hackmd.io/_uploads/H1MtC9qvkg.png) Bitlet digests multiple weights and activations, and computes every bit-significance independently. However, since every bit lane can absorb the essential bit from an arbitrary weight, Bitlet requires a large multiplexer. - Bitwave ![image](https://hackmd.io/_uploads/S14c0c9P1x.png) --- ### Method - BBS: BI-DIRECTIONAL BIT-LEVEL SPARSITY #### BBS Theorem ![image](https://hackmd.io/_uploads/ry41135P1g.png) ![image](https://hackmd.io/_uploads/r1c1JncP1g.png) - From Eq. 2 and 3, we can infer that instead of adding the effectual activations associated with non-zero weight bits, the same result can be obtained by subtracting the activations indicated by zero weight bits from the sum of all activations. #### Bit-level Binary Pruning - BBS with Rounded Averaging ![image](https://hackmd.io/_uploads/Hyi8yjqPJe.png) - Step1: Identifies if there are redundant bit columns that immediately follow the mostsignificant column with the same content. - Step2: Achieved by calculating the rounded average of the values represented by the 3 lower significant bits of original weights. Essentially, this is replacing the 3 lower significant bits of all weights with a 3-bit constant while minimizing the MSE - Step3: Compresses the original weight group by storing only the remaining 4 bit columns and an 8-bit encoding metadata - BBS with Zero-point Shifting ![image](https://hackmd.io/_uploads/HkeXGi5v1g.png) - Step1: Assume a constant −14 is added to the original weight, which changes the binary content of all numbers. - Step2: To minimize the MSE when pruning the 4 lower significant bit columns, a number can either directly zero out the 4 lower bits, or round up to the higher bit significance. - Step3: Shows the actual values after binary pruning and stores the new zero-point in the encoding metadata. ![image](https://hackmd.io/_uploads/HkWHMjqPJg.png) #### Hardware-aware Global Binary Pruning ![image](https://hackmd.io/_uploads/SyVjmicvkx.png) --- ### Method - BITVERT HARDWARE ARCHITECTURE - BitVert Processing Element ![image](https://hackmd.io/_uploads/r1RVEj9P1e.png) - Step1 receives 16 activations A0 ~ A15 and selects 8 of them based on sel0 ~ sel7 that indicates the position of effectual bits in the weight bit-vector. - Step2 performs bit-serial multiplication using valid signals val0 ~ val7 in case there are less than 8 effectual bits. A subtractor subtracts the adder tree result from the sum of activations (Eq. 2), followed by a mux to select the partial sum. - Step3 then shifts the partial sum based on the column index col_idx that specifies the significance of current weight bits. - Step4 multiples this constant with the sum of activations. Finally, the product and bit-serial partial sum are accumulated in Step5. - BitVert Scheduler ![image](https://hackmd.io/_uploads/S1JH_jqwJe.png) - BitVert Accelerator ![image](https://hackmd.io/_uploads/HyPIus5Dyl.png) ---