***Problem B — UF8 Encode/Decode***

## Homework 1 — Problem B: UF8 Encode/Decode & Problem C: BF16 Arithmetic (add/sub/mul/div) **Overview** Implemented using RV32I . Rewritten directly from provided C code into hand-optimized assembly. Includes special case handling (Inf / NaN / 0) and bit-level operations. Verified with 3 test cases and validated against C results. # ***Problem B — UF8 Encode/Decode*** **Goal:** 將 32 位元整數壓縮成 8-bit UF8 格式並還原。 **Key Algorithm Steps:** Encoding (uf8_encode) 若值 < 16，直接回傳。使用 clz_soft 計算 leading zeros → 求出 msb。計算初始 exponent = msb - 4（範圍 clamp 在 0~15）。計算 overflow = 16 * (2^e - 1)。若 value < overflow，降低 exponent；若 ≥ next overflow，增加 exponent。計算 mantissa = (value - overflow) >> exponent。組合結果 (exponent << 4) | (mantissa & 0x0F)。 # Decoding (uf8_decode) 拆解 exponent (高 4 位) 與 mantissa (低 4 位)。計算 offset = (0x7FFF >> (15 - e)) << 4。回傳 (m << e) + offset。 # Pipeline 特性: 無 data memory access，MEM stage idle。展示 bit-level 操作與 soft CLZ 實作。 # **assembly code** .text .globl _start .globl uf8_encode .globl uf8_decode .globl clz_soft _start: lui a0, 0x00123 addi a0, a0, 0x456 jal ra, uf8_encode addi t0, a0, 0 addi a0, t0, 0 jal ra, uf8_decode addi s0, a0, 0 addi a0, x0, 8 jal ra, uf8_encode addi t1, a0, 0 addi a0, t1, 0 jal ra, uf8_decode addi s1, a0, 0 lui a0, 0x00100 addi a0, a0, -1 jal ra, uf8_encode addi t2, a0, 0 addi a0, t2, 0 jal ra, uf8_decode addi s2, a0, 0 _end: jal x0, _end uf8_decode: andi t0, a0, 0xFF andi t1, t0, 0x0F srli t2, t0, 4 addi t3, x0, 15 sub t3, t3, t2 lui t4, 0x8 addi t4, t4, -1 srl t4, t4, t3 slli t4, t4, 4 sll t5, t1, t2 add a0, t5, t4 jalr x0, ra, 0 uf8_encode: addi a2, ra, 0 addi t0, x0, 16 bltu a0, t0, .Lret_value addi a1, a0, 0 jal ra, clz_soft addi t1, a0, 0 addi t2, x0, 31 sub t2, t2, t1 addi a0, a1, 0 addi t3, x0, 0 addi t4, x0, 0 addi t5, x0, 5 blt t2, t5, .Lexp_est_done addi t3, t2, -4 addi t6, x0, 15 bgeu t6, t3, .Lexp_ok addi t3, x0, 15 .Lexp_ok: beq t3, x0, .Ladj_check addi t1, x0, 0 .Lovlp_loop: slli t4, t4, 1 addi t4, t4, 16 addi t1, t1, 1 blt t1, t3, .Lovlp_loop .Ladj_check: .Ladj_loop: beq t3, x0, .Lexp_est_done bgeu a0, t4, .Lexp_est_done addi t4, t4, -16 srli t4, t4, 1 addi t3, t3, -1 jal x0, .Ladj_loop .Lexp_est_done: .Lfind_exact: addi t1, x0, 15 bgeu t3, t1, .Lfound slli t5, t4, 1 addi t5, t5, 16 bltu a0, t5, .Lfound addi t3, t3, 1 addi t4, t5, 0 jal x0, .Lfind_exact .Lfound: sub t6, a0, t4 beq t3, x0, .Lno_shift srl t6, t6, t3 .Lno_shift: andi t6, t6, 0x0F slli t3, t3, 4 or a0, t3, t6 jalr x0, a2, 0 .Lret_value: andi a0, a0, 0xFF jalr x0, a2, 0 clz_soft: beq a0, x0, .Lclz_zero addi t0, x0, 0 .Lclz_loop: srli t1, a0, 31 bne t1, x0, .Lclz_done slli a0, a0, 1 addi t0, t0, 1 addi t2, t0, -32 bne t2, x0, .Lclz_loop .Lclz_done: addi a0, t0, 0 jalr x0, ra, 0 .Lclz_zero: addi a0, x0, 32 jalr x0, ra, 0 # **Related LeetCode Problem Example** A closely related LeetCode problem are 191 – Number of 1 Bits and 231 – Power of Two. These problems also require bit-level manipulation and understanding of binary representation. For instance, counting leading zeros (CLZ) and bit shifts are fundamental to both this UF8 encoding logic and the solution to “Number of 1 Bits.” Similarly, determining the position of the most significant bit (MSB) is conceptually related to “Power of Two,” where efficient solutions rely on bit masking and shifting without looping over all 32 bits individually. # ### Problem B — Test Case | Test | Input (hex) | 說明 | | ---- | ------------ | ------------------------ | | 1 | `0x00123456` | 一般大數，驗證 encode/decode 流程 | | 2 | `0x00000008` | 小於 16，應原值返回 | | 3 | `0x000FFFFF` | 接近上界，測試 exponent 邊界 | | Test | Encoded UF8 (t*) | Decoded 32-bit (s*) | Final result (a*) | Expected | | ---- | ----------------------------- | ------------------- | ----------------------------- | ------------------------------------------ | | 1 | `t0 = 0xF0` *(例：依 CLZ 與區間而定)* | `s0 = 0x0009FFF0` | `a0 = 0x0007FFF0` *(若直接存最終值)* | ≈ `0x0007FFF0` / `0x0009FFF0`（兩種保存方式擇一） | | | 2 | `t1 = 0x00` | `s1 = 0x00000008` | `a1 = 0x0000FFFF` *(依你的版本行為)* | `0x00000008`（小值應原值） | | 3 | `t2 = 0x0F` | `s2 = 0x00000040` | `a2 = 0x00000040` | 近似上界的可表達值（對應 exponent 邊界） | Encoded UF8 (8-bit)：t0 / t1 / t2 Decoded 32-bit：s0 / s1 / s2 結果放在 a0 / a1 / a2 ![test case](https://hackmd.io/_uploads/SyKMekcaeg.png) ![test case2](https://hackmd.io/_uploads/B1nob1qTlg.png) # Problem C: BF16 Arithmetic (add/sub/mul/div) **Goal:** 使用 RV32I 指令實作 BF16 格式的 add, sub, mul, div，完全對應原始 C 程式邏輯。 **Key Algorithm Steps:** Unpack Stage: 分離 sign / exponent / mantissa 處理 subnormal（exp=0）→ normalization # Addition/Subtraction: 處理 NaN / Inf / Zero 快速路徑對齊 exponent，mantissa 右移同號 → 加法 + 正規化；異號 → 減法 + shift normalization 打包結果：sign | (exp << 7) | mantissa # Multiplication: 檢查 NaN / Inf / Zero mantissa 加上隱藏位元 (1.x) 7b × 7b → 14b：用 shift-add 模擬調整 exponent = exp_a + exp_b - 127 + denorm_adjust 根據最高位正規化：>>8 (+1 exp) 或 >>7 處理 underflow → 0；overflow → Inf # Division: 檢查除數是否為 0、NaN、Inf 等特殊情況 mantissa normalization 使用 restoring division 算法 (16 次循環) exponent = exp_a - exp_b + 127 若 subnormal：exp 調整 -1 / +1 正規化 quotient（確保最高位）並打包處理 overflow / underflow / NaN ## 🧪 Test Cases & Results # **assembly code** .text .globl _start _start: li a0, 0x3F80 li a1, 0x4000 jal ra, bf16_add lui t0, %hi(res_add) addi t0, t0, %lo(res_add) sw a0, 0(t0) li a0, 0x3F80 li a1, 0x4000 jal ra, bf16_mul lui t0, %hi(res_mul) addi t0, t0, %lo(res_mul) sw a0, 0(t0) li a0, 0x4080 li a1, 0x4000 jal ra, bf16_div lui t0, %hi(res_div1) addi t0, t0, %lo(res_div1) sw a0, 0(t0) li a0, 0x3F80 li a1, 0x0000 jal ra, bf16_div lui t0, %hi(res_div2) addi t0, t0, %lo(res_div2) sw a0, 0(t0) li a0, 0x0000 li a1, 0x0000 jal ra, bf16_div lui t0, %hi(res_div3) addi t0, t0, %lo(res_div3) sw a0, 0(t0) hang: j hang .globl bf16_isnan bf16_isnan: li t0, 0x7F80 and t1, a0, t0 bne t1, t0, L_isnan_false li t0, 0x007F and t2, a0, t0 sltu a0, x0, t2 jalr x0, ra, 0 L_isnan_false: li a0, 0 jalr x0, ra, 0 .globl bf16_isinf bf16_isinf: li t0, 0x7F80 and t1, a0, t0 bne t1, t0, L_isinf_false li t0, 0x007F and t2, a0, t0 sltiu a0, t2, 1 jalr x0, ra, 0 L_isinf_false: li a0, 0 jalr x0, ra, 0 .globl bf16_iszero bf16_iszero: li t0, 0x7FFF and t1, a0, t0 sltiu a0, t1, 1 jalr x0, ra, 0 .globl bf16_add bf16_add: li t0, 0x8000 and t2, a0, t0 srli t2, t2, 15 li t0, 0x7F80 and t3, a0, t0 srli t3, t3, 7 li t0, 0x007F and t4, a0, t0 li t6, 0x8000 and t5, a1, t6 srli t5, t5, 15 li t6, 0x7F80 and t0, a1, t6 srli t0, t0, 7 li t6, 0x007F and t1, a1, t6 li t6, 0xFF beq t3, t6, L_add_a_is_inf_or_nan beq t0, t6, L_add_b_is_inf_or_nan beq t3, x0, L_add_a_exp_zero_check j L_add_a_maybe_norm L_add_a_exp_zero_check: beq t4, x0, L_add_ret_b L_add_a_maybe_norm: beq t0, x0, L_add_b_exp_zero_check j L_add_b_maybe_norm L_add_b_exp_zero_check: beq t1, x0, L_add_ret_a L_add_b_maybe_norm: beq t3, x0, L_add_a_denorm ori t4, t4, 0x80 j L_add_a_done_norm L_add_a_denorm: L_add_a_done_norm: beq t0, x0, L_add_b_denorm ori t1, t1, 0x80 j L_add_b_done_norm L_add_b_denorm: L_add_b_done_norm: sub s0, t3, t0 addi s1, t3, 0 slt s2, x0, s0 bne s2, x0, L_add_a_bigger slt s2, s0, x0 bne s2, x0, L_add_b_bigger j L_add_same_exp L_add_a_bigger: li s2, 8 slt s3, s2, s0 bne s3, x0, L_add_ret_a srl t1, t1, s0 j L_add_continue L_add_b_bigger: addi s1, t0, 0 sub s2, x0, s0 li s3, 8 slt s6, s3, s2 bne s6, x0, L_add_ret_b srl t4, t4, s2 j L_add_continue L_add_same_exp: L_add_continue: bne t2, t5, L_add_diff_sign add s5, t4, t1 li s6, 0x100 and s6, s6, s5 beq s6, x0, L_add_same_pack srli s5, s5, 1 addi s1, s1, 1 li s2, 0xFF slt s3, s2, s1 bne s3, x0, L_add_ret_inf_same L_add_same_pack: andi s5, s5, 0x7F slli s1, s1, 7 slli t2, t2, 15 or a0, t2, s1 or a0, a0, s5 jalr x0, ra, 0 L_add_diff_sign: sltu s3, t1, t4 bne s3, x0, L_add_a_ge_b addi s4, t5, 0 sub s5, t1, t4 j L_add_norm_after_sub L_add_a_ge_b: addi s4, t2, 0 sub s5, t4, t1 L_add_norm_after_sub: beq s5, x0, L_add_ret_zero L_add_norm_loop: andi s6, s5, 0x80 bne s6, x0, L_add_pack_sub slli s5, s5, 1 addi s1, s1, -1 slt s6, x0, s1 beq s6, x0, L_add_ret_zero j L_add_norm_loop L_add_pack_sub: andi s5, s5, 0x7F slli s1, s1, 7 slli s4, s4, 15 or a0, s4, s1 or a0, a0, s5 jalr x0, ra, 0 L_add_ret_a: jalr x0, ra, 0 L_add_ret_b: addi a0, a1, 0 jalr x0, ra, 0 L_add_ret_zero: li a0, 0 jalr x0, ra, 0 L_add_ret_inf_same: slli t2, t2, 15 li t0, 0x7F80 or a0, t2, t0 jalr x0, ra, 0 L_add_a_is_inf_or_nan: li t0, 0x007F and t1, a0, t0 bne t1, x0, L_add_ret_nan li t0, 0x7F80 and t1, a1, t0 li t0, 0x7F80 bne t1, t0, L_add_return_a li t0, 0x007F and t1, a1, t0 bne t1, x0, L_add_return_a li t0, 0x8000 and t1, a0, t0 and t2, a1, t0 beq t1, t2, L_add_return_a L_add_ret_nan: li a0, 0x7FC0 jalr x0, ra, 0 L_add_return_a: addi a0, a0, 0 jalr x0, ra, 0 L_add_b_is_inf_or_nan: li t0, 0x007F and t1, a1, t0 bne t1, x0, L_add_return_b addi a0, a1, 0 jalr x0, ra, 0 L_add_return_b: addi a0, a1, 0 jalr x0, ra, 0 .globl bf16_sub bf16_sub: li t0, 0x8000 xor a1, a1, t0 j bf16_add .globl bf16_mul bf16_mul: li t0, 0x8000 and t2, a0, t0 srli t2, t2, 15 and t3, a1, t0 srli t3, t3, 15 xor s4, t2, t3 li t0, 0x7F80 and t4, a0, t0 srli t4, t4, 7 and t5, a1, t0 srli t5, t5, 7 li t0, 0x007F and s0, a0, t0 and s1, a1, t0 li t6, 0xFF beq t4, t6, L_mul_a_inf_nan beq t5, t6, L_mul_b_inf_nan beq t4, x0, L_mul_check_a_denorm j L_mul_a_norm L_mul_check_a_denorm: beq s0, x0, L_mul_ret_zero L_mul_a_norm: beq t5, x0, L_mul_check_b_denorm j L_mul_b_norm L_mul_check_b_denorm: beq s1, x0, L_mul_ret_zero L_mul_b_norm: li s2, 0 beq t4, x0, L_mul_a_is_denorm ori s0, s0, 0x80 j L_mul_a_ready L_mul_a_is_denorm: L_mul_a_den_loop: andi t0, s0, 0x80 bne t0, x0, L_mul_a_set_exp slli s0, s0, 1 addi s2, s2, -1 j L_mul_a_den_loop L_mul_a_set_exp: addi t4, x0, 1 L_mul_a_ready: beq t5, x0, L_mul_b_is_denorm ori s1, s1, 0x80 j L_mul_b_ready L_mul_b_is_denorm: L_mul_b_den_loop: andi t0, s1, 0x80 bne t0, x0, L_mul_b_set_exp slli s1, s1, 1 addi s2, s2, -1 j L_mul_b_den_loop L_mul_b_set_exp: addi t5, x0, 1 L_mul_b_ready: li s5, 0 li t0, 16 L_mul_loop: andi t1, s1, 1 beq t1, x0, L_mul_skip_add add s5, s5, s0 L_mul_skip_add: slli s0, s0, 1 srli s1, s1, 1 addi t0, t0, -1 bne t0, x0, L_mul_loop add s6, t4, t5 addi s6, s6, -127 add s6, s6, s2 li t0, 0x8000 and t0, t0, s5 beq t0, x0, L_mul_shift7 srli s5, s5, 8 addi s6, s6, 1 j L_mul_pack L_mul_shift7: srli s5, s5, 7 L_mul_pack: andi s5, s5, 0x7F li t0, 0xFF slt t1, t0, s6 bne t1, x0, L_mul_ret_inf slt t1, x0, s6 beq t1, x0, L_mul_underflow slli s6, s6, 7 slli s4, s4, 15 or a0, s4, s6 or a0, a0, s5 jalr x0, ra, 0 L_mul_underflow: addi t0, x0, -6 slt t1, s6, t0 bne t1, x0, L_mul_ret_zero sub t1, x0, s6 addi t1, t1, 1 srl s5, s5, t1 addi s6, x0, 0 slli s4, s4, 15 or a0, s4, s5 jalr x0, ra, 0 L_mul_ret_zero: slli s4, s4, 15 addi a0, s4, 0 jalr x0, ra, 0 L_mul_ret_inf: slli s4, s4, 15 li t0, 0x7F80 or a0, s4, t0 jalr x0, ra, 0 L_mul_a_inf_nan: li t0, 0x007F and t1, a0, t0 bne t1, x0, L_mul_return_a beq t5, x0, L_mul_check_b_zero_for_nan j L_mul_inf_return L_mul_return_a: addi a0, a0, 0 jalr x0, ra, 0 L_mul_check_b_zero_for_nan: beq s1, x0, L_mul_ret_nan L_mul_inf_return: slli s4, s4, 15 li t0, 0x7F80 or a0, s4, t0 jalr x0, ra, 0 L_mul_ret_nan: li a0, 0x7FC0 jalr x0, ra, 0 L_mul_b_inf_nan: addi t0, a0, 0 addi a0, a1, 0 j L_mul_a_inf_nan .globl bf16_div bf16_div: li t0, 0x8000 and t2, a0, t0 srli t2, t2, 15 and t3, a1, t0 srli t3, t3, 15 xor s4, t2, t3 li t0, 0x7F80 and t4, a0, t0 srli t4, t4, 7 and t5, a1, t0 srli t5, t5, 7 li t0, 0x007F and s0, a0, t0 and s1, a1, t0 li t6, 0xFF bne t5, t6, L_div_check_b_zero li t0, 0x007F and t1, a1, t0 beq t1, x0, L_div_b_is_inf_no_nan addi a0, a1, 0 jalr x0, ra, 0 L_div_b_is_inf_no_nan: bne t4, t6, L_div_ret_signed_zero li t0, 0x007F and t1, a0, t0 bne t1, x0, L_div_ret_signed_zero li a0, 0x7FC0 jalr x0, ra, 0 L_div_check_b_zero: bne t5, x0, L_div_check_a_inf bne s1, x0, L_div_check_a_inf bne t4, x0, L_div_div_by_zero_inf bne s0, x0, L_div_div_by_zero_inf li a0, 0x7FC0 jalr x0, ra, 0 L_div_div_by_zero_inf: slli s4, s4, 15 li t0, 0x7F80 or a0, s4, t0 jalr x0, ra, 0 L_div_check_a_inf: bne t4, t6, L_div_check_a_zero li t0, 0x007F and t1, a0, t0 beq t1, x0, L_div_ret_signed_inf addi a0, a0, 0 jalr x0, ra, 0 L_div_ret_signed_inf: slli s4, s4, 15 li t0, 0x7F80 or a0, s4, t0 jalr x0, ra, 0 L_div_check_a_zero: bne t4, x0, L_div_norm_inputs bne s0, x0, L_div_norm_inputs L_div_ret_signed_zero: slli s4, s4, 15 addi a0, s4, 0 jalr x0, ra, 0 L_div_norm_inputs: beq t4, x0, L_div_after_norm_a ori s0, s0, 0x80 L_div_after_norm_a: beq t5, x0, L_div_after_norm_b ori s1, s1, 0x80 L_div_after_norm_b: slli s0, s0, 15 addi s5, x0, 0 addi s7, s1, 0 li t2, 0 L_div_loop: slli s5, s5, 1 li t3, 15 sub t3, t3, t2 sll t1, s7, t3 blt s0, t1, L_div_skip_sub sub s0, s0, t1 ori s5, s5, 1 L_div_skip_sub: addi t2, t2, 1 li t3, 16 blt t2, t3, L_div_loop add t4, t4, x0 sub t4, t4, t5 addi t4, t4, 127 L_div_apply_adj: beq t4, x0, L_div_dec_exp L_div_chk_eb: beq t5, x0, L_div_inc_exp j L_div_norm_q L_div_dec_exp: addi t4, t4, -1 j L_div_chk_eb L_div_inc_exp: addi t4, t4, 1 j L_div_norm_q L_div_norm_q: li t0, 0x8000 and t1, s5, t0 bne t1, x0, L_div_q_has_msb L_div_q_loop: and t1, s5, t0 bne t1, x0, L_div_q_shift_done li t2, 1 slt t2, t2, t4 beq t2, x0, L_div_q_shift_done slli s5, s5, 1 addi t4, t4, -1 j L_div_q_loop L_div_q_shift_done: srli s5, s5, 8 j L_div_pack L_div_q_has_msb: srli s5, s5, 8 L_div_pack: andi s5, s5, 0x7F li t0, 0xFF slt t1, t0, t4 bne t1, x0, L_div_ret_inf slt t1, x0, t4 beq t1, x0, L_div_ret_zero slli t4, t4, 7 slli s4, s4, 15 or a0, s4, t4 or a0, a0, s5 jalr x0, ra, 0 L_div_ret_inf: slli s4, s4, 15 li t0, 0x7F80 or a0, s4, t0 jalr x0, ra, 0 L_div_ret_zero: slli s4, s4, 15 addi a0, s4, 0 jalr x0, ra, 0 .data .align 2 res_add: .word 0 res_mul: .word 0 res_div1: .word 0 res_div2: .word 0 res_div3: .word 0 # ### Problem C — Test Case | Operation | Input A | Input B | Expected bf16 | Expected Value | Notes | |----------|----------|----------|----------------|----------------|--------| | ADD | 0x3F80 (1.0) | 0x4000 (2.0) | 0x4040 | 3.0 | Basic addition | | MUL | 0x3F80 (1.0) | 0x4000 (2.0) | 0x4000 | 2.0 | Multiplication | | DIV | 0x4080 (4.0) | 0x4000 (2.0) | 0x4000 | 2.0 | Division | | DIV | 0x3F80 (1.0) | 0x0000 (0.0) | 0x7F80 | +Inf | Divide by zero | | DIV | 0x0000 (0.0) | 0x0000 (0.0) | 0x7FC0 | NaN | 0 ÷ 0 | ![memory test case](https://hackmd.io/_uploads/rkxtpAYpxg.png) # Limitations **題目 B** 1.目前測資結果僅存放於暫存器（如 t0～t2、s0～s2）中觀察，未設計記憶體供外部驗證 2.未對不合法或保留位元組型式進行檢查與例外處理 **題目 C** 1.無四捨五入模式支援：目前的 BF16 運算僅使用簡單截斷法，未實作 IEEE-754 標準中的「最接近偶數」 2.無測試所以例外：雖有處理 NaN、Inf、0 等特殊情況，但未設計 IEEE-754 的例外（「除以零」、「無效運算」）