# Assignment 1: RISC-V Assembly and Instruction Pipeline contributed by <`Nsly0204`> [homework1](https://hackmd.io/@sysprog/2025-arch-homework1) ## Problem B > [Commit 9d4e89a](https://github.com/sysprog21/ca2025-quizzes/commit/9d4e89ad09b3f2856ac516648c655f1c9ddef661) Fundemental Logic to improve performance: 1. Store reusable constants (e.g. 0xFF) in specific register avoid repetitively using `li` instruction. 2. Using loop instead of using recursive function (e.g. clz, multiply) to avoid caller save and callee save. ==However, whether the miss in branch prediction or saving register causes more resourse is up to discuss.)== ## Problem C > >* convertion & bf16_add & bf16_sub >[Commit 1b2689a](https://github.com/sysprog21/ca2025-quizzes/commit/1b2689a964e13024c70986d7518e72e40e714c98) >* bf16_div and bf16_mul >[Commit 86e4b5d](https://github.com/sysprog21/ca2025-quizzes/commit/86e4b5d1a6134f8ede4545e341c021ffbdfc8c3a) >* bf16_sqrt and test function >[Commit e38b9fd](https://github.com/sysprog21/ca2025-quizzes/commit/e38b9fd5559516e32614499a80956a91a7aae2f8) ![Screenshot from 2025-10-13 22-00-58](https://hackmd.io/_uploads/HylBIYc6lx.png) ```c= static inline bf16_t f32_to_bf16(float val) { uint32_t f32bits; memcpy(&f32bits, &val, sizeof(float)); if (((f32bits >> 23) & 0xFF) == 0xFF) return (bf16_t) {.bits = (f32bits >> 16) & 0xFFFF}; f32bits += ((f32bits >> 16) & 1) + 0x7FFF; return (bf16_t) {.bits = f32bits >> 16}; } ``` The overall process can be broken down into three parts: * exception detection * rounding (carrier calculation) * f32 to bf16 convertion During exception detection, fp16 is store in a 32 bits risc-v register; therefore, the `& 0xFFFF` operation can be ignored in assembly code. `(f32bits >> 16) & 1)` in line 7 calculates the 17-th bit of `f32bits` the result will be either 1 or 0. This is the implementation of round-to-nearest-even, meaning that when the truncated-part of mentissa is exactly half the percision `0x8000`, it should be round to the nearest even: * if the number after truncation is even, `(f32bits >> 16) & 1)` = 0.When the truncated-part of mentissa is `0x8000`, the carrier equals to 0 because `0x8000 + (f32bits >> 16) & 1) + 0x7FFF` = `0x0FFFF` * if the number after truncation is odd $\rightarrow$ `(f32bits >> 16) & 1)` = 1. When the truncated-part of mentissa is `0x8000`, the carrier equals to 1 because `(f32bits >> 16) & 1) + 0x7FFF` = `0x10000` There are underlying exceptions in the provided convertion method. For instance, `0x7F800001`, which represents NaN in fp32, will be convert into `0x7F80`, which represents infinity in bf16. However, such cases are inevitable due to information loss during convertion. The test program should take this phonomenon into consider. #### assembly code :::spoiler ```asm f32_to_bf16: #### ## input argument # a0 = f32 ## output argument # a0 = traslated bf16 #### # callee save addi sp, sp, -8 sw ra, 0(sp) sw s0, 4(sp) # get input argument mv s0, a0 # NaN or Inf check li t0, 0xFF # t0 = 0xFF (exp mask) srli t2, s0, 23 # (f32bits >> 23) and t1, t2, t0 # (exp of fp32) t1 = (f32bits >> 23) & 0xFF beq t1, t0, exception # if (f32bits >> 23) & 0xFF == 0xFF # convert srli t0, s0, 16 # t0 = fp32 >> 16 # round to even addi t1, x0, 1 # t1 = 1 and t0, t0, t1 # t0 = (fp32 >> 16) & 1, the 16-th bit in mantissa li t1, 0x7FFF # t1 = 0x7FFF add t0, t0, t1 # t0 = ((fp32 >> 16) & 1) + 0x7FFF, t0 = carry add t0, s0, t0 # t0 = fp32 + carry srli a0, t0, 16 # a0 = (fp32 + carry) >> 16 beq x0, x0, end_f32_to_bf16 exception: srli a0, s0, 16 # a0 = fp32 >> 16 end_f32_to_bf16: # retrieve ra and callee save lw ra, 0(sp) lw s0, 4(sp) addi sp, sp, 8 ret ``` ::: ### TODO: using sqrt function with bit operation only to enhance performance The following is a common impelment of square root function with bit operation only. It's a digit-by-digit algorithm origined from long division square root method. For more detail one can refer [Wikipedia-Square root algorithms](https://en.wikipedia.org/wiki/Square_root_algorithms) ```c ix0 += ix0; int32_t q = s = 0; /* q = sqrt(x) */ uint32_t r = 0x01000000; /* r = moving bit from right to left */ while(r!=0) { t = s+r; // temp = sqrt + 1 bit if(t <= ix0) { // if remain > add s = t+r; // s = (s+r) + r = s + 2r ix0 -= t; // remain -add q += r; // quotient add one bit } ix0 += ix0; // append one more bits r >>= 1; } ```