Assignment1: RISC-V Assembly and Instruction Pipeline

--- tags: jserv, 2023-arch, RISC-V --- # Assignment1: RISC-V Assembly and Instruction Pipeline contributed by [freshLiver](https://gist.github.com/freshLiver/1b2300a91d466a7f2cc0a78b53fa5075) ## Quiz 1 Problem C According to <https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#integer-calling-convention>: > Scalars that are 2×XLEN bits wide are passed in a pair of argument registers, with the low-order XLEN bits in the lower-numbered register and the high-order XLEN bits in the higher-numbered register. If no argument registers are available, the scalar is passed on the stack by value. ### Mask the trailing 1s The function `mask_lowest_zero()` has a 64 bits integer as input. However, in order to understand the concept of this function, here we simplify this function into 8 bits version: ```c uint8_t mask_lowest_zero(uint8_t x) { uint8_t mask = x; mask &= (mask << 1) | 0x1; mask &= (mask << 2) | 0x3; mask &= (mask << 4) | 0xF; return mask; } ``` And the 8 bits input value `x`, or said the initial value of `mask`, can be represented in: $$ mask = (b_7, b_6, ..., b_1, b_0), b_i = \{0,1\} $$ Then, the expression `mask &= (mask << 1) | 0x1` (`mask & ((mask << 1) | 0x1)`) can be viewed as: ```text b7 b6 b5 b4 b3 b2 b1 b0 &) b6 b5 b4 b3 b2 b1 b0 1 ``` And the result (the new `mask`) could be represented as: $$ mask = (\Pi^{7}_{i=6}b_i, \Pi^{6}_{i=5}b_i, ..., \Pi^{1}_{i=0}b_i, \Pi^{0}_{i=0}b_i) $$ Similarly, the next expression `mask &= (mask << 2) | 0x3` (`mask & ((mask << 2) | 0x3)`) could be viewed as: ```text b7&b6 b6&b5 b5&b4 b4&b3 b3&b2 b2&b1 b1&b0 b0&1 &) b5&b4 b4&b3 b3&b2 b2&b1 b1&b0 b0&1 1 1 ``` And its result could be represented in: $$ mask = (\Pi^{7}_{i=4}b_i, \Pi^{6}_{i=3}b_i, \Pi^{5}_{i=2}b_i, \Pi^{4}_{i=1}b_i, \Pi^{3}_{i=0}b_i, \Pi^{2}_{i=0}b_i, \Pi^{1}_{i=0}b_i, \Pi^{0}_{i=0}b_i) $$ Then, the next operation `mask &= (mask << 4) | 0xF` (`mask & ((mask << 4) | 0xF)`) is: ```text b7&b6&b5&b4 b6&b5&b4&b3 b5&b4&b3&b2 b4&b3&b2&b1 b3&b2&b1&b0 b2&b1&b0&1 b1&b0&1 b0&1&1 &) b3&b2&b1&b0 b2&b1&b0&1 b1&b0&1 b0&1&1 1 1 1 1 ``` So, the result of the 8 bits version is: $$ mask = (\Pi^{7}_{i=0}b_i, \Pi^{6}_{i=0}b_i, \Pi^{5}_{i=0}b_i, \Pi^{4}_{i=0}b_i, \Pi^{3}_{i=0}b_i, \Pi^{2}_{i=0}b_i, \Pi^{1}_{i=0}b_i, \Pi^{0}_{i=0}b_i) $$ Now, back to the 64 bits version of `mask_lowest_zero`, the output should be: $$ mask = (b'_{63}, b'_{62}, ..., b'_1, b'_0) = (\Pi^{63}_{i=0}b_i, \Pi^{62}_{i=0}b_i, ..., \Pi^{1}_{i=0}b_i, \Pi^{0}_{i=0}b_i) $$ For each bit $b'_i$ of the result, $b'_i$ will be 1 only if all the lower $i$ bits ($b_{[0,i]}$) are 1. In other word, this function will return a mask which mask all the trailing 1s ($b'_0, ..., b'_{k-1}$, if the first zero bit is $b_k$) of the given `x`. #### Implement `mask_lowest_zero` with rv32i ```asm mask_lowest_zero: # prologue (1 64 bits arg, no func call) # uint64_t mask = x mv t0, a0 # lower mv t1, a1 # upper li t5, 0xFFFF li t6, 0xFFFFFFFF # mask &= (mask << 1) | 0x1 srli t2, t0, 31 # move MSB to LSB (other bits are 0) slli t3, t0, 1 slli t4, t1, 1 or t4, t4, t2 # apply MSB of lower part ori t3, t3, 0x1 and t0, t0, t3 and t1, t1, t4 # mask &= (mask << 2) | 0x3 srli t2, t0, 30 slli t3, t0, 2 slli t4, t1, 2 or t4, t4, t2 ori t3, t3, 0x3 and t0, t0, t3 and t1, t1, t4 # mask &= (mask << 4) | 0xF srli t2, t0, 28 slli t3, t0, 4 slli t4, t1, 4 or t4, t4, t2 ori t3, t3, 0xF and t0, t0, t3 and t1, t1, t4 # mask &= (mask << 8) | 0xFF srli t2, t0, 24 slli t3, t0, 8 slli t4, t1, 8 or t4, t4, t2 ori t3, t3, 0xFF and t0, t0, t3 and t1, t1, t4 # mask &= (mask << 16) | 0xFFFF srli t2, t0, 16 slli t3, t0, 16 slli t4, t1, 16 or t4, t4, t2 or t3, t3, t5 and t0, t0, t3 and t1, t1, t4 # mask &= (mask << 32) | 0xFFFFFFFF li t3, 0 mv t4, t1 or t3, t3, t6 and a0, t0, t3 and a1, t1, t4 # epilogue (return mask) jalr ra ``` To test this function, add the following codes: ```asm .data str_true: .string "assertion passed!" str_false: .string "assertion failed..." .text main: li a0, 0xFFFFFFFF # input lower li a1, 0xFFFFFFFF # input upper jal mask_lowest_zero li a2, 0xFFFFFFFF # expected lower li a3, 0xFFFFFFFF # expected upper jal assert_result li a7, 93 ecall assert_result: mv t0, a0 # expected result lower mv t1, a1 # expected result upper mv t2, a2 # result lower mv t3, a3 # result upper bne t0, t2, assert_result_false # not expected lower bne t1, t3, assert_result_false # not expected upper assert_result_true: li a0, 1 # file descriptor la a1, str_true # address of string li a2, 17 # length of string li a7, 64 # syscall number for write ecall li a0, 0 ret assert_result_false: li a0, 1 # file descriptor la a1, str_false # address of string li a2, 19 # length of string li a7, 64 # syscall number for write ecall li a0, 1 ret ``` ### Increase a integer with bitwise operations only ```c int64_t inc(int64_t x) { if (~x == 0) return 0; /* TODO: Carry flag */ int64_t mask = mask_lowest_zero(x); int64_t z1 = mask ^ ((mask << 1) | 1); return (x & ~mask) | z1; } ``` Regardless of whether the input value is positive or negative, the increase operation is just set the lowest zero bit ($b_k$), and clear all the bits lower than it ($b_{[0,k)}$). 1. Set the lowest zero bit $b_k$ Since the function `mask_lowest_zero()` only mask the bits lower than $b_k$, here use `mask ^ ((mask << 1) | 1)` to mask only the lowest zero bit $b_k$. 2. Clear all the bits lower than $b_k$ All the bits lower than $b_k$ can be clear by using `~mask`. :::warning **Is `~x == 0` redundant ?** This function will return right away at `~x == 0` (`x == 0xFFFFFFFF`). However, when the value of `x` is `-1` (`0xFFFFFFFF`), `mask` and `z1` will be `0xFFFFFFFF` and `0`, respectively. And the result of `(x & ~mask) | z1` (`(0xFFFFFFFF & 0) | 0`) should be `0`, which is the correct value after the given `x` being increased. Therefore, the branch condition should be redundant. ::: #### Implement `inc` with rv32i ```asm inc: # prologue addi sp, sp, -12 sw s0, -8(sp) sw s1, -4(sp) sw ra, 0(sp) mv s0, a0 mv s1, a1 # int64_t mask = mask_lowest_zero(x); jal mask_lowest_zero # int64_t z1 = mask ^ ((mask << 1) | 1); slli t0, a0, 1 # a0 is mask's lower slli t1, a1, 1 # a1 is mask's upper srli t2, a0, 31 or t1, t1, t2 ori t0, t0, 1 xor t0, a0, t0 # t0 is z1's lower xor t1, a1, t1 # t1 is z1's upper # return (x & ~mask) | z1; li t2, 0xFFFFFFFF xor t3, a0, t2 # t3 is ~mask's lower xor t4, a1, t2 # t4 is ~mask's upper and t3, t3, s0 and t4, t4, s1 or a0, t0, t3 or a1, t1, t4 # epilogue: lw ra, 0(sp) lw s1, -4(sp) lw s0, -8(sp) addi sp, sp, 12 ret ``` To verify this, replace the main function with following codes: ```asm main: li a0, 0xFFFFFFFF # input lower li a1, 0xFFFFFFFF # input upper jal inc li a2, 0 # expected lower li a3, 0 # expected upper jal assert_result li a7, 93 ecall ``` ### Get n-th Bit ```c static inline int64_t getbit(int64_t value, int n) { return (value >> n) & 1; } ``` #### Implement `getbit` with rv32i ```asm getbit: addi t0, a2, -32 bltz t0, getbit_lower getbit_upper: # target at upper part srl t1, a1, t0 andi a0, t1, 1 li a1, 0 ret getbit_lower: # target at lower part srl t1, a0, a2 andi a0, t1, 1 li a1, 0 ret ``` If the given `n` is less than 32, we only need to shift the lower 32 bits to get the result. Otherwise, just shift the upper 32 bits to get the result. The `a1` should always be zero since the result must be 1 or 0. To verify the implementation, replace the main function with following codes: ```asm main: li a0, 0x7FFFFFFF # input lower li a1, 0x80000000 # input upper li a2, 63 # input n jal getbit li a2, 1 # expected lower li a3, 0 # expected upper jal assert_result li a7, 93 ecall ``` ### Int32 Multiplication ```c int64_t imul32(int32_t a, int32_t b) { int64_t r = 0, a64 = (int64_t) a, b64 = (int64_t) b; for (int i = 0; i < 32; i++) { if (getbit(b64, i)) r += a64 << i; } return r; } ``` This function does what the long multiplication does, it uses a loop to multiply the multiplicand (`a64`) with each bits, from LSB to MSB, of the multiplier (`b64`), and adds the results together. #### Implement `imul32` with rv32i ```asm imul32: # prologue addi sp, sp, -40 sw s0, -36(sp) # s0 for a sw s1, -32(sp) # s1 for b sw s2, -28(sp) # s2 for lower r sw s3, -24(sp) # s3 for upper r sw s4, -20(sp) # s4 for lower a64 sw s5, -16(sp) # s5 for upper a64 sw s6, -12(sp) # s6 for lower b64 sw s7, -8(sp) # s7 for upper b64 sw s8, -4(sp) # s8 for i sw ra, 0(sp) mv s0, a0 mv s1, a1 li s2, 0 li s3, 0 li s8, 0 mv s4, s0 slti s5, s0, 0 # test sign for upper a64 neg s5, s5 # use 2'c to do sign-extend mv s6, s1 slti s7, s1, 0 # test sign for upper b64 neg s7, s7 # use 2'c to do imul32_loop: mv a0, s6 mv a1, s7 mv a2, s8 jal getbit beqz a0, imul32_loop_cont neg t2, s8 addi t2, t2, 32 # 32 - i srl t2, s4, t2 # t2 for upper (32-i) bits of lower a64 sll t0, s4, s8 # t0 for shifted lower a64 sll t1, s5, s8 or t1, t1, t2 # t1 for shifted upper a64 mv t3, s2 # keep old lower r for testing overflow add s2, s2, t0 sltu t0, s2, t3 # set carry if lower overflow add s3, s3, t1 add s3, s3, t0 # add carry bit from lower imul32_loop_cont: addi s8, s8, 1 slti t0, s8, 32 bnez t0, imul32_loop # epilogue mv a0 ,s2 mv a1 ,s3 lw ra, 0(sp) lw s8, -4(sp) lw s7, -8(sp) lw s6, -12(sp) lw s5, -16(sp) lw s4, -20(sp) lw s3, -24(sp) lw s2, -28(sp) lw s1, -32(sp) lw s0, -36(sp) addi sp, sp, 40 ret ``` To verify this implementation, replace main function with: ```asm main: li a0, 0x81234567 # input a li a1, 0x90ABCDEF # input b jal imul32 li a2, 0x1A4E4629 # expected lower li a3, 0xB84EB38C # expected upper jal assert_result li a7, 93 ecall ``` ### Float32 Multiplication ```c float fmul32(float a, float b) { /* TODO: Special values like NaN and INF */ int32_t ia = *(int32_t *) &a, ib = *(int32_t *) &b; /* sign */ int sa = ia >> 31; int sb = ib >> 31; /* mantissa */ int32_t ma = (ia & 0x7FFFFF) | 0x800000; int32_t mb = (ib & 0x7FFFFF) | 0x800000; /* exponent */ int32_t ea = ((ia >> 23) & 0xFF); int32_t eb = ((ib >> 23) & 0xFF); ``` The former part just extract the 1 bit sign, 8 bits exponent, and 23 bits fraction from the input 32 bits floating-points. Note that, in IEEE 754 32 bits floating-point format, the normalized mantissa part could be represented in the format $1.fraction$, we cannot use the lower 23 bits as the mantissa part directly, and the code thus add the hidden 1 back to the `ma` and `mb` using the OR operation. ```c /* 'r' = result */ int64_t mrtmp = imul32(ma, mb) >> 23; int mshift = getbit(mrtmp, C01); int64_t mr = mrtmp >> mshift; int32_t ertmp = ea + eb - C02; int32_t er = mshift ? inc(ertmp) : ertmp; /* TODO: Overflow ^ */ int sr = sa ^ sb; int32_t r = (sr << C03) | ((er & 0xFF) << C04) | (mr & 0x7FFFFF); return *(float *) &r; } ``` And the, to multiply the two floating-points, we can get the result sign bit (`sr`) by simply multiply the two sign bits `sa` and `sb` together (here use XOR operation, same effect). For the exponent part, we need to add the exponent part together. However, since the exponent parts extract from the input floating-points are biased: $$ biased\ exponent = real\ exponent + 127 $$ If we simply add the two exponent bits `ea` and `eb` together, the bias will be added twice, and we will get $$ real\ exponent_a + 127 + real\ exponent_b + 127 = real\ exponent_a + real\ exponent_b + 254 $$ Instead of: $$ real\ exponent_a + real\ exponent_b + 127 $$ Therefore, we need to additionally subtract 127 from the result of `ea + eb`, which means the `C02` should be 127. And for the mantissa part, we need to multiply the mantissa parts (include the hidden 1). Since the hidden 1s had been added back to the `ma` and `mb`, we can multiply the mantissa together and get the result `mrtmp`. :::info **Ignore the lower 23 bits** After multiplying the two 24 bits mantissas, the result will have at most 48 bits, which is larger than the number of bits of a floating-point's fraction part. Therefore, here simply neglect the lower 23 bits, and use the other 25 bits to calculate the normalized result mantissa. ::: Then, since the value of the result mantissa (`mrtmp`) must smaller than 4 ($1.frac_a \times 1.frac_b$), we can simply test the bit 24 (zero based, the second bit left to the decimal point) to determine whether the result mantissa is already normalized or not, which means the `C01` should be 24. If the target bit is 1, we should increse the result exponent value `ertmp` and ignore (right shift) 1 more bit of the result mantissa `mrtmp`. Finally, we need to combine the result sign bit (`sr`), the result exponent (`ertmp`, already biased and increased if needed), and the result mantissa (`mrtmp`, normalized). Therefore, `C03` and `C04` should 31 (MSB, for sign bit) and 23 (exponent part, 8 bits after MSB) respectively. #### Implement `fmul32` with rv32i ```asm fmul32: # prologue addi sp, sp, -36 sw s0, -32(sp) # for a sw s1, -28(sp) # for b sw s2, -24(sp) # for sa sw s3, -20(sp) # for sb sw s4, -16(sp) # for ma sw s5, -12(sp) # for mb sw s6, -8(sp) # for ea sw s7, -4(sp) # for eb sw ra, 0(sp) mv s0, a0 mv s1, a1 srli s2, s0, 31 # uint32_t sa = ia >> 31 srli s3, s1, 31 # uint32_t sb = ib >> 31 li t0, 0x7FFFFF # & 0x7FFFFF and s4, s0, t0 and s5, s1, t0 li t0, 0x800000 # | 0x800000 or s4, s4, t0 # ma or s5, s5, t0 # mb srli s6, s0, 23 # int32_t ea = ((ia >> 23) & 0xFF) andi s6, s6, 0xFF srli s7, s1, 23 # int32_t eb = ((ib >> 23) & 0xFF) andi s7, s7, 0xFF mv a0, s4 mv a1, s5 jal imul32 li t0, 0x7FFFFF # tX is not safe after func call! and t0, a1, t0 # int64_t mrtmp = imul32(ma, mb) >> 23 slli t0, t0, 9 srli a0, a0, 23 srli a1, a1, 23 or a0, a0, t0 addi sp, sp, -8 # use stack to save the 64b mrtmp sw a0, -4(sp) sw a1, 0(sp) li a2, 24 jal getbit # int mshift = getbit(mrtmp, C01) lw t0, -4(sp) # lower mrtmp lw t1, 0(sp) # upper mrtmp beqz a0, fmul32_normalize andi t2, t1, 1 # LSB of upper, mshift must be 1 or 0! slli t2, t2, 31 # move to MSB srli t0, t0, 1 # mrtmp >> mshift srli t1, t1, 1 or t0, t0, t2 fmul32_normalize: # t1 t0 for mr mv t2, a0 # t2 for mshift add a0, s6, s7 # int32_t ertmp = ea + eb - C02 addi a0, a0, -127 # a0 for ertmp (or er if branch) beqz t2, fmul32_no_inc_exp sw t1, -4(sp) sw t0, 0(sp) jal inc # int32_t er = mshift ? inc(ertmp) : ertmp lw t0, 0(sp) lw t1, -4(sp) fmul32_no_inc_exp: # a0 for er, t1 t0 for mr (upper, t1, not used) xor t2, s2, s3 # int sr = sa ^ sb slli t2, t2, 31 # (sr << C03) andi a0, a0, 0xFF # | ((er & 0xFF) << C04) slli a0, a0, 23 or t2, t2, a0 li t1, 0x7FFFFF # | (mr & 0x7FFFFF), lower mr only and t0, t0, t1 or a0, t0, t2 # epilogue addi sp, sp, 8 lw ra, 0(sp) lw s7, -4(sp) # for eb lw s6, -8(sp) # for ea lw s5, -12(sp) # for mb lw s4, -16(sp) # for ma lw s3, -20(sp) # for sb lw s2, -24(sp) # for sa lw s1, -28(sp) # for b lw s0, -32(sp) # for a addi sp, sp, 36 ret ``` To test the implementation, use the following main function: ```asm main: li a0, 0x47F12064 # input a li a1, 0x4940BD02 # input b jal fmul32 li a2, 0x51B58A51 # expected li a3, 0 # no upper li a1, 0 # only 32b jal assert_result li a7, 93 ecall ``` ## Assignment 1 Multiplication of a floating-point and an integer (in limited value range). ### Integer to Float ```c uint32_t bits_before_frac(uint32_t x) // FIXME: 0 is undefined { int n = 1; if ((x >> 16) == 0) { n += 16; x <<= 16; } if ((x >> 24) == 0) { n += 8; x <<= 8; } if ((x >> 28) == 0) { n += 4; x <<= 4; } if ((x >> 30) == 0) { n += 2; x <<= 2; } n = n - (x >> 31); return n + 1; // the leading 1 } ``` To get the fraction part (the leading 1 is excluded) of the converted floating-point value, we need a function similar to clz, but also counts the leading 1. ```c uint32_t itof(const int32_t i) // FIXME: only valid for range -2^24 ~ 2^24 { if (i == 0) return 0; uint32_t neg = !!(i < 0) << 31; uint32_t ans = neg ? -i : i; uint32_t bbf = bits_before_frac(ans); ans <<= bbf; ans >>= 9; ans = neg | ((32 - bbf + 127) << 23) | ans; return ans; } ``` Then, we can combine the 3 parts to get the result floating-point value. #### Implement `bits_before_frac` in rv32i ```asm bits_before_frac: # no prologue li t0, 1 # int n = 1; srli t1, a0, 16 # if ((x >> 16) == 0) { n += 16; x <<= 16; } bnez t1, upper16 addi t0, t0, 16 slli a0, a0, 16 upper16: srli t1, a0, 24 # if ((x >> 24) == 0) { n += 8; x <<= 8; } bnez t1, upper8 addi t0, t0, 8 slli a0, a0, 8 upper8: srli t1, a0, 28 # if ((x >> 28) == 0) { n += 4; x <<= 4; } bnez t1, upper4 addi t0, t0, 4 slli a0, a0, 4 upper4: srli t1, a0, 30 # if ((x >> 30) == 0) { n += 2; x <<= 2; } bnez t1, upper2 addi t0, t0, 2 slli a0, a0, 2 upper2: srli a0, a0, 31 # n = n - (x >> 31); neg a0, a0 add t0, t0, a0 addi a0, t0, 1 # return n + 1; # no epilogue ret ``` To verify this implementation, replace main with following instructions: ```asm main: li a0, 0x003FFF00 # input x li a1, 0 jal bits_before_frac li a2, 11 # expected mv a3, a1 # no upper jal assert_result li a7, 93 ecall ``` #### Implement `itof` in rv32i ```asm itof: addi sp, sp, -8 # prologue sw s1, -8(sp) # s1 for ans sw s0, -4(sp) # s0 for neg sw ra, 0(sp) beqz a0, itof_out # if (i == 0) return 0; li t0, 0x80000000 # uint32_t neg = !!(i < 0) << 31; and s0, a0, t0 beqz s0, itof_pos # uint32_t ans = neg ? -i : i; neg a0, a0 itof_pos: mv s1, a0 jal bits_before_frac # uint32_t bbf = bits_before_frac(ans); sll s1, s1, a0 # ans <<= bbf; srli s1, s1, 9 # ans >>= 9; or s1, s1, s0 # ans = neg | ((32 - bbf + 127) << 23) | ans; neg a0, a0 addi a0, a0, 159 slli a0, a0, 23 or a0, s1, a0 lw ra, 0(sp) # epilogue lw s0, -4(sp) lw s1, -8(sp) itof_out: addi sp, sp, 8 ret ``` To verify this implementation: ```asm main: li a0, 0x00F12064 # input i, limited range (+-16777216) li a1, 0 jal itof li a2, 0x4b712064 # expected mv a3, a1 # only 32b jal assert_result li a7, 93 ecall ``` ### Multiply ```c float ifmul(uint32_t i, uint32_t f) { // convert integer to float float fi = itof(*(int32_t *)&i); return fmul(fi, f); } ``` To multiply an integer and a floating-point, we should first convert the integer to floating-point. And then we can reuse the function `fmul` to perform a floating-point multiplication. #### Implement `ifmul` in rv32i ```asm ifmul: addi sp, sp, -4 # prologue sw s0, -4(sp) # s0 for f sw ra, 0(sp) mv s0, a1 jal itof mv a1, s0 jal fmul32 lw ra, 0(sp) # epilogue lw s0, -4(sp) # s0 for f addi sp, sp, 4 ret ``` And to verify this implementation: ```asm main: li a0, 124 # input i, limited in +-16777216 li a1, 0x42f6e979 # input f jal ifmul li a2, 0x466f322d # expected mv a3, a1 # only 32b jal assert_result li a7, 93 ecall ``` FIXME: use array for multiple test data ### Multiplication Error ```c typedef union { float f; int32_t i; uint32_t raw; } value_t; int main(int argc, char const *argv[]) { value_t i, fi, f, o, a; i.i = atoi(argv[1]); fi.f = (float)atoi(argv[1]); i.raw = itof(i.i); f.f = atof(argv[2]); if (i.raw != fi.raw) printf("Expected itof=%f(0x%08x), but got 0x%08x...\n", i.f, i.raw, fi.raw); o.f = fmul32(i.f, f.f); a.f = f.f * i.f; printf("0x%x x 0x%x = 0x%08x (ans: 0x%08x)\n", i.raw, f.raw, o.raw, a.raw); printf("%s\n", o.raw == a.raw ? "correct!" : "wrong..."); return 0; } ``` However, the result of the `fmul32` function and that of the `*` operation may have slight difference. ```text $ ./a.out 123 123.456 0x42f60000 x 0x42f6e979 = 0x466d445a (ans: 0x466d445a) correct! $ ./a.out 124 123.456 0x42f80000 x 0x42f6e979 = 0x466f322d (ans: 0x466f322d) correct! $ ./a.out 12345 123.456 0x4640e400 x 0x42f6e979 = 0x49ba0b02 (ans: 0x49ba0b03) wrong... ``` ongoing... ### Analysis #### Execution Info ![](https://hackmd.io/_uploads/S1W8up5b6.png) ## References - [Lab1: RV32I Simulator](https://hackmd.io/@sysprog/H1TpVYMdB#Lab1-RV32I-Simulator) - [IEEE-754 Floating Point Converter](https://www.h-schmidt.net/FloatConverter/IEEE754.html)