# Assignment1: RISC-V Assembly and Instruction Pipeline contributed by <[`shhung`](https://github.com/shhung/Implement-transformation-from-integer-to-float)> ## Implement transformation from integet to float This assignment implements the conversion from a 64-bit integer to a 64 bit floating-point data type. ## Implementation ### C code ```c= // naive implementation union DoubleConverter { unsigned long long intValue; double doubleValue; }; double ll_to_double(unsigned long long significand) { // Only support 0 < significand < 1 << 53. if (significand == 0 || significand >= 1ULL << 53) return -1.0; int shifts = 0; while ((significand & (1ULL << 52)) == 0) { significand <<= 1; shifts++; } unsigned long long exponent = 1023 + 52 - shifts; unsigned long long merged = (exponent << 52) | (significand & 0xFFFFFFFFFFFFF); // Use union for type conversion union DoubleConverter converter; converter.intValue = merged; return converter.doubleValue; } ``` Instead of accumulating shifts through iteration, using clz is more efficient ```diff=11 -int shifts = 0; -while ((significand & (1ULL << 52)) == 0) -{ - significand <<= 1; - shifts++; -} +int shifts = __builtin_clzll(significand) - 11; +significand <<= shifts; ``` ### Assembly code ```rv32 .data num: .dword 0xBBFFFFFFFF, 0x84f2, 0x811111111 mask: .word 0xFFFFF maskclz: .word 0x55555555, 0x33333333, 0x0f0f0f0f .text la t0, num lw a0, 12(t0) lw a1, 8(t0) jal itof jal exit # ====== subroutines ====== # cast int64 to double # input uint64[a0, a1] # output double[a0, a1] itof: addi sp, sp, -4 sw ra, 0(sp) bnez a0, inrange bnez a1, inrange li t0, 1 slli t0, t0, 21 blt a0, t0, inrange # overrange, set msb 1 li t0, 1 slli t0, t0, 31 or a0, a0, t0 ret inrange: addi sp, sp, -8 sw a0, 0(sp) sw a1, 4(sp) call clz addi a2, a0, -11 lw a0, 0(sp) lw a1, 4(sp) addi sp, sp 8 li a3, 32 bge a2, a3, ge32 # lt32 sub a3, a3, a2 sll a0, a0, a2 srl t0, a1, a3 sll a1, a1, a2 or a0, a0, t0 j merged ge32: sub a3, a2, a3 sll a1, a1, a3 mv a0, a1 li a1, 0 # exponent = 1023 + 52 - shifts merged: li a3, 1075 sub a3, a3, a2 slli a3, a3, 20 la t0, mask lw t0, 0(t0) and a0, a0, t0 or a0, a0, a3 lw ra, 0(sp) addi sp, sp, 4 ret clz: # input int64[a0, a1] # output int32[a0] # x |= (x >> {1, 2, 4, 8, 16}) addi sp, sp, -4 sw ra, 0(sp) addi t1, x0, 0x1 Loop1: addi t2, x0, 32 srl t0, a0, t1 or a0, a0, t0 srl t0, a1, t1 or a1, a1, t0 sub t2, t2, t1 sll t0, a0, t2 or a1, a1, t0 slli t1, t1, 1 addi t2, x0, 32 bgt t2, t1, Loop1 # x |= (x >> 32) or a1, a1, a0 # x -= ((x >> 1) & 0x5555555555555555); la t6, maskclz srli t0, a0, 1 lw t5, 0(t6) and t0, t0, t5 # t0 = (a0 >> 1) & 0x55555555 srli t1, a1, 1 slli t2, a0, 31 or t1, t1, t2 and t1, t1, t5 # t1 = (a1 >> 1) & 0x55555555 sub a0, a0, t0 sltu t3, a1, t1 bne t3, x0, Borrow sub a1, a1, t1 j Done Borrow: addi a0, a0, -1 sub a1, t1, a1 addi t3, x0, -1 sub a1, t3, a1 Done: # x = ((x >> 2) & 0x3333333333333333) + (x & 0x3333333333333333) lw t5, 4(t6) srli t0, a0, 2 srli t1, a1, 2 slli t2, a0, 30 or t1, t1, t2 # [t0, t1] = x >> 2 and t0, t0, t5 and t1, t1, t5 and a0, a0, t5 and a1, a1, t5 mv a2, t0 mv a3, t1 jal ra, Add64 # ((x >> 4) + x) & 0x0f0f0f0f0f0f0f0f srli t0, a0, 4 srli t1, a1, 4 slli t2, a0, 28 or t1, t1, t2 mv a2, t0 mv a3, t1 jal ra, Add64 lw t5, 8(t6) and a0, a0 ,t5 and a1, a1 ,t5 # x += (x >> 8) srli t0, a0, 8 srli t1, a1, 8 slli t2, a0, 24 or t1, t1, t2 mv a2, t0 mv a3, t1 jal ra, Add64 # x += (x >> 16) srli t0, a0, 16 srli t1, a1, 16 slli t2, a0, 16 or t1, t1, t2 mv a2, t0 mv a3, t1 jal ra, Add64 # x += (x >> 32) mv a2, x0 mv a3, a0 jal ra, Add64 # return (64 - (x & 0x7f)) andi a0, a1, 0x7f addi a1, x0, 64 sub a0, a1, a0 lw ra, 0(sp) addi sp, sp, 4 ret Add64: add a0, a0, a2 add a1, a1, a3 sltu s0, a1, a3 bne s0, x0, Carry ret Carry: addi a0, a0, 1 ret exit: # Exit program li a7, 10 ecall ``` ### Explanation #### Data section ```rv32 .data num: .dword 0x2187654321 mask: .word 0xFFFFF maskclz: .word 0x55555555, 0x33333333, 0x0f0f0f0f ``` 64 bits value is represented using `dword`, `num` represents test data. `mask` is used to filter the mantisa of integer. Although under the [IEEE 754](https://en.wikipedia.org/wiki/Double-precision_floating-point_format) standard, Double-precision floating-point has a 53-bit mantissa(52 explicitly stored), in the context of rv32, two registers are used to store the value. Therefore, processing only needs to be done on the upper half of the bits. `maskclz` is used to implement [popCount](https://www.chessprogramming.org/Population_Count#The_Constants) in `clz` #### Subroutines * `itof` * Input: uint64 [a0, a1] * Output: double[a0, a1] Convert the input integer to the IEEE 754 double format * `clz` * Input: uint64 [a0, a1] * Output: uint32 [a0] Count leading zero of the input value #### Program flow The program stores a 64-bit value in two registers, with the upper half in `a0` and the lower half in `a1`. `itof` first checks if it exceeds the range that double storage allows, and if it does, the conversion process begins. It starts by determining the shifts to calculate the exponent, and then aligns and retrieves the mantissa based on the shifts value. Finally, it combines these components to obtain the IEEE 754 double format value. #### 64 bits operations Since two registers are used to store the value, operations such as addition, subtraction, bit shifting, etc., need to take into account the crossing of bits between the two registers. * Addition For addition, it's relatively straightforward—just need to check whether there will be a carry from the lower half of the registers to the upper half. If the result after addition is less than either of the operands, it means there's a need for carrying over ```rv32 # [a0, a1] + [a2, a3] Add64: add a0, a0, a2 add a1, a1, a3 sltu s0, a1, a3 bne s0, x0, Carry ret Carry: addi a0, a0, 1 ret ``` * Subtraction For subtraction, in this assignment, the minuend is guaranteed to be greater than the subtrahend. Therefore, it's only necessary to check whether borrowing is needed from the upper half of the registers to the lower half. If borrowing is needed, then swap the minuend and subtrahend, and subtract the difference with the borrow. ```rv32 # suppose a - b sub a0, a0, t0 sltu t3, a1, t1 # if a < b, Borrow bne t3, x0, Borrow sub a1, a1, t1 j Done Borrow: # t = b - a # res = borrow - t addi a0, a0, -1 sub a1, t1, a1 # The borrow value is stored in t3, and since the immediate value has limitations, it is assigned through t3 = 0 - 1 addi t3, x0, -1 sub a1, t3, a1 Done: ``` * Shifting Shifting, whether left or right, will always encounter cases where the upper half crosses over to the lower half of the registers, or vice versa. In the case of the left shift used in this assignment, if the shift is less than 32, it means the bits will move to the lower bits of the upper half of the registers. Therefore, reverse shifting is performed to the correct position, and then the or operation is used to set the bits in the upper half of the registers. When the shift is greater than or equal to 32, it means the bits will move to the higher bits of the upper half of the registers. In this case, the shift direction aligns with the original operation, but it needs to subtract 32 to obtain the correct bits. These corrected bits are then directly assigned to the upper half of the registers, and the lower half is set to zero. ```rv32 li a3, 32 bge a2, a3, ge32 # lt32 sub a3, a3, a2 sll a0, a0, a2 srl t0, a1, a3 sll a1, a1, a2 or a0, a0, t0 j merged ge32: sub a3, a2, a3 sll a1, a1, a3 mv a0, a1 li a1, 0 ``` #### Result Since ripes doesn't provide output for 64-bit integers and 64-bit floating-point numbers, it's necessary to inspect the values in registers to verify the results. The results are stored in two registers, with `a0` holding the upper half and `a1` holding the lower half. Here are two tools for quick format conversion. The [Hexadecimal to Decimal converter](https://www.rapidtables.com/convert/number/hex-to-decimal.html) is used to obtain the decimal representation, while [FractionConvert](https://www.toolhelper.cn/Digit/FractionConvert) provides the IEEE 754 standard representation. To have a clear comparison of results, I've stored the input values in s0 and s1 registers. ```rv32 # test case 0xBBFFFFFFFF = 0x42677FFFFFFFE000 (IEEE 754) 0x84f2 = 0x40E09E4000000000 (IEEE 754) 0x11 = 4031000000000000 (IEEE 754) ``` * num1 ![test1](https://hackmd.io/_uploads/S17EimzWT.png) * num2 ![test2](https://hackmd.io/_uploads/SJ0RiXzWp.png) * num3 ![test3](https://hackmd.io/_uploads/SJFPnmfWp.png) #### Analysis Comparing the execution results of different test cases, it can be observed that for larger values, the performance of conversion based on clz is inferior to the naive approach. However, for smaller values, it outperforms the naive conversion, and the performance difference becomes more pronounced as the values decrease. The main difference lies in the fact that the clz method can obtain the shifts value in a fixed number of cycles, while the naive approach involves iterative calculations, leading to an increase in iterations as the values decrease. * 0xBBFFFFFFFF | | itof_clz | itof | |:-------------- |:--------:|:-----:| | cycles | 234 | 156 | | Insrs: retired | 182 | 111 | | CPI | 1.29 | 1.41 | | IPC | 0.774 | 0.712 | * 0x84f2 | | itof_clz | itof | |:-------------- |:--------:|:----:| | cycles | 234 | 342 | | Insrs: retired | 181 | 253 | | CPI | 1.29 | 1.35 | | IPC | 0.774 | 0.74 | * 0x11 | | itof_clz | itof | |:-------------- |:--------:|:-----:| | cycles | 234 | 430 | | Insrs: retired | 181 | 319 | | CPI | 1.29 | 1.35 | | IPC | 0.774 | 0.742 | ## Summarize Thanks to `population count`, we were able to achieve constant time complexity for CLZ. In this assignment, we leveraged this optimization to improve the conversion from integers to floating-point numbers.