Try   HackMD

Assignment1: RISC-V Assembly and Instruction Pipeline

contribute by <david070889>

Choosing Problem B from Quiz 1.

Problem B

According to the assignment requirements, the first task i need to do is to translate C to assembly.

fp32_to_bf16

C

bf16_t fp32_to_bf16(float s)
{
    bf16_t h;
    union {
        float f;
        uint32_t i;
    } u = {.f = s};
    if ((u.i & 0x7fffffff) > 0x7f800000) { /* NaN */
        h.bits = (u.i >> 16) | 64;         /* force to quiet */
        return h;                                                                                                                                             
    }
    h.bits = (u.i + (0x7fff + ((u.i >> 0x10) & 1))) >> 0x10;
    return h;
}

Assembly

.data
    input_float_val:    .word 0x3f9e0419   # 這是 1.23456 的單精度浮點數的十六進位表示
    #input_float_val:    .word 0x7FC00000    #NaN in FP32
    output_float_val:    .word 0x00

.text
.global main
 
main: 
    la    t0, input_float_val
    la    s10, output_float_val
    lw    t1, 0(t0) 

# 1. 檢查是否為 NaN
    li    s1, 0x7fffffff      # 常數 0x7fffffff
    and    s2, s1, t1        # u.i & 0x7fffffff
    li    s3, 0x7f800000    # 常數 0x7f800000
    bgtu    s2, s3, NaN    # if (u.i & 0x7fffffff) > 0x7f800000, 跳轉到 NaN 處理
# 2. 進行操作
    srli    s1, t1, 16
    andi    s1, s1, 1
    li    s2, 0x7fff
    add    s1, s1, s2
    add    s1, s1, t1
    srli    s1, s1, 16
    sw    s1, 0(s10)
    ret
    
NaN:
    li    a6, 15
    srli    s4, t1, 16
    ori    s4, s4, 64
    sw    s4, 0(t0)
    ret

Calculating the n-th Root of x

The original C source code probably using exponential and logarithm to compute the root of a number.
I couldn't figure out a way to implement these two function using RISC-V RV32I.
Instead, I decide to use Newton-Raphson method to implement the n-th Root of x.

image
Here is the function of calculating the n-th Root of x.
n is the degree of the root, x is the number to be rooted, y is the guessing result according to the Newton's method and k is the iteration time.
For this assignment, x is a positive number, and n is a positive integer.

nth_root.c

float nthRoot(float x, int n) {
    float guess = x / n; // initial guess
    float epsilon = 0.00001; // precision

    while (fabs(pow(guess, n) - x) > epsilon) {
        float new_guess = ((n - 1) * guess + x / pow(guess, n - 1)) / n;
        if (new_guess == guess){
            printf("%lf\n",new_guess);
            return guess;
        }
        guess = new_guess;
        printf("%lf\n",guess);
    }
    return guess;
}

This function is constructed using built-in C functions, so I need to translate some of them into floating-point bitwise operation.

nth_root_translate.c

float nthRoot(float x, int n) {
    float f_n = (float)n;
    float guess = fdiv32(x, f_n); // initial guess
    float epsilon = 0.0001; // precision
    float n_One = -1;

    while (fcomparison(fabsf(fadd32(my_power(guess, f_n), -x)), epsilon)) {
        float new_guess = fdiv32(fadd32(fmul32(fadd32(f_n, n_One), guess), fdiv32(x, my_power(guess, fadd32(f_n, n_One)))), f_n);
        if (new_guess == guess){
            // printf("%f\n",new_guess);
            return guess;
        }
        guess = new_guess;
        // printf("%f\n",guess);
    }
    return guess;
}

In this function, we can see that basically every function were translate into floating-point bitwise operation.

At the end, I translated the entire nth_root.c into nth_root_FP32.s.

Improvment from FP32 to BF16

Combine both nth_root_FP32 and FP32_to_BF16.

FP32 BF16
image

Reference

Sample4