owned this note changed 8 months ago
Published Linked with GitHub

Assignment1: RISC-V Assembly and Instruction Pipeline

contributed by < burhou >

Check Add or edit the title to ensure the consistency of given title.

In addition, computer architecture is not computer structure.

Problem C in Quiz1

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Function 1. fabsf:

C code

static inline float fabsf(float x) {
    uint32_t i = *(uint32_t *)&x;  // Read the bits of the float into an integer
    i &= 0x7FFFFFFF;               // Clear the sign bit to get the absolute value
    x = *(float *)&i;              // Write the modified bits back into the float
    return x;
}

Don't code snip without comprehensive discussions.

Convert to risc-v

RISC-V

fabsf:
    # The function parameter is in a0, and the return value will also be in a0
    # a0 contains the address of the float x
    
    lw t0, 0(a0)          # Load the float as an integer (read the bits)
    andi t0, t0, 0x7FFFFFFF # Clear the sign bit to get the absolute value
    sw t0, 0(a0)         # Store the modified bits back to memory
    # The return value is already in a0, so no additional return statement is needed
    ret    

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Function 2. my_clz:

C code

static inline int my_clz(uint32_t x) {
    int count = 0;
    for (int i = 31; i >= 0; --i) {
        if (x & (1U << i))
            break;
        count++;
    }
    return count;
}

Convert to risc-v

RISC-V

my_clz:
    li      t0, 32           # Initialize the counter t0 = 32
    li      t1, 31           # Initialize the bit index t1 = 31

clz_loop:
    slli    t2, a0, 1        # Left shift a0 by 1 bit to check the most significant bit
    bnez    t2, clz_end      # If the most significant bit is 1, exit the loop
    addi    t0, t0, -1       # Decrement the counter by 1
    addi    t1, t1, -1       # Decrement the index by 1
    bgez    t1, clz_loop     # If the index is >= 0, continue the loop

clz_end:
    mv      a0, t0           # Move the result to a0 (return value)
    ret                      # Return from the function

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Function 3. fp16_to_fp32:

C code

static inline uint32_t fp16_to_fp32(uint16_t h) {
    /*
     * Extends the 16-bit half-precision floating-point number to 32 bits
     * by shifting it to the upper half of a 32-bit word:
     *      +---+-----+------------+-------------------+
     *      | S |EEEEE|MM MMMM MMMM|0000 0000 0000 0000|
     *      +---+-----+------------+-------------------+
     * Bits  31  26-30    16-25            0-15
     *
     * S - sign bit, E - exponent bits, M - mantissa bits, 0 - zero bits.
     */
    const uint32_t w = (uint32_t) h << 16;
    
    /*
     * Isolates the sign bit from the input number, placing it in the most
     * significant bit of a 32-bit word:
     *
     *      +---+----------------------------------+
     *      | S |0000000 00000000 00000000 00000000|
     *      +---+----------------------------------+
     * Bits  31                 0-31
     */
    const uint32_t sign = w & UINT32_C(0x80000000);
    
    /*
     * Extracts the mantissa and exponent from the input number, placing
     * them in bits 0-30 of the 32-bit word:
     *
     *      +---+-----+------------+-------------------+
     *      | 0 |EEEEE|MM MMMM MMMM|0000 0000 0000 0000|
     *      +---+-----+------------+-------------------+
     * Bits  30  27-31     17-26            0-16
     */
    const uint32_t nonsign = w & UINT32_C(0x7FFFFFFF);
    
    /*
     * The renorm_shift variable indicates how many bits the mantissa
     * needs to be shifted to normalize the half-precision number. 
     * For normalized numbers, renorm_shift will be 0. For denormalized
     * numbers, renorm_shift will be greater than 0. Shifting a 
     * denormalized number will move the mantissa into the exponent,
     * normalizing it.
     */
    uint32_t renorm_shift = my_clz(nonsign);
    renorm_shift = renorm_shift > 5 ? renorm_shift - 5 : 0;
    
    /*
     * If the half-precision number has an exponent of 15, adding a 
     * specific value will cause overflow into bit 31, which converts 
     * the upper 9 bits into ones. Thus:
     *   inf_nan_mask ==
     *                   0x7F800000 if the half-precision number is 
     *                   NaN or infinity (exponent of 15)
     *                   0x00000000 otherwise
     */
    const int32_t inf_nan_mask = ((int32_t)(nonsign + 0x04000000) >> 8) &
                                 INT32_C(0x7F800000);
    
    /*
     * If nonsign equals 0, subtracting 1 will cause overflow, setting
     * bit 31 to 1. Otherwise, bit 31 will be 0. Shifting this result
     * propagates bit 31 across all bits in zero_mask. Thus:
     *   zero_mask ==
     *                0xFFFFFFFF if the half-precision number is 
     *                zero (+0.0h or -0.0h)
     *                0x00000000 otherwise
     */
    const int32_t zero_mask = (int32_t)(nonsign - 1) >> 31;
    
    /*
     * 1. Shifts nonsign left by renorm_shift to normalize it (for denormal
     *    inputs).
     * 2. Shifts nonsign right by 3, adjusting the exponent to fit in the
     *    8-bit exponent field and moving the mantissa into the correct
     *    position within the 23-bit mantissa field of the single-precision
     *    format.
     * 3. Adds 0x70 to the exponent to account for the difference in bias
     *    between half-precision and single-precision.
     * 4. Subtracts renorm_shift from the exponent to account for any
     *    renormalization that occurred.
     * 5. ORs with inf_nan_mask to set the exponent to 0xFF if the input
     *    was NaN or infinity.
     * 6. ANDs with the inverted zero_mask to set the mantissa and exponent
     *    to zero if the input was zero.
     * 7. Combines everything with the sign bit of the input number.
     */
    return sign | ((((nonsign << renorm_shift >> 3) +
            ((0x70 - renorm_shift) << 23)) | inf_nan_mask) & ~zero_mask);
}

Convert to risc-v

RISC-V

.section .text
.global fp16_to_fp32

fp16_to_fp32:
    # Arguments:
    # a0: uint16_t h (input half-precision float)

    # Extend h to 32 bits (w = (uint32_t) h << 16)
    slli    a1, a0, 16      # a1 = w = h << 16

    # Extract sign bit (sign = w & 0x80000000)
    andi    a2, a1, 0x80000000  # a2 = sign

    # Extract nonsign bits (nonsign = w & 0x7FFFFFFF)
    li      t0, 0x7FFFFFFF
    and     a3, a1, t0      # a3 = nonsign

    # Calculate renorm_shift using clz (my_clz equivalent)
    clz     t1, a3          # t1 = clz(nonsign)
    li      t2, 5
    sub     t1, t1, t2      # renorm_shift = clz - 5
    bltz    t1, normalize_shift_zero
    mv      t1, zero        # If renorm_shift < 0, set it to 0
normalize_shift_zero:

    # inf_nan_mask calculation
    li      t3, 0x04000000
    add     t4, a3, t3      # t4 = nonsign + 0x04000000
    srai    t4, t4, 8       # Shift right to align exponent
    li      t5, 0x7F800000
    and     t4, t4, t5      # t4 = inf_nan_mask

    # zero_mask calculation
    addi    t6, a3, -1      # t6 = nonsign - 1
    srai    t6, t6, 31      # t6 = zero_mask

    # Normalize if necessary and adjust exponent/mantissa
    sll     t7, a3, t1      # Shift left by renorm_shift
    srai    t7, t7, 3       # Adjust mantissa and exponent
    li      t8, 0x70
    sub     t8, t8, t1      # 0x70 - renorm_shift
    sll     t8, t8, 23      # Shift to exponent position
    add     t7, t7, t8      # Combine adjusted exponent and mantissa

    # Apply inf_nan_mask and zero_mask
    or      t7, t7, t4      # OR with inf_nan_mask
    not     t6, t6          # Invert zero_mask
    and     t7, t7, t6      # AND with inverted zero_mask

    # Combine with sign bit
    or      a0, t7, a2      # Final result with sign bit

    ret

Don't paste the code without discussions!

Use case

LeetCode 93. Restore IP Addresses

A valid IP address consists of exactly four integers separated by single dots. Each integer is between 0 and 255 (inclusive) and cannot have leading zeros.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
For example, "0.1.2.201" and "192.168.1.1" are valid IP addresses, but "0.011.255.245", "192.168.1.312" and "192.168@1.1" are invalid IP addresses.

Given a string s containing only digits, return all possible valid IP addresses that can be formed by inserting dots into s. You are not allowed to reorder or remove any digits in s. You may return the valid IP addresses in any order.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Example 1:

Input: s = "25525511135"
Output: ["255.255.11.135","255.255.111.35"]

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Example 2:

Input: s = "0000"
Output: ["0.0.0.0"]

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Example 3:

Input: s = "101023"
Output: ["1.0.10.23","1.0.102.3","10.1.0.23","10.10.2.3","101.0.2.3"]

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Constrains

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
1 <= s.length <= 20
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
s consists of digits only.

Solution

  1. Understand the Constraints of a Valid IP Address:

An IP address has four parts separated by dots. Each part must be between 0 and 255 (inclusive). No part can have leading zeros (e.g., "01" is invalid but "0" is valid). All characters in the string must be used without reordering or skipping.

  1. Plan the Approach:

Use backtracking to explore all possible ways of inserting 3 dots to divide the string into 4 parts. Validate each part to ensure: It is between 0 and 255. It has no leading zero unless it is exactly "0".

  1. Constraints to Optimize the Code:

If the length of s is less than 4 or more than 12, return an empty result (since the smallest valid IP is "0.0.0.0" with 4 digits, and the largest is "255.255.255.255" with 12 digits).

  1. Backtracking Approach:

We will place three dots in the string to create 4 segments. For each segment: Check if it is a valid integer. If valid, proceed to the next part. If not, backtrack.

  1. Implementing the Solution:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
c code (version 1)

Always write in English!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>

#define MAX_IPS 100

// 檢查該段是否為有效的 IP 段
bool isValidSegment(const char *s, int start, int end) {
    if (start > end) return false;

    // 檢查前導 0
    if (s[start] == '0' && start != end) return false;

    // 轉成數值並檢查範圍 0~255
    int num = 0;
    for (int i = start; i <= end; i++) {
        if (s[i] < '0' || s[i] > '9') return false;
        num = num * 10 + (s[i] - '0');
    }
    return num >= 0 && num <= 255;
}

// 將找到的有效 IP 加入結果陣列
void addIP(char **result, int *returnSize, const char *s, int p1, int p2, int p3) {
    char ip[16]; // IP 最多長度為 15(包括點和終止符)
    snprintf(ip, sizeof(ip), "%.*s.%.*s.%.*s.%s",
             p1, s, p2 - p1, s + p1, p3 - p2, s + p2, s + p3);
    result[(*returnSize)++] = strdup(ip);
}

// 回溯法尋找所有可能的 IP 位址
void backtrack(char **result, int *returnSize, const char *s, int len, int start, int dots, int positions[]) {
    if (dots == 3) { // 已插入 3 個點
        if (isValidSegment(s, start, len - 1)) { // 最後一段需有效
            addIP(result, returnSize, s, positions[0], positions[1], positions[2]);
        }
        return;
    }

    // 遍歷每個可能的分割點(最多 3 個字元)
    for (int i = start; i < len && i < start + 3; ++i) {
        if (isValidSegment(s, start, i)) { // 確認此段有效
            positions[dots] = i + 1; // 保存點的位置
            backtrack(result, returnSize, s, len, i + 1, dots + 1, positions); // 繼續分割
        }
    }
}

// 主函數:傳入字串並回傳所有有效的 IP 位址
char **restoreIpAddresses(char *s, int *returnSize) {
    char **result = malloc(sizeof(char *) * MAX_IPS); // 儲存結果
    *returnSize = 0;
    int len = strlen(s);
    if (len < 4 || len > 12) return result; // 長度不符則返回空結果

    int positions[3]; // 保存每個點的位置
    backtrack(result, returnSize, s, len, 0, 0, positions); // 開始回溯
    return result;
}

// 生成隨機數字串
void generateRandomString(char *s, int len) {
    for (int i = 0; i < len; i++) {
        s[i] = '0' + rand() % 10; // 生成 '0' 到 '9' 的隨機數字
    }
    s[len] = '\0'; // 加上終止符
}

// 測試函數
int main() {
    srand(7); // 初始化隨機數生成器
    int returnSize;

    // 生成並測試三組隨機資料
    for (int i = 0; i < 3; i++) {
        int len = 4 + rand() % 9; // 隨機生成長度 4~12
        char s[len + 1]; // 終止符 '\0'
        generateRandomString(s, len);

        printf("Test %d - Random Input: %s\n", i + 1, s);
        char **ips = restoreIpAddresses(s, &returnSize);

        if (returnSize == 0) {
            printf("No valid IP addresses found.\n");
        } else {
            printf("Valid IP addresses:\n");
            for (int j = 0; j < returnSize; ++j) {
                printf("%s\n", ips[j]);
                free(ips[j]); // 釋放記憶體
            }
        }
        free(ips); // 釋放結果陣列
        printf("\n");
    }

    return 0;
}

I encountered an issue when converting C code to RISC-V regarding function calls. I'm not sure how to call other functions.
Therefore, I tried modifying my C code by adding static inline before each function.
Since I used library functions like srand, rand, and malloc, I wasn't sure how to convert them into RISC-V instructions. As a result, I rewrote these functions in other ways.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
c code (version 2)

Always write in English!

Instead of pasting the whole source code, you can use diff -up to compare the files and then put the notable difference here.

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

#define MAX_IPS 100
#define MAX_IP_LENGTH 16 // IP address length: 15 characters + 1 null terminator

// 計算字串長度的函數 (取代 strlen)
static inline int stringLength(const char *s) {
    int length = 0;
    while (s[length] != '\0') {
        length++;
    }
    return length;
}

// 將整數轉成字串,並回傳轉換後的長度
static inline int intToString(int num, char *buffer) {
    int length = 0;
    if (num == 0) {
        buffer[length++] = '0';
    } else {
        char temp[4]; // IP segment最多三位數,外加一個 '\0'
        int i = 0;
        while (num > 0) {
            temp[i++] = (num % 10) + '0';
            num /= 10;
        }
        // 反轉 temp,放入 buffer
        while (i > 0) {
            buffer[length++] = temp[--i];
        }
    }
    buffer[length] = '\0'; // 加上字串結尾符號
    return length;
}

// 將找到的有效 IP 加入結果陣列
static inline void addIP(char result[MAX_IPS][MAX_IP_LENGTH], int *returnSize, const char *s, int p1, int p2, int p3) {
    char *ip = result[*returnSize]; // 取得當前儲存位置的指標
    int index = 0;

    // 將四段 IP 依序組成字串
    for (int i = 0; i < p1; ++i) ip[index++] = s[i];
    ip[index++] = '.';

    for (int i = p1; i < p2; ++i) ip[index++] = s[i];
    ip[index++] = '.';

    for (int i = p2; i < p3; ++i) ip[index++] = s[i];
    ip[index++] = '.';

    for (int i = p3; s[i] != '\0'; ++i) ip[index++] = s[i];

    ip[index] = '\0'; // 加上字串結尾符號
    (*returnSize)++;
}

// 檢查該段是否為有效的 IP 段
static inline bool isValidSegment(const char *s, int start, int end) {
    if (start > end) return false;

    // 檢查前導 0 (長度超過 1 且第一個字元是 '0' 時無效)
    if (s[start] == '0' && start != end) return false;

    int num = 0;
    for (int i = start; i <= end; i++) {
        if (s[i] < '0' || s[i] > '9') return false;
        num = num * 10 + (s[i] - '0');
    }
    return num <= 255;
}

// 回溯法尋找所有可能的 IP 位址
static inline void backtrack(char result[MAX_IPS][MAX_IP_LENGTH], int *returnSize, const char *s, int len, int start, int dots, int positions[]) {
    if (dots == 3) {
        if (isValidSegment(s, start, len - 1)) {
            addIP(result, returnSize, s, positions[0], positions[1], positions[2]);
        }
        return;
    }

    for (int i = start; i < len && i < start + 3; ++i) {
        if (isValidSegment(s, start, i)) {
            positions[dots] = i + 1;
            backtrack(result, returnSize, s, len, i + 1, dots + 1, positions);
        }
    }
}

// 主函數:傳入字串並回傳所有有效的 IP 位址
static inline void restoreIpAddresses(char *s, int *returnSize, char result[MAX_IPS][MAX_IP_LENGTH]) {
    *returnSize = 0;
    int len = stringLength(s); // 使用自訂的 stringLength 函數
    if (len < 4 || len > 12) return; // 無效長度直接返回

    int positions[3];
    backtrack(result, returnSize, s, len, 0, 0, positions);
}

// 測試函數
int main() {
    int returnSize;

    // 使用提供的測試資料
    char *testCases[] = {
        "25016247489",
        "12539220",
        "524816033"
    };

    char result[MAX_IPS][MAX_IP_LENGTH]; // 預先分配儲存 IP 的陣列

    for (int i = 0; i < 3; i++) {
        char *s = testCases[i];
        printf("Test %d - Random Input: %s\n", i + 1, s);

        restoreIpAddresses(s, &returnSize, result);

        if (returnSize == 0) {
            printf("No valid IP addresses found.\n");
        } else {
            printf("Valid IP addresses:\n");
            for (int j = 0; j < returnSize; ++j) {
                printf("%s\n", result[j]);
            }
        }
        printf("\n");
    }

    return 0;
}

Next, I realized adjustments were needed. Even after adding static inline, the RISC-V code still generated call instructions to invoke other functions. Therefore, I removed the functions and wrote everything directly in the main function to avoid function calls.
Later, I extracted parameters from the main function that could be defined as global variables to make the conversion to RISC-V easier. These parameters can be defined at the beginning.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
c code (version 3)

Always write in English!

#include <stdio.h>
#include <stdbool.h>

#define MAX_IPS 100
#define MAX_IP_LENGTH 16 // IP address length: 15 characters + 1 null terminator

int returnSize;
char *testCases[] = {
        "25016247489",
        "12539220",
        "524816033"
    };
char result[MAX_IPS][MAX_IP_LENGTH]; // 預先分配儲存 IP 的陣列
int main() {

    for (int t = 0; t < 3; t++) {
        char *s = testCases[t];
        printf("Input: %s\n", s);
        returnSize = 0;

        int len = 0;
        while (s[len] != '\0') {
            len++;
        }
        if (len < 4 || len > 12) {
            printf("No valid IP addresses found.\n\n");
            continue; // 無效長度直接返回
        }

        // 生成 IP 地址的邏輯
        for (int i = 1; i < len && i <= 3; i++) { // 第一段
            // 檢查第一段是否有效
            if (s[0] == '0' && i > 1) continue; // 防止前導 0
            int num1 = 0;
            for (int j = 0; j < i; j++) num1 = num1 * 10 + (s[j] - '0');
            if (num1 > 255) continue;
            int p1 = i; // 第一段的結束位置

            for (int j = p1 + 1; j < len && j <= p1 + 3; j++) { // 第二段
                // 檢查第二段是否有效
                if (s[p1] == '0' && j > p1 + 1) continue; // 防止前導 0
                int num2 = 0;
                for (int k = p1; k < j; k++) num2 = num2 * 10 + (s[k] - '0');
                if (num2 > 255) continue;
                int p2 = j; // 第二段的結束位置

                for (int k = p2 + 1; k < len && k <= p2 + 3; k++) { // 第三段
                    // 檢查第三段是否有效
                    if (s[p2] == '0' && k > p2 + 1) continue; // 防止前導 0
                    int num3 = 0;
                    for (int l = p2; l < k; l++) num3 = num3 * 10 + (s[l] - '0');
                    if (num3 > 255) continue;
                    int p3 = k; // 第三段的結束位置

                    // 第四段
                    if (p3 < len) {
                        if (s[p3] == '0' && p3 + 1 < len) continue; // 防止前導 0
                        int num4 = 0;
                        for (int l = p3; l < len; l++) num4 = num4 * 10 + (s[l] - '0');
                        if (num4 > 255) continue; // 檢查第四段是否有效

                        // 生成有效的 IP 地址
                        char *ip = result[returnSize];
                        int index = 0;
                        for (int j = 0; j < p1; j++) ip[index++] = s[j];
                        ip[index++] = '.';
                        for (int j = p1; j < p2; j++) ip[index++] = s[j];
                        ip[index++] = '.';
                        for (int j = p2; j < p3; j++) ip[index++] = s[j];
                        ip[index++] = '.';
                        for (int j = p3; j < len; j++) ip[index++] = s[j];
                        ip[index] = '\0'; // 結尾符號
                        returnSize++;
                    }
                }
            }
        }

        if (returnSize == 0) {
            printf("No valid IP addresses found.\n\n");
        } else {
            printf("Valid IP addresses:\n");
            for (int j = 0; j < returnSize; ++j) {
                printf("%s\n", result[j]);
            }
            printf("\n");
        }
    }

    return 0;
}

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
risc-v

  1. Output result
    on C
    螢幕擷取畫面 2024-10-14 230722

on Ripes
螢幕擷取畫面 2024-10-15 040700 image

  1. Example Walkthrough:

For s = "12539220": Start with the first part: "1", "12", or "125". For each valid first part, try the next part, and so on. Eventually, all valid IP will be "1.25.39.220" "1.253.9.220" "1.253.92.20" "12.5.39.220" "12.53.9.220" "12.53.92.20" "125.3.9.220" "125.3.92.20" "125.39.2.20" "125.39.22.0".

  1. Edge Cases:

If s = "1111", the only valid IP is "1.1.1.1". If s = "0000", the only valid IP is "0.0.0.0". If s = "010010", valid IPs are "0.10.0.10" and "0.100.1.0".

Explain the above.

Select a repo