Assignment1: RISC-V Assembly and Instruction Pipeline

# Assignment 1: RISC-V Assembly and Instruction Pipeline contributed by < `JoshuaLee0321` > This project will record all the trouble I encountered when writing the first [homework](https://hackmd.io/@sysprog/2023-arch-homework1) in this class ## Installation and Environment :::info Currently, my cpu's info is as follow: ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian ``` As you can see, It is not `RISC-V` cpu, its architecture is `X86_64`, it is not possible to run `RISC-V` instructions directly on my cpu. Thus, the idea of ==emulator== struck my mind ::: ## RISC-V gnu Toolchain Generally, when we install `gcc` on `linux`, we will get a compiler regarding my cpu. Now, I will have to install another compiler in order to compile corresponding c/asm code into binary executable. ```bash # installation.sh mkdir RISCV cd RISCV git clone --recursive https://github.com/riscv-collab/riscv-gnu-toolchain export RISCV=/home/ubuntu/RISC-V/riscv-gnu-toolchain_install (Your own RISCV path) export PATH=$PATH:$RISCV/bin cd ricsv-gnu-toolchain mkdir build cd build ../configure --prefix=$RISCV --with-arch=rv32i make ``` This script will help you get the job done! ## Proxy Kernel and Spike ### pk (Proxy Kernel) According to chatgpt: > proxy kernel is a critical software layer used to manage and isolate different instances of operating systems in virtualization or containerization environments. It facilitates resource sharing and isolation while ensuring security and performance. Different virtualization and containerization technologies may implement proxy kernels in various ways. In short, it is a virtual environment for us to develop `RISC-V` assembly ```bash # installation.sh git clone https://github.com/riscv-software-src/riscv-pk cd riscv-pk mkdir build cd build ../configure --prefix=$RISCV --with-arch=rv64gc_zifencei --host=riscv64-unknown-elf make make install ``` ### spike ```bash # installation.sh git clone https://github.com/riscv-software-src/riscv-isa-sim cd riscv-isa-sim mkdir build cd build ../configure --prefix=$RISCV --enable-histogram make make install ``` ## to sum up In summary, Spike and pk are used together to provide a complete RISC-V environment, where Spike simulates the CPU and instruction set operations, while pk simulates the operating system and related environment functionalities. The collaboration between these two tools allows developers to perform comprehensive RISC-V software testing and debugging. Hence, in order to build and run `RISC-V`, we need compiler and virtual environment. # Ripes Simulator Little did I know, this project require us to ensure code can run in [Ripes](https://github.com/mortbopet/Ripes), so just download it, and it works properly. ![](https://hackmd.io/_uploads/H1PMmtclp.png) As you can see, we successfully ran it. # clz implementation since we aim to solve the problem of bitwise and of number range, and its only requires 32 bit implementation, hence, this section will only reproduce 32 bits version of clz ```asm .data d1_33: .word 0x33333333 d2_0f: .word 0x0f0f0f0f d3_55: .word 0x55555555 .text count_leading_zeros: #prologue addi sp, sp, -12 sw ra, 0(sp) sw s0, 4(sp) sw s1, 8(sp) # start shifting # input : a0 mv s0, a0 # x = input srli s1, s0, 1 # x >> 1 or s0, s1, s0 # x |= (x >> 1) srli s1, s0, 2 # x >> 2 or s0, s1, s0 # x |= (x >> 2) srli s1, s0, 4 # x >> 4 or s0, s1, s0 # x |= (x >> 4) srli s1, s0, 8 # x >> 8 or s0, s1, s0 # x |= (x >> 8) srli s1, s0, 16 # x >> 16 or s0, s1, s0 # x |= (x >> 16) la t0, d3_55 lw t0, 0(t0) srli s1, s0, 1 and s1, s1, t0 sub s0, s0, s1 # x -= ((x >> 1) & 0x55555555) la t0, d1_33 lw t0, 0(t0) srli s1, s0, 2 and s1, s1, t0 and t1, s0, t0 add s0, t1, s1 # x = ((x >> 2) & 0x33333333) + (x & 0x33333333); la t0, d2_0f lw t0, 0(t0) srli s1, s0, 4 add s1, s1, s0 and s0, s1, t0 # x = ((x >> 4) + x) & 0x0f0f0f0f; srli s1, s0, 8 add s0, s1, s0 srli s1, s0, 16 add s0, s1, s0 addi t0, zero, 32 andi s0, s0, 0x7f sub a0, t0, s0 # return (32 - (x & 0x7f)); #epilogue lw ra, 0(sp) lw s0, 4(sp) lw s1, 8(sp) addi sp, sp, 12 jr ra ``` Many code segments are inherently self-explanatory, requiring minimal explanation. However, in the context of code optimization and maintainability, particular attention should be given to the prologue and epilogue sections of functions. According to insights shared in a relevant Stack Overflow article ([link](https://stackoverflow.com/questions/9268586/what-are-callee-and-caller-saved-registers)), it's essential for callee functions to preserve and safeguard all incoming parameters for potential future use. Therefore, these functions should incorporate well-structured prologue and epilogue sections to ensure parameter integrity and overall program reliability. # [bitwise-and-of-numbers-range](https://leetcode.com/problems/bitwise-and-of-numbers-range/description/) ## previous solution ```c int rangeBitwiseAnd(int left, int right){ int res = 0; while(left != right){ left>>=1; right>>=1; res+=1; } return left << res; } ``` According to the previous submission of mine, it clearly needs the use of heavy branch instruction, consider [branch prediction and recovery](https://www.educative.io/answers/what-is-branch-prediction), it might not be a good idea. and it must have some other faster implementation this is another solution which only uses bit manipulation to gain the result ```c int rangeBitwiseAnd(int L, int R) { return L == R ? L : L & ~(INT_MAX >> clz(L ^ R)); } ``` so according to this solution, we can translate it into risc-v solution ```asm .text rangeBitwiseAnd: # a0, a1 as input, dword #prologue addi sp, sp, -12 sw ra, 0(sp) sw s0, 4(sp) sw s1, 8(sp) # s0 = a0, s1 = a1 mv s0, a0 mv s1, a1 beq s0, s1, rangedBitWiseAndEnd xor a0, s0, s1 call count_leading_zeros mv t1, a0 # load int max la t0, int_max lw t0, 0(t0) srl t0, t0, t1 not t0, t0 and s0, s0, t0 rangedBitWiseAndEnd: #epilogue mv a0, s0 lw ra, 0(sp) lw s0, 4(sp) lw s1, 8(sp) addi sp, sp, 12 jr ra ```