# Assignemnt 1:RISC-V Assembly and Instruction Pipeline contributed By < [chihenliu](https://hackmd.io/@chihenliu) > ## linkedlist "linked list" is a common data structure ![](https://hackmd.io/_uploads/SyTmRzgWa.png) ![](https://hackmd.io/_uploads/Sks4Cfxbp.png) As shown in the two diagrams above, the concept is to use nodes to record, represent, and store data. Each node has three components: Data, Pointer, and Address. Additionally, each node's pointer points to the address of the next node, continuing until it points to Null, signifying the end of this simple linked list. The time complexity is **O(N)** ## Count leading zero To calculate the number of consecutive zeros, counting from the Most Significant Bit (MSB) towards the right, until the first encountered '1' in a binary number Ex: **0000000000000010 =14** ## Motivation Before taking this course, I had no prior knowledge of data structures. Linked lists were the first data structure I learned about. Therefore, I wanted to try implementing the 32-bits Count Leading Zeros operation in RISC-V to further understand it。 # Implement --- ## C code ```c #include <stdint.h> #include <stdio.h> #include <stdlib.h> // using malloc functions // def linkedlist structure typedef struct Node { uint32_t data; // 32 bits unsigned integers struct Node* next; } Node; // calculate 32bits unsigned int count of leading zeros uint16_t count_leading_zeros(uint32_t x) { x |= (x >> 1); x |= (x >> 2); x |= (x >> 4); x |= (x >> 8); x |= (x >> 16); /* count ones (population count) */ x -= ((x >> 1) & 0x55555555); x = ((x >> 2) & 0x33333333) + (x & 0x33333333); x = ((x >> 4) + x) & 0x0f0f0f0f; x += (x >> 8); x += (x >> 16); return (32 - (x & 0x1f)); // 32 bits unsigned int leading zeors } // calculate all linkedlists node clz and sum uint64_t sum_of_leading_zeros(Node* head) { uint64_t sum = 0; while (head != NULL) { uint16_t leadingZeros = count_leading_zeros(head->data); sum += leadingZeros; head = head->next; } return sum; } int main() { // create a simple linked list Node* head = NULL; Node* node1 = malloc(sizeof(Node)); node1->data = 23; node1->next = NULL; head = node1; Node* node2 = malloc(sizeof(Node)); node2->data = 15; node2->next = NULL; node1->next = node2; Node* node3 = malloc(sizeof(Node)); node3->data = 1 ; node3->next = NULL; node2->next = node3; // calculate sum of linkedlist node leading zeors for 32bits unsigned integers uint32_t totalLeadingZeros = sum_of_leading_zeros(head); printf("Sum of Leading Zeros: %llu\n", totalLeadingZeros); // release linked list node memory while (head != NULL) { Node* temp = head; head = head->next; free(temp); } return 0; } ``` First, I defined the structure of a linked list and created a simple linked list with three nodes. Then, I used a 32-bit CLZ (Count Leading Zeros) function to calculate the total sum of leading zeros for the three nodes in the linked list. case1: Input list_1:23、15、8 OutPut1:![](https://hackmd.io/_uploads/rJ4hXXlb6.png) case2: Input list_2:10、32、56 Output2:![](https://hackmd.io/_uploads/rJNqHmg-p.png) case3: Input list_3:89、125、256 Output3:![](https://hackmd.io/_uploads/rkZDDmlbT.png) Table | Input_1 | Input_2 | Input_3 | | -------- | -------- | -------- | | 23、15、8 | 10、32、56 | 89、125、256 | | Output_1 |Output_2 | Output_3 | | -------- | -------- | -------- | | 83 | 80 | 73 | ## Assembly Code(RISC-V) The following is the RISC-V implementation of 32-bits Count Leading Zeros. I have implemented it as a function. ```c clz: addi sp,sp,-16 sw ra,0(sp) sw s0,4(sp) sw s1,8(sp) #Prologue sw s2,12(sp) #s0 is x add s0,x0,a0 #x|=(x>>1) srli t0,s0,1 or s0,s0,t0 #x|=(x>>2) srli t0,s0,2 or s0,s0,t0 #x|=(x>>4) srli t0,s0,4 or s0,s0,t0 #x|=(x>>8) srli t0,s0,8 or s0,s0,t0 #x|=(x>>16) srli t0,s0,16 or s0,s0,t0 #x -= ((x>>1) & 0x55555555) li t1,0x55555555 srli t0,s0,1 and t0,t0,t1 sub s0,s0,t0 #x = ((x>>2) & 0x33333333)+(x &0x33333333) li t1,0x33333333 srli t0,s0,2 and t0,t0,t1 and t1,s0,t1 add s0,t0,t1 #x = ((x>>2) +4)&0x0f0f0f0f srli t0,s0,4 add t0,t0,s0 li t1,0x0f0f0f0f and s0,t0,t1 #x += (x>>8) srli t0,s0,8 add s0,t0,s0 #x += (x>>16) srli t0,s0,16 add s0,t0,s0 #(32-(x&0x1f)) li a0,32 andi t0,s0,0x1f sub a0,a0,t0 lw ra,0(sp) lw s0,4(sp) #Epiologue lw s1,8(sp) lw s2,12(sp) addi sp,sp,16 jr ra ``` This function calculates the sum of leading zeros obtained from the CLZ (Count Leading Zeros) operation on three nodes. ```cpp sum_clz_zeros: addi sp, sp, -16 sw ra, 0(sp) sw s0, 4(sp) #Prologue sw s1, 8(sp) sw s2, 12(sp) li s0, 0 la s1, list #load address for list loop: lw t0, 0(s1) beqz t0,done mv a0, t0 mv t2,a0 jal ra, clz add s0, s0, a0 addi s1,s1,4 j loop done: mv a0,s0 #s0->a0 lw ra,0(sp) lw s0,4(sp) lw s1,8(sp) #Epiologue lw s2,12(sp) addi sp,sp,16 jr ra ``` Full RISC-V code ```cpp .data list:.word 10,32,56 .text # this is 32 unsign int clz computation for three node for list main: #load address for list la s0,list lw a0,0(s0) call clz jal ra, print_result lw a0,4(s0) call clz jal ra, print_result lw a0,8(s0) call clz jal ra, print_result #calculate Sum of three node clz leading zeros call sum_clz_zeros jal ra,print_result j exit_program print_result: li a7,1 ecall jr ra clz: addi sp,sp,-16 sw ra,0(sp) sw s0,4(sp) sw s1,8(sp) #Prologue sw s2,12(sp) #s0 is x add s0,x0,a0 #x|=(x>>1) srli t0,s0,1 or s0,s0,t0 #x|=(x>>2) srli t0,s0,2 or s0,s0,t0 #x|=(x>>4) srli t0,s0,4 or s0,s0,t0 #x|=(x>>8) srli t0,s0,8 or s0,s0,t0 #x|=(x>>16) srli t0,s0,16 or s0,s0,t0 #x -= ((x>>1) & 0x55555555) li t1,0x55555555 srli t0,s0,1 and t0,t0,t1 sub s0,s0,t0 #x = ((x>>2) & 0x33333333)+(x &0x33333333) li t1,0x33333333 srli t0,s0,2 and t0,t0,t1 and t1,s0,t1 add s0,t0,t1 #x = ((x>>2) +4)&0x0f0f0f0f srli t0,s0,4 add t0,t0,s0 li t1,0x0f0f0f0f and s0,t0,t1 #x += (x>>8) srli t0,s0,8 add s0,t0,s0 #x += (x>>16) srli t0,s0,16 add s0,t0,s0 #(32-(x&0x1f)) li a0,32 andi t0,s0,0x1f sub a0,a0,t0 lw ra,0(sp) lw s0,4(sp) #Epiologue lw s1,8(sp) lw s2,12(sp) addi sp,sp,16 jr ra sum_clz_zeros: addi sp, sp, -16 sw ra, 0(sp) sw s0, 4(sp) #Prologue sw s1, 8(sp) sw s2, 12(sp) li s0, 0 la s1, list #load address for list loop: lw t0, 0(s1) beqz t0,done mv a0, t0 mv t2,a0 jal ra, clz add s0, s0, a0 addi s1,s1,4 j loop done: mv a0,s0 #s0->a0 lw ra,0(sp) lw s0,4(sp) lw s1,8(sp) #Epiologue lw s2,12(sp) addi sp,sp,16 jr ra exit_program: la s0,list li t0,0 loop_free: lw t1,0(s0) beqz t1,done_free lw t2,4(t1) sw zero,0(t1) mv s0,t2 j loop_free done_free: li a7,10 ecall ``` For each test case, I will check the count of leading zeros obtained from the CLZ (Count Leading Zeros) operation for each node, and then sum them up ![](https://hackmd.io/_uploads/r13I-El-a.png) ![](https://hackmd.io/_uploads/HkHDZVxW6.png) ![](https://hackmd.io/_uploads/rk1_ZNl-6.png) | OutPut_case1 |OutPut_case2| OutPut_case3 | | -------- | -------- | -------- | | 83 | 80 | 73 | ## 5-stage Pipeline Analysis 5-stage pipeline generated by Ripes ![](https://hackmd.io/_uploads/Bygda_f-p.jpg) ![](https://hackmd.io/_uploads/Sy_d6_zbT.jpg) The above is my analysis of the pipeline within the main label ### IF stage ![](https://hackmd.io/_uploads/SJ2ATuf-p.jpg) * This instruction, `jalr x1, x1, 72` sets the PC (Program Counter) value to (x1 + 72) and stores the address of the next instruction in the `x1` register. This operation is typically used for function calls or branching * Program Counter is 0x00000014, which refers to the next instruction address * The jalr instruction is `I-type` instruction * This instruction by RISC-V GreenCard table is `R[rd]=PC+4;PC=R[rs1]+imm` LSB in jalr is set to zero and jalr instruction | IMM | opecode | Funct3 | | -------- | -------- | -------- | | Imm[11:0] | 1100111 | 000 | * `PC` should be `IR` +4 if no bracnching occured ### ID stage ![](https://hackmd.io/_uploads/SkmlHNx-T.png) * This is an "auipc" instruction that loads the immediate value `0x0` into the register `x1`. Similarly, it is used to set the PC value, and this time the address is 0x0, indicating that the entry point of the program is the address of the main * auipc is `U-type` instruction * This instruction by RISC-V GreenCard table is `R[rd]=PC+{imm,12'b0}` ### Ex stage ![](https://hackmd.io/_uploads/BJXXBExZp.png) * The purpose of this instruction is to read the value at the memory address pointed to by `x8` and store it in register` x10` * and use two OP implement `lw x10 0 x8` * lw is `I-type` instruction,This instruction by RISC-V GreenCard table is `R[rd]={32'bM[](31),M[R[rs1]+imm](31:0)}` and Core instruction foramt is `imm[11:0],rs1,funct3,rd,opcode` ### Mem stage ![](https://hackmd.io/_uploads/rkdqLYf-a.jpg) * `addi x8,x8,0` is`I-type` instruction,This instruction by RISC-V GreenCard table is`R[rd]=R[rs1]+imm` | Imm| opcode | Funct3 | | -------- | -------- | -------- | | Imm[11:0] | 0010011 | 000 | * This is an instruction that loads an immediate value doesn’t involve memory access ### WB stage ![](https://hackmd.io/_uploads/S1Etr4eba.png) * In this stage `auipc` instruction that loads the immediate value `0x10000` into register `x8`. ## CPU analysis ![](https://hackmd.io/_uploads/HJaecKz-p.jpg) ### Conclusion This is my first assignment that took me quite a while, constantly working between C language and the RISC-V architecture. Through the CLZ function, it has deepened my understanding of RISC-V instructions, and I have realized my clear shortcomings, requiring more time to enhance my background knowledge in this course. Thanks to my fellow students who discussed with me, it has also made me aware of how complex it can be to recreate a linked list and manage memory in RISC-V ## Reference [Assignment1: RISC-V Assembly and Instruction Pipeline](https://hackmd.io/@sysprog/2023-arch-homework1) [The RISC-V Instruction Set Manual Volume I: Unprivileged ISA](https://content.riscv.org/wpcontent/uploads/2019/06/riscv-spec.pdf) [RISC-V Assembly Programmer's Manual](https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md) [Linked List: Intro](http://alrightchiu.github.io/SecondRound/linked-list-introjian-jie.html) [RISC-V Datapath Part4: Pipeline](https://zhuanlan.zhihu.com/p/267576239) [RISC-V Greensheet](https://inst.eecs.berkeley.edu/~cs61c/fa17/img/riscvcard.pdf) [Find first set](https://en.wikipedia.org/wiki/Find_first_set)