Lab1: RV32I Simulator

contributed by < qwe661234 >

GitHub: ComputerArchitectureHW/HW1-LeetCode-203-RISCV

203. Remove Linked List Elements

This problem is based on LeetCode 203. Given the head of a linked list and an integer variable val, remove all the nodes of the linked list that has Node.val == val, and return the new head.

ListNode Structure

struct ListNode {
    int val;
    struct ListNode *next;
};

Assembly code

.data
length:
    .word 7
array: 
    .word 1, 2, 6, 3, 4, 5, 6
target:
    .word 6 
answer:
    .word 1, 2, 3, 4, 5
err_msg:
    .string "malloc fail"
assert_fail:
    .string "assert fail"
assert_success:
    .string "assert success"
.text
main:
    jal create_default_list
    mv t1, s1 # store list head to t1
    jal print_list
    mv t1, s1 # store list head to t1
    jal remove_elements
    mv t1, s1 # store list head to t1
    jal print_list
    mv t1, s1 # store list head to t1
    jal assert_list
exit:
    addi a7, x0 10 # exit
    ecall
# ------ create_default_list ------
create_default_list:
    addi sp, sp, 8
    li s2, 0          # i = 0
    la t1, array      # load array
    la t0, length     # load length
    lw t0, 0(t0)      
    bne s2, t0, create
    jr ra
create:
    li s0, 0x0fff0000 # address of list head
    sw ra, 0(sp)      # store return address
    sw s0, 4(sp)      # store the address of list head
    jal ra, malloc    # get memory for the next node
    bne a0, x0, exit_error  
    lw s1, 0(t1)      # load array[i]
    addi t1, t1, 4    # i++
    sw s1, 0(s0)      # node->value = array[i]
    sw x0, 4(s0)      # node->next = NULL
    mv t2, s0         # prev = node
    addi s2, s2, 1    
    addi s0, s0, 8    
    beq s2, t0, ret   # check end condition in for loop
loop:   
    jal ra, malloc    # get memory for the next node
    bne a0, x0, exit_error  
    lw s1, 0(t1)      # load array[i]
    addi t1, t1, 4    # i++
    sw s1, 0(s0)      # node->value = array[i]
    sw x0, 4(s0)      # node->next = NULL
    sw s0, 4(t2)      # prev->next = node
    mv t2, s0         # prev = node
    addi s2, s2, 1    
    addi s0, s0, 8    
    bne s2, t0, loop  # check end condition in for loop
ret:
    lw ra, 0(sp)      # load return address
    lw s1, 4(sp)      # load the address of list head
    addi sp, sp, 8
    jr ra
# ------ malloc ------
malloc:     # allocates 8 bytes on the heap, returns pointer to start in a0
    addi a0, s0, 8   # move heap limit to current + 8, 4 byte for integer, 4 byte for pointer 
    addi a7, x0, 214 # brk systemcall
    ecall
    jr  ra
# ------ remove_elements ------
remove_elements:
    la s0, target
    lw s0, 0(s0)
iterate:
    beq t1, x0, end_iterate 
    lw t2, 0(t1) # load value store in node 
    beq t2, s0, remove
    mv t3, t1    # prev = cur
    lw t1, 4(t1) # load address store in node->next
    j iterate
remove:
    beq t1, s1, remove_head # if cur == head
    lw t1, 4(t1) # load cur->next to t0
    sw t1, 4(t3) # prev->next = cur->next
    j iterate
remove_head:
    lw t1, 4(t1) # load address store in cur->next
    mv s1, t1 # head = head->next
    j iterate
end_iterate:
    jr ra 
# ------ print_list ------
print_list:
    beq t1, x0, end
    lw a0, 0(t1)   # load value store in node
    addi a7, x0, 1 # print number
    ecall
    lw t1, 4(t1)   # load address store in node->next
    j print_list
end:      
    addi a0, x0, 10 # '\n' acsii number
    addi a7, x0, 11 # print char
    ecall
    jr ra
exit_error:
    la a0, err_msg 
    addi a7, x0 4  # print error message
    ecall
    addi a7, x0 10 # exit
    ecall   
# ------ assert_list ------
assert_list:
    la t0, answer
assert_loop:    
    beq t1, x0, end_aseert
    lw a0, 0(t1)   # load value store in node
    lw a1, 0(t0)   # load value store in answer[i]
    lw t1, 4(t1)   # load address store in node->next
    addi t0, t0, 4 # i++
    beq a0, a1 assert_loop
    la a0, assert_fail
    addi a7, x0 4  # print message
    ecall
    jr ra
end_aseert:  
    la a0, assert_success
    addi a7, x0 4  # print message
    ecall 
    jr ra

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Can you use fewer instructions?

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

Interpret the assembly code

malloc

C code

struct ListNode *node = malloc(sizeof(struct ListNode));

Assembly code

# ------ malloc ------
malloc:     # allocates 8 bytes on the heap, returns pointer to start in a0
    addi a0, s0, 8   # move heap limit to current + 8, 4 byte for integer, 4 byte for pointer 
    addi a7, x0, 214 # brk systemcall
    ecall
    jr  ra

register s0 represent the origin heap limit address, registera0 represent the new heap limit address. We want to allocate 8 bytes for ListNode, 4 bytes for node->value and 4 bytes for node->next, so we add register s0 with 8 and store the result in register a0. Then we use brk systemcall to set new heap limit address to address stored in register a0.
It is more convient to use sbrk systemcall to implement malloc, but Ripes only support brk systemcall.

create_default_list

C code

struct ListNode *head = NULL, *cur = head;
int array[7] = {1, 2, 6, 3, 4, 5, 6}
int length = 7;
for (int i = 0; i < len; i++) {
    struct ListNode *node = malloc(sizeof(struct ListNode));
    node->val = array[i];
    node->next = NULL;
    if (!head) 
        head = node;
    else {
        cur->next = node;
        cur = cur->next;
    }
}

Assembly code

create_default_list:
    addi sp, sp, 8
    li s2, 0           # i = 0
    la t1, array       # load array
    la t0, length      # load length
    lw t0, 0(t0)      
    bne s2, t0, create # if length == 0, return
    jr ra

First, we store list head address and return address in stack and load array in data section.

create:
    li s0, 0x0fff0000 # address of list head
    sw ra, 0(sp)      # store return address
    sw s0, 4(sp)      # store the address of list head
    jal ra, malloc    # get memory for the next node
    bne a0, x0, exit_error  
    lw s1, 0(t1)      # load array[i]
    addi t1, t1, 4    # i++
    sw s1, 0(s0)      # node->value = array[i]
    sw x0, 4(s0)      # node->next = NULL
    mv t2, s0         # prev = node
    addi s2, s2, 1    
    addi s0, s0, 8    
    beq s2, t0, ret   # check end condition in for loop

Two instruction sw ra, 0(sp) and sw s0, 4(sp) store return address and the address of list head (0x0fff0000) in stack.
This label is for head == NULL, because head is NULL when i = 0. Instruction sw s1, 0(s0) store arr[0] to node->val, instruction sw x0, 4(s0) set node->next to NULL and instuction mv t2, s0 record previous node.
Instruction addi t1, t1, 4 means i++.

loop:   
    jal ra, malloc    # get memory for the next node
    bne a0, x0, exit_error  
    lw s1, 0(t1)      # load array[i]
    addi t1, t1, 4    # i++
    sw s1, 0(s0)      # node->value = array[i]
    sw x0, 4(s0)      # node->next = NULL
    sw s0, 4(t2)      # prev->next = node
    mv t2, s0         # prev = node
    addi s2, s2, 1    
    addi s0, s0, 8    
    bne s2, t0, loop  # check end condition in for loop

Only one instruction in this label loop is different from label create, sw s0, 4(t2), this instruction store node to previous_node->next.

remove_elements

C code

int target = 1;
struct ListNode *cur = head, *prev = NULL;
while (cur) {
    if (cur->val == target) {
        if (cur == head) {
            head = head->next;
            cur = head;
            continue;
        } else {
            prev->next = cur->next;
        }
    } 
    prev = cur;
    cur = cur->next;
}

Assembly code

iterate:
    beq t1, x0, end_iterate 
    lw t2, 0(t1) # load value store in node 
    beq t2, s0, remove
    mv t3, t1    # prev = cur
    lw t1, 4(t1) # load address store in node->next
    j iterate

Instruction beq t2, s0 is to check if node->val == target.

remove:
    beq t1, s1, remove_head # if cur == head
    lw t1, 4(t1) # load cur->next to t0
    sw t1, 4(t3) # prev->next = cur->next
    j iterate

Instruction beq t1, s1 is to check if current_node == head.
This label is to deal with remove node from list. Register t1 store current node address and register t3 store previous node address.

remove_head:
    lw t1, 4(t1) # load address store in cur->next
    mv s1, t1 # head = head->next
    j iterate

This label is to deal with remove head from list. Register t1 store current node address, instruction lw t1, 4(t1) means we load head->next and instruction mv s1, t1 means we store head->next to head.

assert_list

C code

void assert_list(int *answer, ListNode *head) {
    ListNode *cur = head;
    int count = 0;
    while (cur) {
        if (cur->val != answer[count++]) {
            printf("assert fail");
            return;
        }
        cur = cur->next;
    }
    printf("assert success");
}

Assembly code

assert_list:
    la t0, answer
assert_loop:    
    beq t1, x0, end_aseert
    lw a0, 0(t1)   # load value store in node
    lw a1, 0(t0)   # load value store in answer[i]
    lw t1, 4(t1)   # load address store in node->next
    addi t0, t0, 4 # i++
    beq a0, a1 assert_loop
    la a0, assert_fail
    addi a7, x0 4  # print message
    ecall
    jr ra
end_aseert:  
    la a0, assert_success
    addi a7, x0 4  # print message
    ecall 
    jr ra

This function is for testing results automatically, we load answers which we has defined and value store in list node, then we assert whether these two values are equal or not.

Result

The original linked list is 1->2->6->3->4->5->6. After removing target elements in linked list, the linked list would be 1->2->3->4->5.

1263456
12345
assert success

Program exited with code: 0

Reduce instruction usage

create_default_list

If we create list reversely(from last node to first node), we can just store current_node to pervious_node->next, instead of set current_node->next = NULL. Also, we don't need to care about this creating is for head or node. Therefore, we can save instruction sw x0, 4(s0) # node->next = NULL and merge label create to label loop. But, we should reverse the arrange of array in data section, because you create list reversely.

assembly code after modify

create_default_list:
    addi sp, sp, 8
    li s0, 0x0fff0000 # address of list head
    sw ra, 0(sp)      # store return address
    sw s0, 4(sp)      # store the address of list head
    li s2, 0          # i = 0
    la t1, array      # load array
    la t0, length     # load length
    lw t0, 0(t0)      
    bne s2, t0, loop
    jr ra
loop:   
    jal ra, malloc    # get memory for the next node
    bne a0, x0, exit_error  
    lw s1, 0(t1)      # load array[i]
    addi t1, t1, 4    # i++
    sw s1, 0(s0)      # node->value = array[i]
    sw t2, 4(s0)      # prev->next = node
    mv t2, s0         # prev = node
    addi s2, s2, 1    
    addi s0, s0, 8    
    bne s2, t0, loop  # check end condition in for loop
ret:
    addi s1, s0, -8
    lw ra, 0(sp)      # load return address
    addi sp, sp, 8
    jr ra

Pipeline feature in Ripes

Instruction pipeline is a technique for implementing instruction-level parallelism within a single processor. We partition an instruction to five stages.

Stage	Full name
IF	Instruciton Fetch
ID	Instruction decode & read register
EX	Execution or Address Calculation
MEM	Data memory access (load data from memory)
WB	write back (write data back to register)

Also, we have four pipeline register to store the data of pervious stage. Next stage would load data from pipeline register and store new data to next pipeline register.

Pipeline stage

We take instruction lw t0, 4(t1) as example

IF

In this stage, load the instruction which address is PC, and store PC + 4 to IF/ID pipeline register.

ID

In this stage, decode the instruction loaded from IF/ID pipeline register. the instruction is lw t0, 4(t1), so R1 is register t1(x6) , R2 in this instruction is useless, imm is 4 and register t0 (x5) would store in ID/EX pipeline register.
Reg 1 means the address store in register t1 (x6), here is 0x00000020.

EX

In this stage, we pass immediate and the address store in register t1 (x6) to ALU, ALU will execute add operation to calculate right memory address that we want to access value.
Here is 0x00000020 + 4 (immediate) = 0x00000024.

MEM

In this stage, we will access data store in memory address that ALU calculate in previous stage.

WB

In last stage, we will write data which we access in pervious stage back to the target register.
We can notice the below image, red line represent the target register number, and blue line represent the value that access from memory.

Pipeline Hazard

The pipeline design might encounter three types of hazard

1. Structure hazards

A required resource is busy, and instruction need to wait until the resource is ready.

Solution

Load/store and Instruction fetch both require memory access
Separating instruction memory and data memory or separating instruction cache and data cache.

2. Data hazard

An instruction depends on completion of data access by a previous instruction.
There are two scenarioes for data hazard

RAW(Read After Write)

There is an example of RAW, before the first instruction add s0, t0, t1 write back data to registerx8, the second instruction sub t2, s0, t3 has read data from register x8.

In this Image, Ripe use forwarding to avoid data hazard.

Load-use

There is an example of RAW, before the first instruction lw s0, 20(t1) access memory and write back data to registers0, the second instruction sub t2, s0, t3 has read data from register s0.

In this Image, Ripe use stall and forwarding to avoid data hazard.

Solution

there are three solution for data hazard:

Reodering code: reorder code to avoid use of load result in the next instruction.
Stall: add no operation instruction to wait data write back.
Forwarding: bypass the result to next instruction, this desgin requires extra connections in the datapath.

But, we still need a stall to avoid data hazard in load-use case.

3. Control hazard

Branch determines flow of control, and fetching next instruction depends on branch outcome.

Solution

Result of branch is known at the EX stage.

Stall: wait until the branch determined, then fetching next instruction.
Branch Prediction:
- Correct prediction: no stall required.
- Incorrect prediction: clean incorrect instruction and stall one cycle to wait previous instruction finish EX stage.

Lab1: RV32I Simulator

203. Remove Linked List Elements

ListNode Structure

Assembly code

Interpret the assembly code

malloc

C code

Assembly code

create_default_list

C code

Assembly code

remove_elements

C code

Assembly code

assert_list

C code

Assembly code

Result

Reduce instruction usage

create_default_list

assembly code after modify

Pipeline feature in Ripes

Pipeline stage

IF

ID

EX

MEM

WB

Pipeline Hazard

1. Structure hazards

Solution

2. Data hazard

RAW(Read After Write)

Load-use

Solution

3. Control hazard

Solution

Reference

Read more

實作 Thread package

Cache Coherency & I/O ordering

CO Final

New EBB Strategy