Assignment1: RISC-V Assembly and Instruction Pipeline
contributed by < shauming1020 >
Power Function
Given an integer x and a positive number y, write a function that computes .
I referenced the Write an iterative O(Log y) function for pow(x, y) to implement a function with time complexity and extra space.
C-code
Obviously, the function contains loops and conditional branches.
RISC-V Assembly-code
.data
x: .word 3
y: .word 5
str1: .string " ^ "
str2: .string " = "
.text
main:
lw s0, x
lw s1, y
mv a0, s0
mv a1, s1
jal power
mv s2, a0
j print
power:
addi sp, sp, -12
sw ra, 0(sp)
sw s0, 4(sp)
sw s1, 8(sp)
mv s0, a0
mv s1, a1
li s2, 1
loop:
bge x0, s1, Ret
andi t0, s1, 0x1
li t1, 0x1
li ra, 0x54
beq t0, t1, odd
srli s1, s1, 1
mul s0, s0, s0
j loop
odd:
mul s2, s2, s0
jr ra
Ret:
mv a0, s2
lw ra, 0(sp)
lw s0, 4(sp)
lw s1, 8(sp)
addi sp, sp, 12
jr ra
print:
mv t0, a0
mv a0, s0
li a7, 1
ecall
la a0, str1
li a7, 4
ecall
mv a0, s1
li a7, 1
ecall
la a0, str2
li a7, 4
ecall
mv a0, s2
li a7, 1
ecall
Show

Instruction Pipeline
Pipeline contains main five stages, IF, ID, EX, MEM and WB
ISA: RV32IM
Instruction fetch (IF)
- The Program Counter (PC), is a register that holds the address that is presented to the instruction memory.
0x00000000 present the address of instruction 'auipc x8, 0x10000'
0x00000004 present the address of next instruction 'lw x8, 0(x8)'
- Next PC is calculated by PC + 4, or take the result of a branch / jump calculation as the next PC.
Instruction Decode (ID)
- Instructions are presented with 4 bytes, and it has four types in RV32I ISA.
- The indexes of these two registers are identified within the instruction, and the indexes are presented to the register memory, as the address.
EXE
- Consists of an ALU and a bit shifter, the ALU for performing boolean operations (and, or, not, nand, nor, xor, xnor) and for perfoming integer addition and subtraction.
MEM
- Access the data memory with load / store operations.
- Use MemWrite / MemRead signal to control the behavor.
WB
- Save the result to register file.
Data Hazards
- When instructions that exhibit data dependence modify data in different stages of a pipeline.
- e.g. RAW (Read After Write)
- i1. R2 <- R5 + R3
- i2. R4 <- R2 + R3
- A data dependency between i1 and i2, that may use the wrong value at R2 to compute in i2, because i1 not yet finish in WB stage.
Detection
- Forwarding detection, check the register of ID/EX and EX/MEM (MEM/WB) whether have the same register file.
- If data hazard happens, then it will take the forwarding or stall mechanism.
Forwarding (bypassing)
- When we compute the ALU result value in EXE stage, then pass the value to the next stage.
Stall
- load-use data hazard, when a destination register of a load instruction equal to the source register of a instruction that is following this load instruction, then we cannot use the forwarding to solve this type of data hazard.
- Stall one cycle to keep the forwarding mechanism working
load-use with x9 register
stall one cycle, then ALU res can pass to add x9, x8, x9
Description
main
-
lw x8, 0(x8)
-
ID, set wr_reg_idx: 0x08
-
EXE, when auipc x8 0x10000 done in EX/MEM, the alures_out will be set equal to 0x10000000(x8) and forwarding to EXE(lw), then we can compute the ALU res by x0 immediately to avoid the data hazard suffering.
-
MEM, access Data memory.
-
WB, the target register had been set to wr_reg_idx: 0x08 in ID stage, we just assign the value to the target register in this stage.


-
After WB, register x8 be set equal to the parameter x.
-
jal x1 0x24 < power >
- EXE, we can fetch the target instruction address by ALU res in this stage, and two unuseful instruction had been fetch in ID, IF, so we need to flush them in next stage.

- MEM, flush them.


power
We need to store the return address (ra) and store the saved register because we may use it.
- sw x1, 0(x2)
- IF, get the value of source register x1
- EXE, forwarding x2 computed by addi then we can get the 0(x2) in this stage.

- MEM, activate the Wr_en signal and write the value to stack memory.

loop
-
bge x0 x9 40 < Ret >
- EXE, when branch condition is met, PC will be set to ALU res, so that can fetch the correct instruction in next stage, avoid the branch hazard.

- MEM, flush two unuseful instruction with nop.

-
li ra, 0x54
- Need to set the return address with address of 'srli s1, s1, 1' before go into odd, so that can execute the correct instruction after doing the mul s2, s2, s0.

here we can see the address of 'srli x9, x9, 1' is 0x54.
Ret
Before return, we need to set the return value to a0 and restore ra, saved registers.