owned this note
owned this note
Published
Linked with GitHub
# Lab1: Assignment1: RISC-V Assembly and Instruction
## Leetcode problem - Perfect Number
[[Leetcode 507](https://leetcode.com/problems/perfect-number/)] A **perfect number** is a positive integer that is equal to the sum of its positive divisors, excluding the number itself. A divisor of an integer x is an integer that can divide x evenly.
Given an integer n, return true if n is a perfect number, otherwise return false.
#### Example1

#### Example2

## C code
There are three parts of the code. ***MAIN*** is setting variable and keep the whole program running. ***CHECKPERFECTNUMBER*** is the main logic of checking whether the target number is a perfect number. ***FLOORSQRT*** is a function being used by ***CHECKPERFECTNUMBER***, whose purpose is to calculate the square root of a number and floor it.
```clike=
#include <stdio.h>
int floorSqrt(int x)
{
for(long int i=0;i<=x;i++)
if(((i+1)*(i+1)) > x) #find the first i+1 value, which (i+1)^2 is greater than the target number and return i
return i;
return 0;
}
int checkPerfectNumber(int num) {
if ((num & 1 == 1) || (num < 2)) {
return 0;
}
int sum_factor = 1; #1 is factor for every number, so we preadd it to the sum, and start i in the below for loop from 2
for (int i = 2; i < floorSqrt(num) + 1; i++) { #find factor of the number (the maximum i number doesnt need to exceed floorSqrt(num))
if (num % i == 0) { #remainder is 0 =>factor of num
sum_factor += i; #add the factor and its pair factor to sum
sum_factor += num / i;
}
}
if (sum_factor == num){ #if sum equals to original number, it is a perfect number and return True(1)
return 1;
}
else
{
return 0;
}
}
int main(void) #main
{
int n=28; #set variable
printf("1 is True, 0 is False:%d",checkPerfectNumber(n));
}
```
## Assembly code
```asm=
.data
str1: .string "\n 1 is Perfect Number, 0 is not ==> " #String to print
num: .word 28 #Set int equals to 28
.text
main:
lw a0,num #set value into a0
jal ra,checkPerfectNumber #jump to the checkPerfectNumber tag and link return address to next line
la a0,str1 #set string value into output string for print
li a7,4
ecall
mv a0,a1 #set result value into output string for print
li a7,1
j Exit #jump to Exit tag to end the program
floorSqrt:
add a7,zero,zero #j=0 (use register a7)
Loopfs:
bgt a7,a0,Donefs #j>n, condition of exiting for loop(didnt trigger the if statement inside)
addi t1,a7,1 #j+1
mul t1,t1,t1 #(j+1)*(j+1)
bgt t1,a0,DoneIf #(j+1)*(j+1)>num, if statement is True, jump to tag Doneif and return the value j
addi a7,a7,1 #j++ in for loop
j Loopfs #keep for loop running
Donefs:
add a7,zero,zero #a7=0 (a7 is floorSqrt(num) value)
j Exitfs #jump to exit tag of the floorsqrt function(a7=0)
DoneIf: #when (j+1)*(j+1)>num
add a7,a7,zero #assign a7 as j
Exitfs:
jr ra #back to line 47(Loop tag)
checkPerfectNumber:
addi a3,zero,1 #assign sumfactor=1
mv t2, a0 #assign parameter num value into t2 register
andi t0,t2,1 #see if num is odd (num&1), because odd number can't be perfect number
bne t0,zero,check #num & 1!=0(t0=1 is odd number, so jump back to)
slti t0,a0,2 #num<2 (t0=1)
beq t0,zero,calc #num>=2
check:
add a1,zero,zero #set a1=0(register a1 is 1 or 0, representing the result True or False)
jr ra #jump back to line 9(return address preserve by line 8 jal)
calc: #if statement is false
addi a2,zero,2 #set i=2 in register a2
addi sp,sp,-4 #preserve return address of line 9, because we are going to use another function which will overwrite the previous ra, so have to save the old address into stack
sw ra,0(sp) #save ra value into stack
jal floorSqrt #prepare the floorSqrt value for the for loop
Loop:
bge a2,a7,Donecpn #a7 is sqrt value, compare with i(a2)
rem t1,a0,a2 #get num%i value into t1
bne t1,zero,notDv #if statement num%i!=0(t1 equal to 1 means we have to skip the calculation in if)
add a3,a3,a2 #sumfactor+=i
div t1,a0,a2 #num/i
add a3,a3,t1 #sumfactor+=num/i
#get the current sum in variable sumfactor(a3)
notDv: #tag to skip the calculation in if statement
addi a2,a2,1 #i++
j Loop #keep the for loop going
Donecpn: #tag for exit for loop
lw ra,0(sp) #load line 9 address back
addi sp,sp,4 #resume the stack pointer
ResultTrue:
bne a3,a0,ResultFalse #check if sumfactor(a3) equals to num(a0)
addi a1,zero,1 #a1=1(result is True)
ret
ResultFalse:
add a1,zero,zero #a1=0(result is False)
ret
Exit:
ecall
```
## Result
**Input=28** (Input is perfect number)

**Input=27** (Input an odd number)

**Input=498** (Input an even number, which is not a perfect number)

## Psuedo Instruction
For example:
```
a4: 00012083 lw x1 0 x2
```
Here **lw x1 0 x2** is a **I-format instruction**, so the 32-bit instruction format in RISC-V looks like below:

And in the instruction we show above, **a4** represents the address in memory 000000a4. We have instruction **lw x1 0 x2**. From lw we can imply fun3=010(3-bit) and opcode(load)=0000011(7-bit). Moreover, we both use 5-bit to represent rs1 and rd, because there are 32 register in total. So from x1 we can imply rd=00001 and from x2 we can imply rs1=00010. Finally, the 12-bit of immediate is all zeros since we havn't used this section in this instruction.
Therefore, from above we can know that the 32-bit of **lw x1 0 x2**
is: 000000000000|00010|010|00001|0000011
which we convert it into hex
==>0b0000|0000|0000|0001|0010|0000|1000|0011
==>**0x00012083**
So we can know that the middle section in the format graph above, is the hex representation of 32-bit instruction.
-----------------------------------------------------------------------------
------**Memory**----------**32-bit Word**-----------------------**Psuedo**---------------------------------
------**Address**-----------**Instruction**----------------------**Instruction**-------------------------------
```
0000000 <main>:
0: 10000517 auipc x10 0x10000
4: 02552503 lw x10 37 x10
8: 04c000ef jal x1 76 <checkPerfectNumber>
c: 10000517 auipc x10 0x10000
10: ff450513 addi x10 x10 -12
14: 00400893 addi x17 x0 4
18: 00000073 ecall
1c: 00058513 addi x10 x11 0
20: 00100893 addi x17 x0 1
24: 09c0006f jal x0 156 <Exit>
00000028 <floorSqrt>:
28: 000008b3 add x17 x0 x0
0000002c <Loopfs>:
2c: 01154c63 blt x10 x17 24 <Donefs>
30: 00188313 addi x6 x17 1
34: 02630333 mul x6 x6 x6
38: 00654a63 blt x10 x6 20 <DoneIf>
3c: 00188893 addi x17 x17 1
40: fedff06f jal x0 -20 <Loopfs>
00000044 <Donefs>:
44: 000008b3 add x17 x0 x0
48: 0080006f jal x0 8 <Exitfs>
0000004c <DoneIf>:
4c: 000888b3 add x17 x17 x0
00000050 <Exitfs>:
50: 00008067 jalr x0 x1 0
00000054 <checkPerfectNumber>:
54: 00100693 addi x13 x0 1
58: 00050393 addi x7 x10 0
5c: 0013f293 andi x5 x7 1
60: 00029663 bne x5 x0 12 <check>
64: 00252293 slti x5 x10 2
68: 00028663 beq x5 x0 12 <calc>
0000006c <check>:
6c: 000005b3 add x11 x0 x0
70: 00008067 jalr x0 x1 0
00000074 <calc>:
74: 00200613 addi x12 x0 2
78: ffc10113 addi x2 x2 -4
7c: 00112023 sw x1 0 x2
80: fa9ff0ef jal x1 -88 <floorSqrt>
00000084 <Loop>:
84: 03165063 bge x12 x17 32 <Donecpn>
88: 02c56333 rem x6 x10 x12
8c: 00031863 bne x6 x0 16 <notDv>
90: 00c686b3 add x13 x13 x12
94: 02c54333 div x6 x10 x12
98: 006686b3 add x13 x13 x6
0000009c <notDv>:
9c: 00160613 addi x12 x12 1
a0: fe5ff06f jal x0 -28 <Loop>
000000a4 <Donecpn>:
a4: 00012083 lw x1 0 x2
a8: 00410113 addi x2 x2 4
000000ac <ResultTrue>:
ac: 00a69663 bne x13 x10 12 <ResultFalse>
b0: 00100593 addi x11 x0 1
b4: 00008067 jalr x0 x1 0
000000b8 <ResultFalse>:
b8: 000005b3 add x11 x0 x0
bc: 00008067 jalr x0 x1 0
000000c0 <Exit>:
c0: 00000073 ecall
```
## 5-stage Pipelined Processor Analysis
The "5-stage" means this processor using five-stage pipeline to parallelize instructions. The stages are:
1. Instruction fetch (IF)
2. Instruction decode and register fetch (ID)
3. Execute (EX)
4. Memory access (MEM)
5. Register write back (WB)

In a 5-stage processor, there are several scenarios that will change the pipeline status and register value. Here, I will explain three scenarios that I observed while my program are running.
That is:
1. **How variable i changes in for loop**
2. **What happened when a function calls another function**
3. **What happened when ecall**
### 1. How variable i changes in for loop
In this part, we'll show what happened in proccessor when variable i increments while for loop proceeding.

Now in 9c of psuedo instruction, the instruction `addi x12 x12 1` comes into the first stage of the pipeline IF. When `addi x12 x12 1` reach to EX stage, which 90: `add x13 x13 x12` just went out of WB stage. Register x13's value turn form 0x00000001 to 0x00000003, because when i=2, i is a factor of 28, so we add the value to the sum. Meanwhile, at this moment `addi x12 x12 1` is in the EX stage, which means the instruction will be executed next moment, but the value in X12 will not be change until the next moment when `addi x12 x12 1` is in the WB stage.
Moreover, when `addi x12 x12 1` is in the WB stage, the proccessor just executed the `jal x0 -28 <Loop>`, so the now in IF stage will be `bge x12 x17 32 <Donecpn>` instead of `lw x1 0 x2` in a4. And the proccessor will empty(flush) the EXE and ID stage. Finally, next moment `addi x12 x12 1` went out of the WB stage and the value in x12 will be increment by 1. This is how for loop controls its variable.

### 2. What happened when a function calls another function
In this part, we'll show what happened in code and register when we want to call a function inside another function.


Here we can see the for loop in the C code. We use floorSqrt function's value as a boundary of for loop. So we have to call floorSqrt function to get the value when we are in the checkPerfectNumber function.
Now we can see when `addi x12 x0 2` comes into pipeline, there is a `jal` on the way, which will substitute the current addressA(x2) to return the next line of the instruction where the checkPerfectNumber function is called, to another addressB that is the address of `bge x12 x17 32 <Donecpn>` when we return from the floorSqrt function. So here we use the following two instuctions `addi x2 x2 -4` and `sw x1 0 x2` to push the addressA value into stack. And When the for loop is over(we won't call the floorSqrt function), we use instructions `lw x1 0 x2` and `addi x2 x2 4` to get the addressA back in ra and resume the stack pointer.
### 3. What happened when ecall
In the last part, we'll see what happened when the ecall instruction is being executed.

Now we can see when **ecall** comes into the EX stage, it's like seeing a red light for the instructions before the ecall and green light for instructions after the ecall. Until the former 2 instructions went out of the pipeline, the pipeline resume moving forward. This can help make sure the former 2 instructions are properly done then pipeline proceeds.

## Reference
https://leetcode.com/problems/perfect-number/
https://leetcode.com/problems/sqrtx/
https://hackmd.io/@sysprog/H1TpVYMdB
https://hackmd.io/@8bFA57f7SRG-K7AAT8s62g/ryv1NT3S#%E7%AC%AC18%EF%BD%9E21%E8%AC%9B-Pipelining-75