# Fianl Project - Analyze ASFRV32IM and explain how it works!
###### tags: `computer architure 2021`
## Object
1. ASFRV32IM is a simple implementation of RISC-V, We will simulate it and synthesis it by Icarus Verilog.
2. Check the homework in this semester and make sure the data output type by UART, and implement it in the ASFRV32IM.
3. Discuss the datapath in the ASFRV32IM, visualize the function block in the ASFRV32IM.
## Intorduction
### The ASFRV32IM
ASFRV32IM is a small RISC-V RV32IM implementation written with 250 lines for iverilog. It is compliant with RISC-V Unpriviledged ISA 20191213. It has simple character output function (UART) and program cycle counter with memory mapped IO, and can run coremark and dhrystone benchmark.
## Set up the develope invironment
First, we get the pack on Github
```
$ git clone https://github.com/asfdrwe/ASFRV32IM.git
```
and synthesis by iverilog
```
$ iverilog -o RV32IM RV32IM.v RV32IM_test.v
```
Run as cpu dump mode
```
$ ./RV32IM
```
We can see the result shown as follow:

and we can see the output waveform (load .vcd file into our gtkwave)

This is our basic test of the ASFRV32IM, so the whole things seems working okay.
The instruction working here were loaded form `test.hex`

Which is in the ASFRV32IM folder.
So next, we will going to writing the machine code, and the assembly code, and the c code in this environment.
## Code something in this platform
We can put our code on it by serveral method.
### hand write machine code
Example "lui x1, 0x20000"
We can check the instruction set manual of RISCV(attach the file)
and it will be like this:
```
0b_0010_0000_0000_0000_0000_00001_0110111 (binary)
```
we transfer it into hex
```
0x_20_00_00_B7 (hexadecimal)
```
And we Write test.hex as
B7
00
00
20
Please be aware, we use little endian here.
and we can see the output:
`0: PC = 00000000, OPCODE = 200000b7, ALU_DATA = 20000000, UART = 00`
but this is too hard to implement something we really want, so next, we hand writing the assenbly code and convert it into machine code by `riscv-gnu-toolchain`
### Make assembly into ASFRV32IM
Here we put the assembly code in homework 1 into our `ASFRV32IM`.
First, We use the [RISC-V GNU Compiler Toolchain](https://github.com/riscv-collab/riscv-gnu-toolchain). And install it as it mention on the webpage.
ASFRV32IM is a baremetal. So We need a linker script such as link.ld because ASFRV32IM's program starts from 0 address.
```
OUTPUT_ARCH( "riscv" )
ENTRY(_start)
SECTIONS
{
. = 0x00000000;
.text : { *(.text) }
. = ALIGN(0x0100);
.sbss : { *(.sbss) }
.bss : { *(.bss) }
.sdata : { *(.sdata) }
.data : { *(.data) }
_end = .;
}
```
and we assemble as follow:
`$ riscv64-unknown-elf-gcc -o test1.bin test1.S -march=rv32g -mabi=ilp32 -nostdlib -nostartfiles -T ./link.ld`
And we have to convert it into hex text, so we need this [one](https://github.com/sifive/elf2hex/blob/master/freedom-bin2hex.py).
`$ riscv64-unknown-elf-objcopy -O binary test1 test1.bin`
`$ python freedom-bin2hex.py -w 8 test1.bin test.hex`
And run our `ASFRV32IM`
`$ ./RV32IM`
### Put C into `ASFRV32IM`
Here is the c code that we want to put into the `ASFRV32IM`
The c code in homework 1:
```=c
int maxProfit(int* prices, int pricesSize){
int min_price=INT_MAX;
int p_gap=0;
for (int i=0;i<pricesSize;i++){
if(prices[i]<=min_price)
min_price=prices[i]; //update min_price value
if(prices[i]-min_price>=p_gap)
p_gap=prices[i]-min_price; //update the newest price gap
}
return p_gap;
}
```
We convert is into with the main funct. one, just like our homework 1.
```=c
#include <stdio.h>
#include <limits.h>
int maxProfit(int* prices, int pricesSize){
int min_price=INT_MAX;
int p_gap=0;
for (int i=0;i<pricesSize;i++){
if(prices[i]<=min_price)
min_price=prices[i]; //update min_price value
if(prices[i]-min_price>=p_gap)
p_gap=prices[i]-min_price; //update the newest price gap
}
return p_gap;
}
int main(){
int prices[] = {7,1,5,3,6,4};
printf("%d\n", maxProfit(prices,6));
return 0;
}
```
And this is the assembly code:
```=asm
main:
lui s1, 0x0
addi t0, x0, 7
addi t1, x0, 1
addi t2, x0, 5
addi t3, x0, 3
addi t4, x0, 6
addi t5, x0, 4
addi ra, x0, 0
sw t0, 0(s1) # s1 = pointer to nums[0]
sw t1, 4(s1)
sw t2, 8(s1)
sw t3, 12(s1)
sw t4, 16(s1)
sw t5, 20(s1)
li s4, -1 # s4 = -1
srli s4, s4, 1 # let -1 shift right logic imm by 1, to get the largest num
addi s3, x0, 6 # s3 = numsSize
add t0, x0, x0 # i = 0
add s2, x0, x0 # s5 =0
jal ra, loop # jump to loop and save the addr. of instr "jal ra, print" to reg "ra"
jal ra, print # jump to print and save the addr. of instr "li a7, 10" to reg "ra"
lw t0, 0(s2)
li a7, 10 # end the program
ecall
loop:
lw t1, 0(s1) # t1 = prices[i]
bgt t1, s4, if2 # if (prices[i]>min_price) than branch
add s4, x0, t1 # min_price=prices[i]
if2:
sub t2, t1, s4 # price[i]-min_price
blt t2, s2, if3 # if(prices[i]-min_price<p_gap) than branch
sub s2, t1, s4 # p_gap=prices[i]-min_price
if3:
addi s1, s1, 4 # nums++ (address move forward)
addi t0, t0, 1 # i++
blt t0, s3, loop # if i < array length
jr ra # else, return to main
print:
addi t0, s2, 0 # load result (indicate on the t0 register)
add a0, s2, x0 # load result
li a7, 1 # print integer
ecall
jr ra # go back to main
```
And we compile it by following instruction
`$ riscv64-unknown-elf-gcc -o test1.bin test1.S -march=rv32g -mabi=ilp32 -nostdlib -nostartfiles -T ./link.ld`
And also, We have to convert it into hexadecimal text.
```
$ riscv64-unknown-elf-objcopy -O binary test1 test1.bin
$ python freedom-bin2hex.py -w 8 test1.bin test.hex
```
Run it!
`$ ./RV32IM`
And this is the result:

We can see the output waveform file, when we excute the `addi t0, s2, 0`(which can observed by the opcode part), the write data were 5:

Same as the solution at leetcde

## Observation
We will using the example code to analysis the function block of this RISCV machine
This is our ASFRV32IM block diagram

### Example 1 : addi instruction
First, we will discuss the I-type instruction
For example: `addi x5 x0 7`
the arrangement of the machine code will be like this:
| 000000000111 | 00000 | 000 | 00101 | 0010011|
| -------- | -------- | -------- | -------- | -------- |
| 7 imm. part | for the x0 (rs) | funct. | x5 (rd) |0x13 (the opcode) |
And something going on when we excute this instruction. First , we make the machine code into hex type:
`0x0070_0293`
And we put it into our `ASFRV32IM` datapath and see the result.
To analysis the result,here we are going to discuss the I-format concept in RISCV architecture.
The figure showed as follow indicated the I format instruction dataflow and the control signal in our `ASFRV32IM` project.

Because in `ASFRV32IM`, the datapath is single cyle machine, so every instruction excute in one cycle, which means the whole thing heppend in single!
First, the `pc` (program counter) point the address which in instruction memory of current instruction. In this example it will be: `0x00000004`
The output of instruction memory is the instruction machine code pointed by the pc (the address). In this example, the machine code were stored at the opcode register, the value will be `0x0070_0293`


By the decoder here, the control signal were decoded, Cotrol the component like Register File, op1sel ,op2sel, ALU and so on.
The decoder were also decoded the register address for the `rs` register value, even the imm. value too.


After the register file reach the address value of `rs`, it's output the value of it and select by the op1sel, and the other side, the imm. value pass the op2sel.


After the selection of the alucon signal pin, ALU excute the `add` operation, because we don't need to save the data into the data mamory in this instruction, so we just pass the value to the `wb_sel` and write back to the register file.


The PC(program counter) part:
because in this instruction, we don't have to branch, so just let the pc value +4 (a word equa s 4 byte here), pc_sel2 = 00.


### Example 2 : store word instruction
Another insturction example, store word
`sw x5 0(x9)`
The machine code of this instruction as follow:
`0x0054a023`
| imm[11:5] | rs2 | rs1 | func3 | immp[4:0] | opcode |
| -------- | -------- | -------- | -------- | -------- | -------- |
| 00000000 (offset [11:5] part) | 00101 (rs2) | 01001 (rs1) | 010 (sw) | 00000 (offset [4:0] part) | 0100011 (store) |
The dataflow in datapath

waveform output:

Something different between immediate instruction and load/store instruction, something calculate in ALU is the address of the target address which we want to store.
### Example 3 : branch instruction
Here we discuss the branch instruction:
`blt x5 x19 -32 <loop>`
The distance between `blt` and `loop` is -32 (pc-realtive)


The machine code in hex format:
`0xff32_c0e3`
And this is `-32` in binary format, we store it in imm.
`imm=1111_1110_0000`
| {imm[12],imm[10:5]} | rs2 | rs1 | BLT | {imm[4:1],imm[11]} | Branch |
| -------- | -------- | -------- |-------- | -------- | -------- |
| 1111111 | 100111 | 00101 | 100 | 00001 | 1100011 |
Something different here between branch instruction and other instruction is about the pc value.
In branch instruction we are involving the pc value changes, and we will talk about this comprehensively.
Here is our datapath and controller diagram which we mentioned before, with marked data flow:

First , the `pc` value comes from the `pc register` as we mentioned, We can use the pc value to fetch our instruction, which is the address of the instruction that we looking for(the `opcode[31:0]` part).

Second, after decoding the instruction, the data and the contorl signal start their work.
Let us to check our instruction here again:
`blt x5 x19 -32 <loop>`
The ALU here must have to compare the value of the x5 register and the x19 register, so we have to put their value into the Branch controller component, after we decode the value of the instruction, we can get the information of "which register that we want to use", in this instruction `blt x5 x19 -32 <loop>` at `r_addr1[5:0]` and the `r_addr2[5:0]` we can get the information of x5, x19, which is 00101 and 10011, we could realize it by discover the machine code:`0xff32_c0e3`
| rs2 | rs1 |
| -------- |-------- |
| 10100 | 00110 |
The data flow:

And than, the branch controller which recieved the value of x5 and x19 will decide whether branch or not.
Now we discuss the pc value here, by using the `opsel1`, `opsel2`, we can select the `imm. value` and the current `pc value` into our ALU.

If the branch controller decide to branch, than the pc value calculated by the ALU will update into the `next_pc` wire, by switching the value of `pc_sel2` into 1.
If the controller decide not to branch, than the value of `next_pc` wire will still the current pc value plus 4 as usual, by switching the value of `pc_sel2` into 0.

This is the waveform of this instruction

So, in the end, we are going to analysis the waveform, this is a blt instruction. It will branch when the `r_data1` value is less than `r_data2`. The data in `r_data1` is `1` and `r_data2` is `6`, so in this circumstance, the next_pc will update into the `alu_data` (the calculated result), tha `alu_data` is the addition of the value of `s_data1` and `s_data2`. As shown on the waveform, we can see the next pc value in next resing edge will be `0x0000_0060`, which is our branch target.
## Conclusion and some feedback of this semester
In this term project, we implement our c code homework into the `ASFRV32IM`, and compile it with `riscv-gnu-toolchain`.
Also we discuss three main RISC-V insrtuctiontype, which is immediate type, and the load/store type, and the branch type, Discussing it by the datapath diagram may enhance the concept of the RISC-V instruction
In the end, thanks to Jserv and TA Steven. For our Double E student, We could have a different Computer Architecture course by the perspective from the software and really help us a lot!
(Although the Quiz and the homework were sooo difficult QQ).
## Reference
https://inst.eecs.berkeley.edu/~cs61c/resources/su18_lec/Lecture7.pdf
https://github.com/asfdrwe/ASFRV32IM
https://github.com/riscv-collab/riscv-gnu-toolchain
https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf
Some course materials in the course page
http://wiki.csie.ncku.edu.tw/arch/schedule