# Assignment3: SoftCPU
###### tags: `Computer Architecture` `RISC-V` `jserv`
> Contributed by <[tonych1997](https://github.com/tonych1997/Computer-Architecture)>
[Assignment3 Requirements](https://hackmd.io/@sysprog/2022-arch-homework3)
[Lab3: srv32 - RISCV RV32IM Soft CPU](https://hackmd.io/@sysprog/S1Udn1Xtt)
[Github - srv32](https://github.com/sysprog21/srv32)
:::info
2022.12.12 This assignment is almost complete =)
2022.12.07 My environment installation was successful! I am in the process of completing my subsequent content.
2022.11.30 I spent A LOT OF TIME building the environment, but now it's still a failure, and I'll add the content after I succeed.
:::
## Old Enviroment Installaion
**Originally I used `Windows wsl Ubuntu 20.04`** to install the environment, but I encountered many problems in the process.
Therefore, on the advice of TA, I **switched to `Linux Ubuntu 20.04`** to install the environment, but I encountered many errors in the process.
I found out that the order of the installation or whether to install it or not would affect the result of the installation, so I recorded the last successful steps below and recorded the error process in another hackmd note: [Assignment3: RISC-V Enviroment Building old (failed) version ](https://hackmd.io/@Vgwl_uixQFasIvsDbsFlvA/risc-v-hw3_old).
## Install enviroment
I use `Linux Ubuntu 20.04` to build the environment.
### Set environment variables
:::info
* If an unintended error occurs during execution or installation, refer to this paragraph to resolve it.
* If this is the FIRST time to install, start at the [first step](https://hackmd.io/LJeLMHbrRnqaLhL7LCQNkw?both#1-Update-Ubuntu).
:::
Confirm the Environment Variables and PATH settings.
```
$ export # Check PATH, CROSS_COMPILE, VERILATOR_ROOT etc.
$ echo $PATH # check PATH only
```
Shutting down, rebooting, turning off the terminal, restarting the terminal, using a different terminal, etc. can cause the Environment Variables or PATH to be lost.
If the Environment Variable is missing, run the following command to reset the it.
```
# Set Cross Compile and Verilator root
$ export CROSS_COMPILE=riscv-none-elf-
$ export VERILATOR_ROOT=$HOME/verilator
```
If the PATH is missing, run the following command to reset it.
```
# Set RISC-V toolchain PATH
$ cd $HOME/riscv-none-elf-gcc
$ echo "export PATH=`pwd`/bin:$PATH" > setenv
$ cd $HOME
$ source riscv-none-elf-gcc/setenv
```
```
# Set verilator PATH
$ export PATH=$VERILATOR_ROOT/bin:$PATH
```
### 1. Update Ubuntu
First, update ubuntu.
```
$ sudo apt-get update
```
### 2. RISC-V toolchain
The RISC-V toolchain (riscv-none-elf-gcc) has already been installed in Assignment 2, so the toochain installation procedure is based on my [notes](https://hackmd.io/@Vgwl_uixQFasIvsDbsFlvA/risc-v-hw2) in Assignment 2. Or you can refer to [Lab2: RISC-V RV32I[MACF] emulator with ELF support](https://hackmd.io/@sysprog/SJAR5XMmi).
i.e. DO NOT run the command to install xpack, riscv-gnu-toolchain, etc. in assignment 3.
First, check cross compiler and PATH are still in the system.
If it's missing, do it again.
```
$ export # Check PATH, CROSS_COMPILE, VERILATOR_ROOT etc.
$ echo $PATH # check PATH only
```
Set cross compiler environment variables.
:::warning
1. The toolchain used may be different. Verify that the toolchain used is the same as the set variables. (e.g. riscv-gnu-toolchain)
2. The name of the toolchain was updated to [risc-none-elf-gcc](https://xpack.github.io/blog/2022/08/30/riscv-none-elf-gcc-v12-2-0-1-released/#risc-none-elf-gcc) (The old version used riscv-none-embed-gcc.)
:::
```
# Set cross compiler
$ export CROSS_COMPILE=riscv-none-elf-
```
```
# Set RISC-V toolchain PATH
$ cd $HOME/riscv-none-elf-gcc
$ echo "export PATH=`pwd`/bin:$PATH" > setenv
$ cd $HOME
$ source riscv-none-elf-gcc/setenv
```
### 3. Install all required packages
```
$ sudo apt-get install lcov
$ sudo apt install curl
$ curl -sL https://deb.nodesource.com/setup_14.x | sudo -E bash -
$ sudo apt-get install gcc g++ make
$ sudo apt-get update && sudo apt-get install yarn
$ sudo apt install -y nodejs
$ sudo apt install build-essential
```
```
$ sudo apt install build-essential lcov ccache libsystemc-dev
```
### 4. Install verilator
Install the veilator package according to the [srv32 - RISCV RV32IM Soft CPU](https://hackmd.io/@sysprog/S1Udn1Xtt) and the [verilator official installation document](https://verilator.org/guide/latest/install.html).
A command named `autoconf` is missing from the lab3 document, it should be execute before `export VERILATOR_ROOT=‵pwd‵`.
The `make` command will take a while to run before it finished, after `make` the terminal will show the message `build complete` and `Now type 'make test'
to test`.
```
$ cd $HOME
$ git clone https://github.com/verilator/verilator
$ cd verilator
$ git checkout stable # can't sudo
$ sudo apt-get install autoconf
$ autoconf # create ./configure script
$ export VERILATOR_ROOT=`pwd`
$ sudo apt-get install flex bison # package needed in ./configure
$ ./configure
$ make
```
And if excute the command `make test`, terminal will show these messages.
```
$ make test
Tests passed!
Now type 'make install' to install.
Or type 'make' inside an examples subdirectory.
```
After `make test`, do `make install`.
```
$ make install
```
Then set the enviroment variables in advance.
```
$ export VERILATOR_ROOT=$HOME/verilator
$ export PATH=$VERILATOR_ROOT/bin:$PATH
```
And check the Verilator version.
```
$ verilator --version
Verilator 5.002 2022-10-29 rev v5.002-29-gdb39d70c7
```
### 5. Get srv32
I refer [srv32's Github](https://hackmd.io/@wIVnCcUaTouAktrkMVLEMA/Hy6vD5DtF) to get srv32 and do `make` command in directories.
#### Clone srv32 files.
```
$ cd ~/
$ git clone https://github.com/sysprog21/srv32
```
:::warning
Just do the commands described below. i.e. there is NO NEED to install or set the RISC-V toolchain again by following the installation [building toolchain](https://github.com/sysprog21/srv32#building-toolchains) steps in srv32 Readme.
:::
Build the simulator.
```
$ cd ~/srv32/tools
$ make
$ cd ~/srv32/sim
$ make
```
Then bulid the test case.
:::warning
1. DO NOT execute `make tests` in `~/srv32/tests`, or you will get some error message.
2. DO NOT added `sudo` on the command `make tests`, because I discovered `sudo` will call the other `gcc` to compile, instead of `riscv-none-elf-gcc`.
:::
```
$ cd ~/srv32
$ make tests
```

```
$ cd ~/srv32
$ make tests-sw
```

### 6. Install GTKWave
Install GTKWave by following command.
```
$ sudo apt-get install gtkwave
$ cd sim && ./sim +dump
```
Then use Ubuntu's GUI go to the directory `/sim (~/srv32/sim)` to double click the `wave.fst` file to open the GTKWave.
## Q1: Assignement 2 - Search Insert Position
Here is the code I used in my [Assignment 2](https://hackmd.io/@Vgwl_uixQFasIvsDbsFlvA/risc-v-hw2).
### Code
C Code in Details (詳細資料).
:::spoiler
```clike=
#include <stdio.h>
int searchInsert(int *nums, int numsSize, int target) {
int left = 0;
int right = numsSize - 1;
while (left <= right)
{
int mid = (left + right) / 2;
if (target == nums[mid])
return mid;
else if (target < nums[mid])
right = mid - 1;
else
left = mid + 1;
}
return left;
}
int main()
{
int data[] = {1, 3, 5, 6};
int size = 4;
int tar1 = 5, tar2 = 2, tar3 = 7;
int index1 = searchInsert(data, size, tar1);
int index2 = searchInsert(data, size, tar2);
int index3 = searchInsert(data, size, tar3);
printf("The target1 insert position is %d\n", index1);
printf("The target2 insert position is %d\n", index2);
printf("The target3 insert position is %d\n", index3);
return 0;
}
```
:::
---
Assembly Code in Details (詳細資料).
The srv32 assembly code is as follows.
Code of `searchInsert` in Details (詳細資料).
:::spoiler
```
0000003c <searchInsert>:
3c: fff58593 addi a1,a1,-1
40: 00050693 mv a3,a0
44: 0405c263 bltz a1,88 <searchInsert+0x4c>
48: 00000713 li a4,0
4c: 00b70533 add a0,a4,a1
50: 40155513 srai a0,a0,0x1
54: 00251793 slli a5,a0,0x2
58: 00f687b3 add a5,a3,a5
5c: 0007a783 lw a5,0(a5)
60: 00c78a63 beq a5,a2,74 <searchInsert+0x38>
64: 00f65a63 bge a2,a5,78 <searchInsert+0x3c>
68: fff50593 addi a1,a0,-1
6c: fee5d0e3 bge a1,a4,4c <searchInsert+0x10>
70: 00070513 mv a0,a4
74: 00008067 ret
78: 00150713 addi a4,a0,1
7c: fce5d8e3 bge a1,a4,4c <searchInsert+0x10>
80: 00070513 mv a0,a4
84: ff1ff06f j 74 <searchInsert+0x38>
88: 00000513 li a0,0
8c: 00008067 ret
```
:::
Code of `main` in Details (詳細資料).
:::spoiler
```
00000090 <main>:
90: 000207b7 lui a5,0x20
94: 0fc78793 addi a5,a5,252 # 200fc <environ+0x70>
98: 0007a603 lw a2,0(a5)
9c: 0047a683 lw a3,4(a5)
a0: 0087a703 lw a4,8(a5)
a4: 00c7a783 lw a5,12(a5)
a8: fe010113 addi sp,sp,-32
ac: 00c12023 sw a2,0(sp)
b0: 00d12223 sw a3,4(sp)
b4: 00112e23 sw ra,28(sp)
b8: 00812c23 sw s0,24(sp)
bc: 00912a23 sw s1,20(sp)
c0: 00e12423 sw a4,8(sp)
c4: 00f12623 sw a5,12(sp)
c8: 00300693 li a3,3
cc: 00000593 li a1,0
d0: 00500613 li a2,5
d4: 00b687b3 add a5,a3,a1
d8: 4017d793 srai a5,a5,0x1
dc: 00279713 slli a4,a5,0x2
e0: 01070713 addi a4,a4,16
e4: 00270733 add a4,a4,sp
e8: ff072703 lw a4,-16(a4)
ec: 12c70c63 beq a4,a2,224 <main+0x194>
f0: 10e65863 bge a2,a4,200 <main+0x170>
f4: fff78693 addi a3,a5,-1
f8: fcb6dee3 bge a3,a1,d4 <main+0x44>
fc: 00300693 li a3,3
100: 00000493 li s1,0
104: 00200613 li a2,2
108: 00d487b3 add a5,s1,a3
10c: 4017d793 srai a5,a5,0x1
110: 00279713 slli a4,a5,0x2
114: 01070713 addi a4,a4,16
118: 00270733 add a4,a4,sp
11c: ff072703 lw a4,-16(a4)
120: 0cc70c63 beq a4,a2,1f8 <main+0x168>
124: 0ae65863 bge a2,a4,1d4 <main+0x144>
128: fff78693 addi a3,a5,-1
12c: fc96dee3 bge a3,s1,108 <main+0x78>
130: 00000413 li s0,0
134: 00300693 li a3,3
138: 00700613 li a2,7
13c: 00d407b3 add a5,s0,a3
140: 4017d793 srai a5,a5,0x1
144: 00279713 slli a4,a5,0x2
148: 01070713 addi a4,a4,16
14c: 00270733 add a4,a4,sp
150: ff072703 lw a4,-16(a4)
154: 06c70c63 beq a4,a2,1cc <main+0x13c>
158: 04e65863 bge a2,a4,1a8 <main+0x118>
15c: fff78693 addi a3,a5,-1
160: fc86dee3 bge a3,s0,13c <main+0xac>
164: 00020537 lui a0,0x20
168: 09050513 addi a0,a0,144 # 20090 <environ+0x4>
16c: 100000ef jal ra,26c <printf>
170: 00020537 lui a0,0x20
174: 00048593 mv a1,s1
178: 0b450513 addi a0,a0,180 # 200b4 <environ+0x28>
17c: 0f0000ef jal ra,26c <printf>
180: 00020537 lui a0,0x20
184: 00040593 mv a1,s0
188: 0d850513 addi a0,a0,216 # 200d8 <environ+0x4c>
18c: 0e0000ef jal ra,26c <printf>
190: 01c12083 lw ra,28(sp)
194: 01812403 lw s0,24(sp)
198: 01412483 lw s1,20(sp)
19c: 00000513 li a0,0
1a0: 02010113 addi sp,sp,32
1a4: 00008067 ret
1a8: 00178413 addi s0,a5,1
1ac: fa86cce3 blt a3,s0,164 <main+0xd4>
1b0: 00d407b3 add a5,s0,a3
1b4: 4017d793 srai a5,a5,0x1
1b8: 00279713 slli a4,a5,0x2
1bc: 01070713 addi a4,a4,16
1c0: 00270733 add a4,a4,sp
1c4: ff072703 lw a4,-16(a4)
1c8: f8c718e3 bne a4,a2,158 <main+0xc8>
1cc: 00078413 mv s0,a5
1d0: f95ff06f j 164 <main+0xd4>
1d4: 00178493 addi s1,a5,1
1d8: f496cce3 blt a3,s1,130 <main+0xa0>
1dc: 00d487b3 add a5,s1,a3
1e0: 4017d793 srai a5,a5,0x1
1e4: 00279713 slli a4,a5,0x2
1e8: 01070713 addi a4,a4,16
1ec: 00270733 add a4,a4,sp
1f0: ff072703 lw a4,-16(a4)
1f4: f2c718e3 bne a4,a2,124 <main+0x94>
1f8: 00078493 mv s1,a5
1fc: f35ff06f j 130 <main+0xa0>
200: 00178593 addi a1,a5,1
204: eeb6cce3 blt a3,a1,fc <main+0x6c>
208: 00b687b3 add a5,a3,a1
20c: 4017d793 srai a5,a5,0x1
210: 00279713 slli a4,a5,0x2
214: 01070713 addi a4,a4,16
218: 00270733 add a4,a4,sp
21c: ff072703 lw a4,-16(a4)
220: ecc718e3 bne a4,a2,f0 <main+0x60>
224: 00078593 mv a1,a5
228: ed5ff06f j fc <main+0x6c>
```
:::
---
### Modify Makefile
I create the C code file of Assignment 2: search insert position under the `sw/sip (~/srv32/sw/sip)` path.
Then I rewrite the `Makefile` by referring to `/sw/hello (~/srv32/sw/hello)` and change the `src` and `target` to my file name `sip`.
Note: The path / directory name is the same as the file name.
```
include ../common/Makefile.common
EXE = .elf
SRC = sip.c
CFLAGS += -L../common
LDFLAGS += -T ../common/default.ld
TARGET = sip
OUTPUT = $(TARGET)$(EXE)
.PHONY: all clean
all: $(TARGET)
$(TARGET): $(SRC)
$(CC) $(CFLAGS) -o $(OUTPUT) $(SRC) $(LDFLAGS)
$(OBJCOPY) -j .text -O binary $(OUTPUT) imem.bin
$(OBJCOPY) -j .data -O binary $(OUTPUT) dmem.bin
$(OBJCOPY) -O binary $(OUTPUT) memory.bin
$(OBJDUMP) -d $(OUTPUT) > $(TARGET).dis
$(READELF) -a $(OUTPUT) > $(TARGET).symbol
clean:
```
---
### Result in srv32
Run `make sip` under the `root` directory, i.e.`/srv32 (~srv32)`, and the result is shown below.

For more details, please refer to Details (詳細資料).
:::spoiler
```
t123@t123-BM6875-BM6675-BP6375:~/srv32$ make sip
make[1]: Entering directory '/home/t123/srv32/sw'
make -C common
make[2]: Entering directory '/home/t123/srv32/sw/common'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/t123/srv32/sw/common'
make[2]: Entering directory '/home/t123/srv32/sw/sip'
riscv-none-elf-gcc -O3 -Wall -march=rv32im_zicsr -mabi=ilp32 -misa-spec=2.2 -march=rv32im -nostartfiles -nostdlib -L../common -o sip.elf sip.c -lc -lm -lgcc -lsys -T ../common/default.ld
riscv-none-elf-objcopy -j .text -O binary sip.elf imem.bin
riscv-none-elf-objcopy -j .data -O binary sip.elf dmem.bin
riscv-none-elf-objcopy -O binary sip.elf memory.bin
riscv-none-elf-objdump -d sip.elf > sip.dis
riscv-none-elf-readelf -a sip.elf > sip.symbol
make[2]: Leaving directory '/home/t123/srv32/sw/sip'
make[1]: Leaving directory '/home/t123/srv32/sw'
make[1]: Entering directory '/home/t123/srv32/sim'
Excuting 1267 instructions, 1883 cycles, 1.486 CPI
Program terminate
- ../rtl/../testbench/testbench.v:434: Verilog $finish
Simulation statistics
=====================
Simulation time : 0.138 s
Simulation cycles: 1894
Simulation speed : 0.0137246 MHz
make[1]: Leaving directory '/home/t123/srv32/sim'
make[1]: Entering directory '/home/t123/srv32/tools'
./rvsim --memsize 128 -l trace.log ../sw/sip/sip.elf
The target1 insert position is 2
The target2 insert position is 1
The target3 insert position is 4
Excuting 7094 instructions, 9752 cycles, 1.375 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.006 s
Simulation cycles: 9752
Simulation speed : 1.701 MHz
make[1]: Leaving directory '/home/t123/srv32/tools'
Compare the trace between RTL and ISS simulator
Files sim/trace.log and tools/trace.log differ
make: *** [Makefile:121: sip] Error 1
```
:::
Tabular Results.
| Comapre | sim (verilog) | rvsim (c++) |
|-------------------|---------------|-------------|
| Instructions | 1267 | 7094 |
| Cycles | 1883 | 9752 |
| CPI | 1.486 | 1.375 |
| Simulation Times | 0.138 s | 0.006 s |
| Simulation cycles | 1894 | 9752 |
| Simulation Speed | 0.0137246 MHz | 1.701 MHz |
---
### Waveform analysis
[`srv32` is a 3-stage pipeline architecture with IF/ID, EX, WB stages. The follwing diagram marks some important signals for later discussion.](https://hackmd.io/@sysprog/S1Udn1Xtt#Pipeline-architecture)

After `make`, double-click `wave.fst` under `/srv32/sim (~/srv32/sim)` in `Ubuntu` GUI to start `GTKWave`, and then can see the following window.

Then find `sip.dis` from `/srv32/sw/sip (~/srv32/sw/sip)` path, this file is the `Assembly language` version of `sip` of `C program`, then match the `PC` of the file to the `PC` of the waveform, to find the waveform of the corresponding program.
Click `Search` -> `Pattern Search` can get the window named `Waveform Display Search`, then can search the `PC`.

I choose to observe these signals.

We can find that the instructions are executed in the order of `fetch_pc` -> `if_pc` -> `ex_pc` -> `wb_pc`.

In `imem_rdata` and `ex_insn`, we can see that the data is transferred from the IF/ID stage to the EX stage.

#### [Data hazard](https://hackmd.io/@sysprog/S1Udn1Xtt#Data-hazard)
Accroding to [[srv32 data hazard]](https://hackmd.io/@sysprog/S1Udn1Xtt#Data-hazard), we know `srv32` support full fowarding, so it doesn't need to stall to solve the RAW data hazard. Meanwhile, `srv32`, it only have RAW data hazard, because other hazard (WAW, WAR) isn't possible on single issue processor.
For example, I find the RAW data hazard in `main` and the waveform where the RAW data hazard occurred is as follows.
```
00000090 <main>:
...
ec: 12c70c63 beq a4,a2,224 <main+0x194>
f0: 10e65863 bge a2,a4,200 <main+0x170>
...
```
In this part, we can find that when `ex_mem2reg=1` and `wb_alu2reg=1`, a data hazard occurs.

#### [Load-use hazard](https://hackmd.io/@sysprog/S1Udn1Xtt#Load-use-hazard)
If the first instruction is of the Load type, the result will not be forwarded to the Execution of the next instruction until the Memory Access stage.
```
00000090 <main>:
...
1f0: ff072703 lw a4,-16(a4)
1f4: f2c718e3 bne a4,a2,124 <main+0x94>
1f8: 00078493 mv s1,a5
1fc: f35ff06f j 130 <main+0xa0>
...
```
There is Load-use hazard between `1f0` and `1f8`.
When `1f8` in `IF`, `1f4` in `EX`, `1f0` in `WB`, `(wb_dst_sel == ex_src1_sel)` is true and `wb_mem2reg` is true.

| PC | Instruction | cycle 1 | c2 | c3 | c4 | c5 |
| -- | -------------- | ------- | ----- | ----- | -- | -- |
| 1f0 | lw a4,-16(a4) | IF/ID | EX | WB | | |
| 1f4 | bne a4,a2,124 | | IF/ID | EX | WB | |
| 1f8 | mv s1,a5 | | | IF/ID | EX | WB |
#### [Branch penalty](https://hackmd.io/@sysprog/S1Udn1Xtt#Branch-penalty)
Branch penalty is the number of instructions killed after a branch instruction if a branch is TAKEN.
In `srv32`, the branch penalty is 2, it's meens 2 instructions will be kill, in other words, insert 2 nop after branch instruction.
In this section of the `main`, we can see the `branch` instruction `bqe`.
```
00000090 <main>:
...
160: fc86dee3 bge a3,s0,13c <main+0xac>
164: 00020537 lui a0,0x20
168: 09050513 addi a0,a0,144 # 20090 <environ+0x4>
...
```
Next, find the corresponding waveform.

The order of execution is shown below.
| | | IF/ID | EX | WB |
| ------- | ---------------------- | ------- | ----- | ----- |
| next_pc | fetch_pc (immem_addr) | if_pc | ex_pc | wb_pc |
| xxx | addi a0,a0,144 | lui a0,0x20 | bge a3,s0,13c | |
In terms of cycles, as shown below.
| PC | Instruction | cycle 1 | c2 | c3 | c4 | c5 | c6 |
| -- | --------------- | ------- | ----- | ----- | --- | --- | -- |
| 160 | bge a3,s0,13c | IF/ID | EX | WB | | | |
| 164 | lui a0,0x20 | | NOP | NOP | NOP | | |
| 168 | addi a0,a0,144 | | | NOP | NOP | NOP | |
| xxx | exec if branch taken | | | |IF/ID| EX | WB |
---
### Software Optimizations
I try to modify the C code to use less cycles.
Modify the C program to reduce the number of lines, and then make the new program.
New C Code program.
```clike=
#include <stdio.h>
int searchInsert(int *nums, int numsSize, int target) {
int left = 0;
int right = numsSize - 1;
while (left <= right)
{
int mid = (left + right) / 2;
if (target == nums[mid])
return mid;
else if (target < nums[mid])
right = mid - 1;
else
left = mid + 1;
}
return left;
}
int main()
{
int data[] = {1, 3, 5, 6};
int size = 4;
int tar[] = {5, 2, 7};
for(int i=0; i<sizeof(tar)/sizeof(tar[0]); i++) {
printf("The target%d insert position is %d\n", i+1, searchInsert(data, size, tar[i]));
}
return 0;
}
```
After `make`, we can get the result as shown belows.

:::spoiler
```
t123@t123-BM6875-BM6675-BP6375:~/srv32$ make sip2
make[1]: Entering directory '/home/t123/srv32/sw'
make -C common
make[2]: Entering directory '/home/t123/srv32/sw/common'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/t123/srv32/sw/common'
make[2]: Entering directory '/home/t123/srv32/sw/sip2'
riscv-none-elf-gcc -O3 -Wall -march=rv32im_zicsr -mabi=ilp32 -misa-spec=2.2 -march=rv32im -nostartfiles -nostdlib -L../common -o sip2.elf sip2.c -lc -lm -lgcc -lsys -T ../common/default.ld
riscv-none-elf-objcopy -j .text -O binary sip2.elf imem.bin
riscv-none-elf-objcopy -j .data -O binary sip2.elf dmem.bin
riscv-none-elf-objcopy -O binary sip2.elf memory.bin
riscv-none-elf-objdump -d sip2.elf > sip2.dis
riscv-none-elf-readelf -a sip2.elf > sip2.symbol
make[2]: Leaving directory '/home/t123/srv32/sw/sip2'
make[1]: Leaving directory '/home/t123/srv32/sw'
make[1]: Entering directory '/home/t123/srv32/sim'
Excuting 1266 instructions, 1874 cycles, 1.480 CPI
Program terminate
- ../rtl/../testbench/testbench.v:434: Verilog $finish
Simulation statistics
=====================
Simulation time : 0.152 s
Simulation cycles: 1885
Simulation speed : 0.0124013 MHz
make[1]: Leaving directory '/home/t123/srv32/sim'
make[1]: Entering directory '/home/t123/srv32/tools'
./rvsim --memsize 128 -l trace.log ../sw/sip2/sip2.elf
The target1 insert position is 2
The target2 insert position is 1
The target3 insert position is 4
Excuting 8419 instructions, 11581 cycles, 1.376 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.004 s
Simulation cycles: 11581
Simulation speed : 2.724 MHz
make[1]: Leaving directory '/home/t123/srv32/tools'
Compare the trace between RTL and ISS simulator
Files sim/trace.log and tools/trace.log differ
make: *** [Makefile:121: sip2] Error 1
```
:::
Tabular Results.
| Comapre | sim (verilog) | rvsim (c++) |
|-------------------|---------------|-------------|
| Instructions | 1266 | 8419 |
| Cycles | 1874 | 11581 |
| CPI | 1.480 | 1.376 |
| Simulation Times | 0.152 s | 0.004 s |
| Simulation cycles | 1885 | 11581 |
| Simulation Speed | 0.0124013 MHz | 2.724 MHz |
Compare with the original as shown below.
| Comapre | sim (verilog) | rvsim (c++) |
|-------------------|---------------|-------------|
| Instructions | 1267 | 7094 |
| Cycles | 1883 | 9752 |
| CPI | 1.486 | 1.375 |
| Simulation Times | 0.138 s | 0.006 s |
| Simulation cycles | 1894 | 9752 |
| Simulation Speed | 0.0137246 MHz | 1.701 MHz |
In `sim`, we can find a decrease in CPI, which means that the progress is in a good direction; however, in `rvsim`, we can find a slight increase in CPI, which is not a good sign.
---
## Q2: Leetcode [122. Best Time to Buy and Sell Stock II](https://leetcode.com/problems/best-time-to-buy-and-sell-stock-ii/)
I picked [張瑞甫's program in Assignment 2](https://hackmd.io/ADNQPiEFSPC_daP2GJ_sSQ) to complete my Assignment 3.
### Code
C code
```clike=
#include<stdio.h>
#include<stdlib.h>
int maxProfit(int* prices, int pricesSize){
int totalProfit = 0;
int i=0;
int increases = 0;
for (i = 1; i < pricesSize; i++) {
if (prices[i] <= prices[i - 1]) {
totalProfit += increases;
increases = 0;
}
else
increases += prices[i] - prices[i - 1];
}
totalProfit += increases;
return totalProfit;
}
int main(){
int arr[]={99,32,3,56,0,2,56,99};
int size=8;
int a= maxProfit(arr,size);
printf("%d\n",a);
}
```
Assembly Code in Details (詳細資料).
`maxProfit` Assembly Code in Details (詳細資料).
:::spoiler
```
0000003c <maxProfit>:
3c: 00100793 li a5,1
40: 0cb7d263 bge a5,a1,104 <maxProfit+0xc8>
44: 00300793 li a5,3
48: 0cb7d263 bge a5,a1,10c <maxProfit+0xd0>
4c: ffc58713 addi a4,a1,-4
50: 00052783 lw a5,0(a0)
54: ffe77713 andi a4,a4,-2
58: 00450693 addi a3,a0,4
5c: 00370713 addi a4,a4,3
60: 00000813 li a6,0
64: 00100313 li t1,1
68: 00000893 li a7,0
6c: 0006a603 lw a2,0(a3)
70: 00000e13 li t3,0
74: 40f60eb3 sub t4,a2,a5
78: 06c7d863 bge a5,a2,e8 <maxProfit+0xac>
7c: 0046a783 lw a5,4(a3)
80: 010e8e33 add t3,t4,a6
84: 00000813 li a6,0
88: 40c78eb3 sub t4,a5,a2
8c: 06f65863 bge a2,a5,fc <maxProfit+0xc0>
90: 01ce8833 add a6,t4,t3
94: 00230313 addi t1,t1,2
98: 00868693 addi a3,a3,8
9c: fce318e3 bne t1,a4,6c <maxProfit+0x30>
a0: 00271793 slli a5,a4,0x2
a4: 00f507b3 add a5,a0,a5
a8: 0180006f j c0 <maxProfit+0x84>
ac: 00170713 addi a4,a4,1
b0: 010888b3 add a7,a7,a6
b4: 00478793 addi a5,a5,4
b8: 00000813 li a6,0
bc: 02b75263 bge a4,a1,e0 <maxProfit+0xa4>
c0: 0007a603 lw a2,0(a5)
c4: ffc7a683 lw a3,-4(a5)
c8: 40d60533 sub a0,a2,a3
cc: fec6d0e3 bge a3,a2,ac <maxProfit+0x70>
d0: 00170713 addi a4,a4,1
d4: 00a80833 add a6,a6,a0
d8: 00478793 addi a5,a5,4
dc: feb742e3 blt a4,a1,c0 <maxProfit+0x84>
e0: 01088533 add a0,a7,a6
e4: 00008067 ret
e8: 0046a783 lw a5,4(a3)
ec: 010888b3 add a7,a7,a6
f0: 00000813 li a6,0
f4: 40c78eb3 sub t4,a5,a2
f8: f8f64ce3 blt a2,a5,90 <maxProfit+0x54>
fc: 01c888b3 add a7,a7,t3
100: f95ff06f j 94 <maxProfit+0x58>
104: 00000513 li a0,0
108: 00008067 ret
10c: 00000813 li a6,0
110: 00100713 li a4,1
114: 00000893 li a7,0
118: f89ff06f j a0 <maxProfit+0x64>
```
:::
`main` Assembly Code in Details (詳細資料).
:::spoiler
```
0000011c <main>:
11c: 00020537 lui a0,0x20
120: ff010113 addi sp,sp,-16
124: 09800593 li a1,152
128: 09050513 addi a0,a0,144 # 20090 <environ+0x4>
12c: 00112623 sw ra,12(sp)
130: 054000ef jal ra,184 <printf>
134: 00c12083 lw ra,12(sp)
138: 00000513 li a0,0
13c: 01010113 addi sp,sp,16
140: 00008067 ret
```
:::
---
### Modify Makefile
Copy the `Makefile` in `~/srv32/sw/sip` to `~srv32/sw/122`, and modify the `SRC` and `TARGET` from `sip` to `122`.
---
### Result in srv32
Run `make 122` under the root (/srv32 (~srv32/)) directory and the result is shown below.

:::spoiler
```
t123@t123-BM6875-BM6675-BP6375:~/srv32$ make 122
make[1]: Entering directory '/home/t123/srv32/sw'
make -C common
make[2]: Entering directory '/home/t123/srv32/sw/common'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/t123/srv32/sw/common'
make[2]: Entering directory '/home/t123/srv32/sw/122'
riscv-none-elf-gcc -O3 -Wall -march=rv32im_zicsr -mabi=ilp32 -misa-spec=2.2 -march=rv32im -nostartfiles -nostdlib -L../common -o 122.elf 122.c -lc -lm -lgcc -lsys -T ../common/default.ld
riscv-none-elf-objcopy -j .text -O binary 122.elf imem.bin
riscv-none-elf-objcopy -j .data -O binary 122.elf dmem.bin
riscv-none-elf-objcopy -O binary 122.elf memory.bin
riscv-none-elf-objdump -d 122.elf > 122.dis
riscv-none-elf-readelf -a 122.elf > 122.symbol
make[2]: Leaving directory '/home/t123/srv32/sw/122'
make[1]: Leaving directory '/home/t123/srv32/sw'
make[1]: Entering directory '/home/t123/srv32/sim'
Excuting 797 instructions, 1253 cycles, 1.572 CPI
Program terminate
- ../rtl/../testbench/testbench.v:434: Verilog $finish
Simulation statistics
=====================
Simulation time : 0.144 s
Simulation cycles: 1264
Simulation speed : 0.00877778 MHz
make[1]: Leaving directory '/home/t123/srv32/sim'
make[1]: Entering directory '/home/t123/srv32/tools'
./rvsim --memsize 128 -l trace.log ../sw/122/122.elf
152
Excuting 2110 instructions, 2946 cycles, 1.396 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.001 s
Simulation cycles: 2946
Simulation speed : 2.300 MHz
make[1]: Leaving directory '/home/t123/srv32/tools'
Compare the trace between RTL and ISS simulator
Files sim/trace.log and tools/trace.log differ
make: *** [Makefile:121: 122] Error 1
```
:::
---
### Waveform analysis
Stpes are same as Q1.

#### [Data hazard](https://hackmd.io/@sysprog/S1Udn1Xtt#Data-hazard)
There is a data hazard in `printf`.
Assembly code of `printf` in Details (詳細資料).
:::spoiler
```
00000184 <printf>:
184: fc010113 addi sp,sp,-64
188: 02c12423 sw a2,40(sp)
18c: 02d12623 sw a3,44(sp)
190: 00020317 auipc t1,0x20
194: ef032303 lw t1,-272(t1) # 20080 <_impure_ptr>
198: 02b12223 sw a1,36(sp)
19c: 02e12823 sw a4,48(sp)
1a0: 02f12a23 sw a5,52(sp)
1a4: 03012c23 sw a6,56(sp)
1a8: 03112e23 sw a7,60(sp)
1ac: 00832583 lw a1,8(t1)
1b0: 02410693 addi a3,sp,36
1b4: 00050613 mv a2,a0
1b8: 00030513 mv a0,t1
1bc: 00112e23 sw ra,28(sp)
1c0: 00d12623 sw a3,12(sp)
1c4: 010000ef jal ra,1d4 <_vfprintf_r>
1c8: 01c12083 lw ra,28(sp)
1cc: 04010113 addi sp,sp,64
```
:::
We can find that when `PC` is `184`, `(wb_dst_sel == ex_src1_sel)` is `true` and `wb_mem2reg` is `false`. This means that a data hazard has occurred.

---
### Software Optimizations
I would like to try to modify the C code to reduce the usage cycle.
---
## Reference
* Assigment3 last year
* [tobychui](https://hackmd.io/@wIVnCcUaTouAktrkMVLEMA/Hy6vD5DtF)
* [Jack](https://hackmd.io/@jackli/arch_hw3)
* [chinghongfang](https://hackmd.io/@chinghongfang/HJuNqq-cF)
* [陳韋綸](https://hackmd.io/@_UHs74UQS7uNne9_7SwQFQ/S113vvkct)
* [Xiaokan Lua](https://hackmd.io/@E4b6eQ9-RWSAX-9mP_FLhA/HJwz8FgOK)
* Assignment 3 this year
* [nlnlOeO](https://hackmd.io/lMHf_NxVQGeO-VRIYvUV5w?view)
* [wanghanchi](https://hackmd.io/@wanghanchi/H1AxxO9ri)
* [srv32 學習紀錄](https://hackmd.io/Jrr1J1YDR_CtGR46DYotiQ)
* [eecheng's Assignment 3 建置教學](https://hackmd.io/@eecheng/B1fEgnQwF)
* [echo, export commands](https://www.cnblogs.com/xiaopiyuanzi/p/11910107.html)
* [Welcome to GTKWave](http://www.cs.ucf.edu/courses/cda4150/fall05/wave/wave.html)
* [淺談分支預測與 Hazards 議題](https://ithelp.ithome.com.tw/m/articles/10265705)