owned this note
owned this note
Published
Linked with GitHub
# [Assignment3: SoftCPU](https://hackmd.io/@sysprog/2021-arch-homework3)
Reference : [Lab3: srv32 - RISCV RV32IM Soft CPU](https://hackmd.io/@sysprog/S1Udn1Xtt)
###### tags: `RISC-V`, `jserv`
[toc]
## Installlations
### Get risc-v-none-embed-gcc
```bash=
$ cd /tmp
$ wget https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases/download/v10.1.0-1.1/xpack-riscv-none-embed-gcc-10.1.0-1.1-linux-x64.tar.gz
$ tar zxvf xpack-riscv-none-embed-gcc-10.1.0-1.1-linux-x64.tar.gz
$ cp -af xpack-riscv-none-embed-gcc-10.1.0-1.1 $HOME/riscv-none-embed-gcc
# Configure $PATH
$ cd $HOME/riscv-none-embed-gcc
$ echo "export PATH=`pwd`/bin:$PATH" > setenv
$ cd $HOME
$ source riscv-none-embed-gcc/setenv
# Check
$ echo $PATH
```
### Get [srv32](https://github.com/sysprog21/srv32)
```bash=
$ git clone https://github.com/sysprog21/srv32
```
> setup ref : [建置教學](https://hackmd.io/@eecheng/B1fEgnQwF)
### Get [GTKWave](http://gtkwave.sourceforge.net/)
```bash=
$ sudo apt install gtkwave
$ gtkwave
```
### Code Execution
:::spoiler {state="open"}1. Rewrite the C program [(Leetcode268)](https://leetcode.com/problems/missing-number/)
```c=
#include <stdio.h>
int main(){
int nums[] = {3,0,1,5,7,9,10,8,6,11,4,12};
int numsSize = 12;
int res = 0;
for(int i = 0; i < numsSize; i++){
res += nums[i];
}
printf("%d\n", numsSize * (numsSize+1) /2 - res);
return 0;
}
```
:::
2. Run the code with srv32
```bash=
# In srv32/
$ mkdir sw/hw3
# Copy existing make file to your directory
$ cp sw/hello/Makefile sw/hw3/
$ cd sw/hw3/
# Write down the C code
$ vim sw/hw3/hw3_1.c
# Modify the Makefile & make it
$ vim sw/hw3/Makefile
$ make
> riscv-none-embed-gcc -O3 -Wall -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -o hw3_1.elf hw3_1.c -lc -lm -lgcc -lsys -T > ../common/default.ld
> riscv-none-embed-objcopy -j .text -O binary hw3_1.elf imem.bin
> riscv-none-embed-objcopy -j .data -O binary hw3_1.elf dmem.bin
> riscv-none-embed-objcopy -O binary hw3_1.elf memory.bin
> riscv-none-embed-objdump -d hw3_1.elf > hw3_1.dis
> riscv-none-embed-readelf -a hw3_1.elf > hw3_1.symbol
# Run it
$ cd ../../tools
$ ./rvsim --memsize 128 -l trace.log ../sw/hw3/hw3_pro.elf
> 2
>
> Excuting 1473 instructions, 1853 cycles, 1.258 CPI
> Program terminate
> Simulation statistics
> =====================
> Simulation time : 0.001 s
> Simulation cycles: 1853
> Simulation speed : 3.162 MHz
```
---
## Analysis srv32 CPU
- SRV32 is a three-stage (IF/ID, EX, WB) pipeline processor which also passes the RV32IM compliance test.
- SRV32 implements the full forwarding to avoid data hazard, which means the the last instruction read from memory can forward to the execution stage.
- The failed branch prediction at the execution stage will take 2 stalls to flush the wrong instruction by the pipeline.
- The following picture shows the SRV32 processors architecture.

---
## Use GTKWave to analysis the wave
### GTKWave execution
1. Execute GTKWave
```bash=
$ gtkwave
```
2. open ./sim/wave.fst to get the corresponding waveform

### Wave analyse
- The failed branch predictions penalty is 2.

### Optimize
To eliminate unnecessary stalls, we re-written loops as a repeated sequence of similar independent statements. A repeated sequence also avoids the flush penalty from each failed branch prediction.
#### Loop unrolling
- Advantage
+ Minimized the branch penalty.
+ If the instructions in the loop are independent then each instruction can be executed in parallel.
- Disadvantage
+ The program code may become less readable.
+ A larger program size may cause an increase in instruction cache miss.
+ Each single iteration may increase register usage to store more temporary variables.
#### Result
```
./rvsim --memsize 128 -l trace.log ../sw/hw3/hw3_pro.elf
> 2
>
> Excuting 1413 instructions, 1773 cycles, 1.255 CPI
> Program terminate
>
> Simulation statistics
> =====================
> Simulation time : 0.003 s
> Simulation cycles: 1773
> Simulation speed : 0.687 MHz
```
We reduced 1853-1773 = 80 cycles.