---
tags: homework
---
# Asssigment3: SoftCPU
contributed by <:t-rex: `shangrex` >
Still working :hammer_and_pick:
## Set up the environment
1. build the tool chain dependencies
```
sudo apt install build-essential ccache
```
2. download the [risc-v toolchain](https://xpack.github.io/riscv-none-embed-gcc/install/)
1. first check the ISA you have
```
uname -m
```
2. go to the [github release page](https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases/) to install the risc-v tool chain and put it under [/opt](https://stackoverflow.com/questions/12649355/what-does-opt-mean-as-in-the-opt-directory-is-it-an-abbreviation/12649407)
```
$ mkdir -p ~/opt
$ cd ~/opt
$ tar xvf ~/Downloads/xpack-riscv-none-embed-gcc-8.2.1-3.1-linux-x64.tgz
$ sudo chmod -R -w ~/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc-10.2.0
```
3. clone the [simulator repo](https://github.com/sysprog21/srv32) and set up the environment variables
```
git clone git@github.com:sysprog21/srv32.git
cd srv32/
export CROSS_COMPILE=~/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-
```
5. Install coverage tool
```
sudo apt install lcov
```
6. Finished!!
```
make all
```
## Requirement1
Set up a `a1` file in `sw`, and create my c code `a1.c`. Create Makefile( Refer to `hello` file Makefile) and then go to repo root directory run `make a1`.
:::spoiler The Running result
```shell=
make[1]: Entering directory '/home/shangrex/program/srv32/sw'
make -C common
make[2]: Entering directory '/home/shangrex/program/srv32/sw/common'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/shangrex/program/srv32/sw/common'
make[2]: Entering directory '/home/shangrex/program/srv32/sw/a1'
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc -O3 -Wall -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -o a1.elf a1.c -lc -lm -lgcc -lsys -T ../common/default.ld
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .text -O binary a1.elf imem.bin
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .data -O binary a1.elf dmem.bin
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -O binary a1.elf memory.bin
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objdump -d a1.elf > a1.dis
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-readelf -a a1.elf > a1.symbol
make[2]: Leaving directory '/home/shangrex/program/srv32/sw/a1'
make[1]: Leaving directory '/home/shangrex/program/srv32/sw'
make[1]: Entering directory '/home/shangrex/program/srv32/sim'
The result is 1. (1 is true and 0 is false.)
Excuting 2134 instructions, 2856 cycles, 1.338 CPI
Program terminate
- ../rtl/../testbench/testbench.v:418: Verilog $finish
Simulation statistics
=====================
Simulation time : 0.038 s
Simulation cycles: 2867
Simulation speed : 0.0754474 MHz
make[1]: Leaving directory '/home/shangrex/program/srv32/sim'
make[1]: Entering directory '/home/shangrex/program/srv32/tools'
./rvsim --memsize 128 -l trace.log ../sw/a1/a1.elf
The result is 1. (1 is true and 0 is false.)
Excuting 2134 instructions, 2856 cycles, 1.338 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.004 s
Simulation cycles: 2856
Simulation speed : 0.760 MHz
make[1]: Leaving directory '/home/shangrex/program/srv32/tools'
Compare the trace between RTL and ISS simulator
=== Simulation passed ===
```
:::
:::spoiler My C code
```c=
# include <stdio.h>
# include <stdlib.h>
# include <stdbool.h>
bool hasAlternatingBits(volatile int n){
volatile bool t = n & 1;
while(n != 0){
n = n >> 1;
if(t == (n&1))return false;
t = n & 1;
}
return true;
}
int main(void){
volatile int input = 10;
volatile bool rst = hasAlternatingBits(input);
printf("The result is %d. (1 is true and 0 is false.)\n ", rst);
return 0;
}
```
:::
## Requirements 2
First download the GTKWave
```shell=
sudo apt install gtkwave
```
And load the `wave.wst` in ``sim`` file.
Then we start to observe PC, branch, instruction memory (I-MEM), data memory (D-MEM), and instruction internals.

### Observation
1. clock triger

The signal is trigered by clock rise. And apparently `srv32` is three stage pipline, and it is totally different with 5 stage pipline(IF/DE/EX/MEM/WB).

> `if_pc` is address of inst in IF/ID stage. `ex_pc` is the address of ex_insn in EX stage.
| IF | EX | WB |
| -------- | -------- | -------- |
| if_pc | ex_pc | wb_pc |
the three stage pipline
:::spoiler the head of machine code
0: 00014297 auipc t0,0x14
4: 58028293 addi t0,t0,1408 # 14580 <trap_handler>
8: 30529073 csrw mtvec,t0
c: 3050e073 csrsi mtvec,1
10: 00022297 auipc t0,0x22
14: 86428293 addi t0,t0,-1948 # 21874 <_PathLocale>
18: 00022317 auipc t1,0x22
1c: 89c30313 addi t1,t1,-1892 # 218b4 <_bss_end>
:::
2. two cycles instructions

There is a special instruction occupied two cycle of clock.After checking some two cycle instruction, this special instruction is `j`
3. I-MEM(instruction memory) & D-MEM(data memory)

instruction memory store the current instruction for the first stage of pipline.
4. branch penalty

> if_inst=`auipc`
> ex_inst=`bltu`
> wb_inst=`addi`
The instruction `bltu` will cause branch penalty with two cycles. The above picture show that after the branch is taken, and the ex_flush is triggered with two cycles.
## Requirments 3
Optimization goal: fewer instructions, shorter cycle counts, eliminate unnecessary stalls.
* Eliminate the `volatile` declaration. The instruction have reduced from 2134 to 2122.
:::spoiler Running result
```shell=
Excuting 2122 instructions, 2844 cycles, 1.340 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.001 s
Simulation cycles: 2844
Simulation speed : 2.418 MHz
```
:::
* Elminate the function code
:::spoiler My new C code
```c=
# include <stdio.h>
# include <stdlib.h>
# include <stdbool.h>
int main(void){
int input = 10;
bool t = input & 1, rst;
while(input != 0){
input = input >> 1;
if(t == (input&1)){
rst = false;
goto ans;
}
t = input & 1;
}
rst = true;
ans:
printf("The result is %d. (1 is true and 0 is false.)\n ", rst);
return 0;
}
```
:::
:::spoiler Running result
```c=
The result is 1. (1 is true and 0 is false.)
Excuting 2093 instructions, 2803 cycles, 1.339 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.001 s
Simulation cycles: 2803
Simulation speed : 2.509 MHz
```
:::
## Requirments 4
Compilance tests is to verify the risc-v implementation whether fit in the [basic standard](https://github.com/riscv-non-isa/riscv-arch-test/blob/master/doc/README.adoc).
**test suit** is the basic check for important aspects of the specification without focusing on details.
**signature** is a defined memory area where the result of a test suite is stored. Therefore, if the test signature is not matched, it mean there is some wrong during execution.
**Verilator** is a simulator that translate Verilog HDL to C++/System C code. It can simulate the execution of RISC-V binary at RTL level.
According the `sim/Makefile` and execution process, The execution of **Verilator** is after entering the `sim` file(By execution command line `make a2.run`).
:::spoiler execution process
```shell=
make[1]: Entering directory '/home/shangrex/program/srv32/sw'
make -C common
make[2]: Entering directory '/home/shangrex/program/srv32/sw/common'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/shangrex/program/srv32/sw/common'
make[2]: Entering directory '/home/shangrex/program/srv32/sw/a1'
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc -O3 -Wall -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -o a1.elf a1.c -lc -lm -lgcc -lsys -T ../common/default.ld
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .text -O binary a1.elf imem.bin
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .data -O binary a1.elf dmem.bin
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -O binary a1.elf memory.bin
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objdump -d a1.elf > a1.dis
/home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-readelf -a a1.elf > a1.symbol
make[2]: Leaving directory '/home/shangrex/program/srv32/sw/a1'
make[1]: Leaving directory '/home/shangrex/program/srv32/sw'
make[1]: Entering directory '/home/shangrex/program/srv32/sim'
The result is 1. (1 is true and 0 is false.)
Excuting 2134 instructions, 2856 cycles, 1.338 CPI
Program terminate
- ../rtl/../testbench/testbench.v:418: Verilog $finish
Simulation statistics
=====================
Simulation time : 0.034 s
Simulation cycles: 2867
Simulation speed : 0.0843235 MHz
make[1]: Leaving directory '/home/shangrex/program/srv32/sim'
make[1]: Entering directory '/home/shangrex/program/srv32/tools'
./rvsim --memsize 128 -l trace.log ../sw/a1/a1.elf
The result is 1. (1 is true and 0 is false.)
Excuting 2134 instructions, 2856 cycles, 1.338 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.001 s
Simulation cycles: 2856
Simulation speed : 2.320 MHz
make[1]: Leaving directory '/home/shangrex/program/srv32/tools'
Compare the trace between RTL and ISS simulator
=== Simulation passed ===
:::
:::spoiler sim/Makefile
```shell=
verilator ?= 1
top ?= 0
rv32c ?= 0
debug ?= 0
coverage ?= 0
memsize ?= 128
# Run flags
RFLAGS = +trace $(if $(debug), +dump)
TARGET = sim
ifeq (, $(shell which stdbuf))
STDBUF =
else
STDBUF = stdbuf -o0 -e0
endif
TARGET_SIM = iverilog
ifeq ($(top),1)
_top := 1
endif
ifeq ($(coverage),1)
_coverage := 1
endif
ifeq ($(rv32c),1)
_rv32c := 1
endif
ifeq ($(verilator),1)
BFLAGS = -O3 -cc -Wall -Wno-STMTDLY -Wno-UNUSED \
+define+MEMSIZE=$(memsize) \
$(if $(_top), +define+SINGLE_RAM) \
$(if $(_rv32c), +define+RV32C_ENABLED) \
$(if $(_coverage), --coverage) \
--trace-fst --Mdir sim_cc --build --exe sim_main.cpp getch.cpp
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
UNAME_M := $(shell uname -m)
ifeq ($(UNAME_M),x86_64)
export VERILATOR_ROOT=$(CURDIR)/../verilator
TARGET_SIM = $(VERILATOR_ROOT)/bin/verilator
SHELL_HACK := $(shell ln -sf $(TARGET_SIM)_bin-linux-x86_64 $(TARGET_SIM)_bin)
endif
else
TARGET_SIM = verilator
endif
else
BFLAGS = $(if $(_top), -D SINGLE_RAM=1) $(if $(_rv32c), -DRV32C_ENABLED=1)
endif
FILELIST = -f filelist.txt $(if $(_top), ../rtl/top_s.v, ../rtl/top.v)
all: $(TARGET)
$(TARGET):
$(TARGET_SIM) $(BFLAGS) -o $(TARGET) $(FILELIST)
@if [ "$(verilator)" = "1" ]; then \
mv sim_cc/sim .; \
fi
%.run: $(TARGET) checkcode.awk
@if [ ! -f ../sw/$*/memory.bin ]; then \
make -C ../sw $*; \
fi
@cp ../sw/$*/*.bin .
@$(STDBUF) ./$(TARGET) $(RFLAGS) | awk -f $(filter %.awk, $^)
@if [ -f coverage.dat ]; then \
mv coverage.dat $*_cov.dat; \
fi
clean:
@$(RM) $(TARGET) wave.* trace.log *.bin dump.txt
@$(RM) -rf sim_cc *_cov.dat
distclean: clean
@$(RM) $(TARGET_SIM)_bin
```
:::
## Reference
[Instruction Set Simulator(ISS)](https://en.wikipedia.org/wiki/Instruction_set_simulator)
[Register-Transistor-Level(RTL)](https://en.wikipedia.org/wiki/Register-transfer_level)
[WTKwave tuturial](https://hackmd.io/@sysprog/S1Udn1Xtt)
[risc-v document](https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf)
Fun Fact from reference: Value 0x00000013 represents a NOP instruction (addi x0, x0, 0)