--- tags: homework --- # Asssigment3: SoftCPU contributed by <:t-rex: `shangrex` > Still working :hammer_and_pick: ## Set up the environment 1. build the tool chain dependencies ``` sudo apt install build-essential ccache ``` 2. download the [risc-v toolchain](https://xpack.github.io/riscv-none-embed-gcc/install/) 1. first check the ISA you have ``` uname -m ``` 2. go to the [github release page](https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases/) to install the risc-v tool chain and put it under [/opt](https://stackoverflow.com/questions/12649355/what-does-opt-mean-as-in-the-opt-directory-is-it-an-abbreviation/12649407) ``` $ mkdir -p ~/opt $ cd ~/opt $ tar xvf ~/Downloads/xpack-riscv-none-embed-gcc-8.2.1-3.1-linux-x64.tgz $ sudo chmod -R -w ~/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc-10.2.0 ``` 3. clone the [simulator repo](https://github.com/sysprog21/srv32) and set up the environment variables ``` git clone git@github.com:sysprog21/srv32.git cd srv32/ export CROSS_COMPILE=~/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed- ``` 5. Install coverage tool ``` sudo apt install lcov ``` 6. Finished!! ``` make all ``` ## Requirement1 Set up a `a1` file in `sw`, and create my c code `a1.c`. Create Makefile( Refer to `hello` file Makefile) and then go to repo root directory run `make a1`. :::spoiler The Running result ```shell= make[1]: Entering directory '/home/shangrex/program/srv32/sw' make -C common make[2]: Entering directory '/home/shangrex/program/srv32/sw/common' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/shangrex/program/srv32/sw/common' make[2]: Entering directory '/home/shangrex/program/srv32/sw/a1' /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc -O3 -Wall -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -o a1.elf a1.c -lc -lm -lgcc -lsys -T ../common/default.ld /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .text -O binary a1.elf imem.bin /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .data -O binary a1.elf dmem.bin /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -O binary a1.elf memory.bin /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objdump -d a1.elf > a1.dis /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-readelf -a a1.elf > a1.symbol make[2]: Leaving directory '/home/shangrex/program/srv32/sw/a1' make[1]: Leaving directory '/home/shangrex/program/srv32/sw' make[1]: Entering directory '/home/shangrex/program/srv32/sim' The result is 1. (1 is true and 0 is false.) Excuting 2134 instructions, 2856 cycles, 1.338 CPI Program terminate - ../rtl/../testbench/testbench.v:418: Verilog $finish Simulation statistics ===================== Simulation time : 0.038 s Simulation cycles: 2867 Simulation speed : 0.0754474 MHz make[1]: Leaving directory '/home/shangrex/program/srv32/sim' make[1]: Entering directory '/home/shangrex/program/srv32/tools' ./rvsim --memsize 128 -l trace.log ../sw/a1/a1.elf The result is 1. (1 is true and 0 is false.) Excuting 2134 instructions, 2856 cycles, 1.338 CPI Program terminate Simulation statistics ===================== Simulation time : 0.004 s Simulation cycles: 2856 Simulation speed : 0.760 MHz make[1]: Leaving directory '/home/shangrex/program/srv32/tools' Compare the trace between RTL and ISS simulator === Simulation passed === ``` ::: :::spoiler My C code ```c= # include <stdio.h> # include <stdlib.h> # include <stdbool.h> bool hasAlternatingBits(volatile int n){ volatile bool t = n & 1; while(n != 0){ n = n >> 1; if(t == (n&1))return false; t = n & 1; } return true; } int main(void){ volatile int input = 10; volatile bool rst = hasAlternatingBits(input); printf("The result is %d. (1 is true and 0 is false.)\n ", rst); return 0; } ``` ::: ## Requirements 2 First download the GTKWave ```shell= sudo apt install gtkwave ``` And load the `wave.wst` in ``sim`` file. Then we start to observe PC, branch, instruction memory (I-MEM), data memory (D-MEM), and instruction internals. ![](https://i.imgur.com/6fsHXkQ.png) ### Observation 1. clock triger ![](https://i.imgur.com/2lqhIax.png) The signal is trigered by clock rise. And apparently `srv32` is three stage pipline, and it is totally different with 5 stage pipline(IF/DE/EX/MEM/WB). ![](https://i.imgur.com/23CyOlO.png) > `if_pc` is address of inst in IF/ID stage. `ex_pc` is the address of ex_insn in EX stage. | IF | EX | WB | | -------- | -------- | -------- | | if_pc | ex_pc | wb_pc | the three stage pipline :::spoiler the head of machine code 0: 00014297 auipc t0,0x14 4: 58028293 addi t0,t0,1408 # 14580 <trap_handler> 8: 30529073 csrw mtvec,t0 c: 3050e073 csrsi mtvec,1 10: 00022297 auipc t0,0x22 14: 86428293 addi t0,t0,-1948 # 21874 <_PathLocale> 18: 00022317 auipc t1,0x22 1c: 89c30313 addi t1,t1,-1892 # 218b4 <_bss_end> ::: 2. two cycles instructions ![](https://i.imgur.com/by9UNCl.png) There is a special instruction occupied two cycle of clock.After checking some two cycle instruction, this special instruction is `j` 3. I-MEM(instruction memory) & D-MEM(data memory) ![](https://i.imgur.com/PFyx3rU.png) instruction memory store the current instruction for the first stage of pipline. 4. branch penalty ![](https://i.imgur.com/eu8e2b1.png) > if_inst=`auipc` > ex_inst=`bltu` > wb_inst=`addi` The instruction `bltu` will cause branch penalty with two cycles. The above picture show that after the branch is taken, and the ex_flush is triggered with two cycles. ## Requirments 3 Optimization goal: fewer instructions, shorter cycle counts, eliminate unnecessary stalls. * Eliminate the `volatile` declaration. The instruction have reduced from 2134 to 2122. :::spoiler Running result ```shell= Excuting 2122 instructions, 2844 cycles, 1.340 CPI Program terminate Simulation statistics ===================== Simulation time : 0.001 s Simulation cycles: 2844 Simulation speed : 2.418 MHz ``` ::: * Elminate the function code :::spoiler My new C code ```c= # include <stdio.h> # include <stdlib.h> # include <stdbool.h> int main(void){ int input = 10; bool t = input & 1, rst; while(input != 0){ input = input >> 1; if(t == (input&1)){ rst = false; goto ans; } t = input & 1; } rst = true; ans: printf("The result is %d. (1 is true and 0 is false.)\n ", rst); return 0; } ``` ::: :::spoiler Running result ```c= The result is 1. (1 is true and 0 is false.) Excuting 2093 instructions, 2803 cycles, 1.339 CPI Program terminate Simulation statistics ===================== Simulation time : 0.001 s Simulation cycles: 2803 Simulation speed : 2.509 MHz ``` ::: ## Requirments 4 Compilance tests is to verify the risc-v implementation whether fit in the [basic standard](https://github.com/riscv-non-isa/riscv-arch-test/blob/master/doc/README.adoc). **test suit** is the basic check for important aspects of the specification without focusing on details. **signature** is a defined memory area where the result of a test suite is stored. Therefore, if the test signature is not matched, it mean there is some wrong during execution. **Verilator** is a simulator that translate Verilog HDL to C++/System C code. It can simulate the execution of RISC-V binary at RTL level. According the `sim/Makefile` and execution process, The execution of **Verilator** is after entering the `sim` file(By execution command line `make a2.run`). :::spoiler execution process ```shell= make[1]: Entering directory '/home/shangrex/program/srv32/sw' make -C common make[2]: Entering directory '/home/shangrex/program/srv32/sw/common' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/shangrex/program/srv32/sw/common' make[2]: Entering directory '/home/shangrex/program/srv32/sw/a1' /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc -O3 -Wall -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -o a1.elf a1.c -lc -lm -lgcc -lsys -T ../common/default.ld /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .text -O binary a1.elf imem.bin /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -j .data -O binary a1.elf dmem.bin /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objcopy -O binary a1.elf memory.bin /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-objdump -d a1.elf > a1.dis /home/shangrex/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-readelf -a a1.elf > a1.symbol make[2]: Leaving directory '/home/shangrex/program/srv32/sw/a1' make[1]: Leaving directory '/home/shangrex/program/srv32/sw' make[1]: Entering directory '/home/shangrex/program/srv32/sim' The result is 1. (1 is true and 0 is false.) Excuting 2134 instructions, 2856 cycles, 1.338 CPI Program terminate - ../rtl/../testbench/testbench.v:418: Verilog $finish Simulation statistics ===================== Simulation time : 0.034 s Simulation cycles: 2867 Simulation speed : 0.0843235 MHz make[1]: Leaving directory '/home/shangrex/program/srv32/sim' make[1]: Entering directory '/home/shangrex/program/srv32/tools' ./rvsim --memsize 128 -l trace.log ../sw/a1/a1.elf The result is 1. (1 is true and 0 is false.) Excuting 2134 instructions, 2856 cycles, 1.338 CPI Program terminate Simulation statistics ===================== Simulation time : 0.001 s Simulation cycles: 2856 Simulation speed : 2.320 MHz make[1]: Leaving directory '/home/shangrex/program/srv32/tools' Compare the trace between RTL and ISS simulator === Simulation passed === ::: :::spoiler sim/Makefile ```shell= verilator ?= 1 top ?= 0 rv32c ?= 0 debug ?= 0 coverage ?= 0 memsize ?= 128 # Run flags RFLAGS = +trace $(if $(debug), +dump) TARGET = sim ifeq (, $(shell which stdbuf)) STDBUF = else STDBUF = stdbuf -o0 -e0 endif TARGET_SIM = iverilog ifeq ($(top),1) _top := 1 endif ifeq ($(coverage),1) _coverage := 1 endif ifeq ($(rv32c),1) _rv32c := 1 endif ifeq ($(verilator),1) BFLAGS = -O3 -cc -Wall -Wno-STMTDLY -Wno-UNUSED \ +define+MEMSIZE=$(memsize) \ $(if $(_top), +define+SINGLE_RAM) \ $(if $(_rv32c), +define+RV32C_ENABLED) \ $(if $(_coverage), --coverage) \ --trace-fst --Mdir sim_cc --build --exe sim_main.cpp getch.cpp UNAME_S := $(shell uname -s) ifeq ($(UNAME_S),Linux) UNAME_M := $(shell uname -m) ifeq ($(UNAME_M),x86_64) export VERILATOR_ROOT=$(CURDIR)/../verilator TARGET_SIM = $(VERILATOR_ROOT)/bin/verilator SHELL_HACK := $(shell ln -sf $(TARGET_SIM)_bin-linux-x86_64 $(TARGET_SIM)_bin) endif else TARGET_SIM = verilator endif else BFLAGS = $(if $(_top), -D SINGLE_RAM=1) $(if $(_rv32c), -DRV32C_ENABLED=1) endif FILELIST = -f filelist.txt $(if $(_top), ../rtl/top_s.v, ../rtl/top.v) all: $(TARGET) $(TARGET): $(TARGET_SIM) $(BFLAGS) -o $(TARGET) $(FILELIST) @if [ "$(verilator)" = "1" ]; then \ mv sim_cc/sim .; \ fi %.run: $(TARGET) checkcode.awk @if [ ! -f ../sw/$*/memory.bin ]; then \ make -C ../sw $*; \ fi @cp ../sw/$*/*.bin . @$(STDBUF) ./$(TARGET) $(RFLAGS) | awk -f $(filter %.awk, $^) @if [ -f coverage.dat ]; then \ mv coverage.dat $*_cov.dat; \ fi clean: @$(RM) $(TARGET) wave.* trace.log *.bin dump.txt @$(RM) -rf sim_cc *_cov.dat distclean: clean @$(RM) $(TARGET_SIM)_bin ``` ::: ## Reference [Instruction Set Simulator(ISS)](https://en.wikipedia.org/wiki/Instruction_set_simulator) [Register-Transistor-Level(RTL)](https://en.wikipedia.org/wiki/Register-transfer_level) [WTKwave tuturial](https://hackmd.io/@sysprog/S1Udn1Xtt) [risc-v document](https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) Fun Fact from reference: Value 0x00000013 represents a NOP instruction (addi x0, x0, 0)