---
tags: Computer Architecture, RISCV, srv32, jserv, best tech Youtuber
---
# Asssigment3: SoftCPU
contributed by <[`tobychui`](https://github.com/tobychui)>
:::spoiler
I just put them here for easy access :)
[Lab3](https://hackmd.io/@sysprog/S1Udn1Xtt)
[srv32](https://github.com/sysprog21/srv32)
[A1-src](https://gist.github.com/tobychui/032db06f31c7b72be1df5b30ab8d937c)
:::
### Setting up the Environment
The first thing when I try following Lab3 steps is that ```make all``` failed

Then I notice in the README of srv32, it stated that:
The default tools uses riscv64-unknown-elf-. If would like to use others toolchains, you can define a environment to override it. For example,
```bash
export CROSS_COMPILER=riscv-none-embed-
```
:::warning
You should check [sysprog21/srv32](https://github.com/sysprog21/srv32) which contains the latest information about GNU Toolchain.
:notes: jserv
:::
So I decided to reuse the last assignment's ```setenv``` file to configure the runtime environment. Then back into the `sim` directory and execute ```sudo make all``` again and everything worked.
<!--  -->
Next, I tried to make the test to see if the tests are working. I cd into the tests directory and execute ```make tests``` as suggested by the lab 3 instructions. And then the program breaks again.

:::warning
Never show plaintext with screenshots! Use Markdown synatx.
:notes: jserv
:::
I notice the ISS simulator was not built correctly. So I cd into the `tools` directory and run ```make```. A few linker object pop ups from the compiler after the make process finished.
<!--  -->
Then I move back to the `make tests` and the build in tests executed successfully.

### OK, Lets Start Again
Based on jserv comments, I do a reinstallation of the environment.
First, you need the latest RISC-V toolchain
I was trying to use the pre-compiled version, however, when I tried to download & install it, it failed.
```
Downloading https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases/download/v10.2.0-1.2/xpack-riscv-none-embed-gcc-10.2.0-1.2-linux-x64.tar.gz...
Extracting 'xpack-riscv-none-embed-gcc-10.2.0-1.2-linux-x64.tar.gz'...
Killed
# Retry
pi@debian:~$ xpm install @xpack-dev-tools/riscv-none-embed-gcc@latest
@xpack-dev-tools/riscv-none-embed-gcc@10.2.0-1.2.1...
error: installing standalone binary xPack not supported
```
:::warning
Don't do that! You should download the tarball and extract.
:notes: jserv
:::
So I decided to build from source instead.
>When in doublt, build from source
> -- <cite>Probably no one</cite>
```
# Install all building requirements
sudo apt install autoconf automake autotools-dev curl gawk git build-essential bison flex texinfo gperf libtool patchutils bc libmpc-dev libmpfr-dev libgmp-dev gawk zlib1g-dev libexpat1-dev
# Clone the source repo
git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
# Build it
cd riscv-gnu-toolchain
mkdir -p build && cd build
../configure --prefix=/opt/riscv --enable-multilib
sudo make -j$(nproc)
# Without sudo will cause permission denied error during compilation
```
Then the building process failed as follows:
```
libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libba cktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -lmpc -lmpfr -lgmp -rdynamic -ldl -lz
collect2: fatal error: ld terminated with signal 9 [Killed]
compilation terminated.
make[2]: *** [../../../riscv-gcc/gcc/c/Make-lang.in:87: cc1] Error 1
make[2]: *** Waiting for unfinished jobs....
rm gfdl.pod gcc.pod gcov-dump.pod gcov-tool.pod fsf-funding.pod gpl. pod cpp.pod gcov.pod lto-dump.pod
make[2]: Leaving directory '/home/pi/riscv-gnu-toolchain/build/build -gcc-newlib-stage1/gcc'
make[1]: *** [Makefile:4426: all-gcc] Error 2
make[1]: Leaving directory '/home/pi/riscv-gnu-toolchain/build/build -gcc-newlib-stage1'
make: *** [Makefile:521: stamps/build-gcc-newlib-stage1] Error 2
```
After some checking, the compilation failure was cause by out of storage, which indicated by a 100% used /dev/sda1 shown using df -h
```
pi@debian:~/riscv-gnu-toolchain/build$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 969M 0 969M 0% /dev
tmpfs 199M 984K 198M 1% /run
/dev/sda1 15G 14G 132M 100% /
tmpfs 992M 15M 977M 2% /dev/shm
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 199M 52K 199M 1% /run/user/1000
/dev/sr0 57M 57M 0 100% /media/cdrom0
tmpfs 199M 44K 199M 1% /run/user/115
```
Unfortunately, I have no more space left on my laptop that can assign to the VM.
Switching to desktop and migrating the installation from Debian to Ubuntu 20.04, I tried to clone the source again and this error occured:
```
Cloning into '/home/tc/riscv-gnu-toolchain/qemu'...
fatal: unable to access 'https://git.qemu.org/git/qemu.git/': server certificate verification failed. CAfile: none CRLfile: none
fatal: clone of 'https://git.qemu.org/git/qemu.git' into submodule path '/home/tc/riscv-gnu-toolchain/qemu' failed
Failed to clone 'qemu' a second time, aborting
```
It seems to be the certificate of the QEMU repo got some issue. Hence, I am disabling git's SSL checking for this session with
```
export GIT_SSL_NO_VERIFY=1
```
and try again to update the qemu submodule using
```
git submodule update --init --recursive
```
After this, following the config, make command and configuration of the compiler paths, everything started working.

:::spoiler
Later on I found out it works the same as doing the following after I screw up my Ubuntu and need another fresh installation:
1. Download the release from the toolchain release page using wget
https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases/
2. Then follow the instruction on xpack website to install it manually
```
$ mkdir -p ~/opt
$ cd ~/opt
$ tar xvf ~/Downloads/xpack-riscv-none-embed-gcc-8.2.1-3.1-linux-x64.tgz
$ chmod -R -w xPacks/riscv-none-embed-gcc/xpack-riscv-none-embed-gcc-8.2.1-3.1
```
3. Set the environment variable
```
export CROSS_COMPILE=~/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-
```
:::
Next, clone the srv32 to your root directory
```
cd ~/
git clone https://github.com/sysprog21/srv32
```
And build the simulators with:
```
cd ~/srv32/tools/
make
cd ~/srv32/sim/
make
```
Then try the build in test case
```
cd ../tests/
make tests-sw
```
Output as follows:
```
riscv-test-env/verify.sh
Compare to reference files ...
Check I-CSRRC-01 ... OK
Check I-CSRRCI-01 ... OK
Check I-CSRRS-01 ... OK
Check I-CSRRSI-01 ... OK
Check I-CSRRW-01 ... OK
Check I-CSRRWI-01 ... OK
--------------------------------
OK: 6/6 RISCV_TARGET=srv32 RISCV_DEVICE=rv32Zicsr RISCV_ISA=rv32Zicsr
```
## Requirement 1
First I modified the A1 code to make it compatible with the RISC-V compiler by changing all int to volatile int in the source code.
New source code:
:::spoiler
```
#include <stdio.h>
#include <stdlib.h>
volatile int inputNums[6] = {2,5,1,3,4,7};
volatile int inputNumberCount = 6;
volatile int* shuffle(volatile int* nums, volatile int numsSize, volatile int n, volatile int* returnSize){
if (numsSize <= 2){
return nums;
}
volatile int counter = 0;
for (volatile int i = 0; i < n; i++){
returnSize[counter] = nums[i];
counter++;
returnSize[counter] = nums[i+n];
counter++;
}
return returnSize;
}
int main(){
volatile int pointCounts = inputNumberCount / 2;
volatile int *result = (int*)malloc(inputNumberCount * sizeof(int));
//Print the original array
printf("Original Array is \n");
for (volatile int i = 0; i < inputNumberCount; i++){
printf("%d ", inputNums[i]);
}
//Shuffle the array
volatile int* outputNums = shuffle(inputNums, inputNumberCount, pointCounts, result);
//Print the shuffled arary
printf("\nShuffled Array is \n");
for (volatile int j = 0; j < inputNumberCount; j++){
printf("%d ", outputNums[j]);
}
return 0;
}
```
:::
Then the program was uploaded to the sw directory using WinSCP with relative path ./sw/a1 to the srv32 root directory. After modifying the make file and execute make, this error occured
```
/home/tc/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/riscv-none-embed-gcc -O3 -Wall -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -o a1.elf a1.c -lc -lm -lgcc -lsys -T ../common/default.ld
/home/tc/opt/xpack-riscv-none-embed-gcc-10.2.0-1.2/bin/../lib/gcc/riscv-none-embed/10.2.0/../../../../riscv-none-embed/bin/ld: cannot find -lsys
collect2: error: ld returned 1 exit status
make: *** [Makefile:16: a1] Error 1
```
After reading the source repo from [kuopinghsu](https://github.com/kuopinghsu/srv32) and investigating the Makefire, I notice you cannot just call make inside the sw/ directory. Instead, you have to call the make file at the root directory and use ```make a1``` instead. The A1 code was tested and execution result was returned as follows:
```
./rvsim --memsize 128 -l trace.log ../sw/a1/a1.elf
Original Array is
2 5 1 3 4 7
Shuffled Array is
1 3 5 4 1 7
Excuting 9408 instructions, 12012 cycles, 1.277 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.004 s
Simulation cycles: 12012
Simulation speed : 3.417 MHz
make[1]: Leaving directory '/home/tc/srv32/tools'
Compare the trace between RTL and ISS simulator
=== Simulation passed ===
```
### Requirement 2
In this requirement, I will need to run the generated wave.fst file in GTKWave. First thing I need to do is to locate the wave.fst file.
```
find . -name wave.fst
> ./sim/wave.fst
```
Next, download and start GTKWave on Windows
https://sourceforge.net/projects/gtkwave/
As there is no "import fst file" option in GTKWave, following the instruction from the [GTKWave User Guide](http://gtkwave.sourceforge.net/gtkwave.pdf) you must need to start the application via terminal like Powershell
```
cd gtkwave/bin
gtkwave.exe -f ../../sim/wave.fst
```
Then, you will be greeted by the GTKWave interface

After setting up the correct fields, we can now see the waveform from the assembly instructions

Lets take this snapshot for example

I think this part is corrisponding to the for loop section of the original C code where the logic constantly update the array position and hence, jump and dmem is constantly changing through these clock periods.
We can also observe the pipline in operation if we change the display sequence of the wave form as follows:

From the bottom two lanes, we can see the shifting charistic of a pipelined processor operations (which is corrisponding to the DEC and EXEC stage of the pipeline)

### Requirement 3
After the simulation of the RISCV-gcc toolchain generated assembly file, I notice there are a few loation in my original assembly code that can be improved to gain better performance.
#### Fewer instructions
- In the loading period, where the initial inputNums array are filled, it can be replaced by a defined string array in .data section instead
::: spoiler
Loading was done using li and sw
```
#Start filling the inputNums array
li t0, 2
sw t0, 0(s0)
li t0, 5
sw t0, 4(s0)
li t0, 1
sw t0, 8(s0)
li t0, 3
sw t0, 12(s0)
li t0, 4
sw t0, 16(s0)
li t0, 7
sw t0, 20(s0)
```
Actually this can be done with .data, e.g.
```
.data
nums: .word 2,5,1,3,4,7
```
:::
#### Eliminate unnecessary stalls
- As the original assembly was written in function reusability in mind, many jumps are involved in the result printing sections
- A lot of stalls / NOP operations
- Constantly changing PC
::: spoiler
Original implementation of code reusability in assembly, turns out to be a bad idea :-1:
```
# Print function helpers, print the word / int value in a0
# Call with jal opcode
printInt:
li a7, 1
ecall
jr ra
printString:
li a7, 4
ecall
jr ra
printSepeartor:
la a0, str2
li a7, 4
ecall
jr ra
```
:::
Solution: Remove all the jump and replace with sequential printing
- Part of the code can also be changed to use other temporary register instead of reusing the one used by previous instruction for reducing data hazzard, like the following lines:
```
lw t3, 0(s0)
sw t3, 0(s1)
```
can be changed to :
```
lw t3, 0(s0)
sw t4, 0(s1)
```
After applying the optimization to the C file by breaking down all the loops in the source code, the new performance of the program is as follows.
```
Original Array is
2 5 1 3 4 7
Shuffled Array is
1 3 5 4 1 7
Excuting 9334 instructions, 11924 cycles, 1.277 CPI
Program terminate
- ../rtl/../testbench/testbench.v:418: Verilog $finish
Simulation statistics
=====================
Simulation time : 0.069 s
Simulation cycles: 11935
Simulation speed : 0.172971 MHz
make[1]: Leaving directory '/home/tc/srv32/sim'
make[1]: Entering directory '/home/tc/srv32/tools'
./rvsim --memsize 128 -l trace.log ../sw/a1/a1.elf
Original Array is
2 5 1 3 4 7
Shuffled Array is
1 3 5 4 1 7
Excuting 9334 instructions, 11924 cycles, 1.277 CPI
Program terminate
Simulation statistics
=====================
Simulation time : 0.003 s
Simulation cycles: 11924
Simulation speed : 3.430 MHz
make[1]: Leaving directory '/home/tc/srv32/tools'
Compare the trace between RTL and ISS simulator
=== Simulation passed ===
```
Thus, we successfully reduced the cycles from 12012 to 11924, reduced 88 cycles in total.
### Requirement 4
RISC-V Compliance Tests are important to make sure each hardware implementation (i.e. RISCV core) or software written for RISC-V can run on all implementations of RISC-V profiles / specification that comply with the profile. As stated on their README file
>The result that the architecture tests provide to the user is an assurance that the specification has been interpreted correctly and the implementation under test (DUT) can be declared as RISC-V Architecture Test compliant.
Which states that, it only help you to check if your software comply with the RISC-V standard, but won't help you to test / verify for designs.
srv32 is a 3 stage RISC-V core and Verilator is a simulator for Verilog. Hence, by running the srv32 inside Verilator, it checks its implementation and simulate the speed of such designs using digital simulation.