# Assignment3: SoftCPU ###### tags: `Computer Architecture` `RISC-V` `jserv` > Contributed by <[tonych1997](https://github.com/tonych1997/Computer-Architecture)> [Assignment3 Requirements](https://hackmd.io/@sysprog/2022-arch-homework3) [Lab3: srv32 - RISCV RV32IM Soft CPU](https://hackmd.io/@sysprog/S1Udn1Xtt) [Github - srv32](https://github.com/sysprog21/srv32) :::info 2022.12.12 This assignment is almost complete =) 2022.12.07 My environment installation was successful! I am in the process of completing my subsequent content. 2022.11.30 I spent A LOT OF TIME building the environment, but now it's still a failure, and I'll add the content after I succeed. ::: ## Old Enviroment Installaion **Originally I used `Windows wsl Ubuntu 20.04`** to install the environment, but I encountered many problems in the process. Therefore, on the advice of TA, I **switched to `Linux Ubuntu 20.04`** to install the environment, but I encountered many errors in the process. I found out that the order of the installation or whether to install it or not would affect the result of the installation, so I recorded the last successful steps below and recorded the error process in another hackmd note: [Assignment3: RISC-V Enviroment Building old (failed) version ](https://hackmd.io/@Vgwl_uixQFasIvsDbsFlvA/risc-v-hw3_old). ## Install enviroment I use `Linux Ubuntu 20.04` to build the environment. ### Set environment variables :::info * If an unintended error occurs during execution or installation, refer to this paragraph to resolve it. * If this is the FIRST time to install, start at the [first step](https://hackmd.io/LJeLMHbrRnqaLhL7LCQNkw?both#1-Update-Ubuntu). ::: Confirm the Environment Variables and PATH settings. ``` $ export # Check PATH, CROSS_COMPILE, VERILATOR_ROOT etc. $ echo $PATH # check PATH only ``` Shutting down, rebooting, turning off the terminal, restarting the terminal, using a different terminal, etc. can cause the Environment Variables or PATH to be lost. If the Environment Variable is missing, run the following command to reset the it. ``` # Set Cross Compile and Verilator root $ export CROSS_COMPILE=riscv-none-elf- $ export VERILATOR_ROOT=$HOME/verilator ``` If the PATH is missing, run the following command to reset it. ``` # Set RISC-V toolchain PATH $ cd $HOME/riscv-none-elf-gcc $ echo "export PATH=`pwd`/bin:$PATH" > setenv $ cd $HOME $ source riscv-none-elf-gcc/setenv ``` ``` # Set verilator PATH $ export PATH=$VERILATOR_ROOT/bin:$PATH ``` ### 1. Update Ubuntu First, update ubuntu. ``` $ sudo apt-get update ``` ### 2. RISC-V toolchain The RISC-V toolchain (riscv-none-elf-gcc) has already been installed in Assignment 2, so the toochain installation procedure is based on my [notes](https://hackmd.io/@Vgwl_uixQFasIvsDbsFlvA/risc-v-hw2) in Assignment 2. Or you can refer to [Lab2: RISC-V RV32I[MACF] emulator with ELF support](https://hackmd.io/@sysprog/SJAR5XMmi). i.e. DO NOT run the command to install xpack, riscv-gnu-toolchain, etc. in assignment 3. First, check cross compiler and PATH are still in the system. If it's missing, do it again. ``` $ export # Check PATH, CROSS_COMPILE, VERILATOR_ROOT etc. $ echo $PATH # check PATH only ``` Set cross compiler environment variables. :::warning 1. The toolchain used may be different. Verify that the toolchain used is the same as the set variables. (e.g. riscv-gnu-toolchain) 2. The name of the toolchain was updated to [risc-none-elf-gcc](https://xpack.github.io/blog/2022/08/30/riscv-none-elf-gcc-v12-2-0-1-released/#risc-none-elf-gcc) (The old version used riscv-none-embed-gcc.) ::: ``` # Set cross compiler $ export CROSS_COMPILE=riscv-none-elf- ``` ``` # Set RISC-V toolchain PATH $ cd $HOME/riscv-none-elf-gcc $ echo "export PATH=`pwd`/bin:$PATH" > setenv $ cd $HOME $ source riscv-none-elf-gcc/setenv ``` ### 3. Install all required packages ``` $ sudo apt-get install lcov $ sudo apt install curl $ curl -sL https://deb.nodesource.com/setup_14.x | sudo -E bash - $ sudo apt-get install gcc g++ make $ sudo apt-get update && sudo apt-get install yarn $ sudo apt install -y nodejs $ sudo apt install build-essential ``` ``` $ sudo apt install build-essential lcov ccache libsystemc-dev ``` ### 4. Install verilator Install the veilator package according to the [srv32 - RISCV RV32IM Soft CPU](https://hackmd.io/@sysprog/S1Udn1Xtt) and the [verilator official installation document](https://verilator.org/guide/latest/install.html). A command named `autoconf` is missing from the lab3 document, it should be execute before `export VERILATOR_ROOT=‵pwd‵`. The `make` command will take a while to run before it finished, after `make` the terminal will show the message `build complete` and `Now type 'make test' to test`. ``` $ cd $HOME $ git clone https://github.com/verilator/verilator $ cd verilator $ git checkout stable # can't sudo $ sudo apt-get install autoconf $ autoconf # create ./configure script $ export VERILATOR_ROOT=`pwd` $ sudo apt-get install flex bison # package needed in ./configure $ ./configure $ make ``` And if excute the command `make test`, terminal will show these messages. ``` $ make test Tests passed! Now type 'make install' to install. Or type 'make' inside an examples subdirectory. ``` After `make test`, do `make install`. ``` $ make install ``` Then set the enviroment variables in advance. ``` $ export VERILATOR_ROOT=$HOME/verilator $ export PATH=$VERILATOR_ROOT/bin:$PATH ``` And check the Verilator version. ``` $ verilator --version Verilator 5.002 2022-10-29 rev v5.002-29-gdb39d70c7 ``` ### 5. Get srv32 I refer [srv32's Github](https://hackmd.io/@wIVnCcUaTouAktrkMVLEMA/Hy6vD5DtF) to get srv32 and do `make` command in directories. #### Clone srv32 files. ``` $ cd ~/ $ git clone https://github.com/sysprog21/srv32 ``` :::warning Just do the commands described below. i.e. there is NO NEED to install or set the RISC-V toolchain again by following the installation [building toolchain](https://github.com/sysprog21/srv32#building-toolchains) steps in srv32 Readme. ::: Build the simulator. ``` $ cd ~/srv32/tools $ make $ cd ~/srv32/sim $ make ``` Then bulid the test case. :::warning 1. DO NOT execute `make tests` in `~/srv32/tests`, or you will get some error message. 2. DO NOT added `sudo` on the command `make tests`, because I discovered `sudo` will call the other `gcc` to compile, instead of `riscv-none-elf-gcc`. ::: ``` $ cd ~/srv32 $ make tests ``` ![](https://i.imgur.com/X6mbQqe.png) ``` $ cd ~/srv32 $ make tests-sw ``` ![](https://i.imgur.com/qLdWSW6.png) ### 6. Install GTKWave Install GTKWave by following command. ``` $ sudo apt-get install gtkwave $ cd sim && ./sim +dump ``` Then use Ubuntu's GUI go to the directory `/sim (~/srv32/sim)` to double click the `wave.fst` file to open the GTKWave. ## Q1: Assignement 2 - Search Insert Position Here is the code I used in my [Assignment 2](https://hackmd.io/@Vgwl_uixQFasIvsDbsFlvA/risc-v-hw2). ### Code C Code in Details (詳細資料). :::spoiler ```clike= #include <stdio.h> int searchInsert(int *nums, int numsSize, int target) { int left = 0; int right = numsSize - 1; while (left <= right) { int mid = (left + right) / 2; if (target == nums[mid]) return mid; else if (target < nums[mid]) right = mid - 1; else left = mid + 1; } return left; } int main() { int data[] = {1, 3, 5, 6}; int size = 4; int tar1 = 5, tar2 = 2, tar3 = 7; int index1 = searchInsert(data, size, tar1); int index2 = searchInsert(data, size, tar2); int index3 = searchInsert(data, size, tar3); printf("The target1 insert position is %d\n", index1); printf("The target2 insert position is %d\n", index2); printf("The target3 insert position is %d\n", index3); return 0; } ``` ::: --- Assembly Code in Details (詳細資料). The srv32 assembly code is as follows. Code of `searchInsert` in Details (詳細資料). :::spoiler ``` 0000003c <searchInsert>: 3c: fff58593 addi a1,a1,-1 40: 00050693 mv a3,a0 44: 0405c263 bltz a1,88 <searchInsert+0x4c> 48: 00000713 li a4,0 4c: 00b70533 add a0,a4,a1 50: 40155513 srai a0,a0,0x1 54: 00251793 slli a5,a0,0x2 58: 00f687b3 add a5,a3,a5 5c: 0007a783 lw a5,0(a5) 60: 00c78a63 beq a5,a2,74 <searchInsert+0x38> 64: 00f65a63 bge a2,a5,78 <searchInsert+0x3c> 68: fff50593 addi a1,a0,-1 6c: fee5d0e3 bge a1,a4,4c <searchInsert+0x10> 70: 00070513 mv a0,a4 74: 00008067 ret 78: 00150713 addi a4,a0,1 7c: fce5d8e3 bge a1,a4,4c <searchInsert+0x10> 80: 00070513 mv a0,a4 84: ff1ff06f j 74 <searchInsert+0x38> 88: 00000513 li a0,0 8c: 00008067 ret ``` ::: Code of `main` in Details (詳細資料). :::spoiler ``` 00000090 <main>: 90: 000207b7 lui a5,0x20 94: 0fc78793 addi a5,a5,252 # 200fc <environ+0x70> 98: 0007a603 lw a2,0(a5) 9c: 0047a683 lw a3,4(a5) a0: 0087a703 lw a4,8(a5) a4: 00c7a783 lw a5,12(a5) a8: fe010113 addi sp,sp,-32 ac: 00c12023 sw a2,0(sp) b0: 00d12223 sw a3,4(sp) b4: 00112e23 sw ra,28(sp) b8: 00812c23 sw s0,24(sp) bc: 00912a23 sw s1,20(sp) c0: 00e12423 sw a4,8(sp) c4: 00f12623 sw a5,12(sp) c8: 00300693 li a3,3 cc: 00000593 li a1,0 d0: 00500613 li a2,5 d4: 00b687b3 add a5,a3,a1 d8: 4017d793 srai a5,a5,0x1 dc: 00279713 slli a4,a5,0x2 e0: 01070713 addi a4,a4,16 e4: 00270733 add a4,a4,sp e8: ff072703 lw a4,-16(a4) ec: 12c70c63 beq a4,a2,224 <main+0x194> f0: 10e65863 bge a2,a4,200 <main+0x170> f4: fff78693 addi a3,a5,-1 f8: fcb6dee3 bge a3,a1,d4 <main+0x44> fc: 00300693 li a3,3 100: 00000493 li s1,0 104: 00200613 li a2,2 108: 00d487b3 add a5,s1,a3 10c: 4017d793 srai a5,a5,0x1 110: 00279713 slli a4,a5,0x2 114: 01070713 addi a4,a4,16 118: 00270733 add a4,a4,sp 11c: ff072703 lw a4,-16(a4) 120: 0cc70c63 beq a4,a2,1f8 <main+0x168> 124: 0ae65863 bge a2,a4,1d4 <main+0x144> 128: fff78693 addi a3,a5,-1 12c: fc96dee3 bge a3,s1,108 <main+0x78> 130: 00000413 li s0,0 134: 00300693 li a3,3 138: 00700613 li a2,7 13c: 00d407b3 add a5,s0,a3 140: 4017d793 srai a5,a5,0x1 144: 00279713 slli a4,a5,0x2 148: 01070713 addi a4,a4,16 14c: 00270733 add a4,a4,sp 150: ff072703 lw a4,-16(a4) 154: 06c70c63 beq a4,a2,1cc <main+0x13c> 158: 04e65863 bge a2,a4,1a8 <main+0x118> 15c: fff78693 addi a3,a5,-1 160: fc86dee3 bge a3,s0,13c <main+0xac> 164: 00020537 lui a0,0x20 168: 09050513 addi a0,a0,144 # 20090 <environ+0x4> 16c: 100000ef jal ra,26c <printf> 170: 00020537 lui a0,0x20 174: 00048593 mv a1,s1 178: 0b450513 addi a0,a0,180 # 200b4 <environ+0x28> 17c: 0f0000ef jal ra,26c <printf> 180: 00020537 lui a0,0x20 184: 00040593 mv a1,s0 188: 0d850513 addi a0,a0,216 # 200d8 <environ+0x4c> 18c: 0e0000ef jal ra,26c <printf> 190: 01c12083 lw ra,28(sp) 194: 01812403 lw s0,24(sp) 198: 01412483 lw s1,20(sp) 19c: 00000513 li a0,0 1a0: 02010113 addi sp,sp,32 1a4: 00008067 ret 1a8: 00178413 addi s0,a5,1 1ac: fa86cce3 blt a3,s0,164 <main+0xd4> 1b0: 00d407b3 add a5,s0,a3 1b4: 4017d793 srai a5,a5,0x1 1b8: 00279713 slli a4,a5,0x2 1bc: 01070713 addi a4,a4,16 1c0: 00270733 add a4,a4,sp 1c4: ff072703 lw a4,-16(a4) 1c8: f8c718e3 bne a4,a2,158 <main+0xc8> 1cc: 00078413 mv s0,a5 1d0: f95ff06f j 164 <main+0xd4> 1d4: 00178493 addi s1,a5,1 1d8: f496cce3 blt a3,s1,130 <main+0xa0> 1dc: 00d487b3 add a5,s1,a3 1e0: 4017d793 srai a5,a5,0x1 1e4: 00279713 slli a4,a5,0x2 1e8: 01070713 addi a4,a4,16 1ec: 00270733 add a4,a4,sp 1f0: ff072703 lw a4,-16(a4) 1f4: f2c718e3 bne a4,a2,124 <main+0x94> 1f8: 00078493 mv s1,a5 1fc: f35ff06f j 130 <main+0xa0> 200: 00178593 addi a1,a5,1 204: eeb6cce3 blt a3,a1,fc <main+0x6c> 208: 00b687b3 add a5,a3,a1 20c: 4017d793 srai a5,a5,0x1 210: 00279713 slli a4,a5,0x2 214: 01070713 addi a4,a4,16 218: 00270733 add a4,a4,sp 21c: ff072703 lw a4,-16(a4) 220: ecc718e3 bne a4,a2,f0 <main+0x60> 224: 00078593 mv a1,a5 228: ed5ff06f j fc <main+0x6c> ``` ::: --- ### Modify Makefile I create the C code file of Assignment 2: search insert position under the `sw/sip (~/srv32/sw/sip)` path. Then I rewrite the `Makefile` by referring to `/sw/hello (~/srv32/sw/hello)` and change the `src` and `target` to my file name `sip`. Note: The path / directory name is the same as the file name. ``` include ../common/Makefile.common EXE = .elf SRC = sip.c CFLAGS += -L../common LDFLAGS += -T ../common/default.ld TARGET = sip OUTPUT = $(TARGET)$(EXE) .PHONY: all clean all: $(TARGET) $(TARGET): $(SRC) $(CC) $(CFLAGS) -o $(OUTPUT) $(SRC) $(LDFLAGS) $(OBJCOPY) -j .text -O binary $(OUTPUT) imem.bin $(OBJCOPY) -j .data -O binary $(OUTPUT) dmem.bin $(OBJCOPY) -O binary $(OUTPUT) memory.bin $(OBJDUMP) -d $(OUTPUT) > $(TARGET).dis $(READELF) -a $(OUTPUT) > $(TARGET).symbol clean: ``` --- ### Result in srv32 Run `make sip` under the `root` directory, i.e.`/srv32 (~srv32)`, and the result is shown below. ![](https://i.imgur.com/Xa0AnSl.png) For more details, please refer to Details (詳細資料). :::spoiler ``` t123@t123-BM6875-BM6675-BP6375:~/srv32$ make sip make[1]: Entering directory '/home/t123/srv32/sw' make -C common make[2]: Entering directory '/home/t123/srv32/sw/common' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/t123/srv32/sw/common' make[2]: Entering directory '/home/t123/srv32/sw/sip' riscv-none-elf-gcc -O3 -Wall -march=rv32im_zicsr -mabi=ilp32 -misa-spec=2.2 -march=rv32im -nostartfiles -nostdlib -L../common -o sip.elf sip.c -lc -lm -lgcc -lsys -T ../common/default.ld riscv-none-elf-objcopy -j .text -O binary sip.elf imem.bin riscv-none-elf-objcopy -j .data -O binary sip.elf dmem.bin riscv-none-elf-objcopy -O binary sip.elf memory.bin riscv-none-elf-objdump -d sip.elf > sip.dis riscv-none-elf-readelf -a sip.elf > sip.symbol make[2]: Leaving directory '/home/t123/srv32/sw/sip' make[1]: Leaving directory '/home/t123/srv32/sw' make[1]: Entering directory '/home/t123/srv32/sim' Excuting 1267 instructions, 1883 cycles, 1.486 CPI Program terminate - ../rtl/../testbench/testbench.v:434: Verilog $finish Simulation statistics ===================== Simulation time : 0.138 s Simulation cycles: 1894 Simulation speed : 0.0137246 MHz make[1]: Leaving directory '/home/t123/srv32/sim' make[1]: Entering directory '/home/t123/srv32/tools' ./rvsim --memsize 128 -l trace.log ../sw/sip/sip.elf The target1 insert position is 2 The target2 insert position is 1 The target3 insert position is 4 Excuting 7094 instructions, 9752 cycles, 1.375 CPI Program terminate Simulation statistics ===================== Simulation time : 0.006 s Simulation cycles: 9752 Simulation speed : 1.701 MHz make[1]: Leaving directory '/home/t123/srv32/tools' Compare the trace between RTL and ISS simulator Files sim/trace.log and tools/trace.log differ make: *** [Makefile:121: sip] Error 1 ``` ::: Tabular Results. | Comapre | sim (verilog) | rvsim (c++) | |-------------------|---------------|-------------| | Instructions | 1267 | 7094 | | Cycles | 1883 | 9752 | | CPI | 1.486 | 1.375 | | Simulation Times | 0.138 s | 0.006 s | | Simulation cycles | 1894 | 9752 | | Simulation Speed | 0.0137246 MHz | 1.701 MHz | --- ### Waveform analysis [`srv32` is a 3-stage pipeline architecture with IF/ID, EX, WB stages. The follwing diagram marks some important signals for later discussion.](https://hackmd.io/@sysprog/S1Udn1Xtt#Pipeline-architecture) ![](https://i.imgur.com/iPWjBBI.png) After `make`, double-click `wave.fst` under `/srv32/sim (~/srv32/sim)` in `Ubuntu` GUI to start `GTKWave`, and then can see the following window. ![](https://i.imgur.com/lLMLZDS.png) Then find `sip.dis` from `/srv32/sw/sip (~/srv32/sw/sip)` path, this file is the `Assembly language` version of `sip` of `C program`, then match the `PC` of the file to the `PC` of the waveform, to find the waveform of the corresponding program. Click `Search` -> `Pattern Search` can get the window named `Waveform Display Search`, then can search the `PC`. ![](https://i.imgur.com/b4Mr2PK.png =50%x) I choose to observe these signals. ![](https://i.imgur.com/IWsoMi9.png) We can find that the instructions are executed in the order of `fetch_pc` -> `if_pc` -> `ex_pc` -> `wb_pc`. ![](https://i.imgur.com/sAZq1N5.png) In `imem_rdata` and `ex_insn`, we can see that the data is transferred from the IF/ID stage to the EX stage. ![](https://i.imgur.com/UFrztFO.png) #### [Data hazard](https://hackmd.io/@sysprog/S1Udn1Xtt#Data-hazard) Accroding to [[srv32 data hazard]](https://hackmd.io/@sysprog/S1Udn1Xtt#Data-hazard), we know `srv32` support full fowarding, so it doesn't need to stall to solve the RAW data hazard. Meanwhile, `srv32`, it only have RAW data hazard, because other hazard (WAW, WAR) isn't possible on single issue processor. For example, I find the RAW data hazard in `main` and the waveform where the RAW data hazard occurred is as follows. ``` 00000090 <main>: ... ec: 12c70c63 beq a4,a2,224 <main+0x194> f0: 10e65863 bge a2,a4,200 <main+0x170> ... ``` In this part, we can find that when `ex_mem2reg=1` and `wb_alu2reg=1`, a data hazard occurs. ![](https://i.imgur.com/WSmgQta.png) #### [Load-use hazard](https://hackmd.io/@sysprog/S1Udn1Xtt#Load-use-hazard) If the first instruction is of the Load type, the result will not be forwarded to the Execution of the next instruction until the Memory Access stage. ``` 00000090 <main>: ... 1f0: ff072703 lw a4,-16(a4) 1f4: f2c718e3 bne a4,a2,124 <main+0x94> 1f8: 00078493 mv s1,a5 1fc: f35ff06f j 130 <main+0xa0> ... ``` There is Load-use hazard between `1f0` and `1f8`. When `1f8` in `IF`, `1f4` in `EX`, `1f0` in `WB`, `(wb_dst_sel == ex_src1_sel)` is true and `wb_mem2reg` is true. ![](https://i.imgur.com/zMDeo5l.png) | PC | Instruction | cycle 1 | c2 | c3 | c4 | c5 | | -- | -------------- | ------- | ----- | ----- | -- | -- | | 1f0 | lw a4,-16(a4) | IF/ID | EX | WB | | | | 1f4 | bne a4,a2,124 | | IF/ID | EX | WB | | | 1f8 | mv s1,a5 | | | IF/ID | EX | WB | #### [Branch penalty](https://hackmd.io/@sysprog/S1Udn1Xtt#Branch-penalty) Branch penalty is the number of instructions killed after a branch instruction if a branch is TAKEN. In `srv32`, the branch penalty is 2, it's meens 2 instructions will be kill, in other words, insert 2 nop after branch instruction. In this section of the `main`, we can see the `branch` instruction `bqe`. ``` 00000090 <main>: ... 160: fc86dee3 bge a3,s0,13c <main+0xac> 164: 00020537 lui a0,0x20 168: 09050513 addi a0,a0,144 # 20090 <environ+0x4> ... ``` Next, find the corresponding waveform. ![](https://i.imgur.com/jeOZQoE.png) The order of execution is shown below. | | | IF/ID | EX | WB | | ------- | ---------------------- | ------- | ----- | ----- | | next_pc | fetch_pc (immem_addr) | if_pc | ex_pc | wb_pc | | xxx | addi a0,a0,144 | lui a0,0x20 | bge a3,s0,13c | | In terms of cycles, as shown below. | PC | Instruction | cycle 1 | c2 | c3 | c4 | c5 | c6 | | -- | --------------- | ------- | ----- | ----- | --- | --- | -- | | 160 | bge a3,s0,13c | IF/ID | EX | WB | | | | | 164 | lui a0,0x20 | | NOP | NOP | NOP | | | | 168 | addi a0,a0,144 | | | NOP | NOP | NOP | | | xxx | exec if branch taken | | | |IF/ID| EX | WB | --- ### Software Optimizations I try to modify the C code to use less cycles. Modify the C program to reduce the number of lines, and then make the new program. New C Code program. ```clike= #include <stdio.h> int searchInsert(int *nums, int numsSize, int target) { int left = 0; int right = numsSize - 1; while (left <= right) { int mid = (left + right) / 2; if (target == nums[mid]) return mid; else if (target < nums[mid]) right = mid - 1; else left = mid + 1; } return left; } int main() { int data[] = {1, 3, 5, 6}; int size = 4; int tar[] = {5, 2, 7}; for(int i=0; i<sizeof(tar)/sizeof(tar[0]); i++) { printf("The target%d insert position is %d\n", i+1, searchInsert(data, size, tar[i])); } return 0; } ``` After `make`, we can get the result as shown belows. ![](https://i.imgur.com/oFibbgu.png) :::spoiler ``` t123@t123-BM6875-BM6675-BP6375:~/srv32$ make sip2 make[1]: Entering directory '/home/t123/srv32/sw' make -C common make[2]: Entering directory '/home/t123/srv32/sw/common' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/t123/srv32/sw/common' make[2]: Entering directory '/home/t123/srv32/sw/sip2' riscv-none-elf-gcc -O3 -Wall -march=rv32im_zicsr -mabi=ilp32 -misa-spec=2.2 -march=rv32im -nostartfiles -nostdlib -L../common -o sip2.elf sip2.c -lc -lm -lgcc -lsys -T ../common/default.ld riscv-none-elf-objcopy -j .text -O binary sip2.elf imem.bin riscv-none-elf-objcopy -j .data -O binary sip2.elf dmem.bin riscv-none-elf-objcopy -O binary sip2.elf memory.bin riscv-none-elf-objdump -d sip2.elf > sip2.dis riscv-none-elf-readelf -a sip2.elf > sip2.symbol make[2]: Leaving directory '/home/t123/srv32/sw/sip2' make[1]: Leaving directory '/home/t123/srv32/sw' make[1]: Entering directory '/home/t123/srv32/sim' Excuting 1266 instructions, 1874 cycles, 1.480 CPI Program terminate - ../rtl/../testbench/testbench.v:434: Verilog $finish Simulation statistics ===================== Simulation time : 0.152 s Simulation cycles: 1885 Simulation speed : 0.0124013 MHz make[1]: Leaving directory '/home/t123/srv32/sim' make[1]: Entering directory '/home/t123/srv32/tools' ./rvsim --memsize 128 -l trace.log ../sw/sip2/sip2.elf The target1 insert position is 2 The target2 insert position is 1 The target3 insert position is 4 Excuting 8419 instructions, 11581 cycles, 1.376 CPI Program terminate Simulation statistics ===================== Simulation time : 0.004 s Simulation cycles: 11581 Simulation speed : 2.724 MHz make[1]: Leaving directory '/home/t123/srv32/tools' Compare the trace between RTL and ISS simulator Files sim/trace.log and tools/trace.log differ make: *** [Makefile:121: sip2] Error 1 ``` ::: Tabular Results. | Comapre | sim (verilog) | rvsim (c++) | |-------------------|---------------|-------------| | Instructions | 1266 | 8419 | | Cycles | 1874 | 11581 | | CPI | 1.480 | 1.376 | | Simulation Times | 0.152 s | 0.004 s | | Simulation cycles | 1885 | 11581 | | Simulation Speed | 0.0124013 MHz | 2.724 MHz | Compare with the original as shown below. | Comapre | sim (verilog) | rvsim (c++) | |-------------------|---------------|-------------| | Instructions | 1267 | 7094 | | Cycles | 1883 | 9752 | | CPI | 1.486 | 1.375 | | Simulation Times | 0.138 s | 0.006 s | | Simulation cycles | 1894 | 9752 | | Simulation Speed | 0.0137246 MHz | 1.701 MHz | In `sim`, we can find a decrease in CPI, which means that the progress is in a good direction; however, in `rvsim`, we can find a slight increase in CPI, which is not a good sign. --- ## Q2: Leetcode [122. Best Time to Buy and Sell Stock II](https://leetcode.com/problems/best-time-to-buy-and-sell-stock-ii/) I picked [張瑞甫's program in Assignment 2](https://hackmd.io/ADNQPiEFSPC_daP2GJ_sSQ) to complete my Assignment 3. ### Code C code ```clike= #include<stdio.h> #include<stdlib.h> int maxProfit(int* prices, int pricesSize){ int totalProfit = 0; int i=0; int increases = 0; for (i = 1; i < pricesSize; i++) { if (prices[i] <= prices[i - 1]) { totalProfit += increases; increases = 0; } else increases += prices[i] - prices[i - 1]; } totalProfit += increases; return totalProfit; } int main(){ int arr[]={99,32,3,56,0,2,56,99}; int size=8; int a= maxProfit(arr,size); printf("%d\n",a); } ``` Assembly Code in Details (詳細資料). `maxProfit` Assembly Code in Details (詳細資料). :::spoiler ``` 0000003c <maxProfit>: 3c: 00100793 li a5,1 40: 0cb7d263 bge a5,a1,104 <maxProfit+0xc8> 44: 00300793 li a5,3 48: 0cb7d263 bge a5,a1,10c <maxProfit+0xd0> 4c: ffc58713 addi a4,a1,-4 50: 00052783 lw a5,0(a0) 54: ffe77713 andi a4,a4,-2 58: 00450693 addi a3,a0,4 5c: 00370713 addi a4,a4,3 60: 00000813 li a6,0 64: 00100313 li t1,1 68: 00000893 li a7,0 6c: 0006a603 lw a2,0(a3) 70: 00000e13 li t3,0 74: 40f60eb3 sub t4,a2,a5 78: 06c7d863 bge a5,a2,e8 <maxProfit+0xac> 7c: 0046a783 lw a5,4(a3) 80: 010e8e33 add t3,t4,a6 84: 00000813 li a6,0 88: 40c78eb3 sub t4,a5,a2 8c: 06f65863 bge a2,a5,fc <maxProfit+0xc0> 90: 01ce8833 add a6,t4,t3 94: 00230313 addi t1,t1,2 98: 00868693 addi a3,a3,8 9c: fce318e3 bne t1,a4,6c <maxProfit+0x30> a0: 00271793 slli a5,a4,0x2 a4: 00f507b3 add a5,a0,a5 a8: 0180006f j c0 <maxProfit+0x84> ac: 00170713 addi a4,a4,1 b0: 010888b3 add a7,a7,a6 b4: 00478793 addi a5,a5,4 b8: 00000813 li a6,0 bc: 02b75263 bge a4,a1,e0 <maxProfit+0xa4> c0: 0007a603 lw a2,0(a5) c4: ffc7a683 lw a3,-4(a5) c8: 40d60533 sub a0,a2,a3 cc: fec6d0e3 bge a3,a2,ac <maxProfit+0x70> d0: 00170713 addi a4,a4,1 d4: 00a80833 add a6,a6,a0 d8: 00478793 addi a5,a5,4 dc: feb742e3 blt a4,a1,c0 <maxProfit+0x84> e0: 01088533 add a0,a7,a6 e4: 00008067 ret e8: 0046a783 lw a5,4(a3) ec: 010888b3 add a7,a7,a6 f0: 00000813 li a6,0 f4: 40c78eb3 sub t4,a5,a2 f8: f8f64ce3 blt a2,a5,90 <maxProfit+0x54> fc: 01c888b3 add a7,a7,t3 100: f95ff06f j 94 <maxProfit+0x58> 104: 00000513 li a0,0 108: 00008067 ret 10c: 00000813 li a6,0 110: 00100713 li a4,1 114: 00000893 li a7,0 118: f89ff06f j a0 <maxProfit+0x64> ``` ::: `main` Assembly Code in Details (詳細資料). :::spoiler ``` 0000011c <main>: 11c: 00020537 lui a0,0x20 120: ff010113 addi sp,sp,-16 124: 09800593 li a1,152 128: 09050513 addi a0,a0,144 # 20090 <environ+0x4> 12c: 00112623 sw ra,12(sp) 130: 054000ef jal ra,184 <printf> 134: 00c12083 lw ra,12(sp) 138: 00000513 li a0,0 13c: 01010113 addi sp,sp,16 140: 00008067 ret ``` ::: --- ### Modify Makefile Copy the `Makefile` in `~/srv32/sw/sip` to `~srv32/sw/122`, and modify the `SRC` and `TARGET` from `sip` to `122`. --- ### Result in srv32 Run `make 122` under the root (/srv32 (~srv32/)) directory and the result is shown below. ![](https://i.imgur.com/717PWJm.png) :::spoiler ``` t123@t123-BM6875-BM6675-BP6375:~/srv32$ make 122 make[1]: Entering directory '/home/t123/srv32/sw' make -C common make[2]: Entering directory '/home/t123/srv32/sw/common' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/t123/srv32/sw/common' make[2]: Entering directory '/home/t123/srv32/sw/122' riscv-none-elf-gcc -O3 -Wall -march=rv32im_zicsr -mabi=ilp32 -misa-spec=2.2 -march=rv32im -nostartfiles -nostdlib -L../common -o 122.elf 122.c -lc -lm -lgcc -lsys -T ../common/default.ld riscv-none-elf-objcopy -j .text -O binary 122.elf imem.bin riscv-none-elf-objcopy -j .data -O binary 122.elf dmem.bin riscv-none-elf-objcopy -O binary 122.elf memory.bin riscv-none-elf-objdump -d 122.elf > 122.dis riscv-none-elf-readelf -a 122.elf > 122.symbol make[2]: Leaving directory '/home/t123/srv32/sw/122' make[1]: Leaving directory '/home/t123/srv32/sw' make[1]: Entering directory '/home/t123/srv32/sim' Excuting 797 instructions, 1253 cycles, 1.572 CPI Program terminate - ../rtl/../testbench/testbench.v:434: Verilog $finish Simulation statistics ===================== Simulation time : 0.144 s Simulation cycles: 1264 Simulation speed : 0.00877778 MHz make[1]: Leaving directory '/home/t123/srv32/sim' make[1]: Entering directory '/home/t123/srv32/tools' ./rvsim --memsize 128 -l trace.log ../sw/122/122.elf 152 Excuting 2110 instructions, 2946 cycles, 1.396 CPI Program terminate Simulation statistics ===================== Simulation time : 0.001 s Simulation cycles: 2946 Simulation speed : 2.300 MHz make[1]: Leaving directory '/home/t123/srv32/tools' Compare the trace between RTL and ISS simulator Files sim/trace.log and tools/trace.log differ make: *** [Makefile:121: 122] Error 1 ``` ::: --- ### Waveform analysis Stpes are same as Q1. ![](https://i.imgur.com/GjDBHxs.png) #### [Data hazard](https://hackmd.io/@sysprog/S1Udn1Xtt#Data-hazard) There is a data hazard in `printf`. Assembly code of `printf` in Details (詳細資料). :::spoiler ``` 00000184 <printf>: 184: fc010113 addi sp,sp,-64 188: 02c12423 sw a2,40(sp) 18c: 02d12623 sw a3,44(sp) 190: 00020317 auipc t1,0x20 194: ef032303 lw t1,-272(t1) # 20080 <_impure_ptr> 198: 02b12223 sw a1,36(sp) 19c: 02e12823 sw a4,48(sp) 1a0: 02f12a23 sw a5,52(sp) 1a4: 03012c23 sw a6,56(sp) 1a8: 03112e23 sw a7,60(sp) 1ac: 00832583 lw a1,8(t1) 1b0: 02410693 addi a3,sp,36 1b4: 00050613 mv a2,a0 1b8: 00030513 mv a0,t1 1bc: 00112e23 sw ra,28(sp) 1c0: 00d12623 sw a3,12(sp) 1c4: 010000ef jal ra,1d4 <_vfprintf_r> 1c8: 01c12083 lw ra,28(sp) 1cc: 04010113 addi sp,sp,64 ``` ::: We can find that when `PC` is `184`, `(wb_dst_sel == ex_src1_sel)` is `true` and `wb_mem2reg` is `false`. This means that a data hazard has occurred. ![](https://i.imgur.com/5itu6y5.png) --- ### Software Optimizations I would like to try to modify the C code to reduce the usage cycle. --- ## Reference * Assigment3 last year * [tobychui](https://hackmd.io/@wIVnCcUaTouAktrkMVLEMA/Hy6vD5DtF) * [Jack](https://hackmd.io/@jackli/arch_hw3) * [chinghongfang](https://hackmd.io/@chinghongfang/HJuNqq-cF) * [陳韋綸](https://hackmd.io/@_UHs74UQS7uNne9_7SwQFQ/S113vvkct) * [Xiaokan Lua](https://hackmd.io/@E4b6eQ9-RWSAX-9mP_FLhA/HJwz8FgOK) * Assignment 3 this year * [nlnlOeO](https://hackmd.io/lMHf_NxVQGeO-VRIYvUV5w?view) * [wanghanchi](https://hackmd.io/@wanghanchi/H1AxxO9ri) * [srv32 學習紀錄](https://hackmd.io/Jrr1J1YDR_CtGR46DYotiQ) * [eecheng's Assignment 3 建置教學](https://hackmd.io/@eecheng/B1fEgnQwF) * [echo, export commands](https://www.cnblogs.com/xiaopiyuanzi/p/11910107.html) * [Welcome to GTKWave](http://www.cs.ucf.edu/courses/cda4150/fall05/wave/wave.html) * [淺談分支預測與 Hazards 議題](https://ithelp.ithome.com.tw/m/articles/10265705)