# Assignment3: Single-cycle RISC-V CPU
contributed by < [`hungyuhang`](https://github.com/hungyuhang) >
## Environment Setup
### Issue Encountered when Running Chisel Bootcamp
I run the Chisel Bootcamp on my Windows laptop locally using Docker.
But the Docker image provided by Chisel has some issues. After I boot up the Docker image, the code cell in the Jupyter notebook of the Chisel bootcamp cannot execute normally.
To solve the issue, I rebuild the Docker image using the method in this [article](https://github.com/freechipsproject/chisel-bootcamp/issues/140). And after the fixup, the bootcamp runs normally.
### GTKWave Installation
To install the tool, I download `gtkwave-3.3.117.tar.gz` file from [GTKWave](https://gtkwave.sourceforge.net), which is the source code project of GTKWave.
Below are the instructions to install GTKWave using the source code project, which are listed in the README file:
1) Type `./configure`
2) `make`
3) `make install` (as root)
First, I run `./configure`, but it failed.
Then I found a solution provided in the README file, which says that you may need to install some packages first using the commands below:
```
sudo apt-get install libjudy-dev
sudo apt-get install libbz2-dev
sudo apt-get install liblzma-dev
sudo apt-get install libgconf2-dev
sudo apt-get install libgtk2.0-dev
sudo apt-get install tcl-dev
sudo apt-get install tk-dev
sudo apt-get install gperf
sudo apt-get install gtk2-engines-pixbuf
```
After installing the packages above, I install GTKWave successfully using `./configure`, `make` and `make install` command.
## Explaination of [Hello World in Chisel](https://hackmd.io/@sysprog/r1mlr3I7p#Hello-World-in-Chisel)
```scala
// Hello World in Chisel
class Hello extends Module {
val io = IO(new Bundle {
val led = Output(UInt(1.W))
})
val CNT_MAX = (50000000 / 2 - 1).U;
val cntReg = RegInit(0.U(32.W))
val blkReg = RegInit(0.U(1.W))
cntReg := cntReg + 1.U
when(cntReg === CNT_MAX) {
cntReg := 0.U
blkReg := ~blkReg
}
io.led := blkReg
}
```
The code above declares a hardware module with only 1 output.
Within the module, there are two registers:
- `cntReg` is a counter.
- `blkReg` is the current state of the LED.
By observing the module, we can find that the value of `cntReg` is directly related to `blkReg`, so we can eliminate the `blkReg` register, which is shown below:
```scala
// Hello World in Chisel, after eliminating register blkReg
class Hello extends Module {
val io = IO(new Bundle {
val led = Output(UInt(1.W))
})
val CNT_MAX = (50000000 - 1).U;
val TOGGLE_THRESHOLD = (50000000 / 2).U;
val cntReg = RegInit(0.U(32.W))
cntReg := cntReg + 1.U
when(cntReg === CNT_MAX) {
cntReg := 0.U
}
when(cntReg < TOGGLE_THRESHOLD) {
io.led := 0.U
}.otherwise {
io.led := 1.U
}
}
```
## Complete the Code of [Lab3](https://hackmd.io/@sysprog/r1mlr3I7p)
### Problems Encountered with GitHub
At first, I use the following command to push my commits back to github:
```bash
$ git push origin main
```
I type in my GitHub username and password. But the terminal returns the following meaasge:
```
remote: Support for password authentication was removed on August 13, 2021.
remote: Please see https://docs.github.com/en/get-started/getting-started-with-git/about-remote-repositories#cloning-with-https-urls for information on currently recommended modes of authentication.
fatal: Authentication failed for 'https://github.com/hungyuhang/ca2023-lab3/'
```
To solve the issue, I followed the instructions in [GitHub documentation](https://docs.github.com/en/get-started/getting-started-with-git/caching-your-github-credentials-in-git) and installed GCM. And this modification fix the issue.
### My Code of Lab3
Here is my [repository](https://github.com/hungyuhang/ca2023-lab3) of lab3, which forked from [ca2023-lab3](https://github.com/sysprog21/ca2023-lab3).
### Running Unit Tests
To run the unit tests, I use the command:
```
$ sbt test
```
And here is the output:
```
[info] welcome to sbt 1.9.7 (Eclipse Adoptium Java 11.0.21)
[info] loading settings for project ca2023-lab3-build from plugins.sbt ...
[info] loading project definition from /home/hungyuhang/ca2023-lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/home/hungyuhang/ca2023-lab3/)
[info] ExecuteTest:
[info] Execution of Single Cycle CPU
[info] - should execute correctly
[info] ByteAccessTest:
[info] Single Cycle CPU
[info] - should store and load a single byte
[info] InstructionFetchTest:
[info] InstructionFetch of Single Cycle CPU
[info] - should fetch instruction
[info] QuicksortTest:
[info] Single Cycle CPU
[info] - should perform a quicksort on 10 numbers
[info] InstructionDecoderTest:
[info] InstructionDecoder of Single Cycle CPU
[info] - should produce correct control signal
[info] RegisterFileTest:
[info] Register File of Single Cycle CPU
[info] - should read the written content
[info] - should x0 always be zero
[info] - should read the writing content
[info] FibonacciTest:
[info] Single Cycle CPU
[info] - should recursively calculate Fibonacci(10)
[info] Run completed in 30 seconds, 3 milliseconds.
[info] Total number of tests run: 9
[info] Suites: completed 7, aborted 0
[info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 32 s, completed Nov 24, 2023, 10:28:59 AM
```
The sections below are descriptions of each unit test:
#### ExecuteTest
This unit test tests the Execute module by using the `add` and `beq` instructions.
Specifically, the unit test specifies a instruction, altering the input register value and check if the output value of the module is what we expect.
#### ByteAccessTest
This unit test loads a program `sb.asmbin`, runs the program using our CPU, and tests specific(`t0`, `t1`, `ra`) register value after program execution. `sb.asmbin` is a simple program that contains some memory instructions such as `lw` and `sb`.
To make our CPU run a program, first the testing code instantiates a `TestTopModule` module. This module acts like an entire system, it not only contains the CPU module, but also contains the following modules:
- `InstructionROM`
This module is a read only memory(ROM). It takes the program's binary(.asmbin) file as the input, and stores the content of the binary file in its memory space.
- `Memory`
This module is the RAM of the system, the CPU will load/store data from this module during program execution.
- `ROMLoader`
This module will load the contents in `InstructionROM` to `Memory` before program execution. `Parameters.EntryAddress` specifies where the data should be put into `Memory`.
The block diagram of `TestTopModule` should be look like this:

And to test the value of the register and memory, `TestTopModule` also provides debug ports that can directly get the value of the register and memory. The code below are the definitions of the debug ports:
```scala
val mem_debug_read_address = Input(UInt(Parameters.AddrWidth))
val regs_debug_read_address = Input(UInt(Parameters.PhysicalRegisterAddrWidth))
val regs_debug_read_data = Output(UInt(Parameters.DataWidth))
val mem_debug_read_data = Output(UInt(Parameters.DataWidth))
```
For example, to check if register `ra` contains the value `0x15`, just use the testing code below:
```scala
c.io.regs_debug_read_address.poke(1.U) // ra
c.io.regs_debug_read_data.expect(0x15ef.U)
```
#### InstructionFetchTest
This unit test tests the InstructionFetch module. It checks if the module will generate correct PC address. Here is the testing logic:
- If `jump_flag_id` is false, then PC should add 4 at the next clock cycle.
- If `jump_flag_id` is true, then PC should change to the specified jump address at the next clock cycle.
#### QuicksortTest
This unit test loads a program `quicksort.asmbin`, runs the program using our CPU, and checks if the data in specific memory address is what we expect after program execution. `quicksort.asmbin` is a program that runs the quicksort algorithm, and stores the result in memory.
#### InstructionDecoderTest
This unit test tests the InstructionDecode module. It checks if the module will generate correct control signal according to different instruction input such as `sw`, `lui` and `add`.
#### RegisterFileTest
This unit test tests the RegisterFile module. It contains three tests below:
1. Tests if a register can read out the correct value after writing to the same register.
2. Tests if register `zero` always return 0.
3. Tests if a register can read out the correct value when `write_enable` bit is asserted.
#### FibonacciTest
This unit test loads a program `fibonacci.asmbin`, runs the program using our CPU, and checks if the data in specific memory address is what we expect after program execution. `fibonacci.asmbin` is a program that runs the Fibonacci algorithm, and stores the result in memory.
### Waveform Analysis
#### InstructionFetchTest

From the waveform graph, we can see that the instruction address adds 4 on each clock cycle, and jumps to the jump address (which is 0x1000) when `io_jump_flag_id` is set to high.
#### InstructionDecoderTest

The signals I add in the `InstructionDecode.scala` are `io_memory_read_enable` and `io_memory_write_enable`. In the waveform graph, we can see that when`opcode` is 0x23, which is a S type instruction, `io_memory_write_enable` changes to high.
And for the signal `io_memory_read_enable`, the test did not include the instruction that will trigger the signal, so the signal is always low.
#### ExecuteTest

In the test of the `add` instruction, we can see that:
- The value of ALU input `io_op1` is equal to `io_reg_data_1`.
- The value of ALU input `io_op2` is equal to `io_reg_data_2`.
- The value of ALU input `io_func` is equal to 1, which is `ALUFunctions.add`.
- The value of ALU output `io_result` is equal to the sum of `io_op1` and `io_op2`.

And in the test of the `beq` instruction, we can see that:
- `io_aluop1_source` and `io_aluop2_source` are now high.
- The value of ALU input `io_op1` is now equal to `io_instruction_address`.
- The value of ALU input `io_op2` is now equal to `io_immediate`.
- The value of ALU input `io_func` is equal to 1, which is still `ALUFunctions.add`.
- The value of ALU output `io_result` is equal to the sum of `io_op1` and `io_op2`.
- When `io_reg1_data` is not equal to `io_reg2_data`, `io_if_jump_flag` is 0.
- When `io_reg1_data` is equal to `io_reg2_data`, `io_if_jump_flag` will change to 1.
- And the value of `io_result` is now the jump address.
## HW2 Assembly Code Adaptation
This part adapts the code in [homework2](https://hackmd.io/@hungyuhang/risc-v-hw2) to lab3.
You can find the commit history of the adaptation in my [repository](https://github.com/hungyuhang/ca2023-lab3) on GitHub.
### Adapts the Assembly Code to Lab3
First, I do the following operations:
1. Put `hw2_asm.S` in the `csrc` directory.
2. Remove the code related to `rdcycle` and `rdcycleh` in `hw2_asm.S`.
3. Add `hw2_asm.asmbin` to `Makefile` in the `csrc` directory.
- Append `hw2_asm.asmbin` at the end of the `BINS` variable.
4. Run `$ make update` to generate `hw2_asm.asmbin`.
5. Add a blank test for `hw2_asm.asmbin` in `CPUTest.scala`.
At this moment, the code works fine when I run the following command:
```
$ sbt "testOnly riscv.singlecycle.HW2Test"
```
And then I do the following operations:
1. Modify `hw2_asm.S`.
- Make the code to store its results in memory address `0x0000000C`, `0x00000008`, and `0x00000004`.
2. Write corresponding test code in `CPUTest.scala`.
The modified code still pass the test. Here is the output of the test result:
```
[info] welcome to sbt 1.9.7 (Eclipse Adoptium Java 11.0.21)
[info] loading settings for project ca2023-lab3-build from plugins.sbt ...
[info] loading project definition from /home/hungyuhang/ca2023-lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/home/hungyuhang/ca2023-lab3/)
[info] HW2Test:
[info] Single Cycle CPU
[info] - should calculate the leftmost-zero-byte of 3 64bit numbers
[info] Run completed in 10 seconds, 514 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 14 s, completed Nov 24, 2023, 10:33:10 PM
```
### Using Verilator to Run the Assembly
Use the following command to generate the simulation executable file of the CPU:
```
$ make verilator
```
And use the following command to run `hw2_asm.asmbin` on the simulated CPU:
```
$ ./run-verilator.sh -instruction src/main/resources/hw2_asm.asmbin -time 4000 -vcd dump01.vcd
```
Output:
```
-time 4000
-memory 1048576
-instruction src/main/resources/hw2_asm.asmbin
[-------------------->] 100%
```
Use GTKWave to view the output waveform file `dump01.vcd`:
Case 1

- Instruction `0x00AE6E33` is equal to `or t3, t3, a0`.
- `alu_io_func` is 6, which stands for `ALUFunctions.or` in the CPU code.
- The value of `alu_io_reslut` is equal to `0x90A1B2C3 | 0x55007700`.
Case2

- Instruction `0x00112023` is equal to `sw ra, 0(sp)`.
- Since it is a store word instruction, `io_memory_write_enable` is 1.
- And the instruction will write the value `0x00001020` to memory address`0xFFFFFFF8`.
- The value of `alu_io_op1` is the base memory address, which is the value of `sp`.
- The value of `alu_io_op2` is the offset of the memory address, which is 0.
- The address input of the memory `io_memory_bundle_address` is `0xFFFFFFF8`, which is equal to the ALU output `alu_io_result`.
- The data input of the memory `io_memory_bundle_write_data` is `0x00001020`, which is equal to the value of `io_reg2_data`.
- And since `io_read_address2` is 1, the value of `io_reg2_data` is now the value of register `ra`.
Case3

- Instruction `0x084000EF` is equal to `jal ra, 132`.
- `io_if_jump_flag` is set to 1.
- The value of `io_if_jump_address` is equal to the sum of the current instruction address and the immediate field of the instruction, which are `0x0000101C` and `0x84` respectively.
- And from the waveform graph, we can see that the instruction address at the next clock cycle changed to `0x000010A0`, which is equal to `io_if_jump_address`.