# Assignment 3: Single-Cycle RISC-V CPU Contributed by [ollieni](https://github.com/ollieni/ca2023-lab3) ## Explanation of Hello World in Chisel ```scala class Hello extends Module { val io = IO(new Bundle { val led = Output(UInt(1.W)) }) val CNT_MAX = (50000000 / 2 - 1).U; val cntReg = RegInit(0.U(32.W)) val blkReg = RegInit(0.U(1.W)) cntReg := cntReg + 1.U when(cntReg === CNT_MAX) { cntReg := 0.U blkReg := ~blkReg } io.led := blkReg } ``` The Hello world code would generate a square wave on output ```led``` which will make led flash. The frequency of the wave is determined by the ```CNT_MAX```; namely, blkReg flips from 0 to 1(or flip from 1 to 0) when ```cntReg``` count to ```CNT_MAX```. The signal wave should look like this. ![LykY8](https://hackmd.io/_uploads/Sk77DMXH6.png) Given that the output is named led, I assume it as representing a flashing LED saying hello. ### Enhancement of Hello World in Chisel As the requirement of assignment saying "enhance it by incorporating logic circuit", I replace the `when` with `Mux`. In my perspective, employing a `Mux` (multiplexer, a type of logic gate) as a substitute for `when` (not a logic gate) aligns with the assignment's requirement. Below is the code for the method I propose: ```scala import chisel3._ class Hello extends Module { val io = IO(new Bundle { val led = Output(UInt(1.W)) }) val CNT_MAX = (50000000 / 2 - 1).U val cntReg = RegInit(0.U(32.W)) val blkReg = RegInit(0.U(1.W)) cntReg := cntReg + 1.U // Replacing when with Mux cntReg := Mux(cntReg === CNT_MAX, 0.U, cntReg) blkReg := Mux(cntReg === CNT_MAX, ~blkReg, blkReg) io.led := blkReg } ``` ## Test case explanation This section below is my comprehension of the test case. ### InstructionFetch Test Inside the test case, a loop runs for 100 cycles. In each cycle, it randomly chooses to either perform a jump or not. * For the case of no jump (jump_flag_id = 0), it advances the program counter (pre) by 4, indicating the sequential execution of instructions. * For the case of a jump (jump_flag_id = 1), it simulates a jump instruction. It then checks if the instruction fetch module correctly updates the instruction address based on the jump. ### InstructionDecode Test In this program, we use ```poke``` to set the value of ```c.io.instruction``` to S-type, LUI, and ADD instructions. Then we use ```expect``` to check if the decoded control signals```ex_aluop1_source```, ```ex_aluop2_source```, ```regs_reg1_read_address```, ```regs_reg2_read_address``` is matched with the instruction we input. ### Execute Test This test check if the Execute module correctly performs ALU(```add```) operations and handles branch(```beq```) instructions in a single-cycle RISC-V CPU. The test of ```beq``` check both the equal and not equal situations by setting values of ```c.io.reg1_data``` and ```c.io.reg1_data``` and check the corresponding ```c.io.if_jump_flag```,```c.io.if_jump_address```. ### RegisterFile Test This test use three test case to confirm the registers' functionality. * Test Case 1: "read the written content": This test checks if the RegisterFile correctly writes data to a register and reads it back. It sets the write_enable to true, writes data 0xdeadbeefL to register 1, and then reads from register 1 to check if the data matches. * Test Case 2: "x0 always be zero": This test checks if writing to register 0 (x0) always results in zero. It sets the write_enable to true, writes data 0xdeadbeefL to register 0, and then reads from register 0 to check if the data is zero. * Test Case 3: "read the writing content": This test checks if the RegisterFile correctly writes and reads data from a different register (register 2). It first reads from register 2 to ensure it is initially zero. It then sets the write_enable to true, writes data 0xdeadbeefL to register 2, reads from register 2 to check if the data matches, and advances the clock to see if the value is retained. The final read from register 2 is performed to check if the data is still 0xdeadbeefL after clock advancement. ### CPU Test This code includes tests for Fibonacci calculating, performing quicksort and testing byte access operations. #### Fibonacci Test This test use ```mem_debug_read_data``` to read the memory of address 4 and check if it is equal to 55(result of Fibonacci(10)). #### Quicksort Test It reads the memory for the initial 10 numbers and checks if they are sorted. #### Byte Access Test This code is aim to verify the CPU can correctly store and load data in byte size. It checks the data in specific register ```t0```, ```t1```, ```ra```. ## Waveform Analysis ### Instruction Fetch ![screenshot-2023-11-29 103723](https://hackmd.io/_uploads/SJU9OmVBT.png) In the IF (Instruction Fetch) stage, there are several critical signals, namely `io_instruction_address`, `io_jump_flag_id`, and `io_jump_address_id`. In the waveform graph, we observe that the instruction address increment by 4 in every clock cycle. Additionally, as the io_jump_flag_id is set to 1, io_instruction_address would jump to the address specified in io_jump_address_id. ### Instrution Decode ![screenshot-2023-11-29 104036](https://hackmd.io/_uploads/SJCqOQVH6.png) In the image, we observe that the `io_instruction` is `00A02223`, indicating the instruction `sw x10, 4(x0)`. Consequently, the `io_memory_write_enable` is set to 1." ### Execute ![screenshot-2023-11-29 104518](https://hackmd.io/_uploads/BJpsdXVHT.png) In the image, we observe that the `io_instruction` is `001101B3`, representing the instruction `add x3, x2, x1`. This indicates a test of the addition operation at this point. Further, we note that `io_reg1_data` is `0089866A` and `io_reg2_data` is `10EEDD7B`. Upon adding them together, we obtain the `io_mem_alu_result` as `117863E5` which is same as we can see in waveform graph. ## HW2 code modification I remove the```RDCYCLE/RDCYCLEH``` and put the assembly code of hw2 in to ```csrc``` directory. In order to generate .o and .asmbin file from my assembly code, I add ```hw2yh.asmbin``` under the ```BINS = ``` in the makefile. To make the hw2yh.S testable in Chisel, I store the results in specific memory address. And I add the test model for ```hw2yh.asmbin``` in ```CPUTest.scala```. ```scala class HW2Test extends AnyFlatSpec with ChiselScalatestTester { behavior.of("Single Cycle CPU") it should "Multiplication Overflow Prediction" in { test(new TestTopModule("hw2yh.asmbin")).withAnnotations(TestAnnotations.annos) { c => for (i <- 1 to 500) { c.clock.step(1000) c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout } c.io.mem_debug_read_address.poke(4.U) c.clock.step() c.io.mem_debug_read_data.expect(0.U) c.io.mem_debug_read_address.poke(8.U) c.clock.step() c.io.mem_debug_read_data.expect(0.U) c.io.mem_debug_read_address.poke(12.U) c.clock.step() c.io.mem_debug_read_data.expect(1.U) c.io.mem_debug_read_address.poke(16.U) c.clock.step() c.io.mem_debug_read_data.expect(1.U) } } } ``` My code check the value in memory at address `0x4`, `0x8`, `0xC`, `0x10` and with expected value being 0, 0, 1, 1. Test output: ```shell $ sbt "testOnly riscv.singlecycle.HW2Test" [info] welcome to sbt 1.9.7 (OpenLogic Java 11.0.21) [info] loading settings for project ca2023-lab3-build from plugins.sbt ... [info] loading project definition from /home/ollieni/ca2023-lab3/project [info] loading settings for project root from build.sbt ... [info] set current project to mycpu (in build file:/home/ollieni/ca2023-lab3/) [info] HW2Test: [info] Single Cycle CPU [info] - should Multiplication Overflow Prediction [info] Run completed in 15 seconds, 742 milliseconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [success] Total time: 17 s, completed Nov 28, 2023, 10:23:29 PM ``` ## Execute code with verilator Execute the following command in the project’s root directory to generate Verilog files: ```$ make verilator``` Then load the `hw2yh.asmbin` and simulate for 1000 cycles and save waveform to `dump1.vcd`. ```shell $ ./run-verilator.sh -instruction src/main/resources/hw2yh.asmbin -time 2000 -vcd dump1.vcd ``` Checking waveform with `gtkwave dump1.vcd`. To verify the hexadecimal representation, I use the instruction below. ```shell $ hexdump src/main/resources/hello.asmbin | head -1 ``` Output : ``` 0000000 0a93 0005 0113 ff01 0297 0000 8293 25c2 ``` Which is compatible with my waveform. ## Description of key signals when different instructions are executed. **Instruction 1:** ![image](https://hackmd.io/_uploads/HkzNGsNSp.png) At this time, the `io_instruction` represent `sw x5, 0(x2)`. So the `io_memory_bundle_write_enable` is set to 1. And we can observe that `io_memory_bundle_write_data` is 00001264 which is the value in x5. And the address of x5 would be 1FFFFFF0. **Instruction 2:** ![image](https://hackmd.io/_uploads/ByCc4hEB6.png) The `io_instruction`(00155293) indicates `srli x5, x10, 1`. And I can't understand why my waveform has signals like this.