Try   HackMD

Assignment 3: Single-Cycle RISC-V CPU

Contributed by ollieni

Explanation of Hello World in Chisel

class Hello extends Module {
  val io = IO(new Bundle {
    val led = Output(UInt(1.W))
  })
  val CNT_MAX = (50000000 / 2 - 1).U;
  val cntReg  = RegInit(0.U(32.W))
  val blkReg  = RegInit(0.U(1.W))
  cntReg := cntReg + 1.U
  when(cntReg === CNT_MAX) {
    cntReg := 0.U
    blkReg := ~blkReg                                                                                                                                         
  }
  io.led := blkReg
}

The Hello world code would generate a square wave on output led which will make led flash.
The frequency of the wave is determined by the CNT_MAX; namely, blkReg flips from 0 to 1(or flip from 1 to 0) when cntReg count to CNT_MAX.
The signal wave should look like this.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Given that the output is named led, I assume it as representing a flashing LED saying hello.

Enhancement of Hello World in Chisel

As the requirement of assignment saying "enhance it by incorporating logic circuit", I replace the when with Mux.
In my perspective, employing a Mux (multiplexer, a type of logic gate) as a substitute for when (not a logic gate) aligns with the assignment's requirement.
Below is the code for the method I propose:

import chisel3._

class Hello extends Module {
  val io = IO(new Bundle {
    val led = Output(UInt(1.W))
  })

  val CNT_MAX = (50000000 / 2 - 1).U
  val cntReg = RegInit(0.U(32.W))
  val blkReg = RegInit(0.U(1.W))

  cntReg := cntReg + 1.U

  // Replacing when with Mux
  cntReg := Mux(cntReg === CNT_MAX, 0.U, cntReg)
  blkReg := Mux(cntReg === CNT_MAX, ~blkReg, blkReg)

  io.led := blkReg
}

Test case explanation

This section below is my comprehension of the test case.

InstructionFetch Test

Inside the test case, a loop runs for 100 cycles. In each cycle, it randomly chooses to either perform a jump or not.

  • For the case of no jump (jump_flag_id = 0), it advances the program counter (pre) by 4, indicating the sequential execution of instructions.

  • For the case of a jump (jump_flag_id = 1), it simulates a jump instruction. It then checks if the instruction fetch module correctly updates the instruction address based on the jump.

InstructionDecode Test

In this program, we use poke to set the value of c.io.instruction to S-type, LUI, and ADD instructions.
Then we use expect to check if the decoded control signalsex_aluop1_source, ex_aluop2_source, regs_reg1_read_address, regs_reg2_read_address is matched with the instruction we input.

Execute Test

This test check if the Execute module correctly performs ALU(add) operations and handles branch(beq) instructions in a single-cycle RISC-V CPU.
The test of beq check both the equal and not equal situations by setting values of c.io.reg1_data and c.io.reg1_data and check the corresponding c.io.if_jump_flag,c.io.if_jump_address.

RegisterFile Test

This test use three test case to confirm the registers' functionality.

  • Test Case 1: "read the written content": This test checks if the RegisterFile correctly writes data to a register and reads it back.
    It sets the write_enable to true, writes data 0xdeadbeefL to register 1, and then reads from register 1 to check if the data matches.

  • Test Case 2: "x0 always be zero": This test checks if writing to register 0 (x0) always results in zero.
    It sets the write_enable to true, writes data 0xdeadbeefL to register 0, and then reads from register 0 to check if the data is zero.

  • Test Case 3: "read the writing content": This test checks if the RegisterFile correctly writes and reads data from a different register (register 2).
    It first reads from register 2 to ensure it is initially zero. It then sets the write_enable to true, writes data 0xdeadbeefL to register 2, reads from register 2 to check if the data matches, and advances the clock to see if the value is retained.
    The final read from register 2 is performed to check if the data is still 0xdeadbeefL after clock advancement.

CPU Test

This code includes tests for Fibonacci calculating, performing quicksort and testing byte access operations.

Fibonacci Test

This test use mem_debug_read_data to read the memory of address 4 and check if it is equal to 55(result of Fibonacci(10)).

Quicksort Test

It reads the memory for the initial 10 numbers and checks if they are sorted.

Byte Access Test

This code is aim to verify the CPU can correctly store and load data in byte size.
It checks the data in specific register t0, t1, ra.

Waveform Analysis

Instruction Fetch

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

In the IF (Instruction Fetch) stage, there are several critical signals, namely io_instruction_address, io_jump_flag_id, and io_jump_address_id.
In the waveform graph, we observe that the instruction address increment by 4 in every clock cycle.
Additionally, as the io_jump_flag_id is set to 1, io_instruction_address would jump to the address specified in io_jump_address_id.

Instrution Decode

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

In the image, we observe that the io_instruction is 00A02223, indicating the instruction sw x10, 4(x0). Consequently, the io_memory_write_enable is set to 1."

Execute

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

In the image, we observe that the io_instruction is 001101B3, representing the instruction add x3, x2, x1.
This indicates a test of the addition operation at this point.
Further, we note that io_reg1_data is 0089866A and io_reg2_data is 10EEDD7B.
Upon adding them together, we obtain the io_mem_alu_result as 117863E5 which is same as we can see in waveform graph.

HW2 code modification

I remove theRDCYCLE/RDCYCLEH and put the assembly code of hw2 in to csrc directory.
In order to generate .o and .asmbin file from my assembly code, I add hw2yh.asmbin under the BINS = in the makefile.

To make the hw2yh.S testable in Chisel, I store the results in specific memory address.

And I add the test model for hw2yh.asmbin in CPUTest.scala.

class HW2Test extends AnyFlatSpec with ChiselScalatestTester {
  behavior.of("Single Cycle CPU")
  it should "Multiplication Overflow Prediction" in {
    test(new TestTopModule("hw2yh.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
      for (i <- 1 to 500) {
        c.clock.step(1000)
        c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout
      }
      
      c.io.mem_debug_read_address.poke(4.U)
      c.clock.step()
      c.io.mem_debug_read_data.expect(0.U)
      
      c.io.mem_debug_read_address.poke(8.U)
      c.clock.step()
      c.io.mem_debug_read_data.expect(0.U)
      
      c.io.mem_debug_read_address.poke(12.U)
      c.clock.step()
      c.io.mem_debug_read_data.expect(1.U)
      
      c.io.mem_debug_read_address.poke(16.U)
      c.clock.step()
      c.io.mem_debug_read_data.expect(1.U)
      
    }
  }
}

My code check the value in memory at address 0x4, 0x8, 0xC, 0x10 and with expected value being 0, 0, 1, 1.

Test output:

$ sbt "testOnly riscv.singlecycle.HW2Test"
[info] welcome to sbt 1.9.7 (OpenLogic Java 11.0.21)
[info] loading settings for project ca2023-lab3-build from plugins.sbt ...
[info] loading project definition from /home/ollieni/ca2023-lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/home/ollieni/ca2023-lab3/)
[info] HW2Test:
[info] Single Cycle CPU
[info] - should Multiplication Overflow Prediction
[info] Run completed in 15 seconds, 742 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 17 s, completed Nov 28, 2023, 10:23:29 PM

Execute code with verilator

Execute the following command in the project’s root directory to generate Verilog files:
$ make verilator

Then load the hw2yh.asmbin and simulate for 1000 cycles and save waveform to dump1.vcd.

$ ./run-verilator.sh -instruction src/main/resources/hw2yh.asmbin -time 2000 -vcd dump1.vcd

Checking waveform with gtkwave dump1.vcd.

To verify the hexadecimal representation, I use the instruction below.

$ hexdump src/main/resources/hello.asmbin | head -1

Output :

0000000 0a93 0005 0113 ff01 0297 0000 8293 25c2

Which is compatible with my waveform.

Description of key signals when different instructions are executed.

Instruction 1:
image
At this time, the io_instruction represent sw x5, 0(x2).
So the io_memory_bundle_write_enable is set to 1.
And we can observe that io_memory_bundle_write_data is 00001264 which is the value in x5.
And the address of x5 would be 1FFFFFF0.

Instruction 2:
image
The io_instruction(00155293) indicates srli x5, x10, 1.
And I can't understand why my waveform has signals like this.