Try   HackMD

Assignment3: Single-cycle RISC-V CPU

contributed by < hungyuhang >

Environment Setup

Issue Encountered when Running Chisel Bootcamp

I run the Chisel Bootcamp on my Windows laptop locally using Docker.
But the Docker image provided by Chisel has some issues. After I boot up the Docker image, the code cell in the Jupyter notebook of the Chisel bootcamp cannot execute normally.

To solve the issue, I rebuild the Docker image using the method in this article. And after the fixup, the bootcamp runs normally.

GTKWave Installation

To install the tool, I download gtkwave-3.3.117.tar.gz file from GTKWave, which is the source code project of GTKWave.

Below are the instructions to install GTKWave using the source code project, which are listed in the README file:

  1. Type ./configure
  2. make
  3. make install (as root)

First, I run ./configure, but it failed.
Then I found a solution provided in the README file, which says that you may need to install some packages first using the commands below:

sudo apt-get install libjudy-dev
sudo apt-get install libbz2-dev
sudo apt-get install liblzma-dev
sudo apt-get install libgconf2-dev
sudo apt-get install libgtk2.0-dev
sudo apt-get install tcl-dev
sudo apt-get install tk-dev
sudo apt-get install gperf
sudo apt-get install gtk2-engines-pixbuf

After installing the packages above, I install GTKWave successfully using ./configure, make and make install command.

Explaination of Hello World in Chisel

// Hello World in Chisel
class Hello extends Module {
  val io = IO(new Bundle {
    val led = Output(UInt(1.W))
  })
  val CNT_MAX = (50000000 / 2 - 1).U;
  val cntReg  = RegInit(0.U(32.W))
  val blkReg  = RegInit(0.U(1.W))
  cntReg := cntReg + 1.U
  when(cntReg === CNT_MAX) {
    cntReg := 0.U
    blkReg := ~blkReg                                                         
  }
  io.led := blkReg
}

The code above declares a hardware module with only 1 output.

Within the module, there are two registers:

  • cntReg is a counter.
  • blkReg is the current state of the LED.

By observing the module, we can find that the value of cntReg is directly related to blkReg, so we can eliminate the blkReg register, which is shown below:

// Hello World in Chisel, after eliminating register blkReg
class Hello extends Module {
  val io = IO(new Bundle {
    val led = Output(UInt(1.W))
  })
  val CNT_MAX          = (50000000 - 1).U;
  val TOGGLE_THRESHOLD = (50000000 / 2).U;
  val cntReg           = RegInit(0.U(32.W))
  cntReg := cntReg + 1.U
  when(cntReg === CNT_MAX) {
    cntReg := 0.U                                                       
  }
  when(cntReg < TOGGLE_THRESHOLD) {
    io.led := 0.U
  }.otherwise {
    io.led := 1.U
  }
}

Complete the Code of Lab3

Problems Encountered with GitHub

At first, I use the following command to push my commits back to github:

$ git push origin main

I type in my GitHub username and password. But the terminal returns the following meaasge:

remote: Support for password authentication was removed on August 13, 2021.
remote: Please see https://docs.github.com/en/get-started/getting-started-with-git/about-remote-repositories#cloning-with-https-urls for information on currently recommended modes of authentication.
fatal: Authentication failed for 'https://github.com/hungyuhang/ca2023-lab3/'

To solve the issue, I followed the instructions in GitHub documentation and installed GCM. And this modification fix the issue.

My Code of Lab3

Here is my repository of lab3, which forked from ca2023-lab3.

Running Unit Tests

To run the unit tests, I use the command:

$ sbt test

And here is the output:

[info] welcome to sbt 1.9.7 (Eclipse Adoptium Java 11.0.21)
[info] loading settings for project ca2023-lab3-build from plugins.sbt ...
[info] loading project definition from /home/hungyuhang/ca2023-lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/home/hungyuhang/ca2023-lab3/)
[info] ExecuteTest:
[info] Execution of Single Cycle CPU
[info] - should execute correctly
[info] ByteAccessTest:
[info] Single Cycle CPU
[info] - should store and load a single byte
[info] InstructionFetchTest:
[info] InstructionFetch of Single Cycle CPU
[info] - should fetch instruction
[info] QuicksortTest:
[info] Single Cycle CPU
[info] - should perform a quicksort on 10 numbers
[info] InstructionDecoderTest:
[info] InstructionDecoder of Single Cycle CPU
[info] - should produce correct control signal
[info] RegisterFileTest:
[info] Register File of Single Cycle CPU
[info] - should read the written content
[info] - should x0 always be zero
[info] - should read the writing content
[info] FibonacciTest:
[info] Single Cycle CPU
[info] - should recursively calculate Fibonacci(10)
[info] Run completed in 30 seconds, 3 milliseconds.
[info] Total number of tests run: 9
[info] Suites: completed 7, aborted 0
[info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 32 s, completed Nov 24, 2023, 10:28:59 AM

The sections below are descriptions of each unit test:

ExecuteTest

This unit test tests the Execute module by using the add and beq instructions.
Specifically, the unit test specifies a instruction, altering the input register value and check if the output value of the module is what we expect.

ByteAccessTest

This unit test loads a program sb.asmbin, runs the program using our CPU, and tests specific(t0, t1, ra) register value after program execution. sb.asmbin is a simple program that contains some memory instructions such as lw and sb.

To make our CPU run a program, first the testing code instantiates a TestTopModule module. This module acts like an entire system, it not only contains the CPU module, but also contains the following modules:

  • InstructionROM
    This module is a read only memory(ROM). It takes the program's binary(.asmbin) file as the input, and stores the content of the binary file in its memory space.
  • Memory
    This module is the RAM of the system, the CPU will load/store data from this module during program execution.
  • ROMLoader
    This module will load the contents in InstructionROM to Memory before program execution. Parameters.EntryAddress specifies where the data should be put into Memory.

The block diagram of TestTopModule should be look like this:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

And to test the value of the register and memory, TestTopModule also provides debug ports that can directly get the value of the register and memory. The code below are the definitions of the debug ports:

val mem_debug_read_address  = Input(UInt(Parameters.AddrWidth))
val regs_debug_read_address = Input(UInt(Parameters.PhysicalRegisterAddrWidth))
val regs_debug_read_data    = Output(UInt(Parameters.DataWidth))
val mem_debug_read_data     = Output(UInt(Parameters.DataWidth))

For example, to check if register ra contains the value 0x15, just use the testing code below:

c.io.regs_debug_read_address.poke(1.U) // ra
c.io.regs_debug_read_data.expect(0x15ef.U)

InstructionFetchTest

This unit test tests the InstructionFetch module. It checks if the module will generate correct PC address. Here is the testing logic:

  • If jump_flag_id is false, then PC should add 4 at the next clock cycle.
  • If jump_flag_id is true, then PC should change to the specified jump address at the next clock cycle.

QuicksortTest

This unit test loads a program quicksort.asmbin, runs the program using our CPU, and checks if the data in specific memory address is what we expect after program execution. quicksort.asmbin is a program that runs the quicksort algorithm, and stores the result in memory.

InstructionDecoderTest

This unit test tests the InstructionDecode module. It checks if the module will generate correct control signal according to different instruction input such as sw, lui and add.

RegisterFileTest

This unit test tests the RegisterFile module. It contains three tests below:

  1. Tests if a register can read out the correct value after writing to the same register.
  2. Tests if register zero always return 0.
  3. Tests if a register can read out the correct value when write_enable bit is asserted.

FibonacciTest

This unit test loads a program fibonacci.asmbin, runs the program using our CPU, and checks if the data in specific memory address is what we expect after program execution. fibonacci.asmbin is a program that runs the Fibonacci algorithm, and stores the result in memory.

Waveform Analysis

InstructionFetchTest

Screenshot from 2023-11-24 16-17-13
From the waveform graph, we can see that the instruction address adds 4 on each clock cycle, and jumps to the jump address (which is 0x1000) when io_jump_flag_id is set to high.

InstructionDecoderTest

Screenshot from 2023-11-24 16-47-15
The signals I add in the InstructionDecode.scala are io_memory_read_enable and io_memory_write_enable. In the waveform graph, we can see that whenopcode is 0x23, which is a S type instruction, io_memory_write_enable changes to high.

And for the signal io_memory_read_enable, the test did not include the instruction that will trigger the signal, so the signal is always low.

ExecuteTest

Screenshot from 2023-11-24 17-28-08
In the test of the add instruction, we can see that:

  • The value of ALU input io_op1 is equal to io_reg_data_1.
  • The value of ALU input io_op2 is equal to io_reg_data_2.
  • The value of ALU input io_func is equal to 1, which is ALUFunctions.add.
  • The value of ALU output io_result is equal to the sum of io_op1 and io_op2.

Screenshot from 2023-11-24 17-37-03
And in the test of the beq instruction, we can see that:

  • io_aluop1_source and io_aluop2_source are now high.
  • The value of ALU input io_op1 is now equal to io_instruction_address.
  • The value of ALU input io_op2 is now equal to io_immediate.
  • The value of ALU input io_func is equal to 1, which is still ALUFunctions.add.
  • The value of ALU output io_result is equal to the sum of io_op1 and io_op2.
  • When io_reg1_data is not equal to io_reg2_data, io_if_jump_flag is 0.
  • When io_reg1_data is equal to io_reg2_data, io_if_jump_flag will change to 1.
    • And the value of io_result is now the jump address.

HW2 Assembly Code Adaptation

This part adapts the code in homework2 to lab3.
You can find the commit history of the adaptation in my repository on GitHub.

Adapts the Assembly Code to Lab3

First, I do the following operations:

  1. Put hw2_asm.S in the csrc directory.
  2. Remove the code related to rdcycle and rdcycleh in hw2_asm.S.
  3. Add hw2_asm.asmbin to Makefile in the csrc directory.
    • Append hw2_asm.asmbin at the end of the BINS variable.
  4. Run $ make update to generate hw2_asm.asmbin.
  5. Add a blank test for hw2_asm.asmbin in CPUTest.scala.

At this moment, the code works fine when I run the following command:

$ sbt "testOnly riscv.singlecycle.HW2Test"

And then I do the following operations:

  1. Modify hw2_asm.S.
    • Make the code to store its results in memory address 0x0000000C, 0x00000008, and 0x00000004.
  2. Write corresponding test code in CPUTest.scala.

The modified code still pass the test. Here is the output of the test result:

[info] welcome to sbt 1.9.7 (Eclipse Adoptium Java 11.0.21)
[info] loading settings for project ca2023-lab3-build from plugins.sbt ...
[info] loading project definition from /home/hungyuhang/ca2023-lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/home/hungyuhang/ca2023-lab3/)
[info] HW2Test:
[info] Single Cycle CPU
[info] - should calculate the leftmost-zero-byte of 3 64bit numbers
[info] Run completed in 10 seconds, 514 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 14 s, completed Nov 24, 2023, 10:33:10 PM

Using Verilator to Run the Assembly

Use the following command to generate the simulation executable file of the CPU:

$ make verilator

And use the following command to run hw2_asm.asmbin on the simulated CPU:

$ ./run-verilator.sh -instruction src/main/resources/hw2_asm.asmbin -time 4000 -vcd dump01.vcd

Output:

-time 4000
-memory 1048576
-instruction src/main/resources/hw2_asm.asmbin
[-------------------->] 100%

Use GTKWave to view the output waveform file dump01.vcd:

Case 1
Screenshot from 2023-11-25 00-10-12

  • Instruction 0x00AE6E33 is equal to or t3, t3, a0.
  • alu_io_func is 6, which stands for ALUFunctions.or in the CPU code.
  • The value of alu_io_reslut is equal to 0x90A1B2C3 | 0x55007700.

Case2
Screenshot from 2023-11-25 09-15-29

  • Instruction 0x00112023 is equal to sw ra, 0(sp).
  • Since it is a store word instruction, io_memory_write_enable is 1.
  • And the instruction will write the value 0x00001020 to memory address0xFFFFFFF8.
  • The value of alu_io_op1 is the base memory address, which is the value of sp.
  • The value of alu_io_op2 is the offset of the memory address, which is 0.
  • The address input of the memory io_memory_bundle_address is 0xFFFFFFF8, which is equal to the ALU output alu_io_result.
  • The data input of the memory io_memory_bundle_write_data is 0x00001020, which is equal to the value of io_reg2_data.
  • And since io_read_address2 is 1, the value of io_reg2_data is now the value of register ra.

Case3
Screenshot from 2023-11-25 01-02-17

  • Instruction 0x084000EF is equal to jal ra, 132.
  • io_if_jump_flag is set to 1.
  • The value of io_if_jump_address is equal to the sum of the current instruction address and the immediate field of the instruction, which are 0x0000101C and 0x84 respectively.
  • And from the waveform graph, we can see that the instruction address at the next clock cycle changed to 0x000010A0, which is equal to io_if_jump_address.