Try   HackMD

Assignment3: Single-Cycle RISC-V CPU

Contributed by Ray Huang (ι»ƒζŸε‘, coding-ray), 2023.

Set up the Environment

Primary references of this section:

  1. System Software Programming & jserv. (2023). Lab3: Construct a single-cycle RISC-V CPU with Chisel.
  2. System Software Programming & jserv. (2023). sysprog21/ca2023-lab3: Lab3: Construct a single-cycle CPU with Chisel | GitHub.

Operating system: Debian 12.2 (Bookworm)

  1. Install Docker. (Ref: Install Docker Engine on Debian | Docker Docs)
    ​​​​# remove all conflicting packages
    ​​​​export CANDIDATES="docker.io docker-doc docker-compose podman-docker containerd runc"; \
    ​​​​for pkg in $CANDIDATES; do \
    ​​​​  test ! -z "$(apt list --installed $pkg 2>&1 | sed -n 5p)" && \
    ​​​​    sudo apt purge -y --quiet $pkg; \
    ​​​​  test $? -ne 0 && \
    ​​​​    echo Not installed: $pkg; \
    ​​​​done; \
    ​​​​unset CANDIDATES
    ​​​​
    ​​​​# allow apt to use a repository over the HTTPS
    ​​​​sudo apt update && sudo apt install -y ca-certificates curl gnupg
    ​​​​
    ​​​​# add Docker’s official GPG key
    ​​​​sudo install -m 0755 -d /etc/apt/keyrings && \
    ​​​​curl -fsSL https://download.docker.com/linux/debian/gpg | \
    ​​​​sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg && \
    ​​​​sudo chmod a+r /etc/apt/keyrings/docker.gpg
    ​​​​
    ​​​​# set up the repository
    ​​​​echo \
    ​​​​"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
    ​​​​"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
    ​​​​sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    ​​​​
    ​​​​# install the latest Docker engine
    ​​​​sudo apt update && sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
    ​​​​
    ​​​​# add the current user to the `docker` group
    ​​​​sudo usermod -aG docker $USER
    ​​​​
    ​​​​# activate the changes to groups
    ​​​​newgrp docker
    ​​​​
    ​​​​# check Docker version gives no error messages
    ​​​​docker version
    
  2. Install GTKWave.
    ​​​​sudo apt install gtkwave
    
    Installed packages: libjudydebian1 gtkwave
  3. Install GNU toolchain for RISC-V.
    ​​​​# download and extract the toolchain
    ​​​​cd /tmp
    ​​​​wget https://github.com/xpack-dev-tools/riscv-none-elf-gcc-xpack/releases/download/v13.2.0-2/xpack-riscv-none-elf-gcc-13.2.0-2-linux-x64.tar.gz
    ​​​​tar zxf xpack-riscv-none-elf-gcc-13.2.0-2-linux-x64.tar.gz
    ​​​​
    ​​​​# create a version memo
    ​​​​echo 13.2.0-2 > xpack-riscv-none-elf-gcc-13.2.0-2/version.txt
    ​​​​
    ​​​​# move the toolchain to ~/.local/share/, and add it to PATH
    ​​​​mkdir -p ~/.local/share
    ​​​​mv xpack-riscv-none-elf-gcc-13.2.0-2 ~/.local/share/riscv-none-elf-gcc
    ​​​​echo "export PATH=\"\$HOME/.local/share/riscv-none-elf-gcc/bin:\$PATH\"" >> ~/.bashrc
    ​​​​. ~/.bashrc
    ​​​​
    ​​​​# make sure the toolchain is installed successfully
    ​​​​riscv-none-elf-gcc -v
    ​​​​
    ​​​​# clean up
    ​​​​rm -rf xpack-riscv-none-elf-gcc-13.2.0-2-linux-x64.tar.gz
    
  4. For the Chisel bootcamp (a fork of freechipsproject/chisel-bootcamp:dev) to run or stop. After it has started for 15 seconds (for engine to load), connect to it via http://127.0.0.1:8888/.
    ​​​​# first run
    ​​​​docker run -d -it --name chisel-bootcamp -p 8888:8888 sysprog21/chisel-bootcamp
    ​​​​
    ​​​​# stop with progress saved
    ​​​​docker stop chisel-bootcamp
    ​​​​
    ​​​​# later run with progress restored
    ​​​​docker start chisel-bootcamp
    
  5. For the lab 3 (a fork of sysprog21/ca2023-lab3) to run or stop. Attach to the running container with docker exec -it ca-lab3 /bin/bash.

    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More β†’
    The processes inside the container run as root user, with privilege to create files in the current directory (mounted). So, to delete these files owned by root, you need to attach to the container first.

    ​​​​# first run
    ​​​​git clone https://github.com/coding-ray/2023-ca-lab-3 lab3
    ​​​​cd lab3
    ​​​​cp -r ~/.local/share/riscv-none-elf-gcc .
    ​​​​docker build -t ca-lab3 .
    ​docker run -d -it --name ca-lab3 \
    ​​​​--mount type=bind,src="$(pwd)",dst=/app \
    ​​​​ca-lab3
    ​​​​
    ​​​​# stop with progress saved
    ​​​​docker stop ca-lab3
    ​​​​
    ​​​​# later run with progress restored
    ​​​​docker start ca-lab3
    

Lab 3: "MyCPU"

In this part, all the waveform is generated by the following command. Get rid of the prefix WRITE_VCD=1 to run test cases faster. (With the VCD, it takes 25 seconds; without the VCD, it takes 22 seconds on my old PC.)

WRITE_VCD=1 sbt test

Instruction Fetching Test

In the instruction fetching (IF) part (src/main/scala/riscv/core/InstructionFetch.scala), the missing part is to assign the program counter pc with one of the following value.

  1. "pc + 4" if not to branch.
  2. "jump_address_id" (the address specified by the jump instruction) if to branch.

If the input flag jump_flag_id is set, it means "to branch".

In addition, the IF part does the following things.

  1. Link instruction_read_data, which is the instruction read from memory, to the output signal instruction.
  2. If instruction_valid is not set, pc = pc implements a stall.
  3. Output signal instruction_address is always the value of pc.

Observations from the following waveform:

  1. 0-2 ps: When the processor boots up, the reset signal is set (pulled high), so registers (pc) initialize with their default value (pc = entry address = 0x1000).
  2. 0-2 ps: Since instruction_valid is not set, pc = pc implements a stall.
  3. 0-2 ps: Before and after the rising edge (triggering) of the clock, input signals (jump_flag, jump_address_id, instruction_valid, io_instruction) stay still. It is the setup time (before the triggering) and the hold time (after), to prevent undefined behaviors.
  4. 2 ps: At the falling edge of the clock, input signals may change.
  5. 3 ps: instruction_valid and jump_flag_id are set, so pc = jump_address_id = 0x1000. Although it branches from 0x1000 to 0x1000, it looks like a stall.
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More β†’

Observation from the following waveform: Not to branch (jump_flag_id = 0), so pc = pc + 4 (the instruction width is 4 bytes).

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More β†’

Observation from the moment in the following waveform: To branch, so pc = jump_address_id =0x1000.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More β†’

The observations above show that the IF part works as designed, though the output signal instruction is always 0 because the memory contains nothing.

Instruction Decoding Test

In the instruction decoding (ID) part (src/main/scala/riscv/core/InstructionDecode.scala), the missing code does the following two things.

  1. If the decoded instruction is L-type (lw lh lb lhu lbu, whose opcode is 0x3), the output flag memory_read_enable will be true/1. Otherwise, false/0.
  2. If the instruction is S-type (sw sh sb, whose opcode is 0x23), the output flag memory_write_enable will be true. Otherwise, false.

Observations from the following waveform:

  1. When the instruction is sw a0, 4(zero) (0x00A02223), its lower 7 bits is opcode = 0x3, so this instruction is S-type, memory_write_enable is true.
  2. Seen no instruction having its opcode = 0x3, this test doesn't consider L-type instructions. As a result, memory_read is always false.
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More β†’

Instruction Execution Test

In the instruction execution (EXE) part (src/main/scala/riscv/core/Execute.scala), the missing code does the following three things.

  1. Connect the output alu_funct from the ALU control unit (alu_ctrl) to the input funct of the ALU (alu).
  2. Set the op1 of the ALU to the instruction_address if it should be (present as aluop1_source set high). Otherwise, set it to the content of the source register 1 (reg1_data).
  3. Set the op2 of the ALU to the immediate if it should be (present as aluop2_source set high). Otherwise, set it to the content of the source register 2 (reg2_data).

The following waveform shows the case that the op1 should be reg1_data and that op2 should be reg2_data.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More β†’

The following waveform shows the case that the op1 should be instruction_address and that op2 should be immediate.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More β†’

Register Reading and Writing Test

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More β†’

Byte Loading and Writing Test

For this (byte loading and writing) and the following three tests (quick sorting, 10th Fibonacci number, palindrome checker), the external assembly code is moved to memory by the class TestTopModule in the file src/test/scala/riscv/singlecycle/CPUTest.scala. It loads the content of its argument, exeFilename, in binary to the instruction ROM (src/main/scala/peripheral/InstructionROM.scala).

Incorporate Homework 2 to "MyCPU"

There are some minor changes to the C code in homework 2 to make it work properly and testable in "MyCPU". The code is csrc/ispalindrome.c.

  1. In the end of function is_palindrome(), I change the return values for palindrome and non-palindrome from (1, 0) to (1, 2). Otherwise, since the initial values in the memory of MyCPU are 0, it is ambiguous to have them identical to non-palindrome results.
    ​​​​if (a == b)
    ​​​​  return 1; // palindrome
    ​​​​else
    ​​​​  return 2; // not palindrome
    
  2. After each is_palindrome() returns, I save the result in a separate array. After all is_palinedrome() finish, I write the results to a local fixed-size array, which is located in the stack of the program.
  3. After all is_palindrome() finish, I write the results to the memory located in bytes 4 through 20 (4-20), which is located in the code section of the program.
    ​​​​for (int i = 1; i <= 4; i++) {
    ​​​​    *(volatile int *) (i * 4) = results[i - 1];
    ​​​​}
    
  4. If I wrote the result of each is_palinedrome() to the code section right after it returns, I would observe that memory located in bytes 8-20 is 0 entirely. I don't know the reason, but I have the workaround above.
  5. After the program terminates, I read the results in memory located in bytes 4-20, the results are as expected, which is the first two words being 1 (palindrome), and the later two words being 2 (not palindrome). This test is class IsPalinedrome in the file src/test/scala/riscv/singlecycle/CPUTest.scala.

The cursor in the first waveform is the moment that the program counter is off 0x1000. It is at around 2.7 ns.
image

The cursor in the last waveform is the moment that the program returns. It is at around 2716 ns.
image

Since we know the clock period is 2 ps, we know the program takes around 1,357k clock cycles to finish.

Miscellaneous Learning Notes

  1. Functions in Scala work like functions which are always static inline in C. (Ref. Lab 3)
  2. Best practices in Chisel development (Ref. Chisel Best Practices Intensive):
    1. Start from templates.
    2. Incorporate tests from the beginning. That is, test-driven development (TDD). (Ref. Chisel Introduction Intensive).
    3. Document your code. (With ScalaDoc tools, GitHub wiki, etc.)
    4. Follow coding styles. Use style guidelines.
    5. Have a code management strategy. (With git, branch early and often.)
    6. Don't repeat yourself. Utilize functions, objects and classes.
    7. Do more than you think you can in Scala. Notice the difference between Scala and Chisel.
    8. Use the collection library (featured in Scala).
    9. Use an IDE.
    10. Share your knowledge.
    11. Have your own Chisel support team.