Try   HackMD

Pipelined RISC-V core with RV32IMZbaZbbZbcZbs

黃丞漢, 江冠德

GitHub

Goal of the Task

Our goal is to extend the 5-stage pipelined RISC-V CPU to support the RV32IM instruction set along with the bit-manipulation extension(Zba, Zbb, Zbc and Zbs).

Prerequisites

Environment

Ubuntu 24.04.1 LTS on WLS2

Install JDK

$ sudo apt update
$ sudo apt -y install openjdk-11-jdk-headless

Install sbt

$ sudo apt update
$ sudo apt install apt-transport-https curl gnupg -yqq
$ echo "deb https://repo.scala-sbt.org/scalasbt/debian all main" | sudo tee /etc/apt/sources.list.d/sbt.list
$ echo "deb https://repo.scala-sbt.org/scalasbt/debian /" | sudo tee /etc/apt/sources.list.d/sbt_old.list
$ curl -sL "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x2EE0EA64E40A89B84B2DF73499E82A75642AC823" | sudo -H gpg --no-default-keyring --keyring gnupg-ring:/etc/apt/trusted.gpg.d/scalasbt-release.gpg --import
$ sudo chmod 644 /etc/apt/trusted.gpg.d/scalasbt-release.gpg
$ sudo apt-get update
$ sudo apt-get install sbt

Check if sbt works

$ sbt run
[info] welcome to sbt 1.9.6 (Ubuntu Java 11.0.25)
[info] loading settings for project riscv-core-build from plugins.sbt ...
[info] loading project definition from /mnt/c/Users/User/Desktop/riscv-core/project
[info] loading settings for project root from build.sbt ...
[info] set current project to yatcpu (in build file:/mnt/c/Users/User/Desktop/riscv-core/)
[info] running board.verilator.VerilogGenerator
[success] Total time: 16 s, completed Jan 17, 2025, 9:11:41 PM

Install verilator

$ sudo apt update
$ sudo apt install verilator

Install GTKWave

$ sudo apt update
$ sudo apt install gtkwave

RISC-V core

How to execute

Run $ make test in the riscv-core folder :

[info] FiveStageCPUFinalTest:
[info] Five-stage Pipelined CPU with Reduced Branch Delay
[info] - should calculate recursively fibonacci(10)
[info] - should quicksort 10 numbers
[info] - should store and load single byte
[info] - should solve data and control hazards
[info] Run completed in 23 seconds, 689 milliseconds.
[info] Total number of tests run: 4
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 30 s, completed Jan 19, 2025, 8:40:37 PM

RISC-V B-extension

Group the B-extension instructions by opcode

opcode : 0010011

funct7 shamt rs1 funct3 rd opcode Instruction(s)
0010100 5 bits 5 bits 001 5 bits 0010011 bseti
0100100 5 bits 5 bits 001 5 bits 0010011 bclri
0110100 5 bits 5 bits 001 5 bits 0010011 binvi
0110000 00000 5 bits 001 5 bits 0010011 clz
0110000 00001 5 bits 001 5 bits 0010011 ctz
0110000 00010 5 bits 001 5 bits 0010011 cpop
0110000 00100 5 bits 001 5 bits 0010011 sext.b
0110000 00101 5 bits 001 5 bits 0010011 sext.h
0100100 5 bits 5 bits 101 5 bits 0010011 bexti
0110000 5 bits 5 bits 101 5 bits 0010011 rori
0010100 00111 5 bits 101 5 bits 0010011 orc.b
0110100 11000 5 bits 101 5 bits 0010011 rev8

opcode : 0110011

funct7 rs2 rs1 funct3 rd opcode instruction
0000101 5 bits 5 bits 001 5 bits 0110011 clmul
0010100 5 bits 5 bits 001 5 bits 0110011 bset
0100100 5 bits 5 bits 001 5 bits 0110011 bclr
0110000 5 bits 5 bits 001 5 bits 0110011 rol
0110100 5 bits 5 bits 001 5 bits 0110011 binv
0000101 5 bits 5 bits 010 5 bits 0110011 clmulr
0010000 5 bits 5 bits 010 5 bits 0110011 sh1add
0000101 5 bits 5 bits 011 5 bits 0110011 clmulh
0000100 00000 5 bits 100 5 bits 0110011 zext.h
0000101 5 bits 5 bits 100 5 bits 0110011 min
0010000 5 bits 5 bits 100 5 bits 0110011 sh2add
0100000 5 bits 5 bits 100 5 bits 0110011 xnor
0000101 5 bits 5 bits 101 5 bits 0110011 minu
0100100 5 bits 5 bits 101 5 bits 0110011 bext
0110000 5 bits 5 bits 101 5 bits 0110011 ror
0000101 5 bits 5 bits 110 5 bits 0110011 max
0010000 5 bits 5 bits 110 5 bits 0110011 sh3add
0100000 5 bits 5 bits 110 5 bits 0110011 orn
0000101 5 bits 5 bits 111 5 bits 0110011 maxu
0100000 5 bits 5 bits 111 5 bits 0110011 andn

RV32-IMB

How to decode

In InstructionDecode.scala, the opcode is first use to determine the type of the instruction, followed by using funct3 to identify RV32I instructions.
However, from the table above, we can see that the B-extension instructions require additional checks on funct7, and some instructions even need to examine bits [24:20]. Therefore, we need to add new objects in InstructionDecode.scala to decode these complex instructions.

/src/main/scala/riscv/core/fivestage_final/InstructionDecode.scala

... //funct3 object InstructionsTypeI { val addi = 0.U + val slli_and_B_extension = 1.U // val slti = 2.U val sltiu = 3.U val xori = 4.U + val srli_srai_and_B_extension = 5.U // val ori = 6.U val andi = 7.U } //funct7 +object InstructionsTypeI_funct3is1_funct7 { + val slli = "b0000000".U + val bseti = "b0010100".U + val bclri = "b0100100".U + val binvi = "b0110100".U + val other = "b0110000".U +} //funct7 +object InstructionsTypeI_funct3is5_funct7 { + val srli = "b0000000".U + val srai = "b0100000".U + val bexti = "b0100100".U + val rori = "b0110000".U + val orcb = "b0010100".U + val rev8 = "b0110100".U +} //shamt +object InstructionsTypeI_funct3is1_funct7is48_shamt { + val clz = "b00000".U + val ctz = "b00001".U + val cpop = "b00010".U + val sextb = "b00100".U + val sexth = "b00101".U +} //funct3 object InstructionsTypeR { val add_sub = 0.U + val sll_and_B_extension = 1.U // ++ + val slt_and_B_extension = 2.U // +++ + val sltu_and_B_extension = 3.U // +++ + val xor_and_B_extension = 4.U // +++ + val srl_sra_and_B_extension = 5.U // +++ + val or_and_B_extension = 6.U // +++ + val and_and_B_extension = 7.U // +++ } // sll_and_B_extension +object InstructionsTypeR_funct3is1_funct7 { + val sll = "b0000000".U + val clmul = "b0000101".U + val bset = "b0010100".U + val bclr = "b0100100".U + val rol = "b0110000".U + val binv = "b0110100".U +} // slt_and_B_extension +object InstructionsTypeR_funct3is2_funct7 { + val slt = "b0000000".U + val clmulr = "b0000101".U + val sh1add = "b0010000".U +} // sltu_and_B_extension +object InstructionsTypeR_funct3is3_funct7 { + val sltu = "b0000000".U + val clmulh = "b0000101".U +} // xor_and_B_extension +object InstructionsTypeR_funct3is4_funct7 { + val xor = "b0000000".U + val zexth = "b0000100".U + val min = "b0000101".U + val sh2add = "b0010000".U + val xnor = "b0100000".U +} // srl_sra_and_B_extension +object InstructionsTypeR_funct3is5_funct7 { + val srl = "b0000000".U + val sra = "b0100000".U + val minu = "b0000101".U + val bext = "b0100100".U + val ror = "b0110000".U +} // or_and_B_extension +object InstructionsTypeR_funct3is6_funct7 { + val or = "b0000000".U + val max = "b0000101".U + val sh3add = "b0010000".U + val orn = "b0100000".U +} // and_and_B_extension +object InstructionsTypeR_funct3is7_funct7 { + val and = "b0000000".U + val maxu = "b0000101".U + val andn = "b0100000".U +} ...

To decode instructions that use bits [24:20], we added the variable rs2_or_shamt in Execute.scala for further decoding.

/src/main/scala/riscv/core/fivestage_final/Execute.scala

... val opcode = io.instruction(6, 0) val funct3 = io.instruction(14, 12) val funct7 = io.instruction(31, 25) + val rs2_or_shamt = io.instruction(24,20) // +++ val uimm = io.instruction(19, 15) val alu = Module(new ALU) val alu_ctrl = Module(new ALUControl) alu_ctrl.io.opcode := opcode alu_ctrl.io.funct3 := funct3 alu_ctrl.io.funct7 := funct7 + alu_ctrl.io.rs2_or_shamt := rs2_or_shamt // +++ alu.io.func := alu_ctrl.io.alu_funct ...

With the newly added objects and rs2_or_shamt, we modified the original program, so that it can decode all RV32-IMB machine codes correctly.

/src/main/scala/riscv/core/ALUControl.scala

... class ALUControl extends Module { val io = IO(new Bundle { val opcode = Input(UInt(7.W)) val funct3 = Input(UInt(3.W)) val funct7 = Input(UInt(7.W)) + val rs2_or_shamt = Input(UInt(5.W)) val alu_funct = Output(ALUFunctions()) }) io.alu_funct := ALUFunctions.zero switch(io.opcode) { is(InstructionTypes.I) { io.alu_funct := MuxLookup( io.funct3, ALUFunctions.zero, IndexedSeq( InstructionsTypeI.addi -> ALUFunctions.add, InstructionsTypeI.slti -> ALUFunctions.slt, InstructionsTypeI.sltiu -> ALUFunctions.sltu, InstructionsTypeI.xori -> ALUFunctions.xor, InstructionsTypeI.ori -> ALUFunctions.or, InstructionsTypeI.andi -> ALUFunctions.and, + InstructionsTypeI.slli_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeI_funct3is1_funct7.slli -> ALUFunctions.sll, + InstructionsTypeI_funct3is1_funct7.bseti -> ALUFunctions.bseti, + InstructionsTypeI_funct3is1_funct7.bclri -> ALUFunctions.bclri, + InstructionsTypeI_funct3is1_funct7.binvi -> ALUFunctions.binvi, + InstructionsTypeI_funct3is1_funct7.other -> MuxLookup( + io.rs2_or_shamt, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeI_funct3is1_funct7is48_shamt.clz -> ALUFunctions.clz, + InstructionsTypeI_funct3is1_funct7is48_shamt.ctz -> ALUFunctions.ctz, + InstructionsTypeI_funct3is1_funct7is48_shamt.cpop -> ALUFunctions.cpop, + InstructionsTypeI_funct3is1_funct7is48_shamt.sextb -> ALUFunctions.sextb, + InstructionsTypeI_funct3is1_funct7is48_shamt.sexth -> ALUFunctions.sexth + ) + ) + ) + ), + InstructionsTypeI.srli_srai_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeI_funct3is5_funct7.srli -> ALUFunctions.srl, + InstructionsTypeI_funct3is5_funct7.srai -> ALUFunctions.sra, + InstructionsTypeI_funct3is5_funct7.bexti -> ALUFunctions.bexti, + InstructionsTypeI_funct3is5_funct7.rori -> ALUFunctions.rori, + InstructionsTypeI_funct3is5_funct7.orcb -> ALUFunctions.orcb, + InstructionsTypeI_funct3is5_funct7.rev8 -> ALUFunctions.rev8 + ) + ) ) ) } is(InstructionTypes.RM) { io.alu_funct := MuxLookup( io.funct3, ALUFunctions.zero, IndexedSeq( InstructionsTypeR.add_sub -> Mux(io.funct7(5), ALUFunctions.sub, ALUFunctions.add), + InstructionsTypeR.sll_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is1_funct7.sll -> ALUFunctions.sll, + InstructionsTypeR_funct3is1_funct7.clmul -> ALUFunctions.clmul, + InstructionsTypeR_funct3is1_funct7.bset -> ALUFunctions.bset, + InstructionsTypeR_funct3is1_funct7.bclr -> ALUFunctions.bclr, + InstructionsTypeR_funct3is1_funct7.rol -> ALUFunctions.rol, + InstructionsTypeR_funct3is1_funct7.binv -> ALUFunctions.binv + ) + ), + InstructionsTypeR.slt_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is2_funct7.slt -> ALUFunctions.slt, + InstructionsTypeR_funct3is2_funct7.clmulr -> ALUFunctions.clmulr, + InstructionsTypeR_funct3is2_funct7.sh1add -> ALUFunctions.sh1add + ) + ), + InstructionsTypeR.sltu_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is3_funct7.sltu -> ALUFunctions.sltu, + InstructionsTypeR_funct3is3_funct7.clmulh -> ALUFunctions.clmulh + ) + ), + InstructionsTypeR.xor_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is4_funct7.xor -> ALUFunctions.xor, + InstructionsTypeR_funct3is4_funct7.zexth -> ALUFunctions.zexth, + InstructionsTypeR_funct3is4_funct7.min -> ALUFunctions.min, + InstructionsTypeR_funct3is4_funct7.sh2add -> ALUFunctions.sh2add, + InstructionsTypeR_funct3is4_funct7.xnor -> ALUFunctions.xnor + ) + ), + InstructionsTypeR.srl_sra_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is5_funct7.srl -> ALUFunctions.srl, + InstructionsTypeR_funct3is5_funct7.sra -> ALUFunctions.sra, + InstructionsTypeR_funct3is5_funct7.minu -> ALUFunctions.minu, + InstructionsTypeR_funct3is5_funct7.bext -> ALUFunctions.bext, + InstructionsTypeR_funct3is5_funct7.ror -> ALUFunctions.ror + ) + ), + InstructionsTypeR.or_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is6_funct7.or -> ALUFunctions.or, + InstructionsTypeR_funct3is6_funct7.max -> ALUFunctions.max, + InstructionsTypeR_funct3is6_funct7.sh3add -> ALUFunctions.sh3add, + InstructionsTypeR_funct3is6_funct7.orn -> ALUFunctions.orn + ) + ), + InstructionsTypeR.and_and_B_extension -> MuxLookup( + io.funct7, + ALUFunctions.zero, + IndexedSeq( + InstructionsTypeR_funct3is7_funct7.and -> ALUFunctions.and, + InstructionsTypeR_funct3is7_funct7.maxu -> ALUFunctions.maxu, + InstructionsTypeR_funct3is7_funct7.andn -> ALUFunctions.andn + ) + ) ) ) } is(InstructionTypes.B) { io.alu_funct := ALUFunctions.add } ... } }

/src/main/scala/riscv/core/ALU.scala

... object ALUFunctions extends ChiselEnum { + val zero, add, sub, sll, slt, xor, or, and, srl, sra, sltu, clz, ctz, cpop, sextb, sexth, bseti, bclri, binvi, bexti, rori, orcb, rev8, clmul, bset, bclr, rol, binv, clmulr, sh1add, clmulh, zexth, min, sh2add, xnor, minu, bext, ror, max, sh3add, orn, maxu, andn = Value } ... switch(io.func) { ... + is(ALUFunctions.clz) { + io.result := PriorityEncoder(Reverse(io.op1)) + } + is(ALUFunctions.ctz) { + io.result := PriorityEncoder(io.op1) + } + is(ALUFunctions.cpop) { + io.result := PopCount(io.op1) + } + ... ...

Test

Add new test cases

We want to verify the new test case clz.c

/csrc/clz.c

int clz(int x) { return __builtin_clz(x); } int main() { *(int *) (4) = clz(10); }

First, we need to modify /csrc/Makefile to ensure the compiler support B-extension instructions

/csrc/Makefile

- ASFLAGS = -march=rv32i_zicsr -mabi=ilp32 - CFLAGS = -O0 -Wall -march=rv32i_zicsr -mabi=ilp32 + ASFLAGS = -march=rv32i_zicsr_zba_zbb_zbc_zbs -mabi=ilp32 + CFLAGS = -O0 -Wall -march=rv32i_zicsr_zba_zbb_zbc_zbs -mabi=ilp32 BINS = \ fibonacci.asmbin \ hello.asmbin \ mmio.asmbin \ quicksort.asmbin \ sb.asmbin \ hazard.asmbin\ + clz.asmbin

Then we need to compile it to clz.asmbin and move it to /src/main/resources/. We must install Xpack to support riscv-none-elf-gcc.

Install Xpack

sudo apt update
sudo apt install nodejs
sudo apt install npm
sudo npm install --location=global xpm@latest
xpm install @xpack-dev-tools/riscv-none-elf-gcc@latest --global --verbose

At last, modify FiveStageCPUFinalTest.scala to let the testbench program know there are new added test cases.

/src/test/scala/riscv/FiveStageCPUFinalTest.scala

+ it should "test clz" in { + test(new TestTopModule("clz.asmbin", +ImplementationType.FiveStageFinal)) + .withAnnotations(TestAnnotations.annos) { c => + for (i <- 1 to 50) { + c.clock.step(1000) + c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout + } + c.io.mem_debug_read_address.poke(4.U) + c.clock.step() + c.io.mem_debug_read_data.expect(28.U) + } + }

Now, We can add new tests and check the test results.

Verify results

To verify that our CPU correctly decodes clz as B-extension instruction, we intentionally modified our code in ALUControl.scala to an incorrect version. We can see the test failed.

/src/main/scala/riscv/core/ALUControl.scala

InstructionsTypeI_funct3is1_funct7is48_shamt.clz -> ALUFunctions.zero
$ make test
sbt test
[info] FiveStageCPUFinalTest:
[info] Five-stage Pipelined CPU with Reduced Branch Delay
[info] - should calculate recursively fibonacci(10)
[info] - should quicksort 10 numbers
[info] - should store and load single byte
[info] - should solve data and control hazards
[info] - should test clz *** FAILED ***
[info]   io_mem_debug_read_data=0 (0x0) did not equal expected=28 (0x1c) (lines in FiveStageCPUFinalTest.scala: 73, 65) (FiveStageCPUFinalTest.scala:73)
[info] Run completed in 54 seconds, 781 milliseconds.
[info] Total number of tests run: 5
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 4, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error]         riscv.FiveStageCPUFinalTest
[error] (Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 63 s (01:03), completed Jan 20, 2025, 7:32:49 PM

Now, we change the code back to the correct version and the tests pass successfully.

/src/main/scala/riscv/core/ALUControl.scala

InstructionsTypeI_funct3is1_funct7is48_shamt.clz -> ALUFunctions.clz
$ make test
sbt test
[info] FiveStageCPUFinalTest:
[info] Five-stage Pipelined CPU with Reduced Branch Delay
[info] - should calculate recursively fibonacci(10)
[info] - should quicksort 10 numbers
[info] - should store and load single byte
[info] - should solve data and control hazards
[info] - should test clz
[info] Run completed in 55 seconds, 207 milliseconds.
[info] Total number of tests run: 5
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 5, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 64 s (01:04), completed Jan 20, 2025, 7:34:07 PM

Add the more following test cases :

/csrc/bseti.S

.globl _start _start: li t0, 0x2 bseti t0, t0, 0 li t1, 0x0 bseti t1, t1, 8 li t2, 0x3 bseti t2, t2, 2

/csrc/orcb.S

.globl _start _start: li t0, 0x0000 orc.b t0, t0 li t1, 0x1000 orc.b t1, t1 li t2, 0x1010 orc.b t2, t2 li t3, 0x40000 orc.b t3, t3 li t4, 0x20000000 orc.b t4, t4

/csrc/rev8.S

li t0, 0x00001234 rev8 t0, t0 li t1, 0x10100101 rev8 t1, t1 li t2, 0x00001111 rev8 t2, t2

Modify FiveStageCPUFinalTest.scala for the new test cases

/src/test/scala/riscv/FiveStageCPUFinalTest.scala

  it should "test bseti" in {
    test(new TestTopModule("bseti.asmbin", ImplementationType.FiveStageFinal)).withAnnotations(TestAnnotations.annos) {
      c =>
        c.clock.step(1000)
        c.io.regs_debug_read_address.poke(5.U)
        c.io.regs_debug_read_data.expect(3.U)
        c.io.regs_debug_read_address.poke(6.U)
        c.io.regs_debug_read_data.expect(256.U)
        c.io.regs_debug_read_address.poke(7.U)
        c.io.regs_debug_read_data.expect(7.U)
    }
  }
  it should "test orcb" in {
    test(new TestTopModule("orcb.asmbin", ImplementationType.FiveStageFinal)).withAnnotations(TestAnnotations.annos) {
      c =>
        c.clock.step(1000)
        c.io.regs_debug_read_address.poke(5.U)
        c.io.regs_debug_read_data.expect(0.U)
        c.io.regs_debug_read_address.poke(6.U)
        c.io.regs_debug_read_data.expect(65280.U)
        c.io.regs_debug_read_address.poke(7.U)
        c.io.regs_debug_read_data.expect(65535.U)
        c.io.regs_debug_read_address.poke(28.U)
        c.io.regs_debug_read_data.expect(16711680.U)
        c.io.regs_debug_read_address.poke(29.U)
        c.io.regs_debug_read_data.expect("b11111111000000000000000000000000".U)
    }
  }
  it should "test rev8" in {
    test(new TestTopModule("rev8.asmbin", ImplementationType.FiveStageFinal)).withAnnotations(TestAnnotations.annos) {
      c =>
        c.clock.step(1000)
        c.io.regs_debug_read_address.poke(5.U)
        c.io.regs_debug_read_data.expect(0x34120000.U)
        c.io.regs_debug_read_address.poke(6.U)
        c.io.regs_debug_read_data.expect(0x01011010.U)
        c.io.regs_debug_read_address.poke(7.U)
        c.io.regs_debug_read_data.expect(0x11110000.U)
    }
  }

$ make test

$ make test
sbt test
[info] welcome to sbt 1.9.6 (Ubuntu Java 11.0.25)
[info] loading settings for project rv32-imb-build from plugins.sbt ...
[info] loading project definition from /mnt/c/Users/User/Desktop/GitHub/RV32-IMB/project
[info] loading settings for project root from build.sbt ...
[info] set current project to yatcpu (in build file:/mnt/c/Users/User/Desktop/GitHub/RV32-IMB/)
[info] compiling 2 Scala sources to /mnt/c/Users/User/Desktop/GitHub/RV32-IMB/target/scala-2.13/classes ...
[info] FiveStageCPUFinalTest:
[info] Five-stage Pipelined CPU with Reduced Branch Delay
[info] - should calculate recursively fibonacci(10)
[info] - should quicksort 10 numbers
[info] - should store and load single byte
[info] - should solve data and control hazards
[info] - should test clz
[info] - should test bseti
[info] - should test orcb
[info] - should test rev8
[info] Run completed in 1 minute, 18 seconds.
[info] Total number of tests run: 8
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 8, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 90 s (01:30), completed Jan 21, 2025, 12:15:05 AM

We further add the test cases for the zba zbb zbkb zbc zbs instructions, you can see it on Github, and we sucessfully pass all the test cases.

$ make test
sbt test
[info] welcome to sbt 1.9.6 (Ubuntu Java 11.0.25)
[info] loading settings for project rv32-imb-build from plugins.sbt ...
[info] loading project definition from /mnt/c/Users/User/Desktop/GitHub/RV32-IMB/project
[info] loading settings for project root from build.sbt ...
[info] set current project to yatcpu (in build file:/mnt/c/Users/User/Desktop/GitHub/RV32-IMB/)
[info] compiling 1 Scala source to /mnt/c/Users/User/Desktop/GitHub/RV32-IMB/target/scala-2.13/test-classes ...
make[1]: Warning: File 'VTestTopModule__ver.d' has modification time 0.17 s in the future
make[1]: warning:  Clock skew detected.  Your build may be incomplete.
make[1]: Warning: File 'VTestTopModule__ver.d' has modification time 0.28 s in the future
make[1]: warning:  Clock skew detected.  Your build may be incomplete.
make[1]: Warning: File 'VTestTopModule__ver.d' has modification time 0.1 s in the future
make[1]: warning:  Clock skew detected.  Your build may be incomplete.
make[1]: Warning: File 'VTestTopModule__ver.d' has modification time 0.21 s in the future
make[1]: warning:  Clock skew detected.  Your build may be incomplete.
[info] FiveStageCPUFinalTest:
[info] Five-stage Pipelined CPU with Reduced Branch Delay
[info] - should calculate recursively fibonacci(10)
[info] - should quicksort 10 numbers
[info] - should store and load single byte
[info] - should solve data and control hazards
[info] - should test zba instructions
[info] - should test zbb instructions
[info] - should test zbkb instructions
[info] - should test zbc instructions
[info] - should test zbs instructions
[info] Run completed in 1 minute, 9 seconds.
[info] Total number of tests run: 9
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 78 s (01:18), completed Jan 22, 2025, 6:24:04 PM

Problems encountered in implementation

We need to add new input in Execute.scala since some B-extension instructions depends on shamt/rs2 section to decode

/src/main/scala/riscv/core/fivestage_final/Execute.scala

val opcode = io.instruction(6, 0) val funct3 = io.instruction(14, 12) val funct7 = io.instruction(31, 25) +val rs2_or_shamt = io.instruction(24,20) val uimm = io.instruction(19, 15) val alu = Module(new ALU) val alu_ctrl = Module(new ALUControl) alu_ctrl.io.opcode := opcode alu_ctrl.io.funct3 := funct3 alu_ctrl.io.funct7 := funct7 +alu_ctrl.io.rs2_or_shamt := rs2_or_shamt alu.io.func := alu_ctrl.io.alu_funct

/src/main/scala/riscv/core/ALUControl.scala

val io = IO(new Bundle { val opcode = Input(UInt(7.W)) val funct3 = Input(UInt(3.W)) val funct7 = Input(UInt(7.W)) + val rs2_or_shamt = Input(UInt(5.W)) val alu_funct = Output(ALUFunctions()) })

After we add the code above and run the test, it reports the following errors :

$ make test

[info]   firrtl.passes.CheckInitialization$RefNotInitializedException: @[src/main/scala/riscv/core/fivestage_final/Execute.scala 43:24] : [module Execute]  Reference alu_ctrl is not fully initialized.
[info]    : alu_ctrl.io.rs2_or_shamt <= VOID

This is because rs2_or_shamt has not been used yet. Simply adding the following line can resolve the issue.

/src/main/scala/riscv/core/ALUControl.scala

+val temp = io.rs2_or_shamt

Reference

  1. Building and Testing Scala Projects with sbt
  2. RISC-V Bit-manipulation A, B, C and S Extensions