Pipelined RISC-V core with RV32IMZbaZbbZbcZbs
黃丞漢, 江冠德
GitHub
Goal of the Task
Our goal is to extend the 5-stage pipelined RISC-V CPU to support the RV32IM instruction set along with the bit-manipulation extension(Zba, Zbb, Zbc and Zbs).
Prerequisites
Environment
Ubuntu 24.04.1 LTS on WLS2
Install JDK
Install sbt
Check if sbt works
Install verilator
Install GTKWave
RISC-V core
How to execute
Run $ make test
in the riscv-core folder :
RISC-V B-extension
Group the B-extension instructions by opcode
opcode : 0010011
funct7 |
shamt |
rs1 |
funct3 |
rd |
opcode |
Instruction(s) |
0010100 |
5 bits |
5 bits |
001 |
5 bits |
0010011 |
bseti |
0100100 |
5 bits |
5 bits |
001 |
5 bits |
0010011 |
bclri |
0110100 |
5 bits |
5 bits |
001 |
5 bits |
0010011 |
binvi |
0110000 |
00000 |
5 bits |
001 |
5 bits |
0010011 |
clz |
0110000 |
00001 |
5 bits |
001 |
5 bits |
0010011 |
ctz |
0110000 |
00010 |
5 bits |
001 |
5 bits |
0010011 |
cpop |
0110000 |
00100 |
5 bits |
001 |
5 bits |
0010011 |
sext.b |
0110000 |
00101 |
5 bits |
001 |
5 bits |
0010011 |
sext.h |
0100100 |
5 bits |
5 bits |
101 |
5 bits |
0010011 |
bexti |
0110000 |
5 bits |
5 bits |
101 |
5 bits |
0010011 |
rori |
0010100 |
00111 |
5 bits |
101 |
5 bits |
0010011 |
orc.b |
0110100 |
11000 |
5 bits |
101 |
5 bits |
0010011 |
rev8 |
opcode : 0110011
funct7 |
rs2 |
rs1 |
funct3 |
rd |
opcode |
instruction |
0000101 |
5 bits |
5 bits |
001 |
5 bits |
0110011 |
clmul |
0010100 |
5 bits |
5 bits |
001 |
5 bits |
0110011 |
bset |
0100100 |
5 bits |
5 bits |
001 |
5 bits |
0110011 |
bclr |
0110000 |
5 bits |
5 bits |
001 |
5 bits |
0110011 |
rol |
0110100 |
5 bits |
5 bits |
001 |
5 bits |
0110011 |
binv |
0000101 |
5 bits |
5 bits |
010 |
5 bits |
0110011 |
clmulr |
0010000 |
5 bits |
5 bits |
010 |
5 bits |
0110011 |
sh1add |
0000101 |
5 bits |
5 bits |
011 |
5 bits |
0110011 |
clmulh |
0000100 |
00000 |
5 bits |
100 |
5 bits |
0110011 |
zext.h |
0000101 |
5 bits |
5 bits |
100 |
5 bits |
0110011 |
min |
0010000 |
5 bits |
5 bits |
100 |
5 bits |
0110011 |
sh2add |
0100000 |
5 bits |
5 bits |
100 |
5 bits |
0110011 |
xnor |
0000101 |
5 bits |
5 bits |
101 |
5 bits |
0110011 |
minu |
0100100 |
5 bits |
5 bits |
101 |
5 bits |
0110011 |
bext |
0110000 |
5 bits |
5 bits |
101 |
5 bits |
0110011 |
ror |
0000101 |
5 bits |
5 bits |
110 |
5 bits |
0110011 |
max |
0010000 |
5 bits |
5 bits |
110 |
5 bits |
0110011 |
sh3add |
0100000 |
5 bits |
5 bits |
110 |
5 bits |
0110011 |
orn |
0000101 |
5 bits |
5 bits |
111 |
5 bits |
0110011 |
maxu |
0100000 |
5 bits |
5 bits |
111 |
5 bits |
0110011 |
andn |
RV32-IMB
How to decode
In InstructionDecode.scala
, the opcode
is first use to determine the type of the instruction, followed by using funct3
to identify RV32I instructions.
However, from the table above, we can see that the B-extension instructions require additional checks on funct7
, and some instructions even need to examine bits [24:20]. Therefore, we need to add new objects in InstructionDecode.scala
to decode these complex instructions.
/src/main/scala/riscv/core/fivestage_final/InstructionDecode.scala
To decode instructions that use bits [24:20], we added the variable rs2_or_shamt
in Execute.scala
for further decoding.
/src/main/scala/riscv/core/fivestage_final/Execute.scala
With the newly added objects and rs2_or_shamt
, we modified the original program, so that it can decode all RV32-IMB machine codes correctly.
/src/main/scala/riscv/core/ALUControl.scala
...
class ALUControl extends Module {
val io = IO(new Bundle {
val opcode = Input(UInt(7.W))
val funct3 = Input(UInt(3.W))
val funct7 = Input(UInt(7.W))
+ val rs2_or_shamt = Input(UInt(5.W))
val alu_funct = Output(ALUFunctions())
})
io.alu_funct := ALUFunctions.zero
switch(io.opcode) {
is(InstructionTypes.I) {
io.alu_funct := MuxLookup(
io.funct3,
ALUFunctions.zero,
IndexedSeq(
InstructionsTypeI.addi -> ALUFunctions.add,
InstructionsTypeI.slti -> ALUFunctions.slt,
InstructionsTypeI.sltiu -> ALUFunctions.sltu,
InstructionsTypeI.xori -> ALUFunctions.xor,
InstructionsTypeI.ori -> ALUFunctions.or,
InstructionsTypeI.andi -> ALUFunctions.and,
+ InstructionsTypeI.slli_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeI_funct3is1_funct7.slli -> ALUFunctions.sll,
+ InstructionsTypeI_funct3is1_funct7.bseti -> ALUFunctions.bseti,
+ InstructionsTypeI_funct3is1_funct7.bclri -> ALUFunctions.bclri,
+ InstructionsTypeI_funct3is1_funct7.binvi -> ALUFunctions.binvi,
+ InstructionsTypeI_funct3is1_funct7.other -> MuxLookup(
+ io.rs2_or_shamt,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeI_funct3is1_funct7is48_shamt.clz -> ALUFunctions.clz,
+ InstructionsTypeI_funct3is1_funct7is48_shamt.ctz -> ALUFunctions.ctz,
+ InstructionsTypeI_funct3is1_funct7is48_shamt.cpop -> ALUFunctions.cpop,
+ InstructionsTypeI_funct3is1_funct7is48_shamt.sextb -> ALUFunctions.sextb,
+ InstructionsTypeI_funct3is1_funct7is48_shamt.sexth -> ALUFunctions.sexth
+ )
+ )
+ )
+ ),
+ InstructionsTypeI.srli_srai_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeI_funct3is5_funct7.srli -> ALUFunctions.srl,
+ InstructionsTypeI_funct3is5_funct7.srai -> ALUFunctions.sra,
+ InstructionsTypeI_funct3is5_funct7.bexti -> ALUFunctions.bexti,
+ InstructionsTypeI_funct3is5_funct7.rori -> ALUFunctions.rori,
+ InstructionsTypeI_funct3is5_funct7.orcb -> ALUFunctions.orcb,
+ InstructionsTypeI_funct3is5_funct7.rev8 -> ALUFunctions.rev8
+ )
+ )
)
)
}
is(InstructionTypes.RM) {
io.alu_funct := MuxLookup(
io.funct3,
ALUFunctions.zero,
IndexedSeq(
InstructionsTypeR.add_sub -> Mux(io.funct7(5), ALUFunctions.sub, ALUFunctions.add),
+ InstructionsTypeR.sll_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is1_funct7.sll -> ALUFunctions.sll,
+ InstructionsTypeR_funct3is1_funct7.clmul -> ALUFunctions.clmul,
+ InstructionsTypeR_funct3is1_funct7.bset -> ALUFunctions.bset,
+ InstructionsTypeR_funct3is1_funct7.bclr -> ALUFunctions.bclr,
+ InstructionsTypeR_funct3is1_funct7.rol -> ALUFunctions.rol,
+ InstructionsTypeR_funct3is1_funct7.binv -> ALUFunctions.binv
+ )
+ ),
+ InstructionsTypeR.slt_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is2_funct7.slt -> ALUFunctions.slt,
+ InstructionsTypeR_funct3is2_funct7.clmulr -> ALUFunctions.clmulr,
+ InstructionsTypeR_funct3is2_funct7.sh1add -> ALUFunctions.sh1add
+ )
+ ),
+ InstructionsTypeR.sltu_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is3_funct7.sltu -> ALUFunctions.sltu,
+ InstructionsTypeR_funct3is3_funct7.clmulh -> ALUFunctions.clmulh
+ )
+ ),
+ InstructionsTypeR.xor_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is4_funct7.xor -> ALUFunctions.xor,
+ InstructionsTypeR_funct3is4_funct7.zexth -> ALUFunctions.zexth,
+ InstructionsTypeR_funct3is4_funct7.min -> ALUFunctions.min,
+ InstructionsTypeR_funct3is4_funct7.sh2add -> ALUFunctions.sh2add,
+ InstructionsTypeR_funct3is4_funct7.xnor -> ALUFunctions.xnor
+ )
+ ),
+ InstructionsTypeR.srl_sra_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is5_funct7.srl -> ALUFunctions.srl,
+ InstructionsTypeR_funct3is5_funct7.sra -> ALUFunctions.sra,
+ InstructionsTypeR_funct3is5_funct7.minu -> ALUFunctions.minu,
+ InstructionsTypeR_funct3is5_funct7.bext -> ALUFunctions.bext,
+ InstructionsTypeR_funct3is5_funct7.ror -> ALUFunctions.ror
+ )
+ ),
+ InstructionsTypeR.or_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is6_funct7.or -> ALUFunctions.or,
+ InstructionsTypeR_funct3is6_funct7.max -> ALUFunctions.max,
+ InstructionsTypeR_funct3is6_funct7.sh3add -> ALUFunctions.sh3add,
+ InstructionsTypeR_funct3is6_funct7.orn -> ALUFunctions.orn
+ )
+ ),
+ InstructionsTypeR.and_and_B_extension -> MuxLookup(
+ io.funct7,
+ ALUFunctions.zero,
+ IndexedSeq(
+ InstructionsTypeR_funct3is7_funct7.and -> ALUFunctions.and,
+ InstructionsTypeR_funct3is7_funct7.maxu -> ALUFunctions.maxu,
+ InstructionsTypeR_funct3is7_funct7.andn -> ALUFunctions.andn
+ )
+ )
)
)
}
is(InstructionTypes.B) {
io.alu_funct := ALUFunctions.add
}
...
}
}
/src/main/scala/riscv/core/ALU.scala
...
object ALUFunctions extends ChiselEnum {
+ val zero, add, sub, sll, slt, xor, or, and, srl, sra, sltu, clz, ctz, cpop, sextb, sexth, bseti, bclri, binvi, bexti, rori, orcb, rev8, clmul, bset, bclr, rol, binv, clmulr, sh1add, clmulh, zexth, min, sh2add, xnor, minu, bext, ror, max, sh3add, orn, maxu, andn = Value
}
...
switch(io.func) {
...
+ is(ALUFunctions.clz) {
+ io.result := PriorityEncoder(Reverse(io.op1))
+ }
+ is(ALUFunctions.ctz) {
+ io.result := PriorityEncoder(io.op1)
+ }
+ is(ALUFunctions.cpop) {
+ io.result := PopCount(io.op1)
+ }
+ ...
...
Test
Add new test cases
We want to verify the new test case clz.c
/csrc/clz.c
First, we need to modify /csrc/Makefile
to ensure the compiler support B-extension instructions
/csrc/Makefile
Then we need to compile it to clz.asmbin
and move it to /src/main/resources/
. We must install Xpack to support riscv-none-elf-gcc
.
Install Xpack
At last, modify FiveStageCPUFinalTest.scala
to let the testbench program know there are new added test cases.
/src/test/scala/riscv/FiveStageCPUFinalTest.scala
Now, We can add new tests and check the test results.
Verify results
To verify that our CPU correctly decodes clz
as B-extension instruction, we intentionally modified our code in ALUControl.scala
to an incorrect version. We can see the test failed.
/src/main/scala/riscv/core/ALUControl.scala
Now, we change the code back to the correct version and the tests pass successfully.
/src/main/scala/riscv/core/ALUControl.scala
Add the more following test cases :
/csrc/bseti.S
/csrc/orcb.S
/csrc/rev8.S
Modify FiveStageCPUFinalTest.scala
for the new test cases
/src/test/scala/riscv/FiveStageCPUFinalTest.scala
$ make test
We further add the test cases for the zba zbb zbkb zbc zbs
instructions, you can see it on Github, and we sucessfully pass all the test cases.
Problems encountered in implementation
We need to add new input in Execute.scala
since some B-extension instructions depends on shamt/rs2 section to decode
/src/main/scala/riscv/core/fivestage_final/Execute.scala
/src/main/scala/riscv/core/ALUControl.scala
After we add the code above and run the test, it reports the following errors :
$ make test
This is because rs2_or_shamt has not been used yet. Simply adding the following line can resolve the issue.
/src/main/scala/riscv/core/ALUControl.scala
Reference
- Building and Testing Scala Projects with sbt
- RISC-V Bit-manipulation A, B, C and S Extensions