contributed by chihenliu
My OS is Ubuntu 22.04.3 LTS
$sudo apt install build-essential verilator gtkwave
follow the instructions install SDKman
$curl -s "https://get.sdkman.io" | bash
$source "$HOME/.sdkman/bin/sdkman-init.sh"
$sdk version
follow the instructions install sbt
$sdk install java $(sdk list java | grep -o "\b8\.[0-9]*\.[0-9]*\-tem" | head -1)
$sdk install sbt
The installation of sbt is complete.
follow Lab3: Construct a single-cycle RISC-V CPU with Chisel instructions
$sdk install java 11.0.21-tem
The installation of JDK is complete.
For general install
1.Type./configure
2.make
3.sudo make install
However, my Ubuntu is encountering errors, so I'm following the installation instructions based on the README.md as follows:
$sudo apt-get install libjudy-dev
$sudo apt-get install libbz2-dev
$sudo apt-get install liblzma-dev
$sudo apt-get install libgconf2-dev
$sudo apt-get install libgtk2.0-dev
$sudo apt-get install tcl-dev
$sudo apt-get install tk-dev
$sudo apt-get install gperf
$sudo apt-get install gtk2-engines-pixbuf
After above instrcution install Package ,Iinstall GTKWave using Type./configure
,make
,sudo make install
follow the instructions:
$ git clone https://github.com/ucb-bar/chisel-tutorial
$ cd chisel-tutorial
$ git checkout release
$ sbt run
Output:
test Hello Success: 1 tests passed in 6 cycles taking 0.004980 seconds
[info] [0.002] RAN 1 CYCLES PASSED
[success] Total time: 2 s, completed
You also can run all examples:
$./run-examples.sh all
before Dec.1
I will go through all the steps.
class Hello extends Module {
val io = IO(new Bundle {
val led = Output(UInt(1.W))
})
val CNT_MAX = (50000000 / 2 - 1).U;
val cntReg = RegInit(0.U(32.W))
val blkReg = RegInit(0.U(1.W))
cntReg := cntReg + 1.U
when(cntReg === CNT_MAX) {
cntReg := 0.U
blkReg := ~blkReg
}
io.led := blkReg
}
'led'
is an unsigned integer with a bits width of 1.cntReg
It is a 32-bit unsigned integer register, initialized with 0blkReg
It is a 1-bit unsigned integer register, initialized with 0 and Used to control the state of the LEDCNT_MAX
It is a constant with a value of 24999999
. This value is typically set based on the system's clock frequency and is used to control the flashing frequency of the LEDcntReg
increases by 1 cntReg
reaches the value of CNT_MAX
, cntReg
is reset to 0, and the value of blkReg
is invertedWe can achieve another LED functionality by eliminating blkReg
// Hello in chisel ,after eliminating blkReg
class Hello extends Module {
val io = IO(new Bundle {
val led = Output(UInt(1.W))
})
val cntMax = (50000000 / 2 - 1).U
val cntReg = RegInit(0.U(32.W))
cntReg := Mux(cntReg === cntMax, 0.U, cntReg + 1.U)
io.led := cntReg === cntMax
}
cntMax
, we directly set the LED to ON (1), while at other times, the LED is turned off (0)cntReg
reaches cntMax, rather than remaining illuminated until the next counting cycle is completedMyCPU
codeWe need to add code to four Scala files to complete the modules in src/main/scala/riscv/core
By completing the Instruction Fetch
, Instruction Decode
, and Execute
stages, and then using the aforementioned components, I have completed the CPU
section.
Here is my repository for Lab 3, which was forked from ca2023-lab3.
Test command:
$sbt test
However, since the CPU code was not initially completed, you will receive the following Output:
[info] *** 6 TESTS FAILED ***
[error] Failed tests:
[error] riscv.singlecycle.InstructionDecoderTest
[error] riscv.singlecycle.ByteAccessTest
[error] riscv.singlecycle.InstructionFetchTest
[error] riscv.singlecycle.ExecuteTest
[error] riscv.singlecycle.FibonacciTest
[error] riscv.singlecycle.QuicksortTest
[error] (Test / test) sbt.TestsFailedException: Tests unsuccessful
After completing the missing code for the Instruction Fetch
, Instruction Decode
, and Execute
stages as well as the CPU, I proceeded to test according to the command provided in Lab 3.
$sbt test
we can get following Output:
[info] ExecuteTest:
[info] Execution of Single Cycle CPU
[info] - should execute correctly
[info] InstructionFetchTest:
[info] InstructionFetch of Single Cycle CPU
[info] - should fetch instruction
[info] QuicksortTest:
[info] Single Cycle CPU
[info] - should perform a quicksort on 10 numbers
[info] ByteAccessTest:
[info] Single Cycle CPU
[info] - should store and load a single byte
[info] FibonacciTest:
[info] Single Cycle CPU
[info] - should recursively calculate Fibonacci(10)
[info] InstructionDecoderTest:
[info] InstructionDecoder of Single Cycle CPU
[info] - should produce correct control signal
[info] RegisterFileTest:
[info] Register File of Single Cycle CPU
[info] - should read the written content
[info] - should x0 always be zero
[info] - should read the writing content
[info] Run completed in 9 seconds, 325 milliseconds.
[info] Total number of tests run: 9
[info] Suites: completed 7, aborted 0
[info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 10 s, completed Nov 28, 2023, 5:41:06 PM
To test a single test case, you can use the following command
$sbt "testOnly riscv.singlecycle.XXXTest"
The PC
is initialized to ProgramCounter.EntryAddress
. The jump_flag_id
is used to determine whether a jump should be executed; it is a control signal. If it is true, a jump is executed, and the PC
is updated to the memory location provided by jump_address_id
. If it is false, PC
is incremented by 4 to execute the next instruction
class InstructionFetchTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("InstructionFetch of Single Cycle CPU")
it should "fetch instruction" in {
test(new InstructionFetch).withAnnotations(TestAnnotations.annos) { c =>
val entry = 0x1000
var pre = entry
var cur = pre
c.io.instruction_valid.poke(true.B)
var x = 0
for (x <- 0 to 100) {
Random.nextInt(2) match {
case 0 => // no jump
cur = pre + 4
c.io.jump_flag_id.poke(false.B)
c.clock.step()
c.io.instruction_address.expect(cur)
pre = pre + 4
case 1 => // jump
c.io.jump_flag_id.poke(true.B)
c.io.jump_address_id.poke(entry)
c.clock.step()
c.io.instruction_address.expect(entry)
pre = entry
}
}
In the given example, a random number is generated. If this random number is 0, the program continues without any jump, and the Program Counter (PC
) simply increments by 4 (to pre + 4
). Conversely, if the random number is 1, the program executes a jump to the entry address
When jump_flag_id
is set to 1
, you can observe that instead of incrementing PC by 4
to become 0x1012
, it directly jumps to 0x1000
from its original memory Address at 0x1008
You can observe that when jump_flag_id
is set to 0
, the PC memory address transitions from 0x1000
to 0x1004
after the next clock cycle, following the PC+4
In the ID
stage, an input signal instruction
is decoded by the ID
unit, generating various control signals for the circuit,After completing the ID
module, you will obtain a total of 10
complete outputs。
class InstructionDecoderTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("InstructionDecoder of Single Cycle CPU")
it should "produce correct control signal" in {
test(new InstructionDecode).withAnnotations(TestAnnotations.annos) { c =>
c.io.instruction.poke(0x00a02223L.U) // S-type
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
c.io.regs_reg1_read_address.expect(0.U)
c.io.regs_reg2_read_address.expect(10.U)
c.clock.step()
c.io.instruction.poke(0x000022b7L.U) // lui
c.io.regs_reg1_read_address.expect(0.U)
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
c.clock.step()
c.io.instruction.poke(0x002081b3L.U) // add
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Register)
c.clock.step()
}
}
}
The above code verifies three instructions: S-type
, lui
, and add
. I added two signals, memory_read_enable
and memory_write_enable
, in the InstructionDecoder.scala file,
and the above test case lacks testing formemory_write_enable
. Perhaps, additional test cases can be added for memory_write_enable
as part of completing Assignment 3
S-type Waveform
lui Waveform
add Waveform
Based on Execute.scala
, this stage is primarily composed of two modules: ALU
and ALU Control
. ALU Control
is responsible for generating opcode
, funct3
, and funct7
. Subsequently, ALU performs operations using the code it generates, resulting in output signals if_jump_flag
and if_jump_address
class ExecuteTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("Execution of Single Cycle CPU")
it should "execute correctly" in {
test(new Execute).withAnnotations(TestAnnotations.annos) { c =>
c.io.instruction.poke(0x001101b3L.U) // x3 = x2 + x1
var x = 0
for (x <- 0 to 100) {
val op1 = scala.util.Random.nextInt(429496729)
val op2 = scala.util.Random.nextInt(429496729)
val result = op1 + op2
val addr = scala.util.Random.nextInt(32)
c.io.reg1_data.poke(op1.U)
c.io.reg2_data.poke(op2.U)
c.clock.step()
c.io.mem_alu_result.expect(result.U)
c.io.if_jump_flag.expect(0.U)
}
// beq test
c.io.instruction.poke(0x00208163L.U) // pc + 2 if x1 === x2
c.io.instruction_address.poke(2.U)
c.io.immediate.poke(2.U)
c.io.aluop1_source.poke(1.U)
c.io.aluop2_source.poke(1.U)
c.clock.step()
// equ
c.io.reg1_data.poke(9.U)
c.io.reg2_data.poke(9.U)
c.clock.step()
c.io.if_jump_flag.expect(1.U)
c.io.if_jump_address.expect(4.U)
// not equ
c.io.reg1_data.poke(9.U)
c.io.reg2_data.poke(19.U)
c.clock.step()
c.io.if_jump_flag.expect(0.U)
c.io.if_jump_address.expect(4.U)
}
}
}
I have added the signal assignments for alu.io.func
, alu.io.op1
, and alu.io.op2
in Execute that were previously incomplete. This test is conducted to verify three types of operations: x1+x2=x3
, equ (equal)
, and not equ (not equal)
X3=X1+X2
beq
not beq
Because the single-cycle CPU lacks system calls, I will remove the ecall
, rdcycle
, and rdcycleh
instructions, and instead, I will add the start
and loop
label。
.global itof_clz
.global _start
_start:
la t0, num
lw a0, 12(t0)
lw a1, 8(t0)
jal itof_clz
li t0,1
li t1,2
li t2,3
loop:
j loop
I'm writing my program in CPUtest, and here is my test program
class itof_clzTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("Single Cycle CPU")
it should "convert integer to floating point" in {
test(new TestTopModule("itof_clz.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
for (i <- 1 to 500) {
c.clock.step(1000) // Avoid timeout
c.io.mem_debug_read_address.poke((i * 4).U) // Assume the converted result is stored in memory sequentially
}
c.io.regs_debug_read_address.poke(10.U)
println(s"${c.io.regs_debug_read_data.peek()}")
c.io.regs_debug_read_data.expect(1088462400.U)
c.io.regs_debug_read_address.poke(11.U)
println(s"${c.io.regs_debug_read_data.peek()}")
c.io.regs_debug_read_data.expect(0.U)
}
}
}
The main goal is to test whether my integer can be converted into IEEE-754 floating point.
run single test command
$sbt "testOnly riscv.singlecycle.itof_clzTest"
so I run this test Program get Success output message
[info] welcome to sbt 1.9.7 (Eclipse Adoptium Java 11.0.21)
[info] loading settings for project ca2023-lab3-build from plugins.sbt ...
[info] loading project definition from /home/chihen/ca2023-lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/home/chihen/ca2023-lab3/)
UInt<32>(1088462400)
UInt<32>(0)
[info] itof_clzTest:
[info] Single Cycle CPU
[info] - should convert integer to floating point
[info] Run completed in 18 seconds, 664 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 19 s, completed Nov 29, 2023, 8:36:22 PM
Input | Output |
---|---|
0x84f2 | UInt<32>(1088462400) |
My output valid by IEEE754 converter is correct
Use the following command to generate the simulation executable file of the CPU
$make verilator
$./run-verilator.sh -instruction src/main/resources/itof_clz.asmbin -time 4000 -vcd itofclz01.vcd
Output:
-time 4000
-memory 1048576
-instruction src/main/resources/itof_clz.asmbin
[-------------------->] 100%
Using an online RISC-V instruction encoder/decoder allows us to quickly understand the registers behind the instructions and easily determine their memory locations, enabling us to better observe the waveform variations
sub a3,a3,a2
Assembly =sub x13, x13, x12
Binary =0100 0000 1100 0110 1000 0110 1011 0011
Hexadecimal =0x40c686b3
ID stage
io_reg_write_enable
is used to indicate whether R-Type
instructions should write to IO device registers
EX stage
alu_op
has successfully retrieved the value from the register and is ready to perform operations using it
Reg
For this stage, after the clock
enters the next phase, the values in the registers will undergo a change.
0000 0000 1100 0010 1010 0101 0000 0011
0x00c2a503
ID stage
mem_read_enable
and reg_write_enable
have been set to extract data from memory addresses and prepare for writing into registers.
Ex stage
We can obtain the address 00001308
read from the registers
Reg stage
For this stage, after the clock enters the next phase, the values in the registers will undergo a change
Assembly =sw x10, 0(x2)
Binary =0000 0000 1010 0001 0010 0000 0010 0011
Hexadecimal =0x00a12023
ID stage
io_reg_write_enable
is used to indicate whether S-Type
instructions should write to IO device registers
EX stage
We can observe that alu_op
is determined by the changes in alu_op_source
, which in turn affects the data in the register
Through this practical assignment, I have come to realize my own shortcomings and have learned a new programming language. Going through Lab 3 step by step to understand the architecture of a single-cycle CPU
has given me a deeper understanding of the essence of computer architecture and its design. Perhaps in the future, there may be assignments related to GPU
design that will allow us to delve even further into the implications and principles behind computer components. I also look forward to continuously learning through the guidance of our teacher and pushing myself to bridge the significant gap between myself and those who excel in the field.
Construct a single-cycle RISC-V CPU with Chisel
Single-Cycle Processor
Building a RISC-V Processor
Datapath Control
Chisel Breakdown 3