contributed by freshLiver
The systems I currently have are Ubuntu 20.04 and Debian 12, however the lab3 recommends Ubuntu 22.04, here I use the given dockerfile to build development environment in docker.
Create a XXX.yml
file under ca2023-lab3/.github/workflows
with following content:
name: MyCPU Testing
run-name: ${{ github.actor }} is testing out GitHub Actions 🚀
on: [push]
jobs:
Explore-GitHub-Actions:
runs-on: ubuntu-22.04 # https://github.com/actions/runner-images/tree/main
steps:
- name: Check out repository code
uses: actions/checkout@v4
- name: Setup Chisel Env
run: |
sudo apt-get update && sudo apt-get -y install build-essential curl wget zip unzip verilator
sbt --version
- name: Setup RISC-V Env
run: |
wget https://github.com/xpack-dev-tools/riscv-none-elf-gcc-xpack/releases/download/v13.2.0-2/xpack-riscv-none-elf-gcc-13.2.0-2-linux-x64.tar.gz
tar zxvf xpack-riscv-none-elf-gcc-13.2.0-2-linux-x64.tar.gz
cp -af xpack-riscv-none-elf-gcc-13.2.0-2 $HOME/riscv-none-elf-gcc
export PATH="$HOME/riscv-none-elf-gcc/bin:$PATH"
riscv-none-elf-gcc -v
- name: Build and Test
run: |
export PATH="$HOME/riscv-none-elf-gcc/bin:$PATH"
make -C csrc update
sbt test
To complete the missing parts, we must refer the wiring between each component in the CPU:
The missing part in IF is the MUX to determine the PC of next cycle. The PC of next cycle should be PC + 4
for most cases; however, when the CPU find that a branch/jump instruction is executed in the EXE stage of previous cycle, IF should change the PC to the address specified by the branch/jump instruction.
In the IF part, whether the branch/jump instruction was executed is indicated by the jump_flag_id
input signal, so here just use the Mux
provided by chisel to implement the missing MUX.
:warning: Refrain from copying and pasting your solution directly into the HackMD note. Instead, provide a concise summary of the various test cases, outlining the aspects of the CPU they evaluate, the techniques employed for loading test program instructions, and the outcomes of these test cases.
In ID stage, there are two output signals (memory_read_enable
and memory_write_enable
) not handled, so just raise the signal base on the input instruction type (L type for reading memory, S type for writing memory):
:warning: Refrain from copying and pasting your solution directly into the HackMD note. Instead, provide a concise summary of the various test cases, outlining the aspects of the CPU they evaluate, the techniques employed for loading test program instructions, and the outcomes of these test cases.
In EXE stage, the missing parts are the inputs of ALU, its output has been wired to the output (io.mem_alu_result
) but not wired with needed inputs (the outputs of two MUXs and the ALUFunct
from ALU Control).
The output from ALU Control (alu_ctrl.io.alu_funct
) could be simply wired to the ALU input (alu.io.func
), but the input data of ALU (alu.io.op1
and alu.io.op2
) are based on the two MUXs.
The first (upper) MUX determines which data should be passed to ALU. From the implementation of ID stage, we can find that when the signal ex_aluop1_source
is 0, it indicates that we should pass the register data (reg1_data
in the code, Reg1RD
in the image above); otherwise, the first MUX should output 0 to the ALU.
object ALUOp1Source {
val Register = 0.U(1.W)
val InstructionAddress = 1.U(1.W)
}
...
class InstructionDecode extends Module {
...
io.ex_aluop1_source := Mux(
opcode === Instructions.auipc || opcode === InstructionTypes.B || opcode === Instructions.jal,
ALUOp1Source.InstructionAddress,
ALUOp1Source.Register
)
...
io.ex_aluop2_source := Mux(
opcode === InstructionTypes.RM,
ALUOp2Source.Register,
ALUOp2Source.Immediate
)
...
}
In the above image, when the ALUOp1Src
is set, the MUX should pass Reg1RD
to ALU.
However, in the given implementation, the Reg1RD
is passed to ALU when the signal ex_aluop1_source
is 0, which is different from the wiring in the above image.
Similarly, the second MUX passes the register data (reg2_data
) to ALU, otherwise passes the immediate.
:warning: Refrain from copying and pasting your solution directly into the HackMD note. Instead, provide a concise summary of the various test cases, outlining the aspects of the CPU they evaluate, the techniques employed for loading test program instructions, and the outcomes of these test cases.
Finally, we need to connect needed signals and data for each component, and this is handled by the CPU module.
In this module, the only missing part is wiring the needed signals and data for EXE stage:
:warning: Refrain from copying and pasting your solution directly into the HackMD note. Instead, provide a concise summary of the various test cases, outlining the aspects of the CPU they evaluate, the techniques employed for loading test program instructions, and the outcomes of these test cases.
.asmbin
and .txt
FilesTo understand how the assembly codes are executed by MyCPU, we can first trace the test functions.
When we test MyCPU with the command sbt "testOnly riscv.singlecycle.XXXXTest"
, the corresponding function defined in ca2023-lab3/src/test/scala/riscv/singlecycle/XXXXTest.scala
will be executed. Take QuicksortTest
for example:
$ sbt "testOnly riscv.singlecycle.ByteAccessTest"
[info] welcome to sbt 1.9.7 (Eclipse Adoptium Java 11.0.21)
[info] loading settings for project lab3-build from plugins.sbt ...
[info] loading project definition from /lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/lab3/)
[info] ByteAccessTest:
[info] Single Cycle CPU
[info] - should store and load a single byte
[info] Run completed in 6 seconds, 446 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success]
Then, look at the implementation of the module executed, it use the module TestTopModule
to load the assembly:
class ByteAccessTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("Single Cycle CPU")
it should "store and load a single byte" in {
test(new TestTopModule("sb.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
for (i <- 1 to 500) {
c.clock.step()
c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout
}
c.io.regs_debug_read_address.poke(5.U) // t0
c.io.regs_debug_read_data.expect(0xdeadbeefL.U)
c.io.regs_debug_read_address.poke(6.U) // t1
c.io.regs_debug_read_data.expect(0xef.U)
c.io.regs_debug_read_address.poke(1.U) // ra
c.io.regs_debug_read_data.expect(0x15ef.U)
}
}
}
If we dive into the implementation of TestTopModule
, we can find that it use peripheral.InstructionROM
to read the specified assembly (*.asmbin
file under ca2023-lab3/src/main/resources
directory) into memory, and convert the assembly into a .txt
file under ca2023-lab3/verilog
directory:
class InstructionROM(instructionFilename: String) extends Module {
...
val (instructionsInitFile, capacity) = readAsmBinary(instructionFilename)
...
def readAsmBinary(filename: String) = {
...
var instructions = new Array[BigInt](0)
val arr = new Array[Byte](4)
...
instructions = instructions :+ BigInt(0x00000013L)
instructions = instructions :+ BigInt(0x00000013L)
instructions = instructions :+ BigInt(0x00000013L)
val currentDir = System.getProperty("user.dir")
val exeTxtPath = Paths.get(currentDir, "verilog", f"${instructionFilename}.txt")
val writer = new FileWriter(exeTxtPath.toString)
for (i <- instructions.indices) {
writer.write(f"@$i%x\n${instructions(i)}%08x\n")
}
writer.close()
(exeTxtPath, instructions.length)
}
}
We can see the function readAsmBinary
first appends 3 nop
instructions to the array instructions
, and then write the instructions in hexadecimal plain text format into the txt file, and precede each instruction with @${instruction-number}
.
Why add 3 nop
instead of 4 (ID, EXE, MEM, WB) ?
Now, if we check the content of the created txt file sb.asmbin.txt
:
@0
00400513 # addi x10, x0, 4
@1
deadc2b7 # lui x5, -136484
@2
eef28293 # addi x5, x5, -273
@3
00550023 # sb x5, 0(x10)
@4
00052303 # lw x6, 0(x10)
@5
01500913 # addi x18, x0, 21
@6
012500a3 # sb x18, 1(x10)
@7
00052083 # lw x1, 0(x10)
@8
0000006f # jal x0, 0
@9
00000013 # nop
@a
00000013 # nop
@b
00000013 # nop
And compare with the source assembly code, it should contain all the instructions in the source assembly code:
$ riscv-none-elf-objdump -d sb.o
sb.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <_start>:
0: 00400513 li a0,4
4: deadc2b7 lui t0,0xdeadc
8: eef28293 add t0,t0,-273 # deadbeef <loop+0xdeadbecf>
c: 00550023 sb t0,0(a0)
10: 00052303 lw t1,0(a0)
14: 01500913 li s2,21
18: 012500a3 sb s2,1(a0)
1c: 00052083 lw ra,0(a0)
00000020 <loop>:
20: 0000006f j 20 <loop>
.txt
to RAMImage that there is a ROM beside our CPU, our we need first load our instructions from ROM into RAM, and then start the CPU, which fetch the instructions from the RAM, to execute our program.
However, this only shows that the specified assembly codes will be converted into a txt file, we still need to know how the assembly is loaded by the CPU
module. So next, let's look at how the outputs of readAsmBinary
are used during the simulation.
After the assembly is converted to txt file, the loadMemoryFromFileInline
function is used for loading the content of the given txt file into the ROM:
class InstructionROM(instructionFilename: String) extends Module {
val io = IO(new Bundle {
val address = Input(UInt(Parameters.AddrWidth))
val data = Output(UInt(Parameters.InstructionWidth))
})
val (instructionsInitFile, capacity) = readAsmBinary(instructionFilename)
val mem = Mem(capacity, UInt(Parameters.InstructionWidth))
annotate(new ChiselAnnotation {
override def toFirrtl =
MemorySynthInit
})
loadMemoryFromFileInline(mem, instructionsInitFile.toString.replaceAll("\\\\", "/"))
io.data := mem.read(io.address)
...
}
According to the chisel document:
loadMemoryFromFileInline is an annotation generator that helps with loading a memory from a text file inlined in the Verilog module. This relies on Verilator and Verilog's $readmemh or $readmemb.
The content of the generated txt file ?
This function uses $readmemh
to load the hexadecimal value, so we can check the Verilog spec (another source if the IEEE source is too slow) for the input format for the $readmemh
:
The numbers shall have neither the length nor the base format specified. For $readmemb, each number shall be binary. For $readmemh, the numbers shall be hexadecimal. […] White space and/or comments shall be used to separate the numbers.
In the following discussion, the term address refers to an index into the array that models the memory.
As the file is read, each number encountered is assigned to a successive word element of the memory. Addressing is controlled both by specifying start and/or finish addresses in the system task invocation and by specifying addresses in the data file. When addresses appear in the data file, the format is an at character (@) followed by a hexadecimal number as follows:
@hh…h
Both uppercase and lowercase digits are allowed in the number. No white space is allowed between the @ and the number. As many address specifications as needed within the data file can be used. When the system task encounters an address specification, it loads subsequent data starting at that memory address.
And now we understand why we need to convert the assembly into txt file in a special format (@${instruction-number}
).
Tweak the addresses to verify the understanding
To verify the understanding of the address in the input file, let's do some modification in the InstructionROM
module:
f"@$i%x\n${instructions(i)}%08x\n"
to f"@${i+1}%x\n${instructions(i)}%08x\n"
readAsmBinary
by 1 (instructions.length + 1
)Then, run the CPU and we can see now the first data output by the InstructionROM
module is 00000000
:
$ sbt "testOnly riscv.singlecycle.ByteAccessTest"
...
[warn] one warning found
InstructionROM output 00000000
Memory Poke 0, output instruction -> 00000000
InstructionROM output 00400513
Memory Poke 1, output instruction -> 00000000
InstructionROM output deadc2b7
Memory Poke 2, output instruction -> 00000000
InstructionROM output eef28293
Memory Poke 3, output instruction -> 00000000
CPU input instruction -> 00000000 (0)
InstructionROM output 00550023
Memory Poke 4, output instruction -> 00000000
InstructionROM output 00052303
Memory Poke 5, output instruction -> 00000000
InstructionROM output 01500913
Memory Poke 6, output instruction -> 00000000
InstructionROM output 012500a3
Memory Poke 7, output instruction -> 00000000
From the code, we can see that each time the module InstructionROM
is executed, the instruction at the specified address will be output to the io.data
port.
In the TestTopModule
, we can find that the output (instruction_rom.io.data
) is connected to the rom_loader.io.rom_data
, with the load_address
port of ROMLoader
set as Parameters.EntryAddress
(0x1000
) :
class TestTopModule(exeFilename: String) extends Module {
...
val mem = Module(new Memory(8192))
val instruction_rom = Module(new InstructionROM(exeFilename))
val rom_loader = Module(new ROMLoader(instruction_rom.capacity))
rom_loader.io.rom_data := instruction_rom.io.data
rom_loader.io.load_address := Parameters.EntryAddress
instruction_rom.io.address := rom_loader.io.rom_address
...
}
And from the implementation of ROMLoader
, we can see that the input instruction will be put to the RAM (RAMBundle
), from the given address load_address
:
class ROMLoader(capacity: Int) extends Module {
val io = IO(new Bundle {
val bundle = Flipped(new RAMBundle)
...
})
val address = RegInit(0.U(32.W))
val valid = RegInit(false.B)
...
when(address <= (capacity - 1).U) {
io.bundle.write_enable := true.B
io.bundle.write_data := io.rom_data
io.bundle.address := (address << 2.U).asUInt + io.load_address
io.bundle.write_strobe := VecInit(Seq.fill(Parameters.WordSize)(true.B))
address := address + 1.U
when(address === (capacity - 1).U) {
valid := true.B
}
}
io.load_finished := valid
io.rom_address := address
}
And once all the instructions are loaded into the RAM (indicated by the given capacity
from InstructionROM
, by checking the register address
), the ROMLoader
set the load_finished
to signal the CPU to start fetching the instructions in the RAM.
Now, back to the test module, and we can suppose that the c.io.mem_debug_read_address.poke()
will trigger the InstructionROM
to output the specified instruction to ROMLoader
, then the ROMLoader
will put the instruction into RAM. And after all the instructions are loaded into RAM, the CPU will start fetching the instructions from RAM:
class ByteAccessTest extends AnyFlatSpec with ChiselScalatestTester {
...
test(new TestTopModule("sb.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
for (i <- 1 to 500) {
c.clock.step()
c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout
}
...
}
}
}
Next, let's check how the the instruction are executed.
In the TestTopModule
module, we can find that the CPU is in another clock that only ticks at the 4th cycle of the default clock (for InstructionROM
and ROMLoader
).
Why needs another clock for the CPU module ?
class TestTopModule(exeFilename: String) extends Module {
val io = IO(new Bundle {
val mem_debug_read_address = Input(UInt(Parameters.AddrWidth))
val regs_debug_read_address = Input(UInt(Parameters.PhysicalRegisterAddrWidth))
val regs_debug_read_data = Output(UInt(Parameters.DataWidth))
val mem_debug_read_data = Output(UInt(Parameters.DataWidth))
})
...
val CPU_clkdiv = RegInit(UInt(2.W), 0.U)
val CPU_tick = Wire(Bool())
val CPU_next = Wire(UInt(2.W))
CPU_next := Mux(CPU_clkdiv === 3.U, 0.U, CPU_clkdiv + 1.U)
CPU_tick := CPU_clkdiv === 0.U
CPU_clkdiv := CPU_next
withClock(CPU_tick.asClock) {
val cpu = Module(new CPU)
cpu.io.debug_read_address := 0.U
cpu.io.instruction_valid := rom_loader.io.load_finished
mem.io.instruction_address := cpu.io.instruction_address
cpu.io.instruction := mem.io.instruction
when(!rom_loader.io.load_finished) {
rom_loader.io.bundle <> mem.io.bundle
cpu.io.memory_bundle.read_data := 0.U
}.otherwise {
rom_loader.io.bundle.read_data := 0.U
cpu.io.memory_bundle <> mem.io.bundle
}
cpu.io.debug_read_address := io.regs_debug_read_address
io.regs_debug_read_data := cpu.io.debug_read_data
}
mem.io.debug_read_address := io.mem_debug_read_address
io.mem_debug_read_data := mem.io.debug_read_data
}
How can ROMLoader
access Memory
instance created in TestTopModule
?
The trick seems to be inside the when
statement:
when(!rom_loader.io.load_finished) {
rom_loader.io.bundle <> mem.io.bundle
cpu.io.memory_bundle.read_data := 0.U
}.otherwise {
rom_loader.io.bundle.read_data := 0.U
cpu.io.memory_bundle <> mem.io.bundle
}
According to the official document, the bulk connect operator <>
seems to be used for connecting the same fields of two bundles.
So, when the ROMLoader
is still loading the instructions, the RAM (mem
) in the TestTopModule
will be connected to the RAMBundle
in the ROMLoader
; and once the instructions were loaded into RAM, the RAM should stop accepting the data from the ROMLoader
any more, and change to accept the data from the CPU.
And from the code, we can see that the input port io.mem_debug_read_address
is wired to the RAM's mem.io.debug_read_address
, meaning that the every time the module is poked, it will output the data from the specified address (io.mem_debug_read_address
) in the RAM.
In additional to the io.mem_debug_read_address
port, the cpu.io.instruction_address
is also wired to the RAM, and the RAM will output the instruction (mem.io.instruction
) at the specified address to the CPU.
So it means that at the end of every 4 default cycles (4 c.io.mem_debug_read_address.poke()
), the CPU will be triggered and start fetching the instruction (if the instruction_valid
signal is also set) by specifying the instruction_address
.
Now, add some printf
to inspect the data during simulation, then we can see the output supports our assumption:
$ sbt "testOnly riscv.singlecycle.ByteAccessTest"
...
[warn] one warning found
(default cycle) InstructionROM: output 00400513 from 00000000
(default cycle) ROMLoader: write 00400513 to 00001000 // first instruction loaded
(default cycle) InstructionROM: output deadc2b7 from 00000001
(default cycle) ROMLoader: write deadc2b7 to 00001004
(default cycle) InstructionROM: output eef28293 from 00000002
(default cycle) ROMLoader: write eef28293 to 00001008
(default cycle) InstructionROM: output 00550023 from 00000003
(default cycle) ROMLoader: write 00550023 to 0000100c
(CPU cycle) CPU: fetch 00400513 from 00001000 (valid: 0)
(default cycle) InstructionROM: output 00052303 from 00000004
(default cycle) ROMLoader: write 00052303 to 00001010
(default cycle) InstructionROM: output 01500913 from 00000005
(default cycle) ROMLoader: write 01500913 to 00001014
(default cycle) InstructionROM: output 012500a3 from 00000006
(default cycle) ROMLoader: write 012500a3 to 00001018
(default cycle) InstructionROM: output 00052083 from 00000007
(default cycle) ROMLoader: write 00052083 to 0000101c
(CPU cycle) CPU: fetch 00400513 from 00001000 (valid: 0)
(default cycle) InstructionROM: output 0000006f from 00000008
(default cycle) ROMLoader: write 0000006f to 00001020
(default cycle) InstructionROM: output 00000013 from 00000009
(default cycle) ROMLoader: write 00000013 to 00001024
(default cycle) InstructionROM: output 00000013 from 0000000a
(default cycle) ROMLoader: write 00000013 to 00001028
(default cycle) InstructionROM: output 00000013 from 0000000b
(default cycle) ROMLoader: write 00000013 to 0000102c // last instruction loaded
(CPU cycle) CPU: fetch 00400513 from 00001000 (valid: 0)
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(CPU cycle) CPU: fetch 00400513 from 00001000 (valid: 1) // CPU started
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(CPU cycle) CPU: fetch deadc2b7 from 00001004 (valid: 1)
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
... // running
(CPU cycle) CPU: fetch 00052083 from 0000101c (valid: 1)
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(CPU cycle) CPU: fetch 0000006f from 00001020 (valid: 1)
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(CPU cycle) CPU: fetch 0000006f from 00001020 (valid: 1)
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
(default cycle) InstructionROM: output 00000000 from 0000000c
(default cycle) ROMLoader: write 00000000 to 00000000
... // repeat `j loop`...
After the loop was terminated, the test remaining routine is used to check the registers' value specified via the regs_debug_read_address
port, and compared them with the expected values:
class ByteAccessTest extends AnyFlatSpec with ChiselScalatestTester {
...
test(new TestTopModule("sb.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
...
c.io.regs_debug_read_address.poke(5.U) // t0
c.io.regs_debug_read_data.expect(0xdeadbeefL.U)
c.io.regs_debug_read_address.poke(6.U) // t1
c.io.regs_debug_read_data.expect(0xef.U)
c.io.regs_debug_read_address.poke(1.U) // ra
c.io.regs_debug_read_data.expect(0x15ef.U)
}
}
}
.data
SectionNow, create a simple custom test which runs our handwritten assembly code on MyCPU, to verify our understanding of the overall workflow of MyCPU.
First, define some data in the .data
section and read the data to a0
register:
# csrc/simple.S
.data
mdata:
.word 0x12345678
.word 0x22345678
.word 0x32345678
.word 0x42345678
.text
main:
li s1, 1234
la s0, .data
lw a0, 0(s0)
loop:
j loop
And do make
and check the generated asmbin using xxd
command (with -e
option), and we can see that the .data
section is right after the instructions:
00000000: 4d200493 00001417 ffc40413 00042503 .. M.........%..
00000010: 0000006f 12345678 22345678 32345678 o...xV4.xV4"xV42
00000020: 42345678 xV4B
Then, create a custom test module in ca2023-lab3/src/test/scala/riscv/singlecycle/CPUTest.scala
. This test unit check the registers' (s0
, s1
, a0
) values after all instructions being executed:
class SimpleTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("Simple Test")
it should "Test Handwritten Assembly on MyCPU" in {
test(new TestTopModule("simple.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
for (i <- 1 to 50) {
c.clock.step()
c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout
}
c.io.mem_debug_read_address.poke(0x1000.U) // first instruction
c.clock.step()
println(f"mem[${c.io.mem_debug_read_address.peek().litValue}%08x]: ${c.io.mem_debug_read_data.peek().litValue}%08x")
c.io.regs_debug_read_address.poke(8.U) // s0
println(f"s0: ${c.io.regs_debug_read_data.peek().litValue}%08x")
c.io.regs_debug_read_address.poke(9.U) // s1
c.io.regs_debug_read_data.expect(1234.U) // s1
c.io.regs_debug_read_address.poke(10.U) // a0
c.io.regs_debug_read_data.expect(0x12345678L.U)
}
}
}
Then, execute this test on MyCPU by running sbt "testOnly riscv.singlecycle.SimpleTest"
. And we will get the result:
$ sbt "testOnly riscv.singlecycle.SimpleTest"
[info] welcome to sbt 1.9.7 (Temurin Java 1.8.0_392)
[info] loading settings for project lab3-build from plugins.sbt ...
[info] loading project definition from /lab3/project
[info] loading settings for project root from build.sbt ...
[info] set current project to mycpu (in build file:/lab3/)
mem[00001000]: 4d200493
s0: 00000ffc
[info] SimpleTest:
[info] Simple Test
[info] - should Test Handwritten Assembly on MyCPU *** FAILED ***
[info] io_regs_debug_read_data=0 (0x0) did not equal expected=305419896 (0x12345678) (lines in CPUTest.scala: 136, 120) (CPUTest.scala:136)
[info] Run completed in 6 seconds, 923 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
...
The debug message shows that the address of .data
section (stored in s0
) is at 0x0FFC
. However, according to the loading flow explained above, and the generated txt file:
@0
4d200493
@1
00001417
@2
ffc40413
@3
00042503
@4
0000006f
@5
12345678
@6
22345678
@7
32345678
@8
42345678
@9
00000013
@a
00000013
@b
00000013
The .text
section should begin from 0x1000
, and the .data
section is right after the .text
section, meaning that .data
section should start from 0x1014
. And we can verify this by specifying the address statically in the code:
...
.text
main:
li s1, 1234
li s0, 0x1014
lw a0, 0(s0)
loop:
j loop
And the result shows that our assumption is correct:
$ sbt "testOnly riscv.singlecycle.SimpleTest"
...
mem[00001000]: 4d200493
s0: 00001014
[info] SimpleTest:
[info] Simple Test
[info] - should Test Handwritten Assembly on MyCPU
[info] Run completed in 7 seconds, 434 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success]
TODO: solve the .data
address problem