蕭力文
contributed by <liball
>
To modify the project https://github.com/sysprog21/ca2023-lab3 and enhance it to support the full RV32I instruction set along with CSR instructions (specifically the Zicsr extension) using Chisel.
The implementation must be compatible with the test programs provided in https://github.com/sysprog21/rv32emu/tree/master/tests/perfcounter. Additionally, select at least three RISC-V programs from the course exercises, rewrite them, and ensure they run successfully on the enhanced processor.
Ubuntu Linux 24.04
Fix the permissions of the uploaded pictures.
After validating the instruction, if a jump is required, the PC is updated to the target jump address. Otherwise, it is incremented to PC + 4
.
...
when(io.jump_flag_id) {
pc := io.jump_address_id
}.otherwise {
pc := pc + 4.U
}
...
Execute the command: sbt "testOnly riscv.singlecycle.InstructionFetchTest"
for testing.
The figure below shows that we pass the test successfully.
In the figure below, the initial instruction is set to 0x00000013
, which represents the NOP
instruction. The pc
is initialized to 0x1000
.
At the next positive clock edge, io_instruction_valid
is set to HIGH
, while io_jump_flag_id
remains LOW
. As a result, the pc increments to 0x1004
(pc + 4
).
Next, we set both io_instruction_valid
and io_jump_flag_id
to HIGH
. As shown, the pc
returns to 0x1000
, indicating that the jump was successfully executed.
The following code snippet demonstrates how the control signals for memory read and write operations are generated based on the instruction's opcode during the decode stage:
...
io.memory_read_enable := (opcode === InstructionTypes.L)
io.memory_write_enable := (opcode === InstructionTypes.S)
...
io.memory_read_enable
: This signal is activated (true
) when the instruction belongs to the load (L) type, which indicates that the operation will read data from memory.io.memory_write_enable
: This signal is activated (true
) when the instruction belongs to the store (S) type, which means the operation will write data to memory.By comparing the opcode
with predefined constants (e.g., InstructionTypes.L
and InstructionTypes.S
), the system ensures that the appropriate control signals are generated for memory operations. This enables the processor to distinguish between read and write operations and handle them accordingly in subsequent pipeline stages.
Execute the command: sbt "testOnly riscv.singlecycle.InstructionDecoderTest"
for testing.
The figure below shows that we pass the test successfully.
io_instruction
is 0x00A02223
(0000 0000 1010 0000 0010 0010 0010 0011
in binary). Then we can see that:opcode
= 23
= 0b0100011
is a S-type
instruction.funct3
= 0b010
indiates that it is sw
instruction.imm
= 0x04
= 4
rs1
= 0x00
= x0
; rs2
= 0x0A
= x10
io_memory_read_enable
, io_reg_write
are set to LOW
; io_memory_write_enable
is set to HIGH
.io_ex_aluop1_source
= 0
, and io_ex_aluop2_source
= 1
which represent for Register
for both sources.
io.ex_aluop1_source := Mux(
opcode === Instructions.auipc || opcode === InstructionTypes.B || opcode === Instructions.jal,
ALUOp1Source.InstructionAddress,
ALUOp1Source.Register
)
io.ex_aluop2_source := Mux(
opcode === InstructionTypes.RM,
ALUOp2Source.Register,
ALUOp2Source.Immediate
)
This instruction is sw x10, 4(x0)
.
The following code snippet demonstrates how the ALU inputs are set in the Execute stage:
...
alu.io.func := alu_ctrl.io.alu_funct
alu.io.op1 := Mux(io.aluop1_source === 1.U, io.instruction_address, io.reg1_data)
alu.io.op2 := Mux(io.aluop2_source === 1.U, io.immediate, io.reg2_data)
...
alu.io.func
: This sets the ALU's function code, which determines what operation the ALU will perform. The function code is provided by the ALU control unit, which is based on the decoded instruction’s opcode and function fields.alu.io.op1
: This selects the first operand for the ALU. The value is determined by the aluop1_source
signal:
aluop1_source
is set to 1
, it uses the instruction address (io.instruction_address
) as the first operand.io.reg1_data
).alu.io.op2
: This selects the second operand for the ALU. Similarly to op1
, the value is determined by the aluop2_source
signal:
aluop2_source
is set to 1
, it uses the immediate value (io.immediate
) as the second operand.io.reg2_data
).Execute the command: sbt "testOnly riscv.singlecycle.ExecuteTest"
for testing.
The figure below shows that we pass the test successfully.
This test is checking two main functionalities:
ADD
Instruction Testing:io_instruction
is 0x001101B3
which represents x3 = x2 + x1
.ADD
test performs 100 times. In each time, it:reg1_data
and reg2_data
).if_jump_flag
remains 0
)BEQ
Instruction Testing:io_instruction
is 0x00208163
which represents beq x1, x2, 2
.Sets up the test conditions:
instruction_address
to 2
immediate
to 2
aluop1_source
and aluop2_source
set to 1
)Tests two scenarios:
Equal case:
reg1_data
and reg2_data
to 9
if_jump_flag
to be 1
(branch taken)if_jump_address
to be 4
(PC + 2
)Not equal case:
reg1_data
and reg2_data
to 9
and 19
respectively.if_jump_flag
to be 0
(branch not taken)if_jump_address
to be 4
Connect the inputs between the inputs of Execute module and the outputs of the other modules by following the CPU architecture diagram below.
...
ex.io.instruction := inst_fetch.io.instruction
ex.io.instruction_address := inst_fetch.io.instruction_address
ex.io.reg1_data := regs.io.read_data1
ex.io.reg2_data := regs.io.read_data2
ex.io.immediate := id.io.ex_immediate
ex.io.aluop1_source := id.io.ex_aluop1_source
ex.io.aluop2_source := id.io.ex_aluop2_source
...
Execute the command: sbt test
for testing the whole single-cycle CPU.
The figure below shows that we pass the test successfully.
The Control and Status Register (CSR) is a key feature of the RISC-V architecture. It provides a mechanism for managing system-level configuration, monitoring, and exception handling. CSRs are special-purpose registers used for tasks such as storing control bits, enabling interrupts, tracking performance counters, or handling trap and exception states. Unlike general-purpose registers, CSRs are accessed via special CSR instructions, which allow reading, writing, and modifying these registers atomically. There will be 12 bits to address a CSR which means that there are up to 4096(
The purpose of CSRs in a RISC-V processor is to:
mstatus
(Machine Status Register):mtvec
(Machine Trap Vector Register):mepc
(Machine Exception Program Counter):mepc
during both interrupt and exception handling.mcause
(Machine Cause Register):mcycle
and minstret
):mcycle
) and instructions retired (minstret
), aiding in profiling and debugging.
CSR instructions provide flexible control over these registers:
CSRRW
:
Reads the old value of the CSR, zero-extends the value to XLEN
bits, then writes it to integer register rd. The initial value in rs1 is written to the CSR. If rd = x0
, then the instruction shall not read the CSR and shall not cause any of the side effects that might occur on a CSR read.
CSRRS
:
Reads the value of the CSR, zeroextends the value to XLEN
bits, and writes it to integer register rd
. The initial value in integer register rs1
is treated as a bit mask that specifies bit positions to be set in the CSR. Any bit that is high in rs1
will cause the corresponding bit to be set in the CSR, if that CSR bit is writable.
CSRRC
:
Reads the value of the CSR, zeroextends the value to XLEN
bits, and writes it to integer register rd
. The initial value in integer register rs1
is treated as a bit mask that specifies bit positions to be cleared in the CSR. Any bit that is high in rs1
will cause the corresponding bit to be cleared in the CSR, if that CSR bit is writable. Other bits in the CSR are unaffected.
CSRRWI
:
Similar to CSRRW
, except it update the CSR using an XLEN
-bit value obtained by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the rs1
field instead of a value from an integer register.
CSRRSI
:
Similar to CSRRS
, except it update the CSR using an XLEN
-bit value obtained by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the rs1
field instead of a value from an integer register.
CSRRCI
:
Similar to CSRRC
, except it update the CSR using an XLEN
-bit value obtained by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the rs1
field instead of a value from an integer register.
The Core Local Interrupt (CLINT) handles interrupts, such as timer and software-generated interrupts. It plays a critical role in managing timer-based events and enabling communication between cores in multi-core systems.
Interrupt Management:
CLINT coordinates local interrupts specific to each core and processes requests from software or timers.
Timer Facilities:
CLINT includes programmable timers to support time-sensitive tasks like scheduling and context switching.
Software-Generated Interrupts:
Enables inter-core communication and task signaling by allowing software to trigger interrupts.
Timer Registers:
mtime
: A global timer that holds the current machine time.mtimecmp
: Stores a compare value. When mtime
exceeds mtimecmp
, a timer interrupt is generated.Interrupt Registers:
msip
(Machine Software Interrupt Pending): Indicates pending software interrupts for the core.Memory-Mapped Registers: CLINT registers are exposed via memory-mapped I/O, allowing software and peripherals to configure them.
Timer Interrupt Generation:
When the mtime
value surpasses mtimecmp
, the CLINT signals the core by raising an interrupt.
Interrupt Handling Process:
Upon an interrupt:
Software Interrupts:
Software can trigger interrupts by writing to msip, enabling communication and synchronization between cores.
The figure above is the objective CPU architecture. Reference
To handle interrupts, additional singals are introduced: interrupt_assert
and interrupt_handler_address
.
interrupt_assert
:interrupt_assert
is set to 1
, it signifies that the CPU must handle an interrupt.interrupt_handler_address
:interrupt_assert
is set to 1
, the pc
jumps to the interrupt_handler_address
, redirecting execution to the interrupt handler routine.Interrupt handling is given the highest priority. If both a jump and an interrupt occur simultaneously, the interrupt takes precedence, as it is checked before the jump condition.
The implementation is shown as below.
...
when(io.interrupt_assert){
pc := io.interrupt_handler_address
}.elsewhen(io.jump_flag_id) {
pc := io.jump_address_id
}.otherwise {
pc := pc + 4.U
}
...
The test has been modified to include the ability to verify interrupt handling functionality.
...
case 2 => // interrupt
c.io.interrupt_assert.poke(true.B)
c.io.interrupt_handler_address.poke(interruptHandlerAddress)
c.clock.step()
c.io.instruction_address.expect(interruptHandlerAddress)
pre = interruptHandlerAddress
c.io.interrupt_assert.poke(false.B) // clear interrupt after handling
...
csr[31:20] | rs1/uimm[19:15] | funct3[14:12] | rd[11:7] | opcode[6:0]
When a CSR instruction is decoded. The csr_address
specifies the address of the target CSR register. The csr_write_enable
indicates if the instruction needs to write to the specified CSR. The execution stage uses these signals to determine whether to read, modify, or write to the CSR register according to the instruction's requirements.
csr_address
:
csr_address
is the [31:20] part of io.insturction
csr_write_enable
:
opcode === Instructions.csr
and funct3
is one of the csrrw
, csrrwi
, csrrs
, csrrsi
, csrrc
, csrrci
, it will be set to 1
....
val csr_write_enable = Output(Bool())
val csr_address = Output(UInt(Parameters.CSRRegisterAddrWidth))
...
...
io.csr_address := io.instruction(31, 20)
io.csr_write_enable := (opcode === Instructions.csr) && (
funct3 === InstructionsTypeCSR.csrrw || funct3 === InstructionsTypeCSR.csrrwi ||
funct3 === InstructionsTypeCSR.csrrs || funct3 === InstructionsTypeCSR.csrrsi ||
funct3 === InstructionsTypeCSR.csrrc || funct3 === InstructionsTypeCSR.csrrci
)
...
In the The RISC-V Instruction Set Manual Volume I p.46~p.48, indicating that for:
CSRRW
and CSRRWI
:rd == x0
, then the instruction shall not read the CSR and shall not cause any of the side effects that might occur on a CSR read.CSRRS
and CSRRC
:rs1 == x0
, then the instruction will not write to the CSR at all, and so shall not cause any of the side effects that might otherwise occur on a CSR write, such as raising illegal instruction exceptions on accesses to read-only CSRs.CSRRSI
and CSRRCI
:
uimm[4:0]
field is zero
, then these instructions will not write to the CSR, and shall not cause any of the side effects that might otherwise occur on a CSR write.rd
and rs1
fields.rs1 == x0
situation.TODO: Handle the rd == x0
and rs1 == x0
situations.
Create a test of the instruction csrrw x0, mtvec, x0
.
...
c.io.instruction.poke(0x30501073L.U) // csrrw x0, mtvec, x0
c.io.wb_reg_write_source.expect(RegWriteSource.CSR)
c.io.regs_reg1_read_address.expect(0.U)
c.io.csr_address.expect(0x305.U) // CSR address mtvec
c.io.csr_write_enable.expect(true.B) // CSR write enable should be enabled
c.clock.step()
...
Implement the CSR instructions respectively.
csrrw rd, csr, rs1
:
csr
to rd
.rs1
to the csr
.rd == x0
, the csr
remains unchanged.csrrs rd, csr, rs1
:
csr
to rd
.rs1
is bitwise OR
with the current value in the csr
, and write the result back to the csr
.rs1 == x0
, the csr
remains unchanged.csrrc rd, csr, rs1
:
csr
to rd
.rs1
is bitwise AND
with the complement of the value in the csr
, and write the result back to the csr
.rs1 == x0
, the csr
remains unchanged.csrrwi rd, csr, uimm
:
csr
to rd
.uimm
to the csr
. (uimm
is only 5-bit, so it needs zero-extension to 32-bit while computing)rd == x0
, the csr
remains unchanged.csrrsi rd, csr, uimm
:
csr
to rd
.uimm
is bitwise OR
with the current value in the csr
, and write the result back to the csr
. (uimm
is only 5-bit, so it needs zero-extension to 32-bit while computing)rs1 == x0
, the csr
remains unchanged.csrrci rd, csr, uimm
:
csr
to rd
.uimm
is bitwise AND
with the current value in the csr
, and write the result back to the csr
. (uimm
is only 5-bit, so it needs zero-extension to 32-bit while computing)rs1 == x0
, the csr
remains unchanged....
val csr_reg_read_data = Input(UInt(Parameters.DataWidth))
val csr_reg_write_data = Output(UInt(Parameters.DataWidth))
...
...
io.csr_reg_write_data := MuxLookup(
funct3,
0.U,
IndexedSeq(
InstructionsTypeCSR.csrrw -> io.reg1_data,
InstructionsTypeCSR.csrrs -> (io.csr_reg_read_data | io.reg1_data),
InstructionsTypeCSR.csrrc -> (io.csr_reg_read_data & ~(io.reg1_data)),
InstructionsTypeCSR.csrrwi -> io.immediate,
InstructionsTypeCSR.csrrsi -> (io.csr_reg_read_data | io.immediate),
InstructionsTypeCSR.csrrci -> (io.csr_reg_read_data & ~(io.immediate))
)
)
...
Create a test of the instruction csrrsi x1, mtvec, 0x10
.
...
c.io.csr_reg_read_data.poke(15.U)
c.io.immediate.poke(16.U)
c.io.instruction.poke(0x305860f3L.U)
c.clock.step()
c.io.csr_reg_write_data.expect(31.U)
...
Where 0
is alu_result
; 1
is memory_read_data
; 2
is csr_reg_read_data
; 3
is instruction_address + 4
; control singal is regs_write_source
...
val csr_read_data = Input(UInt(Parameters.DataWidth))
...
...
io.regs_write_data := MuxLookup(
io.regs_write_source,
io.alu_result,
IndexedSeq(
RegWriteSource.Memory -> io.memory_read_data,
RegWriteSource.CSR -> io.csr_read_data
RegWriteSource.NextInstructionAddress -> (io.instruction_address + 4.U)
)
)
...
Connect the components together according to the single-cycle CPU architecture diagram.
...
val csr = Module(new CSR)
val clint = Module(new CLINT)
...
...
inst_fetch.io.jump_address_id := Mux(clint.io.interrupt_assert === 1.U, clint.io.interrupt_handler_address, ex.io.if_jump_address)
inst_fetch.io.jump_flag_id := (ex.io.if_jump_flag | clint.io.interrupt_assert)
inst_fetch.io.interrupt_assert := clint.io.interrupt_assert
inst_fetch.io.interrupt_handler_address := clint.io.interrupt_handler_address
...
csr.io.reg_read_address_id := id.io.csr_address
csr.io.reg_write_enable_id := id.io.csr_write_enable
csr.io.reg_write_address_id := id.io.csr_address
csr.io.reg_write_data_ex := ex.csr_reg_write_data
clint.io.Interrupt_Flag := io.Interrupt_Flag
clint.io.Instruction := inst_fetch.io.instruction
clint.io.IF_Instruction_Address := inst_fetch.io.instruction_address
clint.io.jump_flag := ex.io.if_jump_flag
clint.io.jump_address := ex.io.if_jump_address
...
ex.io.csr_reg_read_data := csr.io.reg_read_data
...
wb.io.csr_read_data := csr.io.reg_read_data
...
Execute the command: sbt test
.
Failed tests:
The RISC-V Instruction Set Manual Volume I: Unprivileged ISA
The RISC-V Instruction Set Manual Volume II: Privileged Architecture
RISC-V Architecture Instruction Encoding
YatCPU