Assignment 3: RISC-V CPU

# Assignment 3: RISC-V CPU contributed by < [`Shih-Hsuan`](https://github.com/Shih-Hsuan) > Full Code < [`Shih-Hsuan/ca2025-mycpu`](https://github.com/Shih-Hsuan/ca2025-mycpu) > --- ## Chisel Bootcmap ### Chisel Module vs Verilog Module * Chisel Module * class ... `extends` Module: All hardware modules must extend. * `val io = IO(...)` : Declare input and output ports. input/output bundle must be assigned to a variable explicitly named `io`. * `new Bundle` : Contains some Input and Output signals. * `UInt(4.W)` : Unsigned integer of width 4. * `:=` : Right-hand signal drives the left-hand signal. ```scala= class Passthrough extends Module { val io = IO(new Bundle { val in = Input(UInt(4.W)) val out = Output(UInt(4.W)) }) io.out := io.in } ``` * Verilog Module Elaboration : Use Scala to call the Chisel compiler to translate Chisel into Verilog ```scala println(getVerilog(new Passthrough)) ``` ```verilog= module Passthrough( input [4:0] in, output reg [4:0] out ); assign out = in; endmodule ``` ### Testing Hardware * `test` : runs the unit test. * `poke` : Set input value. * `expect` : Compare the output to an expected value. ```scala= test(new Passthrough()) { c => c.io.in.poke(0.U) // Set our input to value 0 c.io.out.expect(0.U) // Assert that the output correctly has 0 } println("SUCCESS!!") // Scala Code: if we get here, our tests passed! ``` ### Combinational Logic #### Three of the basic Chisel types * `UInt` - unsigned integer. * `SInt` - signed integer. * `Bool` - true or false may be connected and operated upon. #### Multiplexer **Latch Prevention**: Unlike Verilog, where incomplete conditional assignments in combinational logic implicitly infer latches, Chisel enforces strict initialization rules. If a `Wire` is assigned within a `when` block without a default value or a corresponding `.otherwise` block, the compiler throws a "not fully initialized" error at compile time. This safety mechanism forces designers to explicitly define behavior for all conditions, typically by setting a default value before the conditional logic, resulting in clean combinational logic (multiplexers) instead of unintended latches. **Conditional Logic Abstraction**: Chisel offers two primary ways to express conditional combinational logic. For simple 2-to-1 selections, the functional `Mux(select, true_val, false_val)` is preferred for its conciseness. For more complex control flow, the `when` and `.otherwise` constructs are used, which resemble Verilog's `if-else` blocks. Unlike Verilog, Chisel implies the `always @(*)` sensitivity list automatically based on signal connectivity, eliminating a common source of bugs. * Mux in Chisel Unlike Scala `if`, values are not returned by the blocks associated with `when`. ```scala= when(io.select) { io.out := io.a } .otherwise { io.out := io.b } // when, elsewhen, .otherwise (3-to-1) ``` ```scala= // true.B and false.B are the preferred ways to create Chisel Bool literals. val select = true.B io.out := Mux(io.select, io.a, io.b) // io.out = io.a ``` * Mux in Verilog ( There are three ways to derive a 2-to-1 multiplexer ) ```verilog= always @(a or b or select) begin if (select) out = a; else out = b; end ``` ```verilog= always @(a or b or select) begin out = b; if (select) out = a; end ``` ```verilog= assign out = select ? a : b; ``` #### Concatenation Chisel replaces Verilog's brace syntax ``{msb, lsb}`` with the `Cat(msb, lsb)` object for signal concatenation. The argument order follows the same intuitive structure: the first argument becomes the most significant bits (MSB) and the last argument becomes the least significant bits (LSB). `Cat` simplifies combining multiple signals into a single wider bus and automatically handles output width calculation. * Concatenation in Chisel ```scala= val high = "b1010".U(4.W) val low = "b0011".U(4.W) val out = Wire(UInt(8.W)) out := Cat(high, low) ``` * Concatenation in Verilog ```verilog= wire [3:0] high = 4'b1010; wire [3:0] low = 4'b0011; wire [7:0] out; assign out = {high, low}; ``` ### Finite State Machine In Verilog, implementing an FSM typically requires verbose boilerplate code, including explicit local parameter definitions for states and separate `always` blocks for sequential state updates and combinational next-state logic. Chisel dramatically simplifies this through the `Enum` constructor, which automatically handles state encoding, and `RegInit`, which implicitly manages clock and reset connectivity. The `switch` and `is` constructs in Chisel allow for a more readable and concise description of state transitions compared to Verilog's `case` statements. * FSM in Chisel ```scala= // 1. Define state val idle :: s1 :: s2 :: Nil = Enum(3) // 2. Define register ( RegInit Automatic handling clock & reset ) // initialization idle val state = RegInit(idle) // 3. Next state logic switch (state) { is (idle) { when(io.in) { state := s1 } } is (s1) { state := s2 } is (s2) { state := idle } } // 4. Output logic io.out := (state === s2) ``` * FSM in Verilog ```verilog= // 1. Define state localparam IDLE = 2'b00; localparam S1 = 2'b01; localparam S2 = 2'b10; // 2. Define register reg [1:0] state, next_state; // 3. State register (Sequential Logic) always @(posedge clk) begin if (reset) state <= IDLE; else state <= next_state; end // 4. Next state logic always @(*) begin case (state) IDLE: begin if (in) next_state = S1; else next_state = IDLE; end S1: begin next_state = S2; end S2: begin next_state = IDLE; end default: next_state = IDLE; endcase end // 5. Output logic always @(*) begin out = (state == S2); end ``` ### Sequential Logic #### Registers `Reg` holds its output value until the rising edge of its clock, at which time it takes on the value of its input. * Chisel vs Verilog ```scala= val register = Reg(UInt(12.W)) register := io.in + 1.U ``` ```verilog= reg [11:0] register; always @(posedge clock) begin register <= io_in + 12'h1; end ``` * `RegInit()` , initialized to zero. ```scala= val register = RegInit(0.U(12.W)) register := io.in + 1.U ``` ```verilog= reg [11:0] register; always @(posedge clock) begin if(reset) register <= 0; else register <= io_in + 12'h1; end ``` --- ## `1-single-cycle` ### Test Case Analyze * `WRITE_VCD=1 sbt` : Generating waveform files during testing. * `gtkwave test_run_dir/... .vcd` : See the waveform. #### 1. Instruction Fetch Validate the priority logic of the PC multiplexer (Sequential vs. Jump). Mechanism: * Input: A randomized `jump_flag` is generated in each cycle (0 or 1). * Condition A (Flag = 0): Expect `PC_next = PC_current + 4` (Sequential). * Condition B (Flag = 1): Expect `PC_next = jump_target`. Note : For validation purposes, the `jump_target` is fixed to the entry address (`0x1000`) throughout this test loop. * `jump_flag` = `0` : Sequential ![image](https://hackmd.io/_uploads/rkRbW4tGWe.png) * `jump_flag` = `1` : At `14` ps: `jump_flag` asserted, PC updates to `0x1000` on the next cycle. ![image](https://hackmd.io/_uploads/SJ3lfVKM-g.png) > The `jump_flag` updates at the **negedge** to satisfy setup time requirements, guaranteeing signal stability prior to the subsequent rising edge. #### 2. Instruction Decode Instruction Decode to Control Signals. * aluop1_source : * `0` : `reg1` * `1` : `pc` * aluop2_source : * `0` : `reg2` * `1` : `immediate` * wb_reg_write_source : * `00` : alu result * `01` : memory read out * `10` : pc + 4, next instruction address ##### 1. Load instruction (I-type): `lw x3, 4(x1)` * **Instruction Code**：`0x0040a183` * **Behavior**：Loads data from memory address `x1 + 4` and stores it into `x3`. * **Key Checks**： * **ALU Sources**：`Reg` (x1) + `Imm` (4) used for address calculation. * **Write-back Source**：Must be `Memory`. * `wb_reg_write_source` = `01`. * `reg_write_enable` = `1`. * **Memory Signals**：`read_enable` = `1`，`write_enable` = `0`。 ![image](https://hackmd.io/_uploads/SJ8P1JcGZg.png =500x) ##### 2. Store Instruction (S-type): `sw x10, 4(x0)` * **Instruction Code:** `0x00a02223` * **Behavior:** Stores the value of `x10` into memory address `x0 + 4`. * **Key Checks:** * **ALU Sources:** `Reg` (x0) + `Imm` (4) used for address calculation. * **Register Read:** Reads two registers (`x0` as base, `x10` as data). * **Memory Signals:** `write_enable` is **True**. * **Register Write:** `reg_write_enable` is **False** (Stores do not write back to registers). ![image](https://hackmd.io/_uploads/HJFiD1qz-e.png =500x) ##### 3. ALU Immediate Instruction (I-type): `andi x3, x9, 24` * **Instruction Code:** `0x0184f193` * **Behavior:** Calculates `x9 AND 24` and stores the result in `x3`. * **Key Checks:** * **ALU Sources:** `Reg` (x9) + `Imm` (24, 0x18). * **Write-back Source**: `ALUResult`. * `wb_reg_write_source` = `00`. * `reg_write_enable` = `1`. ![image](https://hackmd.io/_uploads/HyH4_1cM-x.png =500x) ##### 4. Branch Instruction (B-type): `bge x2, x4, 16` * **Instruction Code:** `0x00415863` * **Behavior:** If `x2 >= x4`, the PC jumps to `PC + 16`. * **Key Checks:** * **ALU Sources:** Configured as `InstructionAddress` (PC) + `Imm` (16, 0x10) to calculate the **jump target address**. * **Register Read:** Reads `x2` and `x4` for comparison. ![image](https://hackmd.io/_uploads/Bk4YcJ5GWl.png =500x) ##### 5. Register Operation Instruction (R-type): `add` * **Instruction Code:** `0x002081b3` * **Behavior:** `x3 = x1 + x2`. * **Key Checks:** * **ALU Sources:** Both sources are `Register` (`aluop_source` = `0`). ![image](https://hackmd.io/_uploads/BklKnJ9zbg.png =500x) ##### 6. LUI Instruction (U-type): `lui x5, 2` * **Instruction Code:** `0x000022b7` * **Behavior:** Left-shifts the immediate value `2` by 12 bits and stores it in `x5`. * **Key Checks:** * **ALU Handling:** Treated as `x0 + Imm`. Source 1 is `Register` (reading x0 yields 0), and Source 2 is `Imm`. * **Immediate Value:** The decoded immediate should be `2 << 12`. ![image](https://hackmd.io/_uploads/B1VahkcMbg.png =500x) ##### 7. JAL Instruction (J-type): `jal x5, 8` * **Instruction Code:** `0x008002ef` * **Behavior:** Jumps to `PC + 8` and stores the next instruction address (`PC + 4`) in `x5`. * **Key Checks:** * **ALU Sources:** `PC` + `Imm` (calculates the jump target). * **Write-back Source:** `NextInstructionAddress` (This performs the Link operation). * `wb_reg_write_source` = `10`. * `reg_write_enable` = `1`. ![image](https://hackmd.io/_uploads/rk1vaycMZe.png =500x) ##### 8. JALR Instruction (I-type): `jalr x5, x1, 8` * **Instruction Code:** `0x008082e7` * **Behavior:** Jumps to `x1 + 8` and stores the next instruction address in `x5`. * **Key Checks:** * **ALU Sources:** `Register` (x1) + `Imm` (8). (**Note:** This differs from JAL; JALR uses a register as the base address). * **Write-back Source:** `NextInstructionAddress`. * `wb_reg_write_source` = `10`. * `reg_write_enable` = `1`. ![image](https://hackmd.io/_uploads/H1xapy5f-g.png =550x) ##### 9. AUIPC Instruction (U-type): `auipc x2, 7` * **Instruction Code:** `0x00007117` * **Behavior:** `x2 = PC + (7 << 12)`. * **Key Checks:** * **ALU Sources:** `InstructionAddress` (PC) + `Imm`. ![image](https://hackmd.io/_uploads/B13mAk9GZl.png =400x) #### 3. Execute It primarily validates two core functionalities of the CPU during the execution phase: **ALU Operations** and **Branch Logic**. ##### 1. ALU Arithmetic Test (ADD Instruction) This section employs **Random Testing** methods to verify the adder logic. * **Instruction Setup:** `poke(0x001101b3L.U)` corresponds to the instruction `ADD x3, x2, x1`. * **Test Logic:** * Runs a loop **100 times**. * In each iteration, two random integers, `op1` and `op2`, are generated. * **Software Calculation:** The correct result is pre-calculated in Scala: `result = op1 + op2`. * **Hardware Execution:** These two numbers are fed into the inputs of the Execute module (`reg1_data`, `reg2_data`). * **Verification:** * `c.io.mem_alu_result.expect(result.U)`: Checks if the hardware-calculated result matches the software calculation. * `c.io.if_jump_flag.expect(0.U)`: Confirms that the jump signal is **not** falsely triggered during an addition operation. ##### 2. Branch Logic Test (BEQ Instruction) This section tests the conditional branch instruction (**Branch Equal**), verifying whether the CPU can correctly distinguish between "taken" and "not taken" branches. * **Instruction Setup:** `poke(0x00208163L.U)` corresponds to `BEQ` (branch if equal). * **Environment Setup:** Sets the current PC (`instruction_address`) to 2, and the jump offset (`immediate`) to 2. * **Test Scenario A: Equal (Should Jump)** * **Input:** `reg1 = 9`, `reg2 = 9` (values are equal). * **Expected Result:** * `if_jump_flag` is **1** (True, notifying the IF stage to jump). * `if_jump_address` is **4** (Calculation: `PC + Imm = 2 + 2 = 4`). ![image](https://hackmd.io/_uploads/SyhwkxqMZg.png =550x) * **Test Scenario B: Not Equal (Should Not Jump)** * **Input:** `reg1 = 9`, `reg2 = 19` (values are not equal). * **Expected Result:** * `if_jump_flag` is **0** (False, continuing sequential execution). * *(Note: Even if the branch is not taken, the hardware typically still calculates the target address. Therefore, the test verifies the address is 4, but the flag being 0 indicates this address will be disregarded).* ![image](https://hackmd.io/_uploads/rJB0lg9GZe.png =550x) #### 4. Register File **RegisterFile Unit Test Summary :** * **Basic Read/Write Verification:** Confirms that data written to general-purpose registers is correctly stored and accurately retrieved. * **Hardwired Zero Compliance (x0):** Ensures register `x0` remains constant at 0 and ignores all write attempts, strictly adhering to the RISC-V ISA. * **Data Retention:** Verifies that registers preserve their stored state correctly after the write-enable signal is de-asserted. --- ## `2-mmio-trap` ### Test Result **1. `sbt test`** * All tests passed * ByteAccessTest * CLINTCSRTest * UartMMIOTest * ExecuteTest * FibonacciTest * TimerTest * InterruptTrapTest * QuicksortTest :::spoiler show sbt test result ```bash sbt:mycpu-mmio-trap> test [info] compiling 2 Scala sources to /home/p4/ca2025-mycpu/2-mmio-trap/target/scala-2.13/test-classes ... [info] ByteAccessTest: [info] [CPU] Byte access program [info] - should store and load single byte [info] CLINTCSRTest: [info] [CLINT] Machine-mode interrupt flow [info] - should handle external interrupt [info] - should handle environmental instructions [info] UartMMIOTest: [info] [UART] Comprehensive TX+RX test [info] - should pass all TX and RX tests [info] ExecuteTest: [info] [Execute] CSR write-back [info] - should produce correct data for csr write [info] FibonacciTest: [info] [CPU] Fibonacci program [info] - should calculate recursively fibonacci(10) [info] TimerTest: [info] [Timer] MMIO registers [info] - should read and write the limit [info] InterruptTrapTest: [info] [CPU] Interrupt trap flow [info] - should jump to trap handler and then return [info] QuicksortTest: [info] [CPU] Quicksort program [info] - should quicksort 10 numbers [info] Run completed in 1 minute, 2 seconds. [info] Total number of tests run: 9 [info] Suites: completed 8, aborted 0 [info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` ::: **2. VGA Display Demo** * Ubuntu Linux : `sudo apt install libsdl2-dev` * `make demo` : ![image](https://hackmd.io/_uploads/H1oGFsZfbx.png =500x) --- ## `3-pipeline` ### [CA25: Exercise 21] Hazard Detection Summary and Analysis #### Q1 : Why do we need to stall for load-use hazards? * **A :** The data retrieved by a `LW` instruction is only available after the MEM stage completes. If the immediately following instruction requires this data in its EX stage (or even earlier in the ID stage), a timing gap exists. Even with data forwarding, the data cannot be delivered to the ALU input in time for the next clock cycle. Therefore, the pipeline must stall for one cycle to wait for the data to become available from memory. * Without stall : | Instruction | C1 | C2 | C3 | C4 | C5 | C6 | | ---- | --- | --- | --- | :---: | :---: | --- | | `LW x1, 0(x2)` | IF | ID | EX | MEM | WB | | `ADD x3, x1, x4` | | IF | ID | EX | MEM | WB | * Using stall : We can the reg[`x1`] data from C5 by forwarding. | Instruction | C1 | C2 | C3 | C4 | C5 | C6 | C7 | | ---- | --- | --- | --- | :---: | :---: | --- |--- | | `LW x1, 0(x2)` | IF | ID | EX | MEM | WB | | `stall` | | IF | ID | EX | MEM | WB | | `ADD x3, x1, x4` | | | IF | ID | EX | MEM | WB | #### Q2: What is the difference between "stall" and "flush" operations? * **A :** * **Stall (Freeze)**: This operation holds the pipeline state constant. The PC and IF/ID pipeline registers retain their current values to prevent fetching new instructions. This is typically accompanied by inserting a **NOP (Bubble)** into the subsequent stage to delay execution. * **Flush (Clear)**: This operation invalidates the current stage. It forcibly clears the contents of a pipeline register to 0 (converting the instruction to a NOP). This is primarily used to discard speculatively fetched instructions that are incorrect due to Control Hazards (e.g., taken branches). #### Q3: Why does jump instruction with register dependency need stall? * **A :** In this design, the **jump target address is calculated during the ID stage**. This requires the source register data to be valid immediately in the ID stage. If the dependent data is still being computed in the EX stage of the previous instruction and hasn't been written back, the pipeline must stall. This is typically because the forwarding path from EX output back to ID input is either not implemented or would violate critical path timing constraints. * Without stall : | Instruction | C1 | C2 | C3 | C4 | C5 | C6 | | ---- | --- | --- | --- | :---: | :---: | --- | | `ADD x1, x2, x3` | IF | ID | EX | MEM | WB | | `JALR x0, x1, 0` | | IF | ID | EX | MEM | WB | * Using stall : We can the reg[`x1`] data from C4 by forwarding. | Instruction | C1 | C2 | C3 | C4 | C5 | C6 | C7 | | ---- | --- | --- | --- | :---: | :---: | --- |--- | | `ADD x1, x2, x3` | IF | ID | EX | MEM | WB | | `stall` | | IF | ID | EX | MEM | WB | | `JALR x0, x1, 0` | | | IF | ID | EX | MEM | WB | #### Q4: In this design, why is branch penalty only 1 cycle instead of 2? * **A :** This design implements **Early Branch Resolution**, where the branch condition check (register comparison) and target address calculation are completed in the ID stage. * Standard MIPS (Resolution in EX) : Fetches incorrect instructions in both IF and ID stages → Penalty = 2 cycles. * This Design (Resolution in ID) : Only fetches the incorrect instruction in the IF stage → Penalty = 1 cycle. #### Q5: What would happen if we removed the hazard detection logic entirely? * **Data Corruption** : In Load-Use scenarios, dependent instructions would read stale (old) values from the register file because the new data hasn't been written back yet, leading to incorrect computation results. * MEM[`x2`] = 100、REG[`x1`] = 10 (Old)、REG[`x4`] = 5 * `LW x1, 0(x2)` : Expected move `100` from memory into `x1`. * `ADD x3, x1, x4` : Expected calculate `x1 + x4`, which is `100 + 5 = 105`. * If we removed the Hazard Detection Unit : * In ADD EX stage : x3 = x1 + x4 = 10 + 5 = 15 Because LW hasn't been written back yet, this is the old value * **Control Flow Failure** : In Branch/Jump scenarios, the CPU would execute instructions that should have been skipped (because they were not flushed), causing the program logic to crash. #### Q6: Complete the stall condition summary: * **Stall is needed when:** 1. **EX stage** has a Load instruction (Load-Use) OR ID has a Jump instruction dependent on EX result. 2. **MEM stage** has a Load instruction AND ID has a Jump instruction dependent on it. * **Flush is needed when:** 1. **Branch/Jump is taken** (Control Hazard). ### Test Result **1. `sbt test`** * All tests passed :::spoiler show sbt test result ```bash sbt:mycpu-pipeline> test [info] compiling 1 Scala source to /home/p4/ca2025-mycpu/3-pipeline/target/scala-2.13/classes ... [info] PipelineProgramTest: [info] Three-stage Pipelined CPU [info] - should calculate recursively fibonacci(10) [info] - should quicksort 10 numbers [info] - should store and load single byte [info] - should solve data and control hazards [info] - should handle all hazard types comprehensively [info] - should handle machine-mode traps [info] Five-stage Pipelined CPU with Stalling [info] - should calculate recursively fibonacci(10) [info] - should quicksort 10 numbers [info] - should store and load single byte [info] - should solve data and control hazards [info] - should handle all hazard types comprehensively [info] - should handle machine-mode traps [info] Five-stage Pipelined CPU with Forwarding [info] - should calculate recursively fibonacci(10) [info] - should quicksort 10 numbers [info] - should store and load single byte [info] - should solve data and control hazards [info] - should handle all hazard types comprehensively [info] - should handle machine-mode traps [info] Five-stage Pipelined CPU with Reduced Branch Delay [info] - should calculate recursively fibonacci(10) [info] - should quicksort 10 numbers [info] - should store and load single byte [info] - should solve data and control hazards [info] - should handle all hazard types comprehensively [info] - should handle machine-mode traps [info] PipelineUartTest: [info] Three-stage Pipelined CPU UART Comprehensive Test [info] - should pass all TX and RX tests [info] Five-stage Pipelined CPU with Stalling UART Comprehensive Test [info] - should pass all TX and RX tests [info] Five-stage Pipelined CPU with Forwarding UART Comprehensive Test [info] - should pass all TX and RX tests [info] Five-stage Pipelined CPU with Reduced Branch Delay UART Comprehensive Test [info] - should pass all TX and RX tests [info] PipelineRegisterTest: [info] Pipeline Register [info] - should be able to stall and flush [info] Run completed in 2 minutes, 50 seconds. [info] Total number of tests run: 29 [info] Suites: completed 3, aborted 0 [info] Tests: succeeded 29, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [success] Total time: 171 s (02:51), completed Dec 7, 2025, 11:52:56 PM ``` ::: **2. `make compliance`** * 76 Passed, 0 Failed :::spoiler show make compliance result ``` (.venv) p4@p4dev:~/ca2025-mycpu/3-pipeline$ make compliance Validating RISCOF installation... RISCOF found: /home/p4/ca2025-mycpu/.venv/bin/riscof Version: RISC-V Architectural Test Framework., version 1.25.3 Running RISCOF compliance tests for 3-pipeline (RV32I + Zicsr)... Running RISCOF compliance tests for 3-pipeline... Using config: config-3-pipeline.ini Using RISCOF: /home/p4/ca2025-mycpu/.venv/bin/riscof Using toolchain: Starting compliance test run at Sun Dec 7 23:59:26 PST 2025 This may take 10-15 minutes for the full test suite... ... ✅ Compliance tests complete. Results in riscof_work_3pl/ Completion time: Mon Dec 8 00:15:22 PST 2025 Copying results to results/ directory... Cleaning up auto-generated RISCOF test files... ✅ Compliance tests complete. Results in results/ 📊 View report: results/report.html ``` ::: #### Environment | Riscof | 1.25.3 | | -------- | -------- | | Riscv-arch-test Version/Commit Id | - | | DUT | mycpu | | Reference | rv32emu | | ISA | RV32IZicsr | | User Spec Version | 2.3 | | Privilege Spec Version | 1.10 | #### Results in report.html ![image](https://hackmd.io/_uploads/HyL5tW4fbg.png =200x) #### Issue and resolution ##### 1. `error: externally-managed-environment` Ubuntu 24.04 restricted the direct installation of Python packages via pip due to the "externally-managed-environment" policy * Established a local Python virtual environment : `python3 -m venv .venv` * Activate virtual environment : `source .venv/bin/activate` * Reinstall RISCOF `python3 -m pip install git+https://github.com/riscv/riscof` ##### 2. `ERROR | Error evaluating verify condition (PMP['implemented']): name 'PMP' is not defined` Undefined PMP ( Physical Memory Protection ) * The PMP error was caused by a missing configuration in the test framework's YAML file, not a hardware defect. Since PMP is not implemented in this design, I explicitly disabled it in the configuration (`implemented: false`).