# 2025-12-02 ## [Zhian66](https://hackmd.io/@zhian66/ca25-homework3) How do we evaluate and verify the efficiency improvements when upgrading a 5-stage pipelined CPU from using **bubbles** to implementing **bypassing** from ALU to Register file? ::: success We can evaluate the performance by examining the **CPU cycle counts** and the **waveform**! ::: ### Cycle Count Key Concept: The bypassing version should have a lower cycle count because it eliminates the wasted cycles caused by bubbles (NOPs). 1. Get the initial cycle counts by `CSRRS rd, 0xC00, x0` 2. Execute the data hazards insturction under the bubbles and bypassing mechanisms 3. Get the final cycle counts by `CSRRS` 4. Compute the total cycle counts ### Waveform - Bubbles ver.: Look for "Bubbles" (NOP instructions) or PC holding its value between dependent instructions. - Bypassing ver.: Look for continuous execution (no gaps). Verify that the data is correctly bypassed from the EX or MEM stage directly to the next instruction's input. 1. Generate `.vcd` files using **Verilator** (e.g., `make verilator`, then run with `-vcd dump.vcd`). 2. Inspect signals by opening the `.vcd` file using **Surfer**. 3. Compare the `id_*` and `ex_*` signals between the two versions to confirm the removal of stall cycles. <!-- Keep editing and will paste my testing result before 12/5 --> --- ## [Winstonllllai](https://hackmd.io/@Winstonllllai/ca25-homework3) ### The purpose and functionality of CSR As we dig into ```init.S```, we can see the way to manipulate CSR: ``` # ============================================================================== # CSR (Control and Status Register) Operations # ============================================================================== # Best Practices: # - csrs/csrsi: Set specific bits (read-modify-write, preserves other bits) # - csrc/csrci: Clear specific bits (read-modify-write, preserves other bits) # - csrw/csrrw: Full register write (use only when controlling entire register) # - Avoid magic numbers - use explicit bit shifts for clarity # - Preserve reserved bits per RISC-V Privileged Spec (WARL contract) # ============================================================================== ``` The comment emphasizes the safety and atomicity to manipulate CSRs. Unlike general-purpose registers used for data calculation, CSRs are used to control how the CPU behaves and to monitor its status. Core Functions of CSRs: - Trap and interrupt configuring - Context Preservation - System Identification and State ### Relationship Between CSR and Trap When trap occurs, the hardware will perform a serial of actions according to the information of CSRs. 1. Before trap: Configure CSRs to tell hardware what to do to handle trap. - ```mtvec```: The address of trap handler. PC should jump to the instruction when trap occurs. - ```mstatus```: Enable or disable trap handling. - ```mscratch```: Scratch register for swapping from user stack to kernel stack. 2. During trap: Recording the information while trap occurs. - ```mepc```: PC before the trap. - ```mtval```: Additional error message. - ```mcause```: What cause the trap. 3. After trap: Return the PC before trap occurs. - ```mret instruction```: Return from trap handling according to ```mepc``` & ```mstatus```. ### How can we utilize it? 1. Set ```mtvec``` with ```csrw``` while initialization stage. 2. Use ```csrs/csrc``` for atomic operations to enable control interrupt. 3. Check ```mcause``` to differentiate interrupts and exceptions. 4. Return by using ```mret``` instruction. --- ## [jningmin](https://github.com/jningmin) Csr + 40 single cycle check out riscv spec --- ## [Alex-Amedro](https://hackmd.io/@K0u7M9pjQwW-yF5gwSpQlA/SkUXP3Ib-l) ### VGA initialization and frame updating procedure VGA Controller Simplified Explanation The VGA module’s job is simply to display a small animated image on a screen. The CPU sends frames (64×64 pixels) and a color palette to the controller. The VGA controller then reads this data and displays it enlarged, centered on a 640×480 screen. Two parts run in parallel: The CPU, which uploads frames and selects which one to display The VGA controller, which continuously outputs the image to the screen To change the image, the CPU just selects a different frame. The switch happens cleanly, without flickering. In simulation, a PC program captures the VGA output signals and shows the result in a window. More in my homerwork3 report ## [Potassium-chromate](https://hackmd.io/@Potassium-chromate/SyJLV12Wbg) ### How CLINT Interacts with the Timer When the timer peripheral expires, it raises an interrupt request by setting `io.interrupt_flag` to `InterruptCode.Timer0` (otherwise it stays `None`). CLINT doesn’t immediately force the CPU into the handler; instead, it first checks the CSR enable conditions. Specifically, it requires global interrupts to be enabled (`mstatus.MIE = 1`) and the timer interrupt to be enabled (`mie.MTIE = 1`). Only if the interrupt flag is non-None and `MIE` and the source’s enable bit are all true will CLINT accept the interrupt. After accepting a timer interrupt, CLINT performs the standard RISC-V trap entry flow: it redirects execution to the handler address in `mtvec`, saves the current instruction address into `mepc` for later return, and writes `mcause` to indicate an interrupt with cause code 7 (machine timer interrupt). It also updates `mstatus` to prevent nested interrupts by copying the old `MIE` into `MPIE` and clearing `MIE` to 0, then enables these CSR updates via a direct write. Finally, when the ISR completes and executes `mret`, CLINT reverses the state change: it jumps back to `mepc`, restores `MIE` from `MPIE`, and resets `MPIE` to 1 as required by the spec.