# 2025-12-02
## [Zhian66](https://hackmd.io/@zhian66/ca25-homework3)
How do we evaluate and verify the efficiency improvements when upgrading a 5-stage pipelined CPU from using **bubbles** to implementing **bypassing** from ALU to Register file?
::: success
We can evaluate the performance by examining the **CPU cycle counts** and the **waveform**!
:::
### Cycle Count
Key Concept: The bypassing version should have a lower cycle count because it eliminates the wasted cycles caused by bubbles (NOPs).
1. Get the initial cycle counts by `CSRRS rd, 0xC00, x0`
2. Execute the data hazards insturction under the bubbles and bypassing mechanisms
3. Get the final cycle counts by `CSRRS`
4. Compute the total cycle counts
### Waveform
- Bubbles ver.: Look for "Bubbles" (NOP instructions) or PC holding its value between dependent instructions.
- Bypassing ver.: Look for continuous execution (no gaps). Verify that the data is correctly bypassed from the EX or MEM stage directly to the next instruction's input.
1. Generate `.vcd` files using **Verilator** (e.g., `make verilator`, then run with `-vcd dump.vcd`).
2. Inspect signals by opening the `.vcd` file using **Surfer**.
3. Compare the `id_*` and `ex_*` signals between the two versions to confirm the removal of stall cycles.
<!-- Keep editing and will paste my testing result before 12/5 -->
---
## [Winstonllllai](https://hackmd.io/@Winstonllllai/ca25-homework3)
### The purpose and functionality of CSR
As we dig into ```init.S```, we can see the way to manipulate CSR:
```
# ==============================================================================
# CSR (Control and Status Register) Operations
# ==============================================================================
# Best Practices:
# - csrs/csrsi: Set specific bits (read-modify-write, preserves other bits)
# - csrc/csrci: Clear specific bits (read-modify-write, preserves other bits)
# - csrw/csrrw: Full register write (use only when controlling entire register)
# - Avoid magic numbers - use explicit bit shifts for clarity
# - Preserve reserved bits per RISC-V Privileged Spec (WARL contract)
# ==============================================================================
```
The comment emphasizes the safety and atomicity to manipulate CSRs. Unlike general-purpose registers used for data calculation, CSRs are used to control how the CPU behaves and to monitor its status.
Core Functions of CSRs:
- Trap and interrupt configuring
- Context Preservation
- System Identification and State
### Relationship Between CSR and Trap
When trap occurs, the hardware will perform a serial of actions according to the information of CSRs.
1. Before trap:
Configure CSRs to tell hardware what to do to handle trap.
- ```mtvec```: The address of trap handler. PC should jump to the instruction when trap occurs.
- ```mstatus```: Enable or disable trap handling.
- ```mscratch```: Scratch register for swapping from user stack to kernel stack.
2. During trap:
Recording the information while trap occurs.
- ```mepc```: PC before the trap.
- ```mtval```: Additional error message.
- ```mcause```: What cause the trap.
3. After trap:
Return the PC before trap occurs.
- ```mret instruction```: Return from trap handling according to ```mepc``` & ```mstatus```.
### How can we utilize it?
1. Set ```mtvec``` with ```csrw``` while initialization stage.
2. Use ```csrs/csrc``` for atomic operations to enable control interrupt.
3. Check ```mcause``` to differentiate interrupts and exceptions.
4. Return by using ```mret``` instruction.
---
## [jningmin](https://github.com/jningmin)
Csr + 40 single cycle
check out riscv spec
---
## [Alex-Amedro](https://hackmd.io/@K0u7M9pjQwW-yF5gwSpQlA/SkUXP3Ib-l)
### VGA initialization and frame updating procedure
VGA Controller Simplified Explanation
The VGA module’s job is simply to display a small animated image on a screen.
The CPU sends frames (64×64 pixels) and a color palette to the controller. The VGA controller then reads this data and displays it enlarged, centered on a 640×480 screen.
Two parts run in parallel:
The CPU, which uploads frames and selects which one to display
The VGA controller, which continuously outputs the image to the screen
To change the image, the CPU just selects a different frame. The switch happens cleanly, without flickering.
In simulation, a PC program captures the VGA output signals and shows the result in a window.
More in my homerwork3 report
## [Potassium-chromate](https://hackmd.io/@Potassium-chromate/SyJLV12Wbg)
### How CLINT Interacts with the Timer
When the timer peripheral expires, it raises an interrupt request by setting `io.interrupt_flag` to `InterruptCode.Timer0` (otherwise it stays `None`). CLINT doesn’t immediately force the CPU into the handler; instead, it first checks the CSR enable conditions. Specifically, it requires global interrupts to be enabled (`mstatus.MIE = 1`) and the timer interrupt to be enabled (`mie.MTIE = 1`). Only if the interrupt flag is non-None and `MIE` and the source’s enable bit are all true will CLINT accept the interrupt.
After accepting a timer interrupt, CLINT performs the standard RISC-V trap entry flow: it redirects execution to the handler address in `mtvec`, saves the current instruction address into `mepc` for later return, and writes `mcause` to indicate an interrupt with cause code 7 (machine timer interrupt). It also updates `mstatus` to prevent nested interrupts by copying the old `MIE` into `MPIE` and clearing `MIE` to 0, then enables these CSR updates via a direct write. Finally, when the ISR completes and executes `mret`, CLINT reverses the state change: it jumps back to `mepc`, restores `MIE` from `MPIE`, and resets `MPIE` to 1 as required by the spec.