## Lab4-0 Caravel Simulation
### 1. spiflash access & code execution (observe CPU trace )
#### Intro:
In the Caravel SoC, CPU doesn’t have much internal memory. It needs to load its firmware (the program code) from an external storage device like SPI Flash to start running.
#### Data flow:
* `spiflash.v` role : simulates the behavior of a real **SPI Flash** chip
* Flow:
1️⃣ CPU wants to fetch instructions
2️⃣ CPU sends SPI commands to spiflash
3️⃣ spiflash reads the corresponding program code from the BRAM
4️⃣ Return to the CPU
#### CPU access spiflash:
`caravel-soc_fpga-main\rtl\soc-efabless\VexRiscv_MinDebugCache.v`
CPU access spiflash by two bus:
| Purpose | Bus |
| -------- | -------- |
| Instruction fetch | iBus |
Data access | dBus|
**iBus signal**:
| Signal name | Description |
| -------------------------- | ------------------------------------------------------ |
| `iBus_cmd_valid` | High when the CPU requests a new instruction fetch |
| `iBus_cmd_ready` | High when SPI Flash is ready to accept the request |
| `iBus_cmd_payload_address` | The address of the instruction to fetch |
| `iBus_rsp_valid` | High when the data is ready and being returned |
| `iBus_rsp_payload_data` | The data returned (typically a 32-bit instruction)|

**dBus signal**:
| Signal name | Description |
| -------------------------- | ------------------------------------------------------ |
| `dBus_cmd_valid` | High when the CPU issues a data access request |
| `dBus_cmd_ready` | High when the target is ready to accept the request |
| `dBus_cmd_payload_address` | The memory address to access |
| `dBus_cmd_payload_wr` | 1 = write operation, 0 = read operation |
| `dBus_cmd_payload_data` | Data to write (for write operations) |
| `dBus_cmd_payload_mask` | Which bytes are valid (used for partial writes) |
| `dBus_rsp_valid` | High when data is returned (for reads) |
| `dBus_rsp_payload_data` | The returned data (for reads) |

### 2. CPU WB cycles interaction with user project area.
#### Intro:
CPU and user project transfer data by wishbone bus
#### Signals
| Signal Name | Description |
| ----------------------- | ---------------------------------------------------------- |
| `wb_clk_i` | Wishbone bus clock |
| `wb_rst_i` | Reset signals |
| `wb_adr_i[31:0]` | Address bus, the target address specified by the CPU |
| `wb_dat_i[31:0]` | Data written from the CPU |
| `wb_dat_o[31:0]` | Data read from the user project |
| `wb_cyc_i` | Indicates the start of a valid Wishbone transfer cycle |
| `wb_stb_i` | Data transfer request, asserted to initiate a transfer |
| `wb_we_i` | Write enable, high for write, low for read |
| `wb_ack_o` | Acknowledgment from user project indicating transfer is done |
| `wb_sel_i[3:0]` | Byte enable, selects valid byte |
#### Data Flow
**CPU wirte to user project:**
* CPU setup
* `wb_adr_i`: write in address
* `wb_dat_i`: write in data
* `wb_we_i` = 1: write operation
* `wb_cyc_i` = 1 和 wb_stb_i = 1: transfer activation
* user project respond
* write in data of the corresponding address
* `wb_ack_o = 1`: transfer done
**CPU read from user project:**
* CPU setup
* `wb_adr_i`: read address
* `wb_we_i = 0`: read operation
* `wb_cyc_i = 1` 和 `wb_stb_i = 1`: transfer activation
* user project respond
* fetch data from the address, and put the data on `wb_dat_o`
* `wb_ack_o = 1`: data prepared

### 3. CPU interface with user project with la
#### Intro:
LA(Logic analyzer) is like a programmable io lines(bit-level), it can process data between CPU and user project.
#### Signals:
| Signal | direction | Function |
| ------------- | ------- | --------------------
| `la_data_in[127:0]` | CPU -> user project | data send form CPU to user project |
| `la_data_out[127:0]` | user project -> CPU | data send form user project to CPU |
| `la_oenb[127:0]` | CPU -> user project | 1: CPU read from user project; 0: CPU write to user project |
#### Data flow:
**CPU part:**
```C=
// Now, apply the configuration
reg_mprj_xfer = 1;
while (reg_mprj_xfer == 1);
// Configure LA probes [31:0], [127:64] as inputs to the cpu
// Configure LA probes [63:32] as outputs from the cpu
reg_la0_oenb = reg_la0_iena = 0x00000000; // [31:0]
reg_la1_oenb = reg_la1_iena = 0xFFFFFFFF; // [63:32]
reg_la2_oenb = reg_la2_iena = 0x00000000; // [95:64]
reg_la3_oenb = reg_la3_iena = 0x00000000; // [127:96]
// Flag start of the test
reg_mprj_datal = 0xAB400000;
// Set Counter value to zero through LA probes [63:32]
reg_la1_data = 0x00000000;
// Configure LA probes from [63:32] as inputs to disable counter write
reg_la1_oenb = reg_la1_iena = 0x00000000;
while (1) {
if (reg_la0_data_in > 0x1F4) {
reg_mprj_datal = 0xAB410000;
break;
}
}
//print("\n");
//print("Monitor: Test 1 Passed\n\n"); // Makes simulation very long!
reg_mprj_datal = 0xAB510000;
```
* Initialization: After `reg_mprj_xfer` is cleared to 0, user project is considered loaded and ready to run.
* LA configuration: LA probes [63:32] are outputs from CPU to user project; others are inputs to CPU.
* Flags: `reg_mprj_datal` serves as start/end flag . Here, start at `0xAB400000` and end at `0xAB410000`
* Termination: The CPU polls LA[31:0] (user project counter output), and ends the test when it exceeds `0x1F4 `(500), writing `0xAB410000` to `reg_mprj_datal`
**user project part:**
```verilog=
// LA
assign la_data_out = {{(127-BITS){1'b0}}, count};
// Assuming LA probes [63:32] are for controlling the count register
assign la_write = ~la_oenb[63:32] & ~{BITS{valid}};
// Assuming LA probes [65:64] are for controlling the count clk & reset
assign clk = (~la_oenb[64]) ? la_data_in[64]: wb_clk_i;
assign rst = (~la_oenb[65]) ? la_data_in[65]: wb_rst_i;
```
```verilog=
if (~|la_write) begin
count <= count + 1;
end
if (valid && !ready) begin
ready <= 1'b1;
rdata <= count;
if (wstrb[0]) count[7:0] <= wdata[7:0];
if (wstrb[1]) count[15:8] <= wdata[15:8];
if (wstrb[2]) count[23:16] <= wdata[23:16];
if (wstrb[3]) count[31:24] <= wdata[31:24];
end else if (|la_write) begin
count <= la_write & la_input;
end
```
* Initialization: `reg_la1_data = 0xFFFFFFFF` + `la1_oenb = 0x00000000` → `la_write != 0` → `|la_write` holds → enter `count <= la_write & la_input`
* Start counting: `reg_la1_data = 0x00000000` + `reg_la1_oenb = 0xFFFFFFFF` → `la_write = 0` → `~|la_write` holds → enter `count <= count + 1`
**Summarize**
| Operation | CPU | user project |
| ----------- | --------------------------- | ------------------------- |
| CPU write in user project | `reg_la1_data = 0` | `count <= la_input` |
| activate `count + 1` | `reg_la1_oenb = 0x00000000` | `count <= count + 1` |
| CPU read `count` in user project | `reg_la0_data_in` | `la_data_out = count` |

### 4. User project/RISC-V uses mprj pin, and interacts with Testbench
#### Intro:
User project communicate with testbench via `mprj_io[37:0]` .
* In real situation:
pin <--> padframe <--> mprj_io <--> user project
* In simulation:
testbench --(assign)-- mprj_io <--> user project
#### Data flow and Signals
```
user_project_wrapper.v
│
└─► user_io_out
│
└─► gpio_control_block.v (user_gpio_out)
│
└─► pad_gpio_out
│
└─► mprj_io_out (chip_io.v)
│
└─► io_out (mprj_io.v)
│
└─► io
│
└─► mprj_io (chip_io.v and caravel.v)
```
* `user_project_wrapper.v` : Defines the user logic and drives `io_out`, `io_oeb`, and reads `io_in` to control GPIO pins.
* `gpio_control_block.v` : Translates user signals into pad-level control signals (`pad_gpio_out`, `pad_gpio_outenb`, `pad_gpio_in`)
* `chip_io.v`: Connects all GPIO control blocks to the physical padframe, organizing routing between Caravel and I/Os.
* `mprj_io.v`: Implements bidirectional I/O buffers that connect internal control signals to the actual I/O pads.
* `caravel.v`: Top-level module that integrates everything, including the user project, management SoC, and chip I/O.
#### What to observe in testbench
`checkbits = mprj_io[31:16]`

The checkbits correspond to the `reg_mprj_datal`