# Lab4_0 [1. how caravel soc is analyze (couter_la for example)](https://hackmd.io/EAO1FQAhSGSEiZolsk8pmg?view) [2. The whole pack code to simulation (caravel-soc_fpga-lab)](https://github.com/bol-edu/caravel-soc_fpga/tree/main) [3. Waveform summerized](https://github.com/Raywang908/lab4/tree/master/lab4_0) ## view whole system with WV (counter_la) ### System initialize Knowing that `ENTRY(_start)` in `section.lds` will be executing the assembly code in `crt0_vex.S`, and `*crt0*(.text)` `KEEP(*crt0*(.text))` in `section.lds` shows that `crt0_vex.S` will be assigned to the section of `.text`. Therefore, the start of the system will start fetching instruction at `flash : ORIGIN = 0x10000000` :::info `counter_la.out` shows every part of the section ::: ![image](https://hackmd.io/_uploads/r1bfIckxxe.png) ![image](https://hackmd.io/_uploads/ryiuI51exe.png) ![image](https://hackmd.io/_uploads/Hk-x1j1lel.png) * When the CPU wants to fetch instructions from SPI flash, it asserts `iBus_cmd_valid` along with a starting address, indicating to the memory controller that it intends to read data to fill the instruction cache. Once the external memory interface (e.g., SPI flash controller) is ready to accept the request, it asserts `iBus_cmd_ready`, completing the handshake. Afterward, the flash begins sending instruction data, with each valid word indicated by `iBus_rsp_valid == 1`. This typically happens 8 times (i.e., 8 valid 32-bit words) to fill a full cache line. For example, the instruction `0x0b00006f` — the first instruction in crt0_vex.S — is fetched from SPI flash, transferred via the iBus when `iBus_rsp_valid` is high, and then stored in the instruction cache for the CPU to execute. ### dBus and iBus ![image](https://hackmd.io/_uploads/Sy0OiTyxlx.png) ![image](https://hackmd.io/_uploads/rkhqipJxlg.png) :::info iBus and dBus have different control flow, because dBus has to deal with more complex situation -> if `dBus_cmd_ready` and `dBus_cmd_valid` shakehand -> `dBusWishbone_CYC` pulls up. It is not the same as `iBus_cmd_ready` and `iBus_cmd_valid`. ::: * iBus: deals with instruction fetch, so in the waveform we would only see `iBusWishbone_ADR << 2` from `flash : ORIGIN = 0x10000000` to the end of the assembly code in `1000065c`. | iBus Signal Name | Description | |---------------------------|-----------------------------------------------------------------------------| | `iBus_cmd_valid` | Asserted by the CPU to request an instruction fetch from external memory. | | `iBus_cmd_ready` | Asserted by the memory/slave to acknowledge that it can accept the request. | | `iBus_cmd_payload_address`| The address of the instruction fetch request sent by the CPU. | | `iBus_rsp_valid` | Asserted by the memory/slave when instruction data is available. | | `iBus_rsp_payload_data` | The actual 32-bit instruction word fetched from memory. | | `iBus_rsp_payload_error` | Indicates if an error occurred during instruction fetch. | | `iBusWishbone_CYC` | Wishbone bus cycle signal; high during an active instruction fetch transfer.| | `iBusWishbone_ACK` | Wishbone acknowledge signal; one ACK per 32-bit word successfully transferred. | * dBus: deals with data transfer, if `dBusWishbone_we == 1` means that CPU is writing data and transfer out by `dBusWishbone_DAT_MOSI`. Refer to `section.lds`, we know if the `dBus_cmd_halfPipe_payload_address` equals to `hk : ORIGIN = 0x26000000`, it is dealing with configure register; if equals to `dff : ORIGIN = 0x00000000`, it is dealing with `.bss` or `.data`; if equals to `dff2 : ORIGIN = 0x00000400`, it is dealing with stack register (`PROVIDE(_fstack = ORIGIN(dff2) + LENGTH(dff2))` in `section.lds`) | dBus Signal | Description | |-----------------------|-----------------------------------------------------------------------------| | `dBus_cmd_valid` | Set to 1 when the CPU issues a data read/write command. | | `dBus_cmd_ready` | Set by the slave device when it's ready to accept a command from the CPU. | | `dBusWishbone_CYC` | Wishbone bus cycle signal. Set to 1 indicates an ongoing bus transaction. | | `dBusWishbone_ACK` | Acknowledgment from the slave indicating data has been successfully sent or received. | | `dBus_cmd_rValid` | Pulled high one cycle after a handshake (`valid` & `ready`); indicates that a read command was issued and is waiting for a response. | | `dBus_rsp_ready` | Indicates the CPU is ready to receive the read data. | | `dBus_rsp_data` | The data read from the bus (readback result). | ![image](https://hackmd.io/_uploads/ByJe2p1xll.png) * it may be writing `.bss` since its `MOSI == 0` -> setting uninitialize variable to zero. ![image](https://hackmd.io/_uploads/rk7z_Wmgee.png) * `dBus_rsp_ready` means that CPU receives the data, then CPU will store the data in cache or some buffer. **`dBus_rsp_ready` and `dBusWishbone_WE` can not be pull up at the same time** ![image](https://hackmd.io/_uploads/SJgi2TJell.png) * We see that CPU is writing `1` to `0x26000000` which is defined in `caravel.h`: `#define reg_mprj_xfer (*(volatile uint32_t*)0x26000000)`, it is used as `reg_mprj_xfer = 1;` in `couter_la.c` to control `la_en`. * `while (reg_mprj_xfer == 1);` in `couter_la.c` to indicate that the CPU is busy and it will do rest of the operation in `couter_la.c` after `reg_mprj_xfer == 0`. :::warning it is not sure why `reg_mprj_xfer` pull down. ::: ### Cache ![image](https://hackmd.io/_uploads/r1AJ6Z7gxe.png) | Signal Name | Stage | Direction | Meaning / Function | |---------------------------------------------------------------|---------------|----------------|--------------------------------------------------------------------------------------------------------| | ...io_cpu_fetch_isValid | Fetch | CPU → Cache | Indicates if the CPU is requesting an instruction (the fetch stage is valid, may hit or miss in cache). | | ...io_cpu_fetch_physicalAddress[31:0] | Fetch | CPU → Cache | The physical address the CPU is trying to fetch. Cache checks if it hits or misses based on this address. | | ...io_cpu_fetch_data[31:0] | Fetch | Cache → CPU | If a cache hit occurs, this is the 32-bit instruction data provided to the CPU by the cache. | | ...io_cpu_decode_cacheMiss | Decode | Cache → CPU | If set to 1, indicates that the fetched data was not in the cache (cache miss), and memory fetch is required. | | ...io_cpu_decode_data[31:0] | Decode | Cache → CPU | Instruction data provided by the cache, usually the same as fetch_data, but delayed by one stage (used in Decode). | | ...io_mem_cmd_valid | Memory Access | Cache → Memory | Cache sends a data request to the external memory (due to cache miss). | | ...io_mem_cmd_payload_address[31:0] | Memory Access | Cache → Memory | The address from which the cache fetches data from external memory, usually the starting address of a cache-aligned line. | | Signal Name | Stage | Description | |---------------------------------------------------------|----------|-----------------------------------------------------------------------------| | `IBusCachedPlugin_iBusRsp_stages_0_input_payload` | Prefetch | CPU is requesting the cache to fetch a specific instruction address | | `IBusCachedPlugin_iBusRsp_stages_1_input_payload` | Fetch | Instruction has been fetched from the cache but not yet decoded | | `IBusCachedPlugin_iBusRsp_stages_2_input_payload` | Decode | CPU is decoding the instruction and preparing to execute it | ### housekeeping wb ![image](https://hackmd.io/_uploads/B1nQxGXllg.png) | Signal Name | Description | |--------------|----------------------------------------------------------------------| | **`cyc`** | **Cycle Signal**: Indicates that a valid transaction is in progress. | | **`stb`** | **Strobe Signal**: Indicates that the data on the bus is valid and should be processed. | | **`cyc = 1` and `stb = 1`** | Both signals are high, meaning there is a valid transaction and the data is also valid. | | **`cyc = 1` and `stb = 0`** | `cyc` is high indicating a valid transaction, but `stb` is low meaning the data is not valid. | | **`cyc = 0` and `stb = 0`** | Both signals are low, indicating no transaction is taking place, and no data is being processed. | ### counter_la.c ```cpp reg_mprj_xfer = 1; while (reg_mprj_xfer == 1); // Configure LA probes [31:0], [127:64] as inputs to the cpu // Configure LA probes [63:32] as outputs from the cpu reg_la0_oenb = reg_la0_iena = 0x00000000; // [31:0] reg_la1_oenb = reg_la1_iena = 0xFFFFFFFF; // [63:32] reg_la2_oenb = reg_la2_iena = 0x00000000; // [95:64] reg_la3_oenb = reg_la3_iena = 0x00000000; // [127:96] // Flag start of the test reg_mprj_datal = 0xAB400000; // Set Counter value to zero through LA probes [63:32] reg_la1_data = 0x00000000; // Configure LA probes from [63:32] as inputs to disable counter write reg_la1_oenb = reg_la1_iena = 0x00000000; while (1) { if (reg_la0_data_in > 0x1F4) { reg_mprj_datal = 0xAB410000; break; } } ``` ![image](https://hackmd.io/_uploads/SytTXY4exg.png) ![image](https://hackmd.io/_uploads/Hkt07t4lxg.png) * The line is the same. Althougth considering the .c code, counter should be reset to 0 after `0xAB400000`, but it seems that `count == 0` before `0xAB400000` is asserted. The reason is that `la_wirte` is asserted before `0xAB400000` and `la_data_in` is reset in the begining, so it may look like the waveform is wrong. ## counter_la ### spiflash access & code execution (observe CPU trace) ![image](https://hackmd.io/_uploads/SJrK9UzCke.png) ![image](https://hackmd.io/_uploads/rkzXj8fRye.png) * The first Ack of iBus, meaning the first instruction fetch is above. The address is `0x04000000 << 2 = 0x10000000` where the flash is located. And the Instruction is `0x0B00006F = JAL zero 0` | Instruction Addr / 4 | MISO(fetched data) | Convert to Assembly code | |:----------------------- |:------------------:|:------------------------:| | 0x04000000 | 0B00006F | JAL zero 0 | | 0x04000001 | 00000013 | ADDI zero zero 0 (nop) | | 0x04000002 | 00000013 | nop | | 0x04000003 | 00000013 | nop | | 0x04000004 | 00000013 | nop | | 0x04000005 | 00000013 | nop | | 0x04000006 | 00000013 | nop | | 0x04000007 | 00000013 | nop | | 0x04000028 (0x100000A0) | 00412F03 | LW t5 sp 4 | | 0x04000029 | 00012F83 | LW t6 sp 0 | | 0x0400002A | 04010113 | ADDI sp sp 40 | | 0x0400002B | 30200073 | ECALL zero zero 0 | | 0x0400002C | 60000113 | ADDI sp zero 600 | | 0x0400002D | 00000517 | AUIPC a0 0 [U-type] | | 0x0400002E | F6C50513 | ADDI a0 a0 NaN | | 0x0400002F | 30551073 | CSRRW zero a0 305 | * We can see that the following code is not what we expected. It might be the cache filled, so we should see the assembly code to track the instruction fetch. ### cpu wb cycles interaction with user project area * at caravel.v for `user_project_wrapper mprj` ```verilog .wbs_cyc_i(mprj_cyc_o_user), .wbs_stb_i(mprj_stb_o_user), .wbs_we_i(mprj_we_o_user), .wbs_sel_i(mprj_sel_o_user), .wbs_adr_i(mprj_adr_o_user), .wbs_dat_i(mprj_dat_o_user), .wbs_ack_o(mprj_ack_i_user), .wbs_dat_o(mprj_dat_i_user), ``` ![image](https://hackmd.io/_uploads/ryrGC8MCkx.png) * from waveform above we can see that `wbs_cyc_i == 0`, meaning that wishbone master(CPU) does't transfer data to slave(user_proj) ### cpu interface with user project with la ![image](https://hackmd.io/_uploads/S1S_evGRyx.png) * already known that if `la_write == 0` the counter will work as usual, but when one or more bits of `la_write` is 1, counter will be update with `la_data_in` ### User project/RISC-V uses mprj pin, and interacts with Testbench ![image](https://hackmd.io/_uploads/S1z8PvGAJl.png) * `mprj_io` is the port from caravel SoC, `io_out` is from user_project, and `checkbits` is what testbench is testing. * `assign checkbits = mprj_io[31:16];` * `io_out` in user_project_wrapper.v -> `user_io_out = user_gpio_out` in gpio_control_block.v -> `pad_gpio_out = mprj_io_out` in chip_io.v -> `mprj_io_out = io_out` in mprj_io.v -> `io_out` assign to `io` -> `io = mprj_io` in chip_io.v and in caravel.v