# Caravel SoC Summerization
[Reference of SoC](https://github.com/bol-edu/caravel-soc_fpga/tree/main)
## File .c .h .S .ld/lds
### Hardware Design (.v):
* Role: Describes the hardware, including the SoC's structure and peripherals.
* Relationship: These files define the SoC's architecture and hardware components, which will later interface with the software components (such as the processor, UART, GPIO).
* Flow: These files are synthesized into logic gates and layout designs, and they determine how hardware behaves during execution.
### Assembly Code (.S, .s):
* Role: Contains low-level boot code that initializes the system, setting up the processor, memory, and peripherals.
* Relationship: These files interact directly with the hardware. The `.S` files may include macros or higher-level assembly for initializing the SoC's processor (e.g., setting up the stack pointer, jumping to the main program).
* Flow: These files are the first to execute upon power-up, before jumping to C code for further system setup.
**`.S` file**: This is an assembly file that contains macros and preprocessor instructions. During the compilation process, this file is first processed by the preprocessor (for example, expanding macros) before it is converted into assembly code.
**`.s` file**: This is an assembly file that has already been processed by the preprocessor. In other words, it's pure assembly language with no macros or preprocessor instructions, ready to be passed directly to the assembler.
### C Source Code (.c):
* Role: Contains higher-level software that controls peripherals, implements drivers, and runs application logic.
* Relationship: The C code interacts with the hardware through register mappings and peripheral drivers. It works with the system initialization performed by assembly code (e.g., `.S`), and it depends on the hardware layout defined in `.v`.
* Flow: The C code is compiled and linked into an executable, which will run on the processor once the boot process completes.
### Linker Script (.ld, .lds):
* Role: Defines memory mapping for various code sections, including where the text (code), data, and stack will reside in memory.
* Relationship: The linker script is used during the build process to organize the program's memory layout. It ensures that the C code and assembly code are placed in the correct locations in memory.
* Flow: The linker script is used by the linker to generate the final executable, dictating where different sections (e.g., .text, .data) will be located in memory.
* `.ld` and` .lds` are essentially the same, both are GNU linker scripts.
### Makefile (.mak):
* Role: Manages the build process by specifying how to compile the code, which files to compile, and how to link everything together.
* Relationship: The Makefile ties everything together, specifying the dependencies between .c, .S, .v, and linker scripts. It ensures that the proper compilation and linking steps are taken.
* Flow: The Makefile executes the build process, compiling source files and linking them into a final executable.
### Flowchart

## Analysis Flow (ex. counter_la)
### run_xsim
#path : testbench/counter_la/run_xsim
```
rm -f counter_la.hex
rm -rf xsim.dir/ *.log *.pb *.jou *.wdb
riscv32-unknown-elf-gcc -Wl,--no-warn-rwx-segments -g \
-I../../firmware \
-march=rv32i -mabi=ilp32 -D__vexriscv__ \
-Wl,-Bstatic,-T,../../firmware/sections.lds,--strip-discarded \
-ffreestanding -nostdlib -o counter_la.elf ../../firmware/crt0_vex.S ../../firmware/isr.c counter_la.c
riscv32-unknown-elf-objcopy -O verilog counter_la.elf counter_la.hex
# to fix flash base address
sed -ie 's/@10/@00/g' counter_la.hex
rm -f counter_la.elf counter_la.hexe
xvlog -d FUNCTIONAL -d SIM -d DUNIT_DELAY=#1 -d USE_POWER_PINS -f ./include.rtl.list.xsim counter_la_tb.v
xelab -top counter_la_tb -snapshot counter_la_tb_elab
xsim counter_la_tb_elab -R
```
* `rm` means remove
* This command uses the **RISC-V GCC cross-compiler** (riscv32-unknown-elf-gcc) to compile the C code for the RISC-V architecture.
* `-I../../firmware`: Adds `../../firmware` directory to the list of include paths for header files.
* `-T,../../firmware/sections.lds`: Specifies a custom linker script (`sections.lds`) to define memory sections for the link process.
* `-ffreestanding`: Tells the compiler that the program is freestanding, meaning it is not running in a standard operating system environment.
* `-nostdlib`: This option instructs the compiler not to link the standard library.
* `-o counter_la.elf`: Specifies the output file name as `counter_la.elf`.
* `../../firmware/crt0_vex.S ../../firmware/isr.c counter_la.c`: These are source files; `crt0_vex.S` for providing the startup code to initialize the system; `isr.c` for providing the interrupt service routines; `counter_la.c` contains the main application logic.
### include.rtl.list.xsim
#path : testbench/counter_la/include.rtl.list.xsim
```
## Headers
../../rtl/header/defines.v
../../rtl/header/user_defines.v
## User project
../../rtl/user/user_project_wrapper.v
../../rtl/user/user_proj_example.counter.v
## VIP
../../vip/tbuart.v
../../vip/bram.v
##../../vip/spiflash.v
## DFFRAM Behavioral Model
../../vip/RAM256.v
../../vip/RAM128.v
## Mgmt Core Wrapper
../../rtl/soc-efabless/mgmt_core.v
../../rtl/soc-efabless/mgmt_core_wrapper.v
../../rtl/soc-efabless/VexRiscv_MinDebugCache.v
## These blocks need to stay in RTL
../../rtl/soc/mprj_io.v
## These blocks only needed for RTL sims
../../rtl/soc-efabless/housekeeping_spi.v
../../rtl/soc/spiflash.v
../../rtl/soc/chip_io.v
../../rtl/soc/gpio_control_block.v
../../rtl/soc/gpio_defaults_block.v
../../rtl/soc/housekeeping.v
../../rtl/soc/caravel.v
```

### sections.lds
#path : firmware/sections.lds
```
/* INCLUDE ../../generated/output_format.ld */
OUTPUT_FORMAT("elf32-littleriscv")
ENTRY(_start)
__DYNAMIC = 0;
/* INCLUDE ../../generated/regions.ld */
MEMORY {
vexriscv_debug : ORIGIN = 0xf00f0000, LENGTH = 0x00000100
dff : ORIGIN = 0x00000000, LENGTH = 0x00000400
dff2 : ORIGIN = 0x00000400, LENGTH = 0x00000200
flash : ORIGIN = 0x10000000, LENGTH = 0x01000000
mprj : ORIGIN = 0x30000000, LENGTH = 0x00100000
hk : ORIGIN = 0x26000000, LENGTH = 0x00100000
csr : ORIGIN = 0xf0000000, LENGTH = 0x00010000
}
SECTIONS
{
.text :
{
_ftext = .;
/* Make sure crt0 files come first, and they, and the isr */
/* don't get disposed of by greedy optimisation */
*crt0*(.text)
KEEP(*crt0*(.text))
KEEP(*(.text.isr))
*(.text .stub .text.* .gnu.linkonce.t.*)
_etext = .;
} > flash
.rodata :
{
. = ALIGN(8);
_frodata = .;
*(.rodata .rodata.* .gnu.linkonce.r.*)
*(.rodata1)
. = ALIGN(8);
_erodata = .;
} > flash
.data :
{
. = ALIGN(8);
_fdata = .;
*(.data .data.* .gnu.linkonce.d.*)
*(.data1)
_gp = ALIGN(16);
*(.sdata .sdata.* .gnu.linkonce.s.*)
. = ALIGN(8);
_edata = .;
} > dff AT > flash
.bss :
{
. = ALIGN(8);
_fbss = .;
*(.dynsbss)
*(.sbss .sbss.* .gnu.linkonce.sb.*)
*(.scommon)
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
. = ALIGN(8);
_ebss = .;
_end = .;
} > dff
}
PROVIDE(_fstack = ORIGIN(dff2) + LENGTH(dff2));
PROVIDE(_fdata_rom = LOADADDR(.data));
PROVIDE(_edata_rom = LOADADDR(.data) + SIZEOF(.data));
```
:::info
In the `memory` part, names can be edit, but you have to be sure that there is a phiscal address existed in the SoC
:::
* `.text` section: which typically holds the program's executable code.
* `> flash` means this section will be loaded into the flash memory region, as defined earlier: `flash : ORIGIN = 0x10000000, LENGTH = 0x01000000`
* `*crt0*(.text)` means "Include any .text section content from any file whose name matches `crt0*`."
* `KEEP(*crt0*(.text))`: This forces the linker to keep the matched sections, even if they appear unused. Normally, unused sections might be optimized away by the linker.
* `*(.text .stub .text.* .gnu.linkonce.t.*)`-> Collect all executable code, including: Normal functions (`.text`); Startup helpers or trampolines (`.stub`); Specialized sub-sections of code (`.text.*`) ex. `.text.main`; Deduplicated functions (e.g., C++ templates) from GNU's link-once mechanism (`.gnu.linkonce.t.*`)
* `. = ALIGN(8)`: aligns the current memory address to the next 8-byte boundary. For example, if current `counter` is at `0x1003`, it will jump to `0x1008`
* `_fstack = 0x00000600`
* `dff AT > flash` means that the `.data` section's content will physically reside in the `dff` region during runtime, but it will be loaded from the `flash` region at boot time. Essentially, the section's data will be stored in `flash` (external storage), and when the program starts, it will be copied to `dff` (internal memory) for execution.
* `dff AT > flash` cause the address of `_fdata_rom` and `_edata_rom` is determined by Linker, so the real result should check dump file(compiled asembly code) by adding this code to `run_xsim`: `riscv32-unknown-elf-objdump -D counter_la.elf > dump.out`
:::info
`PROVIDE(_fdata_rom = LOADADDR(.data));`, `_fdata_rom` is the address of `.data` in `flash`.
:::
:::info
Functions or executable code → `.text` section
Constant values (e.g., const variables) → `.rodata` section
Initialized global or static variables → `.data` section
Uninitialized global or static variables → `.bss` section
:::
### crt0_vex.S
#path : firmware/crt0_vex.S
```
.global main
.global isr
.global _start
_start:
j crt_init
nop
nop
nop
nop
nop
nop
nop
.global trap_entry
trap_entry:
sw x1, - 1*4(sp)
sw x5, - 2*4(sp)
sw x6, - 3*4(sp)
sw x7, - 4*4(sp)
sw x10, - 5*4(sp)
sw x11, - 6*4(sp)
sw x12, - 7*4(sp)
sw x13, - 8*4(sp)
sw x14, - 9*4(sp)
sw x15, -10*4(sp)
sw x16, -11*4(sp)
sw x17, -12*4(sp)
sw x28, -13*4(sp)
sw x29, -14*4(sp)
sw x30, -15*4(sp)
sw x31, -16*4(sp)
addi sp,sp,-16*4
call isr
lw x1 , 15*4(sp)
lw x5, 14*4(sp)
lw x6, 13*4(sp)
lw x7, 12*4(sp)
lw x10, 11*4(sp)
lw x11, 10*4(sp)
lw x12, 9*4(sp)
lw x13, 8*4(sp)
lw x14, 7*4(sp)
lw x15, 6*4(sp)
lw x16, 5*4(sp)
lw x17, 4*4(sp)
lw x28, 3*4(sp)
lw x29, 2*4(sp)
lw x30, 1*4(sp)
lw x31, 0*4(sp)
addi sp,sp,16*4
mret
.text
crt_init:
la sp, _fstack
la a0, trap_entry
csrw mtvec, a0
data_init:
la a0, _fdata
la a1, _edata
la a2, _fdata_rom
data_loop:
beq a0,a1,data_done
lw a3,0(a2)
sw a3,0(a0)
add a0,a0,4
add a2,a2,4
j data_loop
data_done:
bss_init:
la a0, _fbss
la a1, _ebss
bss_loop:
beq a0,a1,bss_done
sw zero,0(a0)
add a0,a0,4
#ifndef SIM
j bss_loop
#endif
bss_done:
li a0, 0x880 //880 enable timer + external interrupt sources (until mstatus.MIE is set, they will never trigger an interrupt)
csrw mie,a0
call main
infinit_loop:
j infinit_loop
```
1. jump to `crt_init` => sp stores `_fstack`(`dff2: 0x00000600`)
2. if interupt go to `trap_entry`
3. load data from `_fdata_rom` to `_fdata ~ _edata`(`flash: 0x10000000 `) until `_fdata = _edata`
4. go to `bss_init:` to do global variable initialize (0)
5. `call main` instruction go to `counter_la.c`
### counter_la.c
#path : testbench/counter_la/counter_la.c
```clike
// Set UART clock to 64 kbaud (enable before I/O configuration)
// reg_uart_clkdiv = 625;
reg_uart_enable = 1;
// Now, apply the configuration
reg_mprj_xfer = 1;
while (reg_mprj_xfer == 1);
// Configure LA probes [31:0], [127:64] as inputs to the cpu
// Configure LA probes [63:32] as outputs from the cpu
reg_la0_oenb = reg_la0_iena = 0x00000000; // [31:0]
reg_la1_oenb = reg_la1_iena = 0xFFFFFFFF; // [63:32]
reg_la2_oenb = reg_la2_iena = 0x00000000; // [95:64]
reg_la3_oenb = reg_la3_iena = 0x00000000; // [127:96]
// Flag start of the test
reg_mprj_datal = 0xAB400000;
// Set Counter value to zero through LA probes [63:32]
reg_la1_data = 0x00000000;
// Configure LA probes from [63:32] as inputs to disable counter write
reg_la1_oenb = reg_la1_iena = 0x00000000;
while (1) {
if (reg_la0_data_in > 0x1F4) {
reg_mprj_datal = 0xAB410000;
break;
}
}
//print("\n");
//print("Monitor: Test 1 Passed\n\n"); // Makes simulation very long!
reg_mprj_datal = 0xAB510000;
```
* I skip upper part of c code which is setting "The upper GPIO pins are configured to be output and accessble to the management SoC." and "The lower GPIO pins are configured to be output and accessible to the user project."
* `reg_mprj_io[31:16] = GPIO_MODE_MGMT_STD_OUTPUT`
* `reg_mprj_io[15:7 & 5:0] = GPIO_MODE_USER_STD_OUTPUT`
* `reg_mprj_io[6] = UART Tx line`
* `#define reg_mprj_datal (*(volatile uint32_t*)0x2600000c)`
* The `b` in `reg_la0_oenb` stands for "bar" or "negated", which means the signal is **active low**. When a bit in `oenb` is 0, the corresponding LA line is enabled for output (i.e., user_project drives it). When the bit is 1, the output is disabled (tri-stated).
* Similarly, the `iena` register is **active high**, meaning when a bit is 1, the corresponding LA line is enabled for input (i.e., user_project reads from it).
### counter_la_tb.v
#path : testbench/counter_la/counter_la_tb.v
```verilog
module counter_la_tb;
reg clock;
reg RSTB;
reg CSB;
reg power1, power2;
wire gpio;
wire uart_tx;
wire [37:0] mprj_io;
wire [15:0] checkbits;
assign checkbits = mprj_io[31:16];
assign uart_tx = mprj_io[6];
always #12.5 clock <= (clock === 1'b0);
initial begin
clock = 0;
end
```
* `mprj_io[31:16] = reg_mprj_datal[upper half bit]` but cant find the link between .c and .v
```verilog
initial begin
wait(checkbits == 16'hAB40);
$display("LA Test 1 started");
wait(checkbits == 16'hAB41);
wait(checkbits == 16'hAB51);
$display("LA Test 2 passed");
#10000;
$finish;
end
```
* `checkbits` related to `counter_la.c` -> `reg_mprj_datal`
```verilog
caravel uut (
.clock (clock),
.gpio (gpio),
.mprj_io (mprj_io),
.flash_csb(flash_csb),
.flash_clk(flash_clk),
.flash_io0(flash_io0),
.flash_io1(flash_io1),
.resetb (RSTB)
);
spiflash spiflash (
.ap_clk(clock),
.ap_rst(RSTB),
.romcode_Addr_A(romcode_Addr_A),
.romcode_EN_A(romcode_EN_A),
.romcode_WEN_A(romcode_WEN_A),
.romcode_Din_A(romcode_Din_A),
.romcode_Dout_A(romcode_Dout_A),
.romcode_Clk_A(romcode_Clk_A),
.romcode_Rst_A(romcode_Rst_A),
.csb(flash_csb),
.spiclk(flash_clk),
.io0(flash_io0),
.io1(flash_io1)
);
```
* see the interation between CPU with spiflash
```verilog
bram #(
.FILENAME("counter_la.hex")
) bram (
.CLK(clock),
.WE0(romcode_WEN_A),
.EN0(romcode_EN_A),
.Di0(romcode_Din_A),
.Do0(romcode_Dout_A),
.A0(romcode_Addr_A)
);
```
:::info
Very important: This is through the spiflash module in the Caravel/Verilog simulation platform, which emulates an external SPI Flash containing your compiled program file (usually a .hex or .bin). Then, the SoC—typically with an internal bootloader or ROM—loads the instructions from the Flash into RAM.
:::
### spiflash.v
#path : rtl/soc/spiflash.v
```verilog
assign io1 = outbuf[7];
assign romcode_Addr_A = {8'b0, spi_addr};
assign romcode_Din_A = 32'b0;
assign romcode_EN_A = (bytecount >= 4);
assign romcode_WEN_A = 4'b0;
assign romcode_Clk_A = ap_clk;
assign romcode_Rst_A = ap_rst;
```
* indicated that BRAM only do read operation
:::warning
here BRAM is inside the spiflash in fpga
:::
```verilog
wire [7:0] memory;
assign memory = (spi_addr[1:0] == 2'b00) ? romcode_Dout_A[7:0] :
(spi_addr[1:0] == 2'b01) ? romcode_Dout_A[15:8] :
(spi_addr[1:0] == 2'b10) ? romcode_Dout_A[23:16] :
romcode_Dout_A[31:24] ;
```
* use `spi_addr[1:0]` to determine which part of `romcode_Dout_A` is written into spiflash memory.
| `spi_addr[1:0]` | Choose Byte | `romcode_Dout_A` |
|:---------------:|:-----------:| ------------------------- |
| `2'b00` | Byte0 | `romcode_Dout_A[7:0]` |
| `2'b01` | Byte1 | `romcode_Dout_A[15:8]` |
| `2'b10` | Byte2 | `romcode_Dout_A[23:16]` |
| `2'b11` | Byte3 | ` romcode_Dout_A[31:24] ` |
```verilog
always @(negedge spiclk or posedge csb) begin
if(csb) begin
outbuf <= 0;
end else begin
outbuf <= {outbuf[6:0],1'b0};
if(bitcount == 0 && bytecount >= 4) begin
outbuf <= memory;
end
end
end
```
* if `csb = 1` means that spiflash is closed, and `outbuf` should be cleared.
* at `negedge of spiclk` output buffer `outbuf` is shift left to assert new bit to `io1`
```verilog
wire [7:0] buffer_next = {buffer[6:0], io0};
always @(posedge spiclk or posedge csb) begin // csb deassert -> reset internal states
if (csb) begin
buffer <= 0;
bitcount <= 0;
bytecount <= 0;
end else begin // csb active -> count bit, byte
buffer <= buffer_next;
bitcount <= bitcount + 1;
if (bitcount == 7) begin
bitcount <= 0;
bytecount <= bytecount + 1;
// spi_action;
if(bytecount == 0) spi_cmd <= buffer_next; // command
if(bytecount == 1) spi_addr[23:16] <= buffer_next;
if(bytecount == 2) spi_addr[15:8] <= buffer_next;
if(bytecount == 3) spi_addr[7:0] <= buffer_next;
if(bytecount >= 4 && spi_cmd == 'h03) begin
// buffer <= memory;
spi_addr <= spi_addr + 1;
end
end
end
end
```
* The first byte input `buffer` is the command, and for `bytecount == 1~3` is the address to fetch data from BRAM.
* Since `bytecount` would not repeat the value from 0~4, **the first four if statement** is to determine the first address of the cmd in BRAM, after that address is determined by `spi_addr <= spi_addr + 1;`
| Name | direction | dicription | clock edge |
|:------:|:---------:|:------------------------------------------:|:----------------------------:|
| buffer | input | Used to receive bits sent from the master | Received on `posedge spiclk` |
| outbuf | output | Used to send data bit-by-bit to the master | Sent on `negedge spiclk` |
* `io0` is the bit sent from master. And `io0` is meaningful only at the first input of (cmd + address)
* `spi_cmd == 'h03` is the read cmd.
```verilog
if(bitcount == 0 && bytecount >= 4) begin
outbuf <= memory;
end
```
* The if statemenmt above is required because of `assign romcode_EN_A = (bytecount >= 4);`, so output buffer starts to get 8 bit data from BRAM after the first address is fetched. And `bitcount == 0` assures that `outbuf` update when `memory` update next byte of data.
### user_proj_example.counter.v
#path : rtl/user/user_proj_example.counter.v
#### module counter
```verilog
module counter #(
parameter BITS = 32
)(
input clk,
input reset,
input valid,
input [3:0] wstrb,
input [BITS-1:0] wdata,
input [BITS-1:0] la_write,
input [BITS-1:0] la_input,
output reg ready,
output reg [BITS-1:0] rdata,
output reg [BITS-1:0] count
);
//reg ready;
//reg [BITS-1:0] count;
//reg [BITS-1:0] rdata;
always @(posedge clk) begin
if (reset) begin
count <= 0;
ready <= 0;
end else begin
ready <= 1'b0;
if (~|la_write) begin
count <= count + 1;
end
if (valid && !ready) begin
ready <= 1'b1;
rdata <= count;
if (wstrb[0]) count[7:0] <= wdata[7:0];
if (wstrb[1]) count[15:8] <= wdata[15:8];
if (wstrb[2]) count[23:16] <= wdata[23:16];
if (wstrb[3]) count[31:24] <= wdata[31:24];
end else if (|la_write) begin
count <= la_write & la_input;
end
end
end
endmodule
```
* `|la_write` means that all the bit in `la_write` performs OR operation. Which `~|la_write` equals to `la_write == 0`, and `|la_write` equals to `la_write != 0`
* `la_write` is the control signal: `count <= la_write & la_input;` if certain bits of `la_write` is 1 meaning that la_input of these certain bits will be write into `count`
* It is important that **`valid`, `wstrb`, and `wdata` are from WishBone**. While, **`la_write`, `la_input` are from Logic Analyzer (CPU)**.
* To summarize, if LA isn't writing, `count` exhibit as counter. While if WishBone(can imagine as Axi-lite) shakehand it will send `count` value to testbench and update(if we) count at the next period.
#### module user_proj_example
```verilog
assign valid = wbs_cyc_i && wbs_stb_i;
assign wstrb = wbs_sel_i & {4{wbs_we_i}};
assign wbs_dat_o = rdata;
assign wdata = wbs_dat_i;
// IO
assign io_out = count;
assign io_oeb = {(`MPRJ_IO_PADS-1){rst}};
// IRQ
assign irq = 3'b000; // Unused
// LA
assign la_data_out = {{(127-BITS){1'b0}}, count};
// Assuming LA probes [63:32] are for controlling the count register
assign la_write = ~la_oenb[63:32] & ~{BITS{valid}};
// Assuming LA probes [65:64] are for controlling the count clk & reset
assign clk = (~la_oenb[64]) ? la_data_in[64]: wb_clk_i;
assign rst = (~la_oenb[65]) ? la_data_in[65]: wb_rst_i;
```
* `la_write` is available when `la_oenb[63:32] == 0` and `valid == 0`, means that it is a write operation for LA and WishBone does not handshake.
* `clk` and `rst` is determined by ether LA or WB.