# SoC Lab Workload-optimized SoC
Source code: [GitHub <i class="fa fa-external-link"></i>](https://github.com/dqrengg/SoC_Laboratory/tree/main/Lab-wlos)
## Overview
* HW/SW co-design to optimize SoC performance and workload under specific task assignments
* Implement [FIR <i class="fa fa-external-link"></i>](/rkfLNCYoye) accelerator, AXI bus, DMA, arbiter, and SDRAM controller
## Block Diagram

## SDRAM Controller
* Originally, the SDRAM controller has only a basic FSM to match the SDRAM timing.
* To improve memory bandwidth, the controller is re-designed, and supports **burst read** and **bank interleaving** features.
### Controller Design
### Address Mapping Optimization for Interleaving
* Code and data locating in actual memory

* Memory address mapping for DRAM bank, row and column

* bit 0-1: `column[1:0]`. The momory is word addressable, so the bits are ignored.
* bit 2-3: `column[3:2]`. The momory is set to burst read mode, so these 2 bits determine the read order.
* bit 4: `bank[0]`. Bank interleaving every 4 words (= burst length)
* bit 5-8: `column[7:4]`
* bit 9-13: `row[4:0]`
* bit 14: `bank[14]`. For code/data interleaving. Defined in firmware linker.
* bit 15-22: `row[12:5]`. The bits are unused due to SDRAM capacity.
* During firmware compilation, the linker file allocate code (`mprjram`) and data (`dataram`) segments in different banks.
```c
// file name: ./firmware/section.lds
// line 11 to 21
MEMORY {
// ...
mprj : ORIGIN = 0x30000000, LENGTH = 0x00100000
mprjram : ORIGIN = 0x38000000, LENGTH = 0x00004000
dataram : ORIGIN = 0x38004000, LENGTH = 0x00004000
// ...
}
```
The global variables are assigned to certain memory section.
```c
// file name: ./testbench/wlos/fir.c
// line 5 to 27
int taps[16] __attribute__((section(".dataram"))) = { /* ... */ };
int X[N] __attribute__((section(".dataram"))) = { /* ... */ };
int Y[N] __attribute__((section(".dataram")));
```
* In addition, the length of `tap` array is extended to 16 and the unused spaces are filled with 0s, aligning the following array address starting at multiples of 4 to take the advantage of burst read.
```c
// file name: ./testbench/wlos/fir.c
// line 5 to 8
int taps[16] __attribute__((section(".dataram"))) = {
0, -10, -9, 23, 56, 63, 56, 23, -9, -10, 0,
/* 11-15 unused, filled with 0 */ 0, 0, 0, 0, 0
};
```
## DMA
* The DMA directly transits data between FIR and memory, and therefore offloads CPU.
* Workflow without and with DMA

* Data stream without and with DMA

## Future Improment