--- title: Group 10 FPGA Lab 4 HW --- # FPGA HW4 BRAM ###### tags: `FPGA2021` Members in Group 10: * E24066226 魏晉成 * E24066470 余采蓴 [TOC] ## Problem 1 Block RAM Utilize In this lab, setting of RAMB36E1 is needed to suffice the requirement listed below. Which parameters should be set to suffice the requirements would be cast light on. ### Requirement Overview 1. Structure Diagram ```graphviz digraph { rankdir="LR"; zynq[label="ZYNQ\nProcessor" shape="BOX"]; axi[label="AXI\nBUS" shape="component"]; bram_ctrl[label="AXI\nBRAM\nController" shape="box"]; bram[label="BRAM\nImplemented\nby Verilog\nTemplate" shape="box"]; zynq->axi[dir="both"]; axi->bram_ctrl[dir="both"]; bram_ctrl->bram[dir="both"]; } ``` 2. Design Details | Data Width | Memory Size | RAM Mode | | ---------- | ----------- | -------------- | | 32 bits | 32 Kb | True Dual Port | 3. Initial Contents | Offset | Content | | ------ | ------- | | 0 | 0x2330 | | 4 | 0x2454 | | 28 | 0x2379 | | 64 | 0x3034 | ### Details on Verilog Template to Suffice Requirements 1. Data Width: 32 bits Because this module should support 32-bit width read and write; thus, the I/Os of wrapper named `BRAMSelf` is designed as below. ```verilog module BRAMSelf( input clk, input rst, input en_A, input [3:0] we_A, input [11:0] addr, input [31:0] data_in, output [31:0] data_out ); ``` And the BRAM instance named `RAMB36E1_inst` is configured in parameters as below. ```verilog ... // READ_WIDTH_A/B, WRITE_WIDTH_A/B: Read/write width per port .READ_WIDTH_A(36), // 0-72 .READ_WIDTH_B(0), // 0-36 .WRITE_WIDTH_A(36), // 0-36 .WRITE_WIDTH_B(0), // 0-72 ... ``` 2. Memory Size: 32Kb If you only instantiate **one** RAMB32E1, than 32Kb size requirement can be sufficed. 3. RAM Mode: TDP To suffice true dual port configuration, the BRAM instance named `RAMB36E1_inst` is configured in parameters as below. ```verilog ... // RAM Mode: "SDP" or "TDP" .RAM_MODE("TDP"), ... ``` 4. Initial Content: To preset the initial content in BRAM, ``` ... .INIT_00(256'h00002379_00000000_00000000_00000000_00000000_00000000_00002454_00002330), .INIT_01(256'h0000000000000000000000000000000000000000000000000000000000000000), .INIT_02(256'h0000000000000000000000000000000000000000000000000000000000003034), ... ``` ### Verilog Template Implementation Details Since we have a wrapper module `BRAMSelf` covering up `RAMB36E1_inst` and forward external connection to `RAMB36E1_inst`, below the connection from wrapper to instance of ramb36 would be discussed. 1. clk, rst Clock and reset signal would be last thing that we need to worry about, we only need to forward the signals declaimed in `BRAMSelf` as ```verilog module BRAMSelf( input clk, input rst, ... ); ``` to the instance `RAMB36E1_inst` as ```verilog RAMB36E1 #( ... ) RAMB36E1_inst( ... .CLKARDCLK(clk), // 1-bit input: A port clock/Read clock ... .RSTRAMARSTRAM(rst), // 1-bit input: A port set/reset .RSTREGARSTREG(rst), // 1-bit input: A port register set/reset ); ``` 2. en_A, we_A Enabling signals are also forwarded directly to the instance of RAMB36E1. One thing worth noting is the width of write enable signal is 4-bit. Enable signals are declared in `BRAMSelf` as ```verilog module BRAMSelf( ... input en_A, input [3:0] we_A, ... ); ``` As for `RAMB36E1_inst` is connected as ```verilog RAMB36E1 #( ... ) RAMB36E1_inst( ... .ENARDEN(en_A), // 1-bit input: A port enable/Read enable .REGCEAREGCE(en_A), // 1-bit input: A port register enable/Register enable ... .WEA(we_A), // 4-bit input: A port write enable ); ``` 3. addr One thing notable is that `addr` is 12-bit. However, RAMB36E1 requires 16-bit address. Thus, a little modification from external signal is required when instantiation. Address is received in `BRAMSelf` as ```verilog module BRAMSelf( ... input [11:0] addr, ... ); ``` Connected to `RAMB36E1_inst` as ```verilog RAMB36E1 #( ... ) RAMB36E1_inst( ... .ADDRARDADDR({1'b0, addr[11:0], 3'b000}), // 16-bit input: A port address/Read address ); ``` 4. data_in, data_out Input data and output data for BRAM wrapper is also an forwarded data, declared in `BRAMSelf` as ```verilog module BRAMSelf( ... input [31:0] data_in, output [31:0] data_out... ); ``` And connected to `RAMB36E1_inst` as ```verilog RAMB36E1 #( ... ) RAMB36E1_inst( ... .DOADO(data_out), // 32-bit output: A port data/LSB data ... .DIADI(data_in), // 32-bit input: A port data/LSB data ... ); ``` ### Software Testbench for Read/Write Test 1. Preset Value With below statements, we can read preset values in BRAM. ```c=7 printf("Designated preset data:\n"); printf("[TestR] 0x00: %x\n", *((uint32_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR))); printf("[TestR] 0x04: %x\n", *((uint32_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR + 4))); printf("[TestR] 0x28: %x\n", *((uint32_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR + 28))); printf("[TestR] 0x64: %x\n", *((uint32_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR + 64))); ``` 2. Write Test ```c=12 printf("Sequential Per Byte R/W test:====================================================================\n"); int index = 0; for(; index < 4096; index ++) { printf("[TestR] Original %04x: %x\n", index, *((uint8_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR + index))); *((uint8_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR + index)) = index; printf("[TestR] Written %04x: %x\n", index, *((uint8_t*)(XPAR_AXI_BRAM_CTRL_0_S_AXI_BASEADDR + index))); } ``` ### The Suggested Way to Test This Project * Prerequisite The bitstream **"......\FPGA_HW4_Group10\Problem1\bit\SelfBRAM.bit"** has been writen into Pynq. 1. Create Vitis Project ![](https://i.imgur.com/Li6FT8O.png) ![](https://i.imgur.com/WwD4aE4.png) Create project as above ![](https://i.imgur.com/NB010CK.png) Choose xsa file in **"......\FPGA_HW4_Group10\Problem1\xsa\SelfBRAM.xsa"** ![](https://i.imgur.com/ofG09cD.png) ![](https://i.imgur.com/xaLZg4l.png) Naming this project as **SelfBRAMTest** is suggested. ![](https://i.imgur.com/fuUGzns.png) Following the standard process, a project can be sucessfully created. 2. Import Testbench Program ![](https://i.imgur.com/6GOo5qX.png) Now we need to import the testbench program written by us ![](https://i.imgur.com/zIs1l6N.png) ![](https://i.imgur.com/uNKJMNP.png) ![](https://i.imgur.com/UaPUSNu.png) First, choose **"......\FPGA_HW4_Group10\Problem1\src"** as import directory. ![](https://i.imgur.com/b67ncEw.png) Then check the **testRW.c** file to import it. ![](https://i.imgur.com/JL0YNG4.png) And you can have it in your editor. 3. Build and Execute Testbench Program ![](https://i.imgur.com/4CaeRNN.png) First, build this C program to generate ELF file. ![](https://i.imgur.com/aAQlXHZ.png) ELF file is successfully generated. ![](https://i.imgur.com/I0YBIUO.png) ![](https://i.imgur.com/AisTFcq.png) Configure the project as GDB debug application and run it. ![](https://i.imgur.com/nNkMiyX.png) Afterward, you can see data flying in terminal. 4. Exepected Output * Read Preset Data ```= Designated preset data: [TestR] 0x00: 2330 [TestR] 0x04: 2454 [TestR] 0x28: 2379 [TestR] 0x64: 3034 ``` * Read Previous Data -> Set -> Read Data after Setting (per byte) ```=6 Sequential Per Byte R/W test:==================================================================== [TestR] Original 0000: 30 [TestR] Written 0000: 0 [TestR] Original 0001: 23 [TestR] Written 0001: 1 [TestR] Original 0002: 0 [TestR] Written 0002: 2 [TestR] Original 0003: 0 [TestR] Written 0003: 3 [TestR] Original 0004: 54 [TestR] Written 0004: 4 [TestR] Original 0005: 24 ``` * Read Previous Data -> Set -> Read Data after Setting (per half word) ```=8199 Sequential Per Half Word R/W test:==================================================================== [TestR] Original 0000: 100 [TestR] Written 0000: 0 [TestR] Original 0002: 302 [TestR] Written 0002: 2 [TestR] Original 0004: 504 [TestR] Written 0004: 4 [TestR] Original 0006: 706 [TestR] Written 0006: 6 [TestR] Original 0008: 908 [TestR] Written 0008: 8 ``` * Read Previous Data -> Set -> Read Data after Setting (per word) ```=12296 Sequential Per Word R/W test:==================================================================== [TestR] Original 0000: 20000 [TestR] Written 0000: 0 [TestR] Original 0004: 60004 [TestR] Written 0004: 4 [TestR] Original 0008: a0008 [TestR] Written 0008: 8 [TestR] Original 000c: e000c [TestR] Written 000c: c ``` ## Problems 1. PYNQ-Z2 上共有多少容量的Block RAM? Block RAM capacity of z-7020 is **4.9 Mbit** according to document.([ref](https://www.xilinx.com/support/documentation/selection-guides/zynq-7000-product-selection-guide.pdf)) And to calculate accurate number of capacity, we can use $(32_{(data)}+4_{(parity)})*1024*140_{(\#\_of\_BRAM\_in\_z7020)}=5,160,960=4.921875Mebi\ bit$ 2. 承上題,共有多少個 RAMB36E1? The number of RAMB36E1 in z-7020 is **140**.([ref](https://www.xilinx.com/support/documentation/selection-guides/zynq-7000-product-selection-guide.pdf)) 3. 若要將RAMB36E1 Configure成36Kb FIFO,該使用什麼Verilog Template? Below we assume **the width of FIFO data is k and k is two to the power of m** ($k=2^m$). Also, we assume data would be **written in port B and read from port A**. ```graphviz digraph { rankdir="LR"; Input->PortA->BRAM->PortB->Output } ``` ```verilog module FIFO( input clk, input rst, input [3:0] wr_en, input rd_en, input [k-1:0] data_in, output [k-1:0] data_out ); reg [14-m:0] head, tail; always@(posedge clk or posedge rst) begin if (rst) begin head <= (15-m)'d0; tail <= (15-m)'d0; end else begin head <= (&wr_en) ? head + 1 : head; tail <= (&rd_en) ? tail + 1 : tail; end end RAMB36E1 #( .READ_WIDTH_A(0), // Configured as 0 .READ_WIDTH_B(k), // Configured as k .WRITE_WIDTH_A(k), // Configured as k .WRITE_WIDTH_B(0), // Configured as 0 .RAM_MODE("TDP"), ) RAMB36E1_inst ( .DOBDO(data_out), // Configured for FIFO .ADDRARDADDR({head, m'd0}), // Configured for FIFO .CLKARDCLK(clk), .WEA(wr_en), // 4-bit input: A port write enable // Port A Data: 32-bit (each) input: Port A data .DIADI(data_in), // 32-bit input: A port data/LSB data .ADDRBWRADDR({tail, m'd0}), // 16-bit input: B port address/Write address .CLKBWRCLK(clk), .ENBWREN(rd_en) ); endmodule ```