PYNQ Tutorial 2: DMA
Objective
After you complete this tutorial, you should be able to:
- Integrate the RTL design to the Zynq PS using AXI DMA
Source Code
This repository contains all of the code required in order to follow this tutorial: https://github.com/yohanes-erwin/pemrograman_zynq/tree/main/pynq_part_2
1. Introduction
In the previous tutorial, we have learned the simple design using AXI GPIO.

In this tutorial, we are going to create a system that consists of Zynq PS and PL design. The PL design is a simple processing element (PE) module that does multiply and add. The result of this design is that we can give input to the PE module in PL from the Jupyter Notebook.
How to add custom RTL design to the system?
This is the block diagram of the PE module. It is a multiply and add operation.

This is the Verilog implementation of the PE module.
How do we connect the PE to PS? We use direct memory access (DMA).
What is DMA?
Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU).
What protocol is used in the Xilinx AXI DMA? AXI Stream protocol.
An AXI stream (AXIS) module consists of two ports: master (M_AXIS) and slave (S_AXIS). Every port has mandatory singals: tready, tdata, tvalid, and tlast signal.
- tready: used to indicate that the AXIS module is ready to receive data.
- tdata: the data signals itself.
- tvalid: indication that there is data that needs to be processed in the tdata.
- tlast: indication of the last data of the packet.


The AXI stream modules can be connected in chains. Every block can do a specific process, then send it to another block as shown in this example.

Slave port is used to receive input, and master port is used to send output. So, in our design, the slave port is used to receive the inputs a_in, b, and y_in. The master port is used to send the result y_out.
The PE module needs to be wrapped in a top module called axis_pe.v
. This top module simply does the AXI stream protocol. Because the circuit of PE is only combinational, the AXI stream protocol implementation is simple.
This is the timing of our AXIS PE design.

Because the PE is a combinational circuit, the output is available at the same clock as the input. In a more complex design, this does not always happen. Internal memory and state machines are often required.
To connect an AXIS module to PS, we can use a Xilinx IP named AXI DMA. The AXI DMA translates the memory-mapped data (from DDR memory) to stream data and vice versa.

On the PS CPU, the Jupyter Notebook software is running, so we can access the PE module from it: https://github.com/yohanes-erwin/pemrograman_zynq/blob/main/pynq_part_2/part_2.ipynb
How AXI DMA works?
- The Python code send an instruction to the AXI DMA to move a certain amount of data from a specified address.
- The data for DMA transfer needs to be contiguous. So, we need to allocate physical addresses.
- The AXI DMA will do the jobs to move that data.

2. Custom IP Project
2.1. Create Hardware Design
Follow these steps to do the custom IP project:
- Create a new Vivado project for your board.
- Add ZYNQ7 PS IP, and then click Run Connection Automation.

- From the Add IP menu, add an AXI Direct Memory Access (AXI DMA) IP.

- After the AXI DMA is added, next you can double-click on the AXI DMA IP to configure it.

- Configure the AXI DMA as shown in this window, then click OK.

- Back to block design and click Run Connection Automation. Check the S_AXI_LITE port of the AXI DMA and then click OK.

- The following figure shows the block design after the AXI DMA and ZYNQ7 PS are connected.

- Next, we need to connect the DMA to the DDR memory. Double-click the ZYNQ7 IP.
- On the Page Navigator, go to PS-PL Configuration. Enable the S AXI HP0 interface and the S AXI HP2 interface. Then, click OK.

- Back to block design and click Run Connection Automation.
- Check the S_AXI_HP0 port of the AXI DMA then in the Options set the Master to /axi_dma_0/M_AXIS_MM2S.

- Check the S_AXI_HP2 port of the AXI DMA then in the Options set the Master to /axi_dma_0/M_AXIS_S2MM. Then, click OK.

- After Run Connection Automation, the block design looks like the following figure. The M_AXI_MM2S and M_AXI_S2MM are connected to ZYNQ7 PS.

- On the left side (the Flow Navigator), select the Add Sources menu. Then, select Add or create design sources.

- Create a new file named
axis_pe.v
.

- Create a new file again named
pe.v
.

- Back to the block design, right-click on the block design, and select the Add Module menu.

- Add the
axis_pe
module to the block design.

- Connect the AXIS PE module to the AXI DMA. Connect the s_axis, m_axis, aclk, and aresetn.

- From the Add IP, add a Constant IP.

- Connect the Constant IP to the en port of the AXIS PE.

- In the Sources section, right-click on the design_1 design block, then select the Create HDL Wrapper menu.
- After that, right-click on the design_1_wrapper, then select the Set as Top menu.

- On the left side (the Flow Navigator), select the Generate Block Design menu. In the Synthesis Options section, select the Global option.
- Run the synthesis, implementation, and generate bitstream.
- Export the
.tcl
, .bit
, and .hwh
file to the FPGA board.
2.2. Create Software Design
At this point, the required files to program the FPGA are already on the board. The next step is to create Jupyter Notebook files.
- Open a web browser and open Jupyter Notebook on the board. Create a new file from menu New, Python 3 (pykernel).
- Write the following code to test the design: https://github.com/yohanes-erwin/pemrograman_zynq/blob/main/pynq_part_2/part_2.ipynb
- In this program, we initialize the DMA, then we allocate a physical memory for input and output. The PE computation is done by calling
dma_send.transfer()
and dma_recv.transfer()
.