# FIR Workbook (lab_3) Student ID: 20726557 This lab involves designing a FIR filter which communicates through the AXI-Lite and AXI-Stream protocols. AXI-Lite controls the configuration registers and read/write of the tap coefficients, while AXI-Stream controls the dataflow of the FIR filter. The tap coefficients and data for the FIR filter are stored inside BRAM modules. At the beginning of operation, ap_idle is set. After initial configuration of the data length, number of taps, and the tap coefficients, the FIR filter receives ap_start through AXI-Lite protocol and deasserts ap_idle. When ap_start is asserted, FIR filter operation retrieves and outputs the computed FIR data through the AXI-Stream protocol. When the last data is retrieved and outputted, ap_done and ap_idle will be asserted to await for next FIR filter operation. ## Block Diagram ![Untitled drawing](https://hackmd.io/_uploads/BkV8__x21l.png) ## Operation ### AXI Protocols For this lab, the AXI-Lite and the AXI-Stream Protocols are used to read/write data to the configuration registers and the BRAM. #### AXI-Lite Read The AXI-Lite Read protocol contains two processes: address and data. ![image](https://hackmd.io/_uploads/HJeMg_Fln1x.png) To read the data, the receiver sends an "arvalid" along with the address to be read "araddr". The transmitter samples the address and asserts "arready" to notify the receiver to invalidate "araddr". After sending a "arready" as a response to the "arvalid", the transmitter will assert "rvalid" along with the data to be read "rdata". When the receiver is ready to retrieve the data, it sets "rready". When both "rvalid" and "rready" is asserted, the receiver reads the data from the transmitter and both signals deasserts in the next cycle. #### AXI-Lite Write Similar to the AXI-Lite Read protocol, AXI-Lite Write protocol contains both the address and data processes. ![image](https://hackmd.io/_uploads/HJilXYl2ye.png) To write data, the transimitter asserts "awvalid" along with the receiver address "awaddr". When the receiver detects the "awvalid", the receiver sends a response "awready" to notify the transimitter that it has sampled the address. The transmitter will deassert "awvalid" and invalidate "awaddr" upon detecting "awready" from the receiver. When the data is ready to be transmitted from the transmitter to the receiver, the transmitter asserts "wvalid" and sets the data to "wdata". The receiver samples the data from "wdata" and sends a "wready" to transmitter to deassert "wvalid" and invalidate "wdata". When the receiver have both the address and data, the data is written to the given address. There is no dependency between transimitter asserting "awvalid" and "wvalid". The transmitter can send both send both at the same time, or one before the other. #### AXI-Stream AXI-Stream is used to stream-in and stream-out a series of data. ![image](https://hackmd.io/_uploads/r1Fl0Kx2kx.png) To transmit data, the transmitter asserts "tvalid" to indicate the data is valid to be read by the receiver. When the receiver is ready to retrieve the data through "tdata", it asserts "tready" to indicate to the transmitter to send the next set of data in the stream. The cycle continues until the transmitter sets "tlast" along with the "tvalid" to indicate the last set of data to be sent. After receiving "tlast" and receiving the "tready" from the receiver, the transmitter will halt the stream and set "tvalid" to logic 0. ### Configuration Registers and Tap Coefficient When the system first turns on, the system is in "reset" state due to the "axis_rst_n" set to logic low (active-low). The system becomes "idle" state once "axis_rst_n" is set to logic high. The ap_idle in the AP Configuration Register is set to logic 1, while ap_start and ap_done is set to logic 0. Before the start of the operation, the configuration of the data length, number of taps, and the tap coefficient must be written to the system. The data length and the number of taps are saved in their respective Configuration Registers, while the tap coefficients are saved into a BRAM. The AXI-Lite protocol is used to write the data into these configuration registers and the BRAM. The system contains a AXI-Lite Finite State Machine (FSM) to control the read/writes based on the AXI-Lite protocol mentioned above. The following are the correspond addresses for the AXI-Lite protocol. * 0x00 AP Configuration register * 0x10-13 Data Length Configuration Register * 0x14-18 Tap Number Configuration Register * 0x40-7F Tap Coefficient BRAM ![Configuration Write](https://hackmd.io/_uploads/H1eXd0W3kx.png) Each AXI-Lite operation only writes to one address at a time. Therefore, multiple AXI-Lite operations are needed to fully program the data into the system. When the data length, number of taps, and the tap coefficient are written to their respective locations, the system can start operation. ### FIR Computation To start the operation, the ap_start is generated by writing logic 1 through AXI-Lite Write. If ap_idle is logic 0, the write is ignored by the configuration register as the system is already running. Writing ap_start as logic 1 will also set ap_idle to logic 0. Setting ap_start allows the FIR computation to begin and enables the use of the AXI-Stream FSM to stream-in and stream-out the data through the AXI-Stream Protocol. In this lab, the name of the "Xn" stream-in data is "ss", while the "Yn" output stream is "sm". When the system detects "ss-tready" is asserted, the axi-stream protocol starts streaming in data for the FIR computation. In addition, the ap configuration register performs a write to set ap_start to logic 0. ![ap_start](https://hackmd.io/_uploads/SJsbtRW31e.png) When the FIR computation is started, the data BRAM is first initated by writing 0 to all addresses such that there is no "x" unknown state during computation. The FIR computation follows the following formula: > y[t] = Σ (h[i] * x[t - i]) To perform the FIR computation, the "Xn" data is streamed in through AXI-Stream "ss-tdata". "ss-tready" is set such that the stream-in data can prepare the next set of data. At the same time, the tap coefficient 0 is read from at address[0] (0x40-43) of the Tap BRAM. The "Xn" stream-in data is multiplied by with the tap coefficient that was read. The mutiplied data is then stored and saved in a register. After multiplying, the Data from address[0] is read from the data BRAM, and the Tap coefficient 1 is read from address[1] (0x44-47). The data from stream-in is written to data BRAM address[0] in the next cycle, and the original data from data BRAM address[0] is multiplied with the Tap coefficient 1. This is then added with the previous multiply result saved in a register. In the subsequent cycle, a new data and tap coefficient is read from the next address and the previous data is written to the next address, causing a "shift" in data. The data is multiplied with the tap coefficient and added to the sum of all the multiples. When the address reaches the end (depends on the number of taps), the last tap coefficient from the tap BRAM and the data from data BRAM is read out. The computed result of the FIR is then stream-out as "Yn" through "sm-tdata" in the AXI-Stream protocol. The data that was read out of the RAM is shifted out and removed from the data BRAM. The system repeats the steps above to compute the next set of FIR computation by streaming-in the next "Xn". ![Xn stream-in, Yn stream-out](https://hackmd.io/_uploads/HkoMF0-n1g.png) When the system stream-in the last data (determined by the data length), "ss-tlast" is asserted. The final round of FIR computation is performed. When the last FIR computation is completed, "Yn" is streamed out along with "sm-tlast" to indicate the last stream-out operation. The AXI-Stream FSM sets "ap_idle" and "ap_done" to logic 1. To check the status on the FIR engine, the system can sample the AP configuration register for the current status. The system samples "ap_done" to check for the completion of the FIR engine. "ap_done" will be cleared and reset to logic 0 once it is read. The FIR engine can then be reconfigured and perform the FIR computation after setting "ap_start" to logic 1 again. ![ap_done](https://hackmd.io/_uploads/H1JNFAW3Jx.png) ## Resource Usage ### FF, LUT ![image](https://hackmd.io/_uploads/H1b3YRWh1x.png) ![image](https://hackmd.io/_uploads/ry7x9Cb21x.png) ### BRAM ![image](https://hackmd.io/_uploads/rJ8G50-31e.png) ## Performance Report ### Latency ![ap_start, ap_done](https://hackmd.io/_uploads/rJLw2AbhJx.png) ![image](https://hackmd.io/_uploads/Hy3qhCZ21g.png) ### Throughput ![Throughput](https://hackmd.io/_uploads/Bkw7aCbnkg.png) ![image](https://hackmd.io/_uploads/Sy2LTR-nye.png) ## Timing Report ### Frequency ![image](https://hackmd.io/_uploads/r11eRRZn1l.png) ### Report Timing ![image](https://hackmd.io/_uploads/S19URR-hyx.png) ### Max Delay Path ![image](https://hackmd.io/_uploads/HkYRCRbhJx.png) ![image](https://hackmd.io/_uploads/SkbEkJf3yx.png) ![image](https://hackmd.io/_uploads/ryUw11Gnke.png) ## Simulation Waveform ### Coefficient Program and Read Back Coefficient Program ![Configuration Write](https://hackmd.io/_uploads/BJxpVyz31x.png) Read Back ![image](https://hackmd.io/_uploads/r1NYs1z2kg.png) ### Data Stream Data-in Stream-in ![image](https://hackmd.io/_uploads/HJ5Un1z2yg.png) Data-out Stream-out ![image](https://hackmd.io/_uploads/Hk53hJGh1l.png) ### RAM Access Control Tap BRAM (AXI-Lite Write during ap_idle) ![image](https://hackmd.io/_uploads/S1H4TJM3Je.png) Tap BRAM (AXI-Lite Read during ap_idle) ![image](https://hackmd.io/_uploads/rJ0gJefhkx.png) Tap BRAM (Not ap_idle, drop write transaction, read invalid value 32'hffffffff) ![image](https://hackmd.io/_uploads/HJOLkxznyg.png) Data BRAM (AXI-Stream in, Shift Data) ![image](https://hackmd.io/_uploads/B1idxeM3Jg.png) ## Github https://github.com/AnthonyGithub/EESM6000C-Lab-3