# JPEG Encoder Design ## 2D DCT-II ### 1D DCT-II TODO: This needs to be updated so that we can reuse it for both the first and second-stage 1D DCT. (It will probably have to be parameterized over the size of the input.) ```verilog module dct_1d (input logic clk, rst, ena_in, input logic [7:0] a_in, output logic [11:0] S_out); ``` The input is in two's complement Q8.0 form (that is, just a regular signed integer from -128 to 127). The output is Q12.0. The `dct_1d` module accepts rows (or columns) of 8 pixels, and computes an approximation of their DCT-II using the algorithm described in Algostini et al. 2001. Clock the rows in from left to right, and the DCT-II is computed with 48 cycles of latency. When `ena_in` is low, no new pixels are accepted, and no new coefficients are produced. You must provide entire rows (8 inputs where `ena_in` is high). It is pipelined: you can provide another row immediately after the first, and the computed rows of coefficients will be emitted in the same order, all 48 cycles later. ![](https://i.imgur.com/TVxb0fB.png) ### Transpose buffer The transpose buffer accepts 12-bit coefficients on at a time in row-major order, and 64 cycles (only ones where `ena_in` is high) later, they emerge in column-major order. This is pipelined: after the first 64 coefficients are shifted in, you can continue to shift in coefficients for the next block. ```verilog module transpose_buffer (input logic clk, rst, ena_in, input logic [11:0] in, output logic [11:0] out); ```