---
# System prepended metadata

title: Efficient MDS matrices

---

# Circulant and other efficiently evaluable MDS matrices from Reed-Solomon codes
with Markus Schofnegger.

:::success
In this short note, the content of which we had circulated in the community for quite some time, we recap a canonical construction of efficiently evaluable MDS matrices from Reed-Solomon (RS) codes. 
We stress the fact that the main idea is not a novelty, and considered folklore in the field of coding theory.
:::


Let $\mathbb F_q$ be any finite field (here $q$ is a power of a prime). A square matrix $A\in \mathbb F_q^{k\times k}$ is *maximum distance separable (MDS)* if the determinant of any square submatrix (i.e., any minor) is non-zero.   
In terms of linear codes, the matrix $A$ is MDS iff it the systematic encoder $$ E:\mathbb F_q^{k} \times \mathbb F_q^k\longrightarrow \mathbb F_q^{k + k}, \quad \vec x \mapsto (\vec x, A\vec x), $$ gives a *maxium distance separating code*, meaning that it has best possible Hamming distance $d = n - k + 1$ : The values of a word at any $k$ positions, where $k$ is the dimension of the code, uniquely determines the values of the word at all other positions.

In symmetric cryptography (block ciphers, hashes), square MDS matrices are often used as diffusion layers in order to maximize the branch number or, in other words, the number of active elements over two consecutive full rounds. One aspires these matrices to be efficiently evaluable, meaning that a vector-matrix product can be computed with signifcantly fewer than $k^2$ field operations.  

:::info
There are several approaches to tackle the issue (e.g. via sparsity and/or small entries) but in this note we follow the track of symmetry:
*Reed-Solomon codes are MDS, and if their evaluation domain has a smooth group structure then one can use FFT methods for systematic encoding.*
:::
We also remark that, although we restrict to the case of two-adic domain sizes, the general case arbitrarily smooth numbers is not much harder to tackle.

## Over FFT friendly fields


If the field $\mathbb F_q$ has a multiplicative subgroup of smooth order $N = 2^n$ then we can simply look at the Reed-Solomon code of rate $k = |H|$ with FFT-friendly evaluation domain
$$ 
D= H \cup \tau H,
$$ where $\tau H$ is any coset of $H$, with $\tau H \neq H$. The first half $H$ corresponds to the systematic positions, and any message $m\in \mathbb F_q^H$ is extended to a Reed-Solomon codeword on $D$ by interpolation and subsequent coset evaluation, which can be done efficiently by the multiplicative FFT. 

Let us write down the square matrix $A\in\mathbb F_q^{N\times N}$ which describes the systematic encoding step:  Given $m\in \mathbb F_q^H$ the value of the interpolant at an arbitrary point $z\in \mathbb F_q$ is the inner product


$$ m(z) = \langle m(\:.\:), L_H(\:.\:, z)\rangle_H =  \sum_{x\in H} m(x) \cdot L_H(x, z),
$$ with the *Lagrange kernel* $$L_H(X,Y) = \frac{1}{|H|} \cdot \frac{ Y\cdot v_H(X) - X \cdot v_H(Y)}{X-Y},$$ where $v_H(X) = X^{|H|}-1$ is the vanishing polynomial of the domain $H$. 

:::spoiler
The Lagrange kernel is the unique bivariate polynomial of individual degrees at most $|H|-1$, such that for $x,y$ in $H$ we have 
$$
L_H(x,y) =
\begin{cases} 1 & \text{if } x=y,
\\
0 & \text{otherwise.}
\end{cases}$$
:::

For $x, x' \in H$, we thus get
$$
L_H(x, \tau\cdot x') 
= -\frac{v_H(\tau)}{|H|} \cdot \frac{1}{1 - \tau \cdot x' \cdot x^{-1}}.
$$ We summarize the construction in the following claim.

:::info
**Canonical construction**. Assume that the finite field $\mathbb F_q$ has a multiplicative subgroup $H$ of order $N$, and let $g$ be its generator. For any non-zero element $\tau \notin H$ the $(N\times N)$--matrix with entries 
$$
a_{i,j} = \frac{1 - \tau^N}{N} \cdot \frac{1}{1 - \tau \cdot g^{j-i}},
$$ where $i,j = 1, \ldots, N$, is circulant and maximum distance separable.
If $N=2^n$ then vector-matrix products $x \cdot A$ can be computed within at most $$N\cdot \left( n\cdot \mathsf M +  2\cdot n \cdot \mathsf A\right),
$$ where $\mathsf M$ and $\mathsf A$ denote multiplications and additions over $F$.
:::

:::spoiler Proof
The only thing left to show is the cost for the systematic encoding. Having the coset shift $\tau$ integrated into the twiddle of the FFT network, each, interpolation over $H$ and evaluation over $\tau H$ can be performed in (at most) $n \cdot N/2$ multiplications (with the precomputed twiddles) and $n\cdot N$ additions.
:::


## Over non-FFT-friendly fields

The above canonical construction can be generalized to arbitrary finite fields (and with the same computational costs) by means of the *Elliptic Curve FFT (ECFFT)* from Ben-Sasson, Carmon, Kopparty and Levit, as described in [ECFFT Part 1](https://arxiv.org/abs/2107.08473) and [ECFFT Part 2](https://eprint.iacr.org/2022/1542). Without going into detail: The ECFFT escapes the non-FFT-friendliness of $\mathbb F_q$ (thought of it as line) and looks instead at a suitable elliptic curve 
$$
E: y^2 = a x^3 + b x + c
$$ in $\mathbb F_q$, with a cyclic subgroup $H$ of desired order $N=2^n$ (such curves can be efficiently sampled ). Building on the cyclic group structure (and using the symmetry with respect to the involution $J:(x,y)\mapsto (x,-y)$) Ben-Sasson, et al., construct an FFT-like algorithm, the *ECFFT*, for interpolating a given function by certain rational functions on $E$, which generate a (generalized) Reed-Solomon code which is again maximum distance separable. 

The concrete efficiency of this algorithm is en-par with the regular Cooley-Tukey multiplicative FFT, yet the FFT network goes along a chain of suitable isogenies 
$$
E= E_0 \stackrel{\pi_1}{\longrightarrow} E_1 \stackrel{\pi_2}{\longrightarrow} \ldots \stackrel{\pi_1}{\longrightarrow} E_{n-1}
$$ between smaller curves $E_1,\ldots, E_{n-1}$, which are of quadratic degree and halve the subgroup $H$ in a regular $2$-to-$1$ manner. 

:::danger
Since the ECFFT requires quite some background in algebraic geometry, we skip the explicit description of the systematic ECFFT encoding matrix, and instead sketch two special cases in which one can work with a simpler, alternative FFT.
:::

:::spoiler
We might add explicit description in a future upgrade of the note.
:::


### Circle-FFT-friendly case

For fields $\mathbb F_q$ with smooth $q + 1$, one can use the circle FFT from [HLP24](https://eprint.iacr.org/2024/278), see also the [S-two whitepaper](https://eprint.iacr.org/2026/532). The circle FFT is a simplified variant of the ECFFT, which works over the circle curve 
$$
C: x^2 + y^2 = 1
$$ in $\mathbb F_q$, with the group structure inherited from the rotation group $SO(2,\mathbb F_q)$. (As in [HLP24](https://eprint.iacr.org/2024/278), we write the group operation multiplicatively.) This a cyclic group of order $q + 1$, and the FFT domains are subgroups of $H$ which are invariant under the involution $J:(x,y)\mapsto (x,-y)$, or more generally, of the form 
$$
H = Q \cdot H' \cup Q^{-1}\cdot  H',
$$ where $H'$ is a subgroup of half the size, and $Q \in C\setminus H'$. (These generalized domains are named *twin-cosets* in [HLP24](https://eprint.iacr.org/2024/278).)

Carrying over the canonical construction to the circle curve is not difficult, but requires to tackle some subtleties which are due to the genus of the curve (i.e. genus $0$): For given domain size $N$, the space of functions of the circle FFT are not the complete space of "degree-$N/2$ circle polynomials" 
$$
\mathcal L_{N/2}(\mathbb F_q) = \{p(x,y)\in \mathbb F_q[x,y]/(x^2 + y^2 - 1) \::\: \deg p(x,y) \leq N/2 \},
$$ which has $\dim \mathcal L_{N/2}(\mathbb F_q) = N+1$, but a subspace $\mathcal L'_{N/2}(\mathbb F_q)$ of codimension one, 
$$
\mathcal L_{N/2}(\mathbb F_q) = \mathcal L_{N/2}'(\mathbb F_q) + \langle v_n(x,y)\rangle,
$$ where $v_n(x,y)$ is the vanishing polynomial of the $J$-invariant coset domain $H$. 
:::spoiler
By degree, we mean the total degree of the canonical representant of $p(x,y)$, which is (at most) linear in $y$. 
:::
This one-dimenional gap causes the following anomaly: The code obtained by circle-FFT encoding is only *almost MDS*, since a circle polynomial from $\mathcal L_{N/2}(\mathbb F_q)$ can have $N$ zeros (but not more, unless it is trivial). See Section 4 in [HLP24](https://eprint.iacr.org/2024/278).
To obtain full MDS we 'puncture' the code, by an additional correction step which enforces a zero at a fixed out-of-domain point $Q\notin D$.

The concrete construction is as follows:
The  evaluation domain is the disjoint union of a standard position coset $D_1 = H$ and another twin-coset $D_2$ of the same size $N$, 
$$
D = D_1 \cup D_2.
$$ Given a message $f\in \mathbb F_q^{D_1}$:
- Compute the coefficients of the interpolant $\hat f(x,y)\in \mathcal L_{N/2}'(\mathbb F_q)$ using the circle FFT.
- compute $v_Q =\hat f(Q)$ and evaluate $$ p(x,y) = \hat f(x,y) - \frac{v_Q}{v_n(Q)}\cdot v_n(x,y)$$ over $D_2$ by means of the circle FFT. (First evaluate $\hat f(x,y)$ using the circle FFT, and then post-correct by the values of the vanishing polynomial.)
Note that this does not change the values over $D_1$, hence we still systematically extend $f$.

The obtained code is linear, of dimension $N$, but now any involved polynomial can have at most $N-1$ zeroes, unless it is trivial. In other words, the square matrix $A\in \mathbb F_q^{N\times N}$ from thi systematic encoding procedure is MDS. Taking $Q = (1,0)$, which reduces the evaluation cost at $Q$ to at most $N$ additions, the overall encoding cost is
$$
N \cdot (n\cdot \mathsf M + (2 n + 2) \mathsf A),
$$ using the fact that $v_n(x,y)$ alternates between two values on $D_2$.(We neglect the few operations to determine them.)

:::danger
The explict form of $A$ can be again derived from the Lagrangian (see Remark 16 in [HLP24](https://eprint.iacr.org/2024/278)). We leave the details to the reader.
:::

We point out that the circle code approach causes $2 N$ additions more than the ECFFT.


### Binary fields


For binary fields $\mathbb F_q$, $q = 2^m$, one may take the additive FFT from Lin, Chung and Han [LCH14](https://arxiv.org/abs/1404.3458) instead of the ECFFT. Any (systematic) Reed-Solomon code with a linear subspace of $H$ as message domain, $|H| < 2^n$, which extrapolates to a non-trivial coset $\tau + H$ , leads to a quadratic MDS matrix which is efficiently evaluable by the additive FFT.

We recall the following well-known fact about subspace vanishing polynomials.
:::info 
**Subspace polynomials.** The polynomial $v_H(X) = \prod_{h\in H} (X-h)$ of any $n$-dimensional additive subspace  $H = \langle \beta_1, \ldots, \beta_n\rangle$ of $\mathbb F_q$ is the additive (i.e. $\mathbb F_2$-linear) polynomial
$$
v_H(X) = X^{2^k} + \sum_{i=0}^{k-1} a_{i}\cdot X^{2^i},
$$ which can be obtained recursively from $v_0 (X) := X$ via 
$$
v_{i}(X) =  v_{i-1}(X)^2 + v_{i-1}(\beta_i)\cdot v_{i-1}(X), 
$$ for $i= 1,\ldots, n$, and taking $v_H(X) =v_k(X)$.
:::

:::spoiler Proof.
The vanishing polynomial $v_H(X) = \prod_{h \in H} (X - h)$ for a $n$-dimensional linear subspace $H$ is obtained recursively from an increasing chain of subspaces $H_i = \langle \beta_1, \ldots, \beta_i\rangle$, $i =1, \ldots, n$, using any basis  $\beta_1, \ldots, \beta_k$ of $H$.
One starts with the trivial subspace $H_0 := \{0\}$, which has the vanishing polynomial 
$$
v_0(X) = X.
$$ Using the binary decomposition 
$$
H_{i} = H_{i-1} \cup \left(\beta_j + H_{i-1}\right),
$$ together with the assumption that $v_{i-1}(X)$ is additive, i.e. $v_{i-1}(X + Y) = v_{i-1}(X) + v_{i-1}(Y)$, one obtains that
\begin{align*}
v_{i}(X) &= v_{i-1}(X) \cdot v_{i-1}(X - \beta_k) =   v_{i-1}(X) \cdot (v_{i-1}(X) - v_{i-1}(\beta_k))
\\
&= v_{i-1}(X)^2 + v_{i-1}(\beta_i)\cdot v_{i-1}(X), 
\end{align*} which is again additive. 
:::

The concrete entries of the matrix can be derived from the Lagrange kernel $L_H(X,Y)$ for the linear subspace $H$, which is the (scaled) generalized derivative 
$$
L_H(X, Y) = \frac{1}{a_0} \cdot \frac{v_H(X) - v_H(Y)}{X-Y},
$$ where $a_0$ is the lowest-order coefficient of the vanishing polynomial $v_H(X)$.

:::spoiler
Indeed, from the above Lemma we conclude that $v_H(X) - v_H(Y)$ is divisible by $X-Y$, and the individual degrees are at most $2^k - 1$. Second, by the $\mathbb F_2$-linearity of the vanishing function we obtain that for  $x, y\in H$, we have
$$L_H(x, y) = \frac{1}{a_0}\cdot \frac{v_H(x) - v_H(y)}{x - y} = \frac{1}{a_0}\cdot \frac{v_H(x - y)}{x - y} = 
\begin{cases}
1 & \text{if } x = y,
\\
0 & \text{otherwise}.
\end{cases}
$$ 
:::

For any $x, x'\in H$, we have
$$
L_H(x, \tau + x')
= \frac{1}{a_0} \cdot \frac{-(v_H(\tau) + v_H(x'))}{x - (\tau + x')} 
=  \frac{v_H(\tau)}{a_0} \cdot \frac{1}{\tau + x' - x}.
$$   For the matrix representation we use the index notation $[\vec v]:= v_1 + v_2\cdot 2 + \ldots + v_{k}\cdot 2^{k-1}$ for every $\vec v= v_1 \cdot \beta_1 + \ldots + v_{k}\cdot\beta_k$ from $H$. 

:::info 
**Construction for characteristic $2$.** Let $H$ be a $n$-dimensional $F_2$-linear subspace of $\mathbb F_q$, and $a_0$ be the lowest-order coefficient of the subspace vanishing polynomial $v_H(X)$. Then for every element $\tau \notin H$ the $(2^n\times 2^n)$--matrix with entries
$$
a_{[x],[y]} =\frac{v_H(\tau)}{a_0} \cdot 
\frac{1}{\tau + (y - x)},
$$ where $[\:.\:]$ is the above defined index for elements of $H$, is maximum distance separable.
Moreover, vector-matrix products $x \cdot A$ can be computed within at most
$$
N\cdot \left( n\cdot \mathsf M +  2\cdot n \cdot \mathsf A\right),
$$ where $\mathsf M$ and $\mathsf A$ are for field multiplications and additions in $\mathbb F_q$.
:::

:::spoiler Proof.
The only thing open are the computational cost. 
But these are evident from the operation counts of the additive FFT.
:::