Fast Fourier Transforms (draft).

# Fast Fourier Transforms (draft). A polynomial of degree $k$ can be described in two different ways: 1. the sequence of its coefficients $f = (a_0, a_1\dots , a_{k})$ 2. a list of $k+1$ evaluations $f = (f(x_0), \dots, f(x_k))$ for a fixed set $(x_0, x_1,\cdots, x_k)$. Fourier transforms are ways to switch back and forth between such representations. #### Why would you care? 1. Easy polynomial multiplication. For example if $\deg(f) =m$ and $deg(g)= n$ then computing $f\cdot g$ will cost $O(mn)$ operations. If we do this via FFT, you can do it in $O((m+n)\log (m+n))$. 2. Erasure codes. Recall that if you encode your data as a polynomial of degree $2^{k-1} < n < 2^k$ then you can produce an erasure code $$ C= \{ f(\omega^i) | i = 1 \dots 2^{k+1}-1 \}.$$ Knowing any $2^k$ of these $2^{k+1}$ elements will allow you to reconstruct the polynomial. FFT will allow this in $O(n\log n)$. This is very useful for sharding. ## Mutiplicative FFT ### Theoretical point: Suppose that $p$ is a prime so that $2^k | p-1$ and $F$ is the field of "reminders modulo $p$". In this case $F$ admits $2^k$ roots of 1, that is there exist elements $a \in F$ with $a^{2^k} = 1 \pmod p$. Let $\omega$ be a `primitive` $2^k$ root of one, that is $\omega^{2^k} =1 \pmod p$ in $F$ and $\omega^{2^{k-1}} \ne 1 \pmod p$. (such an element exists). $f$ is a polynomial of degree n and $2^{k-1} < n < 2^k$. Consider the correspondence $$ f \rightarrow S =\{ f(\omega^i) | i = 1 \dots 2^k-1 \}.$$ Clearly knowing $f$ determines the set $S$ and, by Lagrange interpolation, the set $S$ uniquely determines $f$. #### Problems you want to solve: 1. **Evaluation**: Given $f$, compute $S$. Naively this will cost $O(n^2)$ which is unacceptable. FFT will allow for $O(n\log n)$ 2. **Interpolation**: Given the set $S$ of element of $f$ determine $f$ as above. Again Lagrange interpolation costs $O(n^2)$. Again FFT will allow for $O(n\log n)$ ### Algorithms #### Evaluation: The main point here is that, if $\omega$ is a (primitive) $n$ root of 1 (n even) then $\omega^{\frac n 2}= -1$ so and $x= \omega^i$, $y = \omega ^{i+\frac n 2}$ then $x = -y$ and $x^2=y^2$ is a $\frac n 2$ root of 1. Moreover $\omega^2$ will be a primitive $\frac n 2$ root of 1. Consider the decomposition of the polynomial: $$f (x) = a_0 + a_1 x \cdots a_{n}x^{n} = (a_0 + a_2 x^2 + \cdots) + x (a_1 + a_3x^2 \cdots) = f_0(x^2) + x f_1(x^2)$$ If follows that $f(\omega^i) = f_0(\omega^{2i}) + \omega^if_1(\omega^{2i})$. In particular, the evaluation problem for $f$ of degree $n$ reduces to evaluating the two polynomials $f_0$ and $f_1$ which have degree (at most) $\frac n 2$ at all $\frac n 2$ roots of 1. Inductively one can show that you will need $O(n\log n)$ operations. The algorithm then is as follows: `input:` $f = (a_0, ..., a_n)$ `output` $(z_0, ..., z_{2^k-1}) = (f(\omega^0), f(\omega^1), \cdots , f(\omega^{2^k-1}))$ 1. Let $(b_0, \cdots , b_{\frac n 2}) = (a_0, a_2, \cdots, a_n)$ and $(c_0, \cdots , b_{\frac n 2 -1}) = (a_1, a_3, \cdots, a_{n-1})$. 2. Recursively solve the problem for $(b_0, \cdots , b_{\frac n 2})$ and $(c_0, \cdots , b_{\frac n 2 -1})$ replacing $\omega$ by $\omega^2$ to obtain $(x_0, ..., x_{2^{k-1}-1})$ respectively $(y_0, ..., y_{2^{k-1}-1})$. 3. Reasemble the output as $z_i = x_i + \omega^iy_i$ if $i\le 2^{k-1}-1$ and $z_i = x_{i-2^{k-1}} + \omega^iy_{i-2^{k-1}}$ otherwise. #### Interpolation. Suppose that you know the set $S = \{z_i | i = 1 \cdots n-1\}$ and you want to compute apolynomial of degree $< n$ so that $f(\omega^i) = z_i$. You first note that the respective polynomial will have the decomposition $f_0(x^2) + x f_1(x^2)$ as above and that $$f(\omega^i)= f_0(\omega^{2i}) + \omega^i f_1(\omega^{2i})$$ $$f(\omega^{i+ \frac n 2})/2 = y_i + y_{i+\frac n 2})/2= f_0(\omega^{2i}) - \omega^i f_1(\omega^{2i})$$ We can compute $$f_0(\omega^{2i})= \frac{f(\omega^i)+f(\omega^{i+ \frac n 2})} 2 = \frac {z_i + z_{i+\frac n 2}} 2$$ $$f_1(\omega^{2i})= \frac{f(\omega^i)-f(\omega^{i+ \frac n 2})} {2\omega^i} = \frac {z_i - z_{i+\frac n 2}} {2 \omega^i}$$ Therefore the interpolation problem for the set $S=\{y_i | i = 1 \cdots n-1\}$ is equivalent to two interpolation problems $S_0 =\{ \frac {y_i + y_{i+\frac n 2}} 2 | i =0 \cdots \frac n 2 - 1\}$ respectively $S_1 = \{\frac {y_i - y_{i+\frac n 2}}{2 \omega^i} i =0 \cdots \frac n 2 - 1\}$. Inductively this shows that the interpolation will be computed in $O(n\log n)$. The algorithm then is as follows: `input:` $(z_0, ..., z_{2^k-1}) = (f(\omega^0), f(\omega^1), \cdots , f(\omega^{2^k-1}))$ `output` $f = (a_0, ..., a_n)$ 1. Let $x_i = \frac {y_i + y_{i+\frac n 2}} 2$ and $y_i = \frac {z_i - z_{i+\frac n 2}} {2 \omega^i}$ for all $i = 0, \cdots 2^{k-1}-1$. 2. Recursively solve the problem for $(x_0, \cdots , x_{2^{k-1}-1})$ and $(y_0, \cdots , y_{2^{k-1}-1})$ replacing $\omega$ by $\omega^2$ to obtain $(b_0, ..., b_{\frac n 2})$ respectively $(c_0, ..., c_{\frac n 2 -1})$. 3. Reasemble the output as $(a_0, ..., a_n) = (b_0, b_1, a_1, b_2, \cdots , a_{\frac n 2})$. ## Applications: 1. polynomial multiplication: Suppose $f,g$ are two polinomials of degrees $\le n$. It means thet $f\cdot g$ is a poly of degree $\le 2n$. To compute that, compute $f(\omega^i)\cdot g(\omega^i)$ for $0\le i \le 2n$ and $omega$ a $2n$ root of 1 (recall $n$ is a power of two) this takes $O(n\log n)$. Now interpolate the polinomial $g$ of degree less than $2n$ so that $g(\omega^i)=f(\omega^i)\cdot g(\omega^i)$. This again takes $O(n\log n)$. Standard poly multiplication is $n^2$. - Given $a,b$ two vectors with nonegative integer coefficients write an algorithm that finds all possible sums $(a[i]+ b[j])$ and the number of times each appears. 2. General interpolation: Suppose you want to find a polynomial $P$ of degree $\le n$ so that $P(x_i) = y_i$ for $i = 1,\cdots n$. and so that $x_i$ are in the set $\{\omega^k | k = 0,\cdots 2n\}$This is useful for example in erasure codes. You want to use lagrange interpolation, that is to compute $$P(x)= \sum_{i=1}^n y_i\prod_{j\ne i}\frac{x-x_j}{x_i-x_j} =\sum_{i=1}^n y_i\frac{Q_i(x)}{Q_i(x_i)} $$ where $Q_i=\prod_{j\ne i}(x-x_j)$. Note that $Q_i(x_j)=0$ if $i\ne j$. Note that doing this is unreasonably expensive. To do that you compute $Z=(x-x_1)\cdots (x-x_n)$. We then have $$P(x)= \sum_{i=1}^n \frac{Q_i(x)}{Q_i(x_i)} = Z(x)\sum_{i=1}^n \frac{y_i}{Q_i(x_i)(x-x_i)}$$ Note that $Z'(x) = \sum_i Q_i(x)$ and so $Q_i(x_i)=Z'(x_i)$. We can therfore evaluate the $n_i= \frac{y_i}{Q_i(x_i)}$ in $O(n\log n)$. Next we need to compute $Z(x)\sum_{i=1}^n \frac{n_i}{(x-x_i)}$. We use Taylor series to expand $\sum_{i=1}^n \frac{n_i}{(x-x_i)}$ and truncate at $x^n$. The useing polynomial multiplication we can obtain $P$ in $O(n\log n)$. ## Problem: Very often a field $F$ does not contain enough roots of 1. * if $F$ is a field of characteristic 2 then there are no $2^k$ roots of one aside from 1 itself. * in the BLS12-381 library, the field size is `52435875175126190479447740508185965837690552500527637822603658699938581184513` while the largest $2^k$ root of one is of order $2^32=4294967296$. In the literarure we have found two approaches for this: 1. Elliptic curves FFT (see a brief description in [here](https://hackmd.io/@corneliuhoffman/ECFFT)) 2. Additive FFT. There are a number of algorithms for this, see [Todd Mateer's thesis](https://tigerprints.clemson.edu/cgi/viewcontent.cgi?article=1231&context=all_dissertations) for an overview of the state of the art in the late 2000's. To my knowledge the 2014 paper described below is the best existing algorithm ## Additive FFT (based on [Lin & all.](https://www.semanticscholar.org/paper/Novel-Polynomial-Basis-and-Its-Application-to-Codes-Lin-Chung/46ab62acbba389bbaa08d5350dea04ad477ed75d)) ### Finite Fields of characteristic 2: A finite field of characteristic 2 (or binary finite field) is a generalization of mod two arithmetic. It is a finite set $F$ together with operations $+, \cdot$ and elements $0_F, 1_F \in F$so that: * $\forall a, b,c \in F,$ $$a +b = b+a,$$ $$a\cdot b = b\cdot a ,$$ $$a\cdot (b+c) = a\cdot b + a\cdot c $$ $$a+0=a$$ $$a\cdot 1 = a$$ * $\forall a \in F, a+a =0$ * $\forall a \in F\setminus\{0_F\}, \exists b \in F, a\cdot b =1$. We denote that $b$ by $a^{-1}$. #### Properties: 1. The size of a finite field of characteristic 2 is $2^k$ for some $k$ an, for each $k$ there is a "unique" field of that size. It is usually denoted by $\mathbb{F}_{2^k}$ or $GF(2^k)$. We will use the latter notation. In particular the field $GF(2)$ is just the mod 2 arithmetic 2.The field $F =GF(2^k)$ a "copy" inside a field $G =GF(2^l)$ if and only if $k | l$. In that case you can view $G$ as a $F$ vector space and all methods of linear algebra work. 3. $\forall a, b \in F,$ if $a \cdot b = 0$ then $a = 0$ or $b =0$. 4. $\forall a, b \in F, (a +b)^2 = a^2+b^2$. 5. If $F =GF(2^k)$ then $\forall a \in F, a^{2^k} =a$. 6. If $F =GF(2^k)$ then there exists $\omega \in F$ so that $F = \{0, \omega, \omega^2 \cdots \omega^{2^k-1}\}$. 7. If $f(x) = a_0 + a_1x + a_{k-1}x^{k-1} + x^k$ is a polynomial of degree $k$ with coefficients in $F$ then there are at most $k$ elements of $a \in F$ with $f(a)$. Moreover synthetic division of polynomials works as usual. 7. If $F =GF(2^k)$ and $Z_2$ is the field of order 2 then there exists a polynomial $f = a_0 + a_1x + a_{k-1}x^{k-1} + x^k$ of degree $k$ with coefficients in $Z_2$ so that $K =GF(2)[X]/f. That means that $F = \{\sum_{i=0}^{k-1} b_i x^i | b_i \in Z_2\}$ where addition is usual (coefficient XOR) and mutiplication is done modulo the identity $x^k = a_0 + a_1x + a_{k-1}x^{k-1} + x^k$. For example the field with 4 elements can be constructed using the polynomial $x^2+x+1$ so $F = \{0,1,x,x+1\}$ and teh tables for addition and multiplication are: | + | 0 | 1 | x | x+1 | | - | - | - | - | -| | 0 | 0 | 1 | x | x+1| | 1 | 1 | 0 | x+1 | x| | x | x | x+1 | 0 | 1| | x+1 | x+1 | x | 1 | 0| | $\cdot$ | 0 | 1 | x | x+1 | | - | - | - | - | -| | 0 | 0 | 0 | 0 | 0| | 1 | 0 | 1 | x | x+1| | x | 0 | x | x+1 | 1| | x+1 | 0 | x+1 | 1 | x| ### Subspace polynomials: Recall that if $F =GF(2^k)$ $F$ is a $GF(2)$ vector space. Suppose that $V$ is a subspace of $F$. We can define the polynomial $$W(x) = \prod_{a \in V} (x+a) $$ This polynomial has the following properties: 1. $\forall w \in V, W(w) = 0$ 2. $\forall a, b \in F, W(a+b) = W(a) + W(b)$. 3. $W(x) = a_1x + a_2 x^2 + a_4 x^4 + \cdots + x^{2^k}$ where $V$ is a vector space of dimension $k$. 4. $\forall x \in F, w \in V, W(a+w) = W(a)$. ### Lin and all. construction. Let $F = GF(2^k$) and fix $\{v_0, v_1, \cdots v_{r-1}\}$ a bsis of F$ as a vector space. That means that every element of $F$ can be written uniquely asa sumof some of the $v_i$. Fix an order on $F$ as follows: If $$i = i_0 + i_1\cdot 2 + i_2 \cdot 2^2 +\dots + i_{k-1}2^{k-1}$$ is the binary representation of $i$ then we will denote $$ w_i =i_0 + i_1v_1+ i_2 \cdot v_2 +\dots + i_{k-1}v_{k-1}.$$ We also define $W_i$ to be the subspace polynomial for the space $V_i = \{w_j| j= 0\dots 2^i-1\}$ that is $$W_i =\prod_{i=0}^{2^i-1} (x+w_i).$$ **Notes:** - the degree of $W_i$ is $2^i$. - One can recursivelly compmpute $W_i$ using: $$W_{i+1}(x) = W_i^2(x) + W_i(w_{2^i})W_i(x)$$ - if $i\le j$ then $W_i | W_j$. Finally define (with the above binary representation of $i$) $$X_i(x) =\prod_{i=0}^{k-1} \left(\frac{W_j(x)}{W_j(w_{2^j})}\right)^{i_j}$$ It is not to hard to convince yourself that the degree of $X_i$ is i. **Notes:** - the powers $i_j$ are just 0 or 1. In particular $X_0=1$ - if $i_j = 0$ then $\left(\frac{W_j(x)}{W_j(w_{2^j})}\right)^{i_j}=1$ - If $2^{j}\le i < 2^{j+1}$ (that is $j = log_2(i)$) then $\forall w \in V_j, X_i(w) = 0$. - if $i < 2^j$ then $X_i | X_{2^j-1}$. Remarcably $X_i$ forms a basis for the vector space of polynomials with coefficients in $F$ and so they will "play the role" of $x^i$ in the standard FFT. In particular a polynomial of degree $k-1$ can be written as $D_h = (h_0, \cdots, h_{k-1})$ where the polynomial is (notation as in Op.cit.) $$[D_k](x) = \sum_{i=1}^{k-1} d_iX_i(x).$$ Conversely we can have an "evaluation" transformation $$\Psi_k^l([D_k]) = ([D_k](w_0+w_l), [D_k](w_1+w_l), \cdots [D_k](w_{k-1}+w_l))$$ Which admits an "interpolation" inverse that will be described below. For simplicity we will denote $$\bar{W}_j(x)=\frac{W_j(x)}{W_j(w_{2^j})}$$