CKKS stuff - HackMD

# CKKS SIMD encoding & decoding We need a way to map a vector in $C^{N/2}$ to polynomial in $R/X^N+1$. Define the $M^{th}$ cyclotomic polynomial $$X^N+1 = \prod_{\omega_j \in P_M} X - \omega_j$$ where $P_M = {\zeta^i | i \in Z^*_M}$ (i.e. the set of all $M^{th}$ primitive root of unity) Since $X-\omega_j$ are co-primes, due to CRT, there exist following isomorphism: $$X^N+1 \cong \prod_{\omega_j \in P_M} X - \omega_j$$ The map that reduces polynomial in $R/X^N+1$ to its representation in $X-\omega_j$ is a simple evaluation of polynomial over $\omega_j$. Reducing the polynomial for each $\omega_j$ materialises into a metric vector product. More formally,  Define $M^{th} = 2N^{th}$ root of unity $\zeta = e^{\pi i / N}$ and the set: $$\{\zeta_j, \bar{\zeta_j}\ 0 \le j \lt N/2 \}$$ where $\zeta_j = \zeta^{5^{j}}$. The set consists of all $M^{th}$ primitive root of unity and equals the other set $\{\zeta^i | i \in Z^*_M\} = P_M$. This is because integer 5 has order M/4 and spans across $Z^*_M$ with -1 (Take this as granted) Define map $\sigma(m)$ that sends $m \in \mathbf{R}/X^N+1$ to $z \in C^N/2$ (i.e. $\prod_{\omega_j \in P_M} X - \omega_j$) as $$\begin{bmatrix} z \\ \bar{z} \end{bmatrix} = \sigma(m) = \begin{bmatrix} U \\ \bar{U} \end{bmatrix} \cdot m$$ where $$U = \begin{matrix} 1 & \zeta_0 & \zeta_0^2 & \cdots & \zeta_0^{N-1}\\ 1 & \zeta_1 & \zeta_1^2 & \cdots & \zeta_1^{N-1}\\ \vdots \\ 1 & \zeta_{N/2-1} & \zeta_{N/2-1}^2 & \cdots & \zeta_{N/2-1}^{N-1}\\ \end{matrix}$$ and $\bar{U}$ is conjugate matrix of $U$. Observe that: 1. The map evaluates $m$ at values of $\zeta_j$ and $\bar{\zeta_j}$. Matrix $U$ contains powers of $\zeta_j$ and matrix $\bar{U}$ contains powers of $\bar{\zeta_j}$. 2. First half of resulting vector of complex values is complex conjugate of second half. This is because $\zeta_j$ is complex conjugate of $\bar{\zeta_j}$ 3. Due to 2, if we were to map a vector of $C^N$ to $\mathbf{R}/X^N+1$, we can only utilise half of the vector space (i.e. can only encode N/2 complex elements). Inverse of $\sigma$ which sends $z \in C^{N/2}$ to $m \in \mathbf{R}/X^N+1$ is: $$ \begin{bmatrix} U \\ \bar{U} \end{bmatrix}^{-1} = \frac{1}{N} \begin{bmatrix} \bar{U}^T & U^T\\ \end{bmatrix} $$ $$ \sigma^{-1}(z): \frac{1}{N} \begin{bmatrix} \bar{U}^T & U^T\\ \end{bmatrix} \cdot \begin{bmatrix} z \\ \bar{z} \end{bmatrix} = \frac{1}{N} \begin{bmatrix} \bar{U}^Tz + U^T \bar{z} \\ \end{bmatrix} $$ **Encoding** Given a complex vector $z \in C^{N/2}$ we find $m \in \mathbf{R}/X^N+1$ as: $$ m = \sigma^{-1}(z) = \frac{1}{N} \begin{bmatrix} \bar{U}^Tz + U^T \bar{z} \\ \end{bmatrix} $$ However, in practice we want to encode $z$ in $Z_Q/X^N+1$ which requires a few additional steps: 1. After encoding $z$ into $m \in R/X^N+1$, we scale $m$'s coefficients by $\Delta$ and then randomly round resulting coefficients to find scaled $m \in Z/X^N+1$. 2. Coefficients of $m$ are $\in Z$. We map coefficients from signed interval to unsigned interval $[0, Q)$ and output $m \in Z_Q/X^N+1$. **Decoding** To map $m \in Z_Q/X^N+1$ to $z \in C^{N/2}$, 1. map coefficients of $m$ from unsigned interval in $Z_Q$ to signed interval $\in Z$ to find $m \in Z/X^N+1$. 2. Scale down the coefficients of $m$ by $\Delta$ to find $m \in \mathbf{R}/X^N+1$. 3. Apply $\sigma(m)$ to find $z \in C/X^N+1$ *Note: $\Delta$ may vary across the circuit and \Delta used for decoding may not equal \Delta used for encoding.* **Encoding and decoding using Special(I)FFT** Observe that $X^N+1$ splits into $X^{N/2}-i$ and $X^{N/2}+i$, which implies each polynomial in ring $R/X^N+1$ has unique representation in ring $R/X^{N/2}-i$. We can reduce representation of polynomial in ring $R/X^N+1$ to ring $R/X^{N/2}-i$ by simply setting term $X^{N/2} = i$. And the inverse is the opposite: set the real values of polynomial in $R/X^{N/2}-i$ as first half coefficients and set imaginary values of polynomial in $R/X^{N/2}-i$ as second half coefficients of polynomial in ring $R/X^N+1$ We cannot map a vector $z \in C^{N/2}$ to $X^{N/2} - i$ directly efficiently. Instead observe that following change of base for $F(X) \in R/X^N-i$ from $X \to \omega X$ where $\omega$ is $2N^{th}$ root of unity results in: $$F(X) = H(X) + G(X) (X^N-i)$$ $$F(\omega X) = H(\omega X) + G(\omega X) ((\omega X)^{N/2}-i)$$ Since $\omega^{N/2} = i$, $$F(\omega X) = H(\omega X) + iG(\omega X) (X^{N/2}-1)$$ $$F(\omega X) = H(\omega X)\mod{X^{N/2}-1}$$ Therefore, using Inverse FFT (Inverse cyclic convolution), we map $z \in C^{N/2}$ to $R/X^N-1$ and then multiply coefficients by powers of $\omega^{-1}$ to find the corresponding representation in ring $R/X^{N/2}-i$. Hence, we can encode $z \in C^{N/2}$ to ring $R/X^N+1$ by (1) using FFT of size N/2 to find its representation in $R/X^{N/2}-1$, followed by (2) complex multiplication by powers of $\omega^{i}$ to find representation in $R/X^{N/2}-i$, and then (3) setting real values as first half coefficients and imaginary values as second half coefficients to finally find polynomial in ring $R/X^N+1$ To decode $m \in R/X^N+1$, extract first half coeffcients and set them as reals of polynomial in $R/X^{N/2}-i$ and second half coefficients as the imaginary values. Then multiply the polynomial by powers of $\omega$ to find corresponding representation in $R/X^N-1$. Followed by FFT transform to find $z \in C^{N/2}$. Algorithm 1 of [paper](https://eprint.iacr.org/2018/1043.pdf) shows how to pack all these steps into single FFT like operation. We refer to them as SpecialFFT (decoding) / SpecialIFFT (encoding). ## Rotations Let $\rho_k$ be the following map: $$\rho_k: m(X) \rightarrow m(X^{k})$$ To send slot element at slot $i$ to slot $j$ define, $$k = 5^{i-j} \mod{M}$$ Given $m(X)$, which encrypts slots $(z_i)_{0 \le i \lt N/2} \in C^{N/2}$, map $\rho_k$ outputs $$\tilde{m}(X) = m(X^k) = \rho_k(m(X))$$ where $\tilde{m}(X)$ encrypts slots $(\tilde{z}_j)_{0 \le j \lt N/2}$ and $\tilde{z}_j = z_i$ To understand why notice: $$\tilde{z}_j = \tilde{m}(\zeta_j) = m(\zeta_j^{5^{i-j}}) = m(\zeta^{5^j(5^{i-j})}) = m(\zeta^{5^i}) = m(\zeta_i) = z_i$$ ## Conjugation We know that $$\zeta^{-1} = \bar{{\zeta}}$$ Conjugation of $m(X)$ that encrypts slots $(z_i)_{0 \le i \lt N/2} \in C^{N/2}$ is defined as map: $$\tilde{m}(X) = m(X^{-1}) = \rho_{-1}(m(X))$$ where $\tilde{m}(X)$ encrypts slots $(\tilde{z}_i)_{0 \le i \lt N/2}$. This follows from: $$\tilde{z}_i = \tilde{m}(\zeta_i) = m(\zeta_i^{-1}) = m(\bar{\zeta_i}) = \bar{z}_i$$ ## References 1. https://eprint.iacr.org/2016/421.pdf # Homomorphically switching from Coefficient to SIMD encoding Recall that a vector of complex values $z \in C^{N/2}$ is encoded as plaintext $m \in R/X^N+1$ as: $$ m = \frac{1}{N} \begin{bmatrix} \bar{U}^Tz + U^T \bar{z} \\ \end{bmatrix} $$ Observe that we can extract upper half coefficients of $m$ and lower half coefficients of $m$ by splitting $U$ into left and right halves. Define $$U_0 = \begin{matrix} 1 & \zeta_0 & \zeta_0^2 & \cdots & \zeta_0^{\frac{N}{2}-1}\\ 1 & \zeta_1 & \zeta_1^2 & \cdots & \zeta_1^{\frac{N}{2}-1}\\ \vdots \\ 1 & \zeta_{N/2-1} & \zeta_{N/2-1}^2 & \cdots & \zeta_{N/2-1}^{\frac{N}{2}-1}\\ \end{matrix}$$ AND $$U_1 = \begin{matrix} \zeta_0^{\frac{N}{2}} & \zeta_0^{\frac{N}{2}+1} & \zeta_0^{\frac{N}{2}+2} & \cdots & \zeta_0^{N-1}\\ \zeta_1^{\frac{N}{2}} & \zeta_1^{\frac{N}{2}+1} & \zeta_1^{\frac{N}{2}+2} & \cdots & \zeta_1^{N-1}\\ \vdots \\ \zeta_{\frac{N}{2}-1}^{\frac{N}{2}} & \zeta_{\frac{N}{2}-1}^{\frac{N}{2}+1} & \zeta_{\frac{N}{2}-1}^{\frac{N}{2}+2} & \cdots & \zeta_{\frac{N}{2}-1}^{N-1}\\ \end{matrix}$$ Where $$U = U_0 | U_1$$ Then we can extract first half, $z'_0$, and second half, $z'_1$, coefficients of message $m$ via linear transforms on the plaintext $z$ using the identity: $$ z'_k = \frac{1}{N} \begin{bmatrix} \bar{U_k}^Tz + U_k^T \bar{z} \\ \end{bmatrix} $$ **How we do calculate $\bar{U_k}^Tz$ (or $U_k^T \bar{z}$) homomorphically?** It is a simple matrix vector product and can be evaluated via rotations, multiplications, and additions while consuming just one multiplicative depth (because there are no consecutive multiplications on ciphertexts). For techniques on efficient evalution of matrix vector product homomorhpically I will refer you to section 4.3 of https://eprint.iacr.org/2014/106.pdf However, doing this via naive matrix vector multiplication will be too expensive in terms of runtime. This is because the dimension of matrix $U$ is really large (for ex. $2^{15}$). Hence, we need to figure out a faster way to evaluate the same matrix. If you are familiar with FFTs/NTTs, then you will notice that encoding equation for $z$ can be evaluated using FFTs and, indeed, in practice to encode vector $z$ as $m$ we use a FFT like algorithm. Can we leverage FFT view of the algorithm do speed up evaluating $\bar{U_k}^Tz$? Well, yes. Following is matrix formulation of FFTs: ![image](https://hackmd.io/_uploads/r1-ymh1l0.png) *credit: https://www2.math.uconn.edu/~olshevsky/classes/2018_Spring/math3511/FFT.pdf* $F_N$ can recursively be written as product of $\log{N}$ matrices. The interesting thing about the matrices on the right is all of them are sparse. In fact, each has atmost 3 non-zero diagonals. This is good news for us because we can reduce rotations and scalar multiplication to at-most $3*\log{N}$. However, as you will notice, this increases the no. of consecutive multiplications on input vector to $\log{N}$, hence multiplicatve depth increases to $\log{N}$. We can reduce the depth consumed by composing consecutive matrices into groups of $r$, thus reducing the depth to $\log{N}/r$, at the cost of runtime (since combining cosecutive matrices causes no. of non-zero diagonals in each matrix to increase). ## References 1. https://eprint.iacr.org/2016/421.pdf 2. https://eprint.iacr.org/2018/1073.pdf