# An Overview about FIPS 203: Module-Lattice-based Key-Encapsulation-Mechanism ## Introduction In August 13th, 2024, NIST announced three post-quantum cryptography standards, where two of them are from lattice-based cryptography and the other one is from hash-based cryptography. In this post, we will have an overview about FIPS 203, which specify algorithms derived from CRYSTALS-Kyber, is a key encapsulation mechanism based on module lattice. ## Notation ## Pre-requisited ### The LWE Problem #### Definition The Learning with Errors Problem (LWE) has an important role in many cryptographic schemes that are related to lattice cryptography, so let's review this problem again. __Definition 1.__ For positive integer $m, n, q$ and $\beta < q$, the $LWE_{n, m, q, \beta}$ problem asks to distinguish between the following two distributions: - $(A, As + e)$ where $A \gets \mathbb{Z}_q^{n \times m}$, $s \gets [\beta]^m$, $e \gets [\beta]^n$. - $(A, u)$ where $A \gets \mathbb{Z}_q^{n \times m}$, $u \gets \mathbb{Z}_q^n$. The crucial part that makes this problem hard is the exist of error vector $e$, which makes Gaussian elimination can't be applied here. The hardness of LWE relys on the parameters $n, m, q, \beta$, and it becomes harder when $m$ and $\frac{\beta}{q}$ grow. The parameter $n$ is not known to have large impact on the hardness of the problem. We also notice that there is nothing too special about using the uniform distribution for the secret and error terms, it just makes the presentation simpler. When rounded Gaussian distribution is chosen in the original definition of LWE, some practical implementations like Kyber uses the binomial distribution to generate the error because of it speed. To account for different distributions that one could use, we can define the LWE problem with the distribution like this: __Definition 2.__ For positive integer $m, n, q$ and a distribution $\psi$, the $LWE_{n, m, q, \beta}$ problem asks to distinguish between the following two distributions: - $(A, As + e)$ where $A \gets \mathbb{Z}_q^{n \times m}$, $s \gets \psi^m$, $e \gets \psi^n$. - $(A, u)$ where $A \gets \mathbb{Z}_q^{n \times m}$, $u \gets \mathbb{Z}_q^n$. #### An LWE-based Encryption Scheme Let's talk about an encryption scheme based on LWE, which relys on the hardness of $LWE_{m, q, \beta}$. This scheme is adapted from the original and have been improved by many cryptographer. - Key generation: $$\begin{aligned} sk &\colon \bf{s} \gets [\beta]^m \\ pk &\colon (A \in \mathbb{Z}_q^{m \times m}, t = As + e_1) \end{aligned}$$ where $e_1 \gets [\beta]^m$ - Encryption: To encrypt a message $m \in \{0, 1\}$, the encryptor chooses $r, \bf{e_2} \gets [\beta]^m$ and $e_3 \gets [\beta]$ and output: $$\big ( u^T = r^TA + e_2^T, v = r^Tt + e_3 + [\frac{q}{2}]m \big )$$ - Decryption: To decrypt, one computes $v - \bf{u^Ts}$. Instead of receiving message $m$ directly, we will get: $$\begin{aligned} v - \bf{u^Ts} &= r^T(As + e_1) + e_3 + \frac{q}{2}m - (r^TA + e_2^T)s \\ &= r^Te_1 + e_3 + \frac{q}{2}m - e_2^Ts \end{aligned}$$ We can rewrite the final equation as $e + \frac{q}{2}m$, and if the parameters are set such that $e < \frac{q}{4}$, the decryptor can determine $m$ by checking whether the value $v - \bf{u^Ts}$ is closer to $0$ or to $\frac{q}{2}$. ### Lattices The core of lattice-based cryptography, include LWE, are objects known as lattices. An $m$-dimensional integer lattice $\Lambda$ is simply a subgroup of the group $(\mathbb{Z}^m, +)$. Such a group can be described via a generating set called a basis. In particular, a lattice $\Lambda$ defined by a (full-rank) basis $\mathcal{B} \in \mathbb{Z}^{m \times m}$ is $$\Lambda = \mathcal{L}(\mathcal{B}) = \{v \in \mathbb{Z}^m : \exists Av = 0 \mod q\}$$ In this post, we will just learn about $q$-ary integer lattices, as there are the ones that are used in cryptographic constructions. They also have the nice theoretical property that solving some problem over random instances of these lattices is as hard as solving some problem for any lattice. For a matrix $A \in \mathbb{Z}_q^{n \times m}$, the q-ary lattice $\Lambda$ defined by $\mathbb{A}$ is $$\Lambda = \mathcal{L}_q^{\bot}(\mathbb{A}) = \{v \in \mathbb{Z}^m : Av\}$$ The most well known computational problems on lattices are the following: - __Shortest Vector Problem (SVP)__: Given a lattice basis $\bf{B}$, find the shortest nonzero vector in $\mathcal{L}(\bf{B})$ - __Closest Vector Problem (CVP)__: Given a lattice basis $\bf{B}$ and a target vector $\bf{t}$ (not necessarily in the lattice), find the lattice point $\bf{v} \in \mathcal{L}(\bf{B})$ closest to $\bf{t}$ - __Shortest Independent Vectors Problem (SIVP)__: Given a lattice basis $\bf{B} \in \mathbb{R}^{n \times n}$, find $n$ linearity independent lattice vectors $\bf{S} = [s_1, s_2,...,s_n]$ (where $s_i \in \mathcal{L}(\bf{B})$ for all $i$) so that $max||v_i|| \le max||b_i||$, where $||x|| = \sqrt{x_1^2 + x_2^2 +...+ x_n^2}$ We can represent the LWE problem in the language of lattices. Notice that these problems have the relationship: if someone find a way to solve one of these problems efficiently, then he/she can also solve remain problems. ### Encryption over Polynomial Rings The main inefficiency with the LWE-based encryption scheme above was that it required a large ciphertext for encrypting one bit. To deal with this, we can consider the LWE problem over high-degree polynomial rings, rather than just over $\mathbb{Z}_q$. #### Polynomial Rings The polynomial ring $(\mathbb{Z}[X], +, \times)$, with the indeterminate $X$, consists of elements of the form $a(X) = \sum_{i = 0}^\infty a_iX^i$ for $a_i \in \mathbb{Z}$, with the usual polynomial addition and multiplication operations. For convenience, we will often omit the indeterminate $X$ and simply write $a$ instead of $a(X)$. The degree of $a$, denoted $deg(a)$ is the largest $i$ for which $a_i \ne 0$. A polynomial is called monic if $a_{deg(a)} = 1$ and is irreducible if it can be represented as product of two polynomials in the same ring. We will work with the ring $(\mathcal{R}_{q, f}, +, \times)$, where $f \in \mathbb{Z}_q[X]$ is a monic polynomial of degree $d$. The elements of $\mathcal{R}_{q, f}$ are the polynomials $a = \sum_{i=0}^{d - 1}a_iX^i$, where $a_i \in \mathbb{Z}_q$. The sum of two elements in $\mathcal{R}_{q, f}$ simply involves summing the corresponding coefficients in $\mathbb{Z}_q$, that is: $$ a + b = \sum_{i=0}^{d - 1}(a_i + b_i)X^i$$ So the addition of polynomial in $\mathcal{R}_{q, f}$ can be seen as addition of vectors over $\mathbb{Z}_q^d$. Multiplication of a polynomial by an element in $\mathbb{Z}_q$ therefore also has the same interpretation as multiplying a vector by constant. Multiplication of two polynomials in $\mathcal{R}_{q, f}$ involves performing a normal polynomial multiplication followed by a reduction modulo $f$, which means that the remainder after a division by $f$ is performed. #### Generalized-LWE Problems and Encryption With the polynomial ring $\mathcal{R}_{q, f}$, we can define the new version of $LWE_{n, m, \beta}$. __Definition 3:__ For positive integer $m, n, q, \beta < q$ and ring $\mathcal{R}_{q, f}$, the $\mathcal{R}_{q, f}-LWE_{n, m, \beta}$ problem asks to distinguish between the following two distributions: - $(A, As + e)$ where $A \gets \mathcal{R}_{q, f}^{n \times m}$, $s \gets [\beta]^m$, $e \gets [\beta]^n$. - $(A, u)$ where $A \gets \mathcal{R}_{q, f}^{n \times m}$, $u \gets \mathcal{R}_{q, f}^n$. We can also improve the LWE-based encryption scheme which are described in previous section: - Key generation: $$\begin{aligned} sk &\colon \bf{s} \gets [\beta]^m \\ pk &\colon (A \in \mathcal{R}_{q, f}^{m \times m}, t = As + e_1) \end{aligned}$$ where $e_1 \gets [\beta]^m$ - Encryption: To encrypt a message $m \in \mathcal{R}_f$, where the coefficients are in $\{0, 1\}$, the encryptor chooses $r, \bf{e_2} \gets [\beta]^m$ and $e_3 \gets [\beta]$ and output: $$\big ( u^T = r^TA + e_2^T, v = r^Tt + e_3 + [\frac{q}{2}]m \big )$$ - Decryption: To decrypt, one computes: $$\begin{aligned} v - \bf{u^Ts} &= r^T(As + e_1) + e_3 + \frac{q}{2}m - (r^TA + e_2^T)s \\ &= r^Te_1 + e_3 + \frac{q}{2}m - e_2^Ts \end{aligned}$$ And we can extract the message $m$ by checking each coefficient is closer to 0 or to $q$. ### Optimizations #### Number Theoretic Transform The main problem of cryptographic scheme using polynomial is the complexity of multiplying two polynomials of degree $d$, which is $\mathcal{O}(d^2)$. When applying a cryptographic scheme in real world application, it shouldn't be too slow as it will effect all system. Therefore, to perform polynomial multiplication in $\mathcal{R}_{q, f}$ efficiently, Number Theoretic Transfrom (NTT) is the best method, which has complexity $\mathcal{O}(d \log d)$. NTT is a special case of the FFT over the finite field $GF(q)$ rather than over the complex numbers. #### Compression/Decompression function A compression function is an operation that takes an element from one set into smaller target set. When the target set is bigger than start set, it will be called a decompress function. __Definition 4:__ For an element $x \in \mathbb{Z}_q$ and some positive integer $p$, we define a mapping from $\mathbb{Z}_q$ to $\mathbb{Z}_p$ as: $$[x]_{q \to p} = [\frac{xp}{q}] \in \mathbb{Z}_p$$ We can use the function above to compress/decompress data: when we compress an element in $\mathbb{Z}_q$ to one in $\mathbb{Z}_p$ (p < q) and decompress back to $\mathbb{Z}_q$, the result will not be too far away from the original element. __Lemma 1.__ For integers $p < q$ and $x \in \mathbb{Z}_q$, it holds that $$[[x]_{q \to p}]_{p \to q} = x + \eta \in \mathbb{Z}_q$$ for some $\eta \in \mathbb{Z}$ satisfying $|\eta| \le \frac{q}{2p} + \frac{1}{2}$. There are two reasons for using these functions: - Recovering the message $m$ from the noisy decryption output can be done using the compression function. In particular, for an element $x \in \mathbb{Z}_q, [x]_{q \to 2}$ will map to $0$ if $x$ is closer to $0$ than to $\frac{1}{2}$, and to $1$ otherwise. - Using these function will bring bandwidth efficiency while maintaining security properties. ### Key Encapsulation Mechanism In cryptography, a key encapsulation mechanism, or KEM, is a public-key cryptosystem that allows a sender to generate a short secret key and transmit it to a receiver securely, in spite of eavesdropping and intercepting adversaries. There are three algorithms in KEM - `KEM-KeyGen`, `KEM-Encaps` and `KEM-Decaps`. - The key generation algorithms outputs a secret key and a public key. - The encapsulation algorithm takes the public key as input and outputs a share key and a ciphertext. - The decapsulation algorithm takes the ciphertext and the secret key as input and produces the same shared key as output A CPA-secure KEM is one in which an adversary cannot distinguish the shared key from uniform when given the public key and ciphertext. Such a KEM can be constructed from any CPA-secure public key encryption scheme by simply encrypting a random message and setting it as the shared key. ## Detail ### Algorithms #### CRYSTALS-Kyber CPA-secure Encryption Scheme Let's talk about CRYSTALS-Kyber CPA-secure Encryption Scheme, which is based on the hardness of the generalized LWE problem. The scheme works over the ring $\mathcal{R}_{3329, X^{256} + 1}$, and the distribution of the secrets, denoted as $\psi_{\eta}$ for positive integer $\eta$, is drawn from the binomial distribution because it's easier to sample. - Public parameters: $k, \eta_1, \eta_2, d_u, d_v \in \mathbb{Z}^+$ - `CPA-KeyGen`: $$\begin{aligned} A &\gets \mathcal{R}_{3329, X^{256} + 1}^{k \times k} \\ (s, e) &\gets \psi_{\eta_1}^k \times \psi_{\eta_1}^k \\ t &= As + e \\ pk &= (A, t), sk = s \end{aligned}$$ - `CPA-Encrypt(pk, m)`: $$\begin{aligned}(r, e_1, e_2) &\gets \psi_{\eta_1}^k \times \psi_{\eta_2}^k \times \psi_{\eta_2} \\ u^T &= [r^TA + e_1^T]_{q \to 2^{d_u}} \\ v &= [r^Tt + e_2 + \frac{q - 1}{2}m]_{q \to 2^{d_v}} \\ c &= (u, v)\end{aligned}$$ - `CPA-Decrypt(sk, c)`: $$\begin{aligned} u' &= [u]_{2^{d_u} \to q} \\ v' &= [v]_{2^{d_v} \to q} \\ m' &= [v - u^Ts]_{q \to 2}\end{aligned}$$ #### ML-KEM Wrapped everything above, we can talk about ML-KEM right now! - Public parameters: Same as CPA-Encryption scheme. - `KEM-KeyGen`: $$\begin{aligned} (pk, sk) &\gets \text{CPA-KeyGen} \\ pk &:= (A, t), sk := s \end{aligned}$$ - `KEM-Encaps(pk)`: $$ \begin{aligned} m &\gets \{0, 1\}^{256} \in \mathcal{R}_{X^{256} + 1} \\ (K, \rho) &:= \mathcal{H}(m, pk) \in \{0, 1\}^{512} \\ c &:= \text{CPA-Encrypt}(pk, m, \rho) \\ sk &:= K, ctxt := c \end{aligned}$$ - `KEM-Decaps(sk, c, h, z)`: $$\begin{aligned} m' &:= \text{CPA-Decrypt}(sk, c) \\ (K', \rho') &:= \mathcal{H}(m', pk) \\ c' &:= \text{CPA-Encrypt}(pk, m', \rho') \\ c \ne c' &\implies K' := \perp \\ \text{Shared Key} &:= K' \end{aligned} $$ There are some notices about ML-KEM: - The polynomials comprising the matrix $A$ are sampled at random. And in order to efficiently do the multiplication $As$, we need to convert all polynomials in $A$ to their NTT representation. The best strategy here is sample $A$ randomly in its NTT representation. Furthermore, the public key $t$ should be stored in its NTT representation for many benefits it brings. - We can't sample $s, e, r, e_1, e_2$ directly in their NTT representation because their distribution is not uniformly random. - We cannot perform the compression operations $[.]_{q \to p}$ when the element is in its NTT representation. The table below is parameters for the three instantiations of Kyber. The security of the three schemes are approximately equivalent to that of AES-128, AES-192, and AES-256, respectively. | | k | $\eta_1$ | $\eta_2$ | $d_u$ | $d_v$ | decapsulation key size | encapsulation key size | ciphertext size | | --- | --- | -------- | -------- | ----- | ----- | ---------------- | ------- | --------------- | | Kyber-512 | 2 | 3 | 2 | 10 | 4 | 1632 B | 800 B | 768 B | | Kyber-768 | 3 | 2 | 2 | 10 | 4 | 2400 B | 1184 B | 1088 B | | Kyber-1024 | 4 | 2 | 2 | 11 | 5 | 3168 B | 1568 B | 1568 B | ### Implementation The implementation is described clearly at [here](https://csrc.nist.gov/pubs/fips/203/final). You can find example implementation of ML-KEM at https://github.com/Giapppp/ml-kem. ### Benchmark This part uses the implementation above to run three instantiations of Kyber. The environment uses here is MacOS Solama, 1,4 GHz Quad-Core Intel Core i5, 16 GB 2133 MHz LPDDR3. | | Times | | --------- | ----- | | Kyber-512 | 0.26s | | Kyber-768 | 0.40s | | Kyber-1024 | 0.81s | ## Conclusion FIPS 203 and the ML-KEM standard represent significant advancements in cryptographic technology, particularly in preparing for potential future threats posed by quantum computing. By understanding the parameter sets, differences from previous schemes, and practical considerations, organizations can effectively implement ML-KEM to enhance their data protection strategies. ## Resources [FIPS 203: Module-Lattice-based Key-Encapsulation-Mechanism](https://csrc.nist.gov/pubs/fips/203/final) [Vadim Lyubashevsky, Basic Lattice Cryptography: The concepts behind Kyber (ML-KEM) and Dilithium (ML-DSA)](https://eprint.iacr.org/2024/1287.pdf)