In August 13th, 2024, NIST announced three post-quantum cryptography standards, where two of them are from lattice-based cryptography and the other one is from hash-based cryptography. In this post, we will have an overview about FIPS 203, which specify algorithms derived from CRYSTALS-Kyber, is a key encapsulation mechanism based on module lattice.
The Learning with Errors Problem (LWE) has an important role in many cryptographic schemes that are related to lattice cryptography, so let's review this problem again.
Definition 1. For positive integer and , the problem asks to distinguish between the following two distributions:
The crucial part that makes this problem hard is the exist of error vector , which makes Gaussian elimination can't be applied here. The hardness of LWE relys on the parameters , and it becomes harder when and grow. The parameter is not known to have large impact on the hardness of the problem.
We also notice that there is nothing too special about using the uniform distribution for the secret and error terms, it just makes the presentation simpler. When rounded Gaussian distribution is chosen in the original definition of LWE, some practical implementations like Kyber uses the binomial distribution to generate the error because of it speed. To account for different distributions that one could use, we can define the LWE problem with the distribution like this:
Definition 2. For positive integer and a distribution , the problem asks to distinguish between the following two distributions:
Let's talk about an encryption scheme based on LWE, which relys on the hardness of . This scheme is adapted from the original and have been improved by many cryptographer.
Key generation:
where
Encryption: To encrypt a message , the encryptor chooses and and output:
Decryption: To decrypt, one computes . Instead of receiving message directly, we will get:
We can rewrite the final equation as , and if the parameters are set such that , the decryptor can determine by checking whether the value is closer to or to .
The core of lattice-based cryptography, include LWE, are objects known as lattices. An -dimensional integer lattice is simply a subgroup of the group . Such a group can be described via a generating set called a basis. In particular, a lattice defined by a (full-rank) basis is
In this post, we will just learn about -ary integer lattices, as there are the ones that are used in cryptographic constructions. They also have the nice theoretical property that solving some problem over random instances of these lattices is as hard as solving some problem for any lattice.
For a matrix , the q-ary lattice defined by is
The most well known computational problems on lattices are the following:
We can represent the LWE problem in the language of lattices. Notice that these problems have the relationship: if someone find a way to solve one of these problems efficiently, then he/she can also solve remain problems.
The main inefficiency with the LWE-based encryption scheme above was that it required a large ciphertext for encrypting one bit. To deal with this, we can consider the LWE problem over high-degree polynomial rings, rather than just over .
The polynomial ring , with the indeterminate , consists of elements of the form for , with the usual polynomial addition and multiplication operations. For convenience, we will often omit the indeterminate and simply write instead of . The degree of , denoted is the largest for which . A polynomial is called monic if and is irreducible if it can be represented as product of two polynomials in the same ring.
We will work with the ring , where is a monic polynomial of degree . The elements of are the polynomials , where . The sum of two elements in simply involves summing the corresponding coefficients in , that is:
So the addition of polynomial in can be seen as addition of vectors over . Multiplication of a polynomial by an element in therefore also has the same interpretation as multiplying a vector by constant.
Multiplication of two polynomials in involves performing a normal polynomial multiplication followed by a reduction modulo , which means that the remainder after a division by is performed.
With the polynomial ring , we can define the new version of .
Definition 3: For positive integer and ring , the problem asks to distinguish between the following two distributions:
We can also improve the LWE-based encryption scheme which are described in previous section:
Key generation:
where
Encryption: To encrypt a message , where the coefficients are in , the encryptor chooses and and output:
Decryption: To decrypt, one computes:
And we can extract the message by checking each coefficient is closer to 0 or to .
The main problem of cryptographic scheme using polynomial is the complexity of multiplying two polynomials of degree , which is . When applying a cryptographic scheme in real world application, it shouldn't be too slow as it will effect all system. Therefore, to perform polynomial multiplication in efficiently, Number Theoretic Transfrom (NTT) is the best method, which has complexity . NTT is a special case of the FFT over the finite field rather than over the complex numbers.
A compression function is an operation that takes an element from one set into smaller target set. When the target set is bigger than start set, it will be called a decompress function.
Definition 4: For an element and some positive integer , we define a mapping from to as:
We can use the function above to compress/decompress data: when we compress an element in to one in (p < q) and decompress back to , the result will not be too far away from the original element.
Lemma 1. For integers and , it holds that
for some satisfying .
There are two reasons for using these functions:
In cryptography, a key encapsulation mechanism, or KEM, is a public-key cryptosystem that allows a sender to generate a short secret key and transmit it to a receiver securely, in spite of eavesdropping and intercepting adversaries.
There are three algorithms in KEM - KEM-KeyGen
, KEM-Encaps
and KEM-Decaps
.
A CPA-secure KEM is one in which an adversary cannot distinguish the shared key from uniform when given the public key and ciphertext. Such a KEM can be constructed from any CPA-secure public key encryption scheme by simply encrypting a random message and setting it as the shared key.
Let's talk about CRYSTALS-Kyber CPA-secure Encryption Scheme, which is based on the hardness of the generalized LWE problem. The scheme works over the ring , and the distribution of the secrets, denoted as for positive integer , is drawn from the binomial distribution because it's easier to sample.
CPA-KeyGen
: CPA-Encrypt(pk, m)
: CPA-Decrypt(sk, c)
: Wrapped everything above, we can talk about ML-KEM right now!
KEM-KeyGen
: KEM-Encaps(pk)
: KEM-Decaps(sk, c, h, z)
: There are some notices about ML-KEM:
The table below is parameters for the three instantiations of Kyber. The security of the three schemes are approximately equivalent to that of AES-128, AES-192, and AES-256, respectively.
k | decapsulation key size | encapsulation key size | ciphertext size | |||||
---|---|---|---|---|---|---|---|---|
Kyber-512 | 2 | 3 | 2 | 10 | 4 | 1632 B | 800 B | 768 B |
Kyber-768 | 3 | 2 | 2 | 10 | 4 | 2400 B | 1184 B | 1088 B |
Kyber-1024 | 4 | 2 | 2 | 11 | 5 | 3168 B | 1568 B | 1568 B |
The implementation is described clearly at here. You can find example implementation of ML-KEM at https://github.com/Giapppp/ml-kem.
This part uses the implementation above to run three instantiations of Kyber. The environment uses here is MacOS Solama, 1,4 GHz Quad-Core Intel Core i5, 16 GB 2133 MHz LPDDR3.
Times | |
---|---|
Kyber-512 | 0.26s |
Kyber-768 | 0.40s |
Kyber-1024 | 0.81s |
FIPS 203 and the ML-KEM standard represent significant advancements in cryptographic technology, particularly in preparing for potential future threats posed by quantum computing. By understanding the parameter sets, differences from previous schemes, and practical considerations, organizations can effectively implement ML-KEM to enhance their data protection strategies.