Try   HackMD

An Overview about FIPS 203: Module-Lattice-based Key-Encapsulation-Mechanism

Introduction

In August 13th, 2024, NIST announced three post-quantum cryptography standards, where two of them are from lattice-based cryptography and the other one is from hash-based cryptography. In this post, we will have an overview about FIPS 203, which specify algorithms derived from CRYSTALS-Kyber, is a key encapsulation mechanism based on module lattice.

Notation

Pre-requisited

The LWE Problem

Definition

The Learning with Errors Problem (LWE) has an important role in many cryptographic schemes that are related to lattice cryptography, so let's review this problem again.

Definition 1. For positive integer

m,n,q and
β<q
, the
LWEn,m,q,β
problem asks to distinguish between the following two distributions:

  • (A,As+e)
    where
    AZqn×m
    ,
    s[β]m
    ,
    e[β]n
    .
  • (A,u)
    where
    AZqn×m
    ,
    uZqn
    .

The crucial part that makes this problem hard is the exist of error vector

e, which makes Gaussian elimination can't be applied here. The hardness of LWE relys on the parameters
n,m,q,β
, and it becomes harder when
m
and
βq
grow. The parameter
n
is not known to have large impact on the hardness of the problem.

We also notice that there is nothing too special about using the uniform distribution for the secret and error terms, it just makes the presentation simpler. When rounded Gaussian distribution is chosen in the original definition of LWE, some practical implementations like Kyber uses the binomial distribution to generate the error because of it speed. To account for different distributions that one could use, we can define the LWE problem with the distribution like this:

Definition 2. For positive integer

m,n,q and a distribution
ψ
, the
LWEn,m,q,β
problem asks to distinguish between the following two distributions:

  • (A,As+e)
    where
    AZqn×m
    ,
    sψm
    ,
    eψn
    .
  • (A,u)
    where
    AZqn×m
    ,
    uZqn
    .

An LWE-based Encryption Scheme

Let's talk about an encryption scheme based on LWE, which relys on the hardness of

LWEm,q,β. This scheme is adapted from the original and have been improved by many cryptographer.

  • Key generation:

    sk:s[β]mpk:(AZqm×m,t=As+e1)
    where
    e1[β]m

  • Encryption: To encrypt a message

    m{0,1}, the encryptor chooses
    r,e2[β]m
    and
    e3[β]
    and output:
    (uT=rTA+e2T,v=rTt+e3+[q2]m)

  • Decryption: To decrypt, one computes

    vuTs. Instead of receiving message
    m
    directly, we will get:
    vuTs=rT(As+e1)+e3+q2m(rTA+e2T)s=rTe1+e3+q2me2Ts

We can rewrite the final equation as

e+q2m, and if the parameters are set such that
e<q4
, the decryptor can determine
m
by checking whether the value
vuTs
is closer to
0
or to
q2
.

Lattices

The core of lattice-based cryptography, include LWE, are objects known as lattices. An

m-dimensional integer lattice
Λ
is simply a subgroup of the group
(Zm,+)
. Such a group can be described via a generating set called a basis. In particular, a lattice
Λ
defined by a (full-rank) basis
BZm×m
is
Λ=L(B)={vZm:Av=0modq}

In this post, we will just learn about

q-ary integer lattices, as there are the ones that are used in cryptographic constructions. They also have the nice theoretical property that solving some problem over random instances of these lattices is as hard as solving some problem for any lattice.

For a matrix

AZqn×m, the q-ary lattice
Λ
defined by
A
is
Λ=Lq(A)={vZm:Av=0modq}

The most well known computational problems on lattices are the following:

  • Shortest Vector Problem (SVP): Given a lattice basis
    B
    , find the shortest nonzero vector in
    L(B)
  • Closest Vector Problem (CVP): Given a lattice basis
    B
    and a target vector
    t
    (not necessarily in the lattice), find the lattice point
    vL(B)
    closest to
    t
  • Shortest Independent Vectors Problem (SIVP): Given a lattice basis
    BRn×n
    , find
    n
    linearity independent lattice vectors
    S=[s1,s2,...,sn]
    (where
    siL(B)
    for all
    i
    ) so that
    max||vi||max||bi||
    , where
    ||x||=x12+x22+...+xn2

We can represent the LWE problem in the language of lattices. Notice that these problems have the relationship: if someone find a way to solve one of these problems efficiently, then he/she can also solve remain problems.

Encryption over Polynomial Rings

The main inefficiency with the LWE-based encryption scheme above was that it required a large ciphertext for encrypting one bit. To deal with this, we can consider the LWE problem over high-degree polynomial rings, rather than just over

Zq.

Polynomial Rings

The polynomial ring

(Z[X],+,×), with the indeterminate
X
, consists of elements of the form
a(X)=i=0aiXi
for
aiZ
, with the usual polynomial addition and multiplication operations. For convenience, we will often omit the indeterminate
X
and simply write
a
instead of
a(X)
. The degree of
a
, denoted
deg(a)
is the largest
i
for which
ai0
. A polynomial is called monic if
adeg(a)=1
and is irreducible if it can be represented as product of two polynomials in the same ring.

We will work with the ring

(Rq,f,+,×), where
fZq[X]
is a monic polynomial of degree
d
. The elements of
Rq,f
are the polynomials
a=i=0d1aiXi
, where
aiZq
. The sum of two elements in
Rq,f
simply involves summing the corresponding coefficients in
Zq
, that is:
a+b=i=0d1(ai+bi)Xi

So the addition of polynomial in

Rq,f can be seen as addition of vectors over
Zqd
. Multiplication of a polynomial by an element in
Zq
therefore also has the same interpretation as multiplying a vector by constant.

Multiplication of two polynomials in

Rq,f involves performing a normal polynomial multiplication followed by a reduction modulo
f
, which means that the remainder after a division by
f
is performed.

Generalized-LWE Problems and Encryption

With the polynomial ring

Rq,f, we can define the new version of
LWEn,m,β
.

Definition 3: For positive integer

m,n,q,β<q and ring
Rq,f
, the
Rq,fLWEn,m,β
problem asks to distinguish between the following two distributions:

  • (A,As+e)
    where
    ARq,fn×m
    ,
    s[β]m
    ,
    e[β]n
    .
  • (A,u)
    where
    ARq,fn×m
    ,
    uRq,fn
    .

We can also improve the LWE-based encryption scheme which are described in previous section:

  • Key generation:

    sk:s[β]mpk:(ARq,fm×m,t=As+e1)
    where
    e1[β]m

  • Encryption: To encrypt a message

    mRf, where the coefficients are in
    {0,1}
    , the encryptor chooses
    r,e2[β]m
    and
    e3[β]
    and output:
    (uT=rTA+e2T,v=rTt+e3+[q2]m)

  • Decryption: To decrypt, one computes:

    vuTs=rT(As+e1)+e3+q2m(rTA+e2T)s=rTe1+e3+q2me2Ts

And we can extract the message

m by checking each coefficient is closer to 0 or to
q
.

Optimizations

Number Theoretic Transform

The main problem of cryptographic scheme using polynomial is the complexity of multiplying two polynomials of degree

d, which is
O(d2)
. When applying a cryptographic scheme in real world application, it shouldn't be too slow as it will effect all system. Therefore, to perform polynomial multiplication in
Rq,f
efficiently, Number Theoretic Transfrom (NTT) is the best method, which has complexity
O(dlogd)
. NTT is a special case of the FFT over the finite field
GF(q)
rather than over the complex numbers.

Compression/Decompression function

A compression function is an operation that takes an element from one set into smaller target set. When the target set is bigger than start set, it will be called a decompress function.

Definition 4: For an element

xZq and some positive integer
p
, we define a mapping from
Zq
to
Zp
as:
[x]qp=[xpq]Zp

We can use the function above to compress/decompress data: when we compress an element in

Zq to one in
Zp
(p < q) and decompress back to
Zq
, the result will not be too far away from the original element.

Lemma 1. For integers

p<q and
xZq
, it holds that
[[x]qp]pq=x+ηZq

for some
ηZ
satisfying
|η|q2p+12
.

There are two reasons for using these functions:

  • Recovering the message
    m
    from the noisy decryption output can be done using the compression function. In particular, for an element
    xZq,[x]q2
    will map to
    0
    if
    x
    is closer to
    0
    than to
    12
    , and to
    1
    otherwise.
  • Using these function will bring bandwidth efficiency while maintaining security properties.

Key Encapsulation Mechanism

In cryptography, a key encapsulation mechanism, or KEM, is a public-key cryptosystem that allows a sender to generate a short secret key and transmit it to a receiver securely, in spite of eavesdropping and intercepting adversaries.

There are three algorithms in KEM - KEM-KeyGen, KEM-Encaps and KEM-Decaps.

  • The key generation algorithms outputs a secret key and a public key.
  • The encapsulation algorithm takes the public key as input and outputs a share key and a ciphertext.
  • The decapsulation algorithm takes the ciphertext and the secret key as input and produces the same shared key as output

A CPA-secure KEM is one in which an adversary cannot distinguish the shared key from uniform when given the public key and ciphertext. Such a KEM can be constructed from any CPA-secure public key encryption scheme by simply encrypting a random message and setting it as the shared key.

Detail

Algorithms

CRYSTALS-Kyber CPA-secure Encryption Scheme

Let's talk about CRYSTALS-Kyber CPA-secure Encryption Scheme, which is based on the hardness of the generalized LWE problem. The scheme works over the ring

R3329,X256+1, and the distribution of the secrets, denoted as
ψη
for positive integer
η
, is drawn from the binomial distribution because it's easier to sample.

  • Public parameters:
    k,η1,η2,du,dvZ+
  • CPA-KeyGen:
    AR3329,X256+1k×k(s,e)ψη1k×ψη1kt=As+epk=(A,t),sk=s
  • CPA-Encrypt(pk, m):
    (r,e1,e2)ψη1k×ψη2k×ψη2uT=[rTA+e1T]q2duv=[rTt+e2+q12m]q2dvc=(u,v)
  • CPA-Decrypt(sk, c):
    u=[u]2duqv=[v]2dvqm=[vuTs]q2

ML-KEM

Wrapped everything above, we can talk about ML-KEM right now!

  • Public parameters: Same as CPA-Encryption scheme.
  • KEM-KeyGen:
    (pk,sk)CPA-KeyGenpk:=(A,t),sk:=s
  • KEM-Encaps(pk):
    m{0,1}256RX256+1(K,ρ):=H(m,pk){0,1}512c:=CPA-Encrypt(pk,m,ρ)sk:=K,ctxt:=c
  • KEM-Decaps(sk, c, h, z):
    m:=CPA-Decrypt(sk,c)(K,ρ):=H(m,pk)c:=CPA-Encrypt(pk,m,ρ)ccK:=⊥Shared Key:=K

There are some notices about ML-KEM:

  • The polynomials comprising the matrix
    A
    are sampled at random. And in order to efficiently do the multiplication
    As
    , we need to convert all polynomials in
    A
    to their NTT representation. The best strategy here is sample
    A
    randomly in its NTT representation. Furthermore, the public key
    t
    should be stored in its NTT representation for many benefits it brings.
  • We can't sample
    s,e,r,e1,e2
    directly in their NTT representation because their distribution is not uniformly random.
  • We cannot perform the compression operations
    [.]qp
    when the element is in its NTT representation.

The table below is parameters for the three instantiations of Kyber. The security of the three schemes are approximately equivalent to that of AES-128, AES-192, and AES-256, respectively.

k
η1
η2
du
dv
decapsulation key size encapsulation key size ciphertext size
Kyber-512 2 3 2 10 4 1632 B 800 B 768 B
Kyber-768 3 2 2 10 4 2400 B 1184 B 1088 B
Kyber-1024 4 2 2 11 5 3168 B 1568 B 1568 B

Implementation

The implementation is described clearly at here. You can find example implementation of ML-KEM at https://github.com/Giapppp/ml-kem.

Benchmark

This part uses the implementation above to run three instantiations of Kyber. The environment uses here is MacOS Solama, 1,4 GHz Quad-Core Intel Core i5, 16 GB 2133 MHz LPDDR3.

Times
Kyber-512 0.26s
Kyber-768 0.40s
Kyber-1024 0.81s

Conclusion

FIPS 203 and the ML-KEM standard represent significant advancements in cryptographic technology, particularly in preparing for potential future threats posed by quantum computing. By understanding the parameter sets, differences from previous schemes, and practical considerations, organizations can effectively implement ML-KEM to enhance their data protection strategies.

Resources

FIPS 203: Module-Lattice-based Key-Encapsulation-Mechanism

Vadim Lyubashevsky, Basic Lattice Cryptography: The concepts behind Kyber (ML-KEM) and Dilithium (ML-DSA)