# Post-Quantum Cryptography Notes ![](https://hackmd.io/_uploads/H1x-jG-Za.png) ![](https://hackmd.io/_uploads/HkT7ofWWa.png) ## Designing cryptosystems that can run on today's classical computers and are secure against quantum attacks. ### What's the rush in developing post-quantum cryptosystems? - Big QCs probably won't exist at a commercial level for several years, however : - Harvesting Attacks: SNDL (Store Now Decrypt Later), Attackers all over the world are currently following this attacks strategy where they are storing today's cipher texts and public keys and are in hopes of performing a brute force attack, once QCs are a common thing, and are commercially available. - Rewriting Past Timestamps (fork history and rewrite transactions that happened in the past thereby, making blockchains *mutable*) - Deploying new cryptography at scale takess about 10 years. ## Lattices ![](https://hackmd.io/_uploads/BJZXRXWb6.png) ### Why Lattices? - Linear and highly parallelizable by the computer. - Resists quantum attacks (till now) - Security from mild **worst-case hardness assumptions** (this means that these cryptosystems exhibit random-instances of hardness, based on the fact that there exists hard instances). - Solutions to some amazing cryptosystems till date like *Fully Homomorphic Encryption*. ### What is a Lattice? - A periodic `grid` in $\mathbb{Z_m}$. (Formally: a full-rank additive group.) ![](https://hackmd.io/_uploads/r1AaGkXba.png) This diagram represents a 2-dimenstional lattice for the viewer's simplicity. It consists of a basis, a *basis* is a set of $m$ linearly independent vectors, which generates a lattice via all integer-based linear combinations. - Basis $B = \{{b_1,b_2,...,b_m}\}$: such that $$\mathcal{L} = \sum_{i=1}^m(\mathbb{Z}.b_i)$$ So before we start doing linear combinations, we can picturise the vectors so be arranged like this: ![](https://hackmd.io/_uploads/r1JCEyXbp.png) - Basis are not unique, there are different vectors forming a differemt basis, thereby constructing the same lattice, for example, differing from the previous set of vectors, if you look at this, it surprisingly builds the same lattice! ![](https://hackmd.io/_uploads/rk3DrkXba.png) ### Some Hard Lattice Problems 1. Find/detect `short` nonzero lattice vectors: (Gap)SVP$\gamma$ (Shortest Vector Problem), SIVP$\gamma$ (Shortest Independent Vector Problem). 2. For $\gamma = poly(m)$, appears to require $2^{Ω(m)}$ time and space, even by the QC. Note that $\gamma$ here is the approximation factor, so in real life, probably one doesn't need to *exactly* find the shortest vector, and can simply compute a vector in the factors of $\gamma$. The approximation factor can also be dependent on the dimension we are dealing with, hence, $\gamma = poly(m)$ means where $\gamma$ is a polynomial in the dimension (degree for newbies) $m$. Lastly, as the approximation factor $\gamma$ grows, the problem becomes easier. As the nature of what an expected solution is grows largely. ### Why Shor's algorithm doesn't work here? Shor's algorithm specializes in finding group structures explicitly, which means, suppose a problem talks about a mathematically defined lattice, using Shor one can construct that lattice explicity, but not perform geometric computations within that lattice, especially in $poly$ time. Here, the SVP$\gamma$ is a geometric problem within that lattice, hence, it is extremely hard to find out the specific geometric pattern in terms of the SVP$\gamma$. Hence, we can say that lattice problems typically fall into the complexity class, where everything depends on the degree of $\gamma$. Usually, if the $\gamma$ is a constant, it usually falls into NP-Complete, however for a $\gamma$ bigger than the square root of the dimension, the complexity class is usually somewhere in $NP \cap Co-NP$. ### Some Harder Problems : Foundations & Digital Signatures #### A Hard Problem: Short Integer Solution `[Ajtai`96]` - $\mathbb{Z_q^n}$ = $n$ dimensional integer vectors modulo $q$. - GOAL: find a non-trivial $z_1,z_2,...,z_m$ $\in$ $\{0, \pm1\}$ such that: ![](https://hackmd.io/_uploads/HJu1leVZ6.png) - In simpler terms, goal is to find a non-trivial "short" $\mathbb{z} \in \mathbb{Z^m}, ||\mathbb{z}|| \leq \beta \ll q$ such that: ![](https://hackmd.io/_uploads/H1QQbx4-T.png) Here, $||\mathbb{z}||$ denotes the Euclidean Norm of $\mathbb{z}$. #### Converting this into a Collision-Resistant Hash Function - Set $m > n log_2 q$. Define "compressing" $f_A: \{0,1\}^m \rightarrow \mathbb{Z_q^n}$. Hence the function is simply defined as: $$f_A(x) = A.x$$ - Collision: let $x, x' \in \{0,1\}^m$ where $Ax = Ax'$, then $A(x-x')=0$, this yields us a **short solution** $z = x-x' \in \{0,1\}^m$. Hence the Euclidean norm of this $||\mathbb{z}||$ is atmost $\sqrt{m}$. ### Correlation of the Short Integer Solution with Lattices - $A \in \mathbb{Z_q^{n \times m}}$ defines a '$q-ary$' lattice: $$\mathcal{L}(A) := \{z\in\mathbb{Z^m} : Az = 0\} \supseteq q\mathbb{Z^m}$$ Here '$q-ary$' means $q \times any \ integer$ will be inside the lattice. And hence, the `short` solutions $z$ lie in the enclosed circle inside the lattice. Refer to the diagram below, this enclosed circle indeed behaves as the Shortest Vector. ![](https://hackmd.io/_uploads/S1PNf5UWT.png) Hence, the short solutions lie in this enclosed circle. #### Worst to Average Case Reduction [Ajtai'96...] Finding "short" ($||\mathbb{z}|| \leq \beta \ll q$) nonzero $\mathbb{z} \in \mathcal{L}(A)$ (for uniformly random $A \in \mathbb{Z_q^{n\times m}}$), then it implies that GapSVP$\beta\sqrt n$ is also solvable on *any* n-d lattice. ---(1) This statement in Lattice Cryptography is known as a `randomized cousin reduction problem`. Here $\beta$ again acts as an approximation factor for the shortest vector, which means that the more we relax $\beta$ we will be facing worse-approximation problems in the worst-case to average-case scenario. Hence, for an instance here if $\beta$ is a polynomial, then $\beta \sqrt n$ is also a polynomial, and then the problem would indeed be considered as a hard problem. ### How is the $\beta$ value even deduced? Honestly, it's reverse-engineered, hit experimenting, it's regarded as the smallest value that we can get away with. If the domain for 'short' $\mathbb{z} \in \mathbb{Z^m}$, then the value of $\beta$ at the worst-case is $\sqrt m$. Now we can simply plug in the value of $\beta$ into eq ---(1) and deduce the worst-case time complexity. #### How is the self-reduction problem stated here different from the self-reduction problem stated in Discrete Log? In discrete log, self-reduction can be defined in the following way. >If a problem's discrete log can be computed in $poly \ time ()$ under the group $g$. Then the discrete log of a similar problem can also be solvable in $poly \ time ()$ under the same group $g$. In lattice problems, however, these things work differently, first of all because in the cousin-reduction statement, the 2 problems may or may not be in the same group $g$. If the groups were definitely same like the ones' in discrete log, then the QC would likely have an easy time, figuring out all the factors of a certain number in that group, in about $poly \ time()$. ## Applications of this Lattice Theory in Digital Signatures. ### Application: [GentryPeikertVaikuntanathan'08] - Generate uniform $vk = A$ such that with a `secret trapdoor` $sk = T$. As we discussed before, in a digital signature, we just need to generate a `public key` and a `private key`, here, the following are a basically a uniformly random matrix $A$ and $T$ respectively. Hence, just to reiterate, $T$ will be my secret signing key and $A$ will be viewing key that I give out to the public. - Sign($T,\mu$) and use $T$ to sample a short $\mathbb{z} \in \mathbb{Z^m}$ s.t. $A\mathbb{z} = H(\mu) \in \mathbb{Z_q^n}$. So to boil it down we first use as a hash function H to map the hidden message $\mu$ to a vector in $\mathbb{Z_q^n}$. Then we use the _secret key_ to sample a short solution $\mathbb{z}$ In simpler context, we draw $\mathbb{z}$ such that it reveals nothing about the secret key or basically the _trapdoor_, here that is $T$. For visualising the distribution, and the algorithm is supposed to sample random $\mathbb{z}$ ![](https://hackmd.io/_uploads/HyWNMgpZT.png) - Verify ($A,\mu,\mathbb{z}$), checking that $Az = H(\mu)$ and $\mathbb{z}$ is sufficiently short. The verifier redoes the computation of $H(\mu)$ and checks it's equality with $Az$. In real life we deduce the value of $H(\mu)$ using a random oracle, ideally, it's a encrypted by a collision-resistant hash function, usually sha-2 or sha-3. - Security Breach: forging a signature for a new message $\mu*$ requires finding _short_ $\mathbb{z}$ st $Az* = H(\mu*)$. This is indeed computationally hard to find. There is still indeed a chance of a security vulnerability, that is, when Alice signs the message multiple times. In that case, for the same $H(\mu)$, multiple $mathbb{z}$ values are getting leaked. Hence, for every sign try, there's a new $\mathbb{z}$, that gets generated, so we can write down the equations like: $$A\mathbb{z} = H(\mu) \ -(1)$$ and $$A\mathbb{z'} = H(\mu) \ -(2)$$ therefore, rearranging we get: $$A(\mathbb{z'} - \mathbb{z})= 0-(3)$$ This is a really harmful step here, as this signifies _giving away short vectors in the A lattice_. One of the ways to mitigate this problem is to add a salt to the hash-function, so that every time we sign we are essentially hashing a different number. ### Today's lattice-based digital signatures: Falcon [FHK + '17], Dilithium [DKL + '17] Refinements for the components of the framework: 1. Generating a "hard" lattice/trapdoor pair: The key question is that is it hard to decode within the threshold distance on lattices produced. 2. Gaussian lattice sampling Falcon uses these to get the _smallest_ vk+sig size, also with very fast verification. Dilithium uses a technique called _"Fiat Shamir with Aborts"_, a very simple signing algorithm. However, Dilithium assumes that SIS is _quantumly_ hard for solution norm $\lesssim q$ in $l_\infty$ norm, apart from just the eucludian norm. The other security concern is that Dilithium does have a security proof where the Hash function is verified that it is truly random (Random Oracle). ## Public Key-Encryption using Lattices ### Learning With Errors [Regev'05] - `Parameters`: dimension $n$, modulus $q$, **error distribution** - `Goal`: find the secret vector $s$ which is chosen at random such that $s \in \mathbb{Z_q^n}$, given many noisy inner products. $$a_1 \leftarrow \mathbb{Z_q^n}, \ b_1 \approx \ <s,a_1> mod \ q$$ $$a_2 \leftarrow \mathbb{Z_q^n}, \ b_2 \approx \ <s,a_2> mod \ q$$ <center>....</center> We can see that the secret $s$ remains the same in the inner products, but we use the $\approx$ symbol to indicate the _noisy-ness_. Therefore, if we want to further refine the equation, then a good representation shall look like: $$a_1 \leftarrow \mathbb{Z_q^n}, \ b_1 = \ <s,a_1> + e_1 \in \mathbb{Z_q}$$ $$a_2 \leftarrow \mathbb{Z_q^n}, \ b_2 = \ <s,a_2>+ e_2 \in \mathbb{Z_q}$$ <center>....</center> Where, $e_1$ and $e_2$ are picked up from a Gaussian distribution of errors. <center><img src="https://hackmd.io/_uploads/SJQifuezp.png"></img></center> here, $\sqrt n$ is the standard deviation of the error, and $q$ as we know is the prime modulus. also, the _rate_, also called the _error rate_ is denoted as: $$rate (\alpha) = \frac{\sqrt n}{q}$$ ![](https://hackmd.io/_uploads/HkGXNOeMa.png) The $A$ matrix states uniformly random $a_i$ vectors that are horizontally stacked sideways to each other. Hence, $s^tA$ is the linear combination of the $A$ with $b^t$ provided there's an extent of error $e_i$ among the inner products of these matrices. - `Decision`: to distinguish between (A, b) from uniformly independent (A, _b_) with no errors in their inner products. ### Hardness of LWE ($\frac{n}{\alpha}$)-approx _worst case_ lattice problems $\leq$ search-LWE $\leq$ decision-LWE $\leq$ crypto If we have an algorithm that can solve a search-LWE problem, then one can convert this algorithm into an algorithm that solves worst case lattice problems that has ($\frac{n}{\alpha}$) approximation factors. $\alpha$ again is the `error rate` here, which is the fraction of the cycle covered by the error here. However, ($\frac{n}{\alpha}$) states that the approximation factor that we have here kind of grows _inversely_ with $\alpha$. Hence, when the error rate narrows, the approximation factor worsens, thereby the problem becomes more relaxed. Smaller `error rate` gives us a `weaker` guarantee for the worst case lattice problem. Moreover, being unable to solve Decision-LWE with quantum also implies being unable to solve Search-LWE with quantum. Thus, coming to the last piece of the puzzle, that is most Post-Quantum Cryptosystems are built based on the fact that decision-LWEs are hard to solve by a quantum computer. A Learning with errors problem can be called a _dual_ to a Short Integer Solutions problem. We use the word *dual* because they share the same strong mathematical foundation. Let $$\mathcal{L}(A) = \{ z^t \equiv s^tA \ mod \ q\}$$ Given $A$ and $b \approx sA$, find $s$. Here is a diagrammatic representation <center><img src="https://hackmd.io/_uploads/Syyr2Fqza.png"></img></center> Here we can see that $b$ is not exactly where we want it to be, but it's close enough in the neighbourhood, within the approximation factor $\gamma$, in terms of SIS. However, in terms of LWE, we call this a **Bounded-Distance Decoding** **(BDD$_\alpha$)**, where the given target is '$\alpha$-far' from a lattice point. Our task is to find that point. The difference between LWE and BDD is that in LWE the inputs are generated from a Gaussian Distribution of points. Whereas in BDD, the lattice, the inputs as well as the targets could be anything and anywhere. Hence BDD is more like a _worst-case scenario_ for LWE. ### Regev's Theorem `Statement`: If we have an algorithm that solves BDD$_\alpha$ on an arbitrary lattice $\mathcal{L}$, then have a _quantum procedure_ then there exists an algorithm that samples '$(1/\alpha)$-short' points from a dual lattice $\mathcal{L^*}$. The length of that Gaussian is inversely related to the relative distance of $\alpha$. ## Further into Dilithium ### Some interesting features: - Recommended Paramters, Public key size 1.5 KB, signature size 2.7 KB (recommended parameters). - Design of Dilithium is based on `Fiat Shamir with Aborts` technique. - Uses singature compression algorithms, to send signatures that are 50% smaller. - Latest specs of Dilithium consists of compression of the public key, thereby making it 60% smaller. - Hardness of the underlying problem is neither based on Learning with Errors nor Short Integer Solutions problem, it's based on something that keeps interpolating between the 2, hence it's called _Module LWE_. ### Principal design patterns of TrueID-Q, based on Dilithium - Easy to implement securely as it does not involve any Gaussian sampling. - Total size of public key + signature is relatively small. Although among the other NIST PQC algorithms, Falcon is smaller. - Parameters on Dilithium are chosen very conservatively, so that future developments can take place easily. - Using _Module-LWE/SIS_ allows us to work over the same small ring for all security levels, arithmetics need to be optimized only once and for all. ### Choosing such a Ring `Strategy`: Choosing the smallest ring dimension _n_ such that it gives the main advantages of Ring LWE. `Dimensions`: 256 is sufficiently large enough to get a large set of small norm challenges. Fully splitting prime _q_ allows for NTT-based mutliplications. $$R = \mathbb{Z}_{2^{23}+2^{13}+1}[X]/(X^256 + 1)$$ ### Main Algorithm #### Key Generation $$ A \leftarrow R^{5\times4} \\ S_1 \leftarrow S_5^4, S_2 \leftarrow S_5^5 \\ t = As_1+s_2 \\ pk (A,t), sk = (A,t,s_1,s_2) $$ #### Verification $$ c' = H(High(Az - ct), M) \\ If \ ||z||_\infty>\gamma - \beta \ and \ c'=c, accept $$ #### Signing $$ y \leftarrow S_\gamma^4 \\ w = Ay \\ c = H(High(w), M) \in B_{60} \\ z = y + cs_1 \\ If \ ||z||_\infty>\gamma - \beta \ or \ ||Low(w-cs_2||_\infty > \gamma - \beta, restart \\ sig = (z,c) $$