Einführung in die Kryptographie

--- title: Einführung in die Kryptographie --- [Real Link](https://hackmd.io/BD_jVPtIQ1-w0JuWzTQAAQ) [toc] # Course Plan * Classical vs Modern Cryptography * Perfectly secret encryption * Private Key Encryption * Message Authentication Codes * Hash Functions and Applications * Practical Applications of Symmetric Crypto * Basics number theory and cryptographic assumptions * Public key management and public key cryptography * Public key encryption * Digital signatures * Outlook # Lecture 1 ## Definitions * Plaintext/Message: $m$, Plaintext Space: $M$ * Secret key: $k$, Key Space: $K$ * Ciphertext: $c$, Ciphertext Space $C$ * A **encryption scheme** is a tuple $(Gen, Enc, Dec)$, where * $Gen$: probabilistic key generation algorithm on input $1^n$ that outputs a key $k \in K$ * $Enc$: encryption algorithm which on input $k$ and message $m \in M$ outputs $c \in C$. We write $c \leftarrow Enc_k(m)$ * $Dec$: decryption algorithm returns on input key $k$ and ciphertext $c \in C$ a message $m \in M$. We write $m = Dec_k(c)$. * **Correctness**: * for every $k$ and $m$ we should have * $Dec_k(Enc_k(m))=m$ ### Wahrscheinlichkeitsformeln (Satz von Bayes) * Für zwei Ereignisse $A, B$ gilt für die bedingte Wahrscheinlichkeit $P(A|B) = \frac{P(B|A) * P(A)}{P(B)}$ * $P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{\frac{P(A \cap B)}{P(A)} * P(A)}{P(B)} = \frac{P(B|A) * P(A)}{P(B)}$ * Wenn $A_{i},\; i = 1, \dotsc, N$ eine Zerlegung der Ergebnismenge in disjunkte Ereignisse ist, gilt: $P(B)\; = \; \sum_{j=1}^N P\left(A_j\cap B\right) \; = \; {\sum_{j=1}^{N} P\left(B\mid A_j\right)* P\left(A_j\right)}$ ## Kerckhoff's principle >The cipher should remain secure even if the adversary knows the specification of the cipher, i.e., the adversary knows $(Gen, Enc, Dec)$. **Motivation:** 1. In commercial products it is unrealistic to assume that the design details remain secret (reverse-engineering!) 2. Short keys are easier to protect, generate and exchange 3. the design details can be discussed and analyzed in public ## Shift cipher * $M = C =$ words over alphabet $\{A,...,Z\} \sim \{0,...,25\}$ * Basic idea: cyclic shift of every letter of the message by $k$ positions in alphabet * $Gen \rightarrow k$ * $Enc_k(m_0,...,m_n)=(m_0 + k \bmod 26,..., m_n + k \bmod 26)$ * $Dec_k(c_0,...,c_n)=(c_0 - k \bmod 26, ..., c_n - k\bmod 26)$ * easily bruteforceable -> **key space needs to be larger** ## Substitution cipher ![](https://i.imgur.com/gRR5xHa.png) * $M = C =$ words over alphabet $\{A,...,Z\} \sim \{0,...,25\}$ * $K =$ a set of permutations over $\{0,...,25\}$ * $Enc_p(m_0,...,m_n) = (p(m_0),...,p(m_n))$ * $Dec_p(c_0,...,c_n) = (p^{-1}(c_0),...,p^{-1}(c_m))$ * Keyspace: $|K|=26! \sim 2^{88}$ (**large**) * Vulnerable to statistical patterns of occurances of certain letters in certain languages ![](https://i.imgur.com/zPi3HQ1.png) ## Historic vs modern cryptography * In the past: * lack of precise definitions * ad-hoc designs * usually insecure * nowadays: * formal definitions * systematic design * very secure * constructions with security proofs ## Security ### Principle I: Formal Definitions Two components of security definition: 1. Aversary goal: what constitutes a successful attack * Example: encryption 1. Attempt: impossible to recover secret key 2. Attempt: impossible to recover entire plaintext from ciphertext 3. Attempt: impossible to recover any character from ciphertext * Right Answer: ciphertext does not leak any new information about plaintext * Moral: Developing formal definitions can be hard 1. Threat Model: what can the adversary do and see (but no restriction on his strategy) * Example Threat Models: * Ciphertext-only Attack (COA): Adversary only observes ciphertexts (also called eavesdropping adversary) * Known-plaintext attack (KPA): Adversary learns plaintext/ciphertext pairs * Chosen-plaintext attack (CPA): Adversary learns plaintext/ciphertext pairs of its *own* choice * Chosen-ciphertext attack (CCA): As before, but adversary also can learn plaintexts for ciphertexts of its choice ### Principle II: Precise assumptions Why are precise assumptions needed? * Unconditional security mostly impossible * Validations of assumptions: if assumption is not precisely stated it cannot be studied and violated * Comparison of schemes: allows to examine if vertain assumptions are weaker or stronger than others * Required for security proof: without mathematical formalization proofs are impossible ### Principle III: Proofs of security * First attempt: Enumerate over all possible adversaries * Not possible: there are too many! * Second attempt: Base security on assumption * Assumptions hold for certain class of adversaries * scheme is secure for certain class of adversaries Provable security is about relations between **assumptions** and **security of cryptoschemes**. In many areas of computer science proofs are not essential. * i.e. instead of proving that algorithm is efficient just simualte its behaviour on "typical" inputs In cryptography this is not true * Notion of "typical adversary" makes little sense *Security proofs are only as good as the assumption and the model!* * Assumptions may turn out to be false * Security model does not match reality: Examples include side-channel attacks, CCA required but only CPA is used # Lecture 2 ## Definitions * Probability distributions over sets $M, K, C$ 1. $K$ is a random variable denoting value of key output by $Gen()$ * $\forall k \in K: Pr[K=k]$ denotes probability that Gen outputs k 1. $M$ is random variable denoting message being encrypted * $\forall m \in M: Pr[M=m]$ denotes probability that m is encrypted message 1. $C$ is random variable denoting the resulting ciphertext * $\forall c \in C: Pr[C=c]$ (probability over choice of key and choice of message) * **Perfect Security (Secrecy)** (Defintion 2.1) * An encryption scheme $P = (Gen, Enc, Dec)$ is perfectly secure(secret), if for all (a-priori knowledge) probability distributions over $M, \ \forall m \in M, c \in C$ with $Pr[C]>0$: * $Pr[M = m\ |C = c] = Pr[M = m]$ * Interpretation: c does not reveal information about m. * Alternative: Indistinguishability of ciphertexts * Informal: Probability of ciphertexts independent of message * Lemma 2.2 * An encryption scheme P is perfectly secure if and only if $\forall m,m' \in M, c \in C:$ * $Pr[Enc_k(m)=c] \ = \ Pr[Enc_k(m')=c]$ * Where probability is taken over choice of k and randomness of $Enc$. ## Experiment / Game based definition Informal: Adverssary has no advantage over guessing * With adversary $\mathcal{A}$ and and encryption sceme $\Pi$ , the Game $PrivK^{eav}_{\mathcal{A},\Pi}$: 1. $(m_0, m_1) \leftarrow \mathcal{A}$ with $m_0, m_1 \in M$; 1. $k \leftarrow Gen; \ b \leftarrow \{0,1\}$; 2. $b' \leftarrow A(Enc_k(m_b))$; 3. if $b = b'$output 1; otherwise output 0 * Remarks * $\mathcal{A}$ chooses messages $m_0, m_1$ himself * If $b=b'$ we say that $\mathcal{A}$ wins game * $\mathcal{A}$ has "trivial" success probability of $\frac{1}{2}$ by guessing * we call $Pr[PrivK^{eav}_{\mathcal{A},\Pi} = 1] - \frac{1}{2}$ advantage of $\mathcal{A}$ **Definition 2.3 (Perfectly indistinguishable):** $\Pi$ is perfectly indistinguishable if $\forall PPT\ A:Pr[PrivK_{\Pi, A}^{eav}=1]=\frac{1}{2}$ **Lemma 2.4:** $\Pi$ is perfectly secure $\Leftrightarrow$ $\Pi$ is perfectly indistinguishable. ![](https://i.imgur.com/TN5PFyQ.png) ## One-time pad and its security Construction: * Let $M = C = K = \{0,1\}^l$ * $Gen$: output $k \leftarrow \{0,1\}^l$ * $Enc$: For $m \in M$ output $Enc_k(m) = k \oplus m$ * $Dec$: for $c \in C$ output $m := Dec_k(c) = k \oplus c$ ### Theorem 2.5 (Security of one-time pad): One-time pad encryption is perfectly secure against COA attacks. (There is a proof in the notes) ### Drawbacks: * Key is as long as message * Key can only be used once: c xor c' = (m xor k) xor (m' xor k) = m xor m' ### Limitations of perfect security Key space must be as long as message space -> OTP is optimal ### Theorem 2.6 (key space) If $\Pi$ is perfectly secure encryption scheme with message space $M$ and key space $K$, then $|K| \ge |M|$. # Lecture 3 ## Defining computational secure private key encryption ### Perfect vs. computational security * Perfect security requires $|K| \ge |M|$ * Computational security: ~128 bis key length and many messages can be encrypted | Perfect security | Computational security | | ---------------------------- | -------------------------- | | Adversary computationally unbounded | Adversary computationally bounded | | No information revealed | small information revealed | **Example** Eavesdropping security game [$Priv^{eav}_{A,\Pi}$](#Experiment--Game-based-definition) We say $\Pi$ is EAV-secure, if for all "efficient" adversaries, we have: $Pr[Priv^{eav}_{A,\Pi}(n)=1]\ =\ Pr[b=b'] \le 0.5\ +$ *"very small"*. * What is "efficient"? * What is "very small"? ### The concrete approach Measure concrete running time & conrete error probability. We say $\Pi$ is $(t, \in)$-secure, if for all adversaries running in time t, we have: $Pr[b=b'] \le 0.5\ + \in$ Ideas for measuring running time: 1. *"Attacker can use at most 1000 intel i7 Processors for at most 100 years"* 2. *"Attacker can buy equipment worth 1 million euro and use it for 30 years"* $\Rightarrow$ It is hard to reason formally about such running times. #### Complexity theory *"Efficient computation"* = Polynomial-time computable by probabilistic algorithm 1. Polynomial-time computable: * Length of $x: n = |x|$ * x \[Input\] $\rightarrow$ *Algorithm* $\rightarrow$ y \[Output\] * Computes the output in $T(n) = O(n^c)$ steps (for a constant $c$) 2. Probabilistic algorithm * $(x, r) \rightarrow$ *Algorithm* $\rightarrow y$ * Access to random coins in each step * Or: Additional randomness as input * Gives the adversary more power 3. A step: for example a move on a poly-time Turing machine * A poly-time Turing-machine has multiple heads/tapes * A probabilistic Turing-machine has an additional tape with random bits 4. Is this the right approach? * Advatages: * Many models of computation (TM, RAMs, circuits,...) are "equivalent" up to a "polynomial reduction" * The formulas for running time get much simpler (we use asymptotics) * Disadvantages: * Asymptotic results don't tell us anything about security of the **conrete systems** * E.g.: Scheme may be concretely secure only for very large keys * However * Usually one can prove formally an asymptotic result and then argue informally that "the constants are reasonable" and can be calculated if one really wants. ### Negligible (Vernachlässigbar) **Definition 3.1**: A function $f:N \rightarrow R_{\geq0}$ is negligible in $n$ if for every positive polynomial $p$ there exists an integer $T$ such that for all $n > T:$ $f(n) < \frac{1}{p(n)}$. We call such a function negligible in $n:negl(n)$ #### Shorthand A function $f:N \rightarrow R_{\geq0}$ is negligible in $n$ if for every positive polynomial $p$ and all *sufficiently large* $n: f(n) < \frac{1}{p(n)}$ #### Examples | Function | Negligible? | | -------- | -------- | | $f(n) = n^{-2}$ | No, inverse poly. n^-3^ is always smaller | | $f(n) = 2^{-n}$ | Yes, for sufficient large n | | $f(n) = 2^{-\frac{n}{2}}$ | Yes, for sufficient large n | | $f(n) = n^{-1000}$ | No, n^-1001^ is always smaller| #### negl() is closed under composition Let $negl_1$ and $negl_2$ be negligible functions. Then: 1. The function defined by $negl_3(n) = negl_1(n) + negl_2(n)$ is negligible 2. For any positive polynomial $p$, the function $negl_4()$ defined by $negl_4(n)= p(n) * negl_1(n)$ is negligible If $f(n), g(n)$ are negligible functions, $q(n)$ is a positive polynom, then: * $h(n) = f(n) \cdot g(n)$ is a negligible function * $h(n) = f(n) + g(n)$ is a negligible function * $h(n) = |f(n) - g(n)|$ is a negligible function * $h(n) = q(n) \cdot g(n)$ is a negligible function ### Definition 3.3: Computational security (final) We adapt the previous attempt at defining EAV-security. We say $\Pi$ is **EAV-secure**, if for all **PPT** adversaries, there exists a negligible function **negl** such that for all $n$ we have: $Pr[Priv^{eav}_{A,\Pi}(n)=1]\ =\ Pr[b=b'] \le 0.5 + negl(n)$. ## Pseudorandom generators (PRGs) Idea: map short random seed $s$ to long string $G(s)$ that *looks random*. ![](https://i.imgur.com/SbSBxAp.png) Motivation: Output of $G$ can be used whenever uniform randomness is needed but only short random string $s$ is available. ### What does *"look random"* mean? 1. Non-crypto application: should pass some statistical tests, e.g. *rand()* function in Windows 2. Crypto application: should pass **all** polynomial-time tests Informal definition of *"look random"*: $\forall$ PPT(probabilistic polynomial time) distinguishers $D$: case 0: r $\in \{0,1\}^{l(n)}$ case 1: $G(s)$ ![](https://i.imgur.com/eAewJyw.png) Distinguisher $D$ cannot distinguish between the two cases ### Definition 3.4 (pseudorandom generator) Let $l$ be polynomial and $G: \{0,1\}^{n} \rightarrow \{0,1\}^{l(n)}$ be a deterministic algorithm. $G$ is a pseudorandom generator if the following holds: 1. *Expansion*: $\forall n: l(n) > n$ 2. *Pseudorandomness*: $\forall$ PPT distinguishers $D$, $\exists$ *negl* such that $|Pr[D(G(s))=1] - Pr[D(r)=1]| \leq negl(n)$, where $s\leftarrow \{0,1\}^{n}, r \leftarrow \{0,1\}^{l(n)}$ are chosen uniformly at random. #### Remark Output distribution of $G$ is far from uniform. Assuming $l(n)=2n$: ![](https://i.imgur.com/6rgs9dt.png) Only a fraction of $2^{-n}(=\frac{2^n}{2^{2n}})$ values are reached (only $2^n$ different outputs of $G$ exist). $\rightarrow$ $G(s)$ is not uniform over $\{0,1\}^{2n}$ $\rightarrow$ easy to distinguish for unbounded distinguisher ## Building secure encryption from PRGs Let $G$ be a PRG with expansion factor $l$. Define encryption scheme $\Pi$ for messages of length $l$ as follows: - $Gen:$ on input $1^n$, output uniform key $s \leftarrow \{0,1\}^{n}$ - $Enc:$ on input key $s \in \{0,1\}^{n}$ and message $m \in \{0,1\}^{l(n)}$ output ciphertext ```c := G(s) xor m``` - $Dec:$ on input key $s \in \{0,1\}^{n}$ and ciphertext $c\in \{0,1\}^{l(n)}$, output message ```m:=G(s) xor c``` ### Comparision to one-time pad - $\Pi$ uses $G(s)$ instead of $r \leftarrow \{0,1\}^{l(n)}$ as one-time pad - Advantage: Instead of $l(n)$ random bits we only need $n$ ### Theorem 3.5 (PRGs are EAV-secure) If $G$ is a PRG, then the above construction is **EAV-secure**. *Proof in the lecture-notes* #### What has been shown? Any adversary $A$ against $\Pi$ can be used to build a distinguisher $D$ against $G$: - Advantage: $A$ and $D$ are directly related (here: identical) - Running times of $A$ and $D$ are directly related (here: almost identical) ### Theoretical constructions of PRGs One of the most important results in private key encryption: * One way functions $\rightarrow$ PRGs ($\rightarrow$ secure encryption) ### What is a one way function (OWF)? ![](https://i.imgur.com/9prSOYm.png) More formally, for function $f:\{0,1\}^{*}\rightarrow\{0,1\}^{*}$ we define the inversion game: #### Experiment $Invert_{A,f}(n)$ 1. $x \leftarrow \{0,1\}^{n}\ ;\ \ y = f(x)$ 2. $x' \leftarrow A(1^{n}, y)$ 3. If f(x')=y output 1; otherwise output 0 #### Defintion 3.6 (one way function) A function $f:\{0,1\}^{*}\rightarrow\{0,1\}^{*}$ is one-way if the following holds: 1. *Easy to compute*: $\exists PPT$ algorithm $\mu_{f},$ such that $\forall x\in\{0,1\}^{*}: \mu_{f}(x)=f(x)$ 3. *Hard to invert*: $\forall PPT$ adversaries $A, \exists \space negl$ such that: $Pr[Invert_{A,f}(n) = 1] \le negl(n)$ ##### Example Conjectured (vermutete) one-way functions: Integer factorization of composite numbers. Let $p,q$ be primes with $|p|=|q|$, then $f(p,q)=p*q$ is believed to be one-way. #### How to encrypt with OWF? Naive Approach: $\forall$ messages $m$, encrypt $m$ as $f(m)$. Problems: * No key $\rightarrow$ Not possible to decrypt * $f(m)$ may reveal something about $m$ #### Theorem 3.7 If a OWF exists, then there exist corresponding pseudorandom generators. ## Practical PRGs = Stream ciphers The pseudorandom generators used in practice are called stream ciphers. They are called like this because their ouput is an "infinite" stream of bits. ### Formal definition A stream cipher is a pair of deterministic algorithms $(Init, GetBits)$ where: * $Init$ takes as input a seed $s$ and an optional initialization vector $IV$, and outputs an initial state $st_0$ * $GetBits$ takes as input state information $st_i$ and outputs $(y, st_{i+1})$, where $y$ can be a bit or a "block of bits" ### How to construct a stream cipher #### Old fashioned approach 1. Take a standard PRG $G$ and a hash function $H$ (will be defined later) 2. Let $st_0 = Init(IV,s):=H(IV,s)$ 3. Set $st_{i+1} = GetBits(st_i)$ #### Modern approach Design such a $G$ from scratch. ### Historical stream ciphers Based on linear feedback shift registers: | Stream cipher | Stats| | -------- | -------- | | A5/1 & A5/2 | completely broken | | Content Scramble System (CSS) | completely broken | | RC4 | has serious security weaknesses | ### RC4 * used in WEP, WPA and TLS * very efficient and simple * **has security flaws** ![](https://i.imgur.com/SCHB0pF.png) #### Problems with RC4 1. Doesn't have seperate IV 2. some bytes of the output are biased 3. First few bytes of output sometimes leak some information about the key * Recommenation: discard the first 768-3072 bytes 4. There are more weaknesses #### Usage of RC4 in WEP To get the seed, a key k is concatenated with the IV * old versions: $|k|$ = 40 bits, $|IV|$ = 24 bits (US export restrictions) * new versions: $|k|$ = 104 bits, $|IV|$ = 24 bits ### Salsa20 One of the winners of the eStream competition. Very efficient both in hardware and in software. $Key \space k, |k| = 256\ bits \\ Nonce \space r, |r| = 64\ bits \\ Salsa20(k,r) := H(k,r,0)||H(k,r,1)||...$ *All the "magic" happens in the function H* # Lecture 4 ## Stronger security notions ### Chosen plaintext attacks (CPA) Setting: Two parties Alice and Bob, adversary $A$ ![](https://i.imgur.com/KpE10Qj.png) Adversary $A$ has control over what messages are encrypted. Formally modelled: $A$ has oracle access to $Enc_{k}(\cdot)$. We write $A^{Enc_{k}(\cdot)}$. For any encryption scheme $\Pi=(Gen, Enc, Dec)$, any PPT adversary $A$, and any security parameter $n$ define the following experiment: $PrivK^{CPA}_{A,\Pi}(n)$ 1. $k \leftarrow Gen(1^n)$ 2. $(m_0, m_1) \leftarrow A^{Enc_{k}(\cdot)}(1^{n})$ with $|m_0| = |m_1|$ 3. $b \leftarrow \{0,1\}$ 4. $b' \leftarrow A^{Enc_{k}(\cdot)}(Enc_k(m_b))$ 5. If $b' = b$ output 1; else output 0 ![](https://i.imgur.com/3W8HdNo.png) ### Definition 3.8 (CPA-secure) A private key encryption scheme $\Pi=(Gen, Enc, Dec)$ is CPA-secure if $\forall \space PPT$ adversaries $A, \exists \space negl$ such that: $Pr[PrivK^{CPA}_{A,\Pi}(1^n) = 1]\leq \frac{1}{2}+negl(n)$ Where probability is taken over randomness of $A$ and of experiment. # Lecture 5 ## Definitions ### Pseudorandom functions (PRFs) Informally: "random looking" function ~ function indistinguishable from random function. Formalized definition: * Let $Func_n = \{f: \{0,1\}^n \rightarrow \{0,1\}^n\}$ be set of all n-bit to n-bit functions * Observation: $|Func_n| = 2^{n^{2^n}}$ * Why? $\rightarrow$ Each function $f \in Func_n$ can be represented by large lookup table ($2^n$ entries $=$ every combination of n bits, every function value has n bits). So each $f$ can be represented by $2^n*n$ bits. In the set we have n different functions. So $|Func_n| = 2^{n^{2^n}}$. * Following we consider functions $F:\{0,1\}^* \times \{0,1\}^* \rightarrow \{0,1\}^*$ * efficient: $\forall$ inputs $k, x \ \exists \mathcal{PPT}$ algorithm computing $F(k, x)$ * keyed: first input is a fixed key $k$ meaning: $F(k, x) = F(x)$ * length preserving: for some n we have: $F:\{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}^n$ * Definition 3.10 (PRF): * Let $F:\{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}^n$ be an efficient, keyed, length preserving function. $F$ is a PRF if it holds that $\forall \ \mathcal{PPT}$ distinguishers $D$, $\exists negl$ such that: * $|Pr[D^{F_k(\cdot)}(1^n) = 1] - Pr[D^{f(\cdot)}(1^n) = 1]| \le negl(n)$ * where $k \leftarrow \{0,1\}^n,\ \ f \leftarrow Func_n$ **Scenario 0: real world (D, F_k)** ```sequence D->F_k: x_i Note right of F_k: F_k is a PRF F_k->D: F_k(x_i) Note over D: D outputs 1/0 ``` --- **Scenario 1: ideal world (D, f)** ```sequence D->f: x_i Note right of f: f is a uniformly random function f->D: f(x_i) Note over D: D outputs 1/0 ``` **Example 3.11** $F_k(x) = k \oplus x$ is not a PRF since $F_k(x_0) \oplus F_k(x_1) = x_0 \oplus x_1$ ## From PRFs to CPA-secure encryption Attempt 1: $Enc_k(m) = F_k(m)$ 1. not clear how to encrypt 2. deterministic $\rightarrow$ cannot be CPA-secure ### Construction Let $F$ be a PRF. Define private key encryption scheme for messages of length $n$: - $Gen$: On input $1^n$, choose $k \leftarrow \{0,1\}^n$ and output $k$ - $Enc$: On input $k, m \in \{0,1\}^n$, sample $r \leftarrow \{0,1\}^n$ and output $(r, F_k(r) \oplus m)$ - $Dec$: On input $k \in \{0,1\}^n$ and ciphertext $c = (r, s)$ output plaintext: $m = s \oplus F_k(r)$ **Theorem 3.12** If $F$ is a PRF, then above scheme is CPA-secure. *Proof in lecture notes.* # Lecture 6 ## Definitions ### Pseudorandom permutations (PRP) Informal: "Random looking" permutation. Same as PRF but $F_k(\cdot)$ and $f(\cdot)$ have to be permutations. Formally: * $Perm_n = \{f:\{0,1\}^n \rightarrow \{0,1\}^n\ :\ {f\ is\ permutation}\}$ * $|Perm_n| = (2^n)!\ \:$(Viewing each $f \in Perm_n$ as look-up table again) * Let $F_k^{-1}(\cdot)$ be the inversion of the permutation. If $F$ is efficient then a PPT algorithm exists that computes $F_k^{-1}(y)\ \forall k,y$ * **PRP Definition**: Analogous to PRF definition, but $F_k$ respectively changed to condition that f is permutaiton **Strong PRP**: If the distinguisher additionally has access to the inversion. If $F$ is a strong PRP, then also $F^{-1}$ is a strong PRP. ![](https://i.imgur.com/7XrTCHj.png) **In Practice**: Block ciphers $\approx$ pseudorandom permutations with some fixed key length (eg. AES, DES, IDEA, ...) ## Block cipher modes We need modes of operation because without them: + messages have to be very short + the cipher would be deterministic and have no state i.e. cannot be CPA-secure ### Electronic Code Book Mode (ECB) ![](https://i.imgur.com/DS2lKTV.png) Each block gets encrypted seperately. **Not secure** since artifacts remain (see famous linux penguin ECB image). ### Cipher Block Chaining Mode (CBC) ![](https://i.imgur.com/wOjeI5c.png) ![](https://i.imgur.com/d8Gjqp0.png) Theorem: ~ If $F$ is a PRP then $F$-CBC is secure^[M. Bellare, A. Desai, E. Jokipii and P. Rogaway 1997]. ### Counter mode (CTR) We can convert a pseudorandom permutation into a pseudorandom generator. ![](https://i.imgur.com/UJcUOsN.png) This is also called CTR mode. We can use it for encryption if we xor the plaintext afterwards. ![](https://i.imgur.com/ujPrlZD.png) ### Comparison of CBC and CTR | Mode of operation | Error propagation | Encryption parallelizable? | Decryption parallelizable? | What if one bit of plaintext changes? | | -------- | -------- | -------- | -------- | -------- | | CBC | Error in block $c_i$ affects $c_i$ and $c_{i+1}$| No | Yes | Need to recompute everything* | | CRT | Error in block $c_i$ affects only $c_i$ | Yes | Yes | Need to recompute one block only | \* Not suited for e.g. disk encryption. ## Construction of block ciphers Typical requirements: * *Security:* Best attack should be brute force key search. * *Efficiency* on different platforms (microcontrollers, phones, PCs, dedicated hardware) * *Key agility:* Efficient key change possible * *Advanced security requirements:* Resistance against side-channel attacks, related key attacks, ... ### Iterated cipher design ![](https://i.imgur.com/XHCT5Eg.png) Popular because it is: * *simple*: Easy to implement and easier to check for backdoors. * *symmetric (repeating patterns)*: Smaller circuits in hardware. * Examples: * Feistel ciphers * Most popular: DES (Lucifer) * Substitution permutation networks * Most popular: AES (Rijndael) ### Substitution permutation networks Use iteration cipher design paradigm and are typically easily invertable. ![](https://i.imgur.com/LODWjkK.png) ### The Round function of AES ![](https://i.imgur.com/OcN6sce.png) --- **SubBytes** ![](https://i.imgur.com/I7ZXLHk.png) --- **ShiftRows** ![](https://i.imgur.com/aEYfNlV.png) --- **MixColumns** ![](https://i.imgur.com/SvrhmcZ.png) ## Feistel-Networks ![](https://i.imgur.com/0ahXcJ8.png) ![](https://i.imgur.com/c2GCC5K.png) # Lecture 7 ## Definitions ### Message Authentication Codes (MACs) *Definition 4.1:* A MAC consists of three PPT algorithms $(Gen, Mac, Vrfy)$: * $Gen$: On input of the security parameter $1^n$, output key $k$ with $|k| \ge n$ * $Mac$: The tag algorithm $Mac$ takes the input key $k$ and message $m \in \{0,1\}^*$ and outputs a tag $t \leftarrow Mac_k(m)$ * $V\!r\!fy$: On input key $k$, message $m$ and tag $t$ output a bit $b$, where $b=1$ means valid and $b=0$ invalid. *Canonical verification*: If $Mac$ is deterministic, the verification works as follows: * Compute $\tilde{t} = Mac_k(m)$ * Output $1$ iff $\tilde{t} = t$, otherwise $0$ #### Security definition Idea: Impossible to generate valid tag for new message. Two elements to define: *Threat model* + *successfull break*. *Threat model:* $A$ sees/controls valid pairs $(m,t)$ *Successful break:* $A$ generates _new_ valid pair $(m^*, t^*)$ Experiment $Mac\!-\!\!forge_{A,\Pi}(n)$: 1. $k \leftarrow Gen(1^n)$ 2. $(m^*, t^*) \leftarrow A^{Mac_k(\cdot)}(1^n)$ , where $Q$ denotes set of all queries asked by $A$ 3. Output 1 if (1) and (2) apply, 0 otherwise: 1. $V\!r\!fy\,_k(m^*, t^*) = 1$ 2. $m^* \notin Q$ ![](https://i.imgur.com/32e7DmZ.png) #### Another security definition from the exercise A message authentication code $\Pi = (Gen, Mac, Vrfy)$ is strongly secure or a strong MAC, if for all probabilistic polynomial-time adversaries $A$, there is a negligible function $negl$ such that: $$ Pr[Mac\!-\!s\!forge_{A, \Pi}(n) = 1] \le negl(n) $$ The experiment $Mac-sforge_{A, \Pi}$ is defined by modifying the above experiment such that adversary wins also if he outputs a new valid tag for a message queried earlier. This means the last part of the experiment is adapted such that instead of $m^* \notin Q$ we write: $(m^*,t^*) \notin Q$. ### MAC-Security (Definition 4.2) A message authentication code $\Pi = (Gen, Mac, Vrfy)$ is existentially unforgeable under adaptive chosen message attacks (or just secure) if $\forall$ PPT attackers $A, \exists negl$ such that: $Pr[Mac\!-\!\!forge_{A,\Pi}(n) = 1] \le negl(n)$ #### Is our definition too strong? *Successful break:* Any previously unauthenticated message counts. *Real words:* Only *"meaningful"* messages. Problem: *"meaningful"* is highly application dependent. #### Replay attacks Replay attacks possible even with MAC. E.g.: 1. Alice $\rightarrow$ Bank: ("10.000$ to Bob", tag) 2. Bob listened to Alices message. Sends the same message to the bank again. Fix this with __*randomness*__ or __*sequence numbers*__! ## Construction of MACs 1. information secure (perfectly secure): possible but inefficient 2. fixed length 3. arbitrary length: theoretical 4. CBC-MAC ### Fixed length MACs Idea: PRFs are fixed length MACs. Let $F$ be a PRF. Define $MAC$ as follows: * $Gen(1^n)$: output random key $k \rightarrow \{0,1\}^n$ * $Mac_k(n)$: output tag $t = F_k(m)$ * $V\!r\!fy_k(m,t)$: output 1 if $t = F_k(m)$; otherwise 0. *Idea for security:* $F$ is a random function. Guessing the correct output has probability $2^{-n}$. *Theorem 4.3:* If $F$ is a PRF, then above construction is a secure fixed-length message. [For proof see slides] ### Domain extension Problem of previous construction: It authenticates only short, fixed-length messages. To solve this we create a scheme $\Pi^*$ from $\Pi$ that works on any message length. #### How NOT to do it: * Approach 1: $Mac_k(m) = (Mac_k'(m_1), \dots, Mac_k'(m_d))$ *Not secure! Block reordering attack possible.* * Approach 2: $Mac_k(m) = (Mac_k'(1||m_1), Mac_k'(2||m_2), \dots)$ *Not secure! Truncation attack (adversary drops blocks from the end).* * Approach 3: $Mac_k(m) = (Mac_k'(l||1||m_1), Mac_k'(l||2||m_2), \dots)$ *Not secure! Mix and match attack: Having two authenticated messages of the same length we can mix.* #### How we do it finally We use a random identifier. $Mac_k(m) = (Mac_k'(r||l||1||m_1), \dots, Mac_k'(r||l||d||m_d), r)$ $V\!r\!fy_k(t,m)$: Check if all MACs are valid and use same random identifier $r$. *Theorem 4.4:* If $\Pi'$ is secure MAC for message length n, then above construction is secure MAC for arbitrary length messages. ### CBC-MAC Disadvantage of previous construction: Inefficient. Block cipher is evaluated $O(d)$ times. Tag length is $O(dn)$. Alternative construction: CBC-MAC. ![](https://i.imgur.com/GVT0uGg.png) * $Mac_k(m_1, \dots, m_d)$ is constructed according to above scheme. $F_k$ outputs a block of length $n$. * $V\!r\!fy_k(m,t)$: Recompute $\tilde{t}$ and output 1 if $t=\tilde{t}$, otherwise output 0. *Theorem 4.5 (CBC-MAC security):* Let $l$ be polynomial. If $F$ is PRF, then CBC-MAC is a secure MAC for messages of length $l(n)\cdot n$. #### CBC-MAC vs CBC mode of encryption * CBC encryption uses random IV (crucial for security). CBC-MAC uses *no IV*. * CBC encryption outputs all intermediate values (otherwise no decryption is possible). CBC-MAC outputs only the final block # Lecture 8 ## Authenticated encryption Ideally secure encryption: *secrecy + integrity*. *Secrecy:* Strong secrecy $\rightarrow$ CCA-security *Integrity:* Unforgeable encryptions ### CCA-Security CPA-security + access to a dencryption oracle CCA-indistinguishability game $PrivK^{CCA}_{A,\Pi}$: 1. $k \leftarrow Gen(1^n)$ 2. $(m_0,m_1) \leftarrow A^{Enc_k(\cdot), Dec_k(\cdot)}(1^n)$ 3. $b \leftarrow \{0,1\}$ 4. $b' \leftarrow A^{Enc_k(\cdot), Dec_k(\cdot)}(Enc_k(m_b))$ 5. If $b = b'$ output 1; otherwise output 0. After receiving the challenge ciphertext after step 3, $A$ can still ask queries to the oracle. However $A$ cannot query the oracle on the challenge ciphertext. *Definition 4.6 (CCA-security):* A private key encryption scheme $\Pi = (Gen, Enc, Dec)$ is CCA-secure, if $\forall$ PPT adversaries $A\ \exists negl$ such that: $Pr[PrivK^{CCA}_{A,\Pi}(n)=1] \le \frac{1}{2} + negl(n)$ ### Unforgeable encryptions = impossible to generate new valid ciphertexts. Experiment $Enc\!-\!\!Forge_{A,\Pi}(n)$: 1. $k \leftarrow Gen(1^n)$ 2. $c \leftarrow A^{Enc_k(\cdot)}(1^n),$ where $Q$ is the set of all messages asked to $Enc_k(\cdot)$ 3. Let $m = Dec_k(c)$. Output 1 iff * $m \neq \bot$ (m deciphered correctly) and * $m \notin Q$ *Definition 4.7 (unforgeable encryption)* $\Pi$ is unforgeable if $\forall$ PPT adversaries $A\ \exists negl$ such that: $Pr[Enc\!-\!\!Forge_{A,\Pi}(n)=1] \le negl(n)$ *Definition 4.8 (authenticated encryption)* $\Pi$ is an authenticated encryption scheme if $\Pi$ is CCA-secure and unforgeable. ### Generic constructions * CPA-secure encryption $\Pi_E=(Gen, Enc, Dec)$ (with random n-bit string $k_E$ as key) and * MAC $\Pi_M=(Gen, Enc, Vrfy)$ (with random n-bit string $k_M$ as key) #### Encrypt-and-MAC Encryption of m: $(c,t) = (Enc_{k_E}(m), Mac_{k_M}(m))$ Decryption of $(c, t)$: $m = Dec_{k_E}(c)$. Output $m$ iff $Vrfy_{k_M}(m,t) = 1$. Otherwise output $\bot$. Security: No! Tag may reveal information about message $m$ (e.g. CBC-MAC). #### MAC-then-encrypt Encryption of m: $t \leftarrow Mac_{k_M}(m)$ Output $c \leftarrow Enc_{k_E}(m||t)$ Decryption of $c$: $m||t = Dec_{k_E}(c)$. Output $m$ iff $Vrfy_{k_M}(m,t) = 1$. Otherwise output $\bot$. Security: Not generically! #### Encrypt-then-MAC Encryption of m: $c \leftarrow Enc_{k_E}(m)$ $t \leftarrow Mac_{k_M}(c)$ Output $(c,t)$. Decryption of $(c,t)$: Iff $Vrfy_{k_M}(c,t) = 1$, output $Dec_{k_E}(c)$. Otherwise output 0. Security: See Theorem 4.9. *Theorem 4.9* If $\Pi_E$ is CPA-scure and $\Pi_M$ is a strongly secure MAC, then Encrypt-then-MAC is an authenticted encryption scheme. Approch: Show encryption unforgeability + CCA-securtity. #### The need for two independent keys Some schemes can be broken when the same key is used. Consider an Encrypt-then-MAC scheme with $k=k_E=k_M$. Let $F$ be a strong PRP (which implies $F^{-1}$ is also a strong PRP). * $Enc_k(m) = F_k(m||r)$ * $Mac_k(c) = F_k^{-1}(c)$ * Encrypt-then-MAC: $Mac_k(Enc_k(m)) \\ = F_k^{-1}(F_k(m||r)) \\ = m||r$ $\Rightarrow$ Message revealed in clear! # Lecture 9 ## Hash functions . ![](https://i.imgur.com/YQjboVa.png) *Informal security property:* The digest uniquely identifies $x$ (= collision resistance. I.e. it's hard to find $(x,x')$ such that $H(x)=H(x')$). *Observation:* There always exist collisions since the domain (Definitionsbereich) is larger than range (Wertebereich): $\{0,1\}^{l(n)} > \{0,1\}^n$. ### Collision resistance First attempt at defining collision resistance. $H$ is collision resistant iff: * $H(x)$ can be computed in polyomial time * $\forall$ PPT adversaries $A$, the probablity that $A$ outputs $x$ with $x\neq x' \wedge H(x)=H(x')$ is neglectable Technichal problem: There always exists an adversary $A$ that has a collision hardcoded into its description. To Solve this we use a public seed $s$. *Definition 5.1* A hash function with output length $n$ is a pair of PPT algorithms $(Gen, H)$ defined as follows. * $Gen:$ On input $1^n$ the probabilistic algorithm outpts public seed $s$ * $H:$ On input $s$ and string $x \in \{0,1\}^{l(n)}$ it outputs $H^s(x)$ If the domain of $H$ is $\{0,1\}^*$ instead of $\{0,1\}^{l(n)}$, then $H$ is a hash function for arbitrary message length. #### Experiment $Hash\!-\!coll_{A,\Pi}(n)$ 1. $s \leftarrow Gen(1^n)$ 2. $(x, x') \leftarrow A(1^n, s)$ with $x, x' \in \{0,1\}^{l(n)}$ 3. Output 1 iff both following conditions apply, 0 otherwise * $x \neq x'$ and * $H(x) = H(x')$ #### Definition 5.2 (collision resistance CRHF) A hash function $\Pi = (Gen, H)$ is collision resistant if $\forall$ PPT adversaries $A\ \exists negl$ such that: $Pr[Hash\!-\!coll_{A,\Pi}(n)=1] \le negl(n)$ In practice: Hash functions are without seed. In that case collision resistance means $\nexists$ trivial adversary. This case is hard to define formally. #### Definition: Second Preimage Secure (weaker than collision resistance) Informally, a hash function is second preimage resistant if **given $s$ and a uniform $x$** it is infeasible for a PPT adversary to find $x′\neq x$ such that $H_s(x′) =H_s(x)$. The difference to collision resistance is that $A$ has to find an $x'$ for a given $x$, while in the collision-resistance game, he can choose both arbitrarily. # Lecture 10 ## Hash functions ### Domain extension Goal: Use a fixed length CRHF to create an arbitrary length CRHF. **Merkle-Damgard transformation:** Let $(Gen, h)$ be a fixed length hash functin for inputs of length $l(n)=n+1$ and output length $n$. We construct a hash function with length $l'(n)=poly(n)$. * $Gen:$ Stays unchanged * $H:$ On input key $s$ and string $X=(x_1, ..., x_{l'(n)})$ with $x_i \in \{0, 1\}$ ![](https://i.imgur.com/VvHUSIj.png) **Theorem 5.3:** If $(Gen, h)$ is collision resistant, then $(Gen, H)$ is collision resistant. *[Proof in the slides]* **What about variable length input?** Consider a CRHF $(Gen, h)$ with $h^s(0^{n+1})=0^n$. Then: $H^s(0)=H(0^{n+1}) \Rightarrow$ collision. Howto fix it: Append length of x to the input. ### MACs from hash functions Goal: Domain extension for MACs **Hash-and-MAC** ![](https://i.imgur.com/nMqjfWE.png) Construction of $\Pi'=(Gen', Mac', Vrfy')$ from MAC $\Pi=(Gen, Mac, Vrfy)$ for messages of length $l(n)$ and hash function $\Pi_H=(Ge_H, H)$ with output length $l(n)$. * $Gen':$ Output $k \leftarrow Gen'(1^n)$ and $s \leftarrow Gen_H(1^n)$. * $Mac':$ On input $(k, s)$ and message $m \in \{0, 1\}$ output $t \leftarrow Mac_k(H^s(m))$. * $Vrfy':$ On input $(k, s), m$ and $t$, output 1 iff $Vrfy_k(H^s(m),t)=1$. **Theorem 5.4:** If $\Pi$ is a secure MAC for messages of length $l$ and $\Pi_H$ is collision resistant, then $\Pi'$ is a secure MAC for arbitrary-length messages. *[Proof see slides]* ### Some real life hash functions * MD5, invented in 1991, collision in compression function ($h$ in Merkle-Damgard-transformation) found in 96 $\Rightarrow$ broken. * SHA-1, freestart collision in 2015, full colision in 2017 $\Rightarrow$ broken. * SHA-3, Keccak won competition in 2012. Considered secure. *Terminology*: Freestart collision: Collision where chaining value (in merkle-damgard-transform) is chosen arbitrarily ## Random oracle model Idea: model the hash function as a random oracle. $H: \{0,1\}^* \rightarrow \{0,1\}^L$ (random function) Difference to PRF: Adversary can query the oracle. As long as the adversary never queried the oracle on X the value H(X) "looks completely random to him" ### Criticism of the Random Oracle Model There exists a signature scheme that is **secure** in ROM but is **not secure** if the random oracle is replaced with any real hash function. This example is very artifical. No realisitic example of this type is known. ### Terminology Model without the random oracles: * "plain model" * "cryptographic model" Random Oracle Model is also called: the "Random Oracle Heuristic" ### One way functions in the RO $$ f^{RO(.)}(X)=RO(X) $$ One way as long as adversary has not queried RO(.) on input X so that RO(X)=Y. For T queries success probability is $\frac{T}{2^n}$. ### PRGs in the RO $$ G^{RO(.)}(X)=RO(X||1),RO(X||2) $$ where each output of RO(.) yields n bits. Only way to distinguish $G^{RO(.)}(X)$ from random is by querying RO(.) on input X. For T queries success probability is $\frac{T}{2^n}$. ### PRFs in the RO $$ F^{RO(.)}_k(X)=RO(k||X) $$ To show: $$ |Pr[A^{RO(.),F^{RO(.)}_k}(1^n) = 1] - Pr[A^{RO(.),R(.)}(1^n)=1]|\le negl(n) $$ where R(.) is random function independent of RO(.). ### Merkle tree ```graphviz digraph merkeltree { nodesep=1.0 // increases the separation between nodes node [color=Black,fontname=Courier,shape=box] //All nodes will this shape and colour edge [color=Black, style=solid] //All the lines look like this "Top Hash\nhash(Hash0+ Hash1)" [color=Red] "Hash0\nhash(Hash0-0+Hash0-1)" [color=Orange] "Hash1\nhash(Hash1-0+Hash1-1)" [color=Red] "Hash1-0\nhash(L2)" [color=Red] "Hash1-1\nhash(L3)" [color=Orange] "Top Hash\nhash(Hash0+ Hash1)"->{"Hash0\nhash(Hash0-0+Hash0-1)" "Hash1\nhash(Hash1-0+Hash1-1)"} "Hash0\nhash(Hash0-0+Hash0-1)" -> {"Hash0-0\nhash(L0)" "Hash0-1\nhash(L1)"} "Hash1\nhash(Hash1-0+Hash1-1)" -> {"Hash1-0\nhash(L2)" "Hash1-1\nhash(L3)"} "Hash0-0\nhash(L0)" -> L0 "Hash0-1\nhash(L1)" -> L1 "Hash1-0\nhash(L2)" -> L2 "Hash1-1\nhash(L3)" -> L3 } ``` #### Example To verify file L2 you need the hashes Hash0 and Hash1-1 and need to compute the hashes Hash1-0, Hash1, TopHash. ### Birthday attack There exists a generic attack on any hash function $$ H:\{0,1\}^*\rightarrow \{0,1\}^n $$ that finds a collision with probability $\frac{1}{2}$ and works in time and space $O(2^\frac{n}{2})$. Consequence: to achieve "m bits of security" one needs to set $n=2\cdot m$ For $q \le \sqrt{2N}$, $q$ denoting the "amount of people in the room to have the same birthday"/"amount of values to have the same hash value", $N$ denoting the total number of possible birthdays or hash values, the probability of a collision can be approximated by $$ Pr[coll(q, N)] > 1 − e^{−q(q−1)/2N} $$ # Lecture 11 ## Motivation **Cryptography is based on assumptions** ```graphviz digraph merkeltree { nodesep=1.0 // increases the separation between nodes node [color=Black,fontname=Courier,shape=box] //All nodes will this shape and colour edge [color=Black, style=solid] //All the lines look like this "Pseudorandom Permutations"->{"Private-Key-Encryption","Message Authentication Codes"} "Block Ciphers" ->"Pseudorandom Permutations" "Pseudorandom Generator" -> "Pseudorandom Permutations" "One Way Functions" -> "Pseudorandom Generator" "Number theoretic assumptions: RSA, Factoring, Discrete Log" -> "One Way Functions" } ``` **Number theory in cryptography - advantages:** 1. Security can be based on famous mathematical conjectures 2. More advanced constructions can be created because of "mathematical" structure 3. Constructions have a natural security parameter (-> they can be scaled) **Number theory in cryptography - disadvantages:** 1. Much less efficient (than e.g. PKE) 2. the number-theoretic "structure" may hely cryptoanalysts Number theory is a source of hard problems. Identifying them can be good for cryptography. ## Recap number theory ### General notation * for $a,b \in \mathbb{Z}: a|b$ means a divides b * greatest common divisor $gcd(a,b)$ largest integer $c$ such that $c|a$ and $c|b$. * $p$ prime if ${1,p}|p$ ### Modular arithmetic * $a,b,N \in \mathbb{Z}$ with $N>1:$ [a mod N]remainder of $a$ upon division by N, i.e. $r:=[a \space mod\space N]$ with $a:=q*N+r$ where $0 \le r\le N$ * $a=b\ mod\ N: [a\ mod\ N] = [b\ mod\ N]$ * $b, N \in \mathbb{Z}$ with $b\ge1$, $N\ge1$ then b has multiplicative inverse iff $gcd(b,N)=1$ ### Groups A group is a set of G along with a binary operation $\circ$ for which following holds: * Closure: $\forall g,h \in G, g \circ h \in G$ * Existence of identity: $\exists\ identity\ e \in G\ s.t. \forall\ g \in G: e \circ g = g = g \circ e$ * Existence of inverse: $\forall \ g \in \ G, \ \exists \ h\ \in \ G\ s.t\ g \circ h=e=h \circ g$ * Associativity: $\forall g_1,g_2,g_3 \in G: (g_1 \circ g_2) \circ g_3 = g_1 \circ (g_2 \circ g_3)$ $G$ is called abelian if further: * Commutativity: $\forall g,h \in G: g \circ h = h \circ g$ If $G$ has a finite number of elements, we say $G$ is finite and let $|G|$ denote the order of $G$ (=#elements in $G$). Notation: $\circ$ typically stands for $"+"$ or $"*"$. ### Subgroups A group $H$ is a subgroup of $G$ if: * $H \subseteq G$ * group operation of H is same as in G **Examples:** a) $\mathbb{Z}$ is abelian group under addition, but not under multiplication. (e.g. 2 has no inverse) b) $G=\{0,...,N-1\}$ with addition $\mod N$ is abelian group of order $N$. ### Group exponentiation * Idea: group operation is applied m times, where m is integer * additive notation: $m \cdot g = \underbrace{g\ +\ ...\ +\ g}_{m\ times}$ * multiplicative notation: $g^m=\underbrace{g\ \circ\ ...\ \circ\ g}_{m\ times}$ **Theorem 6.1** Let $G$ be a finite group with $m=|G|$. Then $$ \forall g \in G:g^m=1 $$ **Fact 6.2** Let $G$ be a finite group with $m=|G|, m > 1$. Then $\forall g \in G$ and any integer $x$ we have: $$ g^x=g^{[x \mod N]} $$ ### The group $\mathbb{Z}_n^*$ $\mathbb{Z}_N^* = \{b \in \{1,...,N-1\}\ |\ gcd(b,N)=1\}$ with group operation "multiplication modulo N" is a group. Restriction: $gcd(b,N)=1$ needed s.t. $b$ has multiplicative inverse modulo N. **Fact 6.3** Let $N > 1$ be an integer. Then $\mathbb{Z}_N^*$ is abelian multiplicative group. The order of $\mathbb{Z}_N^*$ is $\Phi(N) = |\mathbb{Z}_N^*|$. **Examples:** * $N=p$ prime: $\forall b \in \{1,...,p-1\} gcd(b,N)=1 \Rightarrow \Phi(N)=p-1$ * $N=p \cdot q$ for $p,q$ distinct primes: $\Phi(N)=(p-1) \cdot (q-1)$ ## Factoring and RSA assumption Idea: Given $N=p\cdot q$ where $p,q$ are large distinct primes it is hard to factorze $N$ i.e. to find $p,q$. ### Experiment: $Factor_A(n)$: * Sample random n-bit primes $p,q$ and set $N =p \cdot q$ * Adversary $A$ is given $N$ and outputs $p', q' >1$ * Output 1 if $p' \cdot q' = N$; otherwise output $0$. **Remark:** Generating random primes (howto) * generate random $n$-bit number $x$ * Run Miller-Rabin test to check if $x$ is prime **Assumption 6.4** The factoring assumption says that $\forall$ PPT adversaries $A,\ \exists negl. s.t.:$ $$ Pr[Factor_A(n)=1] \le negl(n) $$ In this case we say that factoring is hard. *Factoring gives directly OWF but not practical PKC* ### RSA assumption Define $f_e(x)=x^e \mod N$. If $gcd(e,\Phi(N))=1,$ then $f_e(\cdot)$ is a permutation and $f_d(\cdot)$ with $d=e^{-1} \mod \Phi(N)$ is its inverse, then the *RSA assumption* holds. The assumption states: Computing the inverse of $x^e$ is hard. **How $(N, e,d)\rightarrow GenRSA(1^n)$ works** * Sample random n-bit primes $p,q$ and set $N = p \cdot q$ * Choose $e > 1$ s.t.: $\ gcd(e,\Phi(N)) = 1$ * Compute $d = e^{-1} \mod \Phi(N)$ * Output $(N,e,d)$ ### Experiment $RSA\!-\!inv_A(n)$ 1. $(N, e, d) \leftarrow GenRSA(1^n)$ 2. Choose uniform $y \leftarrow \mathbb{Z}_N^*$ 3. $x \leftarrow A(N, e, y)$ 4. Output $1$ if $x^e = y \mod N$; otherwise $0$ **Assumption 6.5** The RSA-Assumption says that $\forall$ PPT adversaries $A\ \exists negl$ s.t.: $$Pr[RSA\!-\!inv_A(n)=1] \le negl(n) $$ **Lemma 6.6** If factoring is easy, then RSA-assumption is easy. *[Proof see exercise]* ## Crypto assumptions in cyclic groups ### Cyclic groups **Definition 6.7** Let G be a finite group and $g \in G$. The order of $g$ is the smallest integer $i \ge 1$ s.t. $g^i = 1$. **Remarks** * Let $<\!g\!>=\{g^0, \dots, g^{i-1}\}$ be a subgroup of $G$, generated by g. * If $\exists g \in G$ s.t. order of $g$ is $m=|G|$, then $<\!g\!> = G$. In this case $G$ is a cyclic group and $g$ is its generator. * *Example:* Additive group $\mathbb{Z}_N$ for $N>1$ is cyclic with generator 1. **Theorem 6.8** If the order of group $G$ is prime, then $G$ is cyclic and all elements except 1 are generators. ### Discrete logarithm / Diffie-Hellman assumptions Group generation algorithm $Gen(1^n)$ ouputs: * cyclic group G (description of group specifies how elements are represented as bit strings) * $q = |G|$ with bit length n * generator $g \in G$ **Discrete logarithm** Let $g, h \in G$ with $<\!g\!>\, =G$. The discrete log of $h$ wrt. $g$ is the unique element $x \in \mathbb{Z}_q$ s.t. $g^x=h$. We write $x=log_g(h)$. **Experiment $DLog_{A, Gen}(n)$** 1. $(G, q, g) \leftarrow Gen(n)$ 2. Choose $h \leftarrow G$ uniform 3. $x \leftarrow A(G, q, g, h)$ 4. Output 1 if $g^x=h$; otherwise output 0 # Lecture 12 ### **Assumption 6.9 (DLog)** The Dlog assumption says that $\forall$ PPT adversaries $A$ $\exists negl$ s.t. $$ Pr[DLog_{A, Gen}(n)=1] \le negl(n) $$ ## Diffie-Hellman problems Two variants exist: 1. Computational Diffie-Hellman assumption (**CDH**) 2. Decisional Diffie-Hellamn assumption (**DDH**) Let $(G,q,g)\leftarrow Gen(1^n)$. ### CDH Choose $x,y \leftarrow \mathbb{Z}_q$ uniformly and let $r = g^x, s = g^y$. CDH is hard if, given $((G,q,g), r, s)$, it is hard to compute $g^{xy}$. **Remark:** DLog is easy $\Rightarrow$ CDH is easy. ### DDH DDH assumes it is hard to distinguish: * $((G,q,g),g^x,g^y,g^{xy})$ and * $((G, q, g), g^x, g^y, z)$ where $z$ is uniform in G. Formally: **Assumption 6.10 (DDH)** The DDH assumption states that $\forall$ PPT adversaries $A, \exists negl$ s.t.: $$ |Pr[A(G,q,g,g^x, g^y, g^{x\cdot y})=1] - Pr[A(G,q,g,g^x, g^y, z)=1]| \le negl(n) $$ where $(G, q, g) \leftarrow Gen(1^n)$ and $x, y, z \leftarrow \mathbb{Z}_q$ are uniformly chosen. **Remark:** CDH is easy $\Rightarrow$ DDH is easy ### Examples of groups where DLog and CDH/DDH are hard In general: Prime order groups (because $DLog/DH$ assumption is hardest in those groups). #### Example 1 Prime order subgroup $G$ of $Z^*_p$ where $p = r \cdot q +1$ and $|G| = q$. Let: - $l$ be the bit length of $p$ - $n$ be the bit length of $q$ **NIST recommendation (2016)** 1. Minimum strength: 128bits (until 2030 and beyond): $n=256, l=3072$ 3. 256 bits security (until 2030 and beyond): $n=512, l = 15360$ *256 bits of security = breaking a hash function with output length 256 bit* ***but:*** large $l$ means less efficient operations #### Exampe 2: Elliptic curves Groups consist of points over elliptic curve **NIST recommendation (2016)** 1. 128bit security: 256bit size group 2. 256bit security: 512bit size group ## Key Agreement and Intro to Public Key Crypto #### Motivation We use a $(pk, sk)$ pair to reduce the amount of keys that need to be stored. ($n$ key-pairs instead of $n(n-1)$ keys) ### DH key exchange ![](https://i.imgur.com/fR5BYwf.png) *public, authenticated*: Only passive attackers. MITM is excluded. **Correctness:** $k = k_A = k_B$ **Security:** "k is computationally indistinguishable from a random key given transcript trans of $\Pi$" Experiment $KE^{eav}_{A,\Pi}(n):$ 1. Parties run protocol $\Pi$ resulting into transcript $trans$ and key $k$ 2. $b \leftarrow \{0,1\}$ If $b = 0$, set $\hat{k} = k$, otherwise $\hat{k} \leftarrow \{0,1\}^n$ 3. $b' \leftarrow A(trans, \hat{k})$ 4. If $b = b'$ output 1; otherwise output 0 **Definition 7.1** A key agreeement protocol $\Pi$ is secure if $\forall$ PPT adversaries $A, \exists negl$ s.t.: $$Pr[KE^{eav}_{A,\Pi}(n)=1] \le \frac{1}{2} + negl(n). $$ # Lecture 13: Key agreement ## DH - key agreement protocol Group generation: $(G, q, g) \leftarrow Gen(1^n)$ | Alice $(G, q, g)$ | | Bob $(G, q, g)$ | | ----------------------------------------- | ------------------- | ------------------------------------------- | | $1)$ $x \leftarrow \mathbb{Z}_q, h_A=g^x$ | $\xrightarrow{h_A}$ | | | | | $2)$ $y \leftarrow \mathbb{Z}_q, h_B = g^y$ | | | $\xleftarrow{h_B}$ | | | $3)$ Output $k_A=(h_B)^x$ | | $3)$ Output $k_B= (h_A)^y$ | **Correctness:** $k_A=(h_B)^x = g^{x\cdot y} = (h_A)^y = k_B$ **Which assumption is needed for security?** 1. DLog: yes, otherwise it is possible to compute the key 2. CDH: yes, otherwise easy to compute key 3. DDH: yes, sufficient for security **Theorem 7.2** If DDH holds, then the DH-key agreement protocol is secure. > :page_facing_up: for proof see notes *Remark:* DH-Key Agreement is not secure against MITM attacks. ## Public Key Encryption ### Setting and Motivation $m \rightarrow \overset{\overset{PK}{\downarrow}}{Enc()} \rightarrow c \rightarrow \overset{\overset{SK}{\downarrow}}{Dec()} \rightarrow m$ **Advantages of public key encryption:** * key for encryption can be public * one SK needed to securely communicate with many parties. **Disadvantage:** Much slower. ### Definitions **Definition 8.1:** A public key encryption scheme (PKE) is a triple of PPT algorithms $(Gen, Enc, Dec)$ s.t.: 1. $Gen(1^n)$: probabilistic algorithm outputting $(pk, sk)$ 2. $Enc(pk, m)$: probabilisitic algorithm outputting $c \leftarrow Enc(pk,m)$ 3. $Dec(sk, c)$: deterministic algorithm outputting $m = Dec(sk, c)$ **Correctness:** Except with negl. probability over choice of randomness for key generation and encryption, we have $\forall m \in M: Dec(sk, Enc(pk,m)) = m$ with $(pk, sk) \leftarrow Gen(1^n)$. **Security:** Even given the public key $pk$, the adversary cannot distinguish encryption of $m_0$ from encryption of $m_1$. Given PKE scheme $\Pi$ and adversary $A$ we define: **Experiment $PubK_{A,\Pi}^{CPA}(n)$:** 1. $(pk, sk) \leftarrow Gen(1^n)$ 2. $(m_0, m_1) \leftarrow A(pk)$ with $|m_0| = |m_1|$ 3. $b \leftarrow \{0,1\}, c \leftarrow Enc(pk, m_b)$ 4. $b' \leftarrow A(c)$ 5. If $b=b'$ output 1; otherwise output $0$. The public key is given to $A$ in the experiment, he chooses $m_0, m_1$ depending on them. Also $A$ gets the encryption oracle "for free". **Definition 8.2:** A PKE scheme $\Pi$ is CPA-secure if $\forall$ PPT adversaries $A\ \exists negl$ s.t.: $$Pr[PubK_{A,\Pi}^{CPA}(n)=1] \le \frac{1}{2} + negl(n) $$ **Theorem 8.3** No deterministic PKE is CPA secure. > Why? Encrypt $m_0, m_1$ and compare which one equals $c$. ### El Gamal encryption scheme based on DDH El Gamal encryption is defined as follows: 1. $Gen(1^n)$ * $(G, q, g) \leftarrow GroupGen(1^n)$ * Choose uniform $x \leftarrow \mathbb{Z}_q$ and set $h = g^x$ * Output $(pk, sk)$ where $sk = (G, q, g, x),\ pk = (G, q, g, h)$ *(x is the secret)* 2. $Enc(pk, m)$ where $m \in G$ * Choose $y \leftarrow \mathbb{Z}_q$ * Output ciphertext $c = (c_1, c_2) = (g^y, h^y \cdot m)$ 5. $Dec(sk, c)$ * Output $\tilde{m} = \frac{c_2}{(c_1)^x}$ **Correctness:** Let $(c_1, c_2) = (g^y, h^y \cdot m)$ with $h=g^x$. Then: $\frac{c_2}{c_1^x}= \frac{(g^x)^y \cdot m}{(g^y)^x} = m$. **Security** Basic idea: 1. $(g^x, g^y, h^y \cdot m)$ computationally indistinguishable from $(g^x, g^y, g^z \cdot m)$ for $z \leftarrow \mathbb{Z}_q$ (DDH) 2. $g^z$ is a "one-time pad" for $m$ (since $G$ is a group) # Lecture 14 **Theorem 8.4** If DDH assumption holds, then ElGamal is CPA-secure. _Proof idea [Full proof see slides]:_ Define encryption scheme $\widetilde{\Pi}$, where $Enc(pk, m)$ works as: - choose uniformly $y,z \leftarrow \mathbb{Z}q$ - output $(c_1, c_2) = (g^y, g^{z}\cdot m)$ Construct a distinguisher that cannot distinguish between playing $PubK_{A,\widetilde{\Pi}}^{CPA}$ and $PubK_{A, \Pi}^{CPA}$. Since DDH assumption holds, the difference beteen the two cases is negligible: $$ negl(n)\ge |Pr[D(G,q,g,g^x,g^y,g^z)=1] - Pr[D(G,q,g,g^x,g^y,g^{xy})=1]| \\ = |\frac{1}{2} - Pr[PubK^{CPA}_{A,\Pi}(n)=1]| \\ \Rightarrow Pr[PubK^{CPA}_{A,\Pi}(n)=1] \le \frac{1}{2} + negl(n) $$ ## RSA encryption Recall: $GenRS\!A(1^n) \rightarrow (N,e,d)$ s.t.: - $N = p \cdot q$ with $p,q$ prime - $e > 1$ s.t. $\gcd(e,\Phi(N))=1$ - $d=e^{-1} \mod \Phi(N)$ Basic idea of _"Plain RSA"_: - If you know $d$, you can recover $m$ from $c=m^e \mod N$ - without knowledge of $d$ RSA assumption implies computing $m$ from $c$ is hard. ### Plain RSA construction - $Gen(1^n): (N,e,d) \leftarrow GenRS\!A(1^n)$. Output $pk=(N,e),\ sk=(N,d)$ - $Enc(pk, m):$ Output $c=m^e \mod N$ - $Dec(sk, c):$ Output $m=c^d \mod N$ **Is Plain RSA CPA-secure?** $\Rightarrow$ *No.* 1. RSA-assumption only holds for "random m", but: In CPA game adversary chooses messages. 2. Many further attacks known against Plain RSA. _Solution:_ Use Padding scheme. **RSA PKCS \#1 v1.5** Let $pk=(N,e)$ and $k$ be byte length of $N$ Encryption of $D$-byte long message $m$ as $$[(\texttt{0x00}||\texttt{0x02}||\underbrace{r}_{random}||\texttt{0x00}||m)^l \mod N], $$ where $r$ randomly generated $(k-D-3)$-byte string with none of its bytes equal to $\texttt{0x00}$. For too short $r$: Bruteforce attack! For $r$ approx. half of length of $N$: Conjectured to be secure. *Alternative used in practice: RSA-OAEP.* ## Hybrid encryption ### Disadvatages of PKE: * slower due to PK operations * larger ciphertexts (due to expansion) To mitigate disadvantages, use Hybrid encryption: Combine PKE with SKE. ![](https://i.imgur.com/WML4PBL.png) Hybrid encrytion $\Pi=(Gen, Enc, Dec)$: * $Gen(1^n)$: Output $(pK, sK)\leftarrow P\!K\!E.Gen(1^n)$ * $Enc(pK, m)$: 1. $k \leftarrow S\!K\!E.Gen(1^n)$ 2. Output $(c, c')=(P\!K\!E.Enc(pK,k), S\!K\!E.Enc(k, m))$ * $Dec(sK, c)$: 1. Compute $k \leftarrow P\!K\!E.Dec(sk,c)$ 2. Output $m = S\!K\!E.Dec(k, c')$ ### Theorem 8.5 If PKE and SKE are CPA-secure, then \Pi is CPA-secure. # Recap **Not relevant:** Erweiterter euklid ## Introduction - [x] Classical ciphers - [x] Principles of modern crypto ## Perfect security - [x] Definition of perfect security (and some variants) - [x] one-time pad (and security proof) - [x] theorem about keyspace ## Computationally secure encryption - [x] negl and PPT adversary - [ ] security definitions (EAV, CPA, CCA) - [x] PRG - [x] PRG $\Rightarrow$ EAV secure encryption - [x] OWF Definition and cominations of OWFs - [ ] PRF/PRP and differences - [ ] PRF $\Rightarrow$ CPA-secure encryption - [ ] Feistel network - [ ] AES and other practial construction (dont need to know how AES works, only that is extists and what it is) - [ ] modes of operation: CBC, CTR, ECB | padding ## Message authentication (MAC) - [ ] Defintion (strong) - [ ] Fixed length MACs - [ ] MAC domain extension - [ ] authenticated encryption ## Hash functions - [ ] Definitions - [ ] Domain extension (Merkle-Damgard-Construction) - [ ] CRHF $\Rightarrow$ MAC - [ ] Random Oracle model (know that it exists, no proofs) - [ ] Merkle tree construction (not the proof, only how it works is important) - [ ] Birthday paradox ## Number theory and assumptions - [ ] know basic facts - [ ] foctoring + RSA (relation) - [ ] discrete log + DH assumption (relation) ## Key agreement - [ ] Definition + setting - [ ] DH key exchange protocol - [ ] Man-in-the-middle attacks ## Public Key Encryption - [ ] SKE vs PKE - [ ] CPA-security (no difference to eavesdropping security for PKE) - [ ] El Gamal - [ ] RSA - [ ] Hybrid encryption ## Nice graphic that explains connection of different schemes ![](https://i.imgur.com/ufkcPED.png) ###### tags: `2018` `Kryptographie` `Zusammenfassung` `TU Darmstadt` `2019` `recap` _Proud authors of this recap: B.M. and N.S._