---
title: Einführung in die Kryptographie
---
[Real Link](https://hackmd.io/BD_jVPtIQ1-w0JuWzTQAAQ)
[toc]
# Course Plan
* Classical vs Modern Cryptography
* Perfectly secret encryption
* Private Key Encryption
* Message Authentication Codes
* Hash Functions and Applications
* Practical Applications of Symmetric Crypto
* Basics number theory and cryptographic assumptions
* Public key management and public key cryptography
* Public key encryption
* Digital signatures
* Outlook
# Lecture 1
## Definitions
* Plaintext/Message: $m$, Plaintext Space: $M$
* Secret key: $k$, Key Space: $K$
* Ciphertext: $c$, Ciphertext Space $C$
* A **encryption scheme** is a tuple $(Gen, Enc, Dec)$, where
* $Gen$: probabilistic key generation algorithm on input $1^n$ that outputs a key $k \in K$
* $Enc$: encryption algorithm which on input $k$ and message $m \in M$ outputs $c \in C$. We write $c \leftarrow Enc_k(m)$
* $Dec$: decryption algorithm returns on input key $k$ and ciphertext $c \in C$ a message $m \in M$. We write $m = Dec_k(c)$.
* **Correctness**:
* for every $k$ and $m$ we should have
* $Dec_k(Enc_k(m))=m$
### Wahrscheinlichkeitsformeln (Satz von Bayes)
* Für zwei Ereignisse $A, B$ gilt für die bedingte Wahrscheinlichkeit $P(A|B) = \frac{P(B|A) * P(A)}{P(B)}$
* $P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{\frac{P(A \cap B)}{P(A)} * P(A)}{P(B)} = \frac{P(B|A) * P(A)}{P(B)}$
* Wenn $A_{i},\; i = 1, \dotsc, N$ eine Zerlegung der Ergebnismenge in disjunkte Ereignisse ist, gilt: $P(B)\; = \; \sum_{j=1}^N P\left(A_j\cap B\right) \; = \; {\sum_{j=1}^{N} P\left(B\mid A_j\right)* P\left(A_j\right)}$
## Kerckhoff's principle
>The cipher should remain secure even if the adversary knows the specification of the cipher, i.e., the adversary knows $(Gen, Enc, Dec)$.
**Motivation:**
1. In commercial products it is unrealistic to assume that the design details remain secret (reverse-engineering!)
2. Short keys are easier to protect, generate and exchange
3. the design details can be discussed and analyzed in public
## Shift cipher
* $M = C =$ words over alphabet $\{A,...,Z\} \sim \{0,...,25\}$
* Basic idea: cyclic shift of every letter of the message by $k$ positions in alphabet
* $Gen \rightarrow k$
* $Enc_k(m_0,...,m_n)=(m_0 + k \bmod 26,..., m_n + k \bmod 26)$
* $Dec_k(c_0,...,c_n)=(c_0 - k \bmod 26, ..., c_n - k\bmod 26)$
* easily bruteforceable -> **key space needs to be larger**
## Substitution cipher

* $M = C =$ words over alphabet $\{A,...,Z\} \sim \{0,...,25\}$
* $K =$ a set of permutations over $\{0,...,25\}$
* $Enc_p(m_0,...,m_n) = (p(m_0),...,p(m_n))$
* $Dec_p(c_0,...,c_n) = (p^{-1}(c_0),...,p^{-1}(c_m))$
* Keyspace: $|K|=26! \sim 2^{88}$ (**large**)
* Vulnerable to statistical patterns of occurances of certain letters in certain languages

## Historic vs modern cryptography
* In the past:
* lack of precise definitions
* ad-hoc designs
* usually insecure
* nowadays:
* formal definitions
* systematic design
* very secure
* constructions with security proofs
## Security
### Principle I: Formal Definitions
Two components of security definition:
1. Aversary goal: what constitutes a successful attack
* Example: encryption
1. Attempt: impossible to recover secret key
2. Attempt: impossible to recover entire plaintext from ciphertext
3. Attempt: impossible to recover any character from ciphertext
* Right Answer: ciphertext does not leak any new information about plaintext
* Moral: Developing formal definitions can be hard
1. Threat Model: what can the adversary do and see (but no restriction on his strategy)
* Example Threat Models:
* Ciphertext-only Attack (COA): Adversary only observes ciphertexts (also called eavesdropping adversary)
* Known-plaintext attack (KPA): Adversary learns plaintext/ciphertext pairs
* Chosen-plaintext attack (CPA): Adversary learns plaintext/ciphertext pairs of its *own* choice
* Chosen-ciphertext attack (CCA): As before, but adversary also can learn plaintexts for ciphertexts of its choice
### Principle II: Precise assumptions
Why are precise assumptions needed?
* Unconditional security mostly impossible
* Validations of assumptions: if assumption is not precisely stated it cannot be studied and violated
* Comparison of schemes: allows to examine if vertain assumptions are weaker or stronger than others
* Required for security proof: without mathematical formalization proofs are impossible
### Principle III: Proofs of security
* First attempt: Enumerate over all possible adversaries
* Not possible: there are too many!
* Second attempt: Base security on assumption
* Assumptions hold for certain class of adversaries
* scheme is secure for certain class of adversaries
Provable security is about relations between **assumptions** and **security of cryptoschemes**.
In many areas of computer science proofs are not essential.
* i.e. instead of proving that algorithm is efficient just simualte its behaviour on "typical" inputs
In cryptography this is not true
* Notion of "typical adversary" makes little sense
*Security proofs are only as good as the assumption and the model!*
* Assumptions may turn out to be false
* Security model does not match reality: Examples include side-channel attacks, CCA required but only CPA is used
# Lecture 2
## Definitions
* Probability distributions over sets $M, K, C$
1. $K$ is a random variable denoting value of key output by $Gen()$
* $\forall k \in K: Pr[K=k]$ denotes probability that Gen outputs k
1. $M$ is random variable denoting message being encrypted
* $\forall m \in M: Pr[M=m]$ denotes probability that m is encrypted message
1. $C$ is random variable denoting the resulting ciphertext
* $\forall c \in C: Pr[C=c]$ (probability over choice of key and choice of message)
* **Perfect Security (Secrecy)** (Defintion 2.1)
* An encryption scheme $P = (Gen, Enc, Dec)$ is perfectly secure(secret), if for all (a-priori knowledge) probability distributions over $M, \ \forall m \in M, c \in C$ with $Pr[C]>0$:
* $Pr[M = m\ |C = c] = Pr[M = m]$
* Interpretation: c does not reveal information about m.
* Alternative: Indistinguishability of ciphertexts
* Informal: Probability of ciphertexts independent of message
* Lemma 2.2
* An encryption scheme P is perfectly secure if and only if $\forall m,m' \in M, c \in C:$
* $Pr[Enc_k(m)=c] \ = \ Pr[Enc_k(m')=c]$
* Where probability is taken over choice of k and randomness of $Enc$.
## Experiment / Game based definition
Informal: Adverssary has no advantage over guessing
* With adversary $\mathcal{A}$ and and encryption sceme $\Pi$ , the Game $PrivK^{eav}_{\mathcal{A},\Pi}$:
1. $(m_0, m_1) \leftarrow \mathcal{A}$ with $m_0, m_1 \in M$;
1. $k \leftarrow Gen; \ b \leftarrow \{0,1\}$;
2. $b' \leftarrow A(Enc_k(m_b))$;
3. if $b = b'$output 1; otherwise output 0
* Remarks
* $\mathcal{A}$ chooses messages $m_0, m_1$ himself
* If $b=b'$ we say that $\mathcal{A}$ wins game
* $\mathcal{A}$ has "trivial" success probability of $\frac{1}{2}$ by guessing
* we call $Pr[PrivK^{eav}_{\mathcal{A},\Pi} = 1] - \frac{1}{2}$ advantage of $\mathcal{A}$
**Definition 2.3 (Perfectly indistinguishable):**
$\Pi$ is perfectly indistinguishable if $\forall PPT\ A:Pr[PrivK_{\Pi, A}^{eav}=1]=\frac{1}{2}$
**Lemma 2.4:**
$\Pi$ is perfectly secure $\Leftrightarrow$ $\Pi$ is perfectly indistinguishable.

## One-time pad and its security
Construction:
* Let $M = C = K = \{0,1\}^l$
* $Gen$: output $k \leftarrow \{0,1\}^l$
* $Enc$: For $m \in M$ output $Enc_k(m) = k \oplus m$
* $Dec$: for $c \in C$ output $m := Dec_k(c) = k \oplus c$
### Theorem 2.5 (Security of one-time pad):
One-time pad encryption is perfectly secure against COA attacks.
(There is a proof in the notes)
### Drawbacks:
* Key is as long as message
* Key can only be used once: c xor c' = (m xor k) xor (m' xor k) = m xor m'
### Limitations of perfect security
Key space must be as long as message space
-> OTP is optimal
### Theorem 2.6 (key space)
If $\Pi$ is perfectly secure encryption scheme with message space $M$ and key space $K$, then $|K| \ge |M|$.
# Lecture 3
## Defining computational secure private key encryption
### Perfect vs. computational security
* Perfect security requires $|K| \ge |M|$
* Computational security: ~128 bis key length and many messages can be encrypted
| Perfect security | Computational security |
| ---------------------------- | -------------------------- |
| Adversary computationally unbounded | Adversary computationally bounded |
| No information revealed | small information revealed |
**Example**
Eavesdropping security game [$Priv^{eav}_{A,\Pi}$](#Experiment--Game-based-definition)
We say $\Pi$ is EAV-secure, if for all "efficient" adversaries, we have: $Pr[Priv^{eav}_{A,\Pi}(n)=1]\ =\ Pr[b=b'] \le 0.5\ +$ *"very small"*.
* What is "efficient"?
* What is "very small"?
### The concrete approach
Measure concrete running time & conrete error probability.
We say $\Pi$ is $(t, \in)$-secure, if for all adversaries running in time t, we have: $Pr[b=b'] \le 0.5\ + \in$
Ideas for measuring running time:
1. *"Attacker can use at most 1000 intel i7 Processors for at most 100 years"*
2. *"Attacker can buy equipment worth 1 million euro and use it for 30 years"*
$\Rightarrow$ It is hard to reason formally about such running times.
#### Complexity theory
*"Efficient computation"* = Polynomial-time computable by probabilistic algorithm
1. Polynomial-time computable:
* Length of $x: n = |x|$
* x \[Input\] $\rightarrow$ *Algorithm* $\rightarrow$ y \[Output\]
* Computes the output in $T(n) = O(n^c)$ steps (for a constant $c$)
2. Probabilistic algorithm
* $(x, r) \rightarrow$ *Algorithm* $\rightarrow y$
* Access to random coins in each step
* Or: Additional randomness as input
* Gives the adversary more power
3. A step: for example a move on a poly-time Turing machine
* A poly-time Turing-machine has multiple heads/tapes
* A probabilistic Turing-machine has an additional tape with random bits
4. Is this the right approach?
* Advatages:
* Many models of computation (TM, RAMs, circuits,...) are "equivalent" up to a "polynomial reduction"
* The formulas for running time get much simpler (we use asymptotics)
* Disadvantages:
* Asymptotic results don't tell us anything about security of the **conrete systems**
* E.g.: Scheme may be concretely secure only for very large keys
* However
* Usually one can prove formally an asymptotic result and then argue informally that "the constants are reasonable" and can be calculated if one really wants.
### Negligible (Vernachlässigbar)
**Definition 3.1**: A function $f:N \rightarrow R_{\geq0}$ is negligible in $n$ if for every positive polynomial $p$ there exists an integer $T$ such that for all $n > T:$
$f(n) < \frac{1}{p(n)}$. We call such a function negligible in $n:negl(n)$
#### Shorthand
A function $f:N \rightarrow R_{\geq0}$ is negligible in $n$ if for every positive polynomial $p$ and all *sufficiently large* $n: f(n) < \frac{1}{p(n)}$
#### Examples
| Function | Negligible? |
| -------- | -------- |
| $f(n) = n^{-2}$ | No, inverse poly. n^-3^ is always smaller |
| $f(n) = 2^{-n}$ | Yes, for sufficient large n |
| $f(n) = 2^{-\frac{n}{2}}$ | Yes, for sufficient large n |
| $f(n) = n^{-1000}$ | No, n^-1001^ is always smaller|
#### negl() is closed under composition
Let $negl_1$ and $negl_2$ be negligible functions. Then:
1. The function defined by $negl_3(n) = negl_1(n) + negl_2(n)$ is negligible
2. For any positive polynomial $p$, the function $negl_4()$ defined by $negl_4(n)= p(n) * negl_1(n)$ is negligible
If $f(n), g(n)$ are negligible functions, $q(n)$ is a positive polynom, then:
* $h(n) = f(n) \cdot g(n)$ is a negligible function
* $h(n) = f(n) + g(n)$ is a negligible function
* $h(n) = |f(n) - g(n)|$ is a negligible function
* $h(n) = q(n) \cdot g(n)$ is a negligible function
### Definition 3.3: Computational security (final)
We adapt the previous attempt at defining EAV-security.
We say $\Pi$ is **EAV-secure**, if for all **PPT** adversaries, there exists a negligible function **negl** such that for all $n$ we have:
$Pr[Priv^{eav}_{A,\Pi}(n)=1]\ =\ Pr[b=b'] \le 0.5 + negl(n)$.
## Pseudorandom generators (PRGs)
Idea: map short random seed $s$ to long string $G(s)$ that *looks random*.

Motivation: Output of $G$ can be used whenever uniform randomness is needed but only short random string $s$ is available.
### What does *"look random"* mean?
1. Non-crypto application: should pass some statistical tests, e.g. *rand()* function in Windows
2. Crypto application: should pass **all** polynomial-time tests
Informal definition of *"look random"*: $\forall$ PPT(probabilistic polynomial time) distinguishers $D$:
case 0: r $\in \{0,1\}^{l(n)}$
case 1: $G(s)$

Distinguisher $D$ cannot distinguish between the two cases
### Definition 3.4 (pseudorandom generator)
Let $l$ be polynomial and $G: \{0,1\}^{n} \rightarrow \{0,1\}^{l(n)}$ be a deterministic algorithm. $G$ is a pseudorandom generator if the following holds:
1. *Expansion*: $\forall n: l(n) > n$
2. *Pseudorandomness*: $\forall$ PPT distinguishers $D$, $\exists$ *negl* such that
$|Pr[D(G(s))=1] - Pr[D(r)=1]| \leq negl(n)$,
where $s\leftarrow \{0,1\}^{n}, r \leftarrow \{0,1\}^{l(n)}$ are chosen uniformly at random.
#### Remark
Output distribution of $G$ is far from uniform. Assuming $l(n)=2n$:

Only a fraction of $2^{-n}(=\frac{2^n}{2^{2n}})$ values are reached (only $2^n$ different outputs of $G$ exist).
$\rightarrow$ $G(s)$ is not uniform over $\{0,1\}^{2n}$
$\rightarrow$ easy to distinguish for unbounded distinguisher
## Building secure encryption from PRGs
Let $G$ be a PRG with expansion factor $l$. Define encryption scheme $\Pi$ for messages of length $l$ as follows:
- $Gen:$ on input $1^n$, output uniform key $s \leftarrow \{0,1\}^{n}$
- $Enc:$ on input key $s \in \{0,1\}^{n}$ and message $m \in \{0,1\}^{l(n)}$ output ciphertext
```c := G(s) xor m```
- $Dec:$ on input key $s \in \{0,1\}^{n}$ and ciphertext $c\in \{0,1\}^{l(n)}$,
output message ```m:=G(s) xor c```
### Comparision to one-time pad
- $\Pi$ uses $G(s)$ instead of $r \leftarrow \{0,1\}^{l(n)}$ as one-time pad
- Advantage: Instead of $l(n)$ random bits we only need $n$
### Theorem 3.5 (PRGs are EAV-secure)
If $G$ is a PRG, then the above construction is **EAV-secure**.
*Proof in the lecture-notes*
#### What has been shown?
Any adversary $A$ against $\Pi$ can be used to build a distinguisher $D$ against $G$:
- Advantage: $A$ and $D$ are directly related (here: identical)
- Running times of $A$ and $D$ are directly related (here: almost identical)
### Theoretical constructions of PRGs
One of the most important results in private key encryption:
* One way functions $\rightarrow$ PRGs ($\rightarrow$ secure encryption)
### What is a one way function (OWF)?

More formally, for function $f:\{0,1\}^{*}\rightarrow\{0,1\}^{*}$ we define the inversion game:
#### Experiment $Invert_{A,f}(n)$
1. $x \leftarrow \{0,1\}^{n}\ ;\ \ y = f(x)$
2. $x' \leftarrow A(1^{n}, y)$
3. If f(x')=y output 1; otherwise output 0
#### Defintion 3.6 (one way function)
A function $f:\{0,1\}^{*}\rightarrow\{0,1\}^{*}$ is one-way if the following holds:
1. *Easy to compute*:
$\exists PPT$ algorithm $\mu_{f},$ such that $\forall x\in\{0,1\}^{*}: \mu_{f}(x)=f(x)$
3. *Hard to invert*:
$\forall PPT$ adversaries $A, \exists \space negl$ such that: $Pr[Invert_{A,f}(n) = 1] \le negl(n)$
##### Example
Conjectured (vermutete) one-way functions:
Integer factorization of composite numbers. Let $p,q$ be primes with $|p|=|q|$, then $f(p,q)=p*q$ is believed to be one-way.
#### How to encrypt with OWF?
Naive Approach: $\forall$ messages $m$, encrypt $m$ as $f(m)$. Problems:
* No key $\rightarrow$ Not possible to decrypt
* $f(m)$ may reveal something about $m$
#### Theorem 3.7
If a OWF exists, then there exist corresponding pseudorandom generators.
## Practical PRGs = Stream ciphers
The pseudorandom generators used in practice are called stream ciphers. They are called like this because their ouput is an "infinite" stream of bits.
### Formal definition
A stream cipher is a pair of deterministic algorithms $(Init, GetBits)$ where:
* $Init$ takes as input a seed $s$ and an optional initialization vector $IV$, and outputs an initial state $st_0$
* $GetBits$ takes as input state information $st_i$ and outputs $(y, st_{i+1})$, where $y$ can be a bit or a "block of bits"
### How to construct a stream cipher
#### Old fashioned approach
1. Take a standard PRG $G$ and a hash function $H$ (will be defined later)
2. Let $st_0 = Init(IV,s):=H(IV,s)$
3. Set $st_{i+1} = GetBits(st_i)$
#### Modern approach
Design such a $G$ from scratch.
### Historical stream ciphers
Based on linear feedback shift registers:
| Stream cipher | Stats|
| -------- | -------- |
| A5/1 & A5/2 | completely broken |
| Content Scramble System (CSS) | completely broken |
| RC4 | has serious security weaknesses |
### RC4
* used in WEP, WPA and TLS
* very efficient and simple
* **has security flaws**

#### Problems with RC4
1. Doesn't have seperate IV
2. some bytes of the output are biased
3. First few bytes of output sometimes leak some information about the key
* Recommenation: discard the first 768-3072 bytes
4. There are more weaknesses
#### Usage of RC4 in WEP
To get the seed, a key k is concatenated with the IV
* old versions: $|k|$ = 40 bits, $|IV|$ = 24 bits (US export restrictions)
* new versions: $|k|$ = 104 bits, $|IV|$ = 24 bits
### Salsa20
One of the winners of the eStream competition.
Very efficient both in hardware and in software.
$Key \space k, |k| = 256\ bits \\
Nonce \space r, |r| = 64\ bits \\
Salsa20(k,r) := H(k,r,0)||H(k,r,1)||...$
*All the "magic" happens in the function H*
# Lecture 4
## Stronger security notions
### Chosen plaintext attacks (CPA)
Setting: Two parties Alice and Bob, adversary $A$

Adversary $A$ has control over what messages are encrypted.
Formally modelled: $A$ has oracle access to $Enc_{k}(\cdot)$. We write $A^{Enc_{k}(\cdot)}$.
For any encryption scheme $\Pi=(Gen, Enc, Dec)$, any PPT adversary $A$, and any security parameter $n$ define the following experiment:
$PrivK^{CPA}_{A,\Pi}(n)$
1. $k \leftarrow Gen(1^n)$
2. $(m_0, m_1) \leftarrow A^{Enc_{k}(\cdot)}(1^{n})$ with $|m_0| = |m_1|$
3. $b \leftarrow \{0,1\}$
4. $b' \leftarrow A^{Enc_{k}(\cdot)}(Enc_k(m_b))$
5. If $b' = b$ output 1; else output 0

### Definition 3.8 (CPA-secure)
A private key encryption scheme $\Pi=(Gen, Enc, Dec)$ is CPA-secure if $\forall \space PPT$ adversaries $A, \exists \space negl$ such that:
$Pr[PrivK^{CPA}_{A,\Pi}(1^n) = 1]\leq \frac{1}{2}+negl(n)$
Where probability is taken over randomness of $A$ and of experiment.
# Lecture 5
## Definitions
### Pseudorandom functions (PRFs)
Informally: "random looking" function ~ function indistinguishable from random function. Formalized definition:
* Let $Func_n = \{f: \{0,1\}^n \rightarrow \{0,1\}^n\}$ be set of all n-bit to n-bit functions
* Observation: $|Func_n| = 2^{n^{2^n}}$
* Why?
$\rightarrow$ Each function $f \in Func_n$ can be represented by large lookup table ($2^n$ entries $=$ every combination of n bits, every function value has n bits). So each $f$ can be represented by $2^n*n$ bits. In the set we have n different functions. So $|Func_n| = 2^{n^{2^n}}$.
* Following we consider functions $F:\{0,1\}^* \times \{0,1\}^* \rightarrow \{0,1\}^*$
* efficient: $\forall$ inputs $k, x \ \exists \mathcal{PPT}$ algorithm computing $F(k, x)$
* keyed: first input is a fixed key $k$ meaning: $F(k, x) = F(x)$
* length preserving: for some n we have: $F:\{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}^n$
* Definition 3.10 (PRF):
* Let $F:\{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}^n$ be an efficient, keyed, length preserving function. $F$ is a PRF if it holds that $\forall \ \mathcal{PPT}$ distinguishers $D$, $\exists negl$ such that:
* $|Pr[D^{F_k(\cdot)}(1^n) = 1] - Pr[D^{f(\cdot)}(1^n) = 1]| \le negl(n)$
* where $k \leftarrow \{0,1\}^n,\ \ f \leftarrow Func_n$
**Scenario 0: real world (D, F_k)**
```sequence
D->F_k: x_i
Note right of F_k: F_k is a PRF
F_k->D: F_k(x_i)
Note over D: D outputs 1/0
```
---
**Scenario 1: ideal world (D, f)**
```sequence
D->f: x_i
Note right of f: f is a uniformly random function
f->D: f(x_i)
Note over D: D outputs 1/0
```
**Example 3.11**
$F_k(x) = k \oplus x$ is not a PRF since $F_k(x_0) \oplus F_k(x_1) = x_0 \oplus x_1$
## From PRFs to CPA-secure encryption
Attempt 1: $Enc_k(m) = F_k(m)$
1. not clear how to encrypt
2. deterministic $\rightarrow$ cannot be CPA-secure
### Construction
Let $F$ be a PRF. Define private key encryption scheme for messages of length $n$:
- $Gen$: On input $1^n$, choose $k \leftarrow \{0,1\}^n$ and output $k$
- $Enc$: On input $k, m \in \{0,1\}^n$, sample $r \leftarrow \{0,1\}^n$ and output $(r, F_k(r) \oplus m)$
- $Dec$: On input $k \in \{0,1\}^n$ and ciphertext $c = (r, s)$ output plaintext: $m = s \oplus F_k(r)$
**Theorem 3.12**
If $F$ is a PRF, then above scheme is CPA-secure.
*Proof in lecture notes.*
# Lecture 6
## Definitions
### Pseudorandom permutations (PRP)
Informal: "Random looking" permutation. Same as PRF but $F_k(\cdot)$ and $f(\cdot)$ have to be permutations. Formally:
* $Perm_n = \{f:\{0,1\}^n \rightarrow \{0,1\}^n\ :\ {f\ is\ permutation}\}$
* $|Perm_n| = (2^n)!\ \:$(Viewing each $f \in Perm_n$ as look-up table again)
* Let $F_k^{-1}(\cdot)$ be the inversion of the permutation. If $F$ is efficient then a PPT algorithm exists that computes $F_k^{-1}(y)\ \forall k,y$
* **PRP Definition**: Analogous to PRF definition, but $F_k$ respectively changed to condition that f is permutaiton
**Strong PRP**:
If the distinguisher additionally has access to the inversion. If $F$ is a strong PRP, then also $F^{-1}$ is a strong PRP.

**In Practice**:
Block ciphers $\approx$ pseudorandom permutations with some fixed key length (eg. AES, DES, IDEA, ...)
## Block cipher modes
We need modes of operation because without them:
+ messages have to be very short
+ the cipher would be deterministic and have no state i.e. cannot be CPA-secure
### Electronic Code Book Mode (ECB)

Each block gets encrypted seperately. **Not secure** since artifacts remain (see famous linux penguin ECB image).
### Cipher Block Chaining Mode (CBC)


Theorem:
~ If $F$ is a PRP then $F$-CBC is secure^[M. Bellare, A. Desai, E. Jokipii and P. Rogaway 1997].
### Counter mode (CTR)
We can convert a pseudorandom permutation into a pseudorandom generator.

This is also called CTR mode. We can use it for encryption if we xor the plaintext afterwards.

### Comparison of CBC and CTR
| Mode of operation | Error propagation | Encryption parallelizable? | Decryption parallelizable? | What if one bit of plaintext changes? |
| -------- | -------- | -------- | -------- | -------- |
| CBC | Error in block $c_i$ affects $c_i$ and $c_{i+1}$| No | Yes | Need to recompute everything* |
| CRT | Error in block $c_i$ affects only $c_i$ | Yes | Yes | Need to recompute one block only |
\* Not suited for e.g. disk encryption.
## Construction of block ciphers
Typical requirements:
* *Security:* Best attack should be brute force key search.
* *Efficiency* on different platforms (microcontrollers, phones, PCs, dedicated hardware)
* *Key agility:* Efficient key change possible
* *Advanced security requirements:* Resistance against side-channel attacks, related key attacks, ...
### Iterated cipher design

Popular because it is:
* *simple*: Easy to implement and easier to check for backdoors.
* *symmetric (repeating patterns)*: Smaller circuits in hardware.
* Examples:
* Feistel ciphers
* Most popular: DES (Lucifer)
* Substitution permutation networks
* Most popular: AES (Rijndael)
### Substitution permutation networks
Use iteration cipher design paradigm and are typically easily invertable.

### The Round function of AES

---
**SubBytes**

---
**ShiftRows**

---
**MixColumns**

## Feistel-Networks


# Lecture 7
## Definitions
### Message Authentication Codes (MACs)
*Definition 4.1:* A MAC consists of three PPT algorithms $(Gen, Mac, Vrfy)$:
* $Gen$: On input of the security parameter $1^n$, output key $k$ with $|k| \ge n$
* $Mac$: The tag algorithm $Mac$ takes the input key $k$ and message $m \in \{0,1\}^*$ and outputs a tag $t \leftarrow Mac_k(m)$
* $V\!r\!fy$: On input key $k$, message $m$ and tag $t$ output a bit $b$, where $b=1$ means valid and $b=0$ invalid.
*Canonical verification*: If $Mac$ is deterministic, the verification works as follows:
* Compute $\tilde{t} = Mac_k(m)$
* Output $1$ iff $\tilde{t} = t$, otherwise $0$
#### Security definition
Idea: Impossible to generate valid tag for new message. Two elements to define: *Threat model* + *successfull break*.
*Threat model:* $A$ sees/controls valid pairs $(m,t)$
*Successful break:* $A$ generates _new_ valid pair $(m^*, t^*)$
Experiment $Mac\!-\!\!forge_{A,\Pi}(n)$:
1. $k \leftarrow Gen(1^n)$
2. $(m^*, t^*) \leftarrow A^{Mac_k(\cdot)}(1^n)$ , where $Q$ denotes set of all queries asked by $A$
3. Output 1 if (1) and (2) apply, 0 otherwise:
1. $V\!r\!fy\,_k(m^*, t^*) = 1$
2. $m^* \notin Q$

#### Another security definition from the exercise
A message authentication code $\Pi = (Gen, Mac, Vrfy)$ is strongly secure or a strong MAC, if for all probabilistic polynomial-time adversaries $A$, there is a negligible function $negl$ such that:
$$ Pr[Mac\!-\!s\!forge_{A, \Pi}(n) = 1] \le negl(n)
$$
The experiment $Mac-sforge_{A, \Pi}$ is defined by modifying the above experiment such
that adversary wins also if he outputs a new valid tag for a message queried earlier. This means the last part of the experiment is adapted such that instead of $m^* \notin Q$ we write: $(m^*,t^*) \notin Q$.
### MAC-Security (Definition 4.2)
A message authentication code $\Pi = (Gen, Mac, Vrfy)$ is existentially unforgeable under adaptive chosen message attacks (or just secure) if $\forall$ PPT attackers $A, \exists negl$ such that:
$Pr[Mac\!-\!\!forge_{A,\Pi}(n) = 1] \le negl(n)$
#### Is our definition too strong?
*Successful break:* Any previously unauthenticated message counts.
*Real words:* Only *"meaningful"* messages.
Problem: *"meaningful"* is highly application dependent.
#### Replay attacks
Replay attacks possible even with MAC. E.g.:
1. Alice $\rightarrow$ Bank: ("10.000$ to Bob", tag)
2. Bob listened to Alices message. Sends the same message to the bank again.
Fix this with __*randomness*__ or __*sequence numbers*__!
## Construction of MACs
1. information secure (perfectly secure): possible but inefficient
2. fixed length
3. arbitrary length: theoretical
4. CBC-MAC
### Fixed length MACs
Idea: PRFs are fixed length MACs. Let $F$ be a PRF. Define $MAC$ as follows:
* $Gen(1^n)$: output random key $k \rightarrow \{0,1\}^n$
* $Mac_k(n)$: output tag $t = F_k(m)$
* $V\!r\!fy_k(m,t)$: output 1 if $t = F_k(m)$; otherwise 0.
*Idea for security:* $F$ is a random function. Guessing the correct output has probability $2^{-n}$.
*Theorem 4.3:* If $F$ is a PRF, then above construction is a secure fixed-length message. [For proof see slides]
### Domain extension
Problem of previous construction: It authenticates only short, fixed-length messages. To solve this we create a scheme $\Pi^*$ from $\Pi$ that works on any message length.
#### How NOT to do it:
* Approach 1: $Mac_k(m) = (Mac_k'(m_1), \dots, Mac_k'(m_d))$
*Not secure! Block reordering attack possible.*
* Approach 2: $Mac_k(m) = (Mac_k'(1||m_1), Mac_k'(2||m_2), \dots)$
*Not secure! Truncation attack (adversary drops blocks from the end).*
* Approach 3: $Mac_k(m) = (Mac_k'(l||1||m_1), Mac_k'(l||2||m_2), \dots)$
*Not secure! Mix and match attack: Having two authenticated messages of the same length we can mix.*
#### How we do it finally
We use a random identifier.
$Mac_k(m) = (Mac_k'(r||l||1||m_1), \dots, Mac_k'(r||l||d||m_d), r)$
$V\!r\!fy_k(t,m)$: Check if all MACs are valid and use same random identifier $r$.
*Theorem 4.4:* If $\Pi'$ is secure MAC for message length n, then above construction is secure MAC for arbitrary length messages.
### CBC-MAC
Disadvantage of previous construction: Inefficient. Block cipher is evaluated $O(d)$ times. Tag length is $O(dn)$. Alternative construction: CBC-MAC.

* $Mac_k(m_1, \dots, m_d)$ is constructed according to above scheme. $F_k$ outputs a block of length $n$.
* $V\!r\!fy_k(m,t)$: Recompute $\tilde{t}$ and output 1 if $t=\tilde{t}$, otherwise output 0.
*Theorem 4.5 (CBC-MAC security):*
Let $l$ be polynomial. If $F$ is PRF, then CBC-MAC is a secure MAC for messages of length $l(n)\cdot n$.
#### CBC-MAC vs CBC mode of encryption
* CBC encryption uses random IV (crucial for security). CBC-MAC uses *no IV*.
* CBC encryption outputs all intermediate values (otherwise no decryption is possible). CBC-MAC outputs only the final block
# Lecture 8
## Authenticated encryption
Ideally secure encryption: *secrecy + integrity*.
*Secrecy:* Strong secrecy $\rightarrow$ CCA-security
*Integrity:* Unforgeable encryptions
### CCA-Security
CPA-security + access to a dencryption oracle
CCA-indistinguishability game $PrivK^{CCA}_{A,\Pi}$:
1. $k \leftarrow Gen(1^n)$
2. $(m_0,m_1) \leftarrow A^{Enc_k(\cdot), Dec_k(\cdot)}(1^n)$
3. $b \leftarrow \{0,1\}$
4. $b' \leftarrow A^{Enc_k(\cdot), Dec_k(\cdot)}(Enc_k(m_b))$
5. If $b = b'$ output 1; otherwise output 0.
After receiving the challenge ciphertext after step 3, $A$ can still ask queries to the oracle. However $A$ cannot query the oracle on the challenge ciphertext.
*Definition 4.6 (CCA-security):*
A private key encryption scheme $\Pi = (Gen, Enc, Dec)$ is CCA-secure, if $\forall$ PPT adversaries $A\ \exists negl$ such that:
$Pr[PrivK^{CCA}_{A,\Pi}(n)=1] \le \frac{1}{2} + negl(n)$
### Unforgeable encryptions
= impossible to generate new valid ciphertexts.
Experiment $Enc\!-\!\!Forge_{A,\Pi}(n)$:
1. $k \leftarrow Gen(1^n)$
2. $c \leftarrow A^{Enc_k(\cdot)}(1^n),$ where $Q$ is the set of all messages asked to $Enc_k(\cdot)$
3. Let $m = Dec_k(c)$. Output 1 iff
* $m \neq \bot$ (m deciphered correctly) and
* $m \notin Q$
*Definition 4.7 (unforgeable encryption)*
$\Pi$ is unforgeable if $\forall$ PPT adversaries $A\ \exists negl$ such that:
$Pr[Enc\!-\!\!Forge_{A,\Pi}(n)=1] \le negl(n)$
*Definition 4.8 (authenticated encryption)*
$\Pi$ is an authenticated encryption scheme if $\Pi$ is CCA-secure and unforgeable.
### Generic constructions
* CPA-secure encryption $\Pi_E=(Gen, Enc, Dec)$ (with random n-bit string $k_E$ as key)
and
* MAC $\Pi_M=(Gen, Enc, Vrfy)$ (with random n-bit string $k_M$ as key)
#### Encrypt-and-MAC
Encryption of m:
$(c,t) = (Enc_{k_E}(m), Mac_{k_M}(m))$
Decryption of $(c, t)$:
$m = Dec_{k_E}(c)$. Output $m$ iff $Vrfy_{k_M}(m,t) = 1$. Otherwise output $\bot$.
Security:
No! Tag may reveal information about message $m$ (e.g. CBC-MAC).
#### MAC-then-encrypt
Encryption of m:
$t \leftarrow Mac_{k_M}(m)$
Output $c \leftarrow Enc_{k_E}(m||t)$
Decryption of $c$:
$m||t = Dec_{k_E}(c)$. Output $m$ iff $Vrfy_{k_M}(m,t) = 1$. Otherwise output $\bot$.
Security:
Not generically!
#### Encrypt-then-MAC
Encryption of m:
$c \leftarrow Enc_{k_E}(m)$
$t \leftarrow Mac_{k_M}(c)$
Output $(c,t)$.
Decryption of $(c,t)$:
Iff $Vrfy_{k_M}(c,t) = 1$, output $Dec_{k_E}(c)$. Otherwise output 0.
Security:
See Theorem 4.9.
*Theorem 4.9*
If $\Pi_E$ is CPA-scure and $\Pi_M$ is a strongly secure MAC, then Encrypt-then-MAC is an authenticted encryption scheme. Approch: Show encryption unforgeability + CCA-securtity.
#### The need for two independent keys
Some schemes can be broken when the same key is used. Consider an Encrypt-then-MAC scheme with $k=k_E=k_M$. Let $F$ be a strong PRP (which implies $F^{-1}$ is also a strong PRP).
* $Enc_k(m) = F_k(m||r)$
* $Mac_k(c) = F_k^{-1}(c)$
* Encrypt-then-MAC:
$Mac_k(Enc_k(m)) \\
= F_k^{-1}(F_k(m||r)) \\
= m||r$
$\Rightarrow$ Message revealed in clear!
# Lecture 9
## Hash functions
. 
*Informal security property:*
The digest uniquely identifies $x$ (= collision resistance. I.e. it's hard to find $(x,x')$ such that $H(x)=H(x')$).
*Observation:* There always exist collisions since the domain (Definitionsbereich) is larger than range (Wertebereich): $\{0,1\}^{l(n)} > \{0,1\}^n$.
### Collision resistance
First attempt at defining collision resistance. $H$ is collision resistant iff:
* $H(x)$ can be computed in polyomial time
* $\forall$ PPT adversaries $A$, the probablity that $A$ outputs $x$ with $x\neq x' \wedge H(x)=H(x')$ is neglectable
Technichal problem: There always exists an adversary $A$ that has a collision hardcoded into its description. To Solve this we use a public seed $s$.
*Definition 5.1*
A hash function with output length $n$ is a pair of PPT algorithms $(Gen, H)$ defined as follows.
* $Gen:$ On input $1^n$ the probabilistic algorithm outpts public seed $s$
* $H:$ On input $s$ and string $x \in \{0,1\}^{l(n)}$ it outputs $H^s(x)$
If the domain of $H$ is $\{0,1\}^*$ instead of $\{0,1\}^{l(n)}$, then $H$ is a hash function for arbitrary message length.
#### Experiment $Hash\!-\!coll_{A,\Pi}(n)$
1. $s \leftarrow Gen(1^n)$
2. $(x, x') \leftarrow A(1^n, s)$ with $x, x' \in \{0,1\}^{l(n)}$
3. Output 1 iff both following conditions apply, 0 otherwise
* $x \neq x'$ and
* $H(x) = H(x')$
#### Definition 5.2 (collision resistance CRHF)
A hash function $\Pi = (Gen, H)$ is collision resistant if $\forall$ PPT adversaries $A\ \exists negl$ such that:
$Pr[Hash\!-\!coll_{A,\Pi}(n)=1] \le negl(n)$
In practice: Hash functions are without seed. In that case collision resistance means $\nexists$ trivial adversary. This case is hard to define formally.
#### Definition: Second Preimage Secure
(weaker than collision resistance)
Informally, a hash function is second preimage resistant if **given $s$ and a uniform $x$** it is infeasible for a PPT adversary to find $x′\neq x$ such that $H_s(x′) =H_s(x)$. The difference to collision resistance is that $A$ has to find an $x'$ for a given $x$, while in the collision-resistance game, he can choose both arbitrarily.
# Lecture 10
## Hash functions
### Domain extension
Goal: Use a fixed length CRHF to create an arbitrary length CRHF.
**Merkle-Damgard transformation:** Let $(Gen, h)$ be a fixed length hash functin for inputs of length $l(n)=n+1$ and output length $n$. We construct a hash function with length $l'(n)=poly(n)$.
* $Gen:$ Stays unchanged
* $H:$ On input key $s$ and string $X=(x_1, ..., x_{l'(n)})$ with $x_i \in \{0, 1\}$

**Theorem 5.3:** If $(Gen, h)$ is collision resistant, then $(Gen, H)$ is collision resistant. *[Proof in the slides]*
**What about variable length input?**
Consider a CRHF $(Gen, h)$ with $h^s(0^{n+1})=0^n$. Then: $H^s(0)=H(0^{n+1}) \Rightarrow$ collision.
Howto fix it: Append length of x to the input.
### MACs from hash functions
Goal: Domain extension for MACs
**Hash-and-MAC**

Construction of $\Pi'=(Gen', Mac', Vrfy')$ from MAC $\Pi=(Gen, Mac, Vrfy)$ for messages of length $l(n)$ and hash function $\Pi_H=(Ge_H, H)$ with output length $l(n)$.
* $Gen':$ Output $k \leftarrow Gen'(1^n)$ and $s \leftarrow Gen_H(1^n)$.
* $Mac':$ On input $(k, s)$ and message $m \in \{0, 1\}$ output $t \leftarrow Mac_k(H^s(m))$.
* $Vrfy':$ On input $(k, s), m$ and $t$, output 1 iff $Vrfy_k(H^s(m),t)=1$.
**Theorem 5.4:** If $\Pi$ is a secure MAC for messages of length $l$ and $\Pi_H$ is collision resistant, then $\Pi'$ is a secure MAC for arbitrary-length messages. *[Proof see slides]*
### Some real life hash functions
* MD5, invented in 1991, collision in compression function ($h$ in Merkle-Damgard-transformation) found in 96 $\Rightarrow$ broken.
* SHA-1, freestart collision in 2015, full colision in 2017 $\Rightarrow$ broken.
* SHA-3, Keccak won competition in 2012. Considered secure.
*Terminology*:
Freestart collision: Collision where chaining value (in merkle-damgard-transform) is chosen arbitrarily
## Random oracle model
Idea: model the hash function as a random oracle.
$H: \{0,1\}^* \rightarrow \{0,1\}^L$ (random function)
Difference to PRF: Adversary can query the oracle.
As long as the adversary never queried the oracle on X the value H(X) "looks completely random to him"
### Criticism of the Random Oracle Model
There exists a signature scheme that is **secure** in ROM but is **not secure** if the random oracle is replaced with any real hash function.
This example is very artifical. No realisitic example of this type is known.
### Terminology
Model without the random oracles:
* "plain model"
* "cryptographic model"
Random Oracle Model is also called: the "Random Oracle Heuristic"
### One way functions in the RO
$$
f^{RO(.)}(X)=RO(X)
$$
One way as long as adversary has not queried RO(.) on input X so that RO(X)=Y.
For T queries success probability is $\frac{T}{2^n}$.
### PRGs in the RO
$$
G^{RO(.)}(X)=RO(X||1),RO(X||2)
$$
where each output of RO(.) yields n bits.
Only way to distinguish $G^{RO(.)}(X)$ from random is by querying RO(.) on input X.
For T queries success probability is $\frac{T}{2^n}$.
### PRFs in the RO
$$
F^{RO(.)}_k(X)=RO(k||X)
$$
To show:
$$
|Pr[A^{RO(.),F^{RO(.)}_k}(1^n) = 1] - Pr[A^{RO(.),R(.)}(1^n)=1]|\le negl(n)
$$
where R(.) is random function independent of RO(.).
### Merkle tree
```graphviz
digraph merkeltree {
nodesep=1.0 // increases the separation between nodes
node [color=Black,fontname=Courier,shape=box] //All nodes will this shape and colour
edge [color=Black, style=solid] //All the lines look like this
"Top Hash\nhash(Hash0+ Hash1)" [color=Red]
"Hash0\nhash(Hash0-0+Hash0-1)" [color=Orange]
"Hash1\nhash(Hash1-0+Hash1-1)" [color=Red]
"Hash1-0\nhash(L2)" [color=Red]
"Hash1-1\nhash(L3)" [color=Orange]
"Top Hash\nhash(Hash0+ Hash1)"->{"Hash0\nhash(Hash0-0+Hash0-1)" "Hash1\nhash(Hash1-0+Hash1-1)"}
"Hash0\nhash(Hash0-0+Hash0-1)" -> {"Hash0-0\nhash(L0)" "Hash0-1\nhash(L1)"}
"Hash1\nhash(Hash1-0+Hash1-1)" -> {"Hash1-0\nhash(L2)" "Hash1-1\nhash(L3)"}
"Hash0-0\nhash(L0)" -> L0
"Hash0-1\nhash(L1)" -> L1
"Hash1-0\nhash(L2)" -> L2
"Hash1-1\nhash(L3)" -> L3
}
```
#### Example
To verify file L2 you need the hashes Hash0 and Hash1-1 and need to compute the hashes Hash1-0, Hash1, TopHash.
### Birthday attack
There exists a generic attack on any hash function
$$
H:\{0,1\}^*\rightarrow \{0,1\}^n
$$
that finds a collision with probability $\frac{1}{2}$ and works in time and space $O(2^\frac{n}{2})$.
Consequence: to achieve "m bits of security" one needs to set $n=2\cdot m$
For $q \le \sqrt{2N}$, $q$ denoting the "amount of people in the room to have the same birthday"/"amount of values to have the same hash value", $N$ denoting the total number of possible birthdays or hash values, the probability of a collision can be approximated by
$$ Pr[coll(q, N)] > 1 − e^{−q(q−1)/2N}
$$
# Lecture 11
## Motivation
**Cryptography is based on assumptions**
```graphviz
digraph merkeltree {
nodesep=1.0 // increases the separation between nodes
node [color=Black,fontname=Courier,shape=box] //All nodes will this shape and colour
edge [color=Black, style=solid] //All the lines look like this
"Pseudorandom Permutations"->{"Private-Key-Encryption","Message Authentication Codes"}
"Block Ciphers" ->"Pseudorandom Permutations"
"Pseudorandom Generator" -> "Pseudorandom Permutations"
"One Way Functions" -> "Pseudorandom Generator"
"Number theoretic assumptions: RSA, Factoring, Discrete Log" -> "One Way Functions"
}
```
**Number theory in cryptography - advantages:**
1. Security can be based on famous mathematical conjectures
2. More advanced constructions can be created because of "mathematical" structure
3. Constructions have a natural security parameter (-> they can be scaled)
**Number theory in cryptography - disadvantages:**
1. Much less efficient (than e.g. PKE)
2. the number-theoretic "structure" may hely cryptoanalysts
Number theory is a source of hard problems. Identifying them can be good for cryptography.
## Recap number theory
### General notation
* for $a,b \in \mathbb{Z}: a|b$ means a divides b
* greatest common divisor $gcd(a,b)$ largest integer $c$ such that $c|a$ and $c|b$.
* $p$ prime if ${1,p}|p$
### Modular arithmetic
* $a,b,N \in \mathbb{Z}$ with $N>1:$ [a mod N]remainder of $a$ upon division by N, i.e. $r:=[a \space mod\space N]$ with $a:=q*N+r$ where $0 \le r\le N$
* $a=b\ mod\ N: [a\ mod\ N] = [b\ mod\ N]$
* $b, N \in \mathbb{Z}$ with $b\ge1$, $N\ge1$ then b has multiplicative inverse iff $gcd(b,N)=1$
### Groups
A group is a set of G along with a binary operation $\circ$ for which following holds:
* Closure: $\forall g,h \in G, g \circ h \in G$
* Existence of identity: $\exists\ identity\ e \in G\ s.t. \forall\ g \in G: e \circ g = g = g \circ e$
* Existence of inverse: $\forall \ g \in \ G, \ \exists \ h\ \in \ G\ s.t\ g \circ h=e=h \circ g$
* Associativity: $\forall g_1,g_2,g_3 \in G: (g_1 \circ g_2) \circ g_3 = g_1 \circ (g_2 \circ g_3)$
$G$ is called abelian if further:
* Commutativity: $\forall g,h \in G: g \circ h = h \circ g$
If $G$ has a finite number of elements, we say $G$ is finite and let $|G|$ denote the order of $G$ (=#elements in $G$).
Notation: $\circ$ typically stands for $"+"$ or $"*"$.
### Subgroups
A group $H$ is a subgroup of $G$ if:
* $H \subseteq G$
* group operation of H is same as in G
**Examples:**
a) $\mathbb{Z}$ is abelian group under addition, but not under multiplication. (e.g. 2 has no inverse)
b) $G=\{0,...,N-1\}$ with addition $\mod N$ is abelian group of order $N$.
### Group exponentiation
* Idea: group operation is applied m times, where m is integer
* additive notation: $m \cdot g = \underbrace{g\ +\ ...\ +\ g}_{m\ times}$
* multiplicative notation: $g^m=\underbrace{g\ \circ\ ...\ \circ\ g}_{m\ times}$
**Theorem 6.1**
Let $G$ be a finite group with $m=|G|$. Then
$$
\forall g \in G:g^m=1
$$
**Fact 6.2**
Let $G$ be a finite group with $m=|G|, m > 1$. Then $\forall g \in G$ and any integer $x$ we have:
$$
g^x=g^{[x \mod N]}
$$
### The group $\mathbb{Z}_n^*$
$\mathbb{Z}_N^* = \{b \in \{1,...,N-1\}\ |\ gcd(b,N)=1\}$ with group operation "multiplication modulo N" is a group.
Restriction: $gcd(b,N)=1$ needed s.t. $b$ has multiplicative inverse modulo N.
**Fact 6.3**
Let $N > 1$ be an integer. Then $\mathbb{Z}_N^*$ is abelian multiplicative group. The order of $\mathbb{Z}_N^*$ is $\Phi(N) = |\mathbb{Z}_N^*|$.
**Examples:**
* $N=p$ prime: $\forall b \in \{1,...,p-1\} gcd(b,N)=1 \Rightarrow \Phi(N)=p-1$
* $N=p \cdot q$ for $p,q$ distinct primes: $\Phi(N)=(p-1) \cdot (q-1)$
## Factoring and RSA assumption
Idea: Given $N=p\cdot q$ where $p,q$ are large distinct primes it is hard to factorze $N$ i.e. to find $p,q$.
### Experiment: $Factor_A(n)$:
* Sample random n-bit primes $p,q$ and set $N =p \cdot q$
* Adversary $A$ is given $N$ and outputs $p', q' >1$
* Output 1 if $p' \cdot q' = N$; otherwise output $0$.
**Remark:** Generating random primes (howto)
* generate random $n$-bit number $x$
* Run Miller-Rabin test to check if $x$ is prime
**Assumption 6.4**
The factoring assumption says that $\forall$ PPT adversaries $A,\ \exists negl. s.t.:$
$$
Pr[Factor_A(n)=1] \le negl(n)
$$
In this case we say that factoring is hard.
*Factoring gives directly OWF but not practical PKC*
### RSA assumption
Define $f_e(x)=x^e \mod N$.
If $gcd(e,\Phi(N))=1,$ then $f_e(\cdot)$ is a permutation and
$f_d(\cdot)$ with $d=e^{-1} \mod \Phi(N)$
is its inverse, then the *RSA assumption* holds. The assumption states: Computing the inverse of $x^e$ is hard.
**How $(N, e,d)\rightarrow GenRSA(1^n)$ works**
* Sample random n-bit primes $p,q$ and set $N = p \cdot q$
* Choose $e > 1$ s.t.: $\ gcd(e,\Phi(N)) = 1$
* Compute $d = e^{-1} \mod \Phi(N)$
* Output $(N,e,d)$
### Experiment $RSA\!-\!inv_A(n)$
1. $(N, e, d) \leftarrow GenRSA(1^n)$
2. Choose uniform $y \leftarrow \mathbb{Z}_N^*$
3. $x \leftarrow A(N, e, y)$
4. Output $1$ if $x^e = y \mod N$; otherwise $0$
**Assumption 6.5**
The RSA-Assumption says that $\forall$ PPT adversaries $A\ \exists negl$ s.t.:
$$Pr[RSA\!-\!inv_A(n)=1] \le negl(n)
$$
**Lemma 6.6**
If factoring is easy, then RSA-assumption is easy. *[Proof see exercise]*
## Crypto assumptions in cyclic groups
### Cyclic groups
**Definition 6.7**
Let G be a finite group and $g \in G$. The order of $g$ is the smallest integer $i \ge 1$ s.t. $g^i = 1$.
**Remarks**
* Let $<\!g\!>=\{g^0, \dots, g^{i-1}\}$ be a subgroup of $G$, generated by g.
* If $\exists g \in G$ s.t. order of $g$ is $m=|G|$, then $<\!g\!> = G$. In this case $G$ is a cyclic group and $g$ is its generator.
* *Example:* Additive group $\mathbb{Z}_N$ for $N>1$ is cyclic with generator 1.
**Theorem 6.8**
If the order of group $G$ is prime, then $G$ is cyclic and all elements except 1 are generators.
### Discrete logarithm / Diffie-Hellman assumptions
Group generation algorithm $Gen(1^n)$ ouputs:
* cyclic group G (description of group specifies how elements are represented as bit strings)
* $q = |G|$ with bit length n
* generator $g \in G$
**Discrete logarithm**
Let $g, h \in G$ with $<\!g\!>\, =G$.
The discrete log of $h$ wrt. $g$ is the unique element $x \in \mathbb{Z}_q$ s.t. $g^x=h$. We write $x=log_g(h)$.
**Experiment $DLog_{A, Gen}(n)$**
1. $(G, q, g) \leftarrow Gen(n)$
2. Choose $h \leftarrow G$ uniform
3. $x \leftarrow A(G, q, g, h)$
4. Output 1 if $g^x=h$; otherwise output 0
# Lecture 12
### **Assumption 6.9 (DLog)**
The Dlog assumption says that $\forall$ PPT adversaries $A$ $\exists negl$ s.t.
$$ Pr[DLog_{A, Gen}(n)=1] \le negl(n)
$$
## Diffie-Hellman problems
Two variants exist:
1. Computational Diffie-Hellman assumption (**CDH**)
2. Decisional Diffie-Hellamn assumption (**DDH**)
Let $(G,q,g)\leftarrow Gen(1^n)$.
### CDH
Choose $x,y \leftarrow \mathbb{Z}_q$ uniformly and let $r = g^x, s = g^y$.
CDH is hard if, given $((G,q,g), r, s)$, it is hard to compute $g^{xy}$.
**Remark:** DLog is easy $\Rightarrow$ CDH is easy.
### DDH
DDH assumes it is hard to distinguish:
* $((G,q,g),g^x,g^y,g^{xy})$ and
* $((G, q, g), g^x, g^y, z)$
where $z$ is uniform in G. Formally:
**Assumption 6.10 (DDH)**
The DDH assumption states that $\forall$ PPT adversaries $A, \exists negl$ s.t.:
$$ |Pr[A(G,q,g,g^x, g^y, g^{x\cdot y})=1] - Pr[A(G,q,g,g^x, g^y, z)=1]| \le negl(n)
$$
where $(G, q, g) \leftarrow Gen(1^n)$ and $x, y, z \leftarrow \mathbb{Z}_q$ are uniformly chosen.
**Remark:** CDH is easy $\Rightarrow$ DDH is easy
### Examples of groups where DLog and CDH/DDH are hard
In general: Prime order groups (because $DLog/DH$ assumption is hardest in those groups).
#### Example 1
Prime order subgroup $G$ of $Z^*_p$ where $p = r \cdot q +1$ and $|G| = q$.
Let:
- $l$ be the bit length of $p$
- $n$ be the bit length of $q$
**NIST recommendation (2016)**
1. Minimum strength: 128bits (until 2030 and beyond):
$n=256, l=3072$
3. 256 bits security (until 2030 and beyond):
$n=512, l = 15360$
*256 bits of security = breaking a hash function with output length 256 bit*
***but:*** large $l$ means less efficient operations
#### Exampe 2: Elliptic curves
Groups consist of points over elliptic curve
**NIST recommendation (2016)**
1. 128bit security: 256bit size group
2. 256bit security: 512bit size group
## Key Agreement and Intro to Public Key Crypto
#### Motivation
We use a $(pk, sk)$ pair to reduce the amount of keys that need to be stored.
($n$ key-pairs instead of $n(n-1)$ keys)
### DH key exchange

*public, authenticated*: Only passive attackers. MITM is excluded.
**Correctness:** $k = k_A = k_B$
**Security:** "k is computationally indistinguishable from a random key given transcript trans of $\Pi$"
Experiment $KE^{eav}_{A,\Pi}(n):$
1. Parties run protocol $\Pi$ resulting into transcript $trans$ and key $k$
2. $b \leftarrow \{0,1\}$ If $b = 0$, set $\hat{k} = k$, otherwise $\hat{k} \leftarrow \{0,1\}^n$
3. $b' \leftarrow A(trans, \hat{k})$
4. If $b = b'$ output 1; otherwise output 0
**Definition 7.1**
A key agreeement protocol $\Pi$ is secure if $\forall$ PPT adversaries $A, \exists negl$ s.t.:
$$Pr[KE^{eav}_{A,\Pi}(n)=1] \le \frac{1}{2} + negl(n).
$$
# Lecture 13: Key agreement
## DH - key agreement protocol
Group generation: $(G, q, g) \leftarrow Gen(1^n)$
| Alice $(G, q, g)$ | | Bob $(G, q, g)$ |
| ----------------------------------------- | ------------------- | ------------------------------------------- |
| $1)$ $x \leftarrow \mathbb{Z}_q, h_A=g^x$ | $\xrightarrow{h_A}$ | |
| | | $2)$ $y \leftarrow \mathbb{Z}_q, h_B = g^y$ |
| | $\xleftarrow{h_B}$ | |
| $3)$ Output $k_A=(h_B)^x$ | | $3)$ Output $k_B= (h_A)^y$ |
**Correctness:**
$k_A=(h_B)^x = g^{x\cdot y} = (h_A)^y = k_B$
**Which assumption is needed for security?**
1. DLog: yes, otherwise it is possible to compute the key
2. CDH: yes, otherwise easy to compute key
3. DDH: yes, sufficient for security
**Theorem 7.2**
If DDH holds, then the DH-key agreement protocol is secure.
> :page_facing_up: for proof see notes
*Remark:* DH-Key Agreement is not secure against MITM attacks.
## Public Key Encryption
### Setting and Motivation
$m \rightarrow \overset{\overset{PK}{\downarrow}}{Enc()} \rightarrow c \rightarrow \overset{\overset{SK}{\downarrow}}{Dec()} \rightarrow m$
**Advantages of public key encryption:**
* key for encryption can be public
* one SK needed to securely communicate with many parties.
**Disadvantage:** Much slower.
### Definitions
**Definition 8.1:**
A public key encryption scheme (PKE) is a triple of PPT algorithms $(Gen, Enc, Dec)$ s.t.:
1. $Gen(1^n)$: probabilistic algorithm outputting $(pk, sk)$
2. $Enc(pk, m)$: probabilisitic algorithm outputting $c \leftarrow Enc(pk,m)$
3. $Dec(sk, c)$: deterministic algorithm outputting $m = Dec(sk, c)$
**Correctness:**
Except with negl. probability over choice of randomness for key generation and encryption, we have $\forall m \in M: Dec(sk, Enc(pk,m)) = m$ with $(pk, sk) \leftarrow Gen(1^n)$.
**Security:**
Even given the public key $pk$, the adversary cannot distinguish encryption of $m_0$ from encryption of $m_1$.
Given PKE scheme $\Pi$ and adversary $A$ we define:
**Experiment $PubK_{A,\Pi}^{CPA}(n)$:**
1. $(pk, sk) \leftarrow Gen(1^n)$
2. $(m_0, m_1) \leftarrow A(pk)$ with $|m_0| = |m_1|$
3. $b \leftarrow \{0,1\}, c \leftarrow Enc(pk, m_b)$
4. $b' \leftarrow A(c)$
5. If $b=b'$ output 1; otherwise output $0$.
The public key is given to $A$ in the experiment, he chooses $m_0, m_1$ depending on them. Also $A$ gets the encryption oracle "for free".
**Definition 8.2:**
A PKE scheme $\Pi$ is CPA-secure if $\forall$ PPT adversaries $A\ \exists negl$ s.t.:
$$Pr[PubK_{A,\Pi}^{CPA}(n)=1] \le \frac{1}{2} + negl(n)
$$
**Theorem 8.3**
No deterministic PKE is CPA secure.
> Why? Encrypt $m_0, m_1$ and compare which one equals $c$.
### El Gamal encryption scheme based on DDH
El Gamal encryption is defined as follows:
1. $Gen(1^n)$
* $(G, q, g) \leftarrow GroupGen(1^n)$
* Choose uniform $x \leftarrow \mathbb{Z}_q$ and set $h = g^x$
* Output $(pk, sk)$ where $sk = (G, q, g, x),\ pk = (G, q, g, h)$
*(x is the secret)*
2. $Enc(pk, m)$ where $m \in G$
* Choose $y \leftarrow \mathbb{Z}_q$
* Output ciphertext $c = (c_1, c_2) = (g^y, h^y \cdot m)$
5. $Dec(sk, c)$
* Output $\tilde{m} = \frac{c_2}{(c_1)^x}$
**Correctness:**
Let $(c_1, c_2) = (g^y, h^y \cdot m)$ with $h=g^x$. Then: $\frac{c_2}{c_1^x}= \frac{(g^x)^y \cdot m}{(g^y)^x} = m$.
**Security**
Basic idea:
1. $(g^x, g^y, h^y \cdot m)$ computationally indistinguishable from $(g^x, g^y, g^z \cdot m)$ for $z \leftarrow \mathbb{Z}_q$ (DDH)
2. $g^z$ is a "one-time pad" for $m$ (since $G$ is a group)
# Lecture 14
**Theorem 8.4**
If DDH assumption holds, then ElGamal is CPA-secure.
_Proof idea [Full proof see slides]:_
Define encryption scheme $\widetilde{\Pi}$, where $Enc(pk, m)$ works as:
- choose uniformly $y,z \leftarrow \mathbb{Z}q$
- output $(c_1, c_2) = (g^y, g^{z}\cdot m)$
Construct a distinguisher that cannot distinguish between playing $PubK_{A,\widetilde{\Pi}}^{CPA}$ and $PubK_{A, \Pi}^{CPA}$. Since DDH assumption holds, the difference beteen the two cases is negligible:
$$ negl(n)\ge |Pr[D(G,q,g,g^x,g^y,g^z)=1] - Pr[D(G,q,g,g^x,g^y,g^{xy})=1]| \\ = |\frac{1}{2} - Pr[PubK^{CPA}_{A,\Pi}(n)=1]| \\
\Rightarrow Pr[PubK^{CPA}_{A,\Pi}(n)=1] \le \frac{1}{2} + negl(n)
$$
## RSA encryption
Recall: $GenRS\!A(1^n) \rightarrow (N,e,d)$ s.t.:
- $N = p \cdot q$ with $p,q$ prime
- $e > 1$ s.t. $\gcd(e,\Phi(N))=1$
- $d=e^{-1} \mod \Phi(N)$
Basic idea of _"Plain RSA"_:
- If you know $d$, you can recover $m$ from $c=m^e \mod N$
- without knowledge of $d$ RSA assumption implies computing $m$ from $c$ is hard.
### Plain RSA construction
- $Gen(1^n): (N,e,d) \leftarrow GenRS\!A(1^n)$. Output $pk=(N,e),\ sk=(N,d)$
- $Enc(pk, m):$ Output $c=m^e \mod N$
- $Dec(sk, c):$ Output $m=c^d \mod N$
**Is Plain RSA CPA-secure?** $\Rightarrow$ *No.*
1. RSA-assumption only holds for "random m", but: In CPA game adversary chooses messages.
2. Many further attacks known against Plain RSA.
_Solution:_ Use Padding scheme.
**RSA PKCS \#1 v1.5**
Let $pk=(N,e)$ and $k$ be byte length of $N$
Encryption of $D$-byte long message $m$ as
$$[(\texttt{0x00}||\texttt{0x02}||\underbrace{r}_{random}||\texttt{0x00}||m)^l \mod N],
$$
where $r$ randomly generated $(k-D-3)$-byte string with none of its bytes equal to $\texttt{0x00}$.
For too short $r$: Bruteforce attack!
For $r$ approx. half of length of $N$: Conjectured to be secure.
*Alternative used in practice: RSA-OAEP.*
## Hybrid encryption
### Disadvatages of PKE:
* slower due to PK operations
* larger ciphertexts (due to expansion)
To mitigate disadvantages, use Hybrid encryption: Combine PKE with SKE.

Hybrid encrytion $\Pi=(Gen, Enc, Dec)$:
* $Gen(1^n)$: Output $(pK, sK)\leftarrow P\!K\!E.Gen(1^n)$
* $Enc(pK, m)$:
1. $k \leftarrow S\!K\!E.Gen(1^n)$
2. Output $(c, c')=(P\!K\!E.Enc(pK,k), S\!K\!E.Enc(k, m))$
* $Dec(sK, c)$:
1. Compute $k \leftarrow P\!K\!E.Dec(sk,c)$
2. Output $m = S\!K\!E.Dec(k, c')$
### Theorem 8.5
If PKE and SKE are CPA-secure, then \Pi is CPA-secure.
# Recap
**Not relevant:**
Erweiterter euklid
## Introduction
- [x] Classical ciphers
- [x] Principles of modern crypto
## Perfect security
- [x] Definition of perfect security (and some variants)
- [x] one-time pad (and security proof)
- [x] theorem about keyspace
## Computationally secure encryption
- [x] negl and PPT adversary
- [ ] security definitions (EAV, CPA, CCA)
- [x] PRG
- [x] PRG $\Rightarrow$ EAV secure encryption
- [x] OWF Definition and cominations of OWFs
- [ ] PRF/PRP and differences
- [ ] PRF $\Rightarrow$ CPA-secure encryption
- [ ] Feistel network
- [ ] AES and other practial construction (dont need to know how AES works, only that is extists and what it is)
- [ ] modes of operation: CBC, CTR, ECB | padding
## Message authentication (MAC)
- [ ] Defintion (strong)
- [ ] Fixed length MACs
- [ ] MAC domain extension
- [ ] authenticated encryption
## Hash functions
- [ ] Definitions
- [ ] Domain extension (Merkle-Damgard-Construction)
- [ ] CRHF $\Rightarrow$ MAC
- [ ] Random Oracle model (know that it exists, no proofs)
- [ ] Merkle tree construction (not the proof, only how it works is important)
- [ ] Birthday paradox
## Number theory and assumptions
- [ ] know basic facts
- [ ] foctoring + RSA (relation)
- [ ] discrete log + DH assumption (relation)
## Key agreement
- [ ] Definition + setting
- [ ] DH key exchange protocol
- [ ] Man-in-the-middle attacks
## Public Key Encryption
- [ ] SKE vs PKE
- [ ] CPA-security (no difference to eavesdropping security for PKE)
- [ ] El Gamal
- [ ] RSA
- [ ] Hybrid encryption
## Nice graphic that explains connection of different schemes

###### tags: `2018` `Kryptographie` `Zusammenfassung` `TU Darmstadt` `2019` `recap`
_Proud authors of this recap: B.M. and N.S._