--- tags: Master --- # Applied Cryptography [TOC] ## Introduction to cryptography $\{\mathbb{F}_2\}^n$ represents the set of all possible $n$-length binary strings. - πŸ‘β€πŸ—¨ Reti Avanzate summary - πŸ‘β€πŸ—¨ Introduction to Network Security summary **Cryptosystem:** is a 5-tuple $(E, D, M, K, C)$ - $E$: `encryption` $M \times K \rightarrow C$ - $D$: `decryption` $C \times K \rightarrow M$ - $M$: set of `plaintexts` (also $P$) - $K$: `keys` - $C$: set of `ciphertexts` **Kerckhoff’s principle:** Algorithm should NOT be SECRET, **KEYS ARE THE ONLY SECRET!** **Channel:** any physical or logical `medium of communication` from one user to another. ...very basic cryptography notions... Move problem from secrecy of cryptographic functions (`ciphers`) to **secrecy of keys**: - βœ” easier to keep secret a key - βœ” easier to change key - ❌ attacker must just find the key πŸ‘β€πŸ—¨ **Cryptography** is rarely a solution for a security problem, **is a translation mechanism:** - `communication security` issue $\rightarrow$ `key management` issue $\rightarrow$ `computer security` issue - **moral of the story:** keys are vulnerable data, `access control mechanisms` in the computer system `have to protect these keys` **Attack types:** - `CIPHERTEXT-ONLY`: if data is encrypted with *same cipher and key*, given the **probability distribution** of the plain-texts, an attacker can get a lot of info just by observing some cipher-texts - `KNOWN-PLAINTEXT`: if access to some plain-texts $m$ and their corresponding cipher-texts $c$, generally is assumed that attacker fully knows the cipher - can do **Key-Recovery attack:** information collected and/or time needed could be insufficient for key recovery - can do **Global Deduction (**or Global Reconstruction**) attack:** good alternative to key-recovery attack if it is infeasible, just find an equivalent encrypting algorithm to $\text{cipher}_{key}(m)$ - `CHOSEN-PLAINTEXT`: if attacker is able to get ciphertext for any plaintext he wants - same goals as for Known-Plaintext attacks πŸ‘β€πŸ—¨ **Effort of the attack:** Ciphertext-only $>$ Known-plaintext $>$ Chosen-plaintext ### Perfect ciphers #### According to Shannon ![](https://i.imgur.com/LZInVyk.png) πŸ’₯ An attacker cannot get any new information on the sent plain-text by looking at just one cipher-text. - $\implies$ a `perfect cipher` is **unbreakable** if attacker can use *only one* cipher-text **Shannon's Theorem (**perfect cipher**):** given - independent plain-texts and keys (`clever boy assumption`) - any $n$-bit string may happen as a key or as a plain-text $$ \text{then} \ Ο• \ \text{is a perfect cipher} \iff $$ 1. the keys are perfectly random 2. for any plain-text $m$ and any cipher-text $c$, $\exists!$ key $k$ such that $Ο•_k(m)=c$ πŸ‘β€πŸ—¨ If an attacker intercepts $>1$ cipher-texts (encrypted with the `same key`) $\implies$ Shannon's theorem no longer guarantees perfect secrecy. - a **perfect cipher** is `not unbreakable in a practical sense` $\implies$ is not necessarily *ideal* #### Vernam cipher (One-Time Pad) XOR of the binary strings $m$ and $k$. If the keys are perfectly random $\implies$ Vernam cipher is perfect. Perfect but not ideal: - πŸ‘Ž `randomness`: the keys must be chosen perfectly at random - πŸ‘Ž `key length`: must be same as message, plus **key exchange problem** (if is one-time may as well exchange directly the message) - πŸ‘Ž `one-time`: but **flattens** the statistical distribution of the letters πŸ‘β€πŸ—¨ Symmetric key algorithms are *approximations* of Vernam cipher. ## Stream ciphers ### Keystream Generation Basics XOR the plain-text with the `keystream` generated from a **seed** (`secret` pseudo-random bits generator) - key-stream should not be predictable - seed should not be discoverable - βœ” lightweight algorithms, bigger throughput - Hardware oriented ![](https://i.imgur.com/aAH4aou.png) - a state $S_n$ has its bits organized in **registers** (colored blocks) - $(F_2)^n$ is the set of all possible states of a binary vector of length $n$. - **Warm-up phase:** apply update functions many times before going to output any bits of keystream. (because the first bit would strongly depend on the state $S_0$) - allows to avoid a lot of practical attacks - πŸ‘β€πŸ—¨ Note: states are periodic since they are finite, thus the **update function** must be well designed in order to have large periods and avoid to keep in the same state - **starting state** impacts of the periodicity too - $(F_2)^n$ is the set of all possible states of a binary vector of length $n$. ### Linear Feedback Shift Register (LFSR) πŸ‘β€πŸ—¨ Used to **generate keystreams**. At any clock execute in the $n$-bits long register: 1. remove rightmost bit and append to the *keystream* 2. shift remaining bits by $1$ to the right 3. insert in the first position of the register the XOR between a selection of the remaining bits ![](https://i.imgur.com/SHLtwdw.png) **Fact:** given a maximum-length LFSR of $n$ bits and reading $2^n βˆ’ 1$ consecutive bits of the $m$-sequence that it produces, we have that there are $2^nβˆ’1$ runs (sequences of consecutive $0$s or $1$s) where: - $1/2$ of the runs has length $1$ - $1/4$ of the runs has length $2$ - . . . - $1/2^i$ of the runs has length $i$ (for $2 ≀ i ≀ n βˆ’ 2$) - there is only one run of $n βˆ’ 1$ zeros and none of the runs has $n βˆ’ 1$ ones - there is only one run of $n$ ones and none of the runs has $n$ zeros. ⚠ LFSRs are `vulnerable to known-plaintext attacks` and they **must not** be used as key-stream generators (although they can be used as components). ### A5 Familiy (GSM) **A5/1:** affected by a number of serious weaknesses, and its use is strongly discouraged, since there are practical attacks that can break the cipher. - `226b` long **keystream** - after 228 bits of output, `A5/1` performs the key-loading again - `64b` state: split in $3$ **registers** (19+22+23) - initialized to $0$ - `tap bits` in blue - `clock bits` in orange - **Initialization:** - `64b` key (**seed**): use the whole key to initialize each register - `22b` **IV** (Initialization Vector) initialize same way as seed - **Update:** for each register, update if its **clock bit** agrees with the `majority` - $100$ clocks warm-up - **Output:** XOR of the output of the three registers (the big one in the image) ![](https://i.imgur.com/XG0lRS1.png) **A5/2:** extremely weak and it can be broken in real time with inexpensive equipment; it is therefore no longer supported by new mobile phones. - similar to `A5/1`, but with $4$ **registers** and different rules for update and output - `81b` state - **initialization** by summing to the `most significant bit` in each register - **Update:** $R_1, R_2, R_3$ updated if their associated clock bit in $R_4$ follows the `majority` - $R_4$ is always updated - $99$ clocks warm-up - **Output:** see image ![](https://i.imgur.com/fFypI36.png) **A5/3:** the common standard for the new generation of mobile and it is considered secure, based on the `block cipher` KASUMI - **initialization:** - `64b` key (**seed**) transformed in a `128b` **master key** for KASUMI ($K$ vector) - `22b` **IV** transformed into a `64b` **plain-text** for KASUMI ($A$ vector) - obtain ciphertext $c_0$ - **Update:** - `first` update ($c_0 \rightarrow c_1$): counter $n=1$ , simply use $c_0$ as plain-text to obtain $c_1$ - `other` updates: $c_n=\text{KASUMI}_K(c_0 \oplus c_{n-1} \oplus n)$ - **Output:** select from cipher-texts as much as needed for generating a keystream ![](https://i.imgur.com/gG3cFh5.png) ### E0 (Bluetooth) - `128b` for $4$ LFSR register + `4b` for $1$ non-linear register - **Update:** - is `regular` (no control bits) for the LFSR ones - for the non-linear register, given $z = \Big\lfloor \dfrac{s_1 + s_2 + s_3 + s_4 + 2c_3 + c_2 }{2} \Big\rfloor $, and $(z_1,z_0)$ its `binary expansion` (needs only two bits) , then see right image - **Output:** XOR between output of each register ![](https://i.imgur.com/ke3bPYi.png) ### RC4 (Wireless) RC4 is a **byte-oriented** stream cipher, i.e. the output $z$ is a byte. - byte-orientedβ†’ `software implementations` - bit-orientedβ†’ hardware implementations $S$ is a register that contains `256B`, where each byte is seen as a number between 0 and 255 (space $(\mathbb{F}_{256})^{256}$) - **Update:** details i don't care about... - **Output:** the byte of $S$ which is in the position $S[i] ⊞ S[j]$ - $x ⊞ y = (x + y) \ \text{mod}\ 256$ ![](https://i.imgur.com/TE8SyOC.png) πŸ‘ No practical attacks are known if `RC4` is used with a careful choice of the **key** and a suitable **warm-up** phase. - be careful on constructing the `session key` from the seed and the initialization vector - `warm-up` phase in important here too in order to avoid certain attacks ## Block ciphers Plain-text split in multiple blocks, complex algorithms are used in order to avoid using the same key over multiple blocks. - πŸ‘β€πŸ—¨ (if using same key, blocks order can be switched by an attacker) - ❌ complex ciphers, slower - βœ” considered more secure - Software oriented πŸ’₯ When we design a block cipher, we must choose components that do not commute. - for example both AES and SERPENT would be trivially breakable if S-box, MixingLayer and AddRoundKey would commute πŸ’₯ If the cipher lacks non-linear components, is easy to reconstruct even without the key. - the **S-box is the only non-linear part** and its choice is crucial: using a weak S-box leads to a great number of attacks πŸ’₯ The **MixingLayer** is the only part that provides `diffusion`. ![](https://i.imgur.com/YzgqKbB.png) **Ideal block cipher:** a block cipher with $k$-bit keys is called ideal if there is no attack on it that has a `computational cost` smaller than $2^k$ encryptions. - we don't know ideal ciphers, but there could already be - if `brute force attack` is the only option πŸ’₯**Rijndael S-box:** substitution box, is a linear mapping of an `n`-bit input to an `n`-bit output. - first `4b` of the input $\rightarrow$ **row** lookup - last `4b` of the input $\rightarrow$ **columns** lookup - is usually fixed and carefully chosen to guarantee protection from linear and differential cryptanalysis ![](https://i.imgur.com/Q92LfTD.png) ### AES **AES (**`Advanced Encryption Standard`**):** does 10,12, or 14 rounds of transformations. - vulnerable to the **biclique attack** (not ideal) ![](https://i.imgur.com/DZjoBsB.png) **Algorithm:** 1. `KeyExpansion:` **round keys** are derived from the cipher key using the [AES key schedule](https://en.wikipedia.org/wiki/AES_key_schedule). AES requires a separate round key block for each round plus one more. 2. **Initial round key** addition: 1. `AddRoundKey:` each byte of the state is combined with a byte of the round key using bitwise XOR. 3. 9, 11 or 13 **rounds**: 1. `SubBytes:` a non-linear substitution step where each byte is replaced with another according to a lookup table 2. `ShiftRows:` bytes of last $3$ rows are cyclically shifted to the left, with increased shift for each row ![](https://i.imgur.com/xKGxFXB.png) 3. `MixColumns:` a linear mixing operation which operates on the columns of the state, combining the four bytes in each column with an invertible linear transformation (a matrix) 4. `AddRoundKey:` combine each sub-key with the corresponding state to obtain a new state 4. **Final round** (making 10, 12 or 14 rounds in total): 1. `SubBytes` 2. `ShiftRows` 3. `AddRoundKey` ### DES Block cipher based on `Feistel` encipherment. **Feistel cipher:** key $K$ divided in multiple sub-keys, and plain text divided in two sub-blocks $L_0,R_0$, $n$ rounds ![](https://i.imgur.com/hHTakk3.png) **DES:** Feistel with $16$ rounds **3DES:** three times DES ### SERPENT Maximizes parallelization for each round, encryption in `32 rounds` ![](https://i.imgur.com/OiwnCAU.png) ### PRESENT A lightweight iterated block cipher, for small devices. (block are small so the cipher can be hardware implemented on a budget) - `31 rounds` - also weight on parallelization - `sBoxLayer:` $Ξ³_i : (\mathbb{F}_2)^{4} β†’ (\mathbb{F}_2)^{4}$ non linear transformation, one 4x4bit S-box repeated in parallel 16 times to cover the whole 64b text space - `pLayer:` $Ξ»: (\mathbb{F}_2)^{64} β†’ (\mathbb{F}_2)^{64}$ linear mapping defined by a table - must maximize the diffusion and be cheap to implement, PRESENT's one uses `binary permutation matrices` ![](https://i.imgur.com/9l0FTmw.png) ## HASH Is a **1-way function**: easy to compute, hard to reverse. - πŸ‘β€πŸ—¨mainly used for **data integrity** **Can be used to protect the real password:** - Instead of the password $x$, the value $f(x)$ is `stored in the password file` - when a user logs in entering a password $x’$, the system applies the one-way function $f$ and compares $f(x’)$ with the expected value $f(x)$. **Secure Hash function requirements:** 1. `Ease of computation:` $x \rightarrow f(x)$ 2. `Compression:` from arbitrary input to fixed length output 3. `One-way:` input $x$ is computationally infeasible to find given the output $f(x)$ 4. `Weak collision resistance:` given $x$, it is computationally infeasible to find another input $x’≠x$ with the same hash output - (ndr) deve essere computazionalmente impossibile ottenere lo stesso digest modificando i dati 5. `Strong collision resistance:` it is computationally infeasible to find any two inputs that return the same hash output - many collisions usually exists, but **no one should be able to find one** given a good enough implementation πŸ‘β€πŸ—¨ **Collision:** situation in which two inputs $x$ and $x’$ map to the same hash, $h(x) = h(x’)$. **Ideal HASH function**: if it behaves like a random oracle. (is impossible to predict the output without using the function) In the iterated hash functions we have essentially **three types of compression functions**: - based on `block ciphers` (DES, 3DES, AES, . . .) - based on particular `permutations` (SPONGE,. . .) - based on `arithmetic primitives` (modular sums, etc.) ITERATED EXAMPLE (Merkle-Damgard): ![](https://i.imgur.com/CK53xyk.png) Examples of hash functions are: - `SNEFRU`: collision attacks - `MD2`: collision and preimage attacks - `MD4`: collision, preimage and 2nd preimage attacks - `MD5`: collision and preimage attacks - `SHA1`: collision attacks - `SHA2`: collision attacks on reduced versions - `SHA3` (KECCAC): new standard **MAC (**`Message Authentication Code`**)** - is a keyed HASH, has 2 inputs (message and key) - βœ”data integrity - βœ”data origin authentication ... ## Public Key Cryptography ### Diffie-Hellman **DLOG problem:** let $(G, Β·)$ be a multiplicative group, and $g ∈ G$, given $h = g^a$ for some (unknown) $a ∈ N$, can we find $a$? - for some groups DLOG is infeasible to solve - example: given a multiplicative group $(\mathbb{Z}_{7 } \backslash \{0\}, Β·)$ and $g=2$, the powers of $g$ in that group are $\langle 2 \rangle = \{1, 2, 4, 8, 16, . . . \} = \{1, 2, 4\}$, given $h=g^a=16$, $\implies$ that $a$ **Diffie-Hellman:** - based on the DLOG problem, used **for secret key exchange only** - :x: does not provide authentication $\Longrightarrow$ `vulnerable to MITM attacks` - πŸ’­`Reasoning:` exponents are easy, **discrete logarithms are hard** to compute (`DLP`) - $x β†’ g^x (\text{mod } p)$ is the `one-way function` ![](https://i.imgur.com/sY9E9uG.png) - a middleman needs $a$ and $b$ to compute $K$, but they are never exchanged, should be infeasible to solve the DLOG - πŸ’₯ $g^a$ and $g^b$ should be bigger than $p$, so modulus can happen - $p$ should be very high to mitigate brute-force attacks - $p$ `prime` should make DLOG infeasible to solve - `elliptic curves` are more used nowadays ### RSA - :heavy_check_mark::heavy_check_mark: can be used for anything: `key exchange`, `digital signatures`, or `encryption` of small blocks of data - πŸ’­`Reasoning:` products are easy, **factorizations are hard** to compute **General scheme:** 1. use public key of $A$ to encrypt message for $A$, which is the only one who has the private key to decrypt it 2. use own private key to decrypt incoming encrypted messages **Keys generation:** ![](https://i.imgur.com/hfyQ9pW.png) - $p,q$ are private - $k$ is the public exponent (also called $e$) **Encryption:** given the public key $(e,n)$, the encrypted message is $c=m^e \ \text{mod} \ n$ **Decryption:** given the private key $(d,n)$, the decrypted message is $m=c^d \ \text{mod} \ n$ **Attacks:** given a public key $(e,n)$, most attacks try to factorize $n$ to find $p,q$, but the other methods are equivalent #### Weak Keys Integers that can be `factorized quickly` by specific algorithms are weak keys for RSA. - πŸ‘β€πŸ—¨many RSA implementations do not check for weak keys **Weak keys examples:** - $m$ and $e$ so small that $m^e < n$ - in this case it is easy to compute $\sqrt[e]{c}$, recovering $m$ - $|p βˆ’ q| < n^{\frac{1}{4}}$ - in this case there are fast algorithms able to find $p$ e $q$, as for example `Fermat’s algorithm` - **Fermat** factorization: can quickly factorize an integer $n$ if it is the product of two integers that are close enough - $d$ too small, $d < \frac{1}{3}n^{\frac{1}{4}}$ - **Wiener** attack: allows $d$ to be recovered rapidly in this case (some minutes for laptops) - works when a small value is chosen for the private exponent $d$, i.e., $d < \dfrac{1}{3}n^{\frac{1}{4}}$ - let $n = log_2n$. The **cost** of the attack is $n^4 + 6n^3 + 5n^2$ - $p βˆ’ 1$ or $q βˆ’ 1$ has only small factors - in this case $n$ can be quickly factorized by the `Pollard` $p βˆ’ 1$ algorithm Comparable strengths between symmetric encryption and DH-ECDH-RSA: (he says to memorize it) ![](https://i.imgur.com/yohXFbS.png) ![](https://i.imgur.com/suvcFyr.png) ## Cryptographic primitives **Cryptographic primitive:** a set of low-level algorithms that are the building blocks used to construct cryptographic protocols - `private key cryptography:` symmetric key, shared secret - `public key cryptography:` used for symmetric key exchange because is slower and needs bigger keys - `hash functions:` integrity and more **Derived primitives:** - `MAC` (Message Authentication Code) - `Digital signature` - `PKI` (Public Key Infrastructure) - `Secret sharing` ## Digital Signature **Digital signature:** string associated to a message, relies on HASH and public key algorithms, assures: - `authentication:` signer identity can be verified - `non-repudiation` signed message cannot be denied - `integrity:` message modification would invalidate signature Components: - `Key generation` algorithm - `Signing` algorithm - `Verification` algorithm ### Diffie-Hellman Signatures **DSA:** main algorithm for digital signatures. - it is based on the Diffie-Hellman algorithm and hence on the DLOG problem. - in the standard version the algorithm requires a secure `hash function` **ECDSA:** similar to DSA, but with elliptic curves ### RSA Signature Talks only about the procedure, is the usual signature shit. - less and less used, ECDSA signature is much more popular ## Secret sharing **Idea:** each one in a group has a part of the secret, the complete secret can be reconstructed only when a sufficient number of shares are combined. - reduces the need to create backup copies of essential information - `threshold access:` allows access to the secret only once a fixed number of players share their secrets Components - `dealer:` has the secret, distributes the shares - share can be extracted from secret as randomly long contiguous sequence - `players:` are given a different share each - `share:` part of the secret πŸ‘β€πŸ—¨ Can be used for **electronic voting**. πŸ‘β€πŸ—¨ Can be used in **access control** systems. ### Threshold Schemes $(t,w)$ **threshold scheme:** a method to share a secret $S$ among $w$ participants in such a way that: - `threshold condition `($t$)` :`the secret can be recovered by any sub-group only if it has $\ge t$ players **Perfect threshold scheme:** if no information of the secret can be obtained with the threshold not satisfied. πŸ‘β€πŸ—¨`Basic scheme` is a $(t,t)-$threshold scheme. #### Shamir's Threshold Scheme Also called **Lagrange interpolation scheme:** $D$ distributes a secret $S ∈ \mathbb{Z}_p$, with $p$ `prime`, among $w < p$ players. - πŸ‘`perfect:` fewer than $t$ shares gives no information on $S$ - πŸ‘`ideal:` size of one share is the size of the secret $S$ - πŸ‘`extensible to new players:` recomputing new shares does not affect existing ones - πŸ‘`unconditionally secure:` does not rely on hardness of other problems - πŸ‘`allows varying levels of control:` player $p$ can have multiple shares **Initialization:** ($D$ does everything) 1. set a constant $a_0=S$ 2. select $t-1$ random values $a_1,a_2,...,a_{t-1}$ with $0 ≀ a_i ≀ (p βˆ’ 1)$ 3. define the polynomial $f(x)=\sum\limits_{i=0}^{t-1}a_ix^i$ **Distribution:** ($D$ does everything) 1. select $w$ distinct random values $1 ≀ x_j ≀ (p βˆ’ 1)$ with $1 ≀ j ≀ w$ 2. $\forall (1 ≀ j ≀ w)$ compute $y_j=f (x_j)$ 3. `give share:` $\forall (1 ≀ j ≀ w)$ to each player give the pair $(x_j , y_j)$ πŸ‘β€πŸ—¨ $t$ participants can recover $f(x)$ using `Lagrange interpolation`. #### Blakley’s Threshold Scheme Just know that it exists - πŸ‘Ž`not ideal:` the size of the shares is $t$ times greater than the size of the secret #### Mignotte's scheme - based on elementary number theory - πŸ‘Ž`not perfect:` any player $P$ has some information on the secret, while an attacker without any share cannot recover this information **Asmuth-Bloom scheme:** proposed improvement to Mignotte's scheme ## Randomness Random values are needed for: - **symmetric cryptography:** - `Shannon’s perfect cipher:` keys - `block cipher:` keys - `stream cipher:` seeds - **public-key cryptography:** - `DH:` secret exponents $a,b$ - `RSA:` prime factors $p,q$ - actually **random pseudo-primes**: primality needs to be tested - **digital signature** (DSA) ### Pseudorandom generators `Pseudo-random generators` are more practical to use, and more popular. **Pseudorandom bit generators (**`PRBG`**):** a `deterministic algorithm` that outputs a sequence of length $\ell$ from a truly random input of length $k<<\ell$ - _input:_ the seed - _output:_ pseudorandom bit sequence πŸ‘β€πŸ—¨ `Deterministic` because same seed $\implies$ same output. **Statistical tests to check randomness:** can only prove `non-randomness`, no finite number of tests can prove randomness. **Random properties** to test: - `uniformity:` $0$s and $1$s must be the same - `scalability:` any sub-sequence must have the same properties - `consistency:` among different seeds **Linear complexity:** $\ell(\text{sequence})$ is the length of the **smallest LFSR** producing the sequence. - for $M$ maximal sequence generated by an `LFSR`: $β„“(M) = log_2 (M)$ - for $M$ random sequence: $β„“(M) = |M|/2$ πŸ‘β€πŸ—¨**Cryptographically secure PRBG** $=$ hard to predict $+$ statistically similar to random sequences. ## E-payment systems **E-payment systems (**`EPSs`**):** processes and technologies enabling people to πŸ€‘ `transfer money` πŸ€‘, by means of integrated hardware and software systems. ### EFT/POS **EFT/POS (**`Electronic Funds Transfer at Point Of Sale`**):** - based on **payment cards** πŸ’³, are inserted in the POS - `PAN` (Primary Account Number): the card number - `CVV` (Card Verification Value): the short code on the back you dummy - `EMV chip`: carries cryptographic operations - `Magnetic-stripe`: stores card data ...holy moly how much obviousness... **EMV (**`Europay, MasterCard, and Visa`**):** standard used for payment cards and terminals. - card authentication is completely offline ![](https://i.imgur.com/quOwpm8.png) **EMV transaction procedure:** 1. `Application selection:` card tells POS his supported circuits - each circuit corresponds to an _application_ loaded in the chip 2. `Initiate application:` card tells POS his supported _functions_ and _data_ necessary for the transaction 3. `Offline Data Authentication:` - `SDA` **(**Static Data**):** *tampering* protection - `DDA` **(**Dynamic Data**):** *cloning* protection - `CDA` **(**Combined DDA/generate Application cryptogram**):** 4. `Processing Restriction:` mainly a compatibility check - at this point the POS is sure the card is valid 5. `Cardholder Verification:` signature or a **PIN** (online or offline) - at this point the cardholder checked to be legitimate - `Offline Verification:` 1. POS sends PIN a random message encrypted with PIN 2. the card decrypts with its private key and verifies the message - `Online Verification:` PIN verified to the _issuer_ (bank of the card) - ❗ PIN must be encrypted ❗ (only `AES` or `TDEA`) 1. encrypted PIN sent to the TMS (terminal) - (symmetric key between POS and TMS) 2. TMS decrypts and re-encrypts PIN, then sends to Issuer - (symmetric key between TMS and Issuer) 6. `Terminal & Card Action Analysis:` POS and Card do some **risk analysis** before the final decision - POS tells Card one of the following (decline, offline approve, online appros) - Card can do what he wants (can go online anyway), but if transaction is declined, bye bye 7. `Completion:` 8. `Script Processing:` **DDA Authn Procedure:** ![](https://i.imgur.com/7G5Z1z9.png) - `SDA` (additional step): card sends to POS a random message **signed by his issuer's private key** ### Traditional Internet Payment **CNP (**`Card Not Present`**):** the traditional Internet payment system. - pay with the card without using the card, need to give: - his identity (account?) - PAN - CVV - expiry date - πŸ‘Žcard-holder provides card information indirectly $\implies$ legitimacy difficult to estabilish - ❌source of **fraud** and **chargebacks** - πŸ™ˆAmazon uses this system - uses `TLS`, but 2 big problems: 1. πŸ‘Ž**customer authentication:** cannot be verified, some mitigations are - πŸ‘Ž`3D Secure:` consumers authenticate themselves directly with their card issuer and without the card - βœ”`Strong customer authentication:` two-factor authentication 2. πŸ‘Ž**card data stored remotely:** different security standard requirements are used - βœ”`PCI DSS:` the most complete standard ### Online Banking Access to online bank account through `TLS/HTTPS` tunnel. **Two-factor Authn:** improves security, online baking systems often use it - **OTP (**`One-Time Password`**):** given via SMA, App, or OTP device ### Mobile Banking **TLS problem:** a malware can add certificate exceptions - **Certificate pinning:** the solution, `hard-code trusted servers` in application code **PayPal:** an online payments system - `Payments API:` accept online and mobile payments - `Payout APIs:` manage payments to multiple PayPal accounts - `Vault API:` secure way to store credit cards - `Identity API:` lets your customers sign in to your web site using their PayPal credentials - `Invoicing API:` enables the e-commerce site to create, send, and manage invoices ### Mobile Payment - **Card emulation:** via special POSes (NFC) - same EVM functionalities - **Digital wallet:** e-commerce transactions (Google Wallet, PayPal,...) NFC payments problems: - securing card `data`: - **Secure Element (**SE**):** tamper resistant chip for secure storage, in `SIM`, `SD`, or `phone` ![](https://i.imgur.com/0Ioh1LC.png) - NFC can retrieve data directly from SE (so no application can interfere) - **Host-based card emulation:** host OS and apps are involved in the transaction ![](https://i.imgur.com/7EbBaaL.png) - not so secure storage, processing managed by an app (e.g. Google Wallet) and `cardholder data stored in cloud` - protect NFC `communication`: - `SSL` tunnel for confidentiality - `Oauth 2.0` as authorization layer ### PCI Security Standards `PCI Security Standards Council:` formed by the major payment card brands (VISA, MasterCard,...) **to improve payment account security**. ![](https://i.imgur.com/EWsYpp1.png) Main standards: - **PCI DSS (**`PCI Data Security Standard`**):** created to increase controls around cardholder data **to reduce credit card fraud** - applies to all entities involved in **payment card processing** and in the cardholder data storage - $12$ minimum requirements: 1. (network) setup a **firewall** 2. (network) no default vendor-supplied passwords for systems 3. (cardholder data) **encrypt** stored cardholder data 4. (cardholder data) encrypt cardholder data transmissions (**TLS**) 5. (manage vulnerabilities) setup anti-malware, **anti-virus** 6. (manage vulnerabilities) develop and maintain secure systems and applications 7. (access control) setup **access control over** cardholder stored data 8. setup **authentication** layer to system components 9. (access control) restrict physical access to cardholder data 10. **monitor network** traffic 11. test networks 12. maintain an Information **Security Policy** - `Network segmentation:` not required but strongly recommended - **PA DSS (**`Payment Application Data Security Standard `**):** to provide the definitive data standard _for software vendors_ that develop payment applications #### Merchant Responsibilities Merchants who sell their goods can: - develop **own e-commerce** payment software (PCI DSS requirements apply) - use a **third-party** developed solution (merchant is still responsible for PCI DSS compliance) - E-commerce Payment Gateway (forwarding transaction processing) - Web-hosting Provider (website outsourcing) Web application security: presentation layer, processing layer, data-storage layer. ... ## Zero-Knowledge Proofs (from video) **Quadratic residue:** given $n=p_1p_2$ product of two primes, $q$ is called a quadratic residue modulo $n$ if $\exists x$ integer such that: $$ x^2\equiv q \ (\text{mod} \ n) $$ **Quadratic residue problem:** given $p_1,p_2$ two different unknown primes and $n$, is hard to know if $q$ is a square residue. - NP? **Zero-Knowledge proof:** i want to proof that a number is a quadratic residue mod $n$ that i know without giving any necessary data to verify it ($p_1,p_2$) - **Moral:** it is trivial to prove that one possesses knowledge of certain information by simply revealing it; the challenge is to prove such possession `without revealing the information itself` or any additional information. **Fiat-Shamir protocol** for proving quadratic residues: ![](https://i.imgur.com/p2HIWIu.png) - πŸ‘β€πŸ—¨ this works because product of quadratic residues is still a quadratic residue. **Interactive Proof System:** ... ## Quantum Cryptography Exploits properties of quantum mechanics, like: - `quantum superposition` - `quantum annealing` for energy minimization problems ![](https://i.imgur.com/4Wp5Fvt.png) **Shor's algorithm (**1994**):** integer factorization in $O(\text{log}n)$ (is exponential in classic computers) - requires a universal quantum computer - :warning:**Can solve most PK cryptosystems:** `RSA`, `Diffie-Hellman`, ECC, DSA, ECDSA,… **Grover's algorithm (**1996**):** search in $O(\sqrt n)$ (is linear in classic computers) **New PK Cryptosystems based on:** - lattices - error-correcting codes - multivariate polynomials - hash functions - elliptic curve isogenies - ... NIST did two rounds for selecting best algorithms. ## Authentication I **Authentication:** process of verifying a `user’s identity`. - user identity is a parameter in `access control decisions` - user identity is recorded when `logging` security relevant events in an **audit trail** πŸ‘β€πŸ—¨ It is not always necessary or desirable to base access control on user identities. ### User Authn **User authentication:** the process of verifying a `claimed user identity`. - entering `username`: you **announce** who you are - entering `password`: you **prove** that you are who you claim to be ### Passwords overview #### Bootstrapping How do you bootstrap a system so that the password ends up in the right places, but nowhere else? - password delivery might be **intercepted** - password may be delivered to an **impersonator** **In person** withdrawal should be secure enough. **Remote delivery:** 1. Do not give the password to the caller! **Call back an authorized phone number.** 2. Send only passwords valid for a `single log-in request`. **User has to change it to a password not known to the sender.** 3. Send **mail by courier** with personal delivery. 4. Request **confirmation on a different channel** to activate user account. (`SMS` in authN has been recently deprecated by [NIST](https://www.nist.gov/)) πŸ‘β€πŸ—¨ Procedures for **resetting passwords** are the same as listed previously, but reaction **should be instant**. #### Attacks Overview - **Exhaustive search (**`Brute force`**):** try all possible combinations of valid symbols up to a certain length - **Restricted name space (**`popular passwords`, `dictionary attack`, `exploit info about victim`, …**)** #### Mitigation **Some random tips:** - set a password: grazie al ca**o. - change default password: grazie al ca**o. - avoid guessable passwords: grazie al ca**o. **New `NIST` password limitations:** - forbid common ones - no password hints, or knowledge-based authn - limit password attempts ### Phishing vs Spoofing In **phishing** and **spoofing** attacks, a party voluntarily sends the password over a channel, but is misled about the end point of the channel. - **authn is unilateral:** no guarantees about the destination identity | SPOOFING :eyes: | PHISHING :fishing_pole_and_fish: | | ------------------------------------------------------------ | ------------------------------------------------------------ | | Hacker tries to steal the identity to act as another individual. | Hacker tries to steal the sensitive information of the user. | | It doesn’t require fraud. | It is operated in a **fraud manner**. | | **Information is not theft.** | Information is theft. | | Spoofing can be part of the phishing. | Phishing can’t be the part of the spoofing. | | Needs to download some malicious software in victim computer. | No such malicious software is needed. | | Spoofing is basically done to **get a new identity**. | Phishing is done to **get secret information**. | | **Types:** `IP Spoofing`, `Email Spoofing`, `URL Spoofing` etc. | **Types:** `Phone Phishing`, `Clone Phishing` etc. | πŸ‘β€πŸ—¨ **Login spoofing:** particular attack on OS login screen. ### Digital Certificates :warning: `Digital signature problem:` **why trust who the signer claims to be?** **Digital Certificate:** certifies `binding between pub-key and entity` (person, hardware, etc). - signed by **TTP (**`Trusted Third Party`**):** should be trusted **X.509 Certificate structure:** ![](https://i.imgur.com/pkRWurV.png) **Main elements:** 1. Issuer (CA) - can be trusted? 2. Subject (User) 3. Subject Public Key (`bond to subject`) 4. CA Digital Signature (`validates the certificate`, until expiration) ## PKI (Public Key Infrastructure) System used by `TTP` for validating and distributing certificates. πŸ‘β€πŸ—¨ **Binds public keys** with respective identities of entities. ![](https://i.imgur.com/qhzmcFL.png) **Components:** - **CA (**`Certificate Authority`**):** authorizes sub-CAs - `Subordinate CA:` signs and hands out certificates - **RA (**`Registration Authority`**):** verifies certificate requests - **VA (**`Validation Authority`**):** verifies certificate validity - optional (`CA can do this job`) - uses `CRLs` updated from CAs - **CPS (**`Cryptographic Practices Statement`**):** how good security is - **CRL (**`Certificate Revocation List`**):** revoked certificates ## SSL/TLS - **SSL (**`Secure Sockets Layer`**):** way to secure communications between the client and server on the web, **deprecated** - **TLS (**`Transport Layer Security`**):** evolution of SSL **SSL/TLS** provides: - `identification` (Digital Certificate) - `authentication` (Handshake encrypted with each other’s public key) - `confidentiality` (shared symmetric key) - `integrity` (HMAC on every message, hash check at the end of the handshake) **Components:** - `Handshake protocol:` see [**`Reti Avanzate’s PDF`**]() 1. decide algoz and version 2. authenticate both parties (optional) 3. use PKC to establish symmetric key - `Cipher Change Protocol` - `Alert Protocol` - `Record Protocol:` ![](https://i.imgur.com/y7C9uHn.png) ### TLS (<1.3?) Vulnerabilities - **SSLStripping:** (`MITM`) read the traffic between two entities communicating via SSL/TLS - πŸ’­Idea: `downgrade connection` so that encryption is no more used! (strip the β€œs” from all the Intercepted HTTPS traffic) - πŸ’‘ Solution, **HSTS (**` HTTP Strict Transport Security`**):** server tells clients to use HTTPS only, but first client connection maybe HTTP because client does not know in advance - **HSTS Preload Lists:** solves HSTS problems, contain websites that wants HTTPS-only ![](https://i.imgur.com/q9DwAZC.png) - **Sweet32:** can recover small portions of plain-text when encrypted with 64b block ciphers in some circumstances (3DES) - 1-2 days and can recover a session cookie - πŸ’‘ Solution: disable 3DES - **3-shake:** ![](https://i.imgur.com/sVATVSI.png) - **Sweet32:** possible Birthday attack for `3DES cypher` (32 bit hash). 3DES removed in v1.3 - **CRIME:** when using DEFLATE compression. Deflate removed in v1.3 ## SSO (Authn) Allows users to access multiple apps through a **single authentication act**. - **only one password needed:** - :heavy_plus_sign: easier for users - :heavy_minus_sign: weak point (game over if it gets hacked) **Roles:** 1. **User:** requests service from SP 2. **Service provider (**`SP`**):** requests user identity from trusted IdP - `access control` on own services/resources - can `trust many IdPs` 3. **Identity provider (**`IdP`**):** asserts user identity to SP πŸ‘β€πŸ—¨ Most SSO systems rely on `HTTP` and `cookies`. Need balance between **usability** and **security**: ![](https://i.imgur.com/DnYE5LK.png) **Solution:** `SSO + Multi-Factor Auth`, combine for maximum security - something you know (`password`) - something you own (`card`) - who you are (`biometrics`) **Contextual authentication:** - `consider context` when user is logging in - login from an **unusual** place, **unusual** hour and **unusual** activity $\Longrightarrow$ **high risk score** (`warn user`) SSO is a property of `access control` of multiple related, yet independent, software systems. - **Multiple systems:** `multiple sign-on` - :x: burden for administrations and users - :x:/:heavy_check_mark: more security domains $=$ more sign-ons - **Single Sign-on** - `less time` spent re-entering password - less IT costs (low help desk for forgotten passwords) - :heavy_check_mark: more complex passwords - :heavy_check_mark: shared sessions - :heavy_check_mark: only one password to remember - :x: less security, one password for everything **SSO Types:** - **Pseudo-SSO** - separate authn to each service - `client software manages credentials` - login hidden from user - **Proxy-based SSO** - **pseudo-SSO** but implemented in a proxy - `proxy manages user credentials` - login hidden from client - **True SSO** - user authn to a **one authn service** - this authn service `asserts user identity to other services` - **Federated SSO** - single authn between multiple separated administrative domains - **Federation:** is a group of resources - sharing a `common policy` - managed as a `single entity` ## SAML (authn+authz exchanging standard) **SAML (**`Security Assertion Markup Language`**):** standard for exchanging `authentication` and `authorization` data between parties. - set of **XML-based protocol messages** for: - `user authentication` - `attribute information` - defines **3 roles:** 1. **User:** requests service from SP 2. **Service provider (**`SP`**):** requests user identity from trusted IdP - `access control` on own services/resources - can `trust many IdPs` 3. **Identity provider (**`IdP`**):** asserts user identity to SP - does user `authn` - authentication method not specified by SAML - provides `authz` info - could provide `SAML assertions to many SPs` 🎯 **SAML goal:** promote `web-browser SSO` interoperability across multiple security domains. (simplify federated authentication and authorization) - SSO is easy within one security domain (`cookies`, ….), but using assertions, **SAML offers SSO across domains** (cookies cannot do that) - SAML **does not** perform authn - SAML **does not** grant access to resource X **Advantages** - **Platform Neutrality:** security layer is more `independent` (external framework) - **Loose coupling of directories:** no need to keep and sync user info through directories - **Improved online experience for end users:** SAML enables `Single Sign-On` (auth to `IdP`) - **Reduce administrative costs and risks for service providers:** - reduce the cost of maintaining account information - `tracking of users done by IdP` ### SAML Components There are $6$ components: ![](https://i.imgur.com/hVQAv0r.png) **1. Assertions** Package of information supplying one or more statements made by a SAML authority. - 3 kinds of `assertions statements`: - **Authentication** - `tells how` the subject was authenticated *(*πŸ’­*SAML DOES NOT AUTHENTICATE)* - typically generated by the `IdP` - **Attribute** - subject is `associated` with the given attributes - **Authorization Decision** - A request to allow a subject `to access a resource` has been granted or denied `Authentication assertion example:` ![](https://i.imgur.com/OlZLjGn.png) **2. Protocols** Are a number of **`request/response protocols`** that allow `SPs` to request: - one or more `assertions` from a SAML authority - to an `IdP to authenticate` a principal and return assertion - the `registration of a name identifier` - to `end the use of an identifier` - near-simultaneous logout of a collection of related sessions (`single logout`) **3. Bindings** `Mapping of SAML protocols` onto standard messaging and communication protocols (transport mechanisms). - `HTTP` redirection binding - `SOAP` messages binding **4. Profiles** Combination of SAML `assertions`, `protocols`, and `bindings` to support a defined use case (*application*). - **goal:** `++interoperability` with less flexibility - **Web Browser SSO Profile:** most important use case ![](https://i.imgur.com/O9RaokP.png) - SP-Initiated SSO is more common - πŸ‘β€πŸ—¨ **authentication is always done by IdP**, SP simply redirects - πŸ‘β€πŸ—¨ from the examples, the `authn statement` (SAML assertion) <u>CAN</u> pass signed through the User **5. Metadata** Defines how to express `configuration` and `trust-related data`. - identifies actors for profiles. (`SSO IdP`, `SSO SP`, `SSO Requester`, etc…) - data that must be agreed on between system entities: - supported roles - identifiers - URLs - certificate and keys - supported profiles **6. Authentication Context** Permits the augmentation of `assertions with additional information` pertaining to the authentication of the Principal at the IdP. - 🧐 IdP may need extra info for doing an assertion with *β€œconfidence”* - details of multi-factor authentication can be included - concept of **Levels Of Assurances (**`LOA`**):** subdivide authn methods in levels **based on strength** - 3 **Identity** Assurance Levels (`IAL`): identity proofing robustness - 3 **Authentication** Assurance Levels (`AAL`): strictness of authentication - 3 **Federation** Assurance Levels (`FAL`) ### SAML Security How does the relying party `trust received assertions?` - `MITM` could exploit assertions **Security mechanisms:** described for each SAML binding in the standard (`non mandatory❔`) - relying party and asserting party **trust with PKI** (`Public Key Infrastructure`) - for message `integrity` & `confidentiality` $\longrightarrow$ use **SSL/TLS** - for `assertion messages` between relying and asserting parties $\longrightarrow$ SSL/TLS + **mutual authentication** - for assertion message delivered through User (`integrity`) $\longrightarrow$ use **XML Signature** (POST, PUT bindings…) - (generally from IdP to SP) - **XML canonicalization:** needed to obtain same signature from different but logically equivalent XML - attribute order should have no impact on signature, and comments too... ### SAML Privacy **Problems:** - `user’s identity` data management - `inhibit inappropriate correlations` of user actions **Privacy mechanisms** - **pseudonyms:** between identity and SP - set a `different user identifier for each SP` $\longrightarrow$ protection from inappropriate correlations - **one-time/transient identifiers:** SPs won’t be able to recognize multiple accesses from the same user - **Authentication Context:** user authenticated at a sufficient assurance level (not more than necessary) - ❔ SAML allow the fact of a **user consenting to certain operations** to be expressed between providers ### Conclusions - with SAML, **SPs don’t have to care about authentication** - :heavy_minus_sign: administration burden - :heavy_plus_sign: `interoperability`, `usability`, `security` and `privacy` - **SAML profiles are useful** use case scenarios - **Web SSO** most adopted - SSO has increasing importance and is gaining wider and wider adoption - SAML is ideal **starting point to build infrastructures** for digital Identity Management (authentication) - key enablers in an increasing digital world (`SPID`, `eIDAS`) - first line of defense against attackers ## OAUTH 2.0 (Authz delegation) Is a **delegation protocol** that lets users allow applications/servers to access resources on their behalf (usually temporary access). :warning: **Problem:** services may need user authz for other services, but sharing credentials is risky. :heavy_check_mark: **Solution (**OAUTH**):** specific access rights can be delegated and through **tokens**. - token is bound to **scopes**: strings that define `authorization details` - can `expire` or be `revocated` - presented for every request - `opaque to client` (transparent do authorization server and resource owner, by lookup in AS database), no token format - requested scopes must be accepted by the user - authz delegation **only for software** ![](https://i.imgur.com/FSsAFFw.png) - defined only for `HTTP` - relies on `TLS` for securing messages - not a single protocol, `several flows` ### OAUTH Flow Overview 1. **Authentication:** ⚠`pre-requisite` for OAUTH, user logs into a site - πŸ‘β€πŸ—¨ `OAUTH does authz`, NOT authn 2. **User Consent:** user decides what to share with third parties - authz server will create a `token` based on the user decisions 3. **Get OAUTH Token:** token gets delivered to the destination app - defines access rights on user’s resources - is **opaque** to the client (`not intended to be read`) - no strict format 4. **Access Resources:** the app can use the received token on the resource server to access restricted data ### OAUTH Entities - **Resource Owner** - usually a person - can access resources and can grant access to his resources - **Protected Source** - `SP` (Service Provider) protecting resources - share resource on owner request - **Client (**APP**)** - wish to access owner resources - `needs permission` - **Authorization Server** - `generates tokens` - authentication point - authorization manager ### OAUTH Channels - **Front channel:** uses `HTTP redirects through the web browser`, no direct connections - **Back channel:** uses direct HTTP connections between components, the `browser is not involved` ### OAUTH Flows - **Authorization code flow** (`web app`) - `front channel + back channel` - πŸ‘β€πŸ—¨**Why authorization code** other than token? Clients cannot be trusted, so trade authz code + client info for token. - :shield: **For security:** use short-lived access tokens $+$ long lived refresh tokens. ![](https://i.imgur.com/wzRZykQ.png) ![](https://i.imgur.com/vrifYVT.png) ![](https://i.imgur.com/h8OOnfS.png) - **Implicit flow** (`in-browser app`) - only `front channel` (messages have to pass through the browser) - :shield: **For security:** don’t use refresh tokens, client can always repeat authorization process until browser is alive. ![](https://i.imgur.com/VTqyxYO.png) - **Resource Owner Password** (`trusted legacy client`) - user gives credentials to client $\longrightarrow$ client acts as resource owner - only `back-channel` (no browser intermediate) - :warning: high **risk of phishing!** - :shield: **For security:** use this flow only if client and authorization server are controlled by the same entity. ![](https://i.imgur.com/LQqexz6.png) - **Client credentials** (`API keys analogous`) - no explicit resource owner - `no refresh tokens, useless`: client can always ask for another token using his credentials - only `back-channel` (no browser intermediate) - :shield: **For security:** don’t use refresh tokens, client can always use his own credentials. ![](https://i.imgur.com/CumNUq2.png) ### OAUTH Attacks - **Authorization code intercept:** with malicious App and OAuth client, an attacker can successfully ask for a token. 1. Attacker manages to register a **malicious application** on the `client device` and registers a custom `URI` scheme that is also used by another application 2. The attacker has access to the OAuth 2.0 **client_id** and **client_secret** - If client is `mobile app`, then nothing can be assumed to remain *β€œsecret”* ![](https://i.imgur.com/ffnatB2.png) - **Solution:** one-time `PKCE` (Proof Key Code Exchange) that a client uses **to proof that it initiated the flow** instead of simple bearer tokens. - `JWT` used as OAuth 2.0 tokens with **POP (**`Proof of Possession`**)** mechanisms - **the client proofs possession** of private key belonging to a public key by sending a JWT signed with its private key, along corresponding public key and authorization code in order to get the token - **Attacks on JWT (**`JSON Web Token`**):** provides **integrity + non-repudiation** - `JWT is abstract`, JWS implements **Signature**, JWE implements **Encryption** ![](https://i.imgur.com/i3tJYo1.png) - JWT components (or JWS?): - Header: identifies algorithm used to sign (e.g. "HS256" means HMAC with SHA-256) - Payload: information actually used for access control - Signature: e.g. `= HMAC-SHA256(base64urlEncode(header) + '.' + base64urlEncode(payload), secret_key)` - Incorrect implementations creates **vulnerabilities**: - `Change the algorithm type`: 1. if algorithm can be set to **None**, attackers can leave the signature empty $\implies$ **easy forging** 2. **switching** between RSA and HMAC: sign token with public key and change to HMAC, the token would be verified with the same key - `Provide a non-valid signature`: if signature is non verified, just put something - `Brute-force the secret key`: if key not too complex and have some info - `Leak the secret key`: exploit vulnerabilities to get the key directly from storage - `Key ID manipulation`: **optional header field**, allow to specify a file to be used as key - `Directory traversal`: can retrieve a sensitive key file from **file system**, or can use a **public file** as key - `SQL injection`: can also be used to get a key from a DB **if Key ID parameter is not sanitized** - `Command injection`: similar to SQL injection, but with system calls **OAUTH Authentication?** Authentication is all about the user and his presence with the application, with `refresh tokens` an app can obtain valid tokens without the user being authenticated. - out of standard scope ⚠OAuth defines **no specific token format**, defines **no common set of scopes for the access token**, and **does not address how a protected resource validates an access token**. ## OpenID Connect (OIDC) **Open standard** that defines an inter-operable way to **use OAuth 2.0 to perform user authentication**. - is an `identity layer` over OAuth ![](https://i.imgur.com/XlZ7GcY.png) 🎯**Goal:** `remote` authentication (with `SSO` experience), the client/app/server can verify identity of user based on the authentication performed by an authorization server. - in technical terms, OpenID Connect specifies a **RESTful HTTP API**, using JSON as a data format **OAuth vs OpenID Connect flow:** - client gets **OAuth** token for `resource access` - client gets **ID token** (+ OAuth token?) for user `authn` (in similar flow to OAuth) - in both cases the client wants an authorization code from a given OAuth Server (AS) - **OAuth:** user authenticates to AS only for asking an access token for a client app/server - user is not authenticated to the client app/server (`pseudo-authn`) - ❌`cannot tell` when, where and how an authentication occurred - **OpenID Connect:** user authenticates to AS - βœ”client apps can get identity info - βœ”client apps can get authentication details - βœ”allows federated SSO **Features** - build on top of `OAuth 2.0` - **fixes** many of the OAuth pitfalls for user authentication - can smoothly coexist with an OAuth authorization system OpenID Connect `can work at internet scale`, where no parties have to know about each other ahead of time, by using: - **Common identity API:** set of endpoints? - **Dynamic Server Discovery Protocol:** allows `clients` to easily fetch information on how to interact with a specific IdP - **Dynamic Client Registration Protocol:** allows `clients` to be introduced to new IdP ### Basic Concepts - **Participants:** - **end-user:** the subject an app want identity info about - `resource owner` for OAuth: also owns its identity - **relying party:** `client` that relies on an IdP to user authn and to request claims about user - **IdP:** OAuth authorization server that offers authn service, ensures user authn, and provides user and authn claims to client apps - **ID token:** `signed token` given to the `client app`, alongside the regular OAuth access token - **contains claims** (authn info) - also encoded with **JOSE (**`JSON Object Signing And Encryption`**)**, like for OAuth access tokens - claims are in the payload - **signed by IdP** despite using TLS, prevents more `impersonation attacks` - ID token format is known to the client $\implies$ it can directly parse it - **Claims:** $\sim20$ standard claims, individual pieces of authn info - `sub`: identifier for the **user** - `iss`: identifier for the **IdP** who issued the token - `aud`: identifier of the **client** for which this token was created - issue date, expiration date, authn date, nonce (to avoid `replication attacks`), subject name (optional), subject email (optional), ... - **Scopes:** $\sim4$ standard scopes, `set of claims`, **permissions needed by the client app** - openid scope (mandatory if using OpenID Connect, specifies the minimum required ) - profile scope - email scope - address scope - phone scope - **Endpoints:** IdP offers some REST endpoints for user and client apps (`common identity API`?) - **Authorization** endpoint: `for end-user` to authenticate and grant consent to the client app - $\text{user log-in}\longrightarrow \text{authz code}$ (to be sent to the client app) - **Token** endpoint: `for client app` - $\text{authz code} + \text{info}\longrightarrow \text{ID token} + \text{OAuth access token} + \text{refresh token}$ - **UserInfo** endpoint: `for client app`, returns consented user info (`claims`) given a valid access token - $\text{OAuth access token} \longrightarrow \text{claims}$ πŸ‘β€πŸ—¨**Claims & scopes** can be requested in the `ID token`, or from the `UserInfo endpoint` using a valid access token. :warning: A client must be registered with a `client ID` among IdP he wants to use. - \+ a `client secret` to prove its identity (known by client and IdP that created and gave it) ### Authn Flows Some info: [link](https://medium.com/@darutk/diagrams-of-all-the-openid-connect-flows-6968e3990660) - **Authorization code flow (**`external app`**):** based on OAuth one, but `ID token` is also used - authorization code flow $\rightarrow$ **web app** w/ server back-end - authorization code flow + PKCE $\rightarrow$ native **mobile app** - **Implicit flow (**`in-browser app`**):** based on OAuth one, but `ID token` is also used - implicit flow $\rightarrow$ **JavaScript app** (SPA) w/ API back-end - **Hybrid flow** - mix of the two above ### Use cases - **Authentication** $\rightarrow$ `OpenID Connect` - Simple login - Mobile app login - SSO across sites (federated) - **Authorization** $\rightarrow$ `OAuth 2.0` - Delegated authorization - `granting access` to your API - `getting access` to user data in other systems SAML vs OpenID Connect? Both are standards for `federated SSO`. ![](https://i.imgur.com/P9oaKvg.png) ![](https://i.imgur.com/ZacSvKv.png) For more details on OpenID Connect, see my essay. :sunny: END :sunny: