Cryptograhic Computation and Blockchain

# Cryptograhic Computation and Blockchain ## Adversarial Models ### Definitions An honest party executes all protocol steps exactly **as specified in the protocol** description. Semi-honest (or passive or honest-but-curious) adversaries always follows all protocol steps exactly as an honest party would. The only power given to semi-honest adversaries is to try and **learn private information** from the **messages they receive** along with their **internal randomness/state**. An active adversary (or malicious) can **deviate from the protocol in arbitrary ways** in order to attack it. However, when corrupting parties, it must **respect further restrictions**, such as the amount of parties it is allowed to corrupt and whether it is allowed to perform adaptive corruptions or not. A rational adversary only performs attacks when it stands to **gain something** (e.g. financial gains) if the attack succeeds. ### Theoretical results In the same scenario, active security $\Rightarrow$ passive security. Static active security $\nRightarrow$ adaptive passive security. Active security with honest majority $\nRightarrow$ Passive security with dishonest majority. ## Introduction to MPC ### Definitions In an MPC protocol the adversary learns only the information about honest parties' inputs that can be obtained from the output of the function being computed. Protocols with information theoretical security **do not rely on any computational hardness assumptions** (e.g. the assumption that computing discrete logarithms in polynomial time is hard) and **remain secure even if the adversary has unlimited computational power** (i.e. enough power to quickly solve such computational problems that are believed to be hard). Since protocols with information theoretical security **do not rely on any computational hardness assumption**s, they are called also **unconditional secure**. The **opposite is not necessarily true**. Indeed, **if a computational hardness assumption is proved to be true**, it is theoretically **possible to have a protocol that does not rely on any computational hardness assumption (unconditional security) but it does not remain secure if the adversary has unlimited computational power (information theoretical security)**. Protocols with computational security are secure against a **probabilistic polynomial time adversary**. They are based on **computational hardness assumptions** (i.e., certain computational problems cannot be solved in probabilistic polynomial time). Perfect security and statistical security are **particular cases of information theoretical security**. Perfect security implies that the probability of breaking the protocol is exactly 0. Statistical security implies that the probability of breaking the protocol is **negligible in the security parameter k, even for an adversary with unlimited computational power**. Note that, in the case of computational security, having unlimited computational power allows the adversary to break the protocol with probability as close to $1$ as the adversary wants. Note also that we prove statistical security with respect to **one execution of the protocol**, so if the adversary can do infinite executions of the protocol it will eventually succeed in breaking it, since there is still a non-zero probability that it can break the protocol in each execution. Also, assuming that a protocol with statistical security is only executed a finite number of times, we can always set the security parameter so that the probability of an adversary breaking the protocol after this finite number of executions is as close to 0 as we want. In a secret sharing scheme the adversary **learns no information at all** from a set of shares that is not sufficient to reconstruct the secret. A $t$-out-of-$n$ threshold secret sharing scheme tolerates up to $t-1$ corrupted parties. ### Theoretical results (Im)possibility results regarding security in different scenarios: Number of corrupted parties | (Informational theoretical) perfect security | (Informational theoretical) statistical security | Computational security --- | --- | --- | ---- $< 1/3$ | Possible with private channels | Possible with private channels | Possible without private channels $< 1/2$ | **Impossible** | Possible with broadcast and private channels | Possible with broadcast or trusted PKI $<$ All parties | **Impossible** | **Impossible** | Possible with broadcast or trusted PKI ### References [Systematizing Secure Computation for Research and Decision Support](http://www.cs.yale.edu/homes/jf/PGFW-Systematizing-SCN2014.pdf) ## Introduction to Consensus ### Definitions Classical agreement problems: - Broadcast agreement: one process (the leader) has a value. Everybody must agree on that value. - Interactive consistency: all processes have a value. Everybody must agree on all values. - Consensus: all processes have a value. Everybody must agree on only one value. Being able to solve the broadcast agreement $\Rightarrow$ Being able to solve interactive consistency $\Rightarrow$ Being able to solve consensus $\Rightarrow$ Being able to solve the broadcast agreement $\Rightarrow$ ... If faulty processes can lie, then it is necessary to assume an honest majority in order to solve consensus and, in that case, the problems are equally hard. Process failures: - Crash - Omission - Arbitrary or byzantine Broadcast requirements: Agreement: every non-faulty process outputs the same value. Validity: if the leader is non-faulty everyone outputs its value. Termination: every non-faulty process eventually outputs a value. Consensus requirements: Agreement: every non-faulty process outputs the same value. Validity: if all non-faulty processes suggest the same value, that value must be the output value. Termination: every non-faulty process eventually outputs a value. Systems can be: - Synchronous: we know how fast each process is, and how long a message takes to arrive. - Asynchronous: we do not know either. FLP-impossibility: no deterministic algorithm can tolerate one or more process crashes and still reach **agreement** in an **asynchronous system**. Note that other faults implies crash behaviour. Blockchain protocols, which assumes an asynchronous system, guarantees **eventual consensus** only, i.e., **no guaranteed termination**, **eventual agreement** and **eventual validity**. It means that at some point all nodes agree on the same old state and that if all non-faulty nodes suggest the same state then they will eventually agree on that state. Design an MPC protocol must be at least as hard as reaching consensus. Question: what if just one of the parties expects the output? ### Theoretical results Constraint on the number of corrupted parties in differen scenarios (in a **synchronous system**): Fault model | Broadcast agreement | Interactive consistency | Consensus --- | --- | --- | --- Byzantine | $< 1/3$ | $< 1/3$ | $< 1/3$ | Authenticated byzantine | $<$ All parties | $<$ All parties | $< 1/2$ Omission | $<$ All parties | $<$ All parties | $<$ All parties Crash | $<$ All parties | $<$ All parties | $<$ All parties ## Protocol Security Models ### References - Sections 2.2 and 2.3 of Chapter 2 of Efficient Secure Two-Party Protocols. Carmit Hazay, Yehuda Lindell. - Sections 1, 2, 4, 5 (ZKP), 6 of [How To Simulate It – A Tutorial on the Simulation Proof TechniqueFile](https://eprint.iacr.org/2016/046.pdf). Yehuda Lindell. ## Secret Sharing ### Definitions A qualified set is a subset of parties who are allowed to reconstruct the secret. Every superset of a qualified set is a qualified set. An unqualified set is a subsets of parties who are not allowed to reconstruct a secret. Every subset of an unqualified set in an unqualified set. An access structure $\Gamma$ contains all the qualified sets over a set of parties $P$, with the property that $\forall S \in \Gamma, S \subset S' \subseteq P \Rightarrow S' \in \Gamma$. A secrecy structure $\Pi$ contains all unqualified sets over a set of parties $P$, with the property that $\forall U \in \Pi, U' \subset U \subseteq P \Rightarrow U' \in \Pi$. $\Gamma = \{ S \subset P : |S| \geq t \}, t \leq |P|$ is called threeshold access structure. Note that it makes sense to describe **access structures** in terms of their **minimal sets**, since their monotonicity implies that sets containing these minimal sets will be in the same structure. Conversely, **secrecy structures** are usually described by their **maximal sets** since it is implied that subsets of these maximal sets are also in the same structure. An $n$-out-of-$n$ scheme can be obtained on a field $\mathbb{F}_q$ by requiring the dealer to: - Sample $n-1$ random shares $s_1,\dots, s_{n-1}$ from $\mathbb{F}_q$. - Compute $s_n = s-(s_1 +\dots+ s_{n-1})$ where $s$ is the secret. - Send share $s_i$ to $P_i \in P$ for $i = 1,\dots,n$. The secret $s$ can only be reconstructed if all shares $(s_1,\dots, s_n)$ are known. Replicated secret sharing scheme can realize **general access structures** (and, among them, threshold access structures). Let $|P| = n$, on input $s \in \mathbb{F}_q$, the dealer proceeds as follows: - Let $T_1, . . . , T_k$ be the unqualified sets in the secrecy structure $\Pi = 2^P \setminus \Gamma$. For every unqualified set $T_i \in \Pi$, let $\hat{T_i} = P \setminus T_i.$. - Split $s$ in $s_1,\dots,s_k$ (k different shares) as seen before and for $i = 1, \dots , k$, send $s_i$ to all parties $P_j \in \hat{T}_i$. In order to reconstruct the secret the parties of a qualified set $P_j \in Q$ are required to broadcast their shares. Then it is guaranteed that they have received all the shares $s_1,\dots, s_k$, which are necessary to reconstruct the secret. --- Example replicated $3$-out-of-$4$ secret sharing: At least $3$ parties out of $4$ have to be required to reconstruct the secret. Let $P = \{P_1, P_2, P_3, P_4\}$, on input $s \in \mathbb{Z}_5$ and equal to $3$. Let $\Pi = \{\{P_1,P_2\}, \{P_1,P_3\}, \{P_1,P_4\}, \{P_2,P_3\}, \{P_2,P_4\}, \{P_3,P_4\}\}$ be the secrecy structure, where $T_1 = \{P_1,P_2\}$ and so on. Then, $\hat{T}_1 = P \setminus T_1 = \{P_3, P_4\}$ $\hat{T}_2 = P \setminus T_2 = \{P_2, P_4\}$ $\hat{T}_3 = P \setminus T_3 = \{P_2, P_3\}$ $\hat{T}_4 = P \setminus T_4 = \{P_1, P_4\}$ $\hat{T}_5 = P \setminus T_5 = \{P_1, P_3\}$ $\hat{T}_6 = P \setminus T_6 = \{P_1, P_2\}$ The dealer splits $s = 3$ in $s_1,\dots,s_6$, i.e., randomly samples $s_1,\dots, s_{5}$ from $\mathbb{Z}_5$, and sets $s_6 = s - (s_1 + \dots + s_5)$ and, for $i = 1, \dots , 6$, sends $s_i$ to all parties $P_j \in \hat{T}_i$. In that way, at least $3$ parties out of $4$ are required to reconstruct the secret, in particular shares are distributed as follows: | | $s_1$ | $s_2$ | $s_3$ | $s_4$ | $s_5$ | $s_6$ | |-------|-------|-------|-------|-------|-------|-------| | $P_1$ | | | | Yes | Yes | Yes | | $P_2$ | | Yes | Yes | | | Yes | | $P_3$ | Yes | | Yes | | Yes | | | $P_4$ | Yes | Yes | | Yes | | | --- The Shamir secret sharing scheme **specifically realizes threshold access structures**. The Shamir secret sharing scheme for a $t$-out-of-$n$ threshold access structure over a set of parties $P$, $|P| = n$ with secret $s \in \mathbb{F}_q$ consists in computing shares as evaluations of the polynomial $p(x) = s + c_1 x + \cdots + c_{t-1} x^{t-1}$ and each share $s_i$ is computed as $s_i = p(i)$. Each single share has the same size of the secret since they all belong to $\mathbb{F}_q$. Intutively, a polynomial of degree $t-1$ is completely defined by providing $t$ distinct points, i.e., $t$ shares are required to reconstrct the secret $s$. Shamir secret share achieves perfect security, i.e., less than $t$ shares reveal no information at all about $s$. --- Example $3$-out-of-$4$ Shamir secret sharing: At least $3$ parties out of $4$ have to be required to reconstruct the secret. Let $P = \{P_1, P_2, P_3, P_4\}$, on input $s \in \mathbb{Z}_5$ and equal to $3$. Let $p(x) = s + c_1 x + c_2 x^2$ where $c_1$ and $c_2$ are randomly sampled from $s \in \mathbb{Z}_5$, for example: $p(x) = 3 + x + 2x^2$ Compute share $s_i=p(x_i)$ and send it to party $P_i$ for $i=1,\dots,n$ (for example $x_i = i$). We obtain $s_1 = p(1) = 1$, $s_2 = p(2) = 3$, $s_3 = p(3) = 4$ then, For $i = 1, \cdots , 3$ compute $\mathcal{L}_{i}(x) = \prod \limits_{j = 1, j \ne i}^{j = 3} \frac{x-x_j}{x_i - x_j}$ Then, $p(x) = \sum \limits_{i=1}^{i = 3} p(x_i)\mathcal{L}_{i}(x)$ Then, $\mathcal{L}_{1}(x) = \frac{x-2}{1-2}\frac{x-3}{1-3} = 3 + \dots$ $\mathcal{L}_{2}(x) = \frac{x-1}{2-1}\frac{x-3}{2-3} = -3 + \dots$ $\mathcal{L}_{3}(x) = \frac{x-1}{3-1}\frac{x-2}{3-2} = 1 + \dots$ $\mathcal{L}_{1}(0) = 3$ $\mathcal{L}_{2}(0) = -3$ $\mathcal{L}_{3}(0) = 1$ $p(x) = 3 \cdot 1 + -3 \cdot 3 + 1 \cdot 4 + \dots$ $p(0) = 3$ --- In a verifiable secret sharing scheme all honest parties can check that all shares are valid. ## Information Theoretical MPC ### Definitions and theoretical results An information theoretically secure MPC protocol based on simple (i.e., non-verifiable) secret sharing can achieve **perfect security against passive adversaries corrupting less than 1/3 of the parties**. An information theoretically secure MPC protocol based on **verifiable secret sharing** can achieve security against **active adversaries**. Indeed, with a verifiable secret sharing scheme honest parties can detect corrupted shares sent by an adversary and stop execution before their private inputs might be revealed. Question: as far as not all the parties having a replica of a share are corrupted. Is there a general rule? Using Beaver triples it is possible to construct a **multiplication protocol** secure against **passive adversaries** with **dishonest majorities**. Moreover, it is possible to achieve **better communication complexity** than those based on replicated secret sharing. Beaver triples must be generated by a **trusted third party** who **does not know the shares of inputs to be multiplied** and **can only be used once**. Suppose two parties want to multipliy their inputs $x = x_1 + x_2$ and $y = y_1 + y_2$. First, a trusted dealer computes $a, b, c$ such that $a$ and $b$ are randomly sampled from $\mathbb{F}_q$ and $c = ab$. Then the trusted dealer distributes the shares of $a = a_1 + a_2, b = b_1 + b_2, c = c_1 + c_2$ between the parties. ``` sequence Participant P1 as P1 Participant P2 as P2 Note left of P1: d1 = x1-a1\ne1 = y1-b1 Note right of P2: d2 = x2-a2\ne2 = y2-b2 P1->P2: d1,e1 P2->P1: d2,e2 Note left of P1: s1 = c1+e*a1+d*b1+e*d Note right of P2: s2 = c2+e*a2+d*b2 Note right of P1: s1+s2 = xy ``` ## Commitments and Oblivious Transfer ### Definitions In a commitment scheme the hiding property guarantees that **the receiver cannot learn anything about the message inside the commitment before the opening phase**. In a commitment scheme the binding property guarantees that **the receiver can check in the opening phase whether the sender is opening to the same message that was originally used in the commitment phase**. A coin tossing protocol for two parties based on a commitment scheme can be constructed in the following way: - Alice sends Bob a commitment to a random value $r_A$. - Bob sends Alice a random value $r_B$. - Alice opens the commitment and both parties compute the final random output $r=r_A + r_B$. The Pedersen commitment scheme consists in the following steps assuming both parties have parameters $g,h \in \mathbb{G}_p$ where $g,h$ are generators of a group $\mathbb{G}_p$ of order $p$: - Commitment phase: the sender with a message $m \in \mathbb{Z}_p$ samples $r \in \mathbb{Z}_p$ uniformly at random, computes $c=g^r h^m$ and sends $c$ to the receiver. - Opening phase: the sender sends $(m,r)$ to the receiver. Upon receiving $(m,r)$ from the sender, the receiver checks that $g^r h^m$ is equal to $c$. If this check passes, the receiver accepts the opening to $m$, otherwise it aborts. In this type of commitment the **hiding property is unconditional** (i.e. it holds without any computational assumption). The **binding property holds under the assumption that computing discrete logarithms is hard**. A commitment scheme in the random oracle model can be constructed in the following way: - Commitment phase: the sender with a message $m \in \{0,1\}^k$ queries the random oracle on input $m|r$ ($m$ concatenated with $r$) obtaining $c$ as response and sends $c$ to the receiver. - Opening phase: the sender sends $(m,r)$ to the receiver. Upon receiving $(m,r)$ from the sender, the receiver queries the random oracle on input $m|r$ ($m$ concatenated with $r$) obtaining $c'$ as response and checks that $c'=c$. If this check passes, the receiver accepts the opening to $m$, otherwise it aborts. Both **hiding and binding properties hold against probabilistic polynomial time adversaries**. In a $1$-out-of-$2$ oblivious transfer **the receiver learns only one of the messages** $m_1$ or $m_2$ of the sender (selected by its choice bit input $c$) and **the sender learns nothing**. It is possible to generalize oblivious transfer to the $n$-out-of-$k$ case. Random oblivious transfer, instead, sends a pair of random messages $(m_1,m_2)$ to one party and the pair $(c, m_c)$ ,where $c$ is a random bit representing a choice bit, to the other party. Starting from random oblivious transfer it is possible to obtain oblivious transfer. Let $(m_0',m_1')$ be Alice's actual messages and let $b$ be Bob's actual choice bit. ``` sequence Participant Alice as A Participant Bob as B B->A: d = b+c A->B: x0 = m0' + m_{0+d}, x1 = m1' + m_{1+d} Note right of B: m_b' = x_b-m_c ``` Intuitively, Alice uses messages from ROT to one time pad encrypt her actual inputs. Then Bob will be able to decrypt just the one corresponding to his choice bit $b$. Oblivious transfer is symmetric, so it is possible to use an instance where Alice acts as sender and Bob acts as receiver to obtain an instance where the roles of Alice and Bob are reversed. The transformation is based on ROT. ## Zero Knowledge ### Definitions An interactive proof system is a **protocol executed between a prover and a verifier** in such a way that **the verifier is convinced of a statement by the prover**. Formally, the prover convinces the verifier that a given string $s$ is in a certain language $L$ if and only if $s \in L$. In an interactive proof system where the prover must convince that a string $s \in L$ the following properties holds: - Soundness: if $s \notin L$, the prover can still execute the proof system, but the verifier must not accept the proof. - Completeness: if $s \in L$, the verifier must always accepts the proof as valid. Properties of a zero knowledge proof system for a relation $R$: Note that a relation $R$ is a subset of $\{0,1\}* \times \{0,1\}*$ such that if $(x,w) \in R$, then the length of $w$ is $|w|=poly(|x|)$ for a polynomial $poly()$. Intuitively, for $(x,w) \in R$ we can think of $x$ as an instance of a computation and of $w$ as a solution. For example, for a group $\mathbb{G}$ we can define the discrete log relation $DL=\{(x,w) | x=(g,h) s.t. g \in \mathbb{G}, h \in \mathbb{G}, h=g^{w}\}$. In the case of zero knowledge proof (ZKP), where the prover wants to prove that $(x,.) \in R$ for some $w$, **there exists a probabilistic polynomial time simulator that takes as input $x$ and generates an accepting transcript** (i.e., messages between prover and verifier such that verifier accepts the proof) **given only $x$ but not $w$**. Intuitively, the accepting transcript generated by the simulator, **using its extra power with respect to the prover**, cannot leak any information about $w$ by definition since it does not know $w$. Then, the trascript generated by the prover does not leak any information about $w$ as well. In the case of proof of knowledge (PoK), where the prover wants to prove it knows $w$ such that $(x,w) \in R$, **there exists a probabilistic polynomial time simulator that interacts with a black-box copy of the prover and extracts the witness $w$ by potentially rewinding the prover**. Intuitively, since the simulator can extract $w$, **using its extra power with respect to the prover**, the prover must know $w$. Note that it is possible to have ZKP, PoK and ZKPoK (zero knowledge proof of knowledge). In a Sigma Protocol the prover wants to prove it has $(x, w) \in R$ to the verifier. It has **3 rounds**: the prover sends a message $a$ to the verifier, the verifier replies with a challenge $e$ and the prover replies with a message $z$. Then, the verifier checks if $(a,e,z)$ is accepting. It achieves the **special soundness property** (if there exists $(a,e,z)$ and $(a,e',z')$ it is possible to compute $w$ such that $(x,w) \in R$ from the transcript) and the **special honest verifier zero knowledge property** (it is not secure against a malicious verifier who tries to extract $w$). These properties are weaker than soundness and zero knowledge respectively. --- Example of Sigma protocol for $DL=\{(x,w) | x=(g,h) s.t. g \in \mathbb{G}, h \in \mathbb{G}, h=g^{w}\}$. $x = (g, h)$ is public. ``` sequence Participant P as P Participant V as V Note left of P: Randomly sample r in Z_p P->V: a=g^r Note right of V: Randomly sample e in Z_p V->P: e P->V: z = r+e*w Note right of V: Check g^z = a*h^e\n It passes if h=g^w ``` --- The Fiat-Shamir heuristic relies on a **random oracle to obtain a non-interactive version of an interactive proof system (for example, a Sigma Protocol)**. In order to do that, the prover (who holds $(x,w) \in R$ for a relation $R$) computes the challenges (that should be computed by the verifier) by querying the random oracle on $x$ and its own previous messages, **allowing the prover to compute a transcript of the proof system locally and send this transcript to the verifier as a proof**. ## MPC with OT and Zero Knowledge ### Definitions and theoretical results OT can be used for obtaining **multiplication of private inputs** efficiently with **passive security**. Suppose the two parties want to compute $x \cdot y$. The receives has input $x$ and the sender has input $y$. The receiver inputs $x$ as the **choice bit** and the sender inputs a **random bit** $c$ as its first message and $y + c$ as the second message. In the end of the OT execution, the receiver obtain $s_1 = x \cdot y + c$ and the sender knows $s_2 = c$ (that is, both parties have shares of $x \cdot y$). ```graphviz digraph dfd2{ rankdir=LR node[shape=record]; Sender -> OT [label="(c, y+c)"] Receiver -> OT [label="x"] OT -> Receiver [label="xy+c"] } ``` However, this protocol is passively secure and when many multiplications are required in a circuit, it is not possible to ensure a malicious party inputs the result of a previous multiplication in the next multiplication gate. The GMW compiler **transforms a passively secure protocol into an actively secure protocol**. In particular, it **forces all parties to execute a protocol correctly** by having all parties **prove in zero knowledge that each of their messages have been correctly computed** according to protocol instructions given the previous messages and parties' inputs/randomness. The GMW compiler achieves security against a **dishonest majority** because even if all parties but one are corrupted, the one honest party will be able to detect that any of the other parties are misbehaving and abort the execution. Notice that no matter how many corrupted parties, **the adversary still needs to generate a valid zero knowledge proof** showing that its messages are computed correctly according to the protocol instructions, the previous messages and the parties' inputs/randomness, which is not possible (except with negligible probability) if the adversary does not follow the protocol. In the SCRAPE randomness beacon, non-interactive zero knowledge proofs (obtained with the Fiat-Shamir heuristic) are used to prove that a party has behaved correctly and **public verifiability** is achieved, i.e., any third party who sees the protocol messages can verify that they have been correctly generated. Moreover, given an honest majority, **the output is always obtained even if the adversary is active** (**guaranteed output delivery** or G.O.D.). ## Byzantine Agreement vs. Blockchain Consensus ### Definitions Properties of byzantine agreement: - Agreement: all honest parties output the same value. - Validity: if all honest parties have the same input value, they must all output this value. - Termination: all honest parties must eventually output a value. FLP impossibility result shows that it is **impossible to perform byzantine agreement** over **asynchronous networks** with **perfect agreement**, **validity** and **guaranteed termination** in a deterministic number of rounds. Blockchain protocols do not guarantee that consensus has been achieved after a given number of rounds, while byzantine agreement has this guarantee (via the agreement and termination properties). This property is called **eventual consensus**. Blockchain protocols do not require the honest parties to know each other's public keys or to be connected among each other by point-to-point channels, whereas byzantine agreement assumes that these requirements are fulfilled. This property is called **permissionless**. Blockchain protocols are **more scalable**, supporting a larger number of parties than byzantine agreement protocols while requiring less communication. ### Theoretical results Property | Constraint on the number of corrupted parties --- | --- Byzantine agreement over synchronous networks | $< 1/3$ Byzantine agreement over synchronous networks with PKI | $< 1/2$ ## Proof-of-Work Blockchains A blockchain can be defined as a sequence of blocks $BC=(B_0,B_1,B_2,\ldots,B_n)$ where each block $B_i \in BC$ is composed by certain metadata and data, with the exception of the genesis block $B_0$. The main metadata and data in each block of a valid Proof-of-Work (PoW) based blockchain structure are a header containing the **hash of the previous block**, **a valid solution to the PoW puzzle** and the **hash of the block's data** as well as the **block's data itself**, which may be any binary data. Let $BC=(B_0,B_1,B_2,\ldots,B_{f-1},B_f,\ldots,B_n)$ and $BC'=(B_0,B_1,B_2,\ldots,B_{f-1},B'_f,\ldots,B'_n)$ be two valid blockchains where blocks $B_0,B_1,B_2,\ldots,B_{f-1}$ are equal in both $BC$ and $BC'$ but blocks $B'_f,\ldots,B'_n$ and $B_f,\ldots,B_n$ are different. In an execution of a blockchain consensus protocol, the situation where one honest party obtains $BC$ and another honest party obtains $BC'$ is named **fork**. The GKL model is a model to describe blockchain protocols: - Corruption: $n$ parties, $t$ are corrupted. - Synchronous networks: the protocol proceeds in synchronous rounds where all messages sent are delivered. - $q$-Bounded Random Oracle: hash functions are modelled as random oracles and parties can make $q$ queries to the random oracle at every round. - Adversarial model: the adversary can spoof messages, generate arbitrary messages and make $t \cdot q$ queries to the random oracle. - The Environment gives inputs to honest parties and receives their outputs. - The adversary can see all messages sent before they are delivered and have its messages delivered first. - Security properties must hold **for any environment** (modeling all possible honest parties behaviors/inputs) and **for any adversary** (modeling all possible adversarial strategies). - **Different from real/ideal world paradigm** simulation based proofs: **only real world is used**. Blockchain is maintained and extended by parties using the following functions: - Validation function $V()$: used to determine what blocks are valid (with respect to PoW puzzle and current local view of the party). - Read function $R()$: used to determine the local view of the chain by a party (with respect to new blocks received/created by the party). - Input function $I()$: used to try and create a new block with information received from the environment (by solving the PoW puzzle). Bockchain protocol properties in the GKL model: - Chain Growth: the chain will grow with at least constant speed (a constant number of new blocks is added to the chain after a constant number of rounds). - Chain Quality: given $k$ blocks in a blockchain, at least $c \cdot k$ blocks were generated by honest parties for a constant $c$ with overwhelming probability. - Common Prefix: the probability that two honest parties see different chains after removing the $k$ last blocks from a chain is negligible in $k$. The PoW puzzle must be **hard to solve on average** but **easy to verify** with verification time independent of the solution time. In particular, given a hash function $H:\{0,1\}^{\star} \rightarrow \{0,1\}^{n}$ and a difficulty level $d$ the Bitcoin proof-of-work puzzle is defined for a block $B$ whose header contains a hash $h_d$ of the data and a hash $st$ of the previous block as finding a value $Nonce \in \{0,1\}^{n}$ such that $H(h_d|st|Nonce) < 2^{n-d}$. The difficulty of solving a PoW puzzle is periodically adjusted in order to achieve a constant number of blocks generated per time period (in the case of Bitcoin, 1 block every 10 minutes). The main assumption for proving the security of a Proof-of-Work based blockchain protocol is that the adversary cannot control more than half of all of the computation power invested in solving the PoW puzzle. When a miner who solves a PoW puzzle and successfully generates a block in the Bitcoin blockchain protocol receive the **block reward**. In particular, the block reward consists in a certain number of coins that is halved every time 210,000 blocks are mined. ## Cryptocurrencies In a Bitcoin-style cryptocurrency a user's coins are connected to an address controlled by the user. Addresses are represented by **signature scheme public keys** or **hashes of these public keys**. New coins are **generated as block rewards** given to each of the miners who successfully solve a PoW puzzle and generate a new block that gets included in the blockchain. In order to execute a transaction, it is necessary to have it **signed** under the secret key of the sender. If the owner of address $A$ containing $X$ coins needs to transfer $Y$ coins to address $B$ but $X > Y$, they are sent to a second address $C$ controlled by the owner of address $A$, which is usually called a change address. The issue of double spending in blockchain based cryptocurrencies by having **the users wait for a sufficiently large number of blocks** to be added to the blockchain after the block in which the transaction was added so that the common prefix guarantee, of the underlying blockchain protocol, guarantees that this block is not part of a fork except with negligible probability. Miners receive **arbitrary transactions fees** offered by the user who generates the transaction for each transaction they add to a block. Apart from allowing for transferring coins from address $A$ to address $B$, Bitcoin script can be used to specify **transactions where the recipient is not allowed to spend the coins it receives before a certain amount of times is elapsed**. Smart Contracts allow to describe **arbitrarily complex conditions** under which transactions might take place among multiple users given a set of inputs and they are automatically executed by parties executing a decentralized and permissionless cryptocurrency and smart contracts system. In the case of Ethereum, **miners execute as many instructions of the smart contract as covered by the "gas" fees they receive** from parties who wish to execute the contract (who send transactions to be evaluated as input by the contract). ## PoS Solving PoW puzzles wastes vast amounts of computational power (and electric energy). Moreover, with the periodic halving of block rewards it is unclear if miners will have sufficient incentives to keep executing PoW based blockchain protocols in the future. A stakeholder's (i.e. a user's) stake is the **amount of coins it currently owns in the cryptocurrency system**. The main idea behind Proof-of-Stake (PoS) based blockchain protocols is that it randomly **selects a user to generate a block with probability proportional to the user's stake.** PoS based blockchain protocols are proven secure under the assumption that **more than half of the stake is controlled by honest parties**. PoS based blockchain protocols achieve the **chain growth, common prefix and chain quality properties of the GKL framework**. The Ouroboros PoS blockchain protocol select the user who generates the next block using the **follow-the-satoshi** procedure where a random seed is used to select a coin at random, consequently selecting the coin's owner to generate the next block. The random seed is generated by using a **randomness beacon with guaranteed output delivery** such as SCRAPE, which is executed by the parties on top of the blockchain itself. Assuming that more than half of the stake is controlled by honest parties, Ouroboros Proof-of-Stake (PoS) based blockchain protocol is secure under the conditions of **static adversaries and synchronous communication**. In Ouroboros Praos Proof-of-Stake (PoS) based blockchain blocks are signed by their creators with a forward secure signature scheme. Moreover, **verifiable random functions** are used for both **selecting the users who generate the next block** and for **generating (biased) random nonces** used in this selection process. Assuming that more than half of the stake is controlled by honest parties, Ouroboros Praos Proof-of-Stake (PoS) based blockchain protocol is secure under the conditions of **adaptive adversaries and semi-synchronous communication.**