Drand: a quick technical summary

# How does Drand work? These are some short notes summarising the technical aspects of Drand. They are a synthesis of a collection of several sources (including my [own --admittedly limited-- knowledge about randon-number generation](https://moodlearchive.epfl.ch/2020-2021/enrol/index.php?id=15466)), and as such, there could be small inconsistencies and errors here and there, but I write them in any case in the hopes it is useful to understand their project. ## Introduction Drand is a *distributed randomness beacon for distributed systems.* There are a few key components of that sentence that are worth exploring in more depth. ### Why do we need randomness? Randomness is inherent to several fields of science and engineering. A need for randomness shows up: - In real life, for loteries, random jury selection, etc. - In crypto protocols, such as leader election - In statistics and modeling science As well as many other places. There are two main ways of generating randomness (or random numbers). The first one, is via physical devises that exploit some physical law known to include some *mathematical randomness in it* (e.g., radioactive decay), which, to my understanding, are widely unpractical (or at least expensive) or by using [*Pseudo-Random Numbers*](https://en.wikipedia.org/wiki/Pseudorandom_number_generator), which are generated by a computer, utilizing some algorithm. Since these numbers are the product of a function, they are not strictly speaking "random" (since for a given seed you can recreate the sequence of numbers), however, Random Number Generating (RNG) algorithms typically produce streams of random numbers that satisfy tests (see e.g., [Nist SP 800-22](https://csrc.nist.gov/pubs/sp/800/22/r1/upd1/final) )that bona-fide random numbers would (statistical independence, etc). This is what happens under the hood when you use `numpy.random.random()` in Python, for example (via the [Marsenne-Twister](https://en.wikipedia.org/wiki/Mersenne_Twister) algorithm. Furthermore, we need **good randomness** (unbiased, unpredictable) to avoid biases and hackings in e.g., rigged loteries, leader election attacks, etc. ### Ok, but why can't we *just* use a PRN generator on-chain? There are a few reasons why setting a seed as a parameter of a distributed system just doesn't work. **Centralized vs. Decentralized Randomness:** In a centralized system, if there's a single entity generating randomness, that entity could potentially manipulate the randomness to its advantage, or it could become a single point of failure. In contrast, distributed systems aim to avoid central points of failure or control. With distributed randomness, the idea is to ensure that no single participant can unduly influence the outcome. **Unpredictability:** If multiple participants in a network share the same seed for randomness, anyone who knows this seed can predict the outcome. In some cases, like consensus algorithms, leader elections, or lotteries, predictability could be exploited by adversaries to their advantage. An unpredictable source of randomness makes it hard for any participant to anticipate or game the system. **Bias-resistance:** If randomness is generated centrally or using shared seeds, it might be possible for an entity to introduce subtle biases, which could lead to advantages in contexts like gambling or consensus mechanisms. Distributed randomness protocols like drand are designed to produce unbiased outputs even if a portion of the participants are malicious. **Security Concerns:** Sharing the same seed across the network can expose the system to various risks. If the seed is compromised, the entire randomness generation becomes insecure. Distributed randomness, when implemented correctly, doesn't rely on keeping seeds or other parameters secret. As long as a threshold of participants are honest, the protocol remains secure. **Trust and Transparency:** Public randomness beacons, especially when generated by a diverse set of well-known entities, can offer a level of trust and transparency that's hard to achieve with centralized or proprietary sources of randomness. ## Some concepts of cryptography for Drand Drand relies on a technique called BLS (Boneh-Lynn-Shacham)-threshold cryptography to decentralize its operations. We now explain what this means in more detail. ### Threshold cryptography Simply put, it's like sharing a secret among a group, where only a certain number of members (but not all) are needed to reveal the secret. In the context of Drand, this "sharing" is crucial for generating randomness across multiple participants. In particular, it uses ($t-n$) threshold cryptography, described next. **$(t-n)$-threshold cryptography** In this setting, any $t$ participants out of $n$ ($n\geq t$) are needed to create a signature. This is key to generating randomness in a descentralized way. One common way of doing so is via the *Shamir Secret Sharing* protocol. **Shamir Secret Sharing Protocol (SSSP)**. In this setting, you split a key $s$ into $n$ *shares* in such a way that at least $t$ shares are needed to reconstruct the original value. This is achieved by observing that **any** polynomial of degree $t-1$ (denoted by $f_{t-1}$) can be reconstructed by using lagrange interpolation of $t$ points $\{x_i,f_{t-1}(x_i)\}_{i=1}^{t}$ This works as follows: - Dealer creates a polynomial $f_{t-1}(x)$ of degree $t-1$., i.e., $$f_{t-1}(x):=s+a_1x_1+a_2x^2+\dots,a_{t-1}x^{t-1}, \quad s,a_i\in\mathbb{R}, \forall i=1,\dots,t-1.$$ - Send to $n$ ishareholders their share $s_i=f(x_i)$ (and $x_i$). Thus, $t$ different shareholders can reconstruct $f_{t-1}(x)$ using Lagrange interpolation. Running SSSP $n$ times in parallel, then generates a *distributed key generation* protocol; indeed setting \begin{aligned} f_{1,t-1}(x)&:=s_1+a_{1,1}x_1+a_{1,2}x^2+\dots,a_{1,t-1}x^{t-1}\\ f_{2,t-1}(x)&:=s_2+a_{2,1}x_1+a_{2,2}x^2+\dots,a_{2,t-1}x^{t-1}\\ \vdots&\\ f_{n,t-1}(x)&:=s_n+a_{n,1}x_1+a_{n,2}x^2+\dots,a_{n,t-1}x^{t-1}, \end{aligned} with $$ s=\sum_{i=1}^{n} s_i, \quad s_j=\sum_{i=1}^n f_{i,t-1}(x_j)$$ ### BLS Signature At its core, BLS offers a unique and compact signature scheme that can verify signatures very efficiently. One key aspect of BLS signatures is the ability to aggregate multiple signatures into a single one that can represent multiple signers. Threshold BLS Signatures: When BLS is applied in a threshold setting, it allows a subset of a group to collaboratively produce a signature. In other words, if you have a group of $n$ participants, any $t$ out of them can come together to produce a valid signature, without revealing their individual secret keys. **How it works:** 1. **Key Generation:** Just like in the Shamir Secret Sharing Protocol, a secret key is split into $n$ shares. Each participant in the group receives a unique share. 2. **Signature Creation: When a message needs to be signed:** Each of the $t$ participants uses their secret key share to produce a partial signature on the message. These partial signatures can then be aggregated (combined) to produce a single, valid BLS signature for the message. The beauty is, even if only $t$ out of $n$ participants provide their partial signatures, the final aggregated signature still looks like a signature from a single signer. 3. **Signature Verification:** Anyone can verify the aggregated signature using a public key that corresponds to the combined secret keys of the participants. The verification process does not require knowledge of which specific $t$ out of the $n$ participants signed, making it anonymous. More mathematically, for BLS signatures, we typically work in a pairing-friendly elliptic curve setting. Let $G_1, G_2,G_T$ be three cyclic groups of prime order $p$. We use a bilinear pairing, $e$ i.e., a mapping $e$ that satisfies: \begin{aligned} e: G_1 \times G_2 &\rightarrow G_T\\ e(a G_1, b G_2) &=e( G_1, G_2)^{ab} \quad (\textsf{Bilinearity}) \end{aligned} The components of the BLS protocol are then given as follows: 1. **Key Generation**: - Pick a private key $x \in \mathbb{Z}_p$. - Compute the public key $y = g^{x}$ in $G_1$, where $g$ is a generator of $G_1$. 2. **Signature Generation**: - To sign a message $m$, compute the hash $H(m) \in G_1$. Here $H$ is the *hash function* - The signature $s$ is then $s = H(m)^{x} \in G_1$. 3. **Signature Verification**: - Verify the signature $s$ using the public key $pk$ and a pairing. The signature is valid if $e(s, g_2) = e(H(m), y)$, where $g_2$ is a generator of $G_2$. Distributing $s$ using an SSSP gives place to what is known as a threshold BLS. #### Example: Let's simplify things and assume that our groups are just multiplicative groups modulo a large prime and our pairing operatio $e$ is simply multiplication (this is **NOT** the case with real BLS signatures, but it simplifies our example). ##### Setup: - Prime $p = 101$. - $G_1 = G_2 = G_T = \mathbb{Z}_{101}^*$. - Choose $g = 3$ as a generator (in practice done through an elliptic curve) ##### Key Generation: - Private key $x = 45$. - Public key $y = g^x = 3^{45} \mod 101 = 82$ (notice that we take the module so that we *land* on $G$) ##### Signature: - Let's sign the message $m =$ `hello, CEL!` . For simplicity, let's assume that our hash function maps this message to the number 5. So, $H(m) = 5$. - $s = H(m)^x = 5^{45} \mod 101 = 63$. This is our signature. ##### Verification: - Compute the hash: $H(m) = 5$. - Check the "pairing" equation (which is just multiplication in our simplified example): - $e(s,g)=[s\times g ]\mod 101= 63 \times 3 \mod 101 = 89$. - $e(H(m),y)=[H(m) \times y ]\mod 101= 5 \times 82 \mod 101 = 89$. - Since both values are equal, the signature is valid. > **Remember**, this is a highly simplified and fictional example to convey the concept. Real BLS signatures work over elliptic curves and use genuine bilinear pairings, not simple multiplications as shown here. ### Verifiable delay functions A Verifiable Delay Function (VDF) is a cryptographic primitive that requires a specified amount of sequential computational time to compute and can produce a unique output. This means that, even on parallel computational architectures, the function cannot be significantly sped up. However, once a solution is found, it can be verified as correct very quickly. ### Time-lock encryption Time-Lock Encryption is a mechanism where data is encrypted in such a way that it can only be decrypted after a certain amount of time has elapsed, regardless of the computational power applied to it. One of the classical constructions for time-lock puzzles utilizes the repeated squaring method in RSA groups, as mentioned in the context of VDFs, however, while this requires a minimum number of operations to be solved, it is highly sensitive to improvements in software or algorithms.