# Data Availability Sampling Phase 1 Proposal # WIP The purpose of this document is to describe in more detail a proposal for how phase 1 can be structured based on a "data-availability-focused" approach. The main addition to the beacon chain will be a `Vector` of `ShardDataHeader` objects, one for each shard. A `ShardDataHeader` is a small object which represents a large amount of underlying data (roughly 0-512 kB in size). A block is only valid if the underlying data that the `ShardDataHeader` points to is _available_ - that is, it has been published to the network and anyone can download it. However, to preserve scalability, clients will not try to download the full underlying data of every `ShardDataHeader` to verify the block. Instead, they will verify that the data is available using an indirect technique called _data availability sampling_. ### Parameters | Parameter | Value | Description | | - | - | - | | `SHARD_COUNT` | 64 | Number of `ShardDataHeader` objects per block | | `POINTS_PER_SAMPLE` | 8 | Number of evaluations of the polynomial in each chunk that gets sampled (we group a few evaluations together for efficiency reasons) | | `MODULUS` | [$\approx 5.2 * 10^{77}$](https://github.com/ethereum/py_ecc/blob/master/py_ecc/bls12_381/bls12_381_curve.py#L21) | The modulus that is used for arithmetic operations (the polynomial construction and evaluation are all done in [modular arithmetic](https://en.wikipedia.org/wiki/Modular_arithmetic)). With this modulus we have ~31.7 bytes per evaluation | | `SAMPLES_PER_BLOCK` | 4096 | Number of samples in a block when that block is extended (so the amount of actual data is at most half this) | | `FAST_SHUFFLING_SAMPLES` | 12 | Number of sample indices that adjust quickly | | `SLOW_SHUFFLING_INDICES` | 4 | Number of sample indices that adjust slowly | | `SLOW_SHUFFLING_PERIOD` | 4096 | Slots between slot reshuffle | ## Background: Data Availability Sampling There exists an efficient algorithm to encode N chunks of data $D[0] ... D[n-1]$ (the chunks can have any size) into 2N chunks, such that _any_ N of those chunks suffice to reconstruct the full data. The algorithm is as follows. Compute the polynomial $P$ where $P(0) = D[0]$, $P(1) = D[1]$ ... $P(N-1) = D[N-1]$. This polynomial is guaranteed to have degree < N. Then, evaluate this polynomial at $2n$ positions, the original N but also $P(N) ... P(2N-1)$, to get 2N evaluations. We now use an important property of polynomials: given a polynomial of degree < N, _any_ N evaluations of the polynomial at N different points can be used to reconstruct the polynomial and hence the original data. Hence, if we consider the collection of 2N evaluations of this polynomial at the values $0 ... 2N-1$, we can see that any N of them can be used to reconstruct the polynomial, and from there reconstruct the original data - exactly our goal. (note that the above description is simplified for exposition; to make it possible to use [fast Fourier transforms](https://vitalik.ca/general/2019/05/12/fft.html) to do the above operations in $N*log(N)$ time, powers of some root of unity $\omega$ are used as evaluation points instead of the integers $0...2N-1$) What does this give us? It gives us a way to turn _50% availability_ into _100% availability_. This matters because checking for 50% availability is vastly easier than checking for 100% availability. Checking for 100% availability can only be done by downloading all the data; even if you download an entire 90% of the data, there's a 10% chance that some single missing chunk is somewhere in the remaining 10%. **But checking for 50% availability can be done probabilistically, with only a few random sample checks.** This is the key trick that allows us to do scalable data availability validation. ## Kate commitments See https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html [geometry dash world](https://geometrydashworld.online) for an introduction to Kate commitments A Kate commitment is a fixed size elliptic-curve-pairing-based commitment $com(P)$ to a polynomial $P$, which has the property that for any evaluation point $z$ where $P(z) = a$ for some $a$, you can provide an _opening proof_, another elliptic curve point $Q$, where a verifier given $com(P)$ and $Q$ can be convinced that $P(z)$ actually does equal $a$. This is very useful technology for us, because it means that if we generate the commitment for the polynomial and use that to represent the data, and we generate openings for each of the 2N outputs, then the openings become "self-verifying". **As long as the header is well-formatted, there is no way for a shard data block to be invalid. It can only be unavailable**. This key property allows us to completely avoid the need for any fraud proofs, eliminating a large class of complexity and attacks. ## Subnets Unlike in eth1, we can no longer simply publish all shard data to the same shared p2p network. This is because a node that is on that network would be forced to download all of that data, removing the benefits from sharding! With the above parameters, the amount of data broadcasted would be 256 kB per block (on average, assuming an EIP-1559-like mechanism) * 64 shards / 12 seconds per slot = 1.33 MB/sec (~11 Mbps) (plus P2P overhead), but it is expected to expand over time with more shards and a larger block size. Instead, we use a **subnet** architecture, where there are multiple p2p networks and each node is only usually on a few of them at a time. We have three kinds of subnets: * Shard block subnets (or **horizontal subnets**) * Sample index subnets (or **vertical subnets**) * Cluster protection subnets(or the **main-diagonal subnets**) * The `ShardDataHeader` subnet (or the **global subnet**) When a shard data block is published, the header is published on the global subnet, and the block is published on the horizontal subnet corrsponding to the shard ID. For each $i$ in `[0 ... SAMPLES_PER_BLOCK - 1]`, the i'th set of evaluations (`[POINTS_PER_SAMPLE * i ... POINTS_PER_SAMPLE * (i+1) - 1]`) is published on vertical subnet $i$. ![](https://i.imgur.com/UJrhdSL.png) Figure 1. _(Note: a possible extension is to shuffle which vertical subnet corresponds to which sample index in each slot, so as to balance the load of the different shards in the case of blocks of different sizes)_ ![](https://www.researchgate.net/profile/120445-120472-120475-120459-120462-120475-120477-95-120444-120458-120475-120480-120458-120471/publication/47557940/figure/fig5/AS:670000604008485@1536752000953/Adjacency-matrices-corresponding-to-different-types-of-networks-constructed-from-the.png) Figure 2. _(Note: Adjacency matrices is to classify different types of networks constructed on the slots and nodes involved in validating specific transactions)_ In practice, there would be 2048 horizontal subnets instead of 64, to allow one horizontal subnet per (shard, slot) combo in each epoch. This is done to ensure that each validator can join a single subnet where they will only receive the block corresponding to the committee that they were assigned to. Each validator must join the following subnets: * The global subnet * The horizontal subnet corresponding to the (shard, slot) combo (ie. the committee) that they were assigned to * The vertical subnets corresponding to indices that they are assigned to (each validator computes this for themselves using a private seed) * The main-diagonal subnets corresponding to the complex networks for information validating encryption ### Index calculation algorithm `sample_indices` returns the vertical shards that a validator should make sure to be in during a given slot, based on their private seed. _The code below can be run self-contained._ ```python from hashlib import sha256 def hash(x): return sha256(x).digest() SAMPLES_PER_BLOCK = 4096 FAST_SHUFFLING_SAMPLES = 12 SLOW_SHUFFLING_SAMPLES = 4 SLOW_SHUFFLING_PERIOD = 4096 OFFSET = 2**24 bytes32 = None; List = {int: None} def get_delta(seed: bytes32, adjusted_slot: int, positions: int) -> (int, int): # TODO: consider replacing with an MPC-friendly function like Legendre sub_seed = hash( seed + adjusted_slot.to_bytes(32, 'little') ) position = int.from_bytes(sub_seed[:8], 'little') % positions new_index = int.from_bytes(sub_seed[8:16], 'little') % SAMPLES_PER_BLOCK return (position, new_index) def sample_helper(seed: bytes32, slot: int, positions: int) -> List[int]: output = [None] * positions adjusted_slot = slot + OFFSET while None in output: delta_position, delta_new_index = get_delta(seed, adjusted_slot, positions) if output[delta_position] is None: output[delta_position] = delta_new_index adjusted_slot -= 1 return output def sample_indices(secret_seed: bytes32, public_seed: bytes32, slot: int) -> List[int]: fast_indices = sample_helper(secret_seed, slot, FAST_SHUFFLING_SAMPLES) offset = int.from_bytes(public_seed[:4], 'little') period = (slot + offset) // SLOW_SHUFFLING_PERIOD slow_indices = sample_helper(public_seed, period, SLOW_SHUFFLING_SAMPLES) return fast_indices + slow_indices ``` The intended behavior is that there are two types of indices: fast-reshuffling indices and slow-reshuffling indices. One fast-reshuffling index changes every slot; one slow-reshuffling index changes every `SLOW_SHUFFLING_PERIOD` slots. The slow-reshuffling indices are chosen from a public seed; this allows easy discoverability of nodes that are guaranteed to be on particular subnets. ### ShardDataCommitment structure ```python class ShardDataCommitment(Container): commitment: BLSPubkey length: uint64 length_proof: BLSPubkey # Signatures and signer-related data TBD ``` ### Publishing Note that while we can reasonably expect the publisher of a shard data block to be on the horizontal subnet of the desired shard, they are not going to be on _all_ the vertical shards. To solve this, we use a two-step broadcasting mechanism: * Every block has an associated committee that are all on that horizontal shard. The committee members (~128) receive the block. * Each of the committee members (128) asks each of their peers (conservatively, 128 * 10) what vertical subnets they are in (128 * 10 * 16). They broadcast the appropriate samples to each peer. * Each peer that receives a valid sample broadcasts it to their subnet. This broadcasts an expected 20480 samples, enough to in expectation provide ~14 times the number of samples needed to reconstruct the block (including inefficiency from double-covering an index, you need ~69.3% of a block, or ~1419 samples). This provides a large safety margin and ensures that the block actually does get broadcasted. ### Technical details: Fast Fourier Transform To make the erasure coding efficient, we find a _root of unity_ with order $2N$ = `POINTS_PER_SAMPLE * SAMPLES_PER_BLOCK`; that is, we find a $\omega$ such that $\omega^N = 1$ (in modular arithmetic modulo the `MODULUS` in the config above) but $\omega^k \ne 1$ for any $1 \le k < N$. We use $\omega = 5^{\frac{MODULUS - 1}{32768}}$: `w = 30195699792882346185164345110260439085017223719129789169349923251189180189908` Instead of using $[1 ... 2N]$ as our evaluation points, we use powers of $\omega$ _in order of reverse bit representation_: ```python def reverse_bit_order(bits): if bits == 0: return [0] return ( [x*2 for x in reverse_bit_order(bits-1)] + [x*2+1 for x in reverse_bit_order(bits-1)] ) ``` Here are the examples of outputs with low values (the value that would be used as an input to generate the exponents that will actually get used is `log2(32768) = 15`): ```python >>> reverse_bit_order(1) [0, 1] >>> reverse_bit_order(2) [0, 2, 1, 3] >>> reverse_bit_order(3) [0, 4, 2, 6, 1, 5, 3, 7] >>> reverse_bit_order(4) [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15] ``` Reverse bit order has a neat property that any power-of-two-aligned subset (another way to think of this: any subset that would form a clean subtree if the list were put into a Merkle tree) is a subgroup (or a coset of a subgroup) of the multiplicative group of powers of $\omega$. This makes it very mathematically easy and clean to: * Use a [fast Fourier transform](https://vitalik.ca/general/2019/05/12/fft.html) to extend the original data ($N$ points) into the extended data ($2N$ points). Particularly, note that the original data, which goes in the first half of the evaluation points, neatly falls into even powers of $\omega$ (ie. powers of $\omega^2$), which is a subgroup, and so an FFT can be applied directly. * Prove that any particular contiguous sub-range is full of zeroes. For this, we compute a polynomial $Z(x) = (x - \omega^{rev(i)}) * ... * (x - \omega^{rev(j)})$, and provide a commitment to $Q(x) = \frac{P(x)}{Z(x)}$. The verifier can generate the G2-representation of $Z$ and do a pairing check to verify $e(com(Q), com(Z)) = e(com(P), com(1))$. Note that the powers of $\omega$ can be viewed as at most $2 * log_2(j-i)$ subtrees ($log_2(j-i)$ if $j = \frac{N}{2}$), making it easy to compute $Q(x)$. $Z(x)$ too will be a product of a few $x^{2^k} - \omega^{2^k * r}$ terms, making it and the commitment to it easy to calculate. The latter property is needed because for EIP 1559 purposes, we need to enforce the size of a shard block in such a way that it can be verified from the shard header. For this, we use the application-specific SNARK summarized above to prove that the data outside the range is zeroes, so no one needs to try to propagate or download those coordinates. ### Serialization One important point that must be stressed in this data availability scheme is that **the shard block data should be viewed as natively _being_ a list of integers in the range `[0...MODULUS-1]`**. This is a departure from eth1 and indeed almost all other blockchains (except perhaps iota, which uses ternary?), which tend to view data as natively being bytes. That is, if for example `MODULUS = 13`, the set `[5, 10, 2, 7, 12, 4, 9, 1]` would be admissible data, but `[1, 8, 15, 6, 13, 4, 11, 2]` is not: 13 and 15 are invalid elements. This should be viewed philosophically similarly to how eg. `0x2babd4ba18af` is valid hexadecimal, but `0x7d394a1bdghi` is not. The data availability check that we use is natively built around modular arithmetic, and so the data that it verifies availability of is in the format of a list of integers in the range `[0...MODULUS-1]`. The main challenge here is that serialization and networking use bytes, and so there is a disconnect of formats between serialization/networking and the data availability scheme (which uses points): the 32-byte blob `0xffffffff...ffffffff` is valid _bytes_, but it's not a valid _point_, because points must be lower than the modulus (which is a prime number somewhere between $2^{254}$ and $2^{255}$). To deal with this disconnect, we encode every point as 32 bytes (in little endian), but we add to every object that contains a point a validity condition that requires the point to be less than the modulus. Objects that contain points that are out of range are effectively ignored and treated as nonexistent, as though they had never been received by the client. Note that _at the application layer_, applications usually prefer to use bytes. How to convert a list of points into bytes is an application-layer choice; a natural default is to just take $x\mod 256^{31}$, looking only at the bottom 31 bytes of every point.