# Data Availability Sampling Phase 1 Proposal
# WIP
The purpose of this document is to describe in more detail a proposal for how phase 1 can be structured based on a "data-availability-focused" approach.
The main addition to the beacon chain will be a `Vector` of `ShardDataHeader` objects, one for each shard. A `ShardDataHeader` is a small object which represents a large amount of underlying data (roughly 0-512 kB in size). A block is only valid if the underlying data that the `ShardDataHeader` points to is _available_ - that is, it has been published to the network and anyone can download it. However, to preserve scalability, clients will not try to download the full underlying data of every `ShardDataHeader` to verify the block. Instead, they will verify that the data is available using an indirect technique called _data availability sampling_.
### Parameters
| Parameter | Value | Description |
| - | - | - |
| `SHARD_COUNT` | 64 | Number of `ShardDataHeader` objects per block |
| `POINTS_PER_SAMPLE` | 8 | Number of evaluations of the polynomial in each chunk that gets sampled (we group a few evaluations together for efficiency reasons) |
| `MODULUS` | [$\approx 5.2 * 10^{77}$](https://github.com/ethereum/py_ecc/blob/master/py_ecc/bls12_381/bls12_381_curve.py#L21) | The modulus that is used for arithmetic operations (the polynomial construction and evaluation are all done in [modular arithmetic](https://en.wikipedia.org/wiki/Modular_arithmetic)). With this modulus we have ~31.7 bytes per evaluation |
| `SAMPLES_PER_BLOCK` | 4096 | Number of samples in a block when that block is extended (so the amount of actual data is at most half this) |
| `FAST_SHUFFLING_SAMPLES` | 12 | Number of sample indices that adjust quickly |
| `SLOW_SHUFFLING_INDICES` | 4 | Number of sample indices that adjust slowly |
| `SLOW_SHUFFLING_PERIOD` | 4096 | Slots between slot reshuffle |
## Background: Data Availability Sampling
There exists an efficient algorithm to encode N chunks of data $D[0] ... D[n-1]$ (the chunks can have any size) into 2N chunks, such that _any_ N of those chunks suffice to reconstruct the full data.
The algorithm is as follows. Compute the polynomial $P$ where $P(0) = D[0]$, $P(1) = D[1]$ ... $P(N-1) = D[N-1]$. This polynomial is guaranteed to have degree < N. Then, evaluate this polynomial at $2n$ positions, the original N but also $P(N) ... P(2N-1)$, to get 2N evaluations.
We now use an important property of polynomials: given a polynomial of degree < N, _any_ N evaluations of the polynomial at N different points can be used to reconstruct the polynomial and hence the original data.
Hence, if we consider the collection of 2N evaluations of this polynomial at the values $0 ... 2N-1$, we can see that any N of them can be used to reconstruct the polynomial, and from there reconstruct the original data - exactly our goal.
(note that the above description is simplified for exposition; to make it possible to use [fast Fourier transforms](https://vitalik.ca/general/2019/05/12/fft.html) to do the above operations in $N*log(N)$ time, powers of some root of unity $\omega$ are used as evaluation points instead of the integers $0...2N-1$)
What does this give us? It gives us a way to turn _50% availability_ into _100% availability_. This matters because checking for 50% availability is vastly easier than checking for 100% availability. Checking for 100% availability can only be done by downloading all the data; even if you download an entire 90% of the data, there's a 10% chance that some single missing chunk is somewhere in the remaining 10%. **But checking for 50% availability can be done probabilistically, with only a few random sample checks.** This is the key trick that allows us to do scalable data availability validation.
## Kate commitments
See https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html for an introduction to Kate commitments
A Kate commitment is a fixed size elliptic-curve-pairing-based commitment $com(P)$ to a polynomial $P$, which has the property that for any evaluation point $z$ where $P(z) = a$ for some $a$, you can provide an _opening proof_, another elliptic curve point $Q$, where a verifier given $com(P)$ and $Q$ can be convinced that $P(z)$ actually does equal $a$.
This is very useful technology for us, because it means that if we generate the commitment for the polynomial and use that to represent the data, and we generate openings for each of the 2N outputs, then the openings become "self-verifying". **As long as the header is well-formatted, there is no way for a shard data block to be invalid. It can only be unavailable**. This key property allows us to completely avoid the need for any fraud proofs, eliminating a large class of complexity and attacks.
## Subnets
Unlike in eth1, we can no longer simply publish all shard data to the same shared p2p network. This is because a node that is on that network would be forced to download all of that data, removing the benefits from sharding! With the above parameters, the amount of data broadcasted would be 256 kB per block (on average, assuming an EIP-1559-like mechanism) * 64 shards / 12 seconds per slot = 1.33 MB/sec (~11 Mbps) (plus P2P overhead), but it is expected to expand over time with more shards and a larger block size.
Instead, we use a **subnet** architecture, where there are multiple p2p networks and each node is only usually on a few of them at a time. We have three kinds of subnets:
* Shard block subnets (or **horizontal subnets**)
* Sample index subnets (or **vertical subnets**)
* The `ShardDataHeader` subnet (or the **global subnet**)
When a shard data block is published, the header is published on the global subnet, and the block is published on the horizontal subnet corrsponding to the shard ID. For each $i$ in `[0 ... SAMPLES_PER_BLOCK - 1]`, the i'th set of evaluations (`[POINTS_PER_SAMPLE * i ... POINTS_PER_SAMPLE * (i+1) - 1]`) is published on vertical subnet $i$.
![](https://i.imgur.com/UJrhdSL.png)
_(Note: a possible extension is to shuffle which vertical subnet corresponds to which sample index in each slot, so as to balance the load of the different shards in the case of blocks of different sizes)_
In practice, there would be 2048 horizontal subnets instead of 64, to allow one horizontal subnet per (shard, slot) combo in each epoch. This is done to ensure that each validator can join a single subnet where they will only receive the block corresponding to the committee that they were assigned to.
Each validator must join the following subnets:
* The global subnet
* The horizontal subnet corresponding to the (shard, slot) combo (ie. the committee) that they were assigned to
* The vertical subnets corresponding to indices that they are assigned to (each validator computes this for themselves using a private seed)
### Index calculation algorithm
`sample_indices` returns the vertical shards that a validator should make sure to be in during a given slot, based on their private seed. _The code below can be run self-contained._
```python
from hashlib import sha256
def hash(x):
return sha256(x).digest()
SAMPLES_PER_BLOCK = 4096
FAST_SHUFFLING_SAMPLES = 12
SLOW_SHUFFLING_SAMPLES = 4
SLOW_SHUFFLING_PERIOD = 4096
OFFSET = 2**24
bytes32 = None; List = {int: None}
def get_delta(seed: bytes32, adjusted_slot: int, positions: int) -> (int, int):
# TODO: consider replacing with an MPC-friendly function like Legendre
sub_seed = hash(
seed +
adjusted_slot.to_bytes(32, 'little')
)
position = int.from_bytes(sub_seed[:8], 'little') % positions
new_index = int.from_bytes(sub_seed[8:16], 'little') % SAMPLES_PER_BLOCK
return (position, new_index)
def sample_helper(seed: bytes32, slot: int, positions: int) -> List[int]:
output = [None] * positions
adjusted_slot = slot + OFFSET
while None in output:
delta_position, delta_new_index = get_delta(seed, adjusted_slot, positions)
if output[delta_position] is None:
output[delta_position] = delta_new_index
adjusted_slot -= 1
return output
def sample_indices(secret_seed: bytes32,
public_seed: bytes32,
slot: int) -> List[int]:
fast_indices = sample_helper(secret_seed, slot, FAST_SHUFFLING_SAMPLES)
offset = int.from_bytes(public_seed[:4], 'little')
period = (slot + offset) // SLOW_SHUFFLING_PERIOD
slow_indices = sample_helper(public_seed, period, SLOW_SHUFFLING_SAMPLES)
return fast_indices + slow_indices
```
The intended behavior is that there are two types of indices: fast-reshuffling indices and slow-reshuffling indices. One fast-reshuffling index changes every slot; one slow-reshuffling index changes every `SLOW_SHUFFLING_PERIOD` slots. The slow-reshuffling indices are chosen from a public seed; this allows easy discoverability of nodes that are guaranteed to be on particular subnets.
### ShardDataCommitment structure
```python
class ShardDataCommitment(Container):
commitment: BLSPubkey
length: uint64
length_proof: BLSPubkey
# Signatures and signer-related data TBD
```
### Publishing
Note that while we can reasonably expect the publisher of a shard data block to be on the horizontal subnet of the desired shard, they are not going to be on _all_ the vertical shards. To solve this, we use a two-step broadcasting mechanism:
* Every block has an associated committee that are all on that horizontal shard. The committee members (~128) receive the block.
* Each of the committee members (128) asks each of their peers (conservatively, 128 * 10) what vertical subnets they are in (128 * 10 * 16). They broadcast the appropriate samples to each peer.
* Each peer that receives a valid sample broadcasts it to their subnet.
This broadcasts an expected 20480 samples, enough to in expectation provide ~14 times the number of samples needed to reconstruct the block (including inefficiency from double-covering an index, you need ~69.3% of a block, or ~1419 samples). This provides a large safety margin and ensures that the block actually does get broadcasted.
### Technical details: Fast Fourier Transform
To make the erasure coding efficient, we find a _root of unity_ with order $2N$ = `POINTS_PER_SAMPLE * SAMPLES_PER_BLOCK`; that is, we find a $\omega$ such that $\omega^N = 1$ (in modular arithmetic modulo the `MODULUS` in the config above) but $\omega^k \ne 1$ for any $1 \le k < N$.
We use $\omega = 5^{\frac{MODULUS - 1}{32768}}$:
`w = 30195699792882346185164345110260439085017223719129789169349923251189180189908`
Instead of using $[1 ... 2N]$ as our evaluation points, we use powers of $\omega$ _in order of reverse bit representation_:
```python
def reverse_bit_order(bits):
if bits == 0:
return [0]
return (
[x*2 for x in reverse_bit_order(bits-1)] +
[x*2+1 for x in reverse_bit_order(bits-1)]
)
```
Here are the examples of outputs with low values (the value that would be used as an input to generate the exponents that will actually get used is `log2(32768) = 15`):
```python
>>> reverse_bit_order(1)
[0, 1]
>>> reverse_bit_order(2)
[0, 2, 1, 3]
>>> reverse_bit_order(3)
[0, 4, 2, 6, 1, 5, 3, 7]
>>> reverse_bit_order(4)
[0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]
```
Reverse bit order has a neat property that any power-of-two-aligned subset (another way to think of this: any subset that would form a clean subtree if the list were put into a Merkle tree) is a subgroup (or a coset of a subgroup) of the multiplicative group of powers of $\omega$.
This makes it very mathematically easy and clean to:
* Use a [fast Fourier transform](https://vitalik.ca/general/2019/05/12/fft.html) to extend the original data ($N$ points) into the extended data ($2N$ points). Particularly, note that the original data, which goes in the first half of the evaluation points, neatly falls into even powers of $\omega$ (ie. powers of $\omega^2$), which is a subgroup, and so an FFT can be applied directly.
* Prove that any particular contiguous sub-range is full of zeroes. For this, we compute a polynomial $Z(x) = (x - \omega^{rev(i)}) * ... * (x - \omega^{rev(j)})$, and provide a commitment to $Q(x) = \frac{P(x)}{Z(x)}$. The verifier can generate the G2-representation of $Z$ and do a pairing check to verify $e(com(Q), com(Z)) = e(com(P), com(1))$. Note that the powers of $\omega$ can be viewed as at most $2 * log_2(j-i)$ subtrees ($log_2(j-i)$ if $j = \frac{N}{2}$), making it easy to compute $Q(x)$. $Z(x)$ too will be a product of a few $x^{2^k} - \omega^{2^k * r}$ terms, making it and the commitment to it easy to calculate.
The latter property is needed because for EIP 1559 purposes, we need to enforce the size of a shard block in such a way that it can be verified from the shard header. For this, we use the application-specific SNARK summarized above to prove that the data outside the range is zeroes, so no one needs to try to propagate or download those coordinates.
### Serialization
One important point that must be stressed in this data availability scheme is that **the shard block data should be viewed as natively _being_ a list of integers in the range `[0...MODULUS-1]`**. This is a departure from eth1 and indeed almost all other blockchains (except perhaps iota, which uses ternary?), which tend to view data as natively being bytes.
That is, if for example `MODULUS = 13`, the set `[5, 10, 2, 7, 12, 4, 9, 1]` would be admissible data, but `[1, 8, 15, 6, 13, 4, 11, 2]` is not: 13 and 15 are invalid elements. This should be viewed philosophically similarly to how eg. `0x2babd4ba18af` is valid hexadecimal, but `0x7d394a1bdghi` is not. The data availability check that we use is natively built around modular arithmetic, and so the data that it verifies availability of is in the format of a list of integers in the range `[0...MODULUS-1]`.
The main challenge here is that serialization and networking use bytes, and so there is a disconnect of formats between serialization/networking and the data availability scheme (which uses points): the 32-byte blob `0xffffffff...ffffffff` is valid _bytes_, but it's not a valid _point_, because points must be lower than the modulus (which is a prime number somewhere between $2^{254}$ and $2^{255}$).
To deal with this disconnect, we encode every point as 32 bytes (in little endian), but we add to every object that contains a point a validity condition that requires the point to be less than the modulus. Objects that contain points that are out of range are effectively ignored and treated as nonexistent, as though they had never been received by the client.
Note that _at the application layer_, applications usually prefer to use bytes. How to convert a list of points into bytes is an application-layer choice; a natural default is to just take $x\mod 256^{31}$, looking only at the bottom 31 bytes of every point.

Suppose you have a polynomial $P$, and the sample proofs $Q_i = \lfloor \frac{P}{X^{16} - \omega^{i * 16}} \rfloor$. Goal: verify all proofs. Note that for all $i$, $Q_i * (X^{16} - \omega^{i * 16}) = P - I_i$, where $I_i$ is the $deg < 16$ interpolant of the i'th subgroup. We can combine all of these equations with a random linear combination: $\sum Q_i * (X^{16} - \omega^{i * 16}) * r_i = \sum (P - I_i) * r_i$

11/1/2022Along with proof of stake, the other central feature in the eth2 design is sharding. This proposal introduces a limited form of sharding, called "data sharding", as per the rollup-centric roadmap: the shards would store data, and attest to the availability of ~250 kB sized blobs of data. This availability verification provides a secure and high-throughput data layer for layer-2 protocols such as rollups. To verify the availability of high volumes of data without requiring any single node to personally download all of the data, two techniques are stacked on top of each other: (i) attestation by randomly sampled committees, and (ii) data availability sampling (DAS). ELI5: randomly sampled committees Suppose you have a big amount of data (think: 16 MB, the average amount that the eth2 chain will actually process per slot, at least initially). You represent this data as 64 "blobs" of 256 kB each. You have a proof of stake system, with ~6400 validators. How do you check all of the data without (i) requiring anyone to download the whole thing, or (ii) opening the door for an attacker who controls only a few validators to sneak an invalid block through? We can solve the first problem by splitting up the work: validators 1...100 have to download and check the first blob, validators 101...200 have to download and check the second blob, and so on. The validators in each of these subgroups (or "committees") simply make a signature attesting that they have verified the blob, and the network as a whole only accepts the blob if they have seen signatures from the majority of the corresponding committee. But this leads to a problem: what if the attacker controls some contiguous subset of validators (eg. 1971....2070)? If this were the case, then even though the attacker controls only ~1.5% of the whole validator set, they would dominate a single committee (in this case, they would have ~70% of committee 20, containing validators 2001...2100), and so they would be able to control the committee and push even invalid/unavailable blobs into the chain. Random sampling solves this by using a random shuffling algorithm to select the committees. We use some hash as the seed of a random number generator, which we then use to randomly shuffle the list [1..6400]. The first 100 values in the shuffled list are the first committee, the next 100 are the second committee, etc.

10/26/2022One powerful technique when working with polynomials is taking a set of evaluations of that polynomial and using that directly to compute an evaluation at a different point. For example, if $P(x)$ is a degree-100 polynomial, you can use the evaluations $P(0), P(1) ... P(100)$ to directly compute $P(101)$, or $P(1247130)$, in $O(N)$ time, without ever reconstructing the polynomial. This post describes how this is done. See also, this earlier and more detailed paper by Oxford researchers on the topic: https://people.maths.ox.ac.uk/trefethen/barycentric.pdf General technique Let $P(x)$ be the polynomial, and $x_1 ... x_N$ as the set of points for which we have evaluations of $P(x)$. Call these evaluations $y_1 ... y_N$. Think of $P(x)$ as a linear combination $\sum_i y_i L_i(x)$, where $L_i(x)$ is the polynomial that equals 1 at $x_i$ and 0 at all the other x-coordinates in the set. Now, let us explore how these $L_i$ can be computed. Each $L_i(x)$ can be expressed as:

3/8/2022Two paths to a solution exist, and have existed for a long time: weak statelessness and state expiry: State expiry: remove state that has not been recently accessed from the state (think: accessed in the last year), and require witnesses to revive expired state. This would reduce the state that everyone needs to store to a flat ~20-50 GB. Weak statelessness: only require block proposers to store state, and allow all other nodes to verify blocks statelessly. The good news is that recently, there have been major improvements on both of these paths, that greatly reduce the downsides to both: Some techniques for how a ReGenesis-like epoch-based expiry scheme can be adapted to minimize resurrection conflicts Piper Merriam's work on transaction gossip networks that add witnesses to be stateless-client-friendly, and his work on distributed state storage and on-demand availability Verkle trees, which can reduce worst-case witness sizes from ~4 MB to ~800 kB (this is definitely small enough, because existing worst-case blocks that are full of calldata are already 12.5M / 16 ~= 780 kB and we have to handle those anyway). See slides, doc, code.

1/4/2022
Published on ** HackMD**