Try   HackMD

Data Availability Sampling Phase 1 Proposal

WIP

The purpose of this document is to describe in more detail a proposal for how phase 1 can be structured based on a "data-availability-focused" approach.

The main addition to the beacon chain will be a Vector of ShardDataHeader objects, one for each shard. A ShardDataHeader is a small object which represents a large amount of underlying data (roughly 0-512 kB in size). A block is only valid if the underlying data that the ShardDataHeader points to is available - that is, it has been published to the network and anyone can download it. However, to preserve scalability, clients will not try to download the full underlying data of every ShardDataHeader to verify the block. Instead, they will verify that the data is available using an indirect technique called data availability sampling.

Parameters

Parameter Value Description
SHARD_COUNT 64 Number of ShardDataHeader objects per block
POINTS_PER_SAMPLE 8 Number of evaluations of the polynomial in each chunk that gets sampled (we group a few evaluations together for efficiency reasons)
MODULUS
5.21077
The modulus that is used for arithmetic operations (the polynomial construction and evaluation are all done in modular arithmetic). With this modulus we have ~31.7 bytes per evaluation
SAMPLES_PER_BLOCK 4096 Number of samples in a block when that block is extended (so the amount of actual data is at most half this)
FAST_SHUFFLING_SAMPLES 12 Number of sample indices that adjust quickly
SLOW_SHUFFLING_INDICES 4 Number of sample indices that adjust slowly
SLOW_SHUFFLING_PERIOD 4096 Slots between slot reshuffle

Background: Data Availability Sampling

There exists an efficient algorithm to encode N chunks of data

D[0]...D[n1] (the chunks can have any size) into 2N chunks, such that any N of those chunks suffice to reconstruct the full data.

The algorithm is as follows. Compute the polynomial

P where
P(0)=D[0]
,
P(1)=D[1]
P(N1)=D[N1]
. This polynomial is guaranteed to have degree < N. Then, evaluate this polynomial at
2n
positions, the original N but also
P(N)...P(2N1)
, to get 2N evaluations.

We now use an important property of polynomials: given a polynomial of degree < N, any N evaluations of the polynomial at N different points can be used to reconstruct the polynomial and hence the original data.

Hence, if we consider the collection of 2N evaluations of this polynomial at the values

0...2N1, we can see that any N of them can be used to reconstruct the polynomial, and from there reconstruct the original data - exactly our goal.

(note that the above description is simplified for exposition; to make it possible to use fast Fourier transforms to do the above operations in

Nlog(N) time, powers of some root of unity
ω
are used as evaluation points instead of the integers
0...2N1
)

What does this give us? It gives us a way to turn 50% availability into 100% availability. This matters because checking for 50% availability is vastly easier than checking for 100% availability. Checking for 100% availability can only be done by downloading all the data; even if you download an entire 90% of the data, there's a 10% chance that some single missing chunk is somewhere in the remaining 10%. But checking for 50% availability can be done probabilistically, with only a few random sample checks. This is the key trick that allows us to do scalable data availability validation.

Kate commitments

See https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html geometry dash world for an introduction to Kate commitments

A Kate commitment is a fixed size elliptic-curve-pairing-based commitment

com(P) to a polynomial
P
, which has the property that for any evaluation point
z
where
P(z)=a
for some
a
, you can provide an opening proof, another elliptic curve point
Q
, where a verifier given
com(P)
and
Q
can be convinced that
P(z)
actually does equal
a
.

This is very useful technology for us, because it means that if we generate the commitment for the polynomial and use that to represent the data, and we generate openings for each of the 2N outputs, then the openings become "self-verifying". As long as the header is well-formatted, there is no way for a shard data block to be invalid. It can only be unavailable. This key property allows us to completely avoid the need for any fraud proofs, eliminating a large class of complexity and attacks.

Subnets

Unlike in eth1, we can no longer simply publish all shard data to the same shared p2p network. This is because a node that is on that network would be forced to download all of that data, removing the benefits from sharding! With the above parameters, the amount of data broadcasted would be 256 kB per block (on average, assuming an EIP-1559-like mechanism) * 64 shards / 12 seconds per slot = 1.33 MB/sec (~11 Mbps) (plus P2P overhead), but it is expected to expand over time with more shards and a larger block size.

Instead, we use a subnet architecture, where there are multiple p2p networks and each node is only usually on a few of them at a time. We have three kinds of subnets:

  • Shard block subnets (or horizontal subnets)
  • Sample index subnets (or vertical subnets)
  • Cluster protection subnets(or the main-diagonal subnets)
  • The ShardDataHeader subnet (or the global subnet)

When a shard data block is published, the header is published on the global subnet, and the block is published on the horizontal subnet corrsponding to the shard ID. For each

i in [0 ... SAMPLES_PER_BLOCK - 1], the i'th set of evaluations ([POINTS_PER_SAMPLE * i ... POINTS_PER_SAMPLE * (i+1) - 1]) is published on vertical subnet
i
.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Figure 1. (Note: a possible extension is to shuffle which vertical subnet corresponds to which sample index in each slot, so as to balance the load of the different shards in the case of blocks of different sizes)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Figure 2. (Note: Adjacency matrices is to classify different types of networks constructed on the slots and nodes involved in validating specific transactions)

In practice, there would be 2048 horizontal subnets instead of 64, to allow one horizontal subnet per (shard, slot) combo in each epoch. This is done to ensure that each validator can join a single subnet where they will only receive the block corresponding to the committee that they were assigned to.

Each validator must join the following subnets:

  • The global subnet
  • The horizontal subnet corresponding to the (shard, slot) combo (ie. the committee) that they were assigned to
  • The vertical subnets corresponding to indices that they are assigned to (each validator computes this for themselves using a private seed)
  • The main-diagonal subnets corresponding to the complex networks for information validating encryption

Index calculation algorithm

sample_indices returns the vertical shards that a validator should make sure to be in during a given slot, based on their private seed. The code below can be run self-contained.

from hashlib import sha256

def hash(x):
    return sha256(x).digest()

SAMPLES_PER_BLOCK = 4096
FAST_SHUFFLING_SAMPLES = 12
SLOW_SHUFFLING_SAMPLES = 4
SLOW_SHUFFLING_PERIOD = 4096
OFFSET = 2**24
bytes32 = None; List = {int: None}

def get_delta(seed: bytes32, adjusted_slot: int, positions: int) -> (int, int):
    # TODO: consider replacing with an MPC-friendly function like Legendre
    sub_seed = hash(
        seed +
        adjusted_slot.to_bytes(32, 'little')
    )
    position = int.from_bytes(sub_seed[:8], 'little') % positions
    new_index = int.from_bytes(sub_seed[8:16], 'little') % SAMPLES_PER_BLOCK
    return (position, new_index)

def sample_helper(seed: bytes32,  slot: int, positions: int) -> List[int]:
    output = [None] * positions
    adjusted_slot = slot + OFFSET
    while None in output:
        delta_position, delta_new_index = get_delta(seed, adjusted_slot, positions)
        if output[delta_position] is None:
            output[delta_position] = delta_new_index
        adjusted_slot -= 1
    return output
    
def sample_indices(secret_seed: bytes32,
                   public_seed: bytes32,
                   slot: int) -> List[int]:
        
    fast_indices = sample_helper(secret_seed, slot, FAST_SHUFFLING_SAMPLES)
    offset = int.from_bytes(public_seed[:4], 'little')
    period = (slot + offset) // SLOW_SHUFFLING_PERIOD
    slow_indices = sample_helper(public_seed, period, SLOW_SHUFFLING_SAMPLES)
    return fast_indices + slow_indices

The intended behavior is that there are two types of indices: fast-reshuffling indices and slow-reshuffling indices. One fast-reshuffling index changes every slot; one slow-reshuffling index changes every SLOW_SHUFFLING_PERIOD slots. The slow-reshuffling indices are chosen from a public seed; this allows easy discoverability of nodes that are guaranteed to be on particular subnets.

ShardDataCommitment structure

class ShardDataCommitment(Container):
    commitment: BLSPubkey
    length: uint64
    length_proof: BLSPubkey
    # Signatures and signer-related data TBD

Publishing

Note that while we can reasonably expect the publisher of a shard data block to be on the horizontal subnet of the desired shard, they are not going to be on all the vertical shards. To solve this, we use a two-step broadcasting mechanism:

  • Every block has an associated committee that are all on that horizontal shard. The committee members (~128) receive the block.
  • Each of the committee members (128) asks each of their peers (conservatively, 128 * 10) what vertical subnets they are in (128 * 10 * 16). They broadcast the appropriate samples to each peer.
  • Each peer that receives a valid sample broadcasts it to their subnet.

This broadcasts an expected 20480 samples, enough to in expectation provide ~14 times the number of samples needed to reconstruct the block (including inefficiency from double-covering an index, you need ~69.3% of a block, or ~1419 samples). This provides a large safety margin and ensures that the block actually does get broadcasted.

Technical details: Fast Fourier Transform

To make the erasure coding efficient, we find a root of unity with order

2N = POINTS_PER_SAMPLE * SAMPLES_PER_BLOCK; that is, we find a
ω
such that
ωN=1
(in modular arithmetic modulo the MODULUS in the config above) but
ωk1
for any
1k<N
.

We use

ω=5MODULUS132768:

w = 30195699792882346185164345110260439085017223719129789169349923251189180189908

Instead of using

[1...2N] as our evaluation points, we use powers of
ω
in order of reverse bit representation:

def reverse_bit_order(bits):
    if bits == 0:
        return [0]
    return (
        [x*2 for x in reverse_bit_order(bits-1)] +
        [x*2+1 for x in reverse_bit_order(bits-1)] 
    )

Here are the examples of outputs with low values (the value that would be used as an input to generate the exponents that will actually get used is log2(32768) = 15):

>>> reverse_bit_order(1)
[0, 1]
>>> reverse_bit_order(2)
[0, 2, 1, 3]
>>> reverse_bit_order(3)
[0, 4, 2, 6, 1, 5, 3, 7]
>>> reverse_bit_order(4)
[0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]

Reverse bit order has a neat property that any power-of-two-aligned subset (another way to think of this: any subset that would form a clean subtree if the list were put into a Merkle tree) is a subgroup (or a coset of a subgroup) of the multiplicative group of powers of

ω.

This makes it very mathematically easy and clean to:

  • Use a fast Fourier transform to extend the original data (
    N
    points) into the extended data (
    2N
    points). Particularly, note that the original data, which goes in the first half of the evaluation points, neatly falls into even powers of
    ω
    (ie. powers of
    ω2
    ), which is a subgroup, and so an FFT can be applied directly.
  • Prove that any particular contiguous sub-range is full of zeroes. For this, we compute a polynomial
    Z(x)=(xωrev(i))...(xωrev(j))
    , and provide a commitment to
    Q(x)=P(x)Z(x)
    . The verifier can generate the G2-representation of
    Z
    and do a pairing check to verify
    e(com(Q),com(Z))=e(com(P),com(1))
    . Note that the powers of
    ω
    can be viewed as at most
    2log2(ji)
    subtrees (
    log2(ji)
    if
    j=N2
    ), making it easy to compute
    Q(x)
    .
    Z(x)
    too will be a product of a few
    x2kω2kr
    terms, making it and the commitment to it easy to calculate.

The latter property is needed because for EIP 1559 purposes, we need to enforce the size of a shard block in such a way that it can be verified from the shard header. For this, we use the application-specific SNARK summarized above to prove that the data outside the range is zeroes, so no one needs to try to propagate or download those coordinates.

Serialization

One important point that must be stressed in this data availability scheme is that the shard block data should be viewed as natively being a list of integers in the range [0...MODULUS-1]. This is a departure from eth1 and indeed almost all other blockchains (except perhaps iota, which uses ternary?), which tend to view data as natively being bytes.

That is, if for example MODULUS = 13, the set [5, 10, 2, 7, 12, 4, 9, 1] would be admissible data, but [1, 8, 15, 6, 13, 4, 11, 2] is not: 13 and 15 are invalid elements. This should be viewed philosophically similarly to how eg. 0x2babd4ba18af is valid hexadecimal, but 0x7d394a1bdghi is not. The data availability check that we use is natively built around modular arithmetic, and so the data that it verifies availability of is in the format of a list of integers in the range [0...MODULUS-1].

The main challenge here is that serialization and networking use bytes, and so there is a disconnect of formats between serialization/networking and the data availability scheme (which uses points): the 32-byte blob 0xffffffff...ffffffff is valid bytes, but it's not a valid point, because points must be lower than the modulus (which is a prime number somewhere between

2254 and
2255
).

To deal with this disconnect, we encode every point as 32 bytes (in little endian), but we add to every object that contains a point a validity condition that requires the point to be less than the modulus. Objects that contain points that are out of range are effectively ignored and treated as nonexistent, as though they had never been received by the client.

Note that at the application layer, applications usually prefer to use bytes. How to convert a list of points into bytes is an application-layer choice; a natural default is to just take

xmod25631, looking only at the bottom 31 bytes of every point.