This is a draft version of refactored Sharding DAS specification:
https://github.com/ethereum/eth2.0-specs/blob/dev/specs/das/das-core.md
The goals of this refactoring
Disclaimer: please consider Python code below as a 'pseudo-code': it is not 100% correct and needs to be revised later
Implementation of this spec: https://github.com/Nashatyrev/artemis/tree/8a5cfe60978c82edbe237ffeca5edeaf51f9dfb2/ethereum/spec/src/main/java/tech/pegasys/teku/spec/logic/common/das/baseline
All operations related to Erasure coding and Kate commitments are based on polynomials over field (finite field of integers modulo MODULUS
):
All polynomial elements , and are from . In the spec the Point
class represents a element
Let's choose and fix distinct arbitrary values: . For now let's assume they are arbitrary, and a smarter set will be chosen later. This set of values is fixed for any datapoints to be encoded ever thus the set should large enough and contain at least MAX_EXT_POINTS
elements (the maximum number of extended data points)
Given datapoints let's find the coefficients such that polinomial evaluations would be equal to our datapoints at points . I.e. or shortly. See poly_from_evals()
function.
Thus we get the specific polynomial corresponding to our original dataset
Poly
class represents a polynomial which is essentially just a vector of it's coeeficients and a set of standart operations.
eval_in_G1()
calculates the evaluation of polynomial (poly
parameter) at the secret point mapped to : . But since the is not known to anyone we can't directly evalute the polynomial. But we have G1_SETUP
with powers of mapped to : . Thus we may evaluate the polynomial directly in :
The same applies to eval_in_G2()
for the field and predefined setup points G2_SETUP
poly_from_evals()
function derives a polynomial from its evaluations ys
at points xs
The baseline method for this calculation is the Lagrange polynomial which is implemented below.
The complexity of the reference implementation in which is not perfect. Luckily when points selection is made smartly like below a faster inverse FFT (Fast Fourie Transformation) algorithm could be used.
Polynomial commitment could be thought of as a compact polynomial fingerprint, or hash.
A Kate polynomial commitment is an element from the group , and is essentially the evaluation of at the secret point mapped to : . But since the is not known to anyone we can't directly evalute the polynomial. But we have G1_SETUP
with powers of mapped to : . Thus we may evaluate the polynomial directly in :
Multi-proof is an element from from the group which proves that the polynomial (proves to a verifier who doesn't know the polynomial but has the polinomial constructed above) at points evaluates to values . I.e. for
commit_to_poly()
calculates a commitment for a polynomial represented by its evaluation points xs
and ys
. Read here for more details.
Please refer to this chapter for more details on Kate multiple point proofs
commit_to_max_degree()
calculates the proof that the polynomial poly
has degree not above poly.degree()
The degree proof works as follows. For a polynomial with degree : the length proof is the commitment to the polynomial , where MAX_DEGREE
is the maximum power of polynomial available in the setup. The goal is to ensure that a proof can only be constructed if deg(B) <= N
(there are not hidden higher-order terms in the polynomial, which would thwart reconstruction).
Since we are free to choose any let's do it smart to help optimized algorithms (Fast Fourie Transformation, FK20) to do their job. Thus let's choose and fix the following values:
where rev(i) = reverse_bit_order(i, MAX_EXT_POINTS)
, and is the unity root (with order MAX_EXT_POINTS
) which is defined by the constant ROOT_OF_UNITY
Extension of original data is simly another evluations of polynomial at points , i.e. for
Thus the full extended data is 2x bigger than original and consists of values .
The polynom can be reconstructed with just any values of this set, and thus the whole extended data set can be recovered with just any values.
eval_multi()
evaluates ploynomial poly
at a number of consequtive data points.
This reference implementation is suboptimal (has quadratic complexity), it's worth considering faster FFT (Fast Fourie Transformation) algorithm
Other methods could also be implemented more effectively
construct_proofs()
creates proofs for all samples (data chunks of length POINTS_PER_SAMPLE
) in a batch way. This reference implementation has suboptimal complexity . Assuming specific points selection a better optimized FK20 (TODO link) algorithm can be used here