See https://github.com/crate-crypto/go-kzg-4844
And https://github.com/ethereum/c-kzg-4844
for KZG libraries.
The below notes are very outdated.
EIP-4844 introduces "blob transactions", requiring changes in both execution and consensus layers.
See meta-spec for all resources.
This doc is meant as an implementation guide, clarifying "implementation details" of the different specs.
Initial write-up is focused at KZG performance, by @protolambda.
Learn what KZG is here: KZG polynomial commitments, by Dankrad Feist
In EIP-4844 we use this to commit to the "Blob of data". This is slower to compute than just hashing the data, but good for other things:
We hide the complexity from the transaction itself by hashing the KZG commitment: this way we can version and abstract it away,
and change the hash pre-image to a different type of commitment (or with different parameters) later on.
versioned_hash = prefix_byte ++ hash(compress_serialize_g1(commit_to_poly(blob)))[1:]
The costly part is commit_to_poly
, which is the evaluation of the polynomial at a secret point s
.
We compute this with a linear combination of KZG_SETUP_LAGRANGE
and the blob.
Q: Do I need a whole library to support blobs?
A: No! KZG is super simple. Just implement:
The bare-bones version: geth datablobs prototype crypto/kzg/kzg.go
.
One feature specific to EIP4844 that the go-kzg libraries do not have (yet): batch-verification. More about that in the Optimizations section.
Q: Are there libraries for the lazy buidler? Maybe I build fancy stuff on top of KZG, prototype sharding?
A: Yes! Commitments, proofs, optimized multi-proof generation and data-extension and data-recovery for sampling are all there already.
go-kzg
by Proto, based on python research code from Dankrad.kilic/bls12-381
for BLS. Go-ethereum adopted an older copy of that for previous BLS precompile work.kilic/bls12-381
is nice because it uses Go assembler for it's super simple toolchain and portability (beware though, kilic BLS is not written for 32 bit).herumi/bls-eth-go-binary
a BLS library Prysm used early on. It's relatively slow, but there for comparison.c-kzg
by Ben Edgington, based on go-kzg
. Using BLST, a highly optimized BLS library used by all Eth2 clients today.The most essential benchmark is "time to go from blob to commitment", a.k.a. a G1 linear combination, with maybe some decoding/encoding work.
In EIP-4844 we use 4096 field elements per blob, so that's what we benchmark.
log2(4096) = 12
, in the benchmarks we sometimes have different "scales", scale 12 is what matters here.
Collected outputs:
Using Go 1.17
Collected outputs:
Notes:
40-60 ms for computing a KZG blob is not great:
MAX_BLOBS_PER_TX
) and 16 blobs in a block (MAX_BLOBS_PER_BLOCK
). Even if valid, a lot of work.We can optimize by doing two things:
Description by George:
Batch verification works with multi-scalar-multiplication (MSM):
The MSM would look like this (for three blobs with two field elements each):which we would need to check against the linear combination of commitments:
In the above:
r
are the random scalars of the linear combinationb0
is the first blob,b0_0
is the first element of the first blob.L
are the elements of theKZG_SETUP_LAGRANGE
C
are the commitmentsBy regrouping the above equation around the
L
points we can reduce the length of the MSM further
(down to justn
scalar multiplications) by making it look like this:
Essentially we multiply each blob with a secret random scalar, as well as the commitment,
so we can sum them up safely before verifying.
And so it reduces to two bls.LinCombG1
calls:
KZG_SETUP_LAGRANGE
Go 1.17, kilic BLS, kzg.VerifyBlobs
in geth prototype
I.e. from 57
to 71
milliseconds, but now verifying 16x as many blobs!
And from 57
to 114
for 128x as many. Batching is important
To aggregate with constant memory, the blob elements and commitments can be aggregated on the fly,
i.e. we add the scaled blobs and commitments to running aggregates instead of aggregating at the very end.
This is a trade-off between memory and being able to optimize the linear combinations.
Ethereum fetches txs with the devp2p
protocol, many at a time.
However, in the case of go-ethereum it does not score down peers that return bad transactions
(as far as I understood from quick reverse engineering), and it verifies the transactions single-threaded one at a time.
Assuming the verification of regular txs is cheap, that's not really a problem.
For blob transactions we have to make some changes though:
To aggregate many transactions without increasing memory by too much aggregating on the fly should help
(by parsing all blobs and commitments into deserialized []Fr
and G1Point
instances we double memory).
Staged validation, and the blobs verification stage, have been implemented in go-ethereum
here.
Peer-scoring was not changed, I need feedback on what is already there to integrate with.
Similar to the transaction-pool in the execution-layer,
the consensus-layer also has to verify bundles of commitments and blob-data: known as the BlobsSidecar
.
During gossip of sidecars however, only signed sidecars are distributed:
we only have 1 beacon block per 12 seconds, and only 1 sidecar of blobs.
So even if costly, the proposer signature check avoids DoS nicely.
However, because blobs are available for longer times, but the sidecar signatures are not
(and will definitely not be with sharding, when blobs are recovered from samples),
there are still other places to DoS: the request-response (a.k.a. sync protocol).
In that case we want to batch-verify all blobs in a single sidecar.
And when requesting multiple sidecars from the same peer, using e.g. blobs_sidecars_by_range
v1,
we can aggregate all (or e.g. 20 at a time) received sidecars and batch-verify them all at once!
Again, when batch-verifying a lot of data (100 sidecars = 200 MiB worst-case) it makes sense to
aggregate on the fly to keep constant memory, and write the blobs to some temporary disk space until fully verified.