This blog is aimed to give the reader a developer's view to KZG polynomial commitment schemes as it's a more optimal approach than normal Merkle Proofs.
But before diving right in, we need to understand what is a Blob on Ethereum. Well, Blobs can be defined as large amounts of data, that are about 4096 x 32 bytes (around 128 KB)
.
I hope everyone gets a better understanding after reading this :)
Our job is to : 1)
commit to the data succinctly, basically validating the existence and immutability of the data, and 2)
prove values in the data set and a given position, let's say transaction number 27 on blob number 10. Something like that.
Now, we can definitely do this using a Merkle Tree, and we know that the root of the Merkle Tree is 32 bytes, that's what we need to perform a succinct commitment.
In addition to that, we can also construct a Merkle Proof, saying that the particular leaf of the Merkle Tree is so and so.
But after constructing the proof, the proof turns out to be quite large, around 32 log (width of the Merkle Tree, which is 4096 bytes)
, at least 384 bytes.
This size issue is properly dealt with using the KZG (Kate, Zaverucha and Goldberg) commitment scheme. Using which we can perform both commitments and proof constructions. It takes 48 bytes for a commitment
, which in KZG, is an elliptic curve point, which is further hashed down to 32 bytes within the protocol
. That's because the Ethereum Virtual Machine is inherently much performing with 32 byte data.
What's even more noteworthy is the proof construction (which was otherwise 384 bytes using Merkle Proofs), as it is the same 48 bytes
using the KZG Polynomial Commitment Scheme! We can prove any element within the data set (blob) with just 48 bytes
, and we can again use 48 bytes to prove for any number of elements in the data set. Hence, we can sort of link up and aggregate these proofs and create a multiproof.
So let's just summarize our agenda here…
Coming to the point…
"Think of it as a special purpose hash function" - Vitalik
KZG does 2 important things here:
S
, that nobody knows. This point S
is generated by Multi-Party Compuation (Trusted Setup Ceremony).Probability(anyone stumbling across the correct secret point S, and being able to evaluate the polynomial properly)
-> infinitesimal, hence, very very secure.This diagram effectively captures what the blob_to_kzg()
function does, aka, a commitment. Few points to note:
S
.SHA2(G1)
, in order to bring it down to 32 bytes, and who else is 32 bytes? Storage slots in EVM, hence that greatly enhances EVM compatibility.Once, we're done with the proving part now we come the verification part. In zk crypto, we often call this an opening, that's because now we open the polynomial and various locations as per the key is concerned and lookup the values.
Here, X
refers to the location in the polynomial where we want to prove the value at, and Y
is the value we want it to be equal to. So effectively, we test P(X) = Y
at a given point in the elliptic curve, which is, in this case BLS12-381.
blob_to_kzg()
functionP(X) = Y
at a single pointverify_kzg_proof()
functionSHA2(G1)
point, which was earlier reduced from 48 bytes to 32 bytes.blob_to_kzg()
function, inorder to map the KZG commitments to their 32 bytes EVM compatible hashed values.BLS12-381
library for bilinear pairings, if not present.blob_to_kzg()
kzg_to_versioned_hash()
(Gossip is the process of brodcasting a newly mined block to the individual peers participating in the Ethereum network)
:
blob_to_kzg()
verify_blob_sidecar()
function.In the next blog we'll cover another important stepping stone to EIP-4844, and that is Danksharding, now that we have an overview what goes in KZG commitments.