This blog is aimed to give the reader a developer's view to KZG polynomial commitment schemes as it's a more optimal approach than normal Merkle Proofs. But before diving right in, we need to understand what is a Blob on Ethereum. Well, Blobs can be defined as large amounts of data, that are about `4096 x 32 bytes (around 128 KB)`. I hope everyone gets a better understanding after reading this :) ## Optimisation Our job is to : `1)` commit to the data succinctly, basically validating the existence and immutability of the data, and `2)` prove values in the data set and a given position, let's say transaction number 27 on blob number 10. Something like that. ## Previously, before EIP-4844 Now, we can definitely do this using a Merkle Tree, and we know that the root of the Merkle Tree is 32 bytes, that's what we need to perform a succinct commitment. In addition to that, we can also construct a Merkle Proof, saying that the particular leaf of the Merkle Tree is so and so. But after constructing the proof, the proof turns out to be quite large, around `32 log (width of the Merkle Tree, which is 4096 bytes)`, at least 384 bytes. ## Now... This size issue is properly dealt with using the KZG (Kate, Zaverucha and Goldberg) commitment scheme. Using which we can perform both commitments and proof constructions. It takes `48 bytes for a commitment`, which in KZG, is an elliptic curve point, which is `further hashed down to 32 bytes within the protocol`. That's because the Ethereum Virtual Machine is inherently much performing with 32 byte data. What's even more noteworthy is the proof construction (which was otherwise 384 bytes using Merkle Proofs), as it is the `same 48 bytes` using the KZG Polynomial Commitment Scheme! We can `prove any element within the data set (blob) with just 48 bytes`, and we can again use 48 bytes to prove for any number of elements in the data set. Hence, we can sort of link up and aggregate these proofs and create a multiproof. So let's just summarize our agenda here... ![](https://hackmd.io/_uploads/SyVVPhzFh.png) Coming to the point... ## What is the KZG Commitment? **"Think of it as a special purpose hash function" - Vitalik** KZG does 2 important things here: - It treats our data as evaluations of a polynomial at specific points (roots of unity). The blob is the coefficients of this polynomial. And, users can go from data to blob and vice-versa via fast-fourier transforms, again, which is mainly dealt with in Layer 2. - (treat it like a black-box :D ) - The KZG succinctly commits to a specific polynomial, which basically means that it's just an evaluation of the polynomial at a secret point `S`, that nobody knows. This point `S` is generated by Multi-Party Compuation (Trusted Setup Ceremony). - `Probability(anyone stumbling across the correct secret point S, and being able to evaluate the polynomial properly)` -> infinitesimal, hence, very very secure. ![](https://hackmd.io/_uploads/SyuvP3MYh.png) This diagram effectively captures what the `blob_to_kzg()` function does, aka, a commitment. Few points to note: - In the first phase KZG commits to the polynomial with the coefficients in the Blob, and it evaluates the polynomial at the unknown point `S`. - The elliptic curve we're selecting here is the BLS12-381 curve, and with Ethereum's in-built support for this curve, implementing upto the G1 point becomes a matter of 3 lines of code. - The final point G1 again undergoes a `SHA2(G1)`, in order to bring it down to 32 bytes, and who else is 32 bytes? Storage slots in EVM, hence that greatly enhances EVM compatibility. Once, we're done with the proving part now we come the verification part. In zk crypto, we often call this an **opening**, that's because now we open the polynomial and various locations as per the key is concerned and lookup the values. ![](https://hackmd.io/_uploads/HkuOw2zF2.png) Here, `X` refers to the location in the polynomial where we want to prove the value at, and `Y` is the value we want it to be equal to. So effectively, we test `P(X) = Y` at a given point in the elliptic curve, which is, in this case BLS12-381. ## Changes to be made in Ethereum as a developer ### Changes in Execution Clients 1. Blob verification: - Verifying that the given blob of data matches with it's equivalent KZG Commitment - Implementing the `blob_to_kzg()` function - Typically used more for Optimistic Rollups 2. Point Evaluation: - Proving that `P(X) = Y` at a single point - Implementing the `verify_kzg_proof()` function - Typically used more for zkRollups 3. Some EVM level opcodes to access and manipulate the `SHA2(G1)` point, which was earlier reduced from 48 bytes to 32 bytes. 4. Validating blob transactions using the `blob_to_kzg()` function, inorder to map the KZG commitments to their 32 bytes EVM compatible hashed values. ### Changes to the Consensus Clients 1. Having the `BLS12-381` library for bilinear pairings, if not present. 2. Adding the following functions: - `blob_to_kzg()` - `kzg_to_versioned_hash()` 3. Verification of proofs are not required 4. Block Gossip validation `(Gossip is the process of brodcasting a newly mined block to the individual peers participating in the Ethereum network)`: - This includes checking that all KZG commitments are valid G1 points (same check for public keys) - Verify that all blobs in the blob match the set of KZG commitments, using the `blob_to_kzg()` - One possible optimisation is by validating the commitments in a batch. 5. Data Availability validation using the `verify_blob_sidecar()` function. ### Summarizing how KZG is implemented in the entire Ecosystem ![](https://hackmd.io/_uploads/SytYv3GYn.png) ### Tech we have so far... ![](https://hackmd.io/_uploads/HJS9PhGYn.png) ### What more is needed for developers - More abstraction for kzg libraries in terms of execution and consensus clients - Faster trusted setup ceremony for computing points over the elliptic curve - More and more documentation! In the next blog we'll cover another important stepping stone to EIP-4844, and that is Danksharding, now that we have an overview what goes in KZG commitments.