Blobs subprotocol

# Blobs subprotocol :::info Initial subprotocol design & this doc by Mike. Code implementation by Miranda. Alvaro came up with the nice SpongeBlob idea. ::: Describing the blobs subprotocol to auditors. :::danger Please pay attention to these red text boxes. They highlight questions that need a proper cryptographer's assessment, please. ::: We'll start by describing an illustrative basic blob protocol. Then, we'll build on that to describe Aztec's much more complex blob protocol. --- ## Tl;dr Afer writing all of this, I was able to draw a pretty impressive diagram: > Hackmd says it's too big. RIP. This is a screenshot from my macbook screen. Not as good as the resolution of a screenshot from my proper screen. ![image](https://hackmd.io/_uploads/H17ot4mQbe.png) --- ## Notation $\mathbb{F}_{r, BN}$ - the scalar field of the BN254 curve. $|\mathbb{F}_{r, BN}|= 21888242871839275222246405745257275088548364400416034343698204186575808495617$ - a 254-bit number. $\mathbb{F}_{r, BLS}$ - the scalar field of the BLS12-381 curve. $\mathbb{F}_{r, BLS} = 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001$ $= 52435875175126190479447740508185965837690552500527637822603658699938581184513$ - a 255-bit number. $\mathbb{F}_{q, BLS}$ - the base field of the BLS12-381 curve. $\mathbb{F}_{q, BLS} = 0x1a0111ea397fe69a4b1ba7b6434bacd764774b84f38512bf6730d2a0f6b0f6241eabfffeb153ffffb9feffffffffaaab$ $= 4002409555221667393417789825735904156556882819939007885332058136124031650490837864442687629129015664037894272559787$ - a 381-bit number. $\mathbb{G}_{BLS}$ - the BLS12-381 elliptic curve group, whose coordinates are 381-bit elements of $\mathbb{F}_{q, BLS}$. Let's define some conversion functions: $F_{\mathbb{F}_{r, BN} \rightarrow \mathbb{F}_{r, BLS}}: \mathbb{F}_{r, BN} \rightarrow \mathbb{F}_{r, BLS}$ - a trivial conversion function, since $|\mathbb{F}_{r, BN}| < |\mathbb{F}_{r, BLS}|$. > I might sometimes write $\mathbb{F}_{r, BLS}(\cdot)$ to mean the same; lazily treating it as a conversion function from whatever the argument is. $F_{\mathbb{F}_{r, BLS} \rightarrow \mathbb{F}_{r, BN}^3}: \mathbb{F}_{r, BLS} \rightarrow \mathbb{F}_{r, BN}^3$ - a conversion function from the BLS scalar field to an array of BN fields, according to the implementation in Zac's BigNum library. (I don't explain the details here). $\text{compress_to_bytes48}_{BLS}: \mathbb{G}_{BLS} \rightarrow \text{bytes48}$ (where the implementation details are glossed over, for now). $\text{compress_to_fields}_{BLS}: \mathbb{G}_{BLS} \rightarrow \mathbb{F}_{r, BN}^2$ (where the implementation details are glossed over, for now). --- ## About Blobs Each $blob$ is an array of 4096 BLS12-381 fields: $blob \in \mathbb{F}_{r, BLS}^{4096}$ $d := 4096$ is the blob array length. Every Ethereum block accepts up to 15 new blobs of data (at the time of writing). A single Ethereum tx can be accompanied by multiple blobs of data (up to 15). When an Ethereum tx is accompanied by a blob, a KZG commitment of the blob is calculated. The commitment is calculated over the BLS12-381 scalar field $\mathbb{F}_{r, BLS}$. The Ethereum KZG trusted setup ceremony generated lagrange basis points $\{G_0, G_1, ..., G_{d-1}\} \in \mathbb{G}_{BLS}^d$, with domain the $d$-th roots of unity $\omega^0, \omega^2, ..., \omega^{d-1}$ and "toxic waste" $s$. The KZG commitment to a $blob$ is then: $C = \sum_{i=0}^{d-1} blob[i] \cdot G_i \in \mathbb{G}_{BLS}$ > The $[i]$ syntax is lazy syntax for "get the $i$-th field of the blob". There will already be too many subscripts as we progress through this doc. > You might also see later in this doc: $C = [p(s)]_1$ - the kzg commitment to the polynomial $p(X)$ that interpolates the blob's 4096 values. Blob commitments are often represented as 48-bytes in the codebase. (Recall: the x-coordinate of a BLS12-381 point is 381-bits, which is just shy of 48 bytes. So a BLS point can have a compressed 48-byte encoding of the x-coordinate, plus some bits to convey: [is_compressed, is_infinity, sign]). When a $blob$ is submitted as part of a tx, the commitment $C$ is calculated by Ethereum, and then a "KZG versioned hash" is computed as (in pseudocode): `blobhash = 01 || (sha256(C.to_bits()).to_bytes())[1: 32]` I.e. it sha256-hashes $C$, then removes the first byte, and replaces that byte with a version number for the current design of Ethereum's blobs -- currently version `0x01`. Every time a $blob$ is submitted, the only thing that gets stored on Ethereum is its `blobhash`. That is to say, the _only_ access that a Solidity smart contract has to a blob is the `blobhash` of the blob. If we ever want to demonstrate something about a blob within a Solidity function, the function needs to read the `blobhash` and demonstrate that the data lives within some commitment $C$ that hashes to the `blobhash`. :::info This is also a useful reference: https://github.com/ethereum/consensus-specs/blob/master/specs/deneb/polynomial-commitments.md#deneb----polynomial-commitments ::: --- ### Ethereum Precompiles #### `blobhash(i)` Solidity has a function `blobhash(i)` which takes as input a blob index (between `0` and `14` at the time of writing) and returns the aforementioned KZG-versioned-hash of the blob. The blob index relates to one of up to 15 blobs that can be submitted as part of a transaction. ![image](https://hackmd.io/_uploads/B1gaCQDzbx.png) _The blobhash of a blob can only be captured as part of the tx that accompanies the blob._ That is to say, if you submit a blob-carrying tx, then the only opportunity you have to "inform" a Solidity smart contract about the existence of the blob's `blobhash` is if that same tx contains a function which calls `blobhash(i)` and stores the returned `blobhash`. It's a fleeting moment. So the flow is: - Submit a blob-carrying tx to Ethereum. - The tx can carry 1 or multiple blobs. - (Note: there is a special tx type ("type 3") that makes it easy to submit blobs alongside a tx). - Within that tx, make sure there's a function which: - calls `blobhash(i)` for each of the blobs. - stores the resulting `blobhash` of each blob in Solidity storage, for future reference (because there won't be another opportunity to grab that blobhash!!!). :::info See `ProposeLib.sol`, a function called `propose` in the Aztec codebase, which (eventually) calls `getBlobHash` within `BlobLib.sol`. ::: #### "Point Evaluation Precompile" This is basically an Ethereum precompile that enables a Solidity contract to verify a KZG opening proof. Here are its inputs (where the offsets are bytes): * `input[:32]` - versioned_hash * `input[32:64]` - $z$ - a random x-coordinate at which the blob's interpolated polynomial p(X) has been opened. * `input[64:96]` - $y$ - the claimed evaluation of the blob's interpolated polynomial: I.e. $p(z) = y$. * `input[96:144]` - the kzg commitment $C$ to the blob, compressed to 48 bytes. * `input[144:192]` - a commitment to the quotient polynomial $Q$ The precompile simply returns `true` or `false`, depending on whether the claimed KZG opening proof is valid or not. Here's a call to the point evaluation precompile in the wild: ![image](https://hackmd.io/_uploads/Hkg41EPMZl.png) :::info See also: https://eips.ethereum.org/EIPS/eip-4844#point-evaluation-precompile ::: --- ### KZG Maths Here's a reminder of the maths: Offchain, the prover needs to compute a random challenge $z$ (we'll explain how it's calculated in our use case later), the claimed evaluation $y = p(z)$, and the quotient commitment $Q$, where: $q(X) = \frac{p(X) - p(z)}{X - z}$ is the quotient polynomial for this opening proof. $Q := [q(s)]_1$ - the KZG commitment to $q(X)$. :::info In lots of EIP-4844 literature, and in the `c-kzg` library that we use, $Q$ is confusingly termed "`proof`", which is very confusing when we also have zero-knowledge proofs in our codebase. In this doc I'll call it $Q$. ::: The Point Evaluation Precompile then evaluates: $e(C - [y]_1, [1]_2) == e(Q, [s]_2 - [z]_2)$ Which is effectively: $e([p(s)]_1 - [y]_1, [1]_2) == e(\left[\frac{p(s) - p(z)}{s - z}\right]_1, [s - z]_2)$ (where, recall, $s$ is toxic waste) $e(G, H)^{p(s) - y} == e(G, H)^{\left(\frac{p(s) - p(z)}{s - z} \right) \cdot (s - z)}$ Focussing on the exponent, cancelling the $(s - z)$ terms on the rhs: $p(s) - y == p(s) - y$ Success! --- ## General zk Rollups and blobs How does a zk-rollup use blobs? Lots of users submit L2 txs. Each of those txs emits data. A zk-rollup recursively _proves_ the correctness of lots of txs, and ultimately submits a single proof (representing the correct execution of all of those txs) to L1. Alongside that zk-rollup proof, all of the txs' data needs to be broadcast to L1 for the world to see. (In reality, not _all_ of the data associated with a tx will be submitted to L1; it depends on the design of the zk-rollup. But there's always _some_ data that needs to be broadcast to L1). But... **How does the world know that the data that has been broadcast via blobs actually _matches_ the data that was used within the zk-rollup's circuits?** ### Bad option 1 A naive approach: a zk-rollup could broadcast this tx data as `calldata` alongside the submission of the `proof`. The tx data could be designed to be the `public_inputs` of the `proof`, so that we _know_ this data matches the data that was used within the circuits (otherwise the proof would not verify). ```solidity // In pseudocode: function submit_proof(calldata proof, calldata public_inputs) { bool result = verify_proof(proof, public_inputs); assert(result); } ``` But calldata is too expensive. A zk-rollup broadcasts _a lot_ of data, so we want a cheaper approach than using `calldata` to submit all of this data. BLOBS! Blobs are much cheaper than calldata (when comparing the $ cost of a byte of a blob vs a byte of calldata). Ok, so maybe we can: submit all the data via a blob, and hash all of the public inputs into a single public input, to reduce the amount of `calldata`: ```solidity // In pseudocode: function submit_proof(calldata proof, bytes32 a_single_public_input) { bool result = verify_proof(proof, a_single_public_input); assert(result); let blobhash = blobhash(0); // Hmmm... how do we link this blobhash with the public input? } ``` There's a problem. The zk-rollup circuits made use of some tx data. Let's call that "version 1" of the tx data. The zk-rollup broadcast some tx data in a blob. Let's call that "version 2" of the tx data. How do we ensure that "version 1" equals "version 2"? We must somehow assert that they're equal, otherwise the zk-rollup can lie and broadcast malicious or incorrect data. ![image](https://hackmd.io/_uploads/S1uTxrPf-x.png) ### Bad option 2 Prove the computation of the KZG Commitment $C$ within the rollup circuit: ![image](https://hackmd.io/_uploads/BkVerKuMZg.png) > This is a bad option. We don't do this. This is bad because the number of constraints to compute $C = \sum_{i=0}^{d-1} blob[i] \cdot G_i$ would be _ridiculously huge_. Hundreds of millions of constraints, easily. This is because the native field within the circuit is $\mathbb{F}_{r, BN}$, but computation of $C$ is all non-native BLS12-381 operations: 4095 EC ADDs and 4096 EC MULs. I think with our current BigCurve library, a single $A + b \cdot B$ MSM for BLS12-381 costs ~140k gates. ### Good option: The $blob$'s 4096 values can be interpolated by a polynomial $p(X)$. We can prove correct evaluation of $p(X)$ at a random value $z$ within the circuit, and we can also prove correct evaluation of the broadcast blob commitment $C$ on L1. If the evaluations agree, we know that the circuit used the same data as was published as a blob. That gets explained better soon. We have some questions for how we compute things within the circuit: - How should we compute the random $z$? - How do we derive $p(X)$ from the blob data? - How do we compute $y = p(z)$? **How should we compute the random $z$?** We basically compute a fiat-shamir challenge: $z = \text{poseidon2}(C, blob)$ We'll be more precise later, when we get to Aztec's protocol. **How do we derive $p(X)$ from the blob data?** A naive approach would be to evaluate the lagrange form of the polynomial: $p(z) = \sum_{i=0}^{d-1} blob[i] \cdot l_i(z)$ where $l_i(X)$ is the lagrange polynomial $l_i(X) = \prod\limits_{j\neq i}\frac{X-x_j}{x_i-x_j}$. But this would be too expensive within a circuit. Those products within every summand really blow up the computation. It's roughly: 4095 ADDs + 4096 MULs + 4096 * (4094 MULs + 4095 MULs + 4095 ADDs). That's 16.7m ADDs and 33.5m MULs. Have I calculated that right? And recall: all of these operations are non-native BLS12-381 scalar field operations! That's billions of constraints. So Dankrad or Vitalik or someone had an idea: We can instead derive $p(z)$ using Barycentric evaluation, which requires far fewer constraints. I'll let Dankrad explain it: :::info Barycentric evaluation: https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs.html ::: Tl;dr: $y = p(z) = \frac{z^d - 1}{d} \cdot \sum_{i=0}^{d-1}{blob[i] \cdot \frac{\omega^i}{z - \omega^i}}$ We can precompute the terms: $\frac{1}{d}$, $\omega^i$, $-\omega^i$. So then the only operations are: $log2(d=4096)$ MULs + 1 ADD + 4095 ADD + 4096*(MUL + ADD) + one big batch inversion. That's much more manageable! ![image](https://hackmd.io/_uploads/ryEmBYuf-g.png) So, in summary: The circuit has seen some data (data_1), and we want to show that it is the same data as the published blob. Circuit proves: - data_1 represents a polynomial $p_1(X)$, where $p_1(z) = y$ L1 proves: - $C$ represents a polynomial $p_2(X)$, where $p_2(z) = y$ - $C$ represents the published $blob$ The only way the blob and data_1 could evaluate at a random $z$ to the same $y$ is if they are the same data (with high probability) Therefore the circuit used the same data as the published blob. --- ## Aztec Ok, the above was intentionally general. What does Aztec do with blobs? First we should explain Aztec's rollup architecture. --- ### Aztec Block Building Aztec is divided into various units of time. Here's a diagram: ![image](https://hackmd.io/_uploads/Hkg-hG7Xbe.png) - Txs are submitted by users. - Txs are bundled into Blocks by Block Proposers. - A Block Proposer can assemble a collection of Blocks into a "Checkpoint" - A "Checkpoint" is a checkpoint with L1; an opportunity to broadcast any data emitted by txs of the checkpoint; and an opportunity to send messages between L1 and L2. - To amortise L1 snark verification costs, we generate a single proof for multiple checkpoints. - An Epoch is that collection of Checkpoints which are bundled into a single proof. - At the time of writing: - a Checkpoint is every 72 seconds. - there are 48 Checkpoints in an Epoch. - an Epoch is therefore ~1 hour. #### Block Building and Blobs ![image](https://hackmd.io/_uploads/Syttk7XmZx.png) > This diagram ignores the non-blob stuff that happens at the time of proposal and at the time of epoch proof submission: Important things like committee attestations, snark verification, L1<>L2 message passing, committee selection, block rewards, updating state roots, etc etc. ### What data goes into the <= 6 blobs of a Checkpoint? Here's a beautiful diagram: ![image](https://hackmd.io/_uploads/HJib0q-7Wg.png) Things to realise: - The number of txs in a block can vary. - The number of blocks in a checkpoint can vary. - The number of side-effects (notes, nullifiers, logs, etc) in a tx can vary. Therefore: - The number of blobs submitted as part of a checkpoint proposal can vary. - The _boundaries_ of these data structures (TxEffects, Block End Data, Checkpoint End Data) don't align with blob boundaries. That is to say: sometimes these data structures might span multiple blobs: - A TxEffect -- which (according to today's code) can be as large as 8,603 fields -- can span up to 4 blobs. (E.g. 10 fields + 4096 fields + 4096 fields + 401 fields). - The data for a block (multiple txs plus the block end data) might span multiple blobs, and might share some blob space with other blocks. - The data for a checkpoint (multiple blocks) can clearly span up to 6 blobs. #### Field confusion: $\mathbb{F}_{r, BN}$ vs $\mathbb{F}_{r, BLS}$ All of the KZG maths is being done over the BLS12-381 scalar field $\mathbb{F}_{r, BLS}$, which is 255-bits. All of the fields of data being created by Aztec transactions are BN-254 scalar fields $\mathbb{F}_{r, BN}$, which is 254-bits. The data in the above diagram (and in the tables in the next section) are all $\mathbb{F}_{r, BN}$ fields. We've been a bit lazy: We simple map groups of 4096 $\mathbb{F}_{r, BN}$ fields trivially to the larger 4096 $\mathbb{F}_{r, BLS}$ fields of a blob. > Technically, we are wasting some blob space. > > A blob is 4096 x $\mathbb{F}_{r, BLS}$ fields, which is `4096 * 255 / 8 = 130,560 bytes`. > > 4096 x $\mathbb{F}_{r, BN}$ fields is merely `4096 * 254 / 8 = 130,048 bytes`. > > We're leaving `130,560 - 130,048 = 512 bytes` on the table every blob! > > However, engineering time is a scarce resource, and this blob sub-protocol was complicated enough without tightly-packing 16 extra $\mathbb{F}_{r, BN}$ fields into the upper bits of the blob's $\mathbb{F}_{r, BLS}$ fields. #### More details More details (but basically a repetition of that diagram): ##### Data per Tx The data from a single tx looks like this: ``` pub struct TxEffect { pub tx_hash: Field, pub revert_code: u8, pub transaction_fee: Field, pub note_hashes: [Field; 64], pub nullifiers: [Field; 64], pub l2_to_l1_msgs: [Field; 8], pub public_data_writes: [ PublicDataWrite { pub leaf_slot: Field, pub value: Field, }, 64 ], pub private_logs: [ PrivateLog { pub fields: [Field; 8], pub length: u32, }, 64 ], pub public_logs: PublicLogs { pub length: u32, pub payload: [Field; 4096], }, pub contract_class_logs: [ ContractClassLog { pub log: Log { pub fields: [Field; 3023], pub length: u32, }, pub contract_address: AztecAddress, }, 1 ], } ``` Some quick addition suggests the max size of this flattened data is: 7,965 fields. Lots of those huge arrays will contain lots of `0` fields of padding on their right-hand side. We don't want to waste blob space with unnecessary `0`s, so we encode the actual (variable) lengths of the arrays (see directly below), so that we can trim all those `0`s and tightly-pack the data into blobs: ```rust // TX DATA: // Notice this can vary in length, depending on the nonzero data in the arrays. // Big endian bit packing. // Note: some of these bit-length constants might change before launch. // Note: lots of these bit-lengths are over-allocations. A future release could reduce these. |----------|--------------------------------|----------------------------| | Field 0 | TX_START_PREFIX | 64 bits | | | num_note_hashes | 16 bits | | | num_nullifiers | 16 bits | | | num_l1_to_l2_msgs | 16 bits | | | num_public_data_writes | 16 bits | | | num_private_logs | 16 bits | | | private_logs_length | 16 bits | | | public_logs_length | 32 bits | | | contract_class_log_length | 16 bits | | | revert_code | 8 bits | | | num_blob_fields | 32 bits | |----------|--------------------------------|----------------------------| | Field 1 | tx_effect.tx_hash | A field | |----------|--------------------------------|----------------------------| | Field 2 | tx_effect.transaction_fee | A field | |----------|--------------------------------|----------------------------| | | tx_effect.note_hashes | num_note_hashes | |----------|--------------------------------|----------------------------| | | tx_effect.nullifiers | num_nullifiers | |----------|--------------------------------|----------------------------| | | tx_effect.l2_to_l1_msgs | num_l1_to_l2_msgs | |----------|--------------------------------|----------------------------| | | tx_effect.public_data_writes | num_public_data_writes * 2 | |----------|--------------------------------|----------------------------| | | (tx_effect.private_logs): | (private_logs_length) | | | Where, for each log: | | | | log.length | A field | | | log.fields | log.length | |----------|--------------------------------|----------------------------| | | tx_effect.public_logs.payload | public_logs.length | |----------|--------------------------------|----------------------------| | | tx_effect.contract_class_logs | (contract_class_log_length)| | | Where, for each log: | | | | log.contract_address | A field | | | log.fields | log.length | |----------|--------------------------------|----------------------------| ``` ##### Block end data ```rust // BLOCK END DATA: // Big endian bit packing. // Note: some of these bit-length constants might change before launch. |----------|---------------------------------------------|---------| | Field 0 | BLOCK_END_PREFIX | 72 bits | | | timestamp | 64 bits | | | block_number | 32 bits | | | num_txs | 16 bits | |----------|---------------------------------------------|---------| | Field 1 | l1_to_l2_message_next_available_leaf_index | 36 bits | | | note_hash_next_available_leaf_index | 42 bits | | | nullifier_next_available_leaf_index | 42 bits | | | public_data_next_available_leaf_index | 40 bits | | | total_mana_used | 48 bits | |----------|---------------------------------------------|---------| | Field 2 | last_archive.root | A field | |----------|---------------------------------------------|---------| | Field 3 | note_hash_tree.root | A field | |----------|---------------------------------------------|---------| | Field 4 | nullifier_tree.root | A field | |----------|---------------------------------------------|---------| | Field 5 | public_data_tree.root | A field | |----------|---------------------------------------------|---------| | Field 6 | l1_to_l2_message_tree.root | A field | |----------|---------------------------------------------|---------| ``` ##### Checkpoint end data ```rust // CHECKPOINT END DATA: // Big endian bit packing. // Note: some of these bit-length constants might change before launch. |----------|---------------------------------------------|---------| | Field 0 | CHECKPOINT_END_PREFIX | 112 bits| | | num_blob_fields | 32 bits | |----------|---------------------------------------------|---------| ``` ### SpongeBlob: Ferrying data from hundreds of tx circuits into a blob The earlier Tx-Base and Block-Root circuits see only a small portion of a checkpoint's blobs' data (the portion that they each contribute). The Checkpoint-Root circuit is fed a version of all of all 6 blobs' data (so that it may do Barycentric evaluation -- see later in this doc). How do we make sure those two sets of data are the same? In the Tx-Base and Block-Root circuits, we incrementally absorb data into a poseidon2 sponge (delightfully named SpongeBlob by Miranda). Then in the Checkpoint-Root circuit, we absorb up to 6 * 4096 fields of data into a rival poseidon2 sponge. Then we finally squeeze both sponges, and if they match, we are convinced that the same data was used in the earlier circuits as the Checkpoint Root circuit. We avoid squeezing this large sponge in the earlier circuits, because: a) squeezing costs constraints, and b) we wouldn't know _when_ exactly to squeeze because the amount of data each Tx can contribute towards a blob is _variable_. Conditional squeezing would be too many constraints in the earlier circuits and in the Checkpoint Root circuit. What's more, because there can be a variable number of txs in each block and blocks in each rollup, the _number of squeezes_ would be variable. The Checkpoint Root circuit can't contain a variable number of squeezes, and we don't want to have lots of constraints for some "max number of squeezes". One big sponge for the whole checkpoint solves this nicely. Good idea, Alvaro! ![image](https://hackmd.io/_uploads/SyoKRaWm-x.png) > ^^^ Here's a diagram of the data we need to reconcile. ![image](https://hackmd.io/_uploads/SkKORRbXbl.png) > Follow the red path to see how data gets added to the spongeblob --- ### Blob notation > Talk about how when proving the epoch, we only want to call the precompile once, so we do batching. And we do simple plonk batching, so we need the z and gamma before we can commence proving. With the protocol's current constants (up to 6 blobs per checkpoint, up to 48 checkpoints per epoch), an epoch can have this many blobs: $$ \left\{ \begin{array}{l} (blob_{(1, 1)}, blob_{(1, 2)}, blob_{(1, 3)}, blob_{(1, 4)}, blob_{(1, 5)}, blob_{(1, 6)}), \\ (blob_{(2, 1)}, blob_{(2, 2)}, blob_{(2, 3)}, blob_{(2, 4)}, blob_{(2, 5)}, blob_{(2, 6)}), \\ ...\\ (blob_{(48, 1)}, blob_{(48, 2)}, blob_{(48, 3)}, blob_{(48, 4)}, blob_{(48, 5)}, blob_{(48, 6)}), \end{array} \right\} $$ where the superscipt $(i, j)$ is "blob $j$ of checkpoint $i$". That's up to 288 blobs per epoch. > Note: if the protocol is successful, it might need to increase the number of blobs per checkpoint. We'll see. Each Solidity "Point Evaluation Precompile" call costs 50,000 gas. So there's a potential cost per epoch of 14.4m gas, just to call this precompile! That's too much gas. So we go to lots of effort to do batch KZG opening instead, resulting in a single call to the Point Evaluation Precompile for the entire epoch. Pretty cool, eh? It's worth noting that the number of blobs per checkpoint, and the number of checkpoints per epoch, are flexible. E.g. we could have an epoch with a varying number of blobs per checkpoint, and fewer than 48 checkpoints in the epoch: $$ \left\{ \begin{array}{l} (blob_{(1, 1)}, blob_{(1, 2)}), \\ (blob_{(2, 1)}, blob_{(2, 2)}, blob_{(2, 3)}, blob_{(2, 4)}, blob_{(2, 5)}, blob_{(2, 6)}), \\ (blob_{(3, 1)}, blob_{(3, 2)}, blob_{(3, 3)}, blob_{(3, 4)}, blob_{(3, 5)}), \\ (blob_{(4, 1)}), \\ (blob_{(5, 1)}), \\ (blob_{(6, 1)}, blob_{(6, 2)}, blob_{(6, 3)}, blob_{(6, 4)}, blob_{(6, 5)}, blob_{(6, 6)}), \\ ...\\ (blob_{(13, 1)}, blob_{(13, 2)}, blob_{(13, 3)}), \end{array} \right\} $$ > ^^^ That's an example. > You get the idea. How many blobs per checkpoint depends on demand for txs in that checkpoint, and the amount of data being emitted by each tx. How many checkpoints depends on the performance of Block Proposers: a bad checkpoint proposal might not be provable and hence will be ignored from the epoch. Sometimes, we'll use the $blob_{(i, j)}$ notation (two subscripts), and sometimes we'll flatten all of the blobs of an epoch into $blob_k$ notation (one subscript). The mapping from $(i, j)$ to $k$ will vary depending on network activity, so is fiddly to write down, notation-wise. Hopefully my laziness won't impact your understanding - it should be clear from context. Hmmm... ok I'll need some notation. Let $N_i$ be the number of blobs in checkpoint $i$ of the epoch. Let $N_{epoch}$ be the number of blobs in the epoch. Let $M$ be the number of checkpoints in the epoch. The blob protocol doesn't span epochs, so it suffices to talk about "the epoch" for our purposes, thankfully. So, we can rewrite the blobs in an epoch as: $$ \left\{ \begin{array}{l} (blob_{(1, j)})_{j = 1}^{N_1} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_1}, \\ (blob_{(2, j)})_{j = 1}^{N_2} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_2}, \\ ... \\ (blob_{(M, j)})_{j = 1}^{N_M} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_M}, \\ \end{array} \right\} $$ or as: $\left( blob_{0}, blob_{1}, blob_{2}, ..., blob_{N_{epoch}} \right) \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_{epoch}}$ We'll use both kinds of notation (two subscripts and one subscript). --- ### Batched KZG Opening We chose the simplest approach (engineering-wise) to performing a batch KZG opening. We adopt the _first_ approach from Section 3.1 of the [Plonk Paper](https://eprint.iacr.org/2019/953.pdf). As a reminder, and using the notation that this document will use, this batch protocol allows us to prove the opening of multiple polynomial commitments $\{C_i\}_{i = 1}^{N_{epoch}}$, all evaluated at the same value $z$, to yield evaluations $\{y_i\}_{i = 1}^{N_{epoch}}$. (We'll go slower in a second). Notice that _all_ openings _for all commitments in the entire epoch_ are evaluated at _the same $z$_. It's what makes the batch-opening protocol simple vs alternatives. The downside is that when we come to prove these polynomial commitment evaluations within Aztec circuits, we cannot commence proof generation until we know $z$, and we cannot know $z$ until all blobs in the epoch are known, because $z$ is derived as a fiat-shamir challenge dependent on all those blobs. I.e. **we cannot commence proving until all checkpoints of an epoch have been proposed**. > That's a bit annoying. We did spend a lot of time contemplating more complex batch opening schemes (which didn't need the same $z$ for all blob commitments in the epoch), but we opted for engineering simplicity. > > Note: there are some circuits in an epoch which _can_ commence proving much sooner than the end of the epoch. It's only the circuits which need to know $z$ which must be delayed. --- ### Checkpoint Proposal Ok, let's explain what happens in our protocol before proving can begin (and hence, before we need to think about what happens inside circuits). $M$ checkpoints are proposed for an epoch -- one after the other, 72 seconds apart. By the end, these blobs will have been submitted to L1: $$ \left\{ \begin{array}{l} (blob_{(1, j)})_{j = 1}^{N_1} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_1}, \\ (blob_{(2, j)})_{j = 1}^{N_2} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_2}, \\ ... \\ (blob_{(M, j)})_{j = 1}^{N_M} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_M}, \\ \end{array} \right\} $$ Each blob has a corresponding commitment: $$ \left\{ \begin{array}{l} (C_{(1, j)})_{j = 1}^{N_1} \in \{\mathbb{G}_{BLS}\}^{N_1}, \\ (C_{(2, j)})_{j = 1}^{N_2} \in \{\mathbb{G}_{BLS}\}^{N_2}, \\ ... \\ (C_{(M, j)})_{j = 1}^{N_M} \in \{\mathbb{G}_{BLS}\}^{N_M}, \\ \end{array} \right\} $$ Each commitment has a corresponding "versioned blob hash" (aka `blob_hash`): $$ \left\{ \begin{array}{l} (\texttt{blob_hash}_{(1, j)})_{j = 1}^{N_1} \in \{\texttt{bytes32}\}^{N_1}, \\ (\texttt{blob_hash}_{(2, j)})_{j = 1}^{N_2} \in \{\texttt{bytes32}\}^{N_2}, \\ ... \\ (\texttt{blob_hash}_{(M, j)})_{j = 1}^{N_M} \in \{\texttt{bytes32}\}^{N_M}, \\ \end{array} \right\} $$ > Recall, a blob_hash is derived as: > > `blobhash = 0x01 || (sha256(C.to_bits()).to_bytes())[1: 32]` #### Storing a representation of the blobs onchain, for later retrieval At the time of each checkpoint proposal to L1, we need to _store_ a representation of that checkpoint's blob commitments, because we'll need to load them in a much later L1 tx (at the end of the epoch) when we come to perform the kzg batch opening. Instead of storing each commitment $C$ individually, or storing each `blobhash` individually, we hash-together all commitments of the checkpoint $(C_{(i, j)})_{j = 1}^{N_i}$ to yield a single `blobCommitmentsHash` for each checkpoint. We store just this `blobCommitmentsHash` for each checkpoint to save gas. $$ \left\{ \begin{array}{l} \begin{aligned} &\texttt{blobCommitmentsHash}_1 = \\ &\quad \texttt{sha256_to_field}(... \texttt{sha256_to_field}(\texttt{sha256_to_field}(\bar{C}_{(1,1)}), \bar{C}_{(1,2)}), ... \bar{C}_{(1,N_1)}) \\ &\quad \in \texttt{bytes32}\space \text{(technically, it fits inside a } \texttt{bytes31} \text{)}, \\ \end{aligned} \\ \\ \begin{aligned} &\texttt{blobCommitmentsHash}_2 = \\ &\quad \texttt{sha256_to_field}(... \texttt{sha256_to_field}(\texttt{sha256_to_field}(\texttt{blobCommitmentsHash}_1, \bar{C}_{(2,1)}), \bar{C}_{2,2}), ... \bar{C}_{(2,N_2)}) \\ &\quad \in \texttt{bytes32}\space \text{(technically, it fits inside a } \texttt{bytes31} \text{)}, \\ \end{aligned} \\ \\ ... \\ \begin{aligned} &\texttt{blobCommitmentsHash}_M = \\ &\quad \texttt{sha256_to_field}(... \texttt{sha256_to_field}(\texttt{sha256_to_field}(\texttt{blobCommitmentsHash}_{M-1}, \bar{C}_{(M,1)}), \bar{C}_{M,2}), ... \bar{C}_{(M,N_M)}) \\ &\quad \in \texttt{bytes32}\space \text{(technically, it fits inside a } \texttt{bytes31} \text{)}, \\ \end{aligned} \\ \\ \end{array} \right\} $$ > where $\bar{C}$ is the 48-byte compressed form of the commitment $C$: > I.e. $\bar{C} := \texttt{compress_to_bytes48}(C)$ > > where $\texttt{sha256_to_field}(\cdot):= \texttt{bytes32}(\texttt{0x00} || \texttt{bytes31}(\cdot))$ :::danger Notice the truncation of the output of sha256 to 31 bytes so that it fits within a field. We do this so that circuits can just pass-along a single field element for the `blobCommitmentsHash`. **Please confirm whether this truncation is an acceptable loss of security.** ::: So at the time of proposing checkpoint 1, we store $\texttt{blobCommitmentsHash}_1$. At the time of proposing checkpoint 2, we store $\texttt{blobCommitmentsHash}_2$. Etc. By the end of the epoch, the only persisted record of the blobs that were previously submitted to the chain is through the $\texttt{blobCommitmentsHash}$ of each checkpoint: Notice: each `blobCommitmentsHash` for each checkpoint builds off the last. You might wonder: "Why can't we throw away $\texttt{blobCommitmentsHash}_1$ once $\texttt{blobCommitmentsHash}_2$ has been stored? E.g. overwrite the same storage slot with every new checkpoint proposal in the epoch.". It's because the proposal for checkpoint 2 might be an invalid proposal, in which case the epoch proof will only prove up to checkpoint 1. In such a case, the smart contract needs to be able to load a snapshot of the `blobCommitmentsHash` as at the end of checkpoint 1. So by the end of an epoch, the L1 smart contract has the following useful data in storage: $(\texttt{blobCommitmentsHash}_1, ..., \texttt{blobCommitmentsHash}_M)$ Nice. --- **TODO: we also store a different kind of hash of the blobhashes within the block header, and I don't know why we do that, instead of using the blobCommitmentsHash**. Oh, maybe it's because the blobCommitmentsHash is cumulative? Hmmmm... --- #### Checkpoint Proposal - Summarised :::success See the `propose` function of `ProposeLib.sol` ::: An "Aztec Checkpoint Proposal" tx (submitted by an Aztec block proposer to Ethereum) is accompanied by up to 6 blobs of data, depending on how much data is being emitted by all of the Aztec txs within the Aztec blocks being proposed. (The choice of 6 is arbitrary and could change in future). > We covered a lot of this in the previous section too. When Checkpoint $k$ is proposed to the `propose` function of `ProposeLib.sol`, the L1 contract stores: $$ \begin{aligned} &\texttt{blobCommitmentsHash}_k = \\ &\quad \texttt{sha256_to_field}(...\\ &\quad\quad \texttt{sha256_to_field}(...\\ &\quad\quad\quad\texttt{sha256_to_field}(\texttt{sha256_to_field}(\texttt{blobCommitmentsHash}_{k-1}, \bar{C}_{(k,1)}), \bar{C}_{k,2}),...), \\ &\quad\quad\quad \bar{C}_{(k,1)}\\ &\quad\quad),...), \\ &\quad\quad\bar{C}_{(k,N_k)}\\ &\quad) \\ &\quad \in \texttt{bytes32}\space \text{(technically, it fits inside a } \texttt{bytes31} \text{)}, \\ \end{aligned} $$ > There's more info in an earlier section. The `propose` function also calls the `blobhash` precompile for each of the submitted blobs $(blob_{(k,1)}, ..., blob_{(k,N_k)})$ of this checkpoint, to retrieve the versioned blob hashes of each of the blobs. It compares those correct versioned blob hashes against a "rival" set of versioned blob hashes that have been derived from the function args $(C_{(k, 1)}, ..., C_{(k, N_k)})$. This ensures the stored $\texttt{blobCommitmentsHash}_k$ actually represents the blobs. --- ### Preparing the witness for the circuits The Prover performs the following computations, offchain, outside of a circuit: For each $blob_i$ in the epoch ($i \in \{1, ..., N_{epoch}\}$): 1. $p_i(X) \in \mathbb{F}_{r, BLS}[X]$ - Lagrange interpolation of $blob_i \in \mathbb{F}_{r, BLS}^{4096}$. 1. $C_i := [p_i(s)]_1 \in \mathbb{G}_{BLS}$ . - kzg commitments of the $p_i(X)$'s, using the lagrange basis from the Ethereum KZG Trusted Setup Ceremony. 5. We hash all of a checkpoint's data using a snark-friendly $\texttt{poseidon2}$ hash, where: - $\texttt{poseidon2}: \{\mathbb{F}_{r, BN}\}^t \to \mathbb{F}_{r, BN}$ for a variable number of input fields $t$. - Notice the different field! - See the above SpongeBlob section for details. - For Checkpoint $k$: - $$ \begin{aligned} \texttt{sponge_blob_hash}_k &:= \texttt{blob_fields_hash}_k \\ &:= \texttt{squeeze(absorb(}\\ & \quad\quad\quad ...\widetilde{blob}_{(k,1)}, ...\widetilde{blob}_{(k,2)}, ..., ...\widetilde{blob}_{(k,N_k)},\\ & \quad \texttt{))} \end{aligned} $$ - That $\widetilde{blob}$ syntax lazily means "use the $\mathbb{F}_{r, BN}$ version of each item in the blob, rather than the $\mathbb{F}_{r, BLS}$ version". (Tilde, because it's "almost a $blob$"). In practice, in our circuits, the blobs are built from $\mathbb{F}_{r, BN}$ elements and then converted into the larger $\mathbb{F}_{r, BLS}$ elements, so there's no chance of accidentally losing information in these conversions. 1. Derive the value $z$ at which all the blobs' polynomials will be evaluated, using the fiat-shamir heuristic of hashing all transcript data up until that point (the end of the epoch). - Data up until this point includes: - The claimed data of all blobs in the epoch: $\{blob_i\}_{i = 1}^{N_{epoch}} \in \{\mathbb{F}_{r, BLS}^{4096}\}^{N_{epoch}}$. - This is the data that will be fed into the circuits. - The claimed commitments to those blobs: $\{C_i\}_{i = 1}^{N_{epoch}}$ - A representation of this data is stored onchain through each checkpoint's `blobCommitmentsHash`. - Recall: $z$ can only be derived once all checkpoint proposals are known, at the end of an epoch. 1. For $blob_i$, belonging to checkpoint $k$, we derive: - $z_i := h(\texttt{sponge_blob_hash}_k, ...\mathbb{F}^2_{r, BN}(\bar{C_i})) \in \mathbb{F}_{r, BN}$, - where "$...\mathbb{F}^2_{r, BN}(\bar{C_i}))$" means convert the 48-byte compressed form of $C_i$ into an array of two BN fields, and spread it into the preimage of the poseidon2 hash. - These $z_i$'s are random fiat shamir challenges, which will be used to compute the barycentric challenge $z$. 1. $z_{BN} := h(...h(h(z_0, z_1), z_2), ... z_{N_{epoch}}) \in \mathbb{F}_{r, BN}$ - the barycentric challenge. - The nested hashing here and elsewhere is because this computation will be incremental, across many circuits (see shortly). - :::danger Notice that this Barycentric challenge $z_{BN}$ is an element of $\mathbb{F}_{r, BN}$, rather than $\mathbb{F}_{r, BLS}$, because we favour usage of the snark-friendly poseidon2 hash whenever we can. This means the challenge is technically 254-bits instead of 255-bits. **Please validate whether a 254-bit challenge is safe to use as a random challenge when evaluating polynomials that are defined over the 255-bit scalar field: $\mathbb{F}_{r, BLS}[X]$.** ::: 1. Let $z := \mathbb{F}_{r,BLS}(z_{BN})$ 1. Evaluate each blob's polynomial at the same random challenge $z_{BLS}$. 1. $y_i := p_i(z) \in \mathbb{F}_{r, BLS}$ for each $i$. - Outside of a circuit, this can either be computed using the barycentric formula using $blob_i$, or by directly evaluating the polynomial $p_i$. 1. Derive a "random" (fiat-shamir) challenge for KZG batching: 1. $\tilde{y_i} := h(...F_{\mathbb{F}_{r, BLS} \rightarrow \mathbb{F}_{r, BN}^3}(y_i)) \in \mathbb{F}_{r, BN}$. - Where $...F_{\mathbb{F}_{r, BLS} \rightarrow \mathbb{F}_{r, BN}^3}(\cdot)$ spreads the limbs of the BLS field (BigNum library representation) into the preimage of poseidon2. - This step simply compresses $y_i$ into a BN field, to make derivation of gamma easier to follow. 1. $\gamma_{BN} := h(h(...h(\tilde{y_0}, \tilde{y_1}), \tilde{y_2}), ..., \tilde{y}_{N_{epoch}}), z_{BN}) \in \mathbb{F}_{r, BN}$. 2. Note that this random challenge does indeed incorporate the commitments ${C_i}$, because $z_{BN}$ is derived from those. - :::danger Notice that this $\gamma$ is in $\mathbb{F}_{r, BN}$, rather than $\mathbb{F}_{r, BLS}$, because it is the output of a poseidon2 hash. **We should validate that this is safe.** There's a worry that it could skew the below random linear combinations in an unsafe way. ::: - :::danger I included $z_{BN}$ in this derivation of $\gamma_{BN}$ for good measure. You could perhaps argue that $z_{BN}$ isn't needed, because the evaluations (the $y_i$'s) are technically derived from $z_{BN}$. I was worried about possible attacks where a prover could lie about a claimed evaluation $y_i$ ,such that the $y_i$ is no-longer derived from $z_{BN}$. I couldn't properly formulate a valid attack; but I was too scared to remove $z_{BN}$. 1. Let $\gamma := \mathbb{F}_{r, BLS}(\gamma_{BN})$ 1. Derive the batched polynomial, and its commitment, according to the _first_ approach in section 3.1 of the Plonk Paper: $$ \begin{align} C &:= \sum_{i = 1}^{N_{epoch}} \gamma^{i-1} \cdot C_i \in \mathbb{G}_{BLS} \\ p(X) &:= \sum_{i = 1}^{N_{epoch}} \gamma^{i-1} \cdot p_i(X) \in \mathbb{F}_{r, BLS}[X] \\ \end{align} $$ - Here's a pretty picture of what we just derived, including the dependency graph of the fiat-shamir challenges. ![image](https://hackmd.io/_uploads/S1jB6_qE-l.png) - :::info Ideally, this approach of using $\gamma^i$ is secure, so that we don't have to make any changes to our code. See below a question about its security and whether we should make a change. ::: - :::danger As mentioned just above, $\gamma$ is only a 254-bit value, instead of the full 255-bits of the BLS scalar field. Does this materially impact the security of the batch kzg opening protocol? Does the repeated exponentiation of this not-large-enough number cause unacceptable skew in the calculation of $C$ and $p(X)$? ::: - :::danger Addendum: it's possibly more efficient to _hash_ $\gamma$ to produce challenges than to perform this expensive non-native scalar multiplication by $\gamma$ repeatedly. I.e. Let $\gamma_i := h(\gamma, ...h(\gamma, \gamma))$, where there are $i$ gammas within that nested hash. **Would this be a safe alternative? Would we be able to replace $\gamma^i$ with this $\gamma_i$?** The rollup's current performance isn't bottlenecked by this, so it's not essential from a performance perspective. ::: - :::info Following on from those two red boxes, there are conerns that repeated exponentiation of a 254-bit $\gamma$ could indeed be unsafe. There are two proposals for alternatives to $\gamma^i$: 1. Replace $\gamma^i$ with a new definition $\gamma_i := h(\gamma, i)$. - ideally the hash function would be the native poseidon2 function which outputs a 254-bit element in $\mathbb{F}_{(r, BN)}$. - BUT... our circuits currently don't have access to index $i$: we compute these large sums iteratively (see later in this doc) and each iteration does not know its index $i$ in the loop. It would be annoying engineering-wise to adopt this approach. 2. Replace $\gamma^i$ with a new definition $\gamma_i := h(h(...h(\gamma, \gamma)), \gamma)$ -- a nested hashing of ~$i$ lots of gamma. I.e. $\gamma_i := h(\gamma_{i-1}, \gamma)$ and $\gamma_0 := 1$, $\gamma_1 := \gamma$ (or similar). The nested hashing makes it easy for each iteration of the circuit to hash-in one extra gamma. - ideally the hash function would be the native poseidon2 function which outputs a 254-bit element in $\mathbb{F}_{(r, BN)}$. - This is currently the favoured alternative, if it's safe. ::: 1. Derive the KZG opening proof for our newly derived, batched polynomial commitment: $$ \begin{align} y &:= p(z) = \sum_{i = 0}^{N_{epoch}} \gamma^{i-1} \cdot y_i \in \mathbb{F}_{r, BLS}\\ q(X) &:= \sum_{i = 1}^{N_{epoch}} \gamma^{i-1} \cdot \frac{p_i(X) - p_i(z)}{X - z} \in \mathbb{F}_{r, BLS}[X]\\ Q &:= [q(s)]_1 \in \mathbb{G}_{BLS} \end{align} $$ :::danger As mentioned just above, $\gamma$ is only a 254-bit value, instead of the full 255-bits of the BLS scalar field. Does this materially impact the security of the batch kzg opening protocol? Does the repeated exponentiation of this not-large-enough number cause unacceptable skew in the calculation of $y$ and $q(X)$ and $Q$? ::: --- ### Tx Base rollup circuit **Absorbs the Tx Effects of one tx into the SpongeBlob.** Here's a reminder of that data: ![image](https://hackmd.io/_uploads/B13pT9MXbg.png =50%x) For every tx, a "Tx Base Rollup" circuit is executed. A checkpoint starts with an empty SpongeBlob (a poseidon2 sponge). Even if a checkpoint needs to broadcast multiple blobs of data, a single SpongeBlob absorbs all the data of chat checkpoint's blobs. The tx effects from all txs of all blocks in the checkpoint get absorbed into the same SpongeBlob, but each tx's absorption happens in a distinct "Tx Base Rollup" circuit for each tx. I.e. The Tx Effects of the first tx are absorbed by the first instance of the Tx Base Rollup circuit. And so on... --- ### Block Root rollup circuit There's a "_First_ Block Root Rollup" circuit and a "Block Root Rollup" circuit, for the first and subsequent blocks in a checkpoint. For the blobs subprotocol, there's no difference. This circuit does these blob-related things: - **Absorbs Block End Data of one block into the SpongeBlob**; - Here's a reminder of that data: ![image](https://hackmd.io/_uploads/SkmX7sz7Wl.png =50%x) - **Squeezes a copy of this SpongeBlob, and inserts that squeezed sponge into the block's BlockHeader**, as a record of the emitted data of that block. - We wanted some kind of commitment to the data emitted in a block. We contemplated simply putting the blobs' kzg commitments in the block header, but since each blob commitment is a (non-native) BLS point, proving existence of emitted data in the block would have required BLS pairings, which would have been way too many constraints. This poseidon2 sponge is a huge improvement on that. - Notice: the block header of block $i$ in the checkpoint will contain a squeezed sponge that contains the data from all blocks $(1, ..., i)$ of that checkpoint. This does make it slightly inefficient to prove existence of data that was emitted in the last tx of a checkpoint, because the amount of poseidon absorption could be as high as 6 * 4096 fields. We didn't want to spend extra gates in the rollup circuits to individually hash each block's data, or each tx's data, as this is a niche use case. - **Propagates the unsqueezed SpongeBlob** (to the Block Merge Rollup circuit), so that the next block can continue to absorb data into it. --- ### Checkpoint Root rollup circuit - **Ensures the "Start SpongeBlob" of the checkpoint is empty.** - **Absorbs the Checkpoint End Data into the SpongeBlob**. - Here's a reminder of that data: ![image](https://hackmd.io/_uploads/r1UzdiMX-x.png =50%x) - **Ensures the number of absorbed fields is `<=` the max amount of blob data for a checkpoint (6 * 4096 fields)**. - **Squeezes the sponge**, finally. It also does this more-complex stuff, which we'll cover in more detail: - **Feeds-in the entire stream of 6 * 4096 fields of the checkpoint** - **It poseidon2-hashes these fields** to make sure the result matches the sponge we just squeezed. I.e. it makes sure this stream of `blobs_fields` matches the data that the earlier circuits "saw". - This is quite a lot of constraints, but it's aceptable. - For each of those 6 $blob_i$'s of 4096 fields: - **Evaluates the interpolated polynomial $p_i(X)$ at $z$ to yield the evaluation $y_i$**. - **Adds the next summand in the incremental computations of the batched kzg commitment $C$, and the batched evaluation $y$**. - **Adds the next contribution to the incremental computations of the batch challenges $z$ and $\gamma$**. - **Adds the next contribution to the incremental computation of the $\texttt{blobCommitmentsHash}_M$ that was stored on L1 when checkpoint $M$ was proposed.** - We do this to prove that the $C_i$'s "seen" by the circuits match the $C_i$'s that were submitted to L1 as part of checkpoint proposals. **Inputs** for checkpoint $k$ of the epoch: - $(\widetilde{blob}_{(k, 1)}, \widetilde{blob}_{(k, 2)}, ..., \widetilde{blob}_{(k, 6)})$ - That $\widetilde{blob}$ syntax lazily means "the $\mathbb{F}_{r, BN}$ version of each item in the blob, rather than the $\mathbb{F}_{r, BLS}$ version". (Tilde, because it's "almost a $blob$"). - Note: even if the checkpoint were to emit fewer than 6 blobs of data, this circuit still has to cope with up to 6 blobs. There's logic in the circuit to conditionally ignore unused blobs. - $(C_{(k, 1)}, C_{(k, 2)}, ..., C_{(k, 6)})$ - $z_{BN} \in \mathbb{F}_{r, BN}$ - (an optimistic claim of $z$'s value. It will also be incrementally computed across all of this epoch's Checkpoint Root Rollup circuits: by the time we reach the Root Rollup circuit, we will be able to check whether this claimed $z$ is correct). - $\gamma \in \mathbb{F}_{r,BLS}$ - (optimistically) - Accumulators from the previous Checkpoint Root Rollup circuit: - $\texttt{blobCommitmentsHash}_{acc}$ - $z_{acc}$ - $y_{acc}$ - $C_{acc}$ - $\gamma_{acc}$ - $\gamma_{pow\_acc}$ **Logic** For checkpoint $k$ of the epoch: 1. $$ \begin{aligned} \texttt{sponge_blob_hash}_k &:= \texttt{blob_fields_hash}_k \\ &:= \texttt{squeeze(absorb(}\\ & \quad\quad\quad ...\widetilde{blob}_{(k,1)}, ...\widetilde{blob}_{(k,2)}, ..., ...\widetilde{blob}_{(k,N_k)},\\ & \quad \texttt{))} \end{aligned} $$ For $blob_{(k,i)}$, belonging to checkpoint $k$, we derive: 2. $z_{(k, i)} := h(\texttt{sponge_blob_hash}_k, ...\mathbb{F}^2_{r, BN}(\bar{C}_{(k,i)})) \in \mathbb{F}_{r, BN}$, - where "$...\mathbb{F}^2_{r, BN}(\bar{C}_{(k,i)}))$" means convert the 48-byte compressed form of $C_i$ into an array of two BN fields, and spread it into the preimage of the poseidon2 hash. - These $z_i$'s are random fiat shamir challenges, which will be used to compute the barycentric challenge $z$. 3. $$ \begin{align} z_{acc} \gets \begin{cases} z_{(1, 1)}, & \text{if}\ k=1 \& i=1 \\ h(z_{acc}, z_{(k, i)}), & \text{otherwise} \end{cases} \end{align} $$ - Recall, we're aiming to eventually set in the Root Rollup circuit $z_{BN} \gets z_{acc}$, so that $z_{BN} := h(...h(h(z_1, z_2), z_3), ... z_{N_{epoch}})$ - the barycentric challenge. 3. Let $z \gets \mathbb{F}_{r,BLS}(z_{BN})$ 4. $y_{(k, i)} = \frac{z^d - 1}{d} \cdot \sum_{j=0}^{d-1}{blob_{(k, i)}[j] \cdot \frac{\omega^j}{z - \omega^j}} \in \mathbb{F}_{r, BLS}$ - the barycentric evaluation of $blob_{(k, i)}$'s interpolated polynomial $p_{(k, i)}(X)$ at $z$. - That is, $y_{(k, i)} = p_{(k, i)}(z)$. - Recall: $d = 4096$ - Recall: $z$ is optimistically fed-into this circuit, and is propogated through all rollup circuits and checked to be correct in the Root Rollup circuit. The computation of $z$ is incremental, across all the circuits. 1. $\tilde{y}_{(k, i)} := h(...F_{\mathbb{F}_{r, BLS} \rightarrow \mathbb{F}_{r, BN}^3}(y_{(k, i)})) \in \mathbb{F}_{r,BN}$ - (a.k.a. `hashed_y_i` in the circuit) 5. $$ \begin{align} \gamma_{acc} \gets \begin{cases} \tilde{y}_{(1, 1)}, & \text{if}\ k=1 \& i=1 \\ h(\gamma_{acc}, \tilde{y}_{(k, i)}), & \text{otherwise} \end{cases} \in \mathbb{F}_{r,BN} \end{align} $$ - Recall, we're aiming to eventually set in the Root Rollup circuit $\gamma_{BN} \gets \gamma_{acc}$, so that $\gamma_{BN} := h(h(...h(\tilde{y_1}, \tilde{y_2}), \tilde{y_3}), ..., \tilde{y}_{N_{epoch}}), z_{BN}) \in \mathbb{F}_{r, BN}$ - the KZG batching challenge. 3. $$ \begin{align} y_{acc} \gets \begin{cases} y_{(1, 1)}, & \text{if}\ k=1 \& i=1 \\ y_{acc} + \gamma_{pow\_acc} \cdot y_{(k, i)} \equiv \sum_{j=1}^{\sum_{l=1}^k N_l} \gamma^{j-1} \cdot y_j, & \text{otherwise} \end{cases} \in \mathbb{F}_{r, BLS} \end{align} $$ - Note: Not all 6*k blobs (up to and including this $k$-th checkpoint) will necessarily have been utilised. I.e. some blobs might have been empty. We skip empty blob indices. That's what this ugly $\sum_{j=1}^{\sum_{l=1}^k N_l}$ notation coveys: Recall $N_l$ is the number of utilised blobs in checkpoint $l$. - Note: requires expensive $\mathbb{F}_{r, BLS}$ operations. - Recall, we're aiming to eventually set in the final circuit $y \gets y_{acc}$, so that $y = \sum_{i=1}^{N_{epoch}} \gamma^{i-1} \cdot y_i$. 3. $$ \begin{align} C_{acc} \gets \begin{cases} C_{(1,1)}, & \text{if}\ k=1 \& i=1 \\ C_{acc} + \gamma_{pow\_acc} \cdot C_{(k, i)} \equiv \sum_{j=1}^{\sum_{l=1}^k N_l} \gamma^{j-1} \cdot C_j, & \text{otherwise} \end{cases} \in \mathbb{G}_{BLS} \end{align} $$ - Note: Not all 6*k blobs (up to and including this $k$-th checkpoint) will necessarily have been utilised. I.e. some blobs might have been empty. We skip empty blob indices. That's what this ugly $\sum_{j=1}^{\sum_{l=1}^k N_l}$ notation coveys: Recall $N_l$ is the number of utilised blobs in checkpoint $l$. - Note: requires expensive $\mathbb{G}_{BLS}$ operations. - Recall, we're aiming to eventually set in the final circuit $C \gets C_{acc}$, so that $C = \sum_{i=1}^{N_{epoch}} \gamma^{i-1} \cdot C_i$. 1. $$ \begin{align} & \texttt{blobCommitmentsHash}_{acc} \gets \begin{cases} \texttt{sha256_to_field}(\bar{C}_{(1,1)}), & \text{if}\ k=1 \& i=1 \\ \texttt{sha256_to_field}(\texttt{blobCommitmentsHash}_{acc}, \bar{C}_{(k,i)}), & \text{otherwise} \end{cases} \\ & \quad \in \texttt{bytes32} \space \text{(technically, it fits inside a } \texttt{bytes31} \text{)} \end{align} $$ - > where $\bar{C}$ is the 48-byte compressed form of the commitment $C$: > I.e. $\bar{C} := \texttt{compress_to_bytes48}(C)$ > > where $\texttt{sha256_to_field}(\cdot):= \texttt{bytes32}(\texttt{0x00} || \texttt{bytes31}(\cdot))$ - Recall, we're ultimately seeking to compute the same $\texttt{blobCommitmentsHash}_M$ that was already calculated and stored on L1 as part of the final checkpoint proposal of this epoch. We do it incrementally across many Checkpoint Root circuits to spread the constraint cost of all these sha256 hashes. - Recall, ultimately:$$ \begin{aligned} &\texttt{blobCommitmentsHash}_M = \\ &\quad \texttt{sha256_to_field}(... \texttt{sha256_to_field}(\texttt{sha256_to_field}(\bar{C}_1), \bar{C}_2), ... \bar{C}_M) \\ &\quad \in \texttt{bytes32}\space \text{(technically, it fits inside a } \texttt{bytes31} \text{)}, \\ \end{aligned} $$ 3. Set $\gamma_{pow\_acc}$ for the next iteration: - $$ \begin{align} \gamma_{pow\_acc} \gets \begin{cases} \gamma, & \text{if}\ k=1 \& i=1 \\ \gamma_{pow\_acc} \cdot \gamma, & \text{otherwise} \end{cases} \in \mathbb{F}_{r, BLS} \end{align} $$ - Notice: $\gamma_{pow\_acc} = \gamma^{i}$. - At the start of the next iteration (when $i \gets i + 1$), we'll be starting the iteration with $\gamma_{pow\_acc} = \gamma^{i - 1}$, which is exactly the power we need when computing the next summands. - Note: requires expensive $\mathbb{F}_{r, BLS}$ operations. **Outputs** for checkpoint root rollup circuit $k$ of the epoch: - $z_{BN} \in \mathbb{F}_{r, BN}$ - (an optimistic claim of $z$'s value). - $\gamma \in \mathbb{F}_{r,BLS}$ - (optimistically) - Accumulators at the end of this $k$-th Checkpoint Root Rollup circuit: - $\texttt{blobCommitmentsHash}_{acc} \equiv \texttt{sha256_to_field}(... \texttt{sha256_to_field}(... \texttt{sha256_to_field}(\texttt{sha256_to_field}(\bar{C}_{(1,1)}), \bar{C}_{(1,2)}), ... , \bar{C}_{(k,1)}),... \bar{C}_{(k,N_k)})$ - $z_{acc} \equiv h(...h(...h(h(z_{(1,1)}, z_{(1,2)}), z_{(1,3)}), ... , z_{(k,1)}),... z_{(k,N_k)})$ - $y_{acc} \equiv \sum_{i=1}^{\sum_{l=1}^k N_l} \gamma^{i-1} \cdot y_i$ - $C_{acc} \equiv \sum_{i=1}^{\sum_{l=1}^k N_l} \gamma^{i-1} \cdot C_i$ - $\gamma_{acc} \equiv h(...h(\tilde{y_1}, \tilde{y_2}), \tilde{y_3}), ..., \tilde{y}_{N_k})$ - $\gamma_{pow\_acc} \equiv \gamma^{(\sum_{l=1}^k N_l) - 1}$ --- ### Root Rollup circuit - **Ensures the "Start Blob Accumulator" of the epoch is empty.** - **Check that the incrementally-calculated challenges $\gamma$ and $z$ actually match the optimistically-claimed versins that were used in all of the preceding Checkpoint Root circuits**. - Taking the values outputted by the latest checkpoint of the epoch: - assert($z_{BN} == z_{acc}$) - assert($\gamma == \mathbb{F}_{r,BLS}(h(\gamma_{acc}, z_{acc}))$) - Notice: we hash $z$ into the derivation of $\gamma$ for good measure. - **Emit final public inputs to L1**, of which the ones relevant to this blob subprotocol are: - $\texttt{blobCommitmentsHash}_{M} \gets \texttt{blobCommitmentsHash}_{acc} \equiv \texttt{sha256_to_field}(... \texttt{sha256_to_field}(\texttt{sha256_to_field}(\bar{C}_{(1,1)}), \bar{C}_{(1,2)}), ... , \bar{C}_{(M,1)}), \bar{C}_{(M,N_M)})$ - $z_{BN} \gets z_{acc} \equiv h(...h(...h(h(z_{(1,1)}, z_{(1,2)}), z_{(1,3)}), ... , z_{(M,1)}),... z_{(k,N_M)})$ - $y \gets y_{acc} \equiv \sum_{i=1}^{N_{epoch}} \gamma^{i-1} \cdot y_i$ - $\bar{C} \gets \texttt{compress_to_bytes48}(C_{acc}) \equiv \sum_{i=1}^{N_{epoch}} \gamma^{i-1} \cdot C_i$ --- ### Solidity: Epoch proof verification Once all the circuits of the epoch have been proven, the Prover can submit their final epoch proof (a proof of the Root Rollup circuit) to L1. :::success The prover calls the `submitEpochRootProof` function of `EpochProofLib.sol`. ::: The function inputs we care about for this Blobs subprotocol are: $(\pi, \texttt{blobCommitmentsHash}_{M}, z_{BN}, y, \bar{C}, Q)$ where $\pi$ is the root rollup proof, $Q$ is the KZG opening proof's quotient polynomial commitment (computed offchain much earlier in this document), and the other items are as just described at the end of the previous subsection. The L1 `submitEpochRootProof` can verify the blob-related data as follows: 1. Verify the proof $\pi$ and its public inputs (which include $\texttt{blobCommitmentsHash}_{M}, z_{BN}, y, \bar{C}$). - This gives the verifier certainty that the inputs $(\texttt{blobCommitmentsHash}_{M}, z_{BN}, y, \bar{C})$ match those used inside the circuits. - The logic of the circuits convinces the verifier that $y$ is indeed the evaluation of _some_ batched polynomial, evaluated at $z_{BN}$. - The batched polynomial was derived from some _purported_ blob data that was fed into the circuits. But we haven't yet proven that the purported blob data (or equiv. the purported batched polynomial) relates to the batched polynomial commitment $\bar{C}$. - I.e. The verifier is not yet convinced that $y$ is an opening of $\bar{C}$ at $z$. 3. Read $\texttt{blobCommitmentsHash}_{M}$ for the latest checkpoint $M$ from storage, and compare it against the input. - This convinces us that all of the commitments $C_i$ that were used within the circuits are _the same commitments_ as those of the blobs that were submitted to L1 at the much earlier checkpoint proposal steps. - This in turn convinces us that $\bar{C}$ is indeed a random linear combination of the _actual blobs'_ commitments, and that it's therefore valid for usage in the Plonk Paper's KZG batching protocol. - The verifier is still not yet convinced that $y$ is an opening of $\bar{C}$ at $z$. 10. Call the point evaluation precompile with: $z, y, \bar{C}, Q$, which hopefully succeeds. - Recall, the precompile checks: $e(C - [y]_1, [1]_2) == e(Q, [s]_2 - [z]_2)$. - This is effectively a check of: - $e(\sum_{i=1}^{N_M} \gamma^{i-1} \cdot [p_i(s)]_1 - \left[\sum_{i=1}^{N_M} \gamma^{i-1} \cdot y_i \right]_1, [1]_2) == e(\left[\sum_{i = 1}^{N_M} \gamma^{i-1} \cdot \frac{p_i(s) - p_i(z)}{s - z}\right]_1, [s - z]_2)$ - $e(G, H)^{\sum_{i=1}^{N_M} \gamma^{i-1} \cdot p_i(s) - \sum_{i=1}^{N_M} \gamma^{i-1} \cdot y_i} == e(G, H)^{\left( \sum_{i = 1}^{N_M} \gamma^{i-1} \cdot \frac{p_i(s) - p_i(z)}{s - z} \right) \cdot (s - z)}$ - Focussing on the exponent, cancelling the $(s - z)$ terms on the rhs, and combining the lhs sums: - $\sum_{i=1}^{N_M} \gamma^{i-1} \cdot (p_i(s) - y_i) == \sum_{i=1}^{N_M} \gamma^{i-1} \cdot (p_i(s) - y_i)$ - Success! - This convinces the verifier that $y$ is an opening of $\bar{C}$ at $z$. --- ## Summary - The circuits used $(\texttt{blobCommitmentsHash}_{M}, z_{BN}, y, \bar{C})$. - The circuits used some purported blob data, which, when interpolated into polynomials, batched, and evaluated at $z_{BN}$, yields $y$. - $\bar{C}$ is a valid linear combination of the _actual blobs'_ commitments. - $y$ is an opening of $\bar{C}$ at $z$ - So the data used by the circuits "opens to the same $y$" as the blobs which were submitted to L1. - This implies the data used by the circuits _is the same data_ as was submitted as blobs to L1.