# EIP-4762 execution witness worst-case This document surfaces a consequence of [EIP-4762](https://eips.ethereum.org/EIPS/eip-4762) that is worth analyzing more carefully since it might mean changing the proposed constants. ## Context Let’s start with a minimal refresher so we can understand the point of this document correctly. ### Data grouping The potential proposed Verkle ([EIP-6800](https://eips.ethereum.org/EIPS/eip-6800)) and Binary ([EIP-7864](https://eips.ethereum.org/EIPS/eip-7864)) trees come with many fundamental changes apart from changing the tree structure. For example, they define a way to do code-chunkification, which solves the current worst-case scenario for block proving, and they also propose grouping data at the leaf level so the number of branch openings can be reduced, which helps for Verkle/Binary/SNARK proving (i.e. fewer branches = less hashing) This proposed grouping in the new trees implies a gas cost remodeling to define how this should be charged. For example, in today’s reality, accessing two storage slots always implies different branches. In the new reality, if the two storage slots live in the same branch, it is required to explain how this should be charged. This is where [EIP-4762](https://eips.ethereum.org/EIPS/eip-4762) comes into play, and part of it explains how gas costs for state access should be charged. The part of the EIP that describes how to do the gas cost charging is [here](https://eips.ethereum.org/EIPS/eip-4762#witness-gas-costs), but here is a TL;DR: - A branch (named *stem*) in the new tree contains 256 values (i.e. storage slots, code chunks, or account data) - Whenever a new stem is accessed `WITNESS_BRANCH_COST` is charged. - Accessing any value in this (now warm) stem is `WITNESS_CHUNK_COST`. The EIP also defines similar rules for tree writes. The whole idea of the grouping is to try to improve the efficiency of state access since, for example, if storage slot X is accessed, there’s a high chance that near storage slots could be accessed, which means having to pay less for them since this doesn’t mean having to open new branches for proving. ### Block execution witness Let’s look at the implications of this gas cost remodeling from the angle of potential block size increases. Recall that the idea of Verkle/Binary kind of statelessness is to include in the block an *execution witness* that contains information and cryptographic proof that the provided pre-state data for the block execution is correct. The execution witness is defined as: ```python class ExecutionWitness(container): state_diff: StateDiff verkle_proof: VerkleProof ``` where: - `state_diff` is a list describing the accessed/changed in the block execution. This is similar to what a block-access list would provide but in a different format grouped by stem. - `verkle_proof` information to prove that the provided `state_diff` corresponds to the parent state root. This allows the stateless client to verify that the `state_diff` information required for block re-execution is correct. This is enough information to understand the next section of this document. If you want to understand more about these internal structures, you can see their definitions in [this document](https://notes.ethereum.org/Si5tEWlMTYerkhG3fOIAMQ#Verkle-serialization-format-in-SSZ). Also, to understand more about the execution witness's gory details, refer to this [article](https://ihagopian.com/posts/anatomy-of-a-verkle-proof) I wrote a while ago. ## Execution witness size worst-case scenario The proposed gas remodeling, which reflects this grouping done to improve efficiency, implies a new potential worst-case for block size, which might not be evident since it behaves differently from worst-case sizes in the current MPT. I’ll describe the rationale in steps: - Every state access implies adding a value in `state_diff`. For example, if we access a storage slot, it must be included there. - For this state access, there’re two scenarios: - The storage slot is the first access in the stem, and it should pay `WITNESS_BRANCH_COST+WITNESS_CHUNK_COST`. - The storage slot isn’t the first access in the stem, in which case it should pay `WITNESS_CHUNK_COST` since another access already paid `WITNESS_BRANCH_COST`. - This already should signal that there’s a way to include at least 32 bytes in the `state_diff` only paying `WITNESS_CHUNK_COST`, which is `200`. To exploit this to generate the biggest execution witness size, we can try to access all values for each stem, which lowers the average access cost per byte included in `state_diff`. The worst case isn’t doing random access but actually the opposite. Let’s do the math with the proposed constants `WITNESS_BRANCH_COST=1900` and `WITNESS_CHUNK_COST=200`: - Accessing a complete group costs `1900+256*200=53_100` gas. - Accessing this entire group has the following impact on `state_diff`: - Add the `stem` value of 31 bytes (i.e., the key prefix for all these 256 values). - Add the actual pre-state value of the storage slot for all 256 entries, which is `256*32` bytes. - This means a total of 8223 bytes. - In a 36M gas limit block, we can do `36M/53_100~=678` of these accesses. - `678*8223~=5.3MiB` of `state_diff` in the *execution witness*. This is the gross number of the worst-case. The *execution witness* has other nits included, but the number won’t change much. For example, the `verkle_proof` part of the *execution witness* is less relevant since this kind of access minimizes the number of branches opened by definition. As in, the expected depth of the Verkle Tree is 4; thus opening 53_100 stems implies less than ~ `53_100*4*32=7KiB` of internal node commitments. The cryptographic proof (IPA+Multiproof) has a constant size in the number of openings, so it is unaffected by any state access. So the actual blow-up isn’t the `verkle_proof` per se, but the `state_diff`. This is rooted in the intention of the tree design to allow more efficient state access since you can have a lot of storage slot access for `200` gas costs, which is trying to improve UX. The point of this document is also to clarify this new worst-case scenario and account for it in the EIP design. ## Should we do something about this? The reality is that 5.3MiB might be too big, which increases linearly with the gas limit. It also depends on the networking block distribution protocol improvements and bandwidth requirements recommended in nodes. If this number is too big, a reconsideration of `WITNESS_CHUNK_COST` and `WITNESS_BRANCH_COST` should be done. Note that the current values tried to simulate today's reality where `WITNESS_BRANCH_COST+WITNESS_CHUNK_COST` is `2100` so is very similar to current cold-SLOAD costs. A potential option is to increase `WITNESS_CHUNK_COST` to regulate the impact of the described “attack” impact on `state_diff`, while also potentially lowering `WITNESS_BRANCH_COST` only if we want to keep this 2100 cost for single-value access. In any case, this requires a more thoughtful discussion.