# Ethereum mainnet - Analysis of MPT depth and VKT locality optimization in block executions This document provides summarized information about mainnet transaction execution state trie access for the Merkle Patricia Storage Trees, and on-the-fly analysis of future Verkle Tree branch locality for the same access patterns. # Context and reason for this work Previously, we did a [static analysis of the Merkle Patricia Tree depth of the Ethereum mainnet network](https://hackmd.io/@jsign/geth-mpt-analysis). This was useful to understand the depth of the State Trie and have a distribution of depths for the Storage Trees. What’s important to note about static analysis is that it isn’t factor in the frequency at which each contract is accessed in the actual usage of the network. For example, if 99.99% of the contract Storage Trees have a tree depth of 1, and 0.01% have a depth of 10, we can’t conclude that most Ethereum clients will have to deal with trees of depth 1. This can happen because the most accessed contract is probably the one with more storage; this isn’t a rule, but it makes sense. In our example, if 99% of the transactions target 0.01% of the contracts, the real access depth is closer to ~10 and not ~1. This work aims to collect Storage Tree average depth access on real transactions while they’re being executed in a real client (i.e: in our case Geth). Additionally, we’ll simulate how a Verkle Trie design decision to optimize for access locality would work in *the same* access patterns. While we analyze the actual transaction storage slot accesses, we’d map how these behave on Verkle Trie branch accesses. We’ll explain more about this later. # Methodology To achieve this goal, I built a [Go native Geth tracer](https://geth.ethereum.org/docs/developers/evm-tracing/custom-tracer), which allows hooks into different parts of the execution pipeline. In particular, we can re-execute previous transactions and capture a particular opcode (i.e: `SSTORE` and `SLOAD`) that is being executed. Armed with this possibility, we can load the storage trie and calculate the depth of the accessed branch. Moreover, since the storage slot is present in the EVM stack at the point `SSTORE` and `SLOAD` are executed, we can read this value. This means that we have the preimage of the Storage Trie key, which allows us to calculate [the *stem*](https://blog.ethereum.org/2021/12/02/verkle-tree-structure) corresponding to the storage slot in the future Verkle Trie migration. With this, we can simulate the branch accesses in the Verkle Trie and analyze how many accesses for a branch are: - *First-time access*: A branch is accessed in the VKT for the transaction. - *Hot access*: a second or further access to a previously read branch in the VKT for the transaction. - *Free access*: this is a special case of *hot access* where the storage slot is less than 64, since these are optimized in a special *stem* that is always loaded for the transaction, so it doesn’t incur in *First-time access.* If the above sounds a bit cryptic, I’d highly recommend reading [EIP-4762: Statelessness gas cost changes](https://eips.ethereum.org/EIPS/eip-4762) and the *[Layout of the verkle tree* section from the Verkle Tree blog post](https://blog.ethereum.org/2021/12/02/verkle-tree-structure). The analysis stack is formed by: - [The custom Geth native tracer](https://github.com/jsign/go-ethereum/blob/jsign/liveanalysis/eth/tracers/native/treeaccesslogger.go), which enables calling `debug_traceTransaction(..., "treeAccessLogger")` for a custom tracer to get raw data about storage slot access for MPT depth and VKT branch access. - [A custom tool `trace-depth`](https://github.com/jsign/trace-depth) that continuously calls this new tracer from the Geth node and saves the data in SQLite while the Geth node continues to execute blocks. The same tool provides flag `-results` to create the summary of the collected data. With a Geth node synced to mainnet, we can run the enhanced version from the first bullet, with the `trace-depth` as a sidecar process. Note that this work is trying to do the VKT branch access analysis without a Geth node with access to any existing Keccak preimages. This is why we “simulate” the VKT branch access on-the-fly since we can pull the “preimage” from the EVM stack while the transaction is executed. There’s no migrated VKT tree in the Geth node since we don’t have the information to do that! # Results Here we present the results generated by running the analysis stack and getting the automatic result summary from `trace-depth`. Later, I’ll explain in more detail what each field of the summary means. ```markdown -- Summary -- Block range: [16348389, 16356771] (8383 total blocks) Total txns: 1183641 Total read/write accesses: 25865749 MPT average depth is: 5.04 Top 20 contracts with the most accesses: Contract 0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2 Read/Write slot accesses: 1633727 (6.32%) MPT Avg. read/write access depth: 6.92 VKT Reads branch accesses: 0.04% free, 66.24% hot, 33.72% cold VKT Writes branch accesses: 0.00% free, 33.89% hot, 66.11% cold Contract 0x06450dee7fd2fb8e39061434babcfc05599a6fb8 Read/Write slot accesses: 1501938 (5.81%) MPT Avg. read/write access depth: 7.62 VKT Reads branch accesses: 54.28% free, 31.23% hot, 14.49% cold VKT Writes branch accesses: 25.59% free, 58.75% hot, 15.66% cold Contract 0xdac17f958d2ee523a2206206994597c13d831ec7 Read/Write slot accesses: 1398251 (5.41%) MPT Avg. read/write access depth: 6.93 VKT Reads branch accesses: 54.61% free, 3.88% hot, 41.51% cold VKT Writes branch accesses: 0.00% free, 4.39% hot, 95.61% cold Contract 0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48 Read/Write slot accesses: 1275505 (4.93%) MPT Avg. read/write access depth: 6.82 VKT Reads branch accesses: 10.10% free, 39.67% hot, 50.23% cold VKT Writes branch accesses: 0.09% free, 9.98% hot, 89.93% cold Contract 0x00000000006c3852cbef3e08e8df289169ede581 Read/Write slot accesses: 709661 (2.74%) MPT Avg. read/write access depth: 7.21 VKT Reads branch accesses: 15.52% free, 43.33% hot, 41.15% cold VKT Writes branch accesses: 37.26% free, 31.66% hot, 31.08% cold Contract 0x000000000000ad05ccc4f10045630fb830b95127 Read/Write slot accesses: 513161 (1.98%) MPT Avg. read/write access depth: 6.35 VKT Reads branch accesses: 24.09% free, 56.13% hot, 19.79% cold VKT Writes branch accesses: 29.26% free, 31.71% hot, 39.03% cold Contract 0x42dc0cecefbaf8e81d631a75fa212510c347f66b Read/Write slot accesses: 314540 (1.22%) MPT Avg. read/write access depth: 4.89 VKT Reads branch accesses: 35.73% free, 12.57% hot, 51.70% cold VKT Writes branch accesses: 10.43% free, 7.48% hot, 82.10% cold Contract 0x000000000000aaeb6d7670e522a718067333cd4e Read/Write slot accesses: 270848 (1.05%) MPT Avg. read/write access depth: 4.56 VKT Reads branch accesses: 0.00% free, 42.69% hot, 57.31% cold VKT Writes branch accesses: 0.00% free, 9.45% hot, 90.55% cold Contract 0x2b591e99afe9f32eaa6214f7b7629768c40eeb39 Read/Write slot accesses: 222207 (0.86%) MPT Avg. read/write access depth: 6.67 VKT Reads branch accesses: 3.52% free, 5.54% hot, 90.95% cold VKT Writes branch accesses: 21.38% free, 17.21% hot, 61.40% cold Contract 0x44e94034afce2dd3cd5eb62528f239686fc8f162 Read/Write slot accesses: 143861 (0.56%) MPT Avg. read/write access depth: 6.04 VKT Reads branch accesses: 6.06% free, 79.79% hot, 14.15% cold VKT Writes branch accesses: 47.21% free, 8.59% hot, 44.19% cold Contract 0xcb323294d1cfb4547126b0e299af2e1fab7f1943 Read/Write slot accesses: 139639 (0.54%) MPT Avg. read/write access depth: 4.88 VKT Reads branch accesses: 30.80% free, 16.70% hot, 52.50% cold VKT Writes branch accesses: 4.21% free, 23.73% hot, 72.06% cold Contract 0x5a98fcbea516cf06857215779fd812ca3bef1b32 Read/Write slot accesses: 133654 (0.52%) MPT Avg. read/write access depth: 6.55 VKT Reads branch accesses: 8.47% free, 81.94% hot, 9.60% cold VKT Writes branch accesses: 0.00% free, 6.76% hot, 93.24% cold Contract 0x5f8c3af28b7af2fd628f9ccd14d12570da0715ae Read/Write slot accesses: 128564 (0.50%) MPT Avg. read/write access depth: 4.32 VKT Reads branch accesses: 24.00% free, 50.59% hot, 25.41% cold VKT Writes branch accesses: 11.83% free, 40.16% hot, 48.01% cold Contract 0x7db5af2b9624e1b3b4bb69d6debd9ad1016a58ac Read/Write slot accesses: 120187 (0.46%) MPT Avg. read/write access depth: 5.41 VKT Reads branch accesses: 42.20% free, 52.41% hot, 5.39% cold VKT Writes branch accesses: 53.68% free, 11.48% hot, 34.84% cold Contract 0xcfc19e20d0a7c57b7bfa18c74179de576cd29ecb Read/Write slot accesses: 117295 (0.45%) MPT Avg. read/write access depth: 3.86 VKT Reads branch accesses: 45.97% free, 42.18% hot, 11.85% cold VKT Writes branch accesses: 27.86% free, 9.96% hot, 62.18% cold Contract 0x88e6a0c2ddd26feeb64f039a2c41296fcb3f5640 Read/Write slot accesses: 107145 (0.41%) MPT Avg. read/write access depth: 5.85 VKT Reads branch accesses: 68.41% free, 17.62% hot, 13.97% cold VKT Writes branch accesses: 83.88% free, 9.09% hot, 7.03% cold Contract 0x769272677fab02575e84945f03eca517acc544cc Read/Write slot accesses: 99990 (0.39%) MPT Avg. read/write access depth: 4.54 VKT Reads branch accesses: 0.03% free, 46.05% hot, 53.92% cold VKT Writes branch accesses: 0.00% free, 20.04% hot, 79.96% cold Contract 0xc9aa0d98a7d7899f7ef6daa0e7482e3c4e4f140b Read/Write slot accesses: 96017 (0.37%) MPT Avg. read/write access depth: 4.24 VKT Reads branch accesses: 56.18% free, 24.01% hot, 19.81% cold VKT Writes branch accesses: 11.50% free, 6.15% hot, 82.35% cold Contract 0x74312363e45dcaba76c59ec49a7aa8a65a67eed3 Read/Write slot accesses: 94583 (0.37%) MPT Avg. read/write access depth: 6.42 VKT Reads branch accesses: 12.14% free, 33.29% hot, 54.57% cold VKT Writes branch accesses: 70.11% free, 0.00% hot, 29.89% cold Contract 0x5954ab967bc958940b7eb73ee84797dc8a2afbb9 Read/Write slot accesses: 93651 (0.36%) MPT Avg. read/write access depth: 5.38 VKT Reads branch accesses: 35.68% free, 42.58% hot, 21.73% cold VKT Writes branch accesses: 20.17% free, 21.95% hot, 57.89% cold ``` # Summary output description Let’s explain what the summary is showing. ## Header ```markdown -- Summary -- Block range: [<start>, <end>] (<numBlocks> total blocks) Total txns: <totalTxns> Total read/write accesses: <totalReadWriteAccesses> MPT average depth is: <avgMPTDepth> ``` - `<start>` and `<end>` describe which Ethereum mainnet blocks were analyzed which generated these results. `<numBlocks> = <end>-<start>+1` - `<totalTxns>` are the total number of transactions executed in the analyzed block range. - `<totalReadWriteAccess>` is the total `SSTORE` and `SLOAD` opcodes analyzed in the block range (i.e: every storage slot read/write access). - `<avgMPTDepth>` is the average depth of a Storage Trie branch access. Every branch access depth was logged and averaged with the total number of branch accesses. ## Contract summary ```markdown Contract <contractAddr> Read/Write slot accesses: <readWriteSlotAccesses> (<percOfTotal>) MPT Avg. read/write access depth: <MPTAvgDepth> VKT Reads branch accesses: <readFree>% free, <readHot>% hot, <readCold>% cold VKT Writes branch accesses: <writeFree>% free, <writeHot>% hot, <writeCold>% cold ``` - `<contractAddr>` is the address of the contract. - `<readWriteSlotAccesses>` is the total number of read/write storage slot accesses for this contract in the analyzed block range. - `<percOfTotal>` is the % of read/write storage slots compared to the total analyzed in the block range. This factor sorts results. So, the top in the list is the contract that did more read/write storage slot accesses. - `<MPTAvgDepth>` has the same semantics described in the *Header* section but is scoped to this contract. - `<(read|write)Free>` indicates the % of read/write branch access for a storage slot <64. - `<(read|write)Cold>` indicates the % of read/write branch accesses for a storage slot ≥64 that have accessed a VKT *stem* for the first time in the transaction execution. - `<(read|write>Hot>` indicates the % of read/write branch access for a storage slot ≥64 for a *stem* that has been previously accessed (i.e., previous *cold access* for that *stem* has happened). Note that `<(read|write)Cold>` accesses can provide insight into `WITNESS_BRANCH_COST` and `SUBTREE_EDIT_COST`. # Conclusion If we compare the static results from this live mainnet analysis, we can see that the true MPT depth access is way higher than expected. This makes intuitive sense since the most popular contracts probably have a reasonable amount of storage data. This work also provides interesting results about how the Verkle Triee locality optimization made in the design plays out on real contract executions (no testnets, nor fake contracts). This could provide insight data to refine [EIP-4762](https://eips.ethereum.org/EIPS/eip-4762) and to the owners of contracts on what they might expect regarding future gas cost changes. To potentially improve the usefulness in this regard, we can think of further iterations of this work, for example: - Consider counting count how many “first time” VKT leaf access happen to have some intuition around `CHUNK_EDIT_COST`. - Instead (or apart from) showing the percentages of VKT branch accesses, we could show the average first-time accesses per contract (e.g: contract X has Y branch cold accesses per transaction). This can help estimate future gas costs since using % in summary only describes “access patterns”. - Consider if we could also count `CHUNK_FILL_COST` situations in the access patterns.