Stateless Ethereum With Merkle Witnesses

This post offers a path forward for implementing statelessness in Ethereum protocol without changing the trie layout. It suggests a tradeoff for eliminating the worst case size of a Merkle witness by having stateless nodes store the contract bytecodes.

Huge thanks to @notnotstorm for gathering the data.

Previous Works

Original suggestions for stateless Ethereum implementation with existing state trie layout have been turned down due to the worst case size of the witness. The worst case witness size is primarily caused by the need to include entire contract bytecode and is over 153 MiB (150MiB of contract bytecodes and 3MiB of proofs).

Worst case napkin math
​​​​Let's take an Ethereum block with current gas limit which is filled 
​​​​with one transaction. The transaction repeatedly accesses unique 
​​​​contract bytecodes via `EXTCODESIZE`.

​​​​The transaction loads contract addresses from storage and calls
​​​​`EXTCODESIZE` on the target. For simplicity, the calculation 
​​​​excludes the cost of lookup related and stack operations.
​​​​The approximate maximum number of contract codes the transaction
​​​​can load is
​​​​(max gas used - base cost) / (SLOAD cost + EXTCODESIZE cost)
​​​​(30 000 000 - 21 000) / (2100 + 2600) =~ 6 378

​​​​Assuming each bytecode is unique and has the maximum size of
​​​​24 KiB (as defined by [EIP-170](https://eips.ethereum.org/EIPS/eip-170)),
​​​​that brings the size of the contracts included in the witness to
​​​​6 378 * (32 + 24 576) =~ 150MiB
​​​​
​​​​Currently, there are around 265 mil accounts in the state trie.
​​​​The average depth of the trie is log16(265mil) =~ 7.
​​​​We can estimate the number of unique nodes in state witness using
​​​​formula `N * (C * (D - 1) + 1)`, where D is the average depth of
​​​​the trie, C is the overlap factor and N is the number of accessed
​​​​accounts (leaves). For uniformly distributed state keys we can assume
​​​​the overlap factor of 0.5 thus giving us the number of unique nodes
​​​​in the worst case state witness
​​​​6 378 * (0.5 * (6 - 1) + 1) = 25 512
​​​​
​​​​Provided we store state witness as `keccak(rlp(node)): rlp(node)`
​​​​and the average size of a node is 100 bytes, the size of the state
​​​​witness in the worst case is
​​​​(100 + 32) * 25 512 =~ 3.2Mib
​​​​
​​​​That brings the rough total size of worst case witness to ~153MiB.

Migration to Verkle trie eliminates the worst case by chunking the contract bytecode across multiple leaves and charging the cost of access per leave. Additionally, it reduces the proof size due to a more flat trie layout with less intermediate nodes. However, this requires all Ethereum nodes to perform live database migration.

Trading Off Worst Case Size For Storage

The contract bytecodes today are stored outside of the state trie. The account leaves only contain the commitment (code hash) to the contract code which is then used by the client implementation to retrieve the actual contract code.

The worst case size of the Merkle witness can be significantly brought down by excluding contract bytecodes from the witness and making stateless nodes responsible for retrieving and storing contract bytecodes.

As of Nov 24, 2024, the total uncompressed size of unique Ethereum contracts is less than 14 GiB.

Ethereum Total Unique Bytecode Bytes

The size growth of unique contracts has been ranging between 100 and 330 MiB over the past 5 years.

image_720-1

Select a repo