A few paths to statelessness and state expiry

Two paths to a solution exist, and have existed for a long time: weak statelessness and state expiry:

State expiry: remove state that has not been recently accessed from the state (think: accessed in the last year), and require witnesses to revive expired state. This would reduce the state that everyone needs to store to a flat ~20-50 GB.
Weak statelessness: only require block proposers to store state, and allow all other nodes to verify blocks statelessly.

The good news is that recently, there have been major improvements on both of these paths, that greatly reduce the downsides to both:

Some techniques for how a ReGenesis-like epoch-based expiry scheme can be adapted to minimize resurrection conflicts
Piper Merriam's work on transaction gossip networks that add witnesses to be stateless-client-friendly, and his work on distributed state storage and on-demand availability
Verkle trees, which can reduce worst-case witness sizes from ~4 MB to ~800 kB (this is definitely small enough, because existing worst-case blocks that are full of calldata are already 12.5M / 16 ~= 780 kB and we have to handle those anyway). See slides, doc, code.

The purpose of this post is to go beyond theory and broad concepts, and present some concrete potential roadmaps for how either weak statelessness or state expiry can be introduced.

Option 1: In-place swap to a Verkle tree

We can use EIP 2584, which was initially intended to switch state storage from a hexary tree to a binary tree, to instead switch state storage directly to a Verkle tree. Switching to a Verkle tree makes it possible to create very compact witnesses for any block, enabling stateless verification.

Here is how the EIP 2584 procedure works in diagram form:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Once such a transition is complete, the Ethereum state is from that point on authenticated by a Verkle tree, making it theoretically possible to generate compact witnesses that allow stateless verification. The functionality to actually do the stateless verification and witness broadcasting does not have to be added immediately; "Verkle tree first, stateless infrastructure second" is a perfectly reasonable route, though realistically at least some basic stateless infrastructure (eg. generating a witness for a block and publishing that witness in a dedicated subnet) could be built and tested in parallel with the transition process.

Additionally, Piper's work on stateless transaction gossip can also be worked on and implemented at the same time, though it is not strictly a failure if being part of the transaction-sending network continues to require nodes to have full state for some time.

This approach gets us to stateless verification, but it does not implement any kind of state expiry.

Option 2: per-epoch state expiry

We can implement the state expiry mechanism proposed here. The core idea is that there would be a state tree per epoch (think: 1 epoch ~= 8 months), and when a new epoch begins, an empty state tree is initialized for that epoch and any state updates go into that tree. Full nodes in the network would only be required to store the most recent two trees, so on average they would only be storing state that was read or written in the last ~1.5 epochs ~= 1 year.

This puts a permanent cap (proportional to the gaslimit) on how much state clients would need to store (quick estimate: 2.5m blocks * 12.5m gas per block / 20000 gas cost of filling a new storage slot = 1.56b objects ~= 75 GB maximum, though under normally circumstances it would be much smaller).

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

There would also be a separate address space per epoch; all existing accounts and contracts would be put into address space 0. A new CREATE opcode would be added that takes an address space as an argument and creates an account in that address space. Different address spaces can coexist in the same trie: we mix the address space index into the trie key to avoid collisions.

Rules for reading and writing

There are two key principles:

Only the most recent tree (ie. the tree corresponding to the current epoch) can be modified. All older trees are no longer modifiable; objects in older trees can only be modified by creating copies of them in newer trees, and these copies supersede the older copies.
Full nodes (including block proposers) are expected to only hold the most recent two trees, so only objects in the most recent two trees can be read without a witness. Reading older objects requires providing witnesses.

Creation of new objects (accounts or storage slots) is governed by the address space mechanism: address space

n

can only be modified in epochs

\geq n

. A new object in address space

n

is created in epochs

n

n + 1

without providing a witness, but creating an object in address space

n

in some epoch

e > n + 1

requires witnesses to prove that an object in that same position was not already created.

Expressed in precise terms, the rules are as follows. When we refer to "a state object

(e, s)

e

is the epoch number of the object, and

s

is the address itself (including the storage key if we are talking about a storage slot). The trees are referring to as

S_{0}

S_{1}

S_{2}

…

If an object
$(e, s)$ is modified or created during epoch
$e$ , this can be done directly with a modification to tree
$S_{e}$
If an object
$(e, s)$ is modified or created during epoch
$e + 1$ , this can be done directly with a modification to tree
$S_{e + 1}$ . The object is put into tree
$S_{e + 1}$ ; tree
$S_{e}$ is not modified.
If an object
$(e, s)$ is modified during epoch
$f > e + 1$ , and this object is already part of tree
$S_{f}$ or
$S_{f - 1}$ , then this can be done directly with a modification to tree
$S_{f}$
If an object
$(e, s)$ is first created during epoch
$f > e + 1$ and was never before touched, then the sender of the transaction creating this object must provide a witness showing the object's absence in all trees
$S_{e}, S_{e + 1} . . . S_{f - 2}$
If an object
$(e, s)$ is modified during epoch
$f > e + 1$ , and this object is not yet part of tree
$S_{f}$ , and the object was most recently part of tree
$S_{e^{'}}$ with
$e \leq e^{'} < f - 1$ , then the sender of the transaction creating this object must provide a witness showing the state of the object in tree
$S_{e^{'}}$ and its absence in all trees
$S_{e^{'} + 1}, S_{e^{'} + 2} . . . S_{f - 2}$

Note one additional implementation detail: when an object from an old epoch is read or written to, and its post-transaction value is zero, the object still needs to be saved in the new state tree. "Zero" and "absent" are no longer synonyms, as "zero" means there's nothing there and "absent" means "the latest version of this object might be in older trees, check those first".

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Suppose the dark-blue object was last modified in epoch 0, and you want to read/write it in a transaction in epoch 3. To prove that epoch 0 really was the last time the object was touched, we need to prove the dark-blue values in epochs 0, 1 and 2. Full nodes still have the full epoch 2 state, so no witness is required. For epochs 0 and 1, we do need witnesses: the light blue nodes, plus the purple nodes that can be regenerated during witness verification. After this operation, a copy of the object is saved in the epoch 3 state.

Implementation

Implementing this at consensus layer is fairly straightforward; the main work to be done is:

Changing the state tree structure to support an array of trees
Adding the new CREATE opcode and something similar for EOAs
Adding the functionality to verify witnesses (including a new transaction type)
Figuring out the exact gas mechanics.

However, there is also work at the ecosystem level that needs to be done: particularly, giving transaction senders the ability to generate witnesses for old state. This is the same work that is needed to support transaction senders generating stateless witnesses, except it only applies to old state, and the witnesses are against static trees so the implementation is much simpler. At the beginning, it may be acceptable to just have block explorers provide witness fillers for transactions, and then decentralize that functionality over time.

Option 3: per-epoch state expiry + Verkle trees in one step

Implement option 2 verbatim, except that all trees except the epoch 0 tree are Verkle trees. This allows us to have efficient stateless verification once we get into epoch 2, as at that point state objects will either be refreshed and in the new trees, or they will be in the old tree but accessing them for the first time will require 4 kB witnesses and will be more expensive.

The upside is that this gets us both weak statelessness and state expiry in one step, without the complexity of a full state re-hashing procedure. If we want to remove the complexity of needing to process both hexary Patricia branches from epoch 0 and Verkle proofs from later epochs, we could simply make a hard fork during epoch 2 to replace the hexary Patricia root with a Verkle root to equivalent data, so that we could prove all old states with Verkle proofs only from then on.

Ps: You guys can find fun in block blast if you play it happily.

Micah

2021/03/04 16:08:17

e can use [EIP 2584](https://eips.ethereum.org/EIPS/eip-2584),

I recommend creating a new EIP for this, rather than trying to reform that one. We can just mark that as withdrawn if no one wants it anymore. (Edited)

Paul D.

2021/03/04 22:25:05

75 GB

A worse case is to CREATE contracts totaling 62.5 kb per block (at 200 gas per byte of code). Over 2.5m blocks, this would be ~156 gb. (Edited)

Alex Beregszaszi

2021/03/05 00:15:57

There would also be a separate _address space_ per epoch; all existing accounts and contracts would be put into address space 0.

On Discord it was discussed that addresses would become larger than 160-bits. This means compilers like Solidity and existing contracts need to be investigated for potential issues. Solidity itself assumes addresses to be 160-bits and removes masking in certain cases during optimisation. This may or may not cause issues with storage or other places. Talking to new addresses from old contracts may have problems. (Edited)

vbuterin

2021/03/17 21:10:45

We could always double the contract creation gas cost to 400 gas per byte! (Edited)

2021/03/17 21:11:10

Another alternative is the idea of a separate EIP-1559-like adjustable gasprice for storage filling (Edited)

2021/03/17 21:11:56

Suppose the dark-blue object was last modified in epoch 0

The dark blue in epoch 1 and 2 is just representing that same location in the epoch 1 and 2 state trees. (Edited)

Andrei Maiboroda

2021/05/04 09:52:09

Creation of _new_ objects (accounts or storage slots) is governed by the address space mechanism: address space $n$ can only be modified in epochs $\ge n$. A new object in address space $n$ is created in epochs $n$ or $n+1$ without providing a witness, but creating an object in address space $n$ in some epoch $e > n+1$ requires witnesses to prove that an object in that same position was not already created.

Why would one want to create an object in some old address space? (Edited)

Albus Dompeldorius

2021/06/04 11:51:47

Someone could recieve a token at an address and not move it until a later epoch. The address would be created when it is first moved.

Guest Fox2021/09/28 11:26:19

Ok (Edited)

Guest Rowe2024/09/09 09:28:29

https://strandshint.io/ thank you

Guest Owens2024/10/22 14:39:28

jews did 9/11

A few paths to statelessness and state expiry

Option 1: In-place swap to a Verkle tree

Option 2: per-epoch state expiry

Rules for reading and writing

Implementation

Option 3: per-epoch state expiry + Verkle trees in one step

Read more

A quick barycentric evaluation tutorial

Pragmatic destruction of `SELFDESTRUCT`

Untitled

Vitalik's Minor Changes Wishlist