Try   HackMD

Post-Verkle State Expiry Scheme Design Specs

Summary

This post-verkle state expiry scheme suggests adding additional state epoch metadata for each value in the leaf node and including it in the calculation of the commitment. The metadata is used to determine if a value is expired or not. Expired values could not be accessed, and users have to first submit a transaction to revive the values by providing the necessary witnesses.

State Epoch Metadata

State Epoch Definition

The definition of a state epoch refers to a time period measured in blocks. This parameter could be adjusted through a governance proposal and pushed to the network through hard forks. Ideally, an epoch should be around 6 months to 1 year so that values do get expired without compromising the user experience significantly.

On the client side, the epoch parameter is represented with 16-bit values (i.e., represents 0 to 65535). This should be sufficient with the assumption that a state epoch should span at least several months

type StateEpoch uint16

Expiry Rules

A value is deemed expired if and only if the following equation holds true:

chainEpoch - valueEpoch >= DEFAULT_EPOCH_LIVE_NUM

To simplify, a value is expired if the difference between the current chain’s epoch and the value’s epoch is greater than or equal to DEFAULT_EPOCH_LIVE_NUM. DEFAULT_EPOCH_LIVE_NUM is another parameter used to determine how many epochs a value can stay in the network without getting expired and becoming inaccessible. The recommended default value is 2.

To summarize this section, assuming we’re going with a state epoch period of 6 months and DEFAULT_EPOCH_LIVE_NUM of 2, this means that any value can only exist for a maximum of 1 year before getting expired.

Verkle Tree (VKT)

Extending extension Node

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

The original commitment of an extension node only consists of 1, Stem, C1, and C2. In this proposal, we’re going to modify the extension node such that a third commitment called C3 is included. C3 represents the commitment of 256 state epoch values where individual state epochs correspond to their values in C1 or C2. If a value of a suffix doesn’t exist, then its state epoch defaults to 0.

Therefore, the commitment of the new extension node consists of 1, Stem, C1, C2, and C3.

Expiry & Resurrection

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

When a group of values (i.e., C1 or C2) are expired, we would retain its cryptographic commitment and prune the underlying values from the database. If all 256 values in the extension node are expired, then the extension node is transformed into a “hashed” node where only its commitment stays, and underlying values are pruned.

Performing any CRUD action on a particular value will update its associated state epoch accordingly. For example:

Given key 0xAA......AA, the last accessed epoch is 5. The current state epoch is 6. Upon accessing the value, the epoch for this key is updated to 6.

However, if the CRUD task accesses an expired value, then it’s a rule to first revive the value before proceeding. For example:

Given key 0xAA......AA, the last accessed epoch is 5. The current state epoch is 10. Upon accessing the value, an error is returned, and the transaction is reverted because, as per the expiry rule, the value is deemed expired. Upon resurrecting this value by providing the necessary witness, the value’s state epoch is updated to the current state epoch.

Hard Fork

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

To enable the state expiry scheme fully, we would require 2 hard forks. The first hard fork is used to enable the creation of a new extension node. The second hard fork enables the expiry feature. Here’s how the flow would look like:

Epoch 0: Everything is the same as before

—Enable Hardfork 1—

Epoch 1: Performing CRUD on values would create the new extension node with state epoch metadata. Values that are not accessed in this epoch would not be affected.

—Enable Hardfork 2—

Epoch 2: All legacy extension nodes would be deemed expired and pruned. It’s safe to do so because legacy extension nodes are guaranteed to have their last accessed epoch at epoch 0.

State Storage

Snapshot

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

A new value format is introduced to the snapshot layer. That is, each storage or account key may or may not correspond to a value where the epoch information is included. The value for the snapshot KV pairs is RLP-encoded for the new format or it’s the raw value bytes. Here’s the pseudocode for the structure changes:

const (
	RawValueType       = iota // length <= 32 bytes	
	ValueWithEpochType        
)

type SnapshotValue interface {
	GetType() byte
	GetEpoch() uint16
	GetVal() []byte
	EncodeToRLPBytes(buf *rlp.EncoderBuffer)
}

Verkle Tree DB

Based on the codebase, the state scheme used is path-based only. The current method of storing Verkle nodes is to first serialize them and store the raw bytes in the database. In particular, the serialization of an extension node is as follows:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

The new format of serializing an extension node would include the commitment C3 and the state epochs. It would be something like:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

We should also differentiate between the legacy extension node and the new extension node, as both of them may exist at the same time. Here’s an example of the pseudocode:

const (
	internalRLPType         iota
	legacyLeafRLPType     
	expiryLeafRLPType
)

ExpiryLeafNode struct {
		stem   []byte
		values [][]byte

		commitment *Point
		c1, c2, c3     *Point

		depth byte
}

EVM

New Transaction Type

Since accessing an expired value reverts the transaction, we need a new transaction type to allow users to send the appropriate witness to revive the values that they care about. The following is the pseudocode for the new transaction type:

type ReviveStateTx struct {
	ChainID    *big.Int
	Nonce      uint64
	GasTipCap  *big.Int // a.k.a. maxPriorityFeePerGas
	GasFeeCap  *big.Int // a.k.a. maxFeePerGas
	Gas        uint64
	To         *common.Address `rlp:"nil"` // nil means contract creation
	Value      *big.Int
	Data       []byte
	AccessList AccessList
	V *big.Int 
	R *big.Int 
	S *big.Int 

	WitnessList []ReviveWitness // New field for state revive
}

Witness Format

Witnesses should allow multiple values to be revived through a single transaction. One particular note is that since all information (i.e. account balance, nonce, storage slot) are embedded in a single Verkle Tree, proofs need to specify the data type that it corresponds to. This is a tricky part because contract codes are chunkified and inserted into the tree. The architecture may need encoding and decoding target keys separately to find the exact value to be revived. An example of the witness format looks like this:

type ReviveWitness struct {
	witnessType byte // may support other witness types in the future 
	proofList []LeafValueProof    	
}

type LeafValueProof struct {
	dataType uint256 // account balance, nonce, storage slot, etc
	key []byte
	proof []byte // can be C1, C2 or commitment of extension node
	values []byte // C1/C2 values (128 * 32 bytes) or all values (256 * 32 bytes)
}

SLOAD & SSTORE

The current SLOAD and SSTORE opcodes do not return an error if a state does not exist. In this state expiry scheme, accessing expired values should return error so that the transaction execution halts and reverts.

Gas

Each witness in a revive transaction should also be charged with a gas price. For now, it’s not this project’s best interest to figure out the best number of per-byte gas charged for a witness. Therefore, we’ll just go with some “reasonable” number.

Prune

Offline Prune

For hash-based state scheme, offline prune utilizes bloom filter to identify which trie nodes are staled and can be deleted from the disk. For path-based state scheme, the idea of offline prune seems redundant (p.s. I could be wrong, TBD in the future).

In any case, we would still require a method where client can do an offline prune to delete the expired values from the tree. This may include rebuilding the entire state tree, traverse through each leaf node and delete them 1 by 1.

Inline Prune

TBD