This post-verkle state expiry scheme suggests adding additional state epoch metadata for each value in the leaf node and including it in the calculation of the commitment. The metadata is used to determine if a value is expired or not. Expired values could not be accessed, and users have to first submit a transaction to revive the values by providing the necessary witnesses.
The definition of a state epoch refers to a time period measured in blocks. This parameter could be adjusted through a governance proposal and pushed to the network through hard forks. Ideally, an epoch should be around 6 months to 1 year so that values do get expired without compromising the user experience significantly.
On the client side, the epoch parameter is represented with 16-bit values (i.e., represents 0 to 65535). This should be sufficient with the assumption that a state epoch should span at least several months
A value is deemed expired if and only if the following equation holds true:
To simplify, a value is expired if the difference between the current chain’s epoch and the value’s epoch is greater than or equal to DEFAULT_EPOCH_LIVE_NUM
. DEFAULT_EPOCH_LIVE_NUM
is another parameter used to determine how many epochs a value can stay in the network without getting expired and becoming inaccessible. The recommended default value is 2.
To summarize this section, assuming we’re going with a state epoch period of 6 months and DEFAULT_EPOCH_LIVE_NUM
of 2, this means that any value can only exist for a maximum of 1 year before getting expired.
The original commitment of an extension node only consists of 1, Stem, C1, and C2. In this proposal, we’re going to modify the extension node such that a third commitment called C3 is included. C3 represents the commitment of 256 state epoch values where individual state epochs correspond to their values in C1 or C2. If a value of a suffix doesn’t exist, then its state epoch defaults to 0.
Therefore, the commitment of the new extension node consists of 1, Stem, C1, C2, and C3.
When a group of values (i.e., C1 or C2) are expired, we would retain its cryptographic commitment and prune the underlying values from the database. If all 256 values in the extension node are expired, then the extension node is transformed into a “hashed” node where only its commitment stays, and underlying values are pruned.
Performing any CRUD action on a particular value will update its associated state epoch accordingly. For example:
Given key 0xAA......AA
, the last accessed epoch is 5. The current state epoch is 6. Upon accessing the value, the epoch for this key is updated to 6.
However, if the CRUD task accesses an expired value, then it’s a rule to first revive the value before proceeding. For example:
Given key 0xAA......AA
, the last accessed epoch is 5. The current state epoch is 10. Upon accessing the value, an error is returned, and the transaction is reverted because, as per the expiry rule, the value is deemed expired. Upon resurrecting this value by providing the necessary witness, the value’s state epoch is updated to the current state epoch.
To enable the state expiry scheme fully, we would require 2 hard forks. The first hard fork is used to enable the creation of a new extension node. The second hard fork enables the expiry feature. Here’s how the flow would look like:
Epoch 0: Everything is the same as before
—Enable Hardfork 1—
Epoch 1: Performing CRUD on values would create the new extension node with state epoch metadata. Values that are not accessed in this epoch would not be affected.
—Enable Hardfork 2—
Epoch 2: All legacy extension nodes would be deemed expired and pruned. It’s safe to do so because legacy extension nodes are guaranteed to have their last accessed epoch at epoch 0.
A new value format is introduced to the snapshot layer. That is, each storage or account key may or may not correspond to a value where the epoch information is included. The value for the snapshot KV pairs is RLP-encoded for the new format or it’s the raw value bytes. Here’s the pseudocode for the structure changes:
Based on the codebase, the state scheme used is path-based only. The current method of storing Verkle nodes is to first serialize them and store the raw bytes in the database. In particular, the serialization of an extension node is as follows:
The new format of serializing an extension node would include the commitment C3 and the state epochs. It would be something like:
We should also differentiate between the legacy extension node and the new extension node, as both of them may exist at the same time. Here’s an example of the pseudocode:
Since accessing an expired value reverts the transaction, we need a new transaction type to allow users to send the appropriate witness to revive the values that they care about. The following is the pseudocode for the new transaction type:
Witnesses should allow multiple values to be revived through a single transaction. One particular note is that since all information (i.e. account balance, nonce, storage slot) are embedded in a single Verkle Tree, proofs need to specify the data type that it corresponds to. This is a tricky part because contract codes are chunkified and inserted into the tree. The architecture may need encoding and decoding target keys separately to find the exact value to be revived. An example of the witness format looks like this:
The current SLOAD and SSTORE opcodes do not return an error if a state does not exist. In this state expiry scheme, accessing expired values should return error so that the transaction execution halts and reverts.
Each witness in a revive transaction should also be charged with a gas price. For now, it’s not this project’s best interest to figure out the best number of per-byte gas charged for a witness. Therefore, we’ll just go with some “reasonable” number.
For hash-based state scheme, offline prune utilizes bloom filter to identify which trie nodes are staled and can be deleted from the disk. For path-based state scheme, the idea of offline prune seems redundant (p.s. I could be wrong, TBD in the future).
In any case, we would still require a method where client can do an offline prune to delete the expired values from the tree. This may include rebuilding the entire state tree, traverse through each leaf node and delete them 1 by 1.
TBD