reth-verkle poc

Motivation

stateless-cleints are very much necessary for the decentralisation of ethereum, due to following reasons:

ethereum's current state size is too large for many nodes to keep in working memory, requiring expensive SSDs for storage and slowing down block validation and chain syncing, stateless clients will allow validators to validate blocks without maintaining the full state, significantly reducing their resource requirements and also reducing sync times.
using verkle tries for statelessness makes client architecture more compatible with a Zk-EVM future, with some added explored benefits like increasing of gas-limits to large extents.

the main motivation of creating a reth-verkle poc is is to develop more working implementation of verkle-integration in EL-clients, which will help in running interop with other clients, further research and allow reth to prepare for the verge, learn more about why statelessness is important here: Why it's so important to go stateless

Abstract

this project aims to integrate rust-verkle crytographic primitives into reth, and enable it to act as a stateless client.
a basic TL;DR will be:

allow construction of witness(state_diffs and verkle_proof) during block-execution and then serialization of this witness.
propagation of this witness along with block, for stateless validation by other clients.
obtaining the serialized witness from other clients then proper deserialization of it's contents
then using this deserialized data to compute a partial view of the pre-state trie.
this pre-state trie will be used in proving correctness of pre-state key-value pairs(state_diffs) obtained from witness, used in block(next) execution against pre_state_root.
if above verification succeds (that is provided state_diffs are indeed part of trie whose root is our trusted pre_state_root), proceed with stateless execution:
- create a local copy of state_db from the witness's data.
- then use this state_db for block execution rather than using local-chain.

Goal

end goal of this project will be achieved, if reth is able to join Kaustinen devnet, passive all verkle-execution-spec tests

Specification

This section provides required information about structure of various components of rust-verkle that will be subsequently utilised in reth.
These technical specifications will involve following the defined specs and Verkle serialization format in SSZ for making changes in reth:

A block/execution witness (i.e: the verkle proof required to execute a block statelessly) struct will be created, this is an SSZ-encoded serialization of the following ExecutionWitness structure:
```
class ExecutionWitness(container):
    state_diff: StateDiff
    verkle_proof: VerkleProof
```

state_diff will contain all the pre-state data required to execute the given block, which will then be executed statelessly by other clients(basically verkle trie's, leaf node's key value pair), StateDiff defination:

MAX_STEMS = 2**16
VERKLE_WIDTH = 256

class SuffixStateDiff(Container):
    suffix: Byte

    # Null means not currently present
    current_value: Union[Null, Bytes32]

    # Null means value not updated
    new_value: Union[Null, Bytes32]

class StemStateDiff(Container):
    stem: Stem
    # Valid only if list is sorted by suffixes
    suffix_diffs: List[SuffixStateDiff, VERKLE_WIDTH]

# Valid only if list is sorted by stems
StateDiff = List[StemStateDiff, MAX_STEMS]

verkle_proof will contain, all the data needed by the verifier to re-construct a partial view of the pre-state trie(using commitments, root-node, and given block values) for the data present in state_diff, which will be used to prove that this pre-state data provided is indeed part of the trie whose root-node is the state_root_node(trusted), already present with the client, VerkleProof defination:

BandersnatchGroupElement = Bytes32
BandersnatchFieldElement = Bytes32
MAX_COMMITMENTS_PER_STEM = 33 # = 31 for inner nodes + 2 (C1/C2)
IPA_PROOF_DEPTH = 8 # = log2(VERKLE_WIDTH)

class IpaProof(Container):
    C_L = Vector[BandersnatchGroupElement, IPA_PROOF_DEPTH]
    C_R = Vector[BandersnatchGroupElement, IPA_PROOF_DEPTH]
    final_evaluation = BandersnatchFieldElement

class VerkleProof(Container):
    // [Group A]
    other_stems: List[Bytes32, MAX_STEMS]
    depth_extension_present: List[uint8, MAX_STEMS]
    commitments_by_path: List[BandersnatchGroupElement, MAX_STEMS * MAX_COMMITMENTS_PER_STEM]
    // [Group B]
    D: BandersnatchGroupElement
    ipa_proof: IpaProof

here, other_stems, depth_extension_present, commitments_by_path are data used to construct this partial-view of verkle-trie, and ipa_proof is the verkle proof which will be used to open the commitment in the path from provided leaf-nodes to the trie-root, which will prove that the provided data is indeed correct.
for more details regarding above mentioned changes and terms used refer to this great article by Ignacio: Anatomy of a verkle proof

Implementation plan

1. Data types/helpers

this section discusses pseudo-code for data types and utility/helper functions that would be needed for verkle migration:

trie/verkle.rs:

struct VerkleTrie {
    root: Box<dyn VerkleNode>,
    db: Box<Database>,
    ended: bool,
}

struct ChunkedCode(Vec<u8>);

impl VerkleTrie {
    fn to_dot(&self) -> String;
    
    fn new(root: Box<dyn VerkleNode>, db: Box<Database>, ended: bool) -> Self;
    
    fn flatdb_node_resolver(&self, path: &[u8]) -> Result<Vec<u8>, Error>;
    
    fn insert_migrated_leaves(&mut self, leaves: Vec<LeafNode>) -> Result<(), Error>;
    
    fn get_key(&self, key: &[u8]) -> Vec<u8>;
    
    fn get_storage(&self, addr: Address, key: &[u8]) -> Result<Vec<u8>, Error>;
    
    fn get_with_hashed_key(&self, key: &[u8]) -> Result<Vec<u8>, Error>;
    
    fn get_account(&self, addr: Address) -> Result<Option<StateAccount>, Error>;
    
    fn update_account(&mut self, addr: Address, acc: &StateAccount) -> Result<(), Error>;
    
    fn update_stem(&mut self, key: &[u8], values: Vec<Vec<u8>>) -> Result<(), Error>;
    
    fn update_storage(&mut self, address: Address, key: &[u8], value: &[u8]) -> Result<(), Error>;
    
    fn delete_account(&mut self, addr: Address) -> Result<(), Error>;
    
    fn delete_storage(&mut self, addr: Address, key: &[u8]) -> Result<(), Error>;
    
    fn hash(&self) -> Hash;
    
    fn commit(&mut self, _: bool) -> Result<(Hash, Option<NodeSet>), Error>;
    
    fn node_iterator(&self, start_key: &[u8]) -> Result<Box<dyn NodeIterator>, Error>;
    
    fn prove(&self, key: &[u8], proof_db: &mut dyn KeyValueWriter) -> Result<(), Error>;
    
    fn copy(&self) -> Self;
    
    fn is_verkle(&self) -> bool;
    
    fn set_storage_root_conversion(&mut self, addr: Address, root: Hash);
    
    fn clear_storage_root_conversion(&mut self, addr: Address);
    
    fn update_contract_code(&mut self, addr: Address, code_hash: Hash, code: &[u8]) -> Result<(), Error>;
}

fn prove_and_serialize(
    pretrie: &VerkleTrie,
    posttrie: Option<&VerkleTrie>,
    keys: Vec<Vec<u8>>,
    resolver: impl Fn(&[u8]) -> Result<Vec<u8>, Error>,
) -> Result<(VerkleProof, StateDiff), Error>;

fn deserialize_and_verify_verkle_proof(
    vp: &VerkleProof,
    pre_state_root: &[u8],
    post_state_root: &[u8],
    statediff: StateDiff,
) -> Result<(), Error>;

fn chunkify_code(code: &[u8]) -> ChunkedCode;

need to discuss here what part should go in alloy crates and referenced from there and what should be present in reth crate.

2. Integration of rust-verkle cryptographic primitives in reth.

following functions will be added:

function deserialize_and_verify_verkle_proof: responsible for constructing partial view of pre-state-trie for the given state_diffs using verkle_proof and validating correctness of the pre-state data against the pre_state_root
1. will take verkle_proof, pre_state_root, post_state_root, state_diffs as inputs.
2. proper mechanisms for deserialisation.
3. constructs pre-state trie, perform checks to see the obtained pre-state trie key-value pairs are legit.
4. constructs post-state trie, verfies root of this trie against the given post_state_root
5. internally calls verify_execution_witness (see here) from rust-verkle.
note:
- geth has recently moved their implementation of this function DeserializeAndVerifyVerkleProof to go-verkle in this pr, so after proper testing and implementation, I will be doing the same and then using this function as an endpoint from rust-verkle.
- some functionalities might be needed to add to rust-verkle for the above checks, as currently everything is combined in single function verify_execution_witness including pre-state-trie construction.
function prove_and_serialize: responsible for creating serialized verkle_proof and state_diffs objects.
1. receives pre_trie, post_trie and keys.
2. calls multiproof aggregation function from ipa_multipoint crate of rust-verkle to get the proof.
3. calls serializing utilities from rust-verkle to get serialized proof and state differences.
4. will be called by execute_state_transitions function.
note:
- almost all of the utility functions that will be required from rust-verkle are already ready, but they may need to be aggregated together or separated according to the purpose required, will be taking reference from go-verkle to achieve this purpose in rust-verkle

3. Block execution with witness as StateDB

TODO
Tl;Dr: after recieving the next block and the witness data, state execution should shift to stateless architecture and use copy of witness data as stateDB instead of copy of it's local chain.

client references:

ethereumJS has implemented a very basic architecture for this, to get an overview see reference of utility functions in statelessVerkleStateManager.ts and associated application in runBlock function.
geth has not implemented this part yet, it lags necessary wirings to run the block statelessly.
for nethermind, I am yet to study state-sync architecture and verkle-implementation, ig nethermind is leading in this part, will update changes once I have gone through it.

4. 5. 6…

TODO
later changes related to gas-cost modifications and much more, will be added subsequently

Testing Plan

TODO
see geth reference gballet-go-etherum/core/state_processor_test.go

milestones

first part of implementation will focus on implementing this part and passing very basic test_verkle_from_mpt_conversion execution-spec-test.
second part of implementation will focus on passing the next iteration of verkle-execution-spec tests that are going to be released within a week, for reference see this pr, these tests cover all the border-cases of 6800, 4762, 7709 EIPs.
third part will focus on joinning of latest devnet at that time(devnet 7 maybe, but it can update also), important changes introduced in devnet-7:
- major change is implementing the gas changes discussed in Kenya interop. i.e: packing many account fields in a "BASIC_DATA" see this pr: Update EIP-4762: reworked gas schedule from interop.
- addition of parent state root in witness, see this pr: eip6800: add parent's root to the witness.

references

overview of verkle-tries

get an overview of verkle-trie structure

after getting gist of trie-structure please read this article by Ignacio: Anatomy of a Verkle proof this will give all the necessary understanding needed for verkle-migration in EL-clients.

depth info blogs:

overview of cryptography

a basic understanding of abstract algebra, elliptic curve-cryptography and number-theory would certainly help
articles related to cryptographic-optimisations:
1. Dividing In Lagrange basis when one of the points is zero - Generalised
2. Verkle Trees - Proof creation/verification notes

EIPs

verkle EIPs:
1. EIP-6800: Ethereum state using a unified verkle tree: This introduces a new Verkle state tree alongside the existing MPT.
2. EIP-4762: Statelessness gas cost changes: Changes the gas schedule to reflect the costs of creating a witness by requiring clients update their database layout to match.
3. EIP-7709: Read BLOCKHASH from storage and update cost: Read the BLOCKHASH (0x40) opcode from the EIP-2935 system contract storage and adjust its gas cost to reflect storage access.
EIPs related to transition of stateDB during fork:
1. EIP-7612: Verkle state transition via an overlay tree: Describes the use of an overlay tree to use the verkle tree structure, while leaving the historical state untouched.
2. EIP-7748: State conversion to Verkle Tree: Describes a state conversion procedure to migrate key-values from the Merkle Patricia Tree to the Verkle Tree.
old EIPs:
1. EIP-7545: Verkle proof verification precompile: Add a precompile to help dapps verify verkle proofs
2. EIP-2935: Serve historical block hashes from state: Store and serve last 8192 block hashes as storage slots of a system contract to allow for stateless execution