Try   HackMD

reth-verkle poc

Motivation

stateless-cleints are very much necessary for the decentralisation of ethereum, due to following reasons:

  1. ethereum's current state size is too large for many nodes to keep in working memory, requiring expensive SSDs for storage and slowing down block validation and chain syncing, stateless clients will allow validators to validate blocks without maintaining the full state, significantly reducing their resource requirements and also reducing sync times.
  2. using verkle tries for statelessness makes client architecture more compatible with a Zk-EVM future, with some added explored benefits like increasing of gas-limits to large extents.

the main motivation of creating a reth-verkle poc is is to develop more working implementation of verkle-integration in EL-clients, which will help in running interop with other clients, further research and allow reth to prepare for the verge, learn more about why statelessness is important here: Why it's so important to go stateless

Abstract

this project aims to integrate rust-verkle crytographic primitives into reth, and enable it to act as a stateless client.
a basic TL;DR will be:

  • allow construction of witness(state_diffs and verkle_proof) during block-execution and then serialization of this witness.
  • propagation of this witness along with block, for stateless validation by other clients.
  • obtaining the serialized witness from other clients then proper deserialization of it's contents
  • then using this deserialized data to compute a partial view of the pre-state trie.
  • this pre-state trie will be used in proving correctness of pre-state key-value pairs(state_diffs) obtained from witness, used in block(next) execution against pre_state_root.
  • if above verification succeds (that is provided state_diffs are indeed part of trie whose root is our trusted pre_state_root), proceed with stateless execution:
    • create a local copy of state_db from the witness's data.
    • then use this state_db for block execution rather than using local-chain.

Goal

end goal of this project will be achieved, if reth is able to join Kaustinen devnet, passive all verkle-execution-spec tests

Specification

This section provides required information about structure of various components of rust-verkle that will be subsequently utilised in reth.
These technical specifications will involve following the defined specs and Verkle serialization format in SSZ for making changes in reth:

  1. A block/execution witness (i.e: the verkle proof required to execute a block statelessly) struct will be created, this is an SSZ-encoded serialization of the following ExecutionWitness structure:

    ​​​​class ExecutionWitness(container):
    ​​​​    state_diff: StateDiff
    ​​​​    verkle_proof: VerkleProof
    
  2. state_diff will contain all the pre-state data required to execute the given block, which will then be executed statelessly by other clients(basically verkle trie's, leaf node's key value pair), StateDiff defination:

    ​​​​MAX_STEMS = 2**16
    ​​​​VERKLE_WIDTH = 256
    
    ​​​​class SuffixStateDiff(Container):
    ​​​​    suffix: Byte
    
    ​​​​    # Null means not currently present
    ​​​​    current_value: Union[Null, Bytes32]
    
    ​​​​    # Null means value not updated
    ​​​​    new_value: Union[Null, Bytes32]
    
    ​​​​class StemStateDiff(Container):
    ​​​​    stem: Stem
    ​​​​    # Valid only if list is sorted by suffixes
    ​​​​    suffix_diffs: List[SuffixStateDiff, VERKLE_WIDTH]
    
    ​​​​# Valid only if list is sorted by stems
    ​​​​StateDiff = List[StemStateDiff, MAX_STEMS]
    
  3. verkle_proof will contain, all the data needed by the verifier to re-construct a partial view of the pre-state trie(using commitments, root-node, and given block values) for the data present in state_diff, which will be used to prove that this pre-state data provided is indeed part of the trie whose root-node is the state_root_node(trusted), already present with the client, VerkleProof defination:

    ​​​​BandersnatchGroupElement = Bytes32
    ​​​​BandersnatchFieldElement = Bytes32
    ​​​​MAX_COMMITMENTS_PER_STEM = 33 # = 31 for inner nodes + 2 (C1/C2)
    ​​​​IPA_PROOF_DEPTH = 8 # = log2(VERKLE_WIDTH)
    
    ​​​​class IpaProof(Container):
    ​​​​    C_L = Vector[BandersnatchGroupElement, IPA_PROOF_DEPTH]
    ​​​​    C_R = Vector[BandersnatchGroupElement, IPA_PROOF_DEPTH]
    ​​​​    final_evaluation = BandersnatchFieldElement
    
    ​​​​class VerkleProof(Container):
    ​​​​    // [Group A]
    ​​​​    other_stems: List[Bytes32, MAX_STEMS]
    ​​​​    depth_extension_present: List[uint8, MAX_STEMS]
    ​​​​    commitments_by_path: List[BandersnatchGroupElement, MAX_STEMS * MAX_COMMITMENTS_PER_STEM]
    ​​​​    // [Group B]
    ​​​​    D: BandersnatchGroupElement
    ​​​​    ipa_proof: IpaProof
    

    here, other_stems, depth_extension_present, commitments_by_path are data used to construct this partial-view of verkle-trie, and ipa_proof is the verkle proof which will be used to open the commitment in the path from provided leaf-nodes to the trie-root, which will prove that the provided data is indeed correct.
    for more details regarding above mentioned changes and terms used refer to this great article by Ignacio: Anatomy of a verkle proof

Implementation plan

1. Data types/helpers

this section discusses pseudo-code for data types and utility/helper functions that would be needed for verkle migration:

  • trie/verkle.rs:
struct VerkleTrie {
    root: Box<dyn VerkleNode>,
    db: Box<Database>,
    ended: bool,
}

struct ChunkedCode(Vec<u8>);

impl VerkleTrie {
    fn to_dot(&self) -> String;
    
    fn new(root: Box<dyn VerkleNode>, db: Box<Database>, ended: bool) -> Self;
    
    fn flatdb_node_resolver(&self, path: &[u8]) -> Result<Vec<u8>, Error>;
    
    fn insert_migrated_leaves(&mut self, leaves: Vec<LeafNode>) -> Result<(), Error>;
    
    fn get_key(&self, key: &[u8]) -> Vec<u8>;
    
    fn get_storage(&self, addr: Address, key: &[u8]) -> Result<Vec<u8>, Error>;
    
    fn get_with_hashed_key(&self, key: &[u8]) -> Result<Vec<u8>, Error>;
    
    fn get_account(&self, addr: Address) -> Result<Option<StateAccount>, Error>;
    
    fn update_account(&mut self, addr: Address, acc: &StateAccount) -> Result<(), Error>;
    
    fn update_stem(&mut self, key: &[u8], values: Vec<Vec<u8>>) -> Result<(), Error>;
    
    fn update_storage(&mut self, address: Address, key: &[u8], value: &[u8]) -> Result<(), Error>;
    
    fn delete_account(&mut self, addr: Address) -> Result<(), Error>;
    
    fn delete_storage(&mut self, addr: Address, key: &[u8]) -> Result<(), Error>;
    
    fn hash(&self) -> Hash;
    
    fn commit(&mut self, _: bool) -> Result<(Hash, Option<NodeSet>), Error>;
    
    fn node_iterator(&self, start_key: &[u8]) -> Result<Box<dyn NodeIterator>, Error>;
    
    fn prove(&self, key: &[u8], proof_db: &mut dyn KeyValueWriter) -> Result<(), Error>;
    
    fn copy(&self) -> Self;
    
    fn is_verkle(&self) -> bool;
    
    fn set_storage_root_conversion(&mut self, addr: Address, root: Hash);
    
    fn clear_storage_root_conversion(&mut self, addr: Address);
    
    fn update_contract_code(&mut self, addr: Address, code_hash: Hash, code: &[u8]) -> Result<(), Error>;
}

fn prove_and_serialize(
    pretrie: &VerkleTrie,
    posttrie: Option<&VerkleTrie>,
    keys: Vec<Vec<u8>>,
    resolver: impl Fn(&[u8]) -> Result<Vec<u8>, Error>,
) -> Result<(VerkleProof, StateDiff), Error>;

fn deserialize_and_verify_verkle_proof(
    vp: &VerkleProof,
    pre_state_root: &[u8],
    post_state_root: &[u8],
    statediff: StateDiff,
) -> Result<(), Error>;

fn chunkify_code(code: &[u8]) -> ChunkedCode;

need to discuss here what part should go in alloy crates and referenced from there and what should be present in reth crate.

2. Integration of rust-verkle cryptographic primitives in reth.

following functions will be added:

  • function deserialize_and_verify_verkle_proof: responsible for constructing partial view of pre-state-trie for the given state_diffs using verkle_proof and validating correctness of the pre-state data against the pre_state_root

    1. will take verkle_proof, pre_state_root, post_state_root, state_diffs as inputs.
    2. proper mechanisms for deserialisation.
    3. constructs pre-state trie, perform checks to see the obtained pre-state trie key-value pairs are legit.
    4. constructs post-state trie, verfies root of this trie against the given post_state_root
    5. internally calls verify_execution_witness (see here) from rust-verkle.

    note:

    • geth has recently moved their implementation of this function DeserializeAndVerifyVerkleProof to go-verkle in this pr, so after proper testing and implementation, I will be doing the same and then using this function as an endpoint from rust-verkle.
    • some functionalities might be needed to add to rust-verkle for the above checks, as currently everything is combined in single function verify_execution_witness including pre-state-trie construction.
  • function prove_and_serialize: responsible for creating serialized verkle_proof and state_diffs objects.

    1. receives pre_trie, post_trie and keys.
    2. calls multiproof aggregation function from ipa_multipoint crate of rust-verkle to get the proof.
    3. calls serializing utilities from rust-verkle to get serialized proof and state differences.
    4. will be called by execute_state_transitions function.

    note:

    • almost all of the utility functions that will be required from rust-verkle are already ready, but they may need to be aggregated together or separated according to the purpose required, will be taking reference from go-verkle to achieve this purpose in rust-verkle

3. Block execution with witness as StateDB

TODO
Tl;Dr: after recieving the next block and the witness data, state execution should shift to stateless architecture and use copy of witness data as stateDB instead of copy of it's local chain.

client references:

  1. ethereumJS has implemented a very basic architecture for this, to get an overview see reference of utility functions in statelessVerkleStateManager.ts and associated application in runBlock function.
  2. geth has not implemented this part yet, it lags necessary wirings to run the block statelessly.
  3. for nethermind, I am yet to study state-sync architecture and verkle-implementation, ig nethermind is leading in this part, will update changes once I have gone through it.

4. 5. 6…

TODO
later changes related to gas-cost modifications and much more, will be added subsequently

Testing Plan

TODO
see geth reference gballet-go-etherum/core/state_processor_test.go

milestones

  1. first part of implementation will focus on implementing this part and passing very basic test_verkle_from_mpt_conversion execution-spec-test.
  2. second part of implementation will focus on passing the next iteration of verkle-execution-spec tests that are going to be released within a week, for reference see this pr, these tests cover all the border-cases of 6800, 4762, 7709 EIPs.
  3. third part will focus on joinning of latest devnet at that time(devnet 7 maybe, but it can update also), important changes introduced in devnet-7:

references

overview of verkle-tries

after getting gist of trie-structure please read this article by Ignacio: Anatomy of a Verkle proof this will give all the necessary understanding needed for verkle-migration in EL-clients.

overview of cryptography

EIPs

Implementations

  1. gballet/go-ethereum
  2. etherumJs
  3. nethermind
  4. rust-verkle