[This document is not deprecated, please continue the discussion in the draft EIP PR]
This document contains a draft version of a renewed Binary tree proposal for the Ethereum state. Compared to EIP-3102 it proposes a single balanced tree and pulls other ideas similar to EIP-6800 which are agnostic to Verkle (e.g: packing of account data, storage-slot, and code chunks).
Hash-based state-trees have renewed interest due to 1) Potential PQ concerns, 2) Proving systems advancing at a fast pace getting closer to viable proving state via SNARK/STARKs.
There are two main open questions about a Binary Tree design:
Reg 1., the ideal design for a spec would be a Sparse Merkle tree due to its simplicity. There’s a doc summarizing previous research on the topic to deal with drawbacks. Reg 2., this depends on evolving provers performance and potential security concerns of the considered options.
This proposed draft decided:
32-byte→[Field]
encoding without much impact in the rest of the design.The above aren’t final decisions but a good-ish EIP instance to do PoCs without much friction. There's a python implementation for this EIP here.
This doc is a early draft so that EL clients, L2s, and other people can provide early feedback before creating the EIP and inviting the broader community.
Your EIP feedback is appreciated!
Extra notes:
eip: TODO
title: Ethereum state using a unified binary tree
author: TODO
discussions-to: TODO
status: Draft
type: Standards Track
category: Core
created: TODO
requires: TODO
Introduce a new binary state tree, intended to replace the hexary patricia trees. Account and storage tries are merged into a single tree with 32-byte keys, which also contains contracts code. Account data is broken into independent leaves which are grouped by 256 in order to provide some locality.
(Draft note: the hash function used in the current draft is not final, other potential candidates are Keccak or Poseidon2. For the latter, an extra section on how to encode 32-bytes->[FiniteField] will be created. Do not assume Blake3 is a final decision.)
Ethereum's long-term goal is to allow blocks to be proved with validity proof so that chain verification is as simple and fast as possible. One of the most challenging parts of achieving that goal is proving the state of the tree, which is required for EVM execution.
The current MPT design isn’t friendly for validity proofs for many reasons, such as using RLP for node encoding, Keccak as a hashing function, being a “tree of trees”, and not including accounts code as part of the state.
Apart from requiring a state tree design that is friendly for block validity proofs, it’s also important to have fast and small regular Merkle proofs when the amount of state to prove is small. This is useful not only for Ethereum users but also for supporting future protocol needs in a stateless world.
Regarding these regular Merkle proofs in an MPT, since it’s a hexary tree, their size is quite big on average and in worst-case scenarios. Given a 2^32 size tree, the expected size for proving a single branch is 15 * 32 * log_16(2^32) = 3840
. From a worst-case block perspective, if 30M gas is used to access only a single byte of different account codes since this code isn’t chunkified, we’d need 30M/2400*(5*480+24k)=330MB
.
A binary tree has a good balance between out-of-circuit and in-circuit proving. Since the tree arity is two, this reduces regular Merkle proofs (i.e., # sibilings * log_arity(N)
for arity 2 is better than for arity 16). Moreover, we propose switching from Keccak to Blake3, which has proven security and is more amenable to modern proving systems.
The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.
The proposed Binary Tree stores key-value entries where both the key and value are 32 bytes. The first 31-bytes of the key define the entry stem, and the last byte is the subindex in that stem. If two keys have the same stem, they live in the same “big branch” — this co-locates groups of 256 keys (i.e: keys with the same first 31-bytes).
(More details about the purple/orange part in the "Tree embedding" section)
We can distinguish four node types:
InternalNode
has a left_hash
and right_hash
field (black).StemNode
has fields stem
, left_hash
and right_hash
(blue).LeafNode
has a field value
which is a 32-byte blob or empty
. (orange).EmptyNode
represents an empty node/sub-tree.The path to a StemNode
is defined by the key's first 248-bits (31-bytes) from MSB to LSB. From this node, a subtree of 256 values exists indexed by the key's last byte (8 bits). A newly created StemNode
subtree (i.e: 256 values) has all leaves with value: empty
.
The path length isn't 248-bits, but contains the minimal amount of InternalNodes
required if other stems exist with the same prefix. As in, when walking the 248-bit path inserting a new StemNode
, you can end on a EmptyNode
or StemNode
. In the former, you replace it with the new StemNode
. In the latter, you create the as many InternalNodes
as common bits both stems have as perfixes. Another way to look at this is that there aren't extension nodes in paths, but we use the minimal amount of InternalNodes
to place StemNode
s with common prefixes.
Below is an implementation that describes the tree structure:
class StemNode:
def __init__(self, stem: bytes):
assert len(stem) == 31, "stem must be 31 bytes"
self.stem = stem
self.values: list[Optional[bytes]] = [None] * 256
def set_value(self, index: int, value: bytes):
self.values[index] = value
class InternalNode:
def __init__(self):
self.left = None
self.right = None
class BinaryTree:
def __init__(self):
self.root = None
def insert(self, key: bytes, value: bytes):
assert len(key) == 32, "key must be 32 bytes"
assert len(value) == 32, "value must be 32 bytes"
stem = key[:31]
subindex = key[31]
if self.root is None:
self.root = StemNode(stem)
self.root.set_value(subindex, value)
return
self.root = self._insert(self.root, stem, subindex, value, 0)
def _insert(self, node, stem, subindex, value, depth):
assert depth < 248, "depth must be less than 248"
if node is None:
node = StemNode(stem)
node.set_value(subindex, value)
return node
stem_bits = self._bytes_to_bits(stem)
if isinstance(node, StemNode):
# If the stem already exists, update the value.
if node.stem == stem:
node.set_value(subindex, value)
return node
existing_stem_bits = self._bytes_to_bits(node.stem)
return self._split_leaf(
node, stem_bits, existing_stem_bits, subindex, value, depth
)
# We're in an internal node, go left or right based on the bit.
bit = stem_bits[depth]
if bit == 0:
node.left = self._insert(node.left, stem, subindex, value, depth + 1)
else:
node.right = self._insert(node.right, stem, subindex, value, depth + 1)
return node
def _split_leaf(self, leaf, stem_bits, existing_stem_bits, subindex, value, depth):
# If the stem bits are the same up to this depth, we need to create another
# internal node and recurse.
if stem_bits[depth] == existing_stem_bits[depth]:
new_internal = InternalNode()
bit = stem_bits[depth]
if bit == 0:
new_internal.left = self._split_leaf(
leaf, stem_bits, existing_stem_bits, subindex, value, depth + 1
)
else:
new_internal.right = self._split_leaf(
leaf, stem_bits, existing_stem_bits, subindex, value, depth + 1
)
return new_internal
else:
new_internal = InternalNode()
bit = stem_bits[depth]
stem = self._bits_to_bytes(stem_bits)
if bit == 0:
new_internal.left = StemNode(stem)
new_internal.left.set_value(subindex, value)
new_internal.right = leaf
else:
new_internal.right = StemNode(stem)
new_internal.right.set_value(subindex, value)
new_internal.left = leaf
return new_internal
Define hash(value)
as:
hash([0x00] * 64) = [0x00] * 32
hash(value) = blake3(value)
, otherwise.The value
parameter lengths can only be 32 and 64.
Merkelize each node type as follows:
internal_node_hash = hash(left_hash || right_hash)
stem_node_hash = hash(stem || 0x00 || hash(left_hash || right_hash))
leaf_node_hash = hash(value)
empty_node_hash = [0x00] * 32
Below is an implementation of these rules:
def _hash(self, data):
if data in (None, b"\x00" * 64):
return b"\x00" * 32
assert len(data) == 64 or len(data) == 32, "data must be 32 or 64 bytes"
return blake3(data).digest()
def merkelize(self):
def _merkelize(node):
if node is None:
return b"\x00" * 32
if isinstance(node, InternalNode):
left_hash = _merkelize(node.left)
right_hash = _merkelize(node.right)
return self._hash(left_hash + right_hash)
level = [self._hash(x) for x in node.values]
while len(level) > 1:
new_level = []
for i in range(0, len(level), 2):
new_level.append(self._hash(level[i] + level[i + 1]))
level = new_level
return self._hash(node.stem + b"\0" + level[0])
return _merkelize(self.root)
Instead of a two-layer structure like the MPT, we will embed all information into a unique key: value
space in the proposed single tree. This section specifies which tree keys store the state's information (account header data, code, storage). The data is colocated in such way that data that is usually accessed together lives in the same StemNode
which reduces branch openings.
The following is a big picture of what we'll continue to describe in the rest of this section:
The account stem (green account
node) contains accounts basic data, first 64-storage slots, and first 128-code chunks. This co-location of data allows to have in single stem branch data that is usually accessed together. The rest of storage slots and code-chunks are spread into groups of 256 values in the rest of the tree – this is also done for convenience since slots/code-chunks that are close together will share the same stem branch.
Parameter | Value |
---|---|
BASIC_DATA_LEAF_KEY | 0 |
CODE_HASH_LEAF_KEY | 1 |
HEADER_STORAGE_OFFSET | 64 |
CODE_OFFSET | 128 |
STEM_SUBTREE_WIDTH | 256 |
MAIN_STORAGE_OFFSET | 256**31 |
It’s a required invariant that STEM_SUBTREE_WIDTH > CODE_OFFSET > HEADER_STORAGE_OFFSET
and that HEADER_STORAGE_OFFSET
is greater than the leaf keys. Additionally, MAIN_STORAGE_OFFSET
must be a power of STEM_SUBTREE_WIDTH
.
Note that addresses are always passed around as an Address32
. To convert existing addresses to Address32
, prepend with 12 zero bytes:
def old_style_address_to_address32(address: Address) -> Address32:
return b'\\x00' * 12 + address
These are the positions in the tree at which block header fields of an account are stored.
def tree_hash(inp: bytes) -> bytes32:
return bytes32(blake3.blake3(inp).digest())
def get_tree_key(address: Address32, tree_index: int, sub_index: int):
# Assumes STEM_SUBTREE_WIDTH = 256
return tree_hash(address + tree_index.to_bytes(32, "little"))[:31] + bytes(
[sub_index]
)
def get_tree_key_for_basic_data(address: Address32):
return get_tree_key(address, 0, BASIC_DATA_LEAF_KEY)
def get_tree_key_for_code_hash(address: Address32):
return get_tree_key(address, 0, CODE_HASH_LEAF_KEY)
An account's version
, balance
, nonce
, and code_size
fields are packed with big-endian encoding in the value found at BASIC_DATA_LEAF_KEY
:
Name | Offset | Size |
---|---|---|
version |
0 | 1 |
code_size |
5 | 3 |
nonce |
8 | 8 |
balance |
16 | 16 |
Bytes 1..4
are reserved for future use.
The current layout and encoding allow for an extension of code_size
to 4 bytes without changing the account version. The goal of packing these fields together in a basic-data leaf, is to reduce gas costs since we only need one branch opening compared with usual 3 or 4 if the fields aren't packed. This also simplifies witness generation.
When any account header field is set, the version
field is also set to zero. The codehash
and code_size
fields are set upon contract or EoA creation.
def get_tree_key_for_code_chunk(address: Address32, chunk_id: int):
return get_tree_key(
address,
(CODE_OFFSET + chunk_id) // STEM_SUBTREE_WIDTH,
(CODE_OFFSET + chunk_id) % STEM_SUBTREE_WIDTH
)
Chunk i
stores a 32 byte value, where bytes 1…31 are bytes i*31...(i+1)*31 - 1
of the code (i.e. the i’th 31-byte slice of it), and byte 0 is the number of leading bytes that are part of PUSHDATA (e.g. if part of the code is ...PUSH4 99 98 | 97 96 PUSH1 128 MSTORE...
where |
is the position where a new chunk begins, then the encoding of the latter chunk would begin 2 97 96 PUSH1 128 MSTORE
to reflect that the first 2 bytes are PUSHDATA).
For precision, here is an implementation of code chunkification:
PUSH_OFFSET = 95
PUSH1 = PUSH_OFFSET + 1
PUSH32 = PUSH_OFFSET + 32
def chunkify_code(code: bytes) -> Sequence[bytes32]:
# Pad to multiple of 31 bytes
if len(code) % 31 != 0:
code += b'\\x00' * (31 - (len(code) % 31))
# Figure out how much pushdata there is after+including each byte
bytes_to_exec_data = [0] * (len(code) + 32)
pos = 0
while pos < len(code):
if PUSH1 <= code[pos] <= PUSH32:
pushdata_bytes = code[pos] - PUSH_OFFSET
else:
pushdata_bytes = 0
pos += 1
for x in range(pushdata_bytes):
bytes_to_exec_data[pos + x] = pushdata_bytes - x
pos += pushdata_bytes
# Output chunks
return [
bytes([min(bytes_to_exec_data[pos], 31)]) + code[pos: pos+31]
for pos in range(0, len(code), 31)
]
def get_tree_key_for_storage_slot(address: Address32, storage_key: int):
if storage_key < (CODE_OFFSET - HEADER_STORAGE_OFFSET):
pos = HEADER_STORAGE_OFFSET + storage_key
else:
pos = MAIN_STORAGE_OFFSET + storage_key
return get_tree_key(
address,
pos // STEM_SUBTREE_WIDTH,
pos % STEM_SUBTREE_WIDTH
)
Note that storage slots in the same size STEM_SUBTREE_WIDTH
range (i.e. a range with the form x*STEM_SUBTREE_WIDTH ... (x+1)*STEM_SUBTREE_WIDTH-1)
are all, except for the HEADER_STORAGE_OFFSET
special case, part of a single stem.
Described in EIP-7612.
Described in EIP-4762.
This EIP defines a new Binary Tree that starts empty. Only new state changes are stored in the tree. The MPT continues to exist but is frozen. This sets the stage for a future hard fork that migrates the MPT data to this Binary Tree (EIP-7748).
The proposal uses a single-layer tree structure with 32-byte keys and values for several reasons:
The proposed design should consider the usual complexity, performance, and efficiency for out-of-circuit implementations (i.e. EL clients) and in-circuit ones for generating proofs in SNARKs.
The tree design structure tries to be simple, by not having complex branching rules. For example, in contrast with the MPT, we avoid extension nodes injected in the middle of branches. Also, RLP encoding was removed since it adds unnecessary complexity. Although complexity can be managed more easily in out-of-circuit implementations, it's valuable to help circuit implementations be as simple as possible to avoid proving overhead.
The most important factor in the design is the cryptographic hash function used for merkelization. This hash function should have efficient implementations both out and in circuits. The proposed BLAKE3 hash function was chosen since it provides a good balance between:
Poseidon2 was considered to perform well in the first two points, but further research is still needed to ensure its security.
Due to recent progress in the quantum computing field, expert estimations consider that they can become real in the 2030s. Other alternatives, such as Verkle Trees, introduce a new cryptography stack to the protocol, which relies on elliptic curves that aren't post-quantum secure. This makes the current tree proposal attractive since it only depends on hash functions, which are still safe in this new paradigm.
Moreover, there has been impressive progress in proving systems, which indicates that we could be close to achieving the required performance for creating pre-state & post-state proofs fast enough. This last point is crucial since the main advantage of Verkle Trees was having proofs that are small and fast to generate.
Finally, the current state tree proposal will probably be the final state tree used in the protocol, compared with Verkle Trees, which should eventually be replaced again with a post-quantum secure version.
Binary tries have been chosen primarily because they reduce the witness size. In general, in an N
-element tree with each element having k
children, the average size of a branch is roughly 32 * (k-1) * log(N) / log(k)
plus a few percent for overhead. 32 is the length of a hash; the k-1
refers to the fact that a Merkle proof needs to provide all k-1
sister nodes at each level, and log(N) / log(k)
is an approximation of the number of levels in the tree (e.g. a tree where each node has 5 children, with 625 nodes total, would have depth 4, as 625 = 5**4
and so log(625) / log(5) = 4
).
For any N
, the expression is minimized at k = 2
. Here’s a table of branch lengths for different k
values assuming N = 2**24
:
k (children per node) |
Branch length (chunks) | Branch length (bytes) |
---|---|---|
2 | 1 * 24 = 24 | 768 |
4 | 3 * 12 = 36 | 1152 |
8 | 7 * 8 = 56 | 1792 |
16 | 15 * 6 = 90 | 2880 |
Actual branch lengths might be slightly larger than this due to uneven distribution, but the pattern of k=2
remains by far the best.
The tree design attempts to be as simple as possible considering both out-of-circuit and circuit implementations, while satisfying our proving hashes throughput constraints.
The proposed design avoids a full 248-bit depth as it would happen in a SMT. Since we don't use arithmetic hashes (e.g: Poseidon2), this is required to be agressive in avoiding hashing load in proving systems which today is quite tight on throughput in commodity hardware. There're some tricks that could be used to mitigate this, but this also adds extra complexity to the spec.
Moreover, we could push this further trying to introduce extension nodes in middle of paths as done in MPT, but this also adds complexity that might not worth it considering the tree should be quite balanced.
State-expiry strategies such as EIP-7736 could still be applied, requiring a change in the design. One potential solution is adding a field the StemNode
with epoch
as described in the mentioned EIP. Another alternative is to use 247-bits for the stem, and have two subtrees StemValuesNode
, which would correspond to the current 256-values, and StemMetaNode
which is also a 256-subtree that can be used to store arbitrary stem metadata.
The main backward-compatibility-breaking changes are:
(1) can be mitigated by increasing the gas limit while implementing this EIP, reducing the risk that applications will no longer work due to transaction gas usage rising above the block gas limit.
TODO
Needs discussion.
Copyright and related rights waived via CC0.
TODO