code: https://github.com/scroll-tech/mpt-circuit v0.4
branch.
Merkle Patricia Tree is one of the key data structures used in Ethereum's storage layer. In our zkevm-circuits, we modify the original MPT into zkTrie, which is essentially a sparse binary Merkle Patricia Trie. A detailed description of zkTrie can be found in this zkTrie spec. In the below, to align with the naming in the codebase, we will use the terms "MPT" or "trie" to refer to the zkTrie (instead of the originally defined MPT as in Ethereum).
In zkevm-circuits, we use MPT table to track step by step the state transitions from MPT operations. The MPT table has the following table layout
q_enable | address | storage_key | proof_type | new_root | old_root | new_value | old_value |
---|---|---|---|---|---|---|---|
actually the RLC of storage_key in little-endian bytes |
MPTProofType |
In the above, each row of the MPT table reflects an update to the MPT (MPTUpdate). The columns address
and storage_key
indicate the location where changes in account or storage happen. The change in value for this update are recorded in old_value
and new_value
, and the corresponding change of the trie root are recorded in old_root
and new_root
.
For each row's corresponding MPT update, we will prove its correctness in the MPT circuit that we describe below. To facilitate this, we record the MPTProofType in the above table, which corresponds to the following proof types:
MPTProofType
NonceMod
: Nonce updatedBalanceMod
: Balance updatedKeccakCodeHashExists
: Keccak Code hash existsPoseidonCodeHashExists
: Poseidon Code hash existsCodeSizeExists
: Code size existsNonExistingAccountProof
: Account does not existStorageMod
: Storage updatedNonExistingStorageProof
: Storage does not existSMTTrace
MPT circuit aims at proving the correctedness of the MPT table described above. This means its constraint system enforces a unique change of MPT for each of the MPT's updates as recorded in the MPT table. In particular, when the MPT is changed due to account or storage updates, MPT circuit must prove this update leads to the correct root change.
SMTTrace
(SMT = Sparse Merkle Trie) is the witness data structure constructed inside the zkTrie of the zkevm-circuits to encode one update step of the MPT. SMTTrace
carries the old and new paths on the trie before and after the change to either account or storage, with both paths running from root to leaf. Each path is encoded as an SMTPath
, which consists of the root hash, the leafnode and all intermediate nodes along that path from root to leaf. Each node's value hash and its sibling's value hash are recorded. SMTTrace
also records the address and storage key at which location the change is made. Furthermore, it carries information about the account or storage updates, such as change of nonce, balance, code hash, storage value hash, etc. .
During the witness assignment to the MPT Circuit, a sequence of mpt updates is provided where each update is in the form of an SMTTrace
. We construct an intermediate layer that turns each MPT update as SMTTrace
into a Proof
that will be filled into the circuit as witnesses. At a high level, the flow chart for producing witnesses that are filled into MPT Circuit looks as follows:
In more detail, each Proof
consists of the following components
Proof
claim
: A Claim
carries basic information related to this MPT update such as root and addresses as well as proof type.
Claim
old_root
new_root
address
kind
: ClaimKind
, corresponding to MPTProofType
address_hash_traces
: Vector of traces of account hash with path going from leaf to root. Each vector item is a list with components, that stands for a node in the path at the depth going from high (leaf end) to low (=0, root end). Components are
direction
: bool, this is corresponding to account_key
, left or right that this node is with respect to its fatheropen_value
: Fr
field element, old hash at this node, i.e. for old pathclose_value
: Fr
field element, new hash at this node, i.e. for new pathsibling
: Fr
field element, sibling node's hashis_open_padding
: bool, true if this node is empty for old pathis_close_padding
: bool, true if this node is empty for new pathleafs
: [LeafNode; 2]
, for old and new path
LeafNode
key
value_hash
old_account_hash_traces
: Vector with item in [[Fr; 3]; 6]
. For non-empty account
[codehash_hi, codehash_lo, h1=hash(codehash_hi, codehash_lo)]
[h1, storage_root, h2=hash(h1, storage_root)]
[nonce, balance, h3=hash(nonce, balance)]
[h3, h2, h4=hash(h3, h2)]
[1, account_key, h5=hash(1, account_key)]
[h4, h5, h6=hash(h4, h5)]
[0,0,0]
[0,0,0]
[0,0,0]
[0,0,0]
[1, leaf.key, h5=hash(1, leaf.key)]
[h5, leaf.value_hash, h6=hash(h5, leaf.value_hash)]
new_account_hash_traces
:same as above, but for the new account.storage
: a StorageProof
StorageProof
Root
: when not proving a storage update, so we only need the storage root.Update
: when proving a storage update
key
trie_rows
: vector of TrieRow
TrieRow
old_leaf
: old StorageLeafnew_leaf
: new StorageLeafold
, new
: two Path
corresponding to the old and new account path of SMTTrace
. A Path
data has the following structure:
Path
key
:The account_key(address) if the account exists; else: path.leaf.sibling if it's a type 1 non-existence proof; otherwise account_key(address) if it's a type 2 non-existence proofkey_hash
, which is hash(1, key)
(type 0 and type 1), or 0
(type 2)leaf_data_hash
, which is leaf data hash (type 0 and type 1) or None
(type 2)old_account
, new_account
: Option<EthAccount>
. Each EthAccount
is structured as
EthAccount
nonce
codesize
poisedon_codehash
balance
keccak_codehash
The MPTProofType
inside MPT Circuit is in 1-1 correspondence with that in zkevm-circuits, though with minor changes of names. They are given by
MPTProofType
NonceChanged
= 1, meaning that nonce is always changingBalanceChanged
: account balance updatedCodeHashExists
: keccak codehash updatedPoseidonCodeHashExists
: poseidon codehash updatedCodeSizeExists
: code size updatedAccountDoesNotExist
: non existence proof for accountStorageChanged
: storage updatedStorageDoesNotExist
: non existence proof for storageold_hash | new_hash | old_value | new_value | proof_type | address | storage_key_rlc | segment_type | path_type | depth | key | other_key | other_key_hash | other_leaf_data_hash | direction | sibling | upper_128_bits |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
node's open hash | node's close hash | node's open value | node's close value | MPTProofType | account address | rlc of storage key (u256 to big endian) | Start, AccountTrie, AccountLeaf0, AccountLeaf1, AccountLeaf2, AccountLeaf3, AccountLeaf4, StorageTrie, StorageLeaf0, StorageLeaf1 | Common, ExtensionOld, ExtensionNew | depth increase from root to leaf, from current row to next row | provided key | match provided key with existing mpt path key, until reaching leafnode, obtain resulting key. For type 1 non-existence proof, this key is different from provided key, for type 2 non-existence proof, this key is same as provided key | hash(1, other_key) if other_key matches provided key, else is 0 | hash of the leaf/empty node pointed by other key | direction must match key | sibling node's hash | most significant 128 bits of address or storage key |
Each circuit row represents the mpt update (from old to new) of a path from root to leaf, at a certain depth (depth of root is 0). So each row represents only one node (virtual node in case it does not exist). The row offset order previous --> current --> next
indicates an increase of depth towards leafnode.
The configuration changes to the MPT is characterized by the following two columns:
(1) PathType
: This characterizes topological structural change. It includes PathType::Start
, PathType::Common
, PathType::ExtensionOld
and PathType::ExtensionNew
;
(2) SegmentType
: This characterizes data field (which form the trie's leafnode hash) change. It includes SegmentType::Start
, SegmentType::AccountTrie
, SegmentType::AccountLeaf0
-SegmentType::AccountLeaf4
, SegmentType::StorageTrie
, SegmentType::StorageLeaf0
-SegmentType::StorageLeaf1
.
In both of the above two column witnesses, Start
is used as boundary marker between updates. This means each circuit row with a Start
indicates a new MPT update.
Operations to the trie can be classified as
(1) modify: an account/storage slot is being changed. The account/storage under this operation is named type 0, i.e., type 0 account/storage are in the MPT. This means that they're non-empty, i.e. at least one of their fields is non-zero. Figure below:
(2) insert to/delete from append: an account/storage slot is inserted to the MPT as append, i.e. it forms a new leafnode that previously does not exist even as empty nodes (insert to append), or it is deleted from an appended leaf, i.e., after deletion there will be no node (including empty node) left (delete from append). The account/storage under this operation is named type 1, i.e., type 1 account/storage are empty. Where they could be in the MPT, there is instead a leaf node that maps to another non-empty account/storage. Figure below (for insert case, and delete case just swap old and new):
Notice that the zkTrie adds only one bit of common prefix at each level of its depth. It also applies the optimization that replaces subtrees consisting of exactly one leaf with a single leaf node to reduce the tree height. This means that when we insert to append (or delete from append), it may happen that an extension happens with some intermediate nodes that provide key prefix bits, either for the old path or for the new path. It is constrained that at all the new siblings added are empty nodes.
(3) insert to/delete from fill: insert an account/storage slot to fill in an empty node (insert to fill) as well as delete an account/storage slot from a filled leafnode and leave with an empty node (delete from fill). The account/storage under this operation is named type 2, i.e., type 2 account/storage are also empty. Where they could be in the MPT, there is instead an empty node. Figure below (for insert case, and delete case just swap old and new):
PathType::Common
refers to the sitation that the old and new path share the same topological configuration.
This can correspond to the topological configuration change on the whole path in the modify operation, or the common path (not extended one) of the insert to/delete from append, or the insert to/delete from fill operations.
PathType::ExtensionNew
refers to the sitation that the new path extends the old path in its toplogical configuration.
This can correspond to the extended part of the path in insert to append, insert to fill operations.
PathType::ExtensionOld
refers to the sitation that the old path extends the new path in its toplogical configuration.
This can correspond to the extended part of the path in delete from append, delete from fill operations.
Each circuit row will be assigned a PathType at witness generation. Along a path (old/new) provided by the Proof
, each node will correspond to a row in the circuit. So PathType
can change from Common
to ExtensionOld(New)
when hitting the extended part of the trie. This also includes the case when a type 2 account/storage is inserted/deleted, where the empty node pointed to by the key will extend to a leafnode.
These cases of non-existence proofs are related to non-existence before writing into or after deletion from existing mpt. They can overlap with MPTProofType::AccountDoesNotExist
or MPTProofType::StorageDoesNotExist
, but they also serve for other MPTProofType
during which we insert or delete leafnodes.
There are 2 cases to constrain based on the path directed by the provided non-existing key (coming from hash of account address):
Type 1 non-existence proof (insert to append/delete from append): the path ended at a leaf node. Illustration figure shown below:
In this case, due to our construction of the old and new paths of SMTTrace
, the old path (when inserting)/new path (when deleting) must be directed to this leaf node. The prefix key provided by the old/new path must end at a bit position before the last bit of the leaf key that is to be proved non-exist. So we constrain that the non-existing account/storage must have its key that is not equal to the key at this leaf node. Circuit columns other_key
, other_key_hash
, other_leafnode_hash
and an IsEqualGadget key_equals_other_key
are used to provide witness to these constraints and to constrain.
Type 2 non-existence proof (insert to fill/delete from fill): the path ended at an empty node. Illustration figure shown below:
In this case, due to our construction of the old and new paths of SMTTrace
, the old path (when inserting)/new path (when deleting) must be directed to this empty node. So we constrain the emptiness of these nodes. Circuit provides two IsZeroGadgets old_hash_is_zero
andnew_hash_is_zero
to constrain this case.
According to the zkTrie spec, the leafnode hash is calculated by the formula
leafNodeHash = h(h(1, nodeKey), valueHash)
For account, the formula for valueHash
is
So we introduce account SegmentType
as shown in the following figure
For storage, the formula for valueHash
is
valueHash = h(storageValue[0:16], storageValue[16:32])
, so we have the following figure
The above expansions are beyond what the original trie defined in zkTrie spec can have, where the expansion of the trie will only be up to leafnodes that are classified into AccountTrie
or StorageTrie
, i.e. trie-type segments. The leaf-type segments (non-trie segments) are witnesses generated inside the circuit (parsed from SMTTrace
), and the trie-like structure shown above are just imaginary/virtual which only lives in the circuit. So such common constraints as key is 0 and depth is 0 for non-trie segments must be enforced in the circuit constraint system.
The depth of such expansion at witness generation depends on MPTProofType, which points to a specific data field at which level the expansion needs done.
The PathType
used inside the circuit is also corresponding to this expanded trie including leaf-type segments (non-trie segments). For example, in the case of insert to fill operation, PathType
changes from Common
to ExtensionNew
at the empty node that is filled by a new leaf, and the path continues to the specific account/storage field. So constraints for PathType::ExtensionNew(Old)
must discuss separately for trie and non-trie segment cases.
for every row:
SegmentType==AccountTrie
or StorageTrie
(is_trie
is true):
key
at depth
matches direction
depth_curr==depth_prev+1
PathType::Common
, other_key
at depth
matches direction
SegmentType
is any other (non-trie) type (is_trie
is false):
key==0
depth==0
SegmentType::AccountLeaf0
:
key==poisedon(address_high, address_low)
(see here)sibling = poisedon(1, key)
on rows with PathType::Common
:
old_hash_previous==poisedon_hash(old_hash_left_child, old_hash_right_child)
new_hash_previous==poisedon_hash(new_hash_left_child, new_hash_right_child)
PathType::ExtensionNew
and SegmentType::AccountLeaf0
, then current row has old_hash==0
PathType::ExtensionOld
and SegmentType::AccountLeaf0
, then new_hash==0
PathType
is Start
and current SegmentType
is AccountLeaf0
, then this is related to type 1 non-existence proof for empty account/storage, so we impose constraints according to that case. Notice that the poisedon hash constraints listed at the beginning of this section cannot count for this case. Note: Seems if done in this way then it overlaps with constraints for those non-existence MPTProofType?PathType
is Start
and current SegmentType
is AccountTrie
, then this is related to type 2 non-existence proof for empty account/storage, so we impose constraints according to that case.on rows with PathType::ExtensionNew
:
old_value==0
old_hash
keep unchanged on the extended pathnew_hash=poisedon_hash(leftchild, rightchild)
is_trie==true
)
SegmentType
is still trie type (is_final_trie_segment==false
)
SegmentType
is non-trie (leaf) type (is_final_trie_segment==true
):
SegmentType
is AccountLeaf0
or StorageLeaf0
other_key_hash=poisedon(1, other_key)
key!=other_key
, ExtensionNew corresponds to type 1 non-existence for insertionold_hash==0
, ExtensionNew corresponds to type 2 non-existence for insertionkey!=other_key
, constrain that old_hash_prev==poisedon(other_key_hash, other_leaf_data_hash)
The constraints for this case is just symmetric with respect to PathType::ExtensionNew
by swapping old and new.
on rows with MPTProofType::NonceChanged
:
SegmentType
transitions:
Start -> [Start, AccountTrie, AccountLeaf0]
AccountTrie -> [AccountTrie, AccountLeaf0, Start]
AccountLeaf0 -> [Start, AccountLeaf1]
AccountLeaf1 -> AccountLeaf2
AccountLeaf2 -> AccountLeaf3
AccountLeaf3 -> Start
SegumentType
for the particular row
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==0
AccountLeaf3
direction==0
PathType::Common
:
value
(=nonce
) is 8 bytes, old and newcode_size
(=(hash
-value
)hash
is now nonce||codesize||0
from LSB to MSB and nonce
is 8 bytes, codesize
is 8 bytes and 0
is 16 bytes) is 8 bytes, old and newcode_size
does not change between old and newPathType::ExtensionNew
:
value
(=nonce
) is 8 bytes for newcode_size
is 0 for newPathType::ExtensionOld
:
value
(=nonce
) is 8 bytes for oldcode_size
is 0 for oldAccountLeaf4, StorageTrie, StorageLeaf0, StorageLeaf1
on rows with MPTProofType::CodesizeExists
:
MPTProofType::NonceChanged
case. The only difference is that instead of computing codesize
from nonce
and hash
, compute nonce
from codesize
and hash
and then constrain.on rows with MPTProofType::BalanceChanged
:
MPTProofType::NonceChanged
case. The only difference is that
AccountLeaf0-4
becomes [1, 0, 0, 1]
AccountLeaf3
, constrain that old_hash==old_balance (old_value)
for PathType::{Common, ExtensionOld}
and new_hash==new_balance (new_value)
for PathType::{Common, ExtensionNew}
on rows with MPTProofType::PoisedonCodehashExists
:
MPTProofType::NonceChanged
case. The only difference is that
AccountLeaf0-4
becomes [1,1]
AccountLeaf1
, constrain that old_hash==poisedon_code_hash (old_value)
for PathType::{Common, ExtensionOld}
and new_hash==poisedon_code_hash (new_value)
for PathType::{Common, ExtensionNew}
on rows with MPTProofType::CodehashExists
:
Start
, AccountTrie
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==1
AccountLeaf3
direction==1
old_hash==poisedon(old_keccak_hi, old_keccak_lo)
new_hash==poisedon(new_keccak_hi, new_keccak_lo)
AccountLeaf4
, StorageTrie
, StorageLeaf0
, StorageLeaf1
In this case, the non-existence account to be proved shall have 0 value both before and after the MPT update. But they still correspond to type 1 and type 2 non-existence based on the account key. On rows with MPTProofType::AccountDoesNotExist
:
old_value==0
, new_value==0
direction==0
SegmentType==AccountLeaf0
, current PathType==Common
, next PathType==Start
key != other_key
other_key_hash = poisedon(1, other_key)
SegmentType==AccountTrie
, current PathType==Common
, next PathType==Start
old_hash
and new_hash
are zeroon rows with MPTProofType::StorageChanged
:
Start
, AccountTrie
,
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==1
AccountLeaf3
direction==0
StorageTrie
StorageLeaf0
direction==1
sibling==poisedon(1, key)
key==poisedon(storage_key_hi, storage_key_lo)
(see here)storage_key_hi (lo)
and storage_key
old_hash==poisedon(old_value_hi, old_value_lo)
new_hash==poisedon(new_value_hi, new_value_lo)
old(new)_value_hi(lo)
and old(new)_value
AccountLeaf4
, StorageLeaf1
In this case, the rationale is similar to MPTProofType::AccountDoesNotExist
, although there are small differences such as storage read must be on an existing account.
On rows with MPTProofType::StorageDoesNotExist
:
old_value==new_value==0
PathType
is Start
, which means this is the last segment for the current mpt update proof
old_hash
key!=other_key
hash==0
other_key_hash=poisedon(1, other_key)
old_hash == new_hash == poisedon(other_key_hash, other_leaf_data_hash)
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==1
AccountLeaf3
direction==0
PathType==Common
, because storage read must be on an existing accountkey_hi
(16 bytes), key_lo
(16 bytes) and key==poisedon(key_hi, key_lo)
storage_key_rlc
as RLC of RLC(key_hi)
and RLC(key_lo)