code: https://github.com/scroll-tech/mpt-circuit feat/refactor
branch and feat/refactor_nonce
branch.
This draft may serve as a preliminary version of a part of future documentation/spec.
Notice that the document is not final and may change due to potential errors in my understanding of the codebase or due to frequent changes of the codebase itself.
Corrections, suggestions and comments are very welcome at any time.
Thanks to @z2trillion for collaborating on drafting this documentation.
For the first reading, you may skip the italics part annotated as Notes or Discussions, since these are just additional memo. If you are already familiar with the MPT table and the SMTTrace
stuff in zktrie, you can also start reading from the section "MPT proofs as witnesses for the feat/refactor branch"
MPT-table in zkevm-circuits has the following table layout
address | storage_key | proof_type | new_root | old_root | new_value | old_value |
---|---|---|---|---|---|---|
MPTProofType: NonceMod, BalanceMod, KeccakCodeHashExists, PoseidonCodeHashExists, CodeSizeExists, NonExistingAccountProof, StorageMod, NonExistingStorageProof |
The MPT table is constructed following this zkTrie spec.
Each row of the MPT table reflects an
MPTUpdate
key
: AccountStorage
or Account
, each one has its own data fieldsold_value
and new_value
, change in valueold_root
and new_root
, change in trie rootColumns address
and storage_key
indicate the location where changes in account or storage happen.
MPT circuit aims at proving the correctedness of the MPT table described above. This means its constraint system enforces a unique structure of the MPT (zkTrie) as described.
When the MPT is changed due to account or storage updates, MPT circuit must prove this update leads to the correct root change. We encode one step of MPTUpdate
into an SMTTrace
that carries [old_path, new_path] on the trie for account/storage. Here SMT stands for Sparse Merkle Trie (this confirmed by repo author @noel2004).
Both paths are from root to leaf, encoding the change of path in that step. This construction is done inside the zkTrie of the zkevm-circuits.
Note: Refactor logic (for account as example) is handle_new_state (update the trie according to MPTProofType
category) -> trace_account_update (provide old and new paths in the trie based on the update) -> decode_proof_for_mpt_path (determine the path, which must start from root and end at either a leafnode or an empty node in the old/new trie based on the given key).
Note: In zkTrie the update is constructed according to MPTProofType
, and for MPTProofType::AccountDoesNotExist
and MPTProofType::StorageDoesNotExist
the convention here is that the provided account/storage is empty both before and after the update. The proof of non-existence of account before writing into or after deletion from the existing mpt, will be included as separate constraints that correspond to different types of non-existence proof (we call them type 1 and type 2 non-existence, see below).
SMTTrace
consists of the following hierarchy of notions:
SMTTrace
address
: 20-byte hex, the address of the account being updated.account_key
: 32-byte hex, the hash of address.account_path
: SMTPath
, old and new
SMTPath
: It is understood that this path has an order that goes from root node to leaf node.
root
: hashleaf
: SMTNode
path
: Vec<SMTNode>
path_part
: partial keySMTNode
value
: hashsibling
: hashaccount_update
: [Option<AccountData>; 2]
, old and new
AccountData
nonce
balance
code_hash
: keccak code hashpoisedon_code_hash
code_size
None
, if the SMTTrace
is for an update to an account that previously doesn't exist, or the update for MPTProofType::AccountDoesNotExist
.state_path
: Option<[Option<SMTPath>; 2]>
. SMTPath's for storage trie. There are 5 possibilities for None
/Some(...)
here:
None
: the update doesn't touch the account's storage. I.e. the proof type matches MPTProofType::{NonceChanged, BalanceChanged, CodeHashExists, PoseidonCodeHashExists, CodeSizeExists, AccountDoesNotExist}
[None, None]
the storage slot for the update starts and ends with value 0. Either the proof type is MPTProofType::StorageDoesNotExist
or it is MPTProofType::StorageChanged
and the new and old values are both 0.[Some(...), Some(...)]
: the proof type is MPTProofType::StorageChanged
and the new and old values are non-zero.[None, Some(...)]
: the proof type is MPTProofType::StorageChanged
and the old value is 0 and the new value is non-zero.[Some(...), None]
: the proof type is MPTProofType::StorageChanged
and the old value is non-zero and the new value is non-zero.common_state_root
: 32-byte hex, hash. This is optional and is present iff state_path
and state_update
are both None
. (It is not possible for only one of state_path
or state_update
to be None
.)state_key
: optional hash of storage key. None
if the update doesn't touch the account's storage.state_update
: StateData
, old and new, reflecting update on storage
StateData
key
value
hierarchy looks like:
feat/refactor
branchDuring the witness assignment in mpt_update circuit, a sequence of mpt updates with each update
being of type SMTTrace
is provided. Each update
is then turned into a Proof
and then filled into the circuit as witnesses. Each Proof
consists of the following components
Proof
claim
: A Claim
Claim
old_root
new_root
address
kind
: ClaimKind
: corresponding to MPTProofType
address_hash_traces
: vector of traces of hash obtained from AccountPath
with path going from leaf to root. Each vector item is a list with components, that stands for a node in the path at the depth going from high (leaf end) to low (=0, root end). Components are
direction
: bool, this is corresponding to account_key
, left or right that this node is with respect to its father (not sure? but @z2trillion has the same opinion)open_value
: Fr
field element, old hash at this node, i.e. for old pathclose_value
: Fr
field element, new hash at this node, i.e. for new pathsibling
: Fr
field element, sibling node's hashis_open_padding
: bool, true if this node is empty for old pathis_close_padding
: bool, true if this node is empty for new pathleafs
: [LeafNode; 2]
, for old and new path
LeafNode
key
value_hash
old_account_hash_traces
: Vector with item in [[Fr; 3]; 6]
. For non-empty account
[codehash_hi, codehash_lo, h1=hash(codehash_hi, codehash_lo)]
[h1, storage_root, h2=hash(h1, storage_root)]
[nonce, balance, h3=hash(nonce, balance)]
[h3, h2, h4=hash(h3, h2)]
[1, account_key, h5=hash(1, account_key)]
[h4, h5, h6=hash(h4, h5)]
[0,0,0]
[0,0,0]
[0,0,0]
[0,0,0]
[1, leaf.key, h5=hash(1, leaf.key)]
[h5, leaf.value_hash, h6=hash(h5, leaf.value_hash)]
new_account_hash_traces
:same as above, but for the new account.storage_hash_traces
: same as address_hash_traces, but for storagestorage_key_value_hash_traces
: Option<[[[Fr; 3]; 3]; 2]>
, old and new of the following list
[key_high, key_low, h0=hash(key_high, key_low)]
[value_high, value_low, h1=hash(value_high, value_low)]
[h0, h1, hash(h0, h1)]
old
, new
: two Path
corresponding to the old and new account path of SMTTrace
. A Path
data has the following structure (TODO: this part looks wierd, follow-up with it.)
Path
key
:The account_key(address) if the account exists; else: path.leaf.sibling if it's a type 1 non-existence proof; otherwise account_key(address) if it's a type 2 non-existence proofkey_hash
, which is hash(1, key)
(type 0 and type 1), or 0
(type 2)leaf_data_hash
, which is leaf data hash (type 0 and type 1) or None
(type 2)old_account
, new_account
: Option<EthAccount>
. Each EthAccount
is structured as
EthAccount
nonce
codesize
poisedon_codehash
balance
keccak_codehash
Circuit configures constraints for each MPTProofType
MPTProofType
NonceChanged
= 1BalanceChanged
CodeHashExists
: keccak codehash updatedPoseidonCodeHashExists
: poseidon codehash updatedCodeSizeExists
: code size updatedAccountDoesNotExist
: non exist proof for accountAccountDestructed
: TODO: This does not match what is in zkevm-circuits?StorageChanged
StorageDoesNotExist
TODO: The circuit layout columns may have not been finalized and will be constantly updated based on recent pushes to the git repo branch.
old_hash | new_hash | old_value | new_value | proof_type | address | storage_key_rlc | segment_type | path_type | depth | key | other_key | other_key_hash | other_leaf_data_hash | direction | sibling | upper_128_bits |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
node's open hash | node's close hash | node's open value | node's close value | MPTProofType | account address | rlc of storage key (u256 to big endian) | Start, AccountTrie, AccountLeaf0, AccountLeaf1, AccountLeaf2, AccountLeaf3, AccountLeaf4, StorageTrie, StorageLeaf0, StorageLeaf1 | Common, ExtensionOld, ExtensionNew | depth increase from root to leaf, from current row to next row | provided key | match provided key with existing mpt path key, until reaching leafnode, obtain resulting key. For type 1 non-existence proof, this key is different from provided key, for type 2 non-existence proof, this key is same as provided key | hash(1, other_key) if other_key matches provided key, else is 0 | hash of the leaf/empty node pointed by other key | direction must match key | sibling node's hash | most significant 128 bits of address or storage key |
Each circuit row represents the mpt update (from old to new) of a path from root to leaf, at a certain depth (depth of root is 0). So each row represents only one node (virtual node in case it does not exist). The row offset order previous --> current --> next
indicates an increase of depth towards leafnode.
The configuration changes to the MPT is characterized by the following two columns:
(1) PathType
: This characterizes topological structural change. It includes PathType::Start
, PathType::Common
, PathType::ExtensionOld
and PathType::ExtensionNew
;
(2) SegmentType
: This characterizes data field (which form the trie's leafnode hash) change. It includes SegmentType::Start
, SegmentType::AccountTrie
, SegmentType::AccountLeaf0
-SegmentType::AccountLeaf4
, SegmentType::StorageTrie
, SegmentType::StorageLeaf0
-SegmentType::StorageLeaf1
.
In both of the above two column witnesses, Start
is used as boundary marker between updates. This means each circuit row with a Start
indicates a new MPT update.
Operations to the trie can be classified as
(1) modify: an account/storage slot is being changed. The account/storage under this operation is named type 0, i.e., type 0 account/storage are in the MPT. This means that they're non-emtpy, i.e. at least one of their fields is non-zero. Figure below:
(2) insert to/delete from append: an account/storage slot is inserted to the MPT as append, i.e. it forms a new leafnode that previously does not exist even as empty nodes (insert to append), or it is deleted from an appended leaf, i.e., after deletion there will be no node (including empty node) left (delete from append). The account/storage under this operation is named type 1, i.e., type 1 account/storage are empty. Where they could be in the MPT, there is instead a leaf node that maps to another non-empty account/storage. Figure below (for insert case, and delete case just swap old and new):
Notice that the zkTrie adds only one bit of common prefix at each level of its depth. It also applies the optimization that replaces subtrees consisting of exactly one leaf with a single leaf node to reduce the tree height. This means that when inserting/deleting new account as append, it may happen that an extension happens with some intermediate nodes that provide key prefix bits, either for the old path or for the new path. It is constrained that at all the new siblings added are empty nodes.
(3) insert to/delete from fill: insert an account/storage slot to fill in an empty node (insert to fill) as well as delete an account/storage slot from a filled leafnode and leaves with an empty node (delete from fill). The account/storage under this operation is named type 2, i.e., type 2 account/storage are also empty. Where they could be in the MPT, there is instead an empty node. Figure below (for insert case, and delete case just swap old and new):
PathType::Common
refers to the sitation that the old and new path share the same toplogical configuration.
This can correspond to the toloplgical configuration change on the whole path in the modify operation, or the common path (not extended one) of the insert to/delete from append, or the insert to/delete from fill operations.
PathType::ExtensionNew
refers to the sitation that the new path extends the old path in its toplogical configuration.
This can correspond to the extended part of the path in insert to append, insert to fill operations.
PathType::ExtensionOld
refers to the sitation that the old path extends the new path in its toplogical configuration.
This can correspond to the extended part of the path in delete from append, delete from fill operations.
Each circuit row will be assigned a PathType at witness generation. Along a path (old/new) provided by the Proof
, each node will correspond to a row in the circuit. So PathType
can change from Common
to ExtensionOld(New)
when hitting the extended part of the trie. This also includes the case when a type 2 account/storage is inserted/deleted, where the empty node pointed by the key will be extended to a leafnode.
In fact, if we do not compress subtrees with empty nodes as in the zkTrie spec, then we can combine the two subcases in account addition case (or account deletion case). Here the previous figure for PathType::ExtensionNew
in account addition case becomes the following
However, using the empty-node compression in the current zkTrie spec will result in extension and we have to work with various PathTypes. This is a trade-off, though from the above agument based on whether old path hits a leaf or an empty node, we know that we exhausted all path types.
These cases of non-existence proofs are related to non-existence before writing into or after deletion from existing mpt. They can overlap with MPTProofType::AccountDoesNotExist
or MPTProofType::StorageDoesNotExist
, but they also serve for other MPTProofType
during which we insert or delete leafnodes.
There are 2 cases to constrain based on the path directed by the provided non-existing key (coming from hash of account address):
Type 1 non-existence proof (insert to append/delete from append): the path ended at a leaf node. Illustration figure shown below:
In this case, due to our construction of the old and new paths of SMTTrace
, the old path (when inserting)/new path (when deleting) must be directed to this leaf node. The prefix key provided by the old/new path must end at a bit position before the last bit of the leaf key that is to be proved non-exist. So we constrain that the non-existing account/storage must have its key that is not equal to the key at this leaf node. Circuit columns other_key
, other_key_hash
, other_leafnode_hash
and an IsEqualGadget key_equals_other_key
are used to provide witness to these constraints and to constrain.
Type 2 non-existence proof (insert to fill/delete from fill): the path ended at an empty node. Illustration figure shown below:
In this case, due to our construction of the old and new paths of SMTTrace
, the old path (when inserting)/new path (when deleting) must be directed to this empty node. So we constrain the emptiness of these nodes. Circuit provides two IsZeroGadgets old_hash_is_zero
andnew_hash_is_zero
to constrain this case.
According to the zkTrie spec, the leafnode hash is calculated by the formula
leafNodeHash = h(h(1, nodeKey), valueHash)
For account, the formula for valueHash
is
So we introduce account SegmentType
as shown in the following figure
For storage, the formula for valueHash
is
valueHash = h(storageValue[0:16], storageValue[16:32])
, so we have the following figure
The above expansions are beyond what the original trie defined in zkTrie spec can have, where the expansion of the trie will only be up to leafnodes that are classified into AccountTrie
or StorageTrie
, i.e. trie-type segments. The leaf-type segments (non-trie segments) are witnesses generated inside the circuit (parsed from SMTTrace
), and the trie-like structure shown above are just imaginary/virtual which only lives in the circuit. So such common constraints as key is 0 and depth is 0 for non-trie segments must be enforced in the circuit constraint system.
The depth of such expansion at witness generation depends on MPTProofType, which points to a specific data field at which level the expansion needs done.
The PathType
used inside the circuit is also corresponding to this expanded trie including leaf-type segments (non-trie segments). For example, in the case of insert to fill operation, PathType
changes from Common
to ExtensionNew
at the empty node that is filled by a new leaf, and the path continues to the specific account/storage field. So constraints for PathType::ExtensionNew(Old)
must discuss separately for trie and non-trie segment cases.
for every row:
SegmentType==AccountTrie
or StorageTrie
(is_trie
is true):
key
at depth
matches direction
depth_curr==depth_prev+1
PathType::Common
, other_key
at depth
matches direction
SegmentType
is any other (non-trie) type (is_trie
is false):
key==0
depth==0
SegmentType::AccountLeaf0
:
key==poisedon(address_high, address_low)
(see here)sibling = poisedon(1, key)
on rows with PathType::Common
:
old_hash_previous==poisedon_hash(old_hash_left_child, old_hash_right_child)
new_hash_previous==poisedon_hash(new_hash_left_child, new_hash_right_child)
PathType::ExtensionNew
and SegmentType::AccountLeaf0
, then current row has old_hash==0
PathType::ExtensionOld
and SegmentType::AccountLeaf0
, then new_hash==0
PathType
is Start
and current SegmentType
is AccountLeaf0
, then this is related to type 1 non-existence proof for empty account/storage, so we impose constraints according to that case. Notice that the poisedon hash constraints listed at the beginning of this section cannot count for this case. Note: Seems if done in this way then it overlaps with constraints for those non-existence MPTProofType?PathType
is Start
and current SegmentType
is AccountTrie
, then this is related to type 2 non-existence proof for empty account/storage, so we impose constraints according to that case.on rows with PathType::ExtensionNew
:
old_value==0
old_hash
keep unchanged on the extended pathnew_hash=poisedon_hash(leftchild, rightchild)
is_trie==true
)
SegmentType
is still trie type (is_final_trie_segment==false
)
SegmentType
is non-trie (leaf) type (is_final_trie_segment==true
):
SegmentType
is AccountLeaf0
or StorageLeaf0
other_key_hash=poisedon(1, other_key)
key!=other_key
, ExtensionNew corresponds to type 1 non-existence for insertionold_hash==0
, ExtensionNew corresponds to type 2 non-existence for insertionkey!=other_key
, constrain that old_hash_prev==poisedon(other_key_hash, other_leaf_data_hash)
The constraints for this case is just symmetric with respect to PathType::ExtensionNew
by swapping old and new.
on rows with MPTProofType::NonceChanged
:
SegmentType
transitions:
Start -> [Start, AccountTrie, AccountLeaf0]
AccountTrie -> [AccountTrie, AccountLeaf0, Start]
AccountLeaf0 -> [Start, AccountLeaf1]
AccountLeaf1 -> AccountLeaf2
AccountLeaf2 -> AccountLeaf3
AccountLeaf3 -> Start
SegumentType
for the particular row
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==0
AccountLeaf3
direction==0
PathType::Common
:
value
(=nonce
) is 8 bytes, old and newcode_size
(=(hash
-value
)hash
is now nonce||codesize||0
from LSB to MSB and nonce
is 8 bytes, codesize
is 8 bytes and 0
is 16 bytes) is 8 bytes, old and newcode_size
does not change between old and newPathType::ExtensionNew
:
value
(=nonce
) is 8 bytes for newcode_size
is 0 for newPathType::ExtensionOld
:
value
(=nonce
) is 8 bytes for oldcode_size
is 0 for oldAccountLeaf4, StorageTrie, StorageLeaf0, StorageLeaf1
on rows with MPTProofType::CodesizeExists
:
MPTProofType::NonceChanged
case. The only difference is that instead of computing codesize
from nonce
and hash
, compute nonce
from codesize
and hash
and then constrain.on rows with MPTProofType::BalanceChanged
:
MPTProofType::NonceChanged
case. The only difference is that
AccountLeaf0-4
becomes [1, 0, 0, 1]
AccountLeaf3
, constrain that old_hash==old_balance (old_value)
for PathType::{Common, ExtensionOld}
and new_hash==new_balance (new_value)
for PathType::{Common, ExtensionNew}
on rows with MPTProofType::PoisedonCodehashExists
:
MPTProofType::NonceChanged
case. The only difference is that
AccountLeaf0-4
becomes [1,1]
AccountLeaf1
, constrain that old_hash==poisedon_code_hash (old_value)
for PathType::{Common, ExtensionOld}
and new_hash==poisedon_code_hash (new_value)
for PathType::{Common, ExtensionNew}
on rows with MPTProofType::CodehashExists
:
Start
, AccountTrie
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==1
AccountLeaf3
direction==1
old_hash==poisedon(old_keccak_hi, old_keccak_lo)
new_hash==poisedon(new_keccak_hi, new_keccak_lo)
AccountLeaf4
, StorageTrie
, StorageLeaf0
, StorageLeaf1
In this case, the non-existence account to be proved shall have 0 value both before and after the MPT update. But they still correspond to type 1 and type 2 non-existence based on the account key. On rows with MPTProofType::AccountDoesNotExist
:
old_value==0
, new_value==0
direction==0
SegmentType==AccountLeaf0
, current PathType==Common
, next PathType==Start
key != other_key
other_key_hash = poisedon(1, other_key)
SegmentType==AccountTrie
, current PathType==Common
, next PathType==Start
old_hash
and new_hash
are zeroon rows with MPTProofType::StorageChanged
:
Start
, AccountTrie
,
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==1
AccountLeaf3
direction==0
StorageTrie
StorageLeaf0
direction==1
sibling==poisedon(1, key)
key==poisedon(storage_key_hi, storage_key_lo)
(see here)storage_key_hi (lo)
and storage_key
old_hash==poisedon(old_value_hi, old_value_lo)
new_hash==poisedon(new_value_hi, new_value_lo)
old(new)_value_hi(lo)
and old(new)_value
AccountLeaf4
, StorageLeaf1
In this case, the rationale is similar to MPTProofType::AccountDoesNotExist
, although there are small differences such as storage read must be on an existing account.
On rows with MPTProofType::StorageDoesNotExist
:
old_value==new_value==0
PathType
is Start
, which means this is the last segment for the current mpt update proof
old_hash
key!=other_key
hash==0
other_key_hash=poisedon(1, other_key)
old_hash == new_hash == poisedon(other_key_hash, other_leaf_data_hash)
AccountLeaf0
direction==1
AccountLeaf1
direction==0
AccountLeaf2
direction==1
AccountLeaf3
direction==0
PathType==Common
, because storage read must be on an existing accountkey_hi
(16 bytes), key_lo
(16 bytes) and key==poisedon(key_hi, key_lo)
storage_key_rlc
as RLC of RLC(key_hi)
and RLC(key_lo)