MPT circuit

Let's say we have a trie which has one branch under which there are two branches. Each of these two branches has 16 leaves.

The first branch has: leaf1a, , leaf16a.
The second branch has: leaf1b, , leaf16b.

We make two changes:

  • leaf2a -> leaf2aa
  • leaf1b -> leaf1bb

The root hash at the beginning is H1, after first change it's H2, after second change it's H3.

The public inputs are H1 and H3. We need to prove we know the witnesses for the transition H1 -> H3.

The layout is as follows:

  • a block has 17 rows - 16 for leaves and 1 for the position in the parent branch (this needs generalization for extension nodes)
  • the first column says what kind of block it is: for every second block (odd blocks, except for the first one) we need to prove that the root hash is the same as in the previous block, and also for every second block (even blocks) we need to prove the path is the same
  • the second column contains leaves, the 17-th row in the second column is the index of the leaf that is being changed (if even block, the other leaves need to be the same as in the previous block)
  • the third column contains hashes of leaves (constraint: hash of the second column = third column)
  • the third column also has an information (17th-row of block) where in the parent branch is hash of these columns (hash[h1a, , h16a])
  • the fourth column contains hashes of the parent branch ([_, _, f1a, f1b, _]), the constraint here is that hash([h1a,h2a,,h16a]) = cols[2] (which is f1a), the position 2 is given in the 17-th row (here it might be a degree problem as we have 16 possible values for position)
  • the fifth column contains root hash H1; constraint: hash([_, _, f1a, f1b, _]) = H1.
  • if we are deeper down in the trie, we have more columns
  • the problem might be that 16 values are checked when going from one column to the next one, that would mean constraint like: (pos - 1)(pos - 2) * * (pos - 15) * (hash = first row) + pos * (pos - 2) * * (pos - 15) * (hash = second row) + pos * (pos - 1) * (pos - 3) * * (pos - 15) * (hash = third row) +

The first change (left2a -> left2aa) implies:

  • h2a -> h2aa
  • f1a -> f1aa
  • H1 -> H2

We fill the block as we did for the first one, but we have some new values: leaf2aa, h2aa, f1aa, H2. We check the propagation of hashes (we might have a chip for this). Additionally, we check whether the value in the 17-th row is the same in both blocks. 17-th row constraints will make sure we are climbing up to the root by the same path as in the previous block.

The second change (left1b -> left1bb) implies:

  • h1b -> h1bb
  • f1b -> f1bb (picture wrongly says f2bb)
  • H2 -> H3

Now we need to prove we have another witness for H2. We fill the block with leaf1b, leaf2b, , leaf16b.

We check the propagation of the hashes. And finally we check the root hash is the same as in the previous block (H2).

Now we need to prove we have a witness for H3 which uses the same path as the previous block.

We fill in leaf1bb, , leaf16b. We check the propagation of hashes. We check the 17-th row is the same as in the previous block.

Finally, we compare H3 to the public input.

Generalization

I think the approach above works for updating the value, inserting a new leaf to the existing branch, and deleting a leaf from the branch.

To support converting a leaf into a branch, we can add another two columns (which would be the second and third column now) which would be left empty unless a leaf is turned into branch and a leaf is added to this branch. Then we would have a leaf in the first (of these two) column and its hash in the second column.

To support extension nodes we add another row (18-th) to the block which would tell the parent type. The constraints would be a bit different and some cells would be left empty for extension nodes. One thing that would cause problems is if the trie layout changes in the middle of the path (branch <-> extension node), need to check this.

Question & Thought

  1. We somehow need to know what kind of node it is to perform correct encoding.
  2. In such layout, we need a wide enough circuit to handle the deepest case, which could be 64 layer. Then in normal cases, we need to pad left (or right) to fill all cells, it seems half of cells will be wasted in average.
    Is it possible to only verify single layer per region (per block in your word)?
  3. If state circuit has already sorted to prevent update same key twice, then in MPT circuit we don't need to do that again. In MPT circuit we only need to require all update key value pair to exist in state circuit.
  4. If we are handling all MPT in this circuit, we need some kind of tag to identify different address (should be easy to add).
Select a repo