owned this note
owned this note
Published
Linked with GitHub
---
tags: zkevm-docs
---
# Breaking down and understanding the ZkEVM
I've been working on the ZkEVM project for almost a year now. During this time, I've worked on many parts of the system design and implementation together with my colleages.
And I always thought that would be nice to elaborate some posts/writeups (which are not just code docs from our repos) so that others can understand how the ZkEVM works, why has it been designed like this and, hopefully, provide some basis to understand future work and development done within this new area.
My idea is to break down all of the components that conform the ZkEVM as well as try to explain how they interact within eachother. And explain some discussions that lead to the actual design.
Anyway, I'm not planning to go into the deepest details. For the interesting ones, I'll maybe do some extra articles explaining them.
# 1. General architecture of the ZkEVM
## Understanding the EVM.
I'm sure that most of us are already inline whith the EVM(Ethereum Virtual Machine) architecture. But for the ones who are not, here's a small introduction.
The EVM is a Turing-Complete virtual machine responsible for executing the state changes in the Ethereum blockchain in-between blocks.
To do so, the EVM has the following key components/concepts attached to it.

With the parts aformentioned, this would be the execution model that the EVM follows.

Each EVM code (a contract written in Solidity, Huff, Vyper..) is compiled at it's core into OPCODES/bytecode.
Each opcode gets assigned a Program Counter(PC) (not exactly, but for ease of explaining we will leave it like this).
Each time you process the changes that an Opcode triggers, you update the Program Counter.
Before each Opcode processing, the EVM first checks if there's enough gas in order to execute the following opcode. (All opcodes have a fixed or dynamic gas price assigned to them).
Opcodes are just ways to modify the Stack, Memory and Storage.
That means, that the EVM is responsible to process all of the Stack, Memory and Storage read/writes, which lead to state updates in the chain that translate into balance changes between accounts for example.
For a "slighly more deep" view of the EVM, I wrote quite some years ago some docs about it, Opcodes etc.. You can check it [here](https://github.com/CPerezz/EthDocs/blob/master/evm_%26_assembly.md). It might be outdated in regards of gas costs and some missing opcodes, but the general idea is the exact same one.
## What the hell is a ZkEVM then?
"ZkEVM" is the term used to define a solution to a problem. The problem of **How do we prove that the execution of an entire block inside of the EVM was correctly performed with a Zero Knowledge Proof?**
When we say "ZkEVM" we're refering to the overall architecture of circuits which enables us to make proofs of "Correct Eth Block Processing".
Indeed, that's funny because the Zero-knowledge property of the SNARKS is indeed not used. But we'll talk about this in future articles.
But what does this mean exactly?? Let's dig a bit more deeper.
Well, at the very end, what we are saying is that we will be proving that for each Ethereum block, and for each transaction in it:
- Stack, Memory and Storage changes caused by each opcode are correct.
- Merkle-Patricia Trie (MPT) paths lead to the correct root for each storage slot updated.
- Bytes are copied correctly from/to Memory in a correct way.
- Each hash performed by the EVM is indeed correct and leads to the expected output.
- Every result of every Arithmetic or Binary operation performed by an Opcode is correct and in the correct Stack position.
- Any Reversion is also proven to be correct (Proof of wrong execution).
- Any Tx Signature, hash etc... are correct.
We could continue. But I guess you make the idea already that there's A BUNCH of stuff to prove :sweat_smile:.
## How do we architect the Circuit that has to prove all this execution??
Well, there are several approaches to do this. But I'll remain with the one we've gone for (which as far as I saw in the ZKEVM meeting at DevConnect AMS, is the one almost everyone is following with small tweaks).
### Some background from ZKPs needed to understand the architecture.
- With ZKProofs, you ideally just want to compile your circuits once, and re-use them just changing the witness. That means, that you cannot change your circuits in-between different blocks.
But wait.. Obviously, on each block we will not have always the same tx's nor in the same order nor following the same execution paths.
:::info
So that would mean, that we cannot have a unique circuit. Because in one block, the first 1000th opcode to process might be `PUSH_N` but in the next one it could be `JUMP`. And these opcodes have completely different constraints associated to them.
:::
*We will see how do we avoid this problem once we review the EVM Circuit in depth.*
- To reuse circuits, if there's branching within the circuit logic, that depends on a witness value, the circuit branches must be padded to the largest of them. So that we don't get variadic-size circuits
And as you can imagine, The EVM has quite some branchin possibilities.
:::warning
Basically, to put a very easy example: Since every single opcode can run out of gas, each opcode already represents a branching between Reverting or not the tx. :astonished:
:::
- ZK circuits are padded (in our case using PLONK) to the next power of two from the number of constraints of the circuit. That means that we need to always use the largest possible circuit even if the actual block just needs half of the constraints of what the largest occupies.
- In the PSE solution, we use a tunned version of the [KZG polynomial commitment scheme](https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf) using BN256 as our base curve. For BN256 we have a TWO_ADACITY of 2^28. Which we can translate to how many roots of unity this curve offers us. We need these to encode our polynomials to commit them. Even it might not be trivial to see, the ammount of roots of unity is directy setting a cap on the number of constraints we can have in our circuit.
:::info
What that means, is that if there's a block that somehow ends up taking `>= 2^27` constraints to be proven to be executed correcty, we simply can't represent it. And need to apply extra tricks or simply change our curve.
:::
### Components of the architecture

Well, I hope my hand-written diagram is not an item of laugh for too much time..
There are a lot of important things which we will cover in this article. But none of them will be about going in-depth into any of these components.
The next articles (Which I hope I'll have still the motivation to write) will do so.
#### Bus-Mapping
Bus-Mapping is a piece designed to parse EVM execution traces and manipulate all of the data they provide in order to obtain structured witness inputs for the EVM, State, MPT, and, Keccak circuits.
:::info
How it works is by parsing all of the data contained in the [StructLogRes](https://github.com/ethereum/go-ethereum/blob/9244d5cd61f3ea5a7645fdf2a1a96d53421e412f/eth/tracers/logger/logger.go#L413-L426) and generating then, treated, ordered and formatted witness data ready to be used as circuit input.
:::
Among other things it produces four major outputs:
- Circuit Input Builder: Contains all of the data related to the Opcode executions (Arith op results and similar things) ready to be witnessed.
- Operation Container: Contains an ordered list of all of the operations related to Stack, Memory and/or Storage.
- Keccak Container: Contains a list of all of the hashes required to process the entire ExecutionTrace.
- StateDB: KV Database which represents the Ethereum State Trie. Contains all the required info related to all the storage info.
Given that, we can easily see through an example, a representation of what the Bus-Mapping does for us.
After processing all of the following opcodes (let's supose they come from an ExecutionTrace).
```
pc op stack (top -> down) memory
-- -------------- ---------------------------------- ---------------------------------------
...
53 JUMPDEST [ , , , ] {40: 80, 80: , a0: }
54 PUSH1 40 [ , , , 40] {40: 80, 80: , a0: }
56 MLOAD [ , , , 80] {40: 80, 80: , a0: }
57 PUSH4 deadbeaf [ , , deadbeef, 80] {40: 80, 80: , a0: }
62 DUP2 [ , 80, deadbeef, 80] {40: 80, 80: , a0: }
63 MSTORE [ , , , 80] {40: 80, 80: deadbeef, a0: }
64 PUSH4 faceb00c [ , , faceb00c, 80] {40: 80, 80: deadbeef, a0: }
69 DUP2 [ , 80, faceb00c, 80] {40: 80, 80: deadbeef, a0: }
70 MLOAD [ , deadbeef, faceb00c, 80] {40: 80, 80: deadbeef, a0: }
71 ADD [ , , 1d97c6efb, 80] {40: 80, 80: deadbeef, a0: }
72 DUP2 [ , 80, 1d97c6efb, 80] {40: 80, 80: deadbeef, a0: }
73 MSTORE [ , , , 80] {40: 80, 80: 1d97c6efb, a0: }
74 PUSH4 cafeb0ba [ , , cafeb0ba, 80] {40: 80, 80: 1d97c6efb, a0: }
79 PUSH1 20 [ , 20, cafeb0ba, 80] {40: 80, 80: 1d97c6efb, a0: }
81 DUP3 [ 80, 20, cafeb0ba, 80] {40: 80, 80: 1d97c6efb, a0: }
82 ADD [ , a0, cafeb0ba, 80] {40: 80, 80: 1d97c6efb, a0: }
83 MSTORE [ , , , 80] {40: 80, 80: 1d97c6efb, a0: cafeb0ba}
84 POP [ , , , ] {40: 80, 80: 1d97c6efb, a0: cafeb0ba}
...
```
Once you have the trace built (following the code found above) you can basically get an iterator/vector over the Stack, Memory or Storage operations ordered on the way the State Circuit needs.
On that way, we would get something like this for the Memory ops:
| `key` | `val` | `rw` | `gc` | Note |
|:------:| ------------- | ------- | ---- | ---------------------------------------- |
| `0x40` | `0` | `Write` | | Init |
| `0x40` | `0x80` | `Write` | 0 | Assume written at the begining of `code` |
| `0x40` | `0x80` | `Read` | 4 | `56 MLOAD` |
| - | | | | |
| `0x80` | `0` | `Write` | | Init |
| `0x80` | `0xdeadbeef` | `Write` | 10 | `63 MSTORE` |
| `0x80` | `0xdeadbeef` | `Read` | 16 | `70 MLOAD` |
| `0x80` | `0x1d97c6efb` | `Write` | 24 | `73 MSTORE` |
| - | | | | |
| `0xa0` | `0` | `Write` | | Init |
| `0xa0` | `0xcafeb0ba` | `Write` | 34 | `83 MSTORE`
Where as you see, we group by `key`, `memory_address` and then `global_counter`.
The values also, will be formatted as field elements (the entire solution is abstracted from them). And also would contain any auxiliary data required to process them.
*Thanks to Han (@han0110) for the table examples which I copied :smirk:*
#### State Circuit
Now that the Bus-Mapping has been completelly explained, should be easier to understand the State Circuit.
The purpose of the State circuit is basically, to verify the updates on any part of the EVM (that doesn't only mean Storage Tries but also the Memory or the Stack for example).
The only way to verify the Stack and Memory changes is if they're processed in order. So that we know initial and final values of each updated Mem/Stack/Storage region.
Stack and memory are easier to handle. As in case of Reversion, they're pretty easy to handle.
In the EVM, there are multiple kinds of `StateDB`s that require updating but at the same time, they could be reverted when any internal call fails.
- `tx_access_list_account` - `(tx_id, address) -> accessed`
- `tx_access_list_storage_slot` - `(tx_id, address, storage_slot) -> accessed`
- `account_nonce` - `address -> nonce`
- `account_balance` - `address -> balance`
- `account_code_hash` - `address -> code_hash`
- `account_storage` - `(address, storage_slot) -> storage`
The complete list can be found [here](https://github.com/ethereum/go-ethereum/blob/master/core/state/journal.go#L87-L141). For some of them like `tx_refund`, `tx_log`, or `account_destructed` we don't need to write and revert because it doesn't affect future execution, we only write them when we know the call depth at which we are is persistent.
The state circuit works really close with the EVM circuit. It handles all of the values written and read from all the EVM components so that we're always sure that no value was missintroduced.
It's an important part of the design, and when we review it in depth I'm sure will help you a lot to understand better the entire solution just by understanding this component.
#### EVM Circuit
The EVM circuit has also a clear responsability. But it gets more and more complex each bit deeper you get into it.
To start, the EVM circuit is mainly responsible for constraining the correct execution of each opcode that is processed by the EVM within each block.
It's the circuit with more circuit dependencies indeed. It's linked to almost all of the other circuits as you've seen in the previously shown diagram.
Basically, the idea is simple. You get the opcodes executed by the EVM one by one, you then check the previous Stack, Memory and Storage states that will be involved within this EVM step and the outcome ones.
Finally, it applies the constraints required for the opcode and constraints the inputs and the outputs to be the expected ones and to be the expected out_states of each area `Memory`, `Stack` and `Storage`.
It also calls other circuits and delegates into them.
You can imagine for example the `SHA3` opcode checking being delegated. Or also opcodes like `EXTCODESIZE` delegating the copy to memory to the Bytecode Circuit.
:::warning
But, just as we want to go deeper, we see that we can't assign the constraint that corresponds to an specific opcode directly. As each block has different traces, we will end with different circuits and therefore, needing to compile them each time without being able to re-use Prover/Verifier Keys.
:::info
We won't get deeper into this. But we don't simply apply the constraints required for 1 opcode. Indeed, we apply all of the constraints of all of the opcodes with flags enabling or disabling them.
That allows for circuit re-usage. But requires APIs built on the top of Halo2 to be able to manage all these cells precisely. So that all of them are re-used and we don't waste any resources. :warning:
Just as sneek peak, you can check the [`ConstraintBuilder here`](https://github.com/privacy-scaling-explorations/zkevm-circuits/blob/ec8ef16081038cbf2698292a72a3c402f6fdf166/zkevm-circuits/src/evm_circuit/util/constraint_builder.rs#L242) if you're curious and want to dive on an adventure!! :juggling:
:::
There are A LOT of things to explaing about this circuit. So probably it will require 2 articles to go through it entirely explaining also all of the details. I hope at least this introduction got your attention :smiling_face_with_smiling_eyes_and_hand_covering_mouth:
#### Keccak Circuit
As it's name says, Keccak circuit is responsible of constraining the correctness of __ALL__ the keccaks executed within a block. And also, provide a Lookup Table to the other circuits of the architecture so that they can easily derive the correctness check of any keccak that is involved in their statements proven.
:::info
For example, the EVM circuit can derive the correctness of the SHA3 opcode execution.
Or also, the MPT circuit can check the correctness of all the leaves/level hashes till the root.
As you know, there's a bunch of parts in the entire processing of an Ethereum block, where a keccak operation is performed. You can see the usages we have considered in this [hackmd document](https://hackmd.io/TlttyQKIQ4KejKO9AEkbsQ?view#Bus-mapping-Keccak-hashing-accountant).
:::
*Here's a small diagram of the logic of the keccak circuit and it's architecture.
:warning:Be aware that this is just a simplified diagram of the real solution used.*

The general idea here, is to digest slices of bytes of arbitrary data without leading to variadic-size circuits. (So that we can basically re-use the same Prover & Verifier keys).
Also, note that both, padding and RLC operations are performed inside the circuit. So that means extra stuff to make sure it's re-used cross permutations independently of the witness size so that the outcome is correctly constrained always. No matter the lenght of the array.
:::warning
The problem with keccak is also to end up with different circuits as it's inputs are unbounded. Also, the padding has several conditions and we try to avoid have multiple gates enabled via `Selector`. **So when the padding of unbounded inputs mixes with conditions and also with multiple hashes with different inputs and different lengths, we have work to do** :sweat_smile:
This is solved via state-machine modeling + lookup table exposure from it's API. And we'll see how it works exactly in future articles.
:::
#### MPT Circuit
Hmmm well.. I think we can summarize the complexity of this component with [the following GH conversation](https://twitter.com/CPerezz19/status/1534505452947611649) :rolling_on_the_floor_laughing: :rolling_on_the_floor_laughing:
The MPT circuit is responsible of constraining the following things:
- Storage Root check for each account updated in the block is correct.
- Global storage root correctness check after all the accounts are updated.
:::danger
TBH with you, this is a circuit I don't have enough knowledge about in order to explain it now. So when I do an article for it, I'll force myself to get into all the constraints so that you don't have to. :sweat:
:::
For now, As a short intro to the circuit, I'll post here the words of Miha who has been working insanely hard on the development of this circuit!
> MPT circuit contains S and C columns (other columns are mostly selectors).
With S columns the prover proves the knowledge of key1/val1 that is in the
trie with rootS.
With C columns the prover proves the knowledge of key1/val2 that is in the
trie with rootC. Note that key is the same for both S and C, whereas value
is different. The prover thus proves the knowledge how to change value at key
key1 from val1 to val2 that results the root being changed from rootS to rootC.
The branch contains 16 nodes which are stored in 16 rows.
A row looks like:
`[0, 160, 123, ..., 148, 0, 160, 232, ..., 92 ]
[rlp1 (S), rlp2 (S), b0 (S), ..., b31 (S), rlp1 (C), rlp2 C, b0 (C), ..., b31 (C)]`
Values bi (S) and bi(C) present hash of a node. Thus, the first half of a row
is a S node:
`[rlp1, rlp2, b0, ..., b31]`
The second half of the row is a C node (same structure):
`[rlp1, rlp2, b0, ..., b31]`
We start with top level branch and then we follow branches (could be also extension
nodes) down to the leaf.
Curretly I know there's an effort from Adria and Miha in order to propperly document and explain the circuit, it's constraints and make it match as much as possible to it's spec.
Once this work is terminated will become a way easier to understand all the tricks used in the MPT circuit design.
#### Bytecode Circuit
The Bytecode Circuit is "pretty simple" at least when it comes to its task to perform.
Each time that any bytecode needs to be copied or hashed, the Bytecode circuit is responsible to do so.
This is useful for opcodes like [EXTCODESIZE](https://www.evm.codes/#3b) or [EXTCODEHASH](https://www.evm.codes/#3f) which **force the zkevm to make sure that the bytecode read from the contract specified indeed matches the one is copied to the memory.**
:::success
And the way to do so, is by making the Bytecode Circuit compute the hash of the contract's bytecode and constraint it.
Check also the bytecode correctness and finally providing a table to the other circuits so that they can basically "externalize" these kinds of checks.
:::
Also, the Bytecode Circuit takes responsability upon the correctness of the copied bytecode. And by this, I mean that sequences like the following one where after a `PUSH1` opcode we don't have any data or after `PUSH7` too should be invalid. And therefore, the bytecode circuit is able to catch also these things while constraining the copy of the circuit to ensure also about the bytecode correctness.
```rust
let bytecode = vec![
OpcodeId::ADD.as_u8(),
OpcodeId::PUSH1.as_u8(),
OpcodeId::PUSH1.as_u8(),
OpcodeId::SUB.as_u8(),
OpcodeId::PUSH7.as_u8(),
OpcodeId::ADD.as_u8(),
OpcodeId::PUSH6.as_u8(),
];
```
:::info
Checking the bytecode correctness indeed makes sense. As basically if the contract has been able to be deployed, there's no way on which it can have an invalid bytecode (or the contract creation transaction would have had reverted otherwise).
If the bytecode copied (and later hashed) contains inconsistencies, we know that someonething is going on.
That's why checking the bytecode correcness inside of the circuit is useful.
:::
#### Aggregation Circuit
The aggregation circuit is the key to the entire design of this solution.
In order to build the ZKEVM, 1 year ago we decided to go for [Halo2](https://github.com/zcash/halo2) and fork it removing the `pasta curves` and including BN256 into it.
This has allowed us to contribute to the improvement of this fantastic library designed by the ZCash team and also, provide feedback based upon our experience.
Aside from that, we decided to split the solution between different circuits with lookups that link them between eachother. So that we can modularize the solution and make easier its implementation by splitting the complexity into smaller modules.
The purpose of the aggregation circuit as its own name says is to aggregate the proof generated by each component until we have a single proof that verifies all the others.
:::warning
Do not confuse yourself with recursion here. We're not using recursiveness in any way now. We're just aggregating the proofs together in a final circuit.
Recursing would probably bring a final ammortization in the cost but increments the complexity of the solution and brings more challenges such as copying all of the intermediate states withing a tx/block processing into another proof.
I'll expand on some ideas we've had on later posts about the aggregation topic.
:::
For now, all you need to know is that the aggregation strategy works in a "tree-fashion" style. Where the proofs from the circuits which have more dependencies are processed first. And also, where we can parallelize the process by having a bigger arity in the tree.
_See this diagram from @han0110_

The aggregation circuit will probably be the culmination of this work.
#### L1 verifier