We thank Vitalik Buterin, Barry Whitehat, Chih-Cheng Liang, Kobi Gurkan and Georgios Konstantopoulos for their reviews and insightful comments.
We believe zk-Rollup to be the holy grail — a best-in-class Layer 2 scaling solution that is very cheap and secure. However, existing zk-Rollups are application-specific, which makes it hard to build general composable DApps inside a zk-Rollup and migrate existing applications. We introduce zkEVM, which can generate zk proofs for general EVM verification. This allows us to build a fully EVM-compatible zk-Rollup, which any existing Ethereum application can easily migrate to.
In this article, we identify the design challenges of zkEVM and why it's possible now. We also give a more specific intuition and describe the high-level ideas of how to build it from scratch.
zk-Rollup is recognized as the best scaling solution for Ethereum. It is as secure as Ethereum Layer 1 and has the shortest finalizing time comparing to all other Layer 2 solutions (Detailed comparisons here).
In the medium to long term, ZK rollups will win out in all use cases as ZK-SNARK technology improves. –- Vitalik Buterin
The basic idea of zk-Rollup is to aggregate a huge number of transactions into one Rollup block and generate a succinct proof for the block off-chain. Then the smart contract on Layer 1 only needs to verify the proof and apply the updated state directly without re-executing those transactions. This can help save gas fee for an order of magnitude since proof verification is much cheaper than re-executing computations. Another saving comes from data compression (i.e., only keep minimum on-chain data for verification)
Although zk-Rollup is secure and efficient, its applications are still limited to payments and swaps. It's hard to build general-purpose DApps due to the following two reasons.
In a nutshell, zk-Rollup is developer-unfriendly and has limited functionality for now.
That's the biggest problem we want to tackle. We want to provide the best developer experience and support composability within Layer 2 by supporting native EVM verification directly, so that existing Ethereum applications can simply migrate over onto the zk-Rollup as is.
There are two ways to build general DApps in zk-Rollup.
"circuit" refer to the program representation used in zero-knowledge proof. For example, if you want to prove hash(x) = y, you need to re-write the hash function using the circuit form. The circuit form only supports very limited expressions (i.e., R1CS only support addition and multiplication). So, it's very hard to write program using the circuit language –- you have to build all your program logic (including if else, loop and so on) using add and mul.
The first approach requires developer to design specialized "ASIC" circuits for different DApps. It is the most traditional way to use zero-knowledge proof. Each DApp will have a smaller overhead through customized circuit design. However, it brings the problem of composability since the circuit is "static" and terrible developer experience since it needs strong expertise in circuit design[2].
The second approach doesn't require any special design or expertise for developer. The high-level idea of such machine-based proof is that any program will eventually run on CPU, so we only need to build a universal CPU circuit to verify the low-level CPU step. Then we can use this CPU circuit to verify any program execution. In our scenario, program is smart contract and CPU is EVM. However, this approach is not commonly adopted in the past years due to its large overhead. For example, even if you only want to prove the result of add
is correct in one step, you still need to afford the overhead of an entire EVM circuit. If you have thousands of steps in your execution trace, it will be 1000x EVM circuit overhead on the prover side.[3]
Recently, there has been a lot of research going on to optimize zk proofs following those two approaches, including (i) proposing new zk-friendly primitives i.e. Poseidon hash can achieve 100x efficiency than SHA256 in circuit, (ii) ongoing work on improving efficiency of general-purpose verifiable VMs, as in TinyRAM, and (iii) a growing number of general-purpose optimization tricks like Plookup, and even more generally faster cryptographic libraries.
In our previous article, We propose to design "ASIC" circuit for each DApp and let them communicate through cryptographic commitments. However, based on the feedback from the community, we changed our priority and will focus on building a general EVM circuit (so called "zkEVM") first following the second approach. zkEVM will allow exactly the same developer experience as developing on Layer 1. Instead of leaving design complexity to the developer, we will take over it and solve the efficiency problem through customized EVM circuit design.
zkEVM is hard to build. Even though the intuition is clear for years, no one has built a native EVM circuit successfully. Different from TinyRAM, it’s even more challenging to design and implement zkEVM due to the following reasons:
CALL
and it also has error types related to the execution context and gas. This will bring new challenges to circuit design.add
might result in the overhead of the entire EVM circuit.Thanks for the great progress made by researchers in this area, more and more efficiency problems are solved in the last two years, the proving cost for zkEVM is eventually feasible! The biggest technology improvement comes from the following aspects:
Besides the strong intuition and technology improvement, we still need to have a more clear idea of what we need to prove and figure out a more specific architecture. We will introduce more technical details and comparisons in the follow up articles. Here, we describe the overall workflow and some key ideas.
For developers, they can implement smart contracts using any EVM-compatible language and deploy the compiled bytecode on Scroll. Then, users can send transactions to interact with those deployed smart contracts. The experience for both users and developers will be the exactly the same as Layer 1. However, the gas fee is significantly reduced and transactions are pre-confirmed instantly on Scroll (withdraw only takes a few minutes to finalize).
Even if the workflow outside remains the same, the underlying processing procedure for Layer 1 and Layer 2 are entirely different:
Let's give a more detailed explanation of how things are going differently for transactions on Layer 1 and Layer 2.
In Layer 1, the bytecodes of the deployed smart contracts are stored in the Ethereum storage. Transactions will be broadcasted in a P2P network. For each transaction, each full node needs to load the corresponding bytecode and execute it on EVM to reach the same state (transaction will be used as input data).
In Layer 2, the bytecode is also stored in the storage and users will behave in the same way. Transactions will be sent off-chain to a centralized zkEVM node. Then, instead of just executing the bytecode, zkEVM will generate a succinct proof to prove the states are updated correctly after applying the transactions. Finally, Layer 1 contract will verify the proof and update the states without re-executing the transactions.
Let's take a deeper look at the execution process and see what zkEVM needs to prove at the end of the day. In native execution, EVM will load the bytecode and execute the opcodes in the bytecode one by one from beginning. Each opcode can be thought as doing the following three sub-steps : (i) Read elements from stack, memory or storage (ii) Perform some computation on those elements (iii) Write back results to stack, memory or storage.[5] For example, add
opcode needs to read two elements from stack, add them up and write the result back to stack.
So, it's clear that the proof of zkEVM needs to contain the following aspects corresponding to the execution process
When designing the architecture for zkEVM, we need to handle/address the aforementioned three aspects one by one.
We need to design a circuit for some cryptographic accumulator.
This part acts like a "verifiable storage", we need some technique to prove we are reading correctly. A cryptographic accumulator can be used to achieve this efficiently.[6]
Let's take Merkle Tree as an example. The deployed bytecode will be stored as a leaf in the Merkle Tree. Then, verifier can verify the bytecode is loaded correctly from a given address using a succinct proof (i.e., verify Merkle Path in circuit). For Ethereum storage, we need the circuit to be compatible with Merkle Patricia Trie and Keccak hash function.
We need to design a circuit to link the bytecode with the real execution trace.
One problem to move bytecode into a static circuit is conditional opcodes like jump
(corresponding to loop, if else statement in smart contracts). It can jump anywhere. The destination is not determined before one has run the bytecode with specific input. That's the reason why we need to verify the real execution trace. The execution trace can be thought as "unrolled bytecode", it will include the sequence of opcodes in the real execution order (i.e., if you jump to another position, the trace will contain the destined opcode and position).
Prover will provide the execution trace directly as witness to the circuit. We need to prove that the provided execution trace is indeed the one "unrolled" from the bytecode with specific input. The idea is forcing the value of program counter to be consistent. To deal with the undetermined destination, the idea is letting prover provide everything. Then you can check the consistency efficiently using a lookup argument (i.e., prove the opcodes with proper global counter is included in the "bus").
We need to design circuits for each opcode (Prove read, write and computations in each opcode are correct).
This is the most important part –- Prove each opcode in the execution trace is correct and consistent. It will bring a huge overhead if you put all the things together directly. The important optimization idea here is that
This architecture is firstly specified by Ethereum Foundation. It's still at an early stage and under active development. We are collaborating with them closely on this to find the best way to implement the EVM circuit. So far, the most important traits are defined and some opcodes have already been implemented here (using UltraPlonk syntax in the Halo2 repo). More details will be introduced in the follow up articles. We refer interested readers to read this document. The development process will be transparent. This will be a community-effort and fully open-sourced design. Hope more people can join and contribute to this.
zkEVM is much more than just Layer 2 scaling. It can be thought as a direct way to scale Ethereum Layer 1 via Layer-1 validity proof. That means you can scale existing Layer 1 without any special Layer 2.
For example, you can use zkEVM as a full node. The proof can be used for proving transitions between existing states directly –- No need to port anything to Layer 2, you can prove for all Layer 1 transactions directly! More broadly, you can use zkEVM to generate a succinct proof for the whole Ethereum like Mina. The only thing you need to add is proof recursion (i.e. embed the verification circuit of a block to the zkEVM)[7].
zkEVM can provide the same experience for developers and users. It's order of magnitudes cheaper without sacrificing security. There has been proposed architecture to build it in a modular way. It leverages recent breakthrough in zero-knowledge proof to reduce the overhead (including customized constraint, lookup argument, proof recursion and hardware acceleration). We look forward to seeing more people joining the zkEVM community effort and brainstorming with us!
Scroll Tech is a newly built tech-driven company. We aim to build an EVM-compatible zk-Rollup with a strong proving network (See an overview here). The whole team is now focusing on the development. We are actively hiring more passionate developers, reach out to us at hire@scroll.tech. If you have any question about the technical content, reach out to me at yezhang@scroll.tech. DM is also open.
[1]: Starkware claims to achieve composability a few days ago (reference here)
[2]: Circuit is fixed and static. For example, you can't use variable upper bound loop when implementing a program as a circuit. The upper bound has to be fixed to its maximum value. It can't deal with dynamic logic.
[3]: To make it more clear, We elaborate about the cost of EVM circuit here. As we described earlier, circuit is fixed and static. So, EVM circuit needs to contain all possible logic (10000x larger than pure add
). That means even if you only want to prove for add
, you still need to afford the overhead of all possible logics in the EVM circuit. It will 10000x amplify the cost. In the execution trace, you have a sequence of opcodes to prove and each opcode will have such a large overhead.
[4]: EVM itself is not tightly bound to the Merkle Patricia tree. MPT is just how Ethereum states are stored for now. A different one can easily be plugged in (i.e., the current proposal to replace MPT with Verkle trees).
[5]: This is a highly simplified abstraction. Technically, the list of "EVM state" is longer including PC, gas remaining, call stack (all of the above plus address and staticness per call in the stack), a set of logs, and transaction-scoped variables (warm storage slots, refunds, self-destructs). Composability can be supported directly with additional identifier for different call context.
[6]: We use accumulator for storage since the storage is huge. For memory and stack, one can use editable Plookup ("RAM" can be implemented efficiently in this way).
[7]: It's non-trivial to add a complete recursive proof to the zkEVM circuit. The best way to do recursion is still using cyclic elliptic curves (i.e., Pasta curve). Need some "wrapping" process to make it verifiable on Ethereum Layer 1.