Ethereum wants to have its cake and eat it too: achieve scalability while maintaining or even increasing decentralization.
By reducing the tension between the two we can have more of both simultaneously, which in turn leads to more overall security in various ways: more usage and usecases, network value, robustness against attack, and democratization of acesses and validation.
A lot has been achieved since 2020: uniquely distributed liveness-favoring consensus complemented with a robust slashing and recovery mechansim, enshrined ETH with baseline staking utility, trust-minimized light validation, and scalable and proliferating layer-2s.
Work continues to further harden security, neutrality, and chain economics both in-protocol (censorship resistance, protocol simplification, decentralization) and extra-protocol.
There still are essential things we need and want, and better things we would like:
SNARKs are the key to:
- Bringing (2)-(5)
- Unlocking a simpler path towards (1).
There are many existing zkVM projects in the industry, why can't we just build our L1 SNARKs of interest on one of them?
Via Ming::
SNARKs R&D has matured over recent year and can plausibly now meet L1 core requirement of keeping hardware requirement low enough for consumer-grade hardware:
- GKR-Sumcheck zkVM: 2^20 riscv
add
opcode in 0.65s => 1.613Mhz on a single node. 7x & 17x faster than SP1 & Jolt, respectively.- Plonky3 2M/s Poseidon hashes
- Risk0 Ethereum block proven in 90s, sha256 of 1MB per second
- SP1: 15 minutes 38GB RAM for 1 BLS12 pairing (arkworks) with SP1 on M3 withg 12 cores
- Benchmarks by vac.dev
There are two questions to consider:
Circuits are created for every aspect of the protocol: EVM opcodes, storage read/write, signature verification, hashing, .. all of which are consolidated in meta-circuits to prove entire blocks and finally state-transition function after that (eg checking this proven block's header to constrain previous root hash of previous block).
The advantage of this is that every circuit can be hand-optimized for maximum efficiency (=least number of constraints), but there are serious disadvantages:
There are still circuits in this route, but instead of constraining the always-mutating protocol components, we constrain a stable and simple ISA such as RISCV (or MIPS, WASM, ..). Then, Rust- or Go-based Ethereum nodes can be compiled to RISCV, and the proving goes from there. The circuits in this case have no exposure to the Ethereum protocol directly, a seperation which is clearly advantageous for lower complexity and better auditability.
If Ethereum nodes are upgraded, we simply re-compile the nodes again to RISCV and the proving proceeds exactly as before.
Auditability is also easier. RISCV is simply primitive opcodes
and register
and memory
access.
This abstraction sandwich:
comes with an overhead, but that can be minimized with optimizations: carve out highly-repeated operations and hand-constrain them with 'precompiles'.
Hashing is an example where, instead of implementing keccak in rust and compiling that to riscv and proving whatever binary the riscv compiler spits out for us, we instead:
Deploying a zkVM to wrap the EL for example goes roughly as:
A generic zkVM is created and the entire Ethereum node is compiled to it. For the EL, a block producer or its proxy (builder) would create a block within said zkVM by way of compiling the production node to the zkVM bytecode. The trace of the compiled binary is what gets proven. The new block and the proof are propagated to the network, and nodes simply update the state to reflect changes from this new block but without re-executing tx's or re-hashing tries_, they simply verify the proof and apply the changes. In post-End Game world, "hot" state data itself could be PeerDASed giving nodes, light or otherwise, the choice not store everything. They simply verify and follow a chain of blocks (no state attached to them) or even just the chain of headers + the state of relevance to them which they can validate against the root of the tip of the chain: "there does exist a state trie whose root hash is x
and I have a merkle root with the same x
proving that my ETH balance, the leafe, is part of that distributed state".
Circuits are built but for narrow-scope usecases at some lower level such as Halo2 API or PIL. Depending on the SNARK being built, development time can varry. It can can be an optimization candy store for system developers, because of building so close to the metal. The scope must however remain small before issues of complexity and unaudiability started appearing.
> TODO: find benchmarks of keccak in Halo2 vs in zkVMs or compare/contrasts impl in both
The zkVM is built in exactly the same way as described above in "Top-Down VM" route, but not deployed just yet. Instead, it is used as a substrate to build and deploy narrow-scope SNARKs targeting specific self-contained components of the protocol. These SNARKs are developed mostly in vanilla Rust, with precompiles of high-frequency and/or proving-expensive components imported as libs.
Narrow-scope SNARKs, such as hash-based-SNARK signature aggregation to replace BLS, can help achieve some urgent roadmap goals faster: SSF, quantum resistance, and fully validating light clients. Meanwhile, the core zkVM design and implementation, including proof system backend, can continue to be iterated upon independently and without disruption.
Because of the decoupling between the proof system and programming environment (recall: riscv binary seperates the two nicely), development on Core zkVM can proceeed in parallel with the development of narrow-scope SNARKs of interest, themselves being parallizable.
There exist a decoupling between all layers, and so changes in one does not affect the others (unless some protocol change just happened to touch that exact part of the protocol the SNARK is targeting):
Any self-contained component of the Ethereum protocol could in theory by wrapped in a SNARK. A component is an ideal target for SNARKification if it maximally meets the following criteria:
Today we could have users attach a SNARK to their transaction proving its validity (well-formedness, signature, balance, nonce,..) wrt the latest finalized block, which can be verified before adding it to the mempool or an inclusion list for example. However expending an effort to take it to production is somewhat unjustified: (1) it doesn't solve an urgent need (2) it doesn't unlock optionality or simplification in other parts of the protocol.
Urgency | Unlocking potential | Risk | |
SNARKified ECDSA Sigs | low | low | low |
hash-based-SNARKified Sig Aggr. | high | high | high |
Consensus-Commitee-free light validation | low | medium | low |
Possible strategy: start with #3 because it's less risky but sufficiently useful, then #2 because of its massive unlocking potential towards simple SSF which in turn focuses the work on other things. The work on #1 and #2 can happen in parallel, but deployment of #2 should take longer given the higher risk (dominated by audit time and EIP process).
Right now a daily committee of 512 validators is selected to inform lightclients about the tip of the chain. Eliminating the need for this committee is sufficiently urgent to warrant the effort, can unlock optionality at the app and wallet level (removed or minimized trust vis-a-vis centralized providers of node RPCs) in the short term and can be a kernel to SNARKifying all of EL. It is also low-risk enough to serve as a trial run of the SNARKification stack.
What to prove:
Replacing BLS aggregation with a hash-based SNARK, where aggregation is accomplished though recursive proving, can achieve many benefits and unlock optionality wrt to SSF and state structure refactoring:
Because hashing and recursion is core to this SNARK, there is indirect unlocking potential at the EL level provided the zkVM substrate exhibits blazing fast hash and whole-block proving (i.e. the success of this SNARK as a trial may justify moving expanding and parallelizing the team to move faster on partial and full EL SNARKification):
Skip qunatum-unsafe Verkle trees
keep the the existing hextree
OR upgrade to binary trees
).There are two fundamental aspects to seriously consider in building SNARKs for Ethereum:
[TODO: acceptable soundness security levels]
Credit to Ceno contributors for their design on which this wishlist is based
We can list some of the design choices that are both desireable given the state of the Ethereum roadmap, and feasible given the state of ZK R&D:
- TODO: concisely describe issues in folding
TODO: Basefold backend + recent developments: STIR, WHIR, ARC
Formal verification is undecidable by Rice's theorem, so we can't build general-purpose formal verifiers and just pass systems through. This is fundamentally why formal verification is tedious and hard and why its feasibility and usefulness is inversly proportional to how large and complex the codebase is. FV shines when the codebase is small and all execution paths can exhaustively be verified.
There are two general approaches:
For large systems, formal verification is very complex and the spec itself could be buggy leading to under- or mis- or non-reporting of bugs.
Unique challenges to a RISCV-based zkVM:
Adaptations:
zkvm_add
opcode is exactly riscv_add
), then (a) fuzz inter-components (b) formally verify inter-components taking each as a valid blackbox (eg treat a formally-verified zkvm_add
just as a normal add).
There are many considerations to weigh when discussing Ethereum upgrades in terms of assigning priorities and making design and security assumptions. Some aspect can be quiet subjective, such as: what is the current risk of imbalance between L1-L2 in share of execution (low/med/high/critical)? Another less subjective but still not fully objective question: when do we consider Poseidon to have passed the battle-testing? and so on.
One way to ogranize the discussion is to put Single-slot finality (SSF) at the center and weigh-in every consideration against that. SSF affects and is affected by many other components of the roadmap, and so it serves as a good anchor for assigning weight to different considerations.
leaf
has this root
(proof 1), and root
has x
validator weight behind it (proof 2, whose public-input = block headers since genesis)TODO: elaborate on snarkifying riscv vs a custom-built 'zk-frienly ISA'
The seeds for general-compute in zero-knowledge have been planted over the decades of the 90s and 00s under the academic umbrella of the interactive proofs research generally and PCP specifically. But these noble seeds lacked sufficient conditions to grow: prohibitive proof size and prover/verifier time.
This began to change since early 2010s with innovations in cryptographic primitives and arithmetization of sufficiently-generic computations. More breakthroughs followed and shoots began sprouting out, leading to the first non-trivial implementation of a zkSNARK proof system in 2016: Groth16.
More, more, and more shoots sprouted in years to follow, but not without hitting fundamental obstacles: (a) the tooling still too close to the metal for the average developer to build zkDapps, or the average system developer to build large systems with baked-in ZK, and (b) inefficiency and rigidity of R1CS arithmetization imposed a ceiling on what can practically be built in terms of size and performance.
More innovation followed. PLONK (2019) brought flexibility and universal updatable trusted setup for zkSNARKs. Meanwhile, older ideas began to be realized in live hash-based SNARK production systems. The 2021-22 years witnessesed an increase in large zkS{N,T}AR systems being developed, chief among which are zkRollups tethered to Ethereum. There has also being many advances in primitives such as lookups, commitments schemes, and incrementally verifiable computations (IVC).
A look back at arithmetizations given from implementer's perspective
R1CS arithmetization simulates arbitrary NP deciders using two simple primitives: arithmetic addition and multiplication. This in practice leads to a lot of overhead as routine primitives such as logical operators have to be simulated using a lot of addition and multiplication.
Plonkish arithmetization is a generalization of R1CS and does provide flexability to construct more custom gates, reducing the overall number of constraints. This is done using selector columns (and therefore selector polynomials). With great flexibility, however, comes great complexity when the large circuits are integrated to simulate an overall VM intent. The layout of the overall supercircuit integrating many other circuits becomes too inefficient (lots of unused areas in the final table) and too complex to audit.
[_TODO: AIR arith_]
[TODO: more on padding in Plonkish see eg "Mapping Rows to the Evaluation Domain" in this]
[TODO: performance, complexity, and auditability. TLDR: circuits for small purpose-specific things, VM for larger things.
]