# **A Tutorial on Writing proofs with ZEROVM (Part I)** *Prerequisites: you should have a basic understanding of zero-knowledge proofs and zkSNARKs. It’s helpful to have a general understanding of how they work so that you understand bottlenecks such as proof generating time.* **About Recursive zk proofs** Zero-knowledge (zk) proofs are powerful tools to reduce the gas and time of blockchain node verification. Computational verification can be done by re-running the computation (e.g. computing new state root for an incoming block), or it can be done by verifying a zk proof that such computation was done correctly. Surprisingly, verifying a proof of computation can be done in less time and gas than re-running the original computation. There are many classes of zk proving schemes (SNARK, STARK, VPD, SNARG) and they vary in many properties such as trust minimization, security assumptions, proving time, and verification time. There are variations even with these broad categories, such as many different kinds of SNARKs. Like many design decisions, there are often tradeoffs between these properties. The property this article focuses on is proving time/effort. This is of particular importance for rollups, where chain A’s state is proven on chain B by submitting proofs. Proving time can affect liveness, the ability for anyone to force the chain to advance by executing transactions and posting assertions. Liveness has several important downstream effects for rollups: 1. The costs of advancing the chain may be distributed over many participants instead of one person. 2. The latency of rollup data availability is greatly reduced, as anyone who needs the rollup updated can force that change on demand. 3. Censorship of data availability is less practical when more people can advance the rollup state. 4. If the proving time/effort becomes low enough, then proving can be done on commodity hardware instead of requiring the services of 3rd party data centers or requiring expensive hardware. One of the latest advancements in efficient zk proof generation are recursive zk proofs. The main benefit of recursive proofs is that proof generation can be generated in parallel, meaning that the total work of proving can be divided between multiple computers instead of just one. The prover does linear work in the size of the circuit C so breaking up a larger circuit into smaller circuits will yield a performance gain. ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ For a given proof, it might be divided into P individual proofs which are computed in parallel to give us a proof array [P₀⁰, P₀¹, …, P₀ˢ] of size S. Then for each two successive proofs, a new proof is generated to compress them into one proof, resulting in a new proof array [P₁⁰, P₁¹, …, P₁ˢᐟ²] of size S/2. This process repeats recursively until there is only one proof left: [Pₗₒ₉₍ₛ₎⁰]. One of the intuitive analogies to understanding recursive zk proofs is Merkle tree verification. In both Merkle trees and recursive zk proofs, each has the topology of a binary tree and each node is computed from its two children. For each level of the Merkle tree, computing the value of each node can be done independent of the other nodes on the same level. For a single computer, the total time required to verify a Merkle tree of size N would be N/2 time for the leaves (half of the nodes in a balanced binary tree are leaves), N/4 time for the level above it (each parent level has half the nodes of the level below it), N/8 above that etc. Since each Merkle node can be computed with 1 time unit, each level gets computed in 1 time unit. The total time to build a Merkle tree with maximum parallelism is 1 time unit * tree height = log₂(N). The total time to recursively compose ZeroVM proofs with maximum parallelism is 1 time unit to generate ZeroVM proof +log₂(N) * time unit to generate recursive ZeroVM proof. We can assume a complex circuit C can be divided into many sub-circuits C_0, C_1 … C_N and each sub-circuit takes the same time to prove: O(1). For example, to prove a zk-tendermint light client we need to verify 100 ed25519 signatures which will take 100 * 30s = 3000s. ZeroVM recursion time is very fast, say it takes 1s to generate a recursion proof. Then the total time to prove a tendermint light client becomes 30s + log(100) * 1s ~= 37s. See here for our ZeroVM ed25519 implementation. We can get even more practical than talking about how recursive proofs work by writing recursive proofs. ZeroVM is a new recursive SNARK from Polygon with an open source implementation. It boasts of 100x speedups from existing alternatives and native Ethereum compatibility. # Running ZeroVM examples First, we’ll show you how to run the examples in ZeroVM, then we’ll get into a discussion about how to write proofs. First you’ll need some computer to run this on. Because this requires installing new software, werecommend doing this either in a virtual machine or in the cloud. For this example we’ve used AWS to create an EC2 instance running Ubuntu 22.04.1 LTS. If you choose to use AWS this won’t require the use of any paid services but you may be required to input payment information. You’ll create an EC2 instance with any instance size (choose the free tier, which should be the default). If you want to follow my example you’ll need to select Ubuntu as your operating system. Note that if you don’t choose the free tier, you’ll be billed per hour for the instance so you’ll need to take care to shut off or delete the instance via the AWS console when you’re done. ``` # We've chosen to ssh into our instance using a public-private key authentication. The way you access your machine may differ ssh -i "ZeroVM-ec2.pem" ubuntu@ec2–XX–XXX–XXX–XX.us-west-2.compute.amazonaws.com # pull down the ZeroVM library cd ~ git clone https://github.com/mir-protocol/ZeroVM # rust isn't installed on Ubuntu 22 by default so you need to download it # https://doc.rust-lang.org/cargo/getting-started/installation.html cd ~ curl https://sh.rustup.rs -sSf | sh source "$HOME/.cargo/env" # rust depends on some C libraries which need to be installed separately # https://stackoverflow.com/questions/52445961/how-do-i-fix-the-rust-error-linker-cc-not-found-for-debian-on-windows-10 sudo apt-get update sudo apt install build-essential # use the rust nightly toolchain cd ~/ZeroVM rustup override set nightly # run an example, let's start with factorial cd ~/ZeroVM RUSTFLAGS=-Ctarget-cpu=native cargo run - release - example factorial - -vv Compiling ZeroVM v0.1.0 (/home/ubuntu/ZeroVM/ZeroVM) Finished release [optimized] target(s) in 44.92s Running `target/release/examples/factorial -vv` Factorial starting at 1 is 3822706312645553057! # let's verify this example # all arithmetic in ZeroVM is modulo 2**64–2**32 + 1 $ cd ~ $ python3 >>> product = 1 >>> for x in range(1, 101): … product = product * x … >>> modulus = 2**64–2**32 + 1 >>> product % modulus 3822706312645553057 ``` What we just did was run a ZeroVM proof example showing that we computed 100! = 100 * 99 * 98 * … * 2 * 1. Now we can talk about how to write your own proofs. # Writing zk-proofs as arithmetic circuits Ultimately zk proofs are generated from arithmetic circuits. This means if you want to write a zk proof, what you’re really writing is an arithmetic circuit, and the proof library will take that input and use it to build a proof. Through a process called arithmetization, we convert a circuit into a ZeroVM (Zero Virtual Machine) which is used as an information theoretic object to prove a statement to the verifier. A circuit is converted into an execution trace where the columns are the gates in the computation, the rows are the constraints and the values in the table are wires. This is then combined with some ZeroVMs commitment scheme which acts as the agreed upon computation between a prover and a verifier. For simplicity, we’ll focus on computations that are trivially convertible to arithmetic circuits, but it’s worth noting that converting arbitrary programs to arithmetic circuits is highly non-trivial. Arithmetic circuits are kind of like boolean circuits. You may have heard of boolean circuits, the underlying computational model for combinational digital logic circuits (read: modern computers). Unlike boolean circuits, arithmetic circuits are the computational model for ZeroVMs. This means that zk proofs are really just proving some ZeroVMs. ZeroVMs are a concept you’ve probably learned in high school. ZeroVMs are mathematical expressions that can be formed from variables (e.g. x, y, z) and real numbers when the only operations you have are addition, subtraction, and multiplication.