# Nova zkVM
This is a note describing some ideas for building a VM using Nova/SuperNova (in future even with lookups/custom gates). Many of these ideas come from my prior experiments with [Nova-zkMIPS and thinking about efficient zkVMs](https://hackmd.io/@nibnalin/snark-intensity) (thanks also to Adhyyan, afkbyte, Sampriti, Ye, YT and others for discussion and brainstorms on it), and were then made concrete in collaboration with the PSE zkWASM team (Carlos, Chiro, Oskarth, Violet, others?) at the Vietnam Residency.
## High Level Innovations
1. Use SuperNova with massive proving parallelisation
- SuperNova being FFT-less potentially enables us to get a lot more mileage out of parallelisation over other approaches
2. Go straight from IR to IVC steps (using compiler hot-path optimisations)
- SuperNova particularly optimises at making small blocks of repetitive compute more efficient to prove, so the problem of ZKing an execution trace of a program can be made much more efficient by offloading these responsibilities to compiler pipelines. This is like making application/code-specific VMs on the fly (or more accurately, making code-specific NIVC).
3. Use efficient vector commitment based state design
- We propose a new (to our knowledge) method to store and update VM state between execution steps -- while the general idea is quite classical (using vector commitments), we describe how to use KZG commitments for these in the context of Nova very efficiently.
The latter two are optional and subject to failure for various reasons, whereas the first one is an idea described vaguely in the original papers (so unlikely to have issues), and is also necessary to the success of a Nova-based zkVM vs. Halo2 or other proof systems.
## Using SuperNova with massive parallelisation
Here are some explanations for how to parallelise proving with Nova.
For background, we first share a view of typical sequential IVC:
![](https://i.imgur.com/gkkcsxR.png)
*What IVC typically looks like, and how typical SNARK recursion based proving for IVC works*
IVC is the repeated application of some step function (F) on some public state ($z_i$) with some extra/unknown witness input ($W_i$) at each step. In particular, IVC defines proofs at each step ($\pi_i$) that prove the execution of the first $i$ steps by themselves (in other words, $\pi_i$ proves that $F^{i}(z_0) = z_i$).
In the usual instantiation of IVC, we use SNARKs to construct these $\pi_i$ incrementally: at each step $\pi_i$ shows that 1) $F(z_{i-1}) = z_i$ and 2) $\pi_{i-1}$ verifies correctly. This strategy is described in more detail in other [talks](https://youtu.be/JgJPKwOEWSU) and [blogs](https://0xparc.org/blog/groth16-recursion).
![](https://i.imgur.com/gvabeC9.png)
*How Nova replaces the typical SNARK verifier for a Folding Scheme, and the function the folding scheme serves.*
![](https://i.imgur.com/sm2Iklw.png)
*Opening the folding scheme black-box a bit more*
Notice that in this view, the Nova machinery provides a black box with the ability to prove IVC sequentially. In specific, it assumes proofs are made in sequence, with a running instance $U_i$ being merged with $u_i$ at each step $i$.
### Parallelising Nova
![](https://i.imgur.com/n8ByXCg.png)
The rough idea for parallelizing Nova is by using a binary tree structure for computation instead. We will compute incremental proofs of single steps, double them in size, and continue until we've exhausted the entire trace.
More concretely, we use our folding scheme in the internal nodes to fold together proofs of half the computation trace, and additionally maintain some book-keeping and some checks of consistency for the $z$ values flowing through the computation.
Here's a closer look at the modded function F' in this case:
![](https://i.imgur.com/vMlc6Jk.png)
*Read this diagram left side first, then right. The top part is about how the function F' proceeds and verifies. The bottom W (witness) notes how the prover computes the necessary values for the folding step at the level above the current one.*
### NIVC: Non-uniform IVC
To get to SuperNova, we'll first define a VM-like notion of IVC: Non-uniform IVC. The only difference is that NIVC switches between the function being ran at each step (from some small set of hardcoded instructions).
![](https://i.imgur.com/kJKeNPF.png)
*$OP(z_i)$ selects which opcode to run at each step*
### SuperNova
Nova's folding scheme only works when merging two instances of the same circuit. SuperNova's key innovation is to provide an idea for folding many circuits in _parallel_ for IVC, virtually allowing for many opcode-IVC, defined by the authors as Non-uniform IVC.
Out of the box, the simplest way to run NIVC with Nova is to just treat it as regular IVC, but with switch/case conditions: At each step, choose which opcode to run in the R1CS constraints. This yields a $O(C \cdot L)$ size circuit for each step (where $C$ = number of constraints for one opcode, $L$ = number of distinct opcodes).
SuperNova optimises by removing this switch/case conditional: instead, we maintain $L$ running instances _in parallel_, one for each opcode, and fold the current step's proof to the relevant opcode (with an added check that $F'$ picked this opcode correctly). Here's a diagramatic explanation:
![](https://i.imgur.com/2NcDS72.png)
Notice that we've reduced the prover work done from $O(N \cdot C \cdot L)$ to $O(N \cdot (C + L))$, a pretty significant asymptote.
### Parallelising SuperNova
Much like before, we'll make a binary tree structure, but now with many folding boxes (to contain the instances for each opcode).
![](https://i.imgur.com/WbnV1kC.png)
Much like before, this binary tree structure takes $O(N \cdot (C + L))$ time to compute, but with parallelisation of folding supported in a bottom-up order.
### Go straight from IR to IVC steps
The key observation here is that Supernova works with any blocks of constraints, so the "opcodes" in SuperNova need not have anything to do with the real opcodes of the VM itself. While shifting around and adding opcodes does not impact the total number of constraints we have to prove for a computation trace, it can help us reduce the cost of folding at each step.
To verify a folding, Nova adds some number of constraints (for hashing, checking the commitment openings etc.). While we generally ignore these overhead constraints in asymptotics, let's focus on them for practicality: let's call the number of these constraints the recursion overhead/threshold $R$ (in practice, this number is 10000). The cost of running $N$ steps of SuperNova is $O(N \cdot (C + L + R))$. Notice that SuperNova allows us to shift around these recursion overhead constraints:
Note also that we assume above that $O(C) >> O(L)$.
- Much like writing ZK-specific compiler opcodes
- You get all the benefits of hot path optimisations compilers usually make
- Turns a problem of efficient VM design to a compiler pipeline problem (arguably a better way to TinyRAM approach long term)
### Use KZG-commitment machine state repr
- memory is abstracted as "state" -> transform -> "state", how to use different vector commitments to represent state?