Nova zkVM

This is a note describing some ideas for building a VM using Nova/SuperNova (in future even with lookups/custom gates). Many of these ideas come from my prior experiments with Nova-zkMIPS and thinking about efficient zkVMs (thanks also to Adhyyan, afkbyte, Sampriti, Ye, YT and others for discussion and brainstorms on it), and were then made concrete in collaboration with the PSE zkWASM team (Carlos, Chiro, Oskarth, Violet, others?) at the Vietnam Residency.

High Level Innovations

Use SuperNova with massive proving parallelisation
- SuperNova being FFT-less potentially enables us to get a lot more mileage out of parallelisation over other approaches
Go straight from IR to IVC steps (using compiler hot-path optimisations)
- SuperNova particularly optimises at making small blocks of repetitive compute more efficient to prove, so the problem of ZKing an execution trace of a program can be made much more efficient by offloading these responsibilities to compiler pipelines. This is like making application/code-specific VMs on the fly (or more accurately, making code-specific NIVC).
Use efficient vector commitment based state design
- We propose a new (to our knowledge) method to store and update VM state between execution steps – while the general idea is quite classical (using vector commitments), we describe how to use KZG commitments for these in the context of Nova very efficiently.

The latter two are optional and subject to failure for various reasons, whereas the first one is an idea described vaguely in the original papers (so unlikely to have issues), and is also necessary to the success of a Nova-based zkVM vs. Halo2 or other proof systems.

Using SuperNova with massive parallelisation

Here are some explanations for how to parallelise proving with Nova.

For background, we first share a view of typical sequential IVC:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

What IVC typically looks like, and how typical SNARK recursion based proving for IVC works

IVC is the repeated application of some step function (F) on some public state (

z_{i}

) with some extra/unknown witness input (

W_{i}

) at each step. In particular, IVC defines proofs at each step (

π_{i}

) that prove the execution of the first

i

steps by themselves (in other words,

π_{i}

proves that

F^{i} (z_{0}) = z_{i}

In the usual instantiation of IVC, we use SNARKs to construct these

π_{i}

incrementally: at each step

π_{i}

shows that 1)

F (z_{i - 1}) = z_{i}

and 2)

π_{i - 1}

verifies correctly. This strategy is described in more detail in other talks and blogs.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

How Nova replaces the typical SNARK verifier for a Folding Scheme, and the function the folding scheme serves.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Opening the folding scheme black-box a bit more

Notice that in this view, the Nova machinery provides a black box with the ability to prove IVC sequentially. In specific, it assumes proofs are made in sequence, with a running instance

U_{i}

being merged with

u_{i}

at each step

i

Parallelising Nova

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

The rough idea for parallelizing Nova is by using a binary tree structure for computation instead. We will compute incremental proofs of single steps, double them in size, and continue until we've exhausted the entire trace.

More concretely, we use our folding scheme in the internal nodes to fold together proofs of half the computation trace, and additionally maintain some book-keeping and some checks of consistency for the

z

values flowing through the computation.

Here's a closer look at the modded function F' in this case:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Read this diagram left side first, then right. The top part is about how the function F' proceeds and verifies. The bottom W (witness) notes how the prover computes the necessary values for the folding step at the level above the current one.

NIVC: Non-uniform IVC

To get to SuperNova, we'll first define a VM-like notion of IVC: Non-uniform IVC. The only difference is that NIVC switches between the function being ran at each step (from some small set of hardcoded instructions).

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

$O P (z_{i})$ selects which opcode to run at each step

SuperNova

Nova's folding scheme only works when merging two instances of the same circuit. SuperNova's key innovation is to provide an idea for folding many circuits in parallel for IVC, virtually allowing for many opcode-IVC, defined by the authors as Non-uniform IVC.

Out of the box, the simplest way to run NIVC with Nova is to just treat it as regular IVC, but with switch/case conditions: At each step, choose which opcode to run in the R1CS constraints. This yields a

O (C \cdot L)

size circuit for each step (where

C

= number of constraints for one opcode,

L

= number of distinct opcodes).

SuperNova optimises by removing this switch/case conditional: instead, we maintain

L

running instances in parallel, one for each opcode, and fold the current step's proof to the relevant opcode (with an added check that

F^{'}

picked this opcode correctly). Here's a diagramatic explanation:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Notice that we've reduced the prover work done from

O (N \cdot C \cdot L)

O (N \cdot (C + L))

, a pretty significant asymptote.

Parallelising SuperNova

Much like before, we'll make a binary tree structure, but now with many folding boxes (to contain the instances for each opcode).

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Much like before, this binary tree structure takes

O (N \cdot (C + L))

time to compute, but with parallelisation of folding supported in a bottom-up order.

Go straight from IR to IVC steps

The key observation here is that Supernova works with any blocks of constraints, so the "opcodes" in SuperNova need not have anything to do with the real opcodes of the VM itself. While shifting around and adding opcodes does not impact the total number of constraints we have to prove for a computation trace, it can help us reduce the cost of folding at each step.

To verify a folding, Nova adds some number of constraints (for hashing, checking the commitment openings etc.). While we generally ignore these overhead constraints in asymptotics, let's focus on them for practicality: let's call the number of these constraints the recursion overhead/threshold

R

(in practice, this number is 10000). The cost of running

N

steps of SuperNova is

O (N \cdot (C + L + R))

. Notice that SuperNova allows us to shift around these recursion overhead constraints:

Note also that we assume above that

O (C) >> O (L)

Much like writing ZK-specific compiler opcodes
You get all the benefits of hot path optimisations compilers usually make
Turns a problem of efficient VM design to a compiler pipeline problem (arguably a better way to TinyRAM approach long term)

Use KZG-commitment machine state repr

memory is abstracted as "state" -> transform -> "state", how to use different vector commitments to represent state?