In this multi-part series, we aim to make folding schemes less mysterious.
In the first part, we focus on answering simple question: what is the fundamental advantage of folding schemes over alternate proving schemes? We also examine why some projects experience poor performance despite using folding. TLDR: it is because of sub-optimal choices and things built around folding schemes (e.g., circuits fed to folding schemes) rather than folding schemes themselves.
In future parts, we will provide a gentle introduction to NeutronNova's and Neo's folding schemes. Here's the TLDR.
Now, let's begin. We start with our central question.
By traditional monolithic proof systems, we mean proof systems such as Spartan (which powers Jolt, the state-of-the-art zkVM) and STARKs (which underlies other zkVMs).
First, monolithic proving systems follow a relatively elaborate pipeline:
Note that if we have a circuit with a few million constraints/rows in AIR, R1CS, or Plonkish (or their generalization CCS), the best proving system turns out to be Spartan. See this article for a deeper dive on Spartan. In the current article, we are concerned with very large computations (e.g., those that require many millions or billions of CPU cycles to run).
Folding-based systems, especially modern ones like NeutronNova, transform this paradigm by:
Second, monolithic proof systems cannot scale beyond a certain scale as they would run out of memory. To address this, they use recursion: produce proofs of smaller statements and recursively compose those proofs. This incurs additional prover work. Folding schemes are designed to make recursion efficientβeven without producing a proof in the first place (i.e., they simply commit to a witness and run a prefix of the PIOP).
Letβs unpack what that means in practice.
In many traditional proof systems, opening polynomial evaluations (step 3) is surprisingly expensiveβoften more expensive than committing to the original witness.
Folding eliminates the need for repeated polynomial evaluations. For example, NeutronNova:
This means fewer rounds, fewer commitments, faster prover times, and lower recursion overheads.
Note: Hypothetically, assuming that the prover has unbounded memory, NeutronNova's total prover work is similar to the monolithic prover time in sum-check-based proof systems (e.g., Uniform Spartan). However, unbounded memory is not realistic (e.g., monolithic proof systems that do not use recursion run out of memory when proving a single Ethereum block) and NeutronNova is able to achieve the same prover time efficiency (as the hypothetical solution) while providing significant space efficiency benefits.
Additional Note: Besides the folding scheme, another major contribution of NeutronNova is to reduce more complex statements (e.g., those that arise in zkVMs) to a form where folding can be applied. Concretely, NeutronNova describes reductions by adapting techniques from existing monolithic state-of-the-art zkVMs like Jolt. So, a well-engineered NeutronNova will only be faster than Joltβ-while providing sigificant space efficiency. The recipe in NeutronNova can be used to reduce other techniques in existing monolithic settings to a form where folding can be applied.
Another massive benefit: folding enables efficient recursion.
Most proof systems canβt scale to large computations in a single shot. They rely on recursionβbut naively combining proofs incurs massive overhead, especially when the base prover is expensive. Moreover, some proof systems like STARKs use the so-called "brute-force" recursion, which is significantly more expensive. An alternative approach is the very recent streaming sum-check. However, leveraging this requires adopting a host of other techniques and is concretely more expensive than a non-streaming linear-time sum-check or a folding-based solution such as NeutronNova.
Folding-based systems like Nova shine here. Recursion is built in, cheap, and incremental. Proof composition is natural, and thereβs no need to re-prove everything from scratch.
Despite this promise, we see a lot of projects (e.g., Nexus's zkVM) exhibit poor performance despite using folding, compared to solutions built on monolithic proving systems. Many folding-based systems miss out on performance because of how they use folding, rather than folding itself.
A few critical mistakes that we have seen in prior attempts that use folding schemes. Some projects fix some of these critical mistakes (e.g., switch from Merkle trees to offline memory checking) while switching to a different proof system and attibute the improvement to the different proof system. This creates confusion
Folding isn't just a theoretical winβit shines when backed by practical optimizations. Take multi-scalar multiplication (MSM), a core building block in ZK systems. Many libraries like halo2curves
and arkworks
donβt optimize for small values, but Novaβs commitment routines do.
Letβs compare mean times (in milliseconds) for committing 2^20
scalars across different bit-widths. The benchmark code is available here
Input Bit-width | halo2curves Commit (ms) |
Nova Specialized Commit (ms) |
---|---|---|
1 | 226.72 | 8.66 |
10 | 249.75 | 15.73 |
16 | 370.24 | 34.76 |
32 | 417.82 | 52.37 |
64 | 282.60 | 98.53 |
This comparison makes two things clear:
halo2curves
than bigger values.NOTE: halo2curves
takes more time to commit to 16-bit and 32-bit values than 64-bit values. We suspect this is due to work imbalance among rayon threads.
π In Part 2, weβll explore the NeutronNova folding scheme and how to handle non-uniformity.