In this multi-part series, we aim to make folding schemes less mysterious. In the first part, we focus on answering simple question. We also examine why some projects experience poor performance despite using folding. TLDR: it is because of sub-optimal choices and things built around folding schemes (e.g., circuits fed to folding schemes) rather than folding schemes themselves. In future parts, we will provide a gentle introduction to NeutronNova's and Neo's folding schemes.
Now, let's begin. We start with the most important question.
What is the fundamental advantage of folding schemes over traditional monolithic proving systems?
By traditional monolithic proof systems, we mean proof systems such as Spartan (which powers Jolt, the state-of-the-art zkVM) and STARKs (which underlies other zkVMs).
First, monolithic proving systems follow a relatively elaborate pipeline:
Note that if we have a circuit with a few million constraints/rows in AIR, R1CS, or Plonkish (or their generalization CCS), the best proving system turns out to be Spartan. See this article for a deeper dive on Spartan. In the current article, we are concerned with very large computations (e.g., those that require many millions of CPU cycles to run).
Folding-based systems, especially modern ones like NeutronNova, transform this paradigm by:
Second, monolithic proof systems cannot scale beyond a certain scale as they would run out of memory. To address this, they use recursion: produce proofs of smaller statements and recursively compose those proofs. This incurs additional prover work. Folding schemes are designed to make recursion efficient–even without producing a proof in the first place (i.e., they simply commit to a witness and run a prefix of the PIOP).
Let’s unpack what that means in practice.
In many traditional proof systems, opening polynomial evaluations (step 3) is surprisingly expensive—often more expensive than committing the original witness.
Folding eliminates the need for repeated polynomial evaluations. For example, NeutronNova:
This means fewer rounds, fewer commitments, faster prover times, and lower recursion overheads.
Another massive benefit: folding enables efficient recursion.
Most proof systems can’t scale to large computations in a single shot. They rely on recursion—but naively combining proofs incurs massive overhead, especially when the base prover is expensive. Moreover, some proof systems like STARKs use the so-called "brute-force" recursion, which is significantly more expensive. An alternative approach is the very recent streaming sum-check, but leveraging this requires adopting a host of other techniques and is concretely more expensive than a folding-based solution.
Folding-based systems like Nova shine here. Recursion is built in, cheap, and incremental. Proof composition is natural, and there’s no need to re-prove everything from scratch. That’s a game-changer for use cases like zkVMs.
Despite this promise, we see a lot of projects exhibit poor performance despite using folding, compared to solutions built on monolithic proving systems. Many folding-based systems miss out on performance because of how they use folding, rather than folding itself. A few critical mistakes include:
In fact, MSM performance can vary drastically based on implementation.
Folding isn't just a theoretical win—it shines when backed by practical optimizations. Take multi-scalar multiplication (MSM), a core building block in ZK systems. Many libraries like halo2curves
and arkworks
don’t optimize for small values, but Nova’s commitment routines do.
Let’s compare mean times (in milliseconds) for committing 2^20
scalars across different bit-widths. The benchmark code is available here
Input Bit-width | halo2curves Commit (ms) |
Nova Specialized Commit (ms) |
---|---|---|
1 | 226.72 | 8.66 |
10 | 249.75 | 15.73 |
16 | 370.24 | 34.76 |
32 | 417.82 | 52.37 |
64 | 282.60 | 98.53 |
This comparison makes two things clear:
halo2curves
than bigger values.👉 In Part 2, we’ll explore the NeutronNova folding scheme and how to handle non-uniformity.