# Neo: Lattice-based folding scheme for CCS over small fields and pay-per-bit commitments
## Introduction: Folding Schemes
A **folding scheme** is a powerful cryptographic primitive that takes two computational problems and "folds" them into a single, equivalent problem of the same type. Imagine you need to prove two separate statements: $C(w_1, x_1) = 1$ and $C(w_2, x_2) = 1$. Instead of running two full verifications, a folding scheme combines them into a single, folded statement $C(w, x) = 1$ that is valid if and only if the original two were.
When applied recursively, folding becomes can lead to advanced cryptographic systems like:
- **Incrementally Verifiable Computation (IVC):** Allows for verifying long, ongoing computations by checking only the latest step, resulting in near-constant time verification regardless of the computation's length.
- **Proof-Carrying Data (PCD):** Enables a network of distributed computations where each node can produce a proof that is efficiently chained with others, creating a verifiable suite of computation.
### The Bottleneck of Traditional SNARKs
Modern SNARKs are typically built by combining two components:
1. A **Polynomial Interactive Oracle Proof (PIOP)**, which reduces a computation's validity to a set of claims about polynomials.
2. A **Polynomial Commitment Scheme (PCS)**, which allows a prover to commit to a polynomial and later prove its evaluation at specific points.
This "monolithic" approach, while effective, suffers from two major drawbacks:
1. **High Prover Overhead:** The prover must perform a lot of work to construct the PIOP and the PCS evaluation arguments, often far exceeding the cost of the original computation.
2. **Expensive Recursion:** To prove large computations, one typically breaks them into smaller pieces and uses SNARK recursion. However, verifying a SNARK inside another circuit is costly, often requiring millions of gates.
### Folding Schemes: A More Efficient Path to Recursion
Folding schemes offer a more direct and efficient alternative. Instead of generating a full SNARK at each step, folding operates at the "statement" level, combining multiple instance-witness pairs *before* they are converted into a PIOP.
This leads to significant performance gains:
- **Dramatically Lower Recursion Cost:** Nova, a pioneering folding scheme, requires only about **10,000 R1CS gates** for its recursive verifier circuit. In contrast, traditional SNARK recursion can require millions of gates.
- **Reduced Prover Work:** The prover's work is dominated by committing to the witness, avoiding the expensive machinery of PIOPs and PCS evaluation arguments.
For computations where witnesses consist of small values (e.g., 32-bit integers instead of 256-bit field elements), the performance gap is even wider. Folding schemes can achieve speedups of **10x to 200x** over monolithic SNARKs like Marlin.
### The Evolution: From Nova to Neo
The landscape of folding schemes has evolved rapidly:
- **Nova** first formalized the concept for R1CS.
- **HyperNova**, **Protostar**, and others extended folding to more expressive relations like **CCS (Customizable Constraint System)**, which generalizes R1CS, Plonkish, and AIR. These schemes also added support for lookups and memory models.
However, these state-of-the-art schemes share a common foundation: they rely on cryptographic groups (typically elliptic curves) and are therefore built on discrete logarithm assumptions. This leads to two fundamental problems.
## Two Challenges: Field Size and Quantum Security
Despite their success, modern folding schemes face two significant limitations:
1. **Dependency on Large Prime Fields:** Elliptic curve cryptography requires working over large prime fields (typically 256 bits) for security. This means that even if a computation naturally uses 32-bit or 64-bit numbers, it must be "lifted" into a 256-bit field. This **embedding overhead** is inefficient and wasteful.
2. **Vulnerability to Quantum Attacks:** Their security is based on the hardness of the discrete logarithm problem, which is known to be breakable by quantum computers. They are not **post-quantum secure**.
We arrive at the following question: can we build a folding scheme that is not only efficient but also works over small, machine-friendly prime fields (like 64-bit fields) and is secure against quantum attacks?
### Recent Attempts and Their Trade-Offs
Several projects have tried to address this, but each comes with significant downsides:
- **Lova:** Based on unstructured lattices, but it's designed for the subset-sum problem, not general-purpose relations like CCS. Adapting R1CS to subset-sum is highly inefficient, and performance benchmarks show it is over **6,000x slower** than Nova.
- **Arc:** A hash-based scheme that avoids group assumptions. However, its verifier overhead is massive. Folding requires numerous Merkle tree openings, translating to over **800,000 R1CS constraints** for the verifier circuit, compared to Nova's ~10,000.
- **LatticeFold:** It uses structured lattices. However, it suffers from several performance issues:
* It operates over **cyclotomic polynomial rings**, not prime fields. These rings are computationally expensive (10-100x slower than field arithmetic) and are not compatible with popular small fields like Goldilocks.
* It lacks **pay-per-bit** efficiency. Committing to a vector of bits costs the same as committing to a vector of 64-bit values, removing a key optimization opportunity.
* It requires a large field extension (degree 4 for 64-bit fields), adding a **4x overhead** to its protocols.
:::info
#### What is a Cyclotomic Ring?
A **cyclotomic ring** is a mathematical structure built from polynomials. Think of it as clock arithmetic, but for polynomials instead of numbers.
- You start with a special polynomial called a **cyclotomic polynomial**, denoted $\Phi_n(X)$. For example, a common choice is $\Phi_{2d}(X) = X^d + 1$.
- In this ring, any polynomial is reduced modulo $\Phi_n(X)$. So, if we use $X^d + 1$, it means we treat $X^d + 1$ as zero, which implies $X^d \equiv -1$.
- Elements of the ring are polynomials of degree less than $d$. For example, in the ring $\mathbb{Z}_q[X]/(X^d+1)$, elements are polynomials with coefficients in the finite field $\mathbb{F}_q$.
While these rings have properties useful for lattice cryptography (like enabling fast multiplication via NTTs), performing arithmetic in them is significantly more complex and slower than in a simple prime field like $\mathbb{F}_{2^{61}-1}$.
:::
### In summary
While Lova, Arc, and LatticeFold are promising steps toward post-quantum and small-field-friendly folding schemes, none yet match the speed and practicality of elliptic curve-based approaches like Nova or HyperNova.
Current research is still grappling with how to:
- Avoid large fields without embedding overhead,
- Achieve post-quantum security without massive slowdowns,
- And maintain practical prover/verifier efficiency in real-world systems.
## Introducing Neo: A new folding scheme over small fields
**Neo** is a new folding scheme that solves two major problems in previous constructions:
1. They required large 256-bit fields, even for computations that naturally fit in smaller fields.
2. They were not post-quantum secure.
**Neo** is a new lattice-based folding scheme that successfully addresses these challenges. It provides a blueprint for building folding schemes that are:
1. **Natively over Small Fields:** Neo works directly with popular, machine-friendly prime fields like M61 ($2^{61}-1$) and the Goldilocks prime ($2^{64} - 2^{32} + 1$).
2. **Plausibly Post-Quantum Secure:** Its security is based on the hardness of the **Module-SIS (Short Integer Solution)** problem over lattices.
3. **Pay-Per-Bit Efficient:** Its novel commitment scheme costs scale with the bit-width of the values being committed. Committing to a vector of bits is **64x cheaper** than committing to a vector of 64-bit integers.
4. **Compatible with CCS:** It provides a folding scheme for the general CCS relation.
### What makes Neo different?
Unlike previous systems like HyperNova (which uses elliptic curves) and LatticeFold (which uses cyclotomic rings), Neo:
- Defines constraints directly over small prime fields,
- Runs the sumcheck protocol over an *extension* of that small field,
- Avoids the need to "pack" small-field constraints into large rings.
This leads to much faster arithmetic—field multiplication in M61 takes less than 1 nanosecond, while ring multiplication in cyclotomic rings can take hundreds of nanoseconds.
### A new lattice-based commitment scheme
The core of Neo is a new commitment scheme built on **Ajtai's commitment scheme**. The key idea is a more efficient embedding of vectors from a small field $\mathbb{F}_q$ into a cyclotomic polynomial ring.
- Instead of complex NTT-based packing like in LatticeFold, Neo uses a simple **base-b decomposition**. Each field element is broken down into its base-$b$ digits, and these digits become the coefficients of a low-norm polynomial.
- This approach directly leads to **pay-per-bit costs**. A smaller number has fewer non-zero digits, resulting in a sparser polynomial that is cheaper to commit to.
- Crucially, this commitment scheme provides the necessary **linear homomorphism** property required for folding evaluation claims, but in the challenging lattice setting where norms must be carefully managed.
### Folding CCS with lattices
With this new commitment scheme, Neo adapts HyperNova's folding architecture to the lattice setting.
- It replaces HyperNova's elliptic-curve-based commitments with its own lattice-based ones.
- Like HyperNova, it runs a **single sumcheck protocol** at each folding step. However, in Neo, this sumcheck runs over a small extension of the base prime field (e.g., $\mathbb{F}_{q^2}$ for 128-bit security), which is far more efficient than the ring-based sumcheck in LatticeFold.
- To manage the norm growth inherent in lattice-based schemes, Neo's prover must prove that vectors are correctly decomposed. This introduces a small overhead, but it can be **amortized** by using **multi-folding** (folding many instances at once).
### Security Through Composable Reductions
Neo's security proof is structured as a sequence of three reductions of knowledge:
1. **$\Pi_{\text{DEC}}$ (Decomposition):** Reduces a claim about a high-norm vector into multiple claims about low-norm vectors.
2. **$\Pi_{\text{RLC}}$ (Random Linear Combination):** Folds multiple low-norm claims into a single high-norm claim.
3. **$\Pi_{\text{CCS}}$ (CCS Reduction):** Transforms a CCS claim into a set of evaluation claims compatible with the other reductions.
Interestingly, not all these reductions satisfy the standard definition of knowledge soundness. Neo introduces **relaxed notions of soundness** and proves that their sequential composition still yields a secure folding scheme.
### Parameter choices for performance
Neo can be instantiated with several highly efficient 64-bit fields:
- Mersenne prime $q = 2^{61} - 1$
- Goldilocks prime $q = 2^{64} - 2^{32} + 1$
- Almost-Goldilocks primes (AGL), tuned for better compatibility with NTTs
Neo runs the sumcheck protocol over an extension field like $\mathbb{F}_{q^2}$ to achieve 128-bit security.
### Extending to Lookups and Memory
Neo also supports lookup arguments and read-write memory checks.
By integrating with **Shout** and **Twist**, which both use sumcheck-based techniques, Neo can:
- Fold lookup relations alongside CCS,
- Perform efficient range checks,
- Use sparse vector commitments in the lattice setting.
This opens the door to folding more expressive programs, such as virtual machine traces, within the same framework.
### Post-quantum IVC and PCD
By plugging Neo into known IVC and PCD compilers, we get a plausibly post-quantum recursive proof system.
Neo supports:
- Native field operations in recursive verifiers,
- SNARK compression of IVC proofs via Spartan + FRI (both post-quantum),
- No elliptic curve cycles or "wrong field" tricks.
:::success
Even though Neo's lattice-based commitments don’t directly support efficient evaluation proofs, we can still use Spartan’s FRI-based arguments to prove knowledge of correct IVC proofs, keeping everything in small, efficient fields. We could even extend this to SuperSpartan + WHIR which is exactly the setting currently supported by Whirlaway, see https://github.com/TomWambsgans/Whirlaway
:::
## Neo’s Lattice-Based commitments for folding
Neo, most importantly, is a new lattice-based commitment scheme. It's designed specifically to work with small fields and support efficient folding of CCS instances. This section breaks down what such a commitment needs to do—and how Neo achieves it.
### What do we need from this commitment?
To enable folding in a zero-knowledge setting, the commitment scheme must commit to a vector $z \in \mathbb{F}^m$ (e.g. a CCS witness) and satisfy these three core properties:
1. **Linear Homomorphism for Folding Evaluations:** Given multiple commitments $c_i$ to vectors $z_i$, we need to be able to combine them into a single commitment $c$ to a combined vector $z$, such that an evaluation claim on $z$ corresponds to the combined evaluation claims on the $z_i$.
2. **Pay-per-bit cost:** Committing to low-bit-width values (like bits or 8-bit integers) should be much cheaper than committing to full 64-bit field elements.
3. **Compatibility with small fields and linear transforms:** The commitment scheme should support folding not just the witness itself, but also transformations like $Mz$ for some constraint matrix $M$.
### Why is this hard with lattices?
For discrete-log-based commitments like Pedersen or KZG, linear combinations are simple: you just add the commitments and evaluations using the same weights. But in lattice-based schemes (like Ajtai’s), commitment binding depends on the norm of the committed values.
So two key problems arise:
- Combining vectors increases their norm, possibly beyond what the scheme allows.
- Lattice-based commitments don't naturally scale with the bit-width of the values—they treat all inputs equally, regardless of how small they are.
LatticeFold tried to address this using an NTT (Number Theoretic Transform) representation inside cyclotomic rings. But this introduced high overhead and broke pay-per-bit behavior: committing to bits costs as much as committing to full field elements.
### Neo's Solution - Part 1: The Matrix Commitment Scheme
Neo's first innovation is a commitment scheme that views a vector of field elements $z \in \mathbb{F}_q^m$ as a **matrix** of its decomposed coefficients. This approach is important to achieve both post-quantum security and pay-per-bit efficiency.
#### The `Decomp` Map: From Vectors to Low-Norm Matrices
The process begins with a deterministic map, `Decomp_b`. This function takes a vector $z = [z_1, \dots, z_m]$ and decomposes each element $z_j$ into its base-$b$ digits, where $b$ is a small integer (e.g., $b=2$ for binary decomposition). This decomposition is represented by the formula:
$$z_j = \sum_{i=1}^{d} b^{i-1} Z_{i,j} \quad \text{such that} \quad |Z_{i,j}| < b$$
Here, $d$ is the number of digits required to represent any element in the field. The coefficients $Z_{i,j}$ form a matrix $Z \in \mathbb{F}_q^{d \times m}$, where each column contains the decomposed digits of the corresponding element from the original vector $z$.
This matrix $Z$ is then used in a modified **Ajtai commitment scheme**. Each column of $Z$ is treated as the coefficient vector of a polynomial in a cyclotomic ring $R_q = \mathbb{F}_q[X]/(\Phi_\eta(X))$:
$$z'_j = \sum_{i=1}^{d} Z_{i,j} \cdot X^{i-1} \in R_q$$
The final commitment is computed by multiplying a public matrix $\mathbf{M} \in R_q^{\kappa \times m}$ by the vector of these newly formed polynomials $\mathbf{z'} = [z'_1, \dots, z'_m]^\top$:
$$c = \mathbf{M} \cdot \mathbf{z'} \in R_q^\kappa$$
Since the original values in $z$ were small, their base-$b$ digits $Z_{i,j}$ are also small. This means the resulting polynomials $z'_j$ have a **low norm**, which is essential for the binding property of the Ajtai scheme. Furthermore, the cost of the multiplication $\mathbf{M} \cdot \mathbf{z'}$ is lower when the coefficients of the polynomials in $\mathbf{z'}$ are sparse or small. This directly achieves the **pay-per-bit** property.
:::success
#### Example: How `Decomp` works
Let's commit to the vector $z = [13, 5]$ over $\mathbb{F}_{17}$ using base $b=2$ and dimension $d=4$.
1. **Decompose each element**:
- For $z_1 = 13$:
$13 = \underbrace{1}_{Z_{4,1}} \cdot 2^3 + \underbrace{1}_{Z_{3,1}} \cdot 2^2 + \underbrace{0}_{Z_{2,1}} \cdot 2^1 + \underbrace{1}_{Z_{1,1}} \cdot 2^0$
The coefficient vector is $[1, 0, 1, 1]^\top$.
- For $z_2 = 5$:
$5 = \underbrace{0}_{Z_{4,2}} \cdot 2^3 + \underbrace{1}_{Z_{3,2}} \cdot 2^2 + \underbrace{0}_{Z_{2,2}} \cdot 2^1 + \underbrace{1}_{Z_{1,2}} \cdot 2^0$
The coefficient vector is $[1, 0, 1, 0]^\top$.
2. **Construct the Matrix $Z$**:
The resulting matrix $Z = \text{Decomp}_2(z)$ is formed by these columns:
$$
Z = \begin{pmatrix}
1 & 1 \\
0 & 0 \\
1 & 1 \\
1 & 0
\end{pmatrix} =
\left[
\overbrace{\begin{pmatrix} 1 \\ 0 \\ 1 \\ 1 \end{pmatrix}}^{\text{coeffs of } z'_1}
\overbrace{\begin{pmatrix} 1 \\ 0 \\ 1 \\ 0 \end{pmatrix}}^{\text{coeffs of } z'_2}
\right]
$$
The entries of this matrix are small (0s and 1s), so its **norm is low**.
3. **Form the Polynomials**:
- $z'_1 = 1 \cdot X^0 + 0 \cdot X^1 + 1 \cdot X^2 + 1 \cdot X^3 = 1 + X^2 + X^3$
- $z'_2 = 1 \cdot X^0 + 0 \cdot X^1 + 1 \cdot X^2 + 0 \cdot X^3 = 1 + X^2$
These low-norm polynomials form the vector $\mathbf{z'} = [z'_1, z'_2]^\top$, which is then committed to efficiently. If we had committed to a larger number like $2^{60}$, its decomposition would produce a denser polynomial, making the commitment more expensive. This is the essence of pay-per-bit.
:::
### From Single Commitments to Foldable Proofs
Part 1 established a secure, pay-per-bit method to commit to a single witness matrix $Z$. However, the ultimate goal of a folding scheme is not just to commit, but to efficiently **combine** multiple proofs.
To fold several commitments ($c_1, c_2, \dots$) into one ($c_{folded}$), the scheme needs a special algebraic property: **homomorphism**. We must be able to perform operations (like a random linear combination) on the public commitments and have those operations correspond directly to the same operations on the secret witnesses.
Part 2 introduces the machinery that provides this essential property. It leverages the structure of the underlying cyclotomic ring to enable homomorphic operations on our matrix commitments, turning them into a tool that can be used for folding.
### Neo's Solution - Part 2: Homomorphism Over Matrices
To achieve the required homomorphism, Neo defines its commitment as an **S-module homomorphism**, where $S$ is a special ring of matrices that mirrors the arithmetic of the underlying cyclotomic ring. In simple terms, this is a generalization of a **linear map** that preserves the core algebraic structure of the system. It guarantees that for any witness matrices $Z_1, Z_2$ and any "scalars" (rotation matrices) $\rho_1, \rho_2 \in S$, the following property holds:
$$\text{Commit}(\rho_1 Z_1 + \rho_2 Z_2) = \rho_1 \text{Commit}(Z_1) + \rho_2 \text{Commit}(Z_2)$$
This is the engine that enables folding, as it ensures that performing a linear combination on the secret witnesses and then committing is identical to committing first and then performing the same linear combination on the public commitments.
#### Rotation Matrices: Turning Ring Multiplication into Matrix Multiplication
For any polynomial $a \in R_q$, we can construct a matrix, denoted $\text{rot}(a)$ (rotation matrix), that perfectly simulates multiplication by $a$. Specifically, for any other polynomial $b \in R_q$, the following identity holds:
$$\underbrace{\text{rot}(a)}_{\text{matrix in } \mathbb{F}_q^{d \times d}} \cdot \underbrace{\text{cf}(b)}_{\text{coeffs of } b} = \underbrace{\text{cf}(a \cdot b)}_{\text{coeffs of the product}}$$
where $\text{cf}(\cdot)$ is a function that returns the coefficient vector of a polynomial.
Complex polynomial multiplication in the ring $R_q$ becomes simple matrix-vector multiplication in the space $\mathbb{F}_q^d$. The set of all such rotation matrices, $S = \{ \text{rot}(a) | a \in R_q \}$, forms a ring of matrices that is isomorphic to $R_q$ itself.
Neo's commitment scheme is designed to be an **S-module homomorphism**. This means for any two witness matrices $Z_1, Z_2 \in \mathbb{F}_q^{d \times m}$ and any two rotation matrices $\rho_1, \rho_2 \in S$:
$$\rho_1 \cdot \text{Commit}(Z_1) + \rho_2 \cdot \text{Commit}(Z_2) = \text{Commit}(\rho_1 Z_1 + \rho_2 Z_2)$$
This algebraic structure is precisely what we need to fold multiple claims into one.
:::success
#### Example: A Simple Rotation Matrix
Let's use a simple cyclotomic ring $R_q = \mathbb{F}_q[X]/(X^2+1)$, where $d=2$. An element $a = a_0 + a_1 X$ has the coefficient vector $\text{cf}(a) = [a_0, a_1]^\top$.
The rotation matrix for $a$ is:
$$\text{rot}(a) = \begin{pmatrix} a_0 & -a_1 \\ a_1 & a_0 \end{pmatrix}$$
Now, let's multiply $a$ by another element $b = b_0 + b_1 X$.
- **In the ring:** $a \cdot b = (a_0 b_0 - a_1 b_1) + (a_0 b_1 + a_1 b_0)X$.
- **With the matrix:**
$$\text{rot}(a) \cdot \text{cf}(b) = \begin{pmatrix} a_0 & -a_1 \\ a_1 & a_0 \end{pmatrix} \begin{pmatrix} b_0 \\ b_1 \end{pmatrix} = \begin{pmatrix} a_0 b_0 - a_1 b_1 \\ a_1 b_0 + a_0 b_1 \end{pmatrix}$$
The resulting vector is exactly the coefficient vector of the product $a \cdot b$. This demonstrates how rotation matrices perfectly capture ring multiplication.
:::
#### Folding Evaluation Claims
With this homomorphic property, folding multiple evaluation claims becomes straightforward. Suppose we have $k$ instance-witness pairs to fold. For each $i \in [k]$, we have:
- A commitment $c_i = \text{Commit}(Z_i)$.
- A claim that the multilinear extension of $Z_i$ evaluates to $y_i$ at a random point $r$.
The folding process works as follows:
1. The verifier sends a random challenge, which is a rotation matrix $\rho \in S$.
2. The prover and verifier compute the folded commitment $c_{folded}$ and folded evaluation $y_{folded}$ using powers of $\rho$:
$$c_{folded} = \sum_{i=1}^{k} \rho^i \cdot c_i \quad \text{and} \quad y_{folded} = \sum_{i=1}^{k} \rho^i \cdot y_i$$
3. By the S-module homomorphism property, the folded commitment is a valid commitment to the folded witness matrix $Z_{folded} = \sum_{i=1}^{k} \rho^i \cdot Z_i$:
$$c_{folded} = \sum_{i=1}^{k} \rho^i \cdot \text{Commit}(Z_i) = \text{Commit}\left(\sum_{i=1}^{k} \rho^i \cdot Z_i\right) = \text{Commit}(Z_{folded})$$
The folded evaluation $y_{folded}$ is also the correct evaluation of the folded witness $Z_{folded}$ at the point $r$. This means the single folded instance $(c_{folded}, y_{folded})$ is valid if and only if all the original $k$ instances were valid.
:::success
#### Example: Folding Two Claims into One
Imagine we have two separate claims you want to verify:
1. **Claim A:** Prover knows a witness matrix $Z_A$. The verifier has a commitment $c_A = \text{Commit}(Z_A)$ and a claimed evaluation $y_A$ at a point $r$.
2. **Claim B:** Prover knows a witness matrix $Z_B$. The verifier has a commitment $c_B = \text{Commit}(Z_B)$ and a claimed evaluation $y_B$ at the same point $r$.
Instead of checking both, we can fold them into a single, equivalent claim.
**The Folding Step:**
1. The verifier picks a random challenge matrix $\rho$.
2. Both parties compute a weighted sum of their data using $\rho$.
- **Prover (Secretly):** Folds the witnesses.
$$Z_{folded} = \rho \cdot Z_A + \rho^2 \cdot Z_B$$
- **Verifier (Publicly):** Folds the commitments and evaluations.
$$c_{folded} = \rho \cdot c_A + \rho^2 \cdot c_B$$ $$y_{folded} = \rho \cdot y_A + \rho^2 \cdot y_B$$
**With Homomorphism:**
Because the commitment scheme is homomorphic, the verifier's public calculation perfectly matches the prover's secret one:
$$
\begin{aligned}
\underbrace{c_{folded}}_{\text{Verifier's folded commitment}} &= \rho \cdot \text{Commit}(Z_A) + \rho^2 \cdot \text{Commit}(Z_B) \\
&= \text{Commit}(\underbrace{\rho \cdot Z_A + \rho^2 \cdot Z_B}_{\text{Prover's folded witness}})
\end{aligned}
$$
The verifier now has a single claim to check: that the witness committed to in $c_{folded}$ evaluates to $y_{folded}$ at point $r$. This single check is valid if and only if both original claims were valid. We've successfully reduced two problems to one!
:::
### Managing Norms and Ensuring Security
The final thing is managing the **norm growth** from the linear combinations and ensuring the scheme remains secure.
#### The Problem: Norms Explode
The folded witness $Z_{folded} = \sum \rho^i Z_i$ is a weighted sum of matrices. This sum will naturally have a larger norm than the individual matrices $Z_i$. If this norm grows too large, it can exceed the bound required for the Ajtai commitment scheme to be binding.
#### The Solution: Carefully Chosen Challenge Sets
To control this, the challenges $\rho$ are not chosen from the entire set $S$. They are sampled from a carefully constructed subset $C \subseteq S$ called a **strong sampling set**, which must satisfy two properties:
1. **Invertibility of Differences**: For any two distinct elements $\rho, \rho' \in C$, their difference $(\rho - \rho')$ must be an invertible matrix. This is a technical requirement for the security proof, ensuring that the knowledge extractor can solve for a unique witness.
2. **Bounded Expansion Factor**: The set must have a known, small **expansion factor** $T$. This factor bounds how much multiplication by an element from $C$ can increase a vector's norm:
$$\max_{\rho \in C, v \in \mathbb{F}_q^d} \frac{\|\rho v\|_\infty}{\|v\|_\infty} \le T$$
By knowing $T$, we can precisely calculate an upper bound on the norm of the folded witness $Z_{folded}$ and ensure our system parameters are set securely.
#### Relaxed Binding: A New Security Notion for Folding
Standard commitment binding is not sufficient in a folding context. A malicious prover might be able to create a valid-looking folded commitment $c_{folded}$ without knowing valid witnesses for the original commitments.
Neo relies on a stronger security notion called **(d, m, B, C)-relaxed binding**. This property guarantees that the algebraic structure of the folding process must be respected. Formally, it states that it's computationally infeasible for an adversary to find:
- A commitment $c$.
- Two different linear combination matrices, $\Delta_1$ and $\Delta_2$, built from elements of the challenge set $C$.
- Two witness matrices $Z_1, Z_2$ with norm less than $B$.
Such that:
\begin{aligned}
\underbrace{\Delta_1 \cdot c = \text{Commit}(Z_1) \quad \text{and} \quad \Delta_2 \cdot c = \text{Commit}(Z_2)}_{\text{The commitments appear valid...}} \quad \\ \text{but} \quad \underbrace{\Delta_1 Z_2 \neq \Delta_2 Z_1}_{\text{...but the underlying algebra is inconsistent.}}
\end{aligned}
This ensures that any valid folded commitment must correspond to a correctly folded witness. Neo proves that if the underlying Ajtai scheme is binding against a slightly larger norm, it also satisfies this crucial relaxed binding property.
## Neo's Folding Scheme for CCS
With the commitment scheme in place, we can now construct the full folding scheme for CCS. The scheme is a composition of three reductions that work together to fold a CCS instance into a running "accumulator" instance.
First, we lift the standard CCS relation to the matrix setting.
**Matrix CCS (MCS):** An instance consists of a commitment $c$ and public input $x$. A witness is a matrix $Z$. The relation holds if $c$ is a commitment to $Z$ and $Z$ satisfies the CCS constraints.
**Matrix Evaluation (ME):** An instance is a commitment $c$, a point $r$, and an evaluation $y$. A witness is a matrix $Z$. The relation holds if $c$ commits to $Z$ and the multilinear extension of $Z$ evaluates to $y$ at $r$.
The folding protocol, $\Pi$, is the composition $\Pi_{\text{DEC}} \circ \Pi_{\text{RLC}} \circ \Pi_{\text{CCS}}$. It folds one new MCS instance into a running ME instance. Let's trace the flow.
### CCS in the Matrix Setting
We begin by lifting the CCS relation to operate over matrices.
**Definition (Matrix Constraint System - MCS):**
The **Matrix CCS (MCS)** relation formally defines a valid instance-witness pair in Neo. For a pair to be valid, two main conditions derived from the formula must be met.
First, the public commitment $c$ must correctly commit to the witness matrix $Z$, which is the base-$b$ decomposition of the full witness vector $z$.
Second, the witness must satisfy the circuit's constraints, which is formally expressed as $f(...) ∈ ZS_n$, meaning the CCS constraint polynomial $f$ must evaluate to zero across the entire computational domain.

:::success
#### MCS Example
Let's use a simple constraint: $w_1 \times w_2 - w_3 = 0$. The constraint polynomial is $f(v_1, v_2, v_3) = v_1 v_2 - v_3$.
A prover has a secret witness $w = [3, 4, 12]$ and provides a public commitment $c$. For this to be a valid MCS instance, the formula requires:
$$
\text{MCS}(b, L) := \left\{ \dots :
\begin{cases}
\underbrace{c = \mathcal{L}(\text{Decomp}_2([3, 4, 12]))}_{\text{1. Commitment must match the witness}} \\
\underbrace{f(3, 4, 12) = 3 \cdot 4 - 12 = 0}_{\text{2. Witness must satisfy the constraints}}
\end{cases}
\right\}
$$
Since both conditions hold (assuming the commitment is correct), the instance-witness pair is a valid member of the `MCS` relation and is ready to be folded.
:::
**Definition (Matrix Evaluation - ME):**
The **Matrix Evaluation ($ME$)** relation is the standardized format for the running "accumulator" in Neo's folding scheme. It represents a simple, uniform claim: that a committed witness matrix $Z$ evaluates to a specific set of values $\{y_j\}$ at a random point $r$. This standardized structure is much simpler to fold recursively than a full $CCS$ instance.
For an instance and a witness $Z$ to be in the $ME(b, L)$ relation, three conditions from the formula must hold. The public commitment $c$ must be a valid, low-norm commitment to $Z$. Crucially, for each of the $t$ constraint matrices $M_j$ from the $CCS$ structure, the claimed public evaluation $y_j$ must be the true evaluation of the transformed witness $M_jz$ at the random point $r$. This is formally written as $y_j = Z M_j^T \hat{r}$.

:::success
#### $ME$: A Standardized Claim Example
We have a folded proof and need to check its evaluation. This is what the $ME$ relation does.
The verifier has a public instance: a commitment $c$, a random point $r$, and a claimed evaluation $y_1$. A prover knows the secret witness matrix $Z$. For this to be a valid $ME$ instance, the formula requires:
$$
\text{ME}(b, L) := \left\{ \dots :
\begin{cases}
\underbrace{c = \mathcal{L}(Z) \quad \land \quad \|Z\|_\infty < b}_{\text{1. Commitment must be valid and low-norm}} \\
\underbrace{y_1 = Z M_1^T \hat{r}}_{\text{2. Claimed evaluation must match the true evaluation}}
\end{cases}
\right\}
$$
Here, $Z M_1^T \hat{r}$ represents the true evaluation of the first part of the witness (as selected by matrix $M_1$) at point $r$. If the prover's claimed $y_1$ matches this value, and the commitment is valid, the $ME$ claim holds. This simple, repeatable structure is perfect for the accumulator in the folding cycle.
:::
### The Folding Cycle: A 3-Step Process
The full folding protocol, $\Pi$, is the composition $\Pi = \Pi_{\text{DEC}} \circ \Pi_{\text{RLC}} \circ \Pi_{\text{CCS}}$. It takes a new $CCS$ instance and folds it into a running accumulator of $ME$ instances. Let's trace the flow.
#### Step 1: $\Pi_{\text{CCS}}$ (The Unifier)
- **Transformation:** $\text{MCS}(b, L) \times \text{ME}(b, L)^{k-1} \to \text{ME}(b, L)^k$
- **Goal:** Convert the complex, algebraic $CCS$ constraints into the same standardized "evaluation claim" format as the running accumulator.
The prover takes the new $CCS$ claim and the existing accumulator of $ME$ claims and combines them into a single, large multivariate polynomial $Q$. The **sumcheck protocol** then reduces the proof that this polynomial sums to zero into a single, much simpler evaluation claim at a fresh random point $r'$. This new claim effectively "unifies" the varied structures of the inputs.
:::success
#### $\Pi_{CCS}$ Example: From Mixed Claims to a Uniform Set
Imagine our accumulator holds just one `ME` claim, and we want to fold in a new `MCS` claim.
- **Input 1 (Accumulator):** An `ME` claim stating that a committed witness $Z_{acc}$ evaluates correctly.
- **Input 2 (New Proof):** An `MCS` claim stating that a new witness $Z_{new}$ satisfies the constraint $w_1 \times w_2 - w_3 = 0$.
The prover builds a polynomial $Q(i, x)$ where a selector variable $i$ chooses which claim to check:
- $Q(0, x)$ is constructed to be zero if and only if the **$MCS$ claim** is true.
- $Q(1, x)$ is constructed to be zero if and only if the **$ME$ claim** is true.
The sum-check protocol then proves $\sum_{i \in \{0,1\}, x} Q(i, x) = 0$. The output of this protocol is a single new evaluation claim, $Q(r'_i, r'_x) = y'$, at a random point $(r'_i, r'_x)$. This single claim now implicitly proves both of the original, differently structured claims. The system now holds a set of uniform $ME$ claims.
:::
#### Step 2: $\Pi_{\text{RLC}}$ (The Compressor)
- **Transformation:** $\text{ME}(b, L)^{k+1} \to \text{ME}(B, L)$
- **Goal:** Compress all $k+1$ low-norm instances from the previous step into a single, high-norm instance.
The verifier sends a random challenge matrix $\rho$, and both parties compute a random linear combination of all their components. The norm of the resulting folded witness grows from a bound of $b$ to a larger bound of $B$.
:::success
#### $\Pi_{RLC}$ Example: Compressing Two Claims into One
Let's say the previous step produced two `ME` claims (with witness matrices $Z_0$ and $Z_1$).
- **Claim 0:** Instance $(c_0, r', y_0)$ for witness $Z_0$ where $\|Z_0\|_\infty < b$.
- **Claim 1:** Instance $(c_1, r', y_1)$ for witness $Z_1$ where $\|Z_1\|_\infty < b$.
The verifier sends a random challenge matrix $\rho$.
- **Verifier computes:**
- $c_{folded} = c_0 + \rho \cdot c_1$
- $y_{folded} = y_0 + \rho \cdot y_1$
- **Prover computes:**
- $Z_{folded} = Z_0 + \rho \cdot Z_1$
The result is a **single** `ME` instance $(c_{folded}, r', y_{folded})$ for the witness $Z_{folded}$. Because of the combination, the norm of $Z_{folded}$ is now larger, bounded by $B > b$. We've successfully compressed two claims into one.
:::
#### Step 3: $\Pi_{\text{DEC}}$ (The Normalizer)
- **Transformation:** $\text{ME}(B, L) \to \text{ME}(b, L)^k$
- **Goal:** "Reset" the system by reducing the high-norm witness matrix back down to a set of low-norm matrices, so the recursion can continue.
The prover takes its high-norm witness matrix $Z_{folded}$ and deterministically decomposes it back into $k$ low-norm matrices using base-$b$ decomposition. The claim that this decomposition was performed correctly is then folded into the *next* round.
:::success
#### $\Pi_{DEC}$ Example: Reducing the Norm with a Concrete Matrix
Suppose the previous folding step produced a single high-norm witness matrix $Z_{folded}$. Our system requires that witnesses have an infinity norm less than $b=2$, but $Z_{folded}$ has a norm bounded by $B=8$. We must decompose it into $k=3$ low-norm matrices, since $b^k = 2^3 = 8$.
**1. The High-Norm Witness**
Let's say our folded witness matrix is:
$$Z_{folded} = \begin{pmatrix} 7 \\ 5 \end{pmatrix}$$
Its norm $\|Z_{folded}\|_\infty = 7$ is too high (it's greater than $b=2$).
**2. The Decomposition Process (Element by Element)**
The prover deterministically decomposes each entry into its base-$b$ (base-2) representation:
- **Decomposing 7:** $7 = \mathbf{1} \cdot 2^0 + \mathbf{1} \cdot 2^1 + \mathbf{1} \cdot 2^2$
- **Decomposing 5:** $5 = \mathbf{1} \cdot 2^0 + \mathbf{0} \cdot 2^1 + \mathbf{1} \cdot 2^2$
**3. Constructing the New Low-Norm Matrices**
The coefficients from the decomposition form the new low-norm witness matrices $Z'_0, Z'_1, Z'_2$.
- $Z'_0$ is formed from the $2^0$ coefficients: $\begin{pmatrix} 1 \\ 1 \end{pmatrix}$
- $Z'_1$ is formed from the $2^1$ coefficients: $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$
- $Z'_2$ is formed from the $2^2$ coefficients: $\begin{pmatrix} 1 \\ 1 \end{pmatrix}$
Notice that $\|Z'_0\|_\infty = 1$, $\|Z'_1\|_\infty = 1$, and $\|Z'_2\|_\infty = 1$. All are safely below the required norm bound of $b=2$.
**4. The Final Equation**
The relationship between the original high-norm matrix and the new low-norm matrices is captured by this equation:
$$\underbrace{\begin{pmatrix} 7 \\ 5 \end{pmatrix}}_{Z_{folded}} = \underbrace{\begin{pmatrix} 1 \\ 1 \end{pmatrix}}_{Z'_{0}} + 2 \cdot \underbrace{\begin{pmatrix} 1 \\ 0 \end{pmatrix}}_{Z'_{1}} + 4 \cdot \underbrace{\begin{pmatrix} 1 \\ 1 \end{pmatrix}}_{Z'_{2}}$$
The single claim on the high-norm witness $Z_{folded}$ has now been successfully transformed into a set of claims on the low-norm witnesses $Z'_0, Z'_1, Z'_2$. The proof that this decomposition equation holds is a new constraint that gets fed into the *next* folding round. The system's norm is now reset, ready for another cycle.
:::
### The Complete Picture: The Folding Cycle
The full folding protocol is the sequential composition of the three reductions: $\Pi = \Pi_{\text{DEC}} \circ \Pi_{\text{RLC}} \circ \Pi_{\text{CCS}}$. This composition creates a stable, recursive cycle designed to incrementally fold a new `MCS` instance into a running accumulator, which is represented as a set of `ME` instances. A crucial aspect of this cycle is the systematic management of the witness norm to maintain security.
The state of the accumulator and the witness norm transforms through three distinct phases in each cycle.
#### **Initial State: Start of a Folding Step**
The cycle begins with two inputs:
1. **The Running Accumulator**: A set of $k$ `ME` instances from the previous cycle, each with a **low-norm** witness (i.e., $\|Z_i\|_\infty < b$).
2. **The New Proof**: A fresh `MCS` instance, also with a **low-norm** witness.
#### **Phase 1: $\Pi_{CCS}$ (Unification)**
* **Action**: The $\Pi_{CCS}$ reduction unifies the distinct structures of the incoming `MCS` instance and the accumulator's `ME` instances. Using the sumcheck protocol, it converts the algebraic constraints of the `MCS` proof into the standardized evaluation claim format of the accumulator.
* **Output**: A new, larger set of $k+1$ uniform `ME` instances. The witnesses associated with these claims remain **low-norm**.
#### **Phase 2: $\Pi_{RLC}$ (Compression)**
* **Action**: The $\Pi_{RLC}$ reduction compresses the set of $k+1$ `ME` instances into a single `ME` instance. This is achieved by taking a random linear combination of the commitments, witnesses, and evaluations.
* **Output**: A single, folded `ME` instance. As a direct consequence of the linear combination, the witness norm increases. The witness is now **high-norm** (i.e., $\|Z_{folded}\|_\infty < B$, where $B > b$).
#### **Phase 3: $\Pi_{DEC}$ (Normalization)**
* **Action**: The $\Pi_{DEC}$ reduction normalizes the high-norm witness from the previous phase. The high-norm witness matrix is deterministically decomposed into a set of $k$ new low-norm witness matrices.
* **Output**: The cycle concludes, producing the **new running accumulator**. This consists of a set of $k$ `ME` instances, each with a corresponding **low-norm** witness. The system is now in a stable state, ready to begin the next folding cycle with another new `MCS` proof.
This three-phase process allows for the continuous absorption of new proofs into a constant-size accumulator. The cycle is fundamentally designed around the controlled increase (for compression) and decrease (for normalization) of the witness norm, which is essential for preserving the binding property of the underlying lattice-based commitment scheme.