Yan X Zhang and Aard Vark
(with thanks to Yi Sun, Jonathan Wang, Lev Soukhanov, Nicolas Mohnblatt)
This hasn't been checked. Use at your own risk.
Nova introduced a folding scheme for R1CS circuits. A folding scheme is a way to aggregate multiple satisfying witnesses to a constraint system (in Nova's case, an R1CS circuit), so that multiple verifications can be combined into one. Under the hood, folding works by taking a random linear combination of the witnesses; the R1CS equations need to be generalized to "relaxed R1CS" constraints, which are stable under linear combination.
Our goal here is to show how Nova-style folding can be generalized to any polynomial "custom gate". The idea is sketched in Sangria in Section 3.3. In addition, we provide a useful special case that has not yet been done: folding lookups. (We also have a separate writeup of folding lookups at Origami.)
We'll start with a general algebraic identity.
Let \(p(\mathbf{x})\) be a homogenous polynomial of degree \(d\) in \(n\) variables (where \(\mathbf{x} = (x_1, x_2 \ldots, x_n)\)).
Then there are \(d-1\) polynomials \(\Delta^1 p, \Delta^2 p, \ldots, \Delta^{n-1} p\) of degree at most \(d\) in \(2n\) variables such that
\[ p(\mathbf{x} + r \mathbf{y}) = p(\mathbf{x}) + r^d p(\mathbf{y}) + \sum_{k=1}^{d-1} r^k \Delta^k p (\mathbf{x}, \mathbf{y}). \]
The construction is simple. Write \(p(\mathbf{x})\) as a linear combination of monomials \(x_{i_1} x_{i_2} \cdots x_{i_d}\):
\[p(\mathbf{x}) = \sum a_{i_1, i_2, \ldots, i_d} x_{i_1} x_{i_2} \cdots x_{i_d}.\]
(It's OK if some of the indices are equal to each other. For example, the monomial \(x_1^2\) has \(i_1 = i_2 = 1\). Also, this expression is not unique, and that's OK too. For example, \(x_1 x_2\) could be rewritten as \(x_2 x_1\) or even \(2x_1 x_2 - x_2 x_1\).)
The polynomial \(\Delta^k p(\mathbf{x})\) can be computed as follows: for any monomial, \(\Delta^k(x_{i_1} x_{i_2} \cdots x_{i_d})\) is the sum of the \(\binom{d}{k}\) terms obtained by replacing \(k\) of the \(d\) variables \(x_{i_j}\) with \(y_{i_j}\). For example:
\begin{align}
\Delta^1 (x_1 x_2) & = x_1 y_2 + y_1 x_2 \\
\Delta^2 (x_1 x_2 x_3) & = x_1 y_2 y_3 + y_1 x_2 y_3 + y_1 y_2 x_3 \\
\Delta^1 (x_1^3) & = \Delta^1 (x_1 x_1 x_1) = x_1 y_1 y_1 + y_1 x_1 y_1 + y_1 y_1 x_1. \\
\end{align}
What if \(p\) is not homogenous? To match the notation in Nova, we'll homogenize by introducing an extra variable.
For any polynomial \(p(\mathbf{x})\) of degree at most \(d\) in \(n\) variables, we'll define a homogenous polynomial \(p^{homog}(\mathbf{x}, u)\) of degree \(d\) in \(n+1\) variables, as follows.
Write \[p(\mathbf{x}) = \sum_{d' = 0}^{d} p_{d'}(\mathbf{x}), \] where \(p_{d'}(\mathbf{x})\) is homogenous of degree \(d'\). In other words, \(p_{d'}(\mathbf{x})\) is just the sum of all the degree-\(d'\) terms in \(p(\mathbf{x})\). Then define \[p^{homog}(\mathbf{x}, u) = \sum_{d' = 0}^{d} u^{d - d'} p_{d'}(\mathbf{x}).\]
In other words, to make \(p^{homog}(\mathbf{x}, u)\) from \(p(\mathbf{x})\), just multiply each term of \(p(\mathbf{x})\) by whatever power of \(u\) is needed to make the degree come out to \(d\).
For future reference, it's worth noting that
\[p(\mathbf{x}) = p^{homog}(\mathbf{x}, 1).\]
TODO: an example that reformulates Sangria in this notation.
We'll assume a hiding, binding, additively homomorphic commitment scheme \(\mathrm{Com}(pp, x, \rho)\). We'll denote a commitment to a vector \(V\) by \(\overline{V} = \mathrm{Com}(pp, V, \rho)\). For clarity of notation, sometimes we will write \(\mathrm{Com}(V) := \mathrm{Com}(pp, V, \rho)\) when the arguments are obvious from context.
We'll describe a folding scheme for AIRs, a form of polynomial constraint system. An AIR \(P\) is a collection of polynomials \(f_1, \ldots, f_\ell\) in \(2w\) variables over a finite field \(\mathbb{F}\). An execution trace \(T\) for \(P\) is an array of elements \(T_{r, c}\) of \(\mathbb{F}\), with \(n\) rows, each of width \(w\). In order for the trace \(T\) to be valid, it is required that, when we substitute into any \(f_i\) the \(2w\) elements of any two consecutive rows of \(T\), the result equals zero:
\[ f_i(T_{j, 1}, T_{j, 2}, \ldots, T_{j, w}, T_{j+1, 1}, T_{j+1, 2}, \ldots, T_{j+1, w}) = 0. \]
The folding scheme we're about to describe works for more general polynomial constraint systems as well. All that matters is that a valid trace be defined by some fixed collection of polynomial equations in the entries. It is not necessary that each polynomial only involve entries from two consecutive rows. Nonetheless, we'll continue working within the AIR framework.
To start, we introduce the main object, a relaxed AIR, in the style of Sangria. The main idea of a relaxed AIR, like the relaxed R1CS analogue from Nova, is a polynomial constraint that involves a "slack term."
We'll write \(f_i(T)\) for the length-\(n\) vector whose \(j\)-th entry is \(f_i\) applied to rows \(j\) and \(j+1\) of \(T\):
\[ f_i(T_{j, 1}, T_{j, 2}, \ldots, T_{j, w}, T_{j+1, 1}, T_{j+1, 2}, \ldots, T_{j+1, w}). \]
(Rows wrap around, so row \(n\) is the same as row \(0\).)
A relaxed AIR instance is satisfied by a trace \(T\) (i.e. an \(n\)-by-\(w\) array of elements of \(\mathbb{F}\)), vectors \(E_1, E_2, \ldots, E_{\ell} \in \mathbb{F}^n\), and a scalar \(u \in \mathbb{F}\), if
\[ f_i^{homog}(T_{j, 1}, T_{j, 2}, \ldots, T_{j, w}, T_{j+1, 1}, T_{j+1, 2}, \ldots, T_{j+1, w}, u) = (E_i)_j, \] for each \(i\) and \(j\).
We'll write \(f_i^{homog}(T, u)\) for the vector whose \(j\)-th entry is
\[ f_i^{homog}(T_{j, 1}, T_{j, 2}, \ldots, T_{j, w}, T_{j+1, 1}, T_{j+1, 2}, \ldots, T_{j+1, w}, u). \]
Since \(f_i^{homog}\) is a polynomial in \(2w+1\) variables, the construction at the top of the page defines polynomials \(\Delta^k f_i^{homog}\) in \(2(2w+1)\) variables. We'll write \(\Delta^k f_i^{homog} (T^1, u^1; T^2, u^2)\) for the vector whose \(j\)-th entry is
\[ \Delta^k f_i^{homog} (T^1_{j, 1}, \ldots, T^1_{j+1, w}, u^1, T^2_{j, 1}, \ldots, T^2_{j+1, w}, u^2). \]
A relaxed AIR instance \((E_i, u)\) consists of slack vectors \(E_1, \ldots, E_{\ell} \in \mathbb{F}^n\) and a scalar \(u \in \mathbb{F}\). A witness for such an instance is an \(n\)-by-\(w\) matrix \(T\) of field elements such that \[f_i^{homog}(T, u) = E_i,\] for all \(i\).
Clearly, any AIR instance can be promoted to a relaxed AIR instance by setting \(u = 1\) and \(E = 0\). The main idea in folding AIR instances is that by converting them to relaxed AIR instances, the computations can be more easily combined.
A committed relaxed AIR instance \((f_i, \rho, \overline{E_i}, u, \overline{T})\) consists of polynomials \(f_1, \ldots, f_{\ell}\), randomness \(\rho\) (to be used by the commitment scheme) a commitment to a slack vector \(E = (E_1, \ldots, E_{\ell})\), a scalar \(u\), and a commitment \(\overline{T}\) to an \(n\)-by-\(w\) matrix of scalars. A witness to the committed relaxed AIR instance \((f_i, \overline{E_i}, \overline{T})\) is a tuple \((E, T, \rho)\) of
such that \(\overline{T} = \operatorname{Com}(pp, T, \rho)\), \(\overline{E} = \operatorname{Com}(pp, E, \rho)\), and \(f_i^{homog}(T, u) = E_i\).
We'll describe a folding scheme for relaxed AIR instances. Suppose a prover and a verifier have an AIR \(P\), i.e. they both know the collection of polynomials \(f_i\). The prover is given two relaxed AIR instances \((T^1, E^1, u^1)\) and \((T^2, E^2, u^2)\). The verifier knows only the scalars \(u^1\) and \(u^2\).
Let \(d_i\) be the degree of the polynomial \(f_i\), and let \(\Delta^k f_i^{homog}\) be as defined above.
The prover \(\mathcal{P}\) and the verifier \(\mathcal{V}\) carry out the following protocol.
Protocol 1
\(\mathcal{P}\) computes commitments \(\overline{X}\) for \(X \in \{T^1, T^2, E_i^1, E_i^2\}\) and their openings \(\rho_X\) (meaning that for all \(X\), \(\overline{X} = \operatorname{Com}(pp, X, \rho_X)\) holds) and sends them to \(\mathcal{V}\).
For each constraint polynomial \(f_i\), and each degree \(1 \leq k \leq d_i - 1\), \(\mathcal{P}\) also computes the cross term
\[ B_{i, k} = \Delta^k f_i^{homog} (T^1, u^1; T^2, u^2) \] (a vector of length \(n\)). \(\mathcal{P}\) computes commitments and openings for \(\overline{B_{i, k}}\) and sends them to \(\mathcal{V}\).
\(\mathcal{V}\) samples a random folding challenge \(r \in \mathbb{F}\), and sends \(r\) to \(\mathcal{P}\).
\(\mathcal{P}\) computes \[T = T^1 + r T^2\] and, for each \(i\), \[E_i = E^1_i + r^{d_i} E^2_i + \sum_{k=1}^{d_i-1} r^k B_{i, k}.\] These values \(T\) and \(E_i\) give a folded relaxed AIR witness.
Both \(\mathcal{P}\) and \(\mathcal{V}\) compute \[u = u^1 + r u^2,\] \[\overline{T} = \overline{T^1} + r \overline{T^2},\] and, for each \(i\), \[\overline{E_i^1} + r^{d_i} \overline{E_i^2} + \sum_{k=1}^{d_i-1} r^k \overline{B_{i, k}}. \]
Like Nova, this folding process can be iterated. Steps 2-5 allow two committed relaxed AIR instances to be folded into one, at the cost of sending just \(4 + \sum_{i=1}^{\ell} d_1-1\) commitments to the verifier. One can iterate the procedure to fold an arbitrary number of committed relaxed AIR instances together, one by one. At the end, the prover only has to convince the verifier that the final \((E, T)\) is a witness to the folded instance.
In Origami, we apply our procedure to Halo2 lookup checks.
Technically speaking, this is not an exact special case of the protocol since \(\beta\) and \(\gamma\) play the role of verifier-supplied randomness that does not fit in the framework. Instead, lookups are a special case of a slight generalization of Protocol 1 that includes verifier randomness.
Here we'll argue that the folding procedure satisfies knowledge soundness. In other words, a prover can't convince a verifier to accept a proof (with more than negligible probability) unless the prover itself knows witnesses \((E^1, T^1)\) and \((E^2, T^2)\).
We'll prove this by a "rewinding" argument. Imagine an "extractor" that is allowed to interact with the prover by rewinding it to a previous state. In practice, the extractor will feed the prover different random challenges \(r\) and see what response it gives to each.
Let \(d\) be the largest of the \(d_i\)'s, so that all the constraint polynomials \(f_i\) have degree at most \(d\). Suppose we are given two committed relaxed AIR instances \((f_i, \overline{E^1_i}, u^1, \overline{T^1})\) and \((f_i, \overline{E^2_i}, u^2, \overline{T^2})\) (for the same AIR \(P = (f_1, \ldots, f_{\ell})\)). We are also given \(d+1\) transcripts of the prover-verifier interaction, with \(d+1\) different values of \(r\), say \(r_{(1)}, \ldots, r_{(d+1)}\).
Each transcript contains the same commitments \(\overline{B_{i, k}}\). Furthermore, each transcript gives witnesses \(E(r_{(j)})\), \(T(r_{(j)})\), and \[u(r_{(j)}) = u^1 + r_{(j)} u^2,\] that satisfy the relaxed AIR condition:
\[ f_i^{homog}(T(r_{(j)}), u(r_{(j)})) = E_i(r_{(j)}). \]
Furthermore, these witnesses are consistent with the commitments \(\overline{E^1_i}, \overline{T^1}, \overline{E^2_i}, \overline{T^2}\), in that
\[\overline{T(r_{(j)})} = \overline{T^1} + r_{(j)} \overline{T^2}\]
and
\[\overline{E(r_{(j)})} = \overline{E_i^1} + r_{(j)}^{d_i} \overline{E_i^2} + \sum_{k=1}^{d_i-1} r_{(j)}^k \overline{B_{i, k}}. \]
From this, the extractor needs to extract satisfying witnesses \(E^1_i, T^1, E^2_i, T^2\) to the original relaxed AIR. As in Nova, the extractor works by interpolation.
By linear interpolation, the extractor can find \(T^1\) and \(T^2\) such that
\[T(r_{(j)}) = T^1 + r_{(j)} T^2\]
for \(j = 1, 2\). Since the commitment scheme is linear, the \(T^1\) the extractor finds will satisfy \(\operatorname{Com}(pp, T^1, \rho_{T^1}) = \overline{T^1}\), and similarly for \(T^2\). Therefore, for every \(j\) we have
\[\overline{T(r_{(j)})} = \overline{T^1} + r_{(j)} \overline{T^2} = \operatorname{Com}(T(r_{(j)})),\]
so by the binding property of commitment, we have (with high probability) that
\[T(r_{(j)}) = T^1 + r_{(j)} T^2\] for all \(j\).
Similarly, we'll recover \(E^1_i, E^2_i\) and \(B_{i, k}\) by Lagrange interpolation. Recall that the coefficients of any degree-\(d\) polynomial in one variable can be recovered from the value of the polynomial at \(d+1\) distinct points. (This is purely a question of linear algebra, so the extractor can do it efficiently.) We are looking for \(E^1_i, E^2_i\) and \(B_{i, k}\) such that
\[E_i(r_{(j)}) = E^1_i + r_{(j)}^{d_i} E^2_i + \sum_{k=1}^{d_i-1} r_{(j)}^k B_{i, k}.\]
Well, this is simply an interpolation problem for a polynomial of degree \(d_i \leq d\) in the variable \(r\); the unknowns \(E^1_i, E^2_i\) and \(B_{i, k}\) play the role of coefficients. Thus, for each \(i\), we can find \(E^1_i, E^2_i\) and \(B_{i, k}\) by Lagrange interpolation on the first \(d_i+1\) values \(r_{(j)}\).
We know that \[\overline{E(r_{(j)})} = \overline{E_i^1} + r_{(j)}^{d_i} \overline{E_i^2} + \sum_{k=1}^{d_i-1} r_{(j)}^k \overline{B_{i, k}}, \] so by linearity of commitment and linearity of Lagrange interpolation, \(\operatorname{Com}(pp, E^1_i, \rho_{E^1_i}) = \overline{E^1_i}\), \(\operatorname{Com}(pp, E^2_i, \rho_{E^2_i}) = \overline{E^2_i}\), and \(\operatorname{Com}(pp, B_{i, k}, \rho_{B_{i,k}}) = \overline{B_{i, k}}\) – in other words, the values \(E^1_i, E^2_i, B_{i, k}\) that the extractor finds are consistent with the given commitments.
Just like with \(T(r_{(j)})\) above, the hiding property of the commitment scheme then guarantees that (with all but negligible probability) we have \[E_i(r_{(j)}) = E^1_i + r_{(j)}^{d_i} E^2_i + \sum_{k=1}^{d_i-1} r_{(j)}^k B_{i, k}.\] for all \(j\).
Finally, we need to show that \((E^1_i, u^1, T^1)\) and \((E^2_i, u^2, T^2)\) are satisfying witnesses for the original AIR. To this end, consider the equality \[ f_i^{homog}(T^1 + r T^2, u^1 + r u^2) = E^1_i + r^{d_i} E^2_i + \sum_{k=1}^{d_i-1} r^k B_{i, k}. \]
This is an equality of polynomials of degree \(d_i\) in \(r\) that is known to hold for \(d_i+1\) values of \(r\). Therefore, the equality holds for all \(r\). In particular, it holds for \(r = 0\), so \[ f_i^{homog}(T^1, u^1) = E^1_i. \]
On the other hand, since \(f_i^{homog}\) is homogenous of degree \(d_i\), we see that
\[ f_i^{homog}(s T^1 + T^2, s u^1 + u^2) = s^{d_i} E^1_i + E^2_i + \sum_{k=1}^{d_i-1} s^{d_i-k} B_{i, k} \] holds for \(d_i+1\) values of \(s\), namely \(s = r_{(j)}^{-1}\). Therefore, this equality holds for all \(s\), in particular for \(s=0\), so \[f_i^{homog}(T^2, u^2) = E^2_i. \]
This shows that the extractor can recover the two satisfying witnesses.
It's worth explaining why folded lookup arguments satisfy knowledge soundness as well. Recall the overall shape of the lookup argument:
Then the folding happens. To fold the instance into \(I_{cml}\):
In the proof of knowledge soundness above, the extractor rewinds to just before \(\mathcal{V}\) sends the challenge \(r\). The earlier challenges \(\beta\) and \(\gamma\) remain untouched. The conclusion is that the prover knows a witness \((u^i, \beta^i, \gamma^i, A^i, S^i, A_p^i, S_p^i, Z^i, W^i)\) that satisfies the seven polynomial relations \(f_1, \ldots, f_7\).
A separate argument (knowledge soundness for Halo2 lookups) shows that in fact the prover knows \(A^i\) and \(S^i\) satisfying the lookup condition.
Nova is a folding scheme for R1CS circuits. The R1CS structure is a special case of an AIR, where there is a single constraint polynomial \(f_1\), of degree 2, having a specific form. Because the polynomial is of degree \(d=2\), there is only a single cross-term \(B = B_{1, 1}\). The folding scheme described here generalizes Nova: if you apply this scheme to R1CS systems, you recover the original Nova folding scheme.
The idea that Nova-style folding can be generalized to arbitrary custom gates was introduced in Section 3.3 of Sangria.