# Compressed SNARK description ## High-level ### Traits We add a `BatchedRelaxedR1CSSNARKTrait` which generalizes the existing `RelaxedR1CSSNARKTrait` in `src/traits/snark.rs`. The main difference is that the `setup`, `prove` and `verify` methods take as inputs _slices_ of - R1CS shapes `S` - Relaxed R1CS instances `U` - Relaxed R1CS witnesses `W` We also implement a `CompressedSNARK` proof for the SuperNova `RecursiveSNARK`. The main differences with the original are - We initializes the instance/witness pairs for non-folded primary circuits to their default values - The primary circuits are proved using an implementation of `BatchedRelaxedR1CSSNARK`. In `src/spartan`, we implement batched versions of `snark.rs` and `pp_snark.rs` in `batched.rs` and `batched_ppsnark.rs`. The `BatchedRelaxedR1CSSNARK` structs in both cases resemble their non-batched counter-parts, except that fields containing commitments or evaluations are replaced with vectors of the same elements. The Sumcheck and PCS proofs are batched so their count remains constant with regards to the number of instances. The proving/verification keys are also mainly the same, apart from having to store parameters for each individual circuit being proved. ### Batching Strategy The implementation attempts to mirror the existing SNARK and pp-SNARK as much as possible. In the "IOP" portion of the protocol, where the prover sends commitments to polynomials or evaluations thereof, and the verifier responds with challenges, we simply repeat this step for each instance. The prover messages are added to the transcript in batches, and stored in the proof as vectors. The Sumcheck claims are set up slightly differently to account for circuits of different sizes. In particular, when a claim of size $m$ is batched in a Sumcheck over $n \geq m$ rounds, we create separate `EqPolynomials` for each instance. Given a single random challenge $r$, we use the first $m$ components of $\vec{r} = (1, r^2, r^4,\ldots, r^{2^{n-1}})$ to define the `EqPolynomial` for a claim over $m$ variables. This ensures all polynomials within a claim have the same size. We explain in the next section how we modified Sumcheck to handle multiple claims of possibly different sizes. Since the original SNARKs already implement batching of MLE evaluations, we are able to reuse the code to batch all polynomials across the several instances. We explain the subtleties of the batching argument for the ppSNARK in a later section. ## Batched Sumcheck Multiple Sumcheck claims can be batched together by running the protocol over a random-linear combination of the input claims. The existing implementation of ppSNARK already includes this functionality, though we augmented it to handle instances defined over fewer variables. A *Sumcheck instance* implements the `SumcheckEngine` trait. It may be composed of multiple independent *claims* of the type $$ \sigma_j = \sum_{\vec{x} \in \{0,1\}^n} F_j\big(P_0(\vec{x}), P_1(\vec{x}), \ldots\big), $$ where $F_j$ is a multivariate polynomial of degree $\leq d$, and $P_0,P_1,\ldots$ are multi-linear polynomials in $n$ variables. In rounds $i = 0,1,\ldots,n-1$ after receiving challenges $\vec{r} = r_0,\ldots,r_{i-1}$, the prover computes for each claim $j$ the evaluations at $X = 0, 2, \ldots, d-1$ of the univariate polynomial $$ S_j(X_i) = \sum_{\vec{x} \in \{0,1\}^{n-i-1}} F_j\big( P_0(\vec{r}, X_i, \vec{x}) , P_1(\vec{r}, X_i, \vec{x}) , \ldots, \big) $$ The prover sends the random linear combination of all univariate polynomials $S_j(X)$ and obtains the Fiat-Shamir challenge $r_i$, which is used to bound the variable $X_i$ of the multilinear polynomials $P_0, P_1,\ldots$. After the $n$-th round, the prover sends $v_0,v_1,\ldots$ equal to the multi-linear evaluations of $P_0, P_1,\ldots$ at $\vec{r} = (r_0,\ldots,r_{n-1})$. ### Non-uniform instances We can use the same Sumcheck instantiation to batch instances of different sizes. Let $m < n$ be the number of variables for the instance $$ \sigma = \sum_{\vec{x} \in \{0,1\}^m} F\big(P_0(\vec{x}), P_1(\vec{x}), \ldots\big), $$ where $P_0, P_1,\ldots$ are multi-linear polynomials over $m$ variables. We can prove this claim with $n$-round Sumcheck by considering the equivalent scaled instance $$ \underbrace{2^{n-m}\cdot \sigma}_{\sigma'} = \sum_{\vec{x}' \in \{0,1\}^{n-m}} \sum_{\vec{x} \in \{0,1\}^{m}} F\big(P_0(\vec{x}), P_1(\vec{x}), \ldots\big). $$ Here, we "lift" the $m$-variate polynomials $P_0,P_1,\ldots$ to $n$ variables $(X_0,\ldots,X_{n-1})$ by evaluating them over $(X_{n-m},\ldots,X_{n-1})$. In rounds $0\leq i < n-m$, the univariate polynomial $S(X_i)$ is constant and given by $$ S(X_i) = \sum_{\vec{x}' \in \{0,1\}^{n-m-i-1}} \sum_{\vec{x} \in \{0,1\}^{m}} F\big( P_0( \vec{x}) , P_1( \vec{x}) , \ldots, \big) \equiv 2^{n-m-i-1}\cdot \sigma $$ Given the initial claim $\sigma$, the prover produces the evaluations of $S(X_i)$ without having to perform any work. Moreover, binding the polynomials $P_0,P_1$ to $r_i$ has no effect, so we do not need to update the list of evaluations. Once the prover reaches rounds $n-m \leq i < n$, the protocol proceeds as usual. We note though that in the final phase, the prover sends the evaluations $v_0, v_1,\ldots$ of the polynomials $P_0, P_1,\ldots$ at $\vec{r} = (r_{n-m},\ldots, r_{n-1})$. ### Implementation In order to minimize the amount of changes to the existing `ppsnark.rs` code, we implemented the above technique by recreating a `prove_helper` method which handles batching of multiple instances of different sizes. Our changes are compatible with the existing Instances implementing the `SumcheckEngine` trait. The `prove_helper` function takes as input multiple `SumcheckEngine` implementations, each of which contains one or more independent claims. These claims hold over multilinear polynomials of the same size contained in the instance struct. It is assumed that all instances have degree 3. We let $n$ be the total number of Sumcheck rounds. Let $I = \{I_j\}_{j}$ be a set of instances, where $I_j$ is defined by - Number of variables $m_j$ - Multi-linear polynomials $\{P_{j,\ell}\}_{\ell}$ over variables $(X_{n-m_j}, \ldots, X_{n-1})$ - Initial claims $\vec{\sigma}_j = \{\sigma_{j,k}\}_{k}$. - Combination functions $\{F_{j,k}\}_{k}$ For each instance at index $j$ and each sub-claim at index $k$ it holds that $$ \sigma_{j,k} = \sum_{\vec{x} \in \{0,1\}^{m_j}} F_{j,k}\big( P_{j,0}(\vec{x}), P_{j,1}(\vec{x}), \ldots \big) $$ At the start of the protocol, the verifier sends a challenge $s$ used for the random linear combination of claims. For notational simplicity, we define $s_{j,k}$ as the power of $s$ corresponding to the claim $\sigma_{j,k}$. The full batched claim is given by $$ \sum_{j,k} s_{j,k} \cdot 2^{n-m_j} \cdot \sigma_{j,k} = \sum_{j,k} s_{j,k} \left(\sum_{\vec{x}' \in \{0,1\}^{n-m_j}} \sum_{\vec{x} \in \{0,1\}^{m_j}} F_{j,k}\big( P_{j,0}(\vec{x}), P_{j,1}(\vec{x}), \ldots \big)\right) $$ 1. Get $s \in \mathbb{F}$ from verifier 2. Set initial batched scaled claim $e_0 = \sum_{j,k} s_{j,k} \cdot 2^{n-m_j} \cdot \sigma_{j,k}$ 3. For rounds $0\leq i < n$ 1. Let $\vec{r} = (r_0,\ldots,r_{i-1})$ be the verifier challenges from the previous rounds 2. For each claim $k$ in instance $j$, compute the univariate polynomial $S^{(i)}_{j,k}(X_i)$ - If $m_j < n-i$, set $S^{(i)}_{j,k}(X_i) \equiv 2^{n-i-m_j-1}\cdot \sigma_{j,k}$ - Otherwise, compute evaluate it the usual way by evaluating the sum $\sum_{\vec{x} \in \{0,1\}^{n-i-1}} F_{j,k}\big(P_{j,0}(\vec{r}, X_i, \vec{x}), P_{j,1}(\vec{r}, X_i, \vec{x}), \ldots\big)$ at $X_i = 0, 2, 3$ 3. Compute the batched univariate polynomial $S^{(i)}(X_i) = \sum_{j,k} s_{j,k}\cdot S^{(i)}_{j,k}(X_i)$, noting that $S^{(i)}(1) = e_{i} - S^{(i)}(0)$ 4. Send $S^{(i)}(X_i)$ to the verifier and get challenge $r_i$ (the linear coefficient is removed to save on hashing costs) 5. Compute next batched claimed sum $e_{i+1} = S^{(i)}(r_i)$ 6. For each instance $j$ bind all polynomials $\{P_{j,\ell}\}_{\ell}$ to $X_i = r_i$ - If $m_j < n-i$, then $X_i$ is not a variable of the polynomial $P_{j,\ell}$ so do nothing - Else compute the $2^{n-i-1}$ evaluation of the multilinear polynomial $P_{j,\ell}(r_0,\ldots,r_i,X_{i+1}, \ldots, X_{n-1})$ in $n-i-1$ variables 4. At the end of the protocol, send to the verifier the evaluations $\{v_{j,\ell}\}_{j,\ell}$ for each polynomial at index $\ell$ of the instance $j$, where $v_{j,\ell} = P_{j,\ell}(r_{n-m_j},\ldots,r_{n-1})$ The `SumcheckProof::verify_batch` algorithm only considers the flattened list of claims 1. Sample batching challenge $s$ 2. Compute batched scaled claim $e_0 = \sum_{j,k} s_{j,k} \cdot 2^{n-m_j} \cdot \sigma_{j,k}$ 3. For rounds $0 \leq i < n$ 1. Get $S^{(i)}(X_i)$ from the prover by recomputing the linear term using the relation $S^{(i)}(0) + S^{(i)}(1) = e_{i}$ 2. Sample round challenge $r_i$ 3. Evaluate $e_{i+1} = S^{(i)}(r_i)$ 4. Receive evaluations $\{v_{j,\ell}\}$ from the prover - Note that for polynomials like $\mathsf{eq}$ or $\mathsf{pow}$ appearing in one of the claims at index $j$, the verifier computes them manually over $(r_{n-m_j},\ldots,r_{n-1})$ 5. Check that $e_n = \sum_{j,k} s_{j,k}\cdot F_{j,k}\big(v_{j,0},v_{j,1},\ldots \big)$ If the verifier accepts the final check, it must then check that all polynomial evaluations are correct. Note that this batching technique implemented in the ppSNARK's `prove_helper` method was also ported to the `prove_quad_batch` and `prove_cubic_with_additive_term_batch` methods of `spartan::sumcheck::SumcheckProof`. ## Batched PCS At the start of the protocol, the prover will have compute commitments to polynomials $\{P_{j, \ell}\}_{j,\ell}$ where $P_{j,\ell}$ is defined over $m_j$ variables. Again we let $n = \max_j\{m_j\}$ be the maximum number of variables. We assume the PCS supports opening multilinear polynomials over $n$ variables. It is represented in memory by its list of evaluations $[P_i]_{i=0}^{2^n-1}$ which yields the expression $$ P(X_0,\ldots,X_{n-1}) = \sum_{i = 0}^{2^n-1} P_i \cdot L_i(X_0,\ldots,X_n). $$ We commit to this polynomial by computing the MSM with the commitment key base points $[G_i]_{i=0}^{2^n-1}$. That is $$ C = \sum_{i=0}^{2^n-1} P_i \cdot G_i. $$ If a polynomial is defined over $m<n$ variables, it has $2^{m}$ evaluations $[P_i]_{i=0}^{2^m-1}$. When committing to it, we interpret the evaluations at indices $2^m \leq i < 2^n$ as zeros so that $$ C = \sum_{i=0}^{2^n-1} P_i\cdot G_i = \sum_{i=0}^{2^m-1} P_i\cdot G_i. $$ This interpretation allows us to compute the MSM using only $O(2^m)$ group operations. Effectively, we have committed to the polynomial $$ P'(X_0,\ldots,X_{n-1}) = L_0(X_0,\ldots,X_{n-m-1})\cdot \sum_{i=0}^{2^m-1} P_i \cdot L_i(X_{n-m}, \ldots,X_{n-1}). $$ During Sumcheck though, for polynomial defined over $m<n$ variables, we receive evaluations $$ v = P(r_{n-m},\ldots,r_{n-1}) = \sum_{i=0}^{2^m-1} P_i\cdot L_i(r_{n-m},\ldots,r_{n-1}). $$ In order to open $v$ using the commitment $C$, we need to transform $v$ into an evaluation $v'$ for the polynomial $P'$ $$ v' = L_0(r_0,\ldots,r_{n-m-1})\cdot v_j = P'(r_0,\ldots,r_{n-1}). $$ ### Implementation For each evaluation $v_{j,\ell}$ provided by the Sumcheck prover, both parties compute a batched PCS evaluation instance $u = (C, (r_0,\ldots,r_{n-1}), v)$ and the prover computes the witness polynomial $w = P(X_0,\ldots,X_{n-1})$, such that - $P(r_0,\ldots, r_{n-1}) = v$ - $\mathsf{Commit}(P) = C$ 1. Verifier samples batching challenge $c \in \mathbb{F}$, and both compute powers $\{s_{j,\ell}\}$ 2. For each instance $j$ - $v_{j,\ell}' = L_0(r_0,\ldots,r_{n-m_j-1})\cdot v_{j,\ell}$ 3. Compute batched instance values - $C = \sum_{j,\ell} s_{j,\ell} \cdot C_{j,\ell}$ - $v = \sum_{j,\ell} s_{j,\ell} \cdot v_{j,\ell}'$ 4. The prover computes $P(X_0,\ldots,X_{n-1}) = \sum_{j,\ell} s_{j,\ell} \cdot P'_{j,\ell}(X_0,\ldots,X_{n-1})$, where the individual polynomials $P'_{j,\ell}$ are interpreted as a list of the first $2^{m_j}$ evaluations over the $n$-dimensional hypercube. ## Batched Sumcheck PCS Reduction In the non-pre-processing SNARK, the verifier needs to verify the evaluations of multiple multi-linear polynomials at various points defined by the inner and outer Sumcheck instances. Unfortunately, the polynomial commitment schemes only support batching of evaluations for polynomials opened at the _same_ point. Since the creation of the opening argument is expensive, the original implementation uses a specialized Sumcheck argument to reduce each evaluation to an evaluation at a common point. All of these can be batched via random linear combination so that only a single PCS opening argument needs to be created. Let $n$ be the maximum number of variables over all polynomials being opened. Let $[((C_i, e_i, \vec{x_i}, n_i), P_i)]_i$ be a list of polynomial evaluation instances, where: - $P_i$ represents the MLE polynomial, interpolating values $[P_{i,j}]_{j=0}^{2^{n_i}-1}$ over $\{0,1\}^{m_i}$. - $C_i$ is a commitment to the polynomial $P$, such that $C_i = \sum_{j=0}^{2^{n_i}-1 } P_{i,j} \cdot G_i$ - $e_i$ is the evaluation of $P_i$ queried at $\vec{x_i} \in \mathbb{F}^{n_i}$, where $P_i$ is defined as $P_i(X_0,\ldots,X_{n_i-1}) = \sum_{i=0}^{2^{n_i}-1} P_{i,j} \cdot L_i(X_0,\ldots,X_{n_i-1})$ For each evaluation instance $i$, the prover and verifier run the batched Sumcheck to prove the claim $$ e_i = \sum_{\vec{x} \in \{0,1\}^{m_i}} \mathsf{eq}(\vec{x}_i, \vec{x}) \cdot P_i(\vec{x}). $$ The protocol proceeds as described in [[Compressed SNARK#Batched Sumcheck]], where all claims are appropriately scaled to $n$ variables. The output is a query point $\vec{r} \in \mathbb{F}^n$ and evaluations $[v_i = P_i(r_{n- m_i }, \ldots, r_{n-1})]_i$, to which we can apply the same technique as the [[Compressed SNARK#Batched PCS]]. ### Implementation The methods [`batch_eval_prove`](https://github.com/lurk-lab/arecibo/blob/5b7397eb7585278f8c9ed1a7dd6c71e0cdf1b4ea/src/spartan/snark.rs#L446)and [`batch_eval_verify`](https://github.com/lurk-lab/arecibo/blob/5b7397eb7585278f8c9ed1a7dd6c71e0cdf1b4ea/src/spartan/snark.rs#L520) represent sub-protocols that reduce the task of checking the validity $[(u_i, w_i)]_i = [((C_i, e_i, \vec{r_i}, n_i), P_i)]_i$, to only checking the opening of a single evaluation instance $(u,w) = ((C, v, \vec{r}, n); P')$. 1. The verifier samples challenge $\rho$ for the random linear combination of all Sumcheck claims defined above. 2. Both parties run the batched Sumcheck protocol resulting in $\vec{r} \in \mathbb{F}^n$ and evaluations $[v_i]_i$. 3. The verifier checks that $\sum_i \rho^i \cdot \mathsf{eq}(\vec{x}_i, \vec{r}) \cdot v_i$ evaluated to the final Sumcheck claim. 4. The verifier samples $\gamma$ and computes the new batched polynomial evaluation instance $(C, v, \vec{r}, n)$ - $C = \sum_i \gamma^i \cdot C_i$ - $v = \sum_i \gamma^i \cdot L_0(r_0,\ldots,r_{n-n_i-1}) \cdot v_i$ 5. The prover compute the corresponding polynomial $$ P(X_0,\ldots, X_{n-1}) = \sum_i \gamma^i \cdot L_0(X_0,\ldots,X_{n-n_i-1})\cdot P_i(X_{n_i-n},\ldots,X_{n-1}) $$ 6. Both parties engage in the PCS opening argument for the resulting instance. ## Bounded Witness Check In the pre-processing SNARK, the Sumcheck claims of each RelaxedR1CS instance are padded to $2^n = \max\{\#\text{constraints}, 2\cdot\#\text{variables}, |A|+|B|+|C|\}$. If we let $2^m = \#\text{variables}$, then the witness vector $W$ is of length $2^m$, though we pad it to $2^n$ by adding zeros. $$ W(X_0,\ldots,X_{n-1}) = \sum_{i=0}^{2^{m}-1} W_i \cdot L_i(X_0,\ldots,X_{n-1}) = L_0(X_0,\ldots,X_{n-m-1})\cdot \sum_{i=0}^{2^m-1} W_i \cdot L_i(X_{n-m},\ldots,X_{n-1}). $$ The public input $X$ is a list of at most $2^n$ elements, which we also consider as a MLE. These are appended to $W$ yielding the vector $Z = W || X \in 2^{m+1}$. Its MLE is given by: $$ Z(X_0,\ldots,X_{n-1}) = W(X_0,\ldots,X_{n-1}) + \sum_{i=0}^{2^m-1} X_i \cdot L_{2^{m}+i}(X_0,\ldots,X_{n-1}) $$ The prover provides a commitment to $W$ by the verifier uses the values of $X$ and computations of the Lagrange polynomials to derive an evaluation of $Z$. The evaluation of $W$ is performed over $n$ variables rather than the $m$ variables it is actually defined over. We need to ensure that $W$ does not "overflow" into the evaluation slots for $X$, which we do using a new Sumcheck instance using a special `MaskedEqPolynomial`. $$ \mathsf{eq}_{\leq m}(\vec{r}, i_0,\ldots,i_{n-1}) = \begin{cases} 0, & \text{if } i_0 = \cdots = i_{n-m-1} = 0 \\ \mathsf{eq}(\vec{r}, i_0,\ldots,i_{n-1}), &\text{otherwise} \end{cases} $$ This polynomial can be succinctly evaluated by noting that it is equal to $$ \mathsf{eq}(\vec{r}, \vec{X}) - \left(\prod_{i=0}^{n-m-1} (1-r_i)(1-X_i) \right)\cdot \left(\prod_{i=n-m}^{n-1} \mathsf{eq}(r_i,X_{i}) \right). $$ The Sumcheck instance we end up proving is $$ 0 = \sum_{\vec{x} \in \{0,1\}^n} \mathsf{eq}_{\leq m}(\vec{r}, \vec{x})\cdot W(\vec{x}). $$ By definition of $\mathsf{eq}_{\leq m}$, the sum equals 0 (with overwhelming probability) since in the first $2^m$ entries, $\mathsf{eq}_{\leq m}$ evaluates to 0 cancelling out the values of $W$, while it equals $\mathsf{eq}(\vec{r},\vec{x})$ in the remaining entries, forcing $W$ to be be 0. ### Implementation We include this check as a separate `SumcheckEngine` implementation, defined in `spartan::batched_ppsnark::WitnessBoundSumcheck`. The constructor takes as input a random element $r$, the maximum number of variables $m$, and the padded witness $W \in \mathbb{F}^{2^n}$. It computes the evaluations of $\mathsf{eq}_{\leq m}(\vec{r}, \vec{X})$, where $\vec{r} = (1, r^2, r^4,\ldots, r^{2^{n-1}})$. One of the benefits of including this new instance is that the Sumcheck prover will have computed the multi-linear evaluation of $W$, bypassing the need to compute it manually.