# Issue with public inputs
Let's assume our R1CS shape has $2^n$ variables, $\ell$ public inputs (including the initial 1, or $u$), and $2^m$ constraints, where $\ell < 2^n$. Take $N = \max\{2^{n+1}, 2^m\}$ as the size of the public parameters.
Our witness is the vector $W \in \mathbb{F}^{2^{n}}$ considered as a multilinear polynomial in $n$ variables $W(X_0,\ldots,X_{n-1})$. We define $P(X_0,\ldots,X_{n-1})$ as the MLE of the $\ell$ public inputs. From this, we construct the $Z \in \mathbb{F}^{2^{n+1}}$ vector where its MLE has $n+1$ variables
$$
Z(X_0,X_1,\ldots,X_{n}) = (1-X_0)\cdot W(X_1,\ldots,X_{n}) + X_0\cdot P(X_1,\ldots,X_{n})
$$
Consider that $W,P$ are valid solutions, but the prover wants to use a malicious output $\tilde{P} \in \mathbb{F}^\ell$.
We can craft a malicious $\tilde{W}$ as
$$
\tilde{W}(X_0,\ldots, X_{n}) = (1-X_0)\cdot W(X_1,\ldots,X_{n}) + X_0\cdot P(X_1,\ldots,X_{n}) - X_0 \cdot \tilde{P}(X_1,\ldots,X_{n})
$$
The evaluation list of $\tilde{W}$ is represented as $W$ concatenated with $P - \tilde{P}$, for a length of $2^{n}+\ell$. It's easy to commit to this polynomial since our commitment key supports larger vectors.
Notice that the corresponding $\tilde{Z}$ is equal to $Z$ since the $\tilde{P}$ component cancels out. The prover returns the malicious evaluation of $W$
$$
e_{\tilde{W}} = \frac{(1-r_0)\cdot e_W + r_0\cdot (e_P - e_{\tilde{P}})}{1-r_0}
$$
The verifier computes the evaluation of $Z$ to verify the Sumcheck
$$
e_Z = (1-r_0)\cdot e_{\tilde{W}} + r_0 \cdot e_{\tilde{P}} = (1-r_0)\cdot e_{W} + r_0 \cdot e_{P}
$$
such that the evaluation matches.
If the PCS verifying the evaluations from Sumcheck does not enforce that $W$ has at most $2^{n}$ values, and instead batches it with other evaluations by considering $W'(X_0,\ldots,X_n) = (1-X_0)W(X_1,\ldots,X_n)$ and shifting the evaluation by the first Lagrange polynomial $e_W' = (1-r_0)e_W$, then the verifier will validate the opening of the malicious $e_{\tilde{W}}$.
## Why this attack works
The problem comes from the fact that the evaluation $e_W$ is sent to the verifier after the verifier sends the Sumcheck challenge $r_0$. This allows an attacker to compute the malicious evaluation which will pass the Sumcheck, and will be valid as an evaluation of a polynomials with $2^{n+1}$ variables.
In this particular case, there is no check ensuring that $|W| < 2^n$, which could be enforced either through the PCS, or via another Sumcheck invocation.
## Impact
In the SNARK without pre-processing, all evaluations from the outer and inner instances are batched by invoking another instance of the Sumcheck protocol. This latter instance applies padding to the $W$ and $E$ vectors to ensure the evaluations $e_W$ and $e_E$ are for the points $r_y[1..]$ and $r_x$ respectively. This acts as a degree check, as long as the PCS ensures that both vectors have length at most $\max\{2^{n}, 2^m\}$. Therefore, it is not affected by this attack.
For the pre-processing SNARK however, all Sumcheck instances are padded to $N = \max\{2^{m}, 2\cdot 2^{n}, |A| + |B| + |C|\}$, and batched together. Since there is no "evaluation-batching" Sumcheck instance at the end, there is no guarantee that $W$ is not defined to cancel out the public input, allowing an attacker to create a fraudulent proof for any public inputs of their choice.
## Fix
We describe 3 different way of fixing this issue, each with a different cost.
### Opening $W$ at the correct point
The simplest approach to solving the problem is to use another invocation of the PCS opening protocol, where $W \in 2^{n}$ is opened at the last $n$ challenges of $\vec{r}$ from the Sumcheck.
This is the solution implemented in the Nova repo in [PR 274](https://github.com/microsoft/Nova/pull/274)after we report the issue. Unfortunately, it has a big impact on performance in terms of prover/verifier time and proof size, since the PCS opening arguments makes up a large portion of the proof complexity.
### Batched PCS Reduction with Sumcheck
We can solve the issue in the same way as is done with the non pre-processing SNARK, where we use another Sumcheck invocation to batch all PCS opening queries at different points, and reduce them to queries at the same common point defined by the Sumcheck randomness.
Computationally this is cheaper for the prover since there are no additional commitments involved. It does however increase the proof size by $O(\log{N})$ field elements, and the verifier will need to compute as many additional hashes for Fiat-Shamir.
### Bounded Witness Sumcheck
The witness vector $W$ is used as part of the `MemorySumcheck` which is defined over vectors of size $N$. It is defined as Therefore, we need to ensure that $W$ is zero in the range $[2^n, \ldots, N-1]$, so that the concatenation with public inputs $X$ does not overlap.
This approach involves adding another claim to the batched Sumcheck instead and checking $W$ directly, rather than checking its evaluation produced by Sumcheck.
The main idea is to prove that a random linear combination of the elements of $W$ in the range $[2^n,\ldots,N-1]$ equals 0. To do so, we construct a special $\mathsf{eq}$ polynomial over $\log N$ variables, which equals 0 in the first $2^n$ evaluations, but corresponds to the usual definition on the other points.
It's exact definition is given by the multi-linear polynomial $\mathsf{eq}_{\leq \log N}(\vec{\tau}, \vec{X})$ over $\log N$ variables, using a random point $\vec{\tau} \in \mathbb{F}^{\log N}$:
$$
\mathsf{eq}(\vec{\tau}, \vec{X}) - \left(\prod_{i=0}^{\log N - n -1} (1-\tau_i)(1-X_i)\right)\left(\prod_{i=\log N - n}^{\log N -1} (1-\tau_i)(1-X_i) + \tau_i\cdot X_i\right)
$$
It is easy to check that this polynomial vanishes when $\vec{X} = (0,\ldots,0,X_{\log{N}-n},\ldots,X_{\log{N}-1})$, corresponding the first $2^n$ points of the boolean hypercube. When any of the first $\log N - n$ entries are 1, then the second component vanishes and the polynomial is equal to $\mathsf{eq}(\vec{\tau}, \vec{X})$.
The memory Sumcheck instance is required to consider its input polynomials over $\log(N)$ variables. Currently it uses the vector $Z$ directly, and we run into the issue described above.
We would like to ensure that $|W| < 2^n$. To do so, we need to add the following claim to the instance:
$$
0 = \sum_{\vec{x} \in \{ 0,1 \}^{\log(N)}} \mathsf{eq}_{\leq \log N}(\vec{\tau}, \vec{x}) \cdot W(\vec{x}) = \sum_{\stackrel{\vec{x} \in \{ 0,1 \}^{\log(N)}}{(x_0,\ldots,x_{\log N - n -1})} \neq\vec{0}} \mathsf{eq}(\vec{\tau}, \vec{x}) \cdot W(\vec{x}).
$$
The right-hand side shows that we are indeed proving that a random linear combination of $W[2^n,\ldots, N-]$ is 0.
This approach does not increase the proofs size, and marginally increases the prover/verifier costs since the claim is proved using the existing batched Sumcheck instance.

Multiple Sumcheck claims can be batched together by running the protocol over a random-linear combination of the input claims. The existing implementation of ppSNARK already includes this functionality, though we augmented it to handle instances defined over fewer variables.

12/13/2023Lagrange Polynomial basics For $0 \leq i < 2^n$, define it's bit representation as $(i_0,\ldots, i_{n-1})$ such that $$ i = \sum_{k=0}^{n-1} i_k \cdot 2^{n-k-1} $$ Note that for $n' < n$, if $i <2^{n'}$ then its bit representation will look like $(0,\ldots,0,i_{n'-n},\ldots, i_{n-1})$, with $n-n'$ zeros in front. The Lagrange polynomials over ${0,1}^n$ are indexed by $0\leq i<2^n$, where $$ \begin{aligned}

11/26/2023This document contains notes and ideas as a result of an implementation of Protostar in halo2. Originally, the plan was to introduce folding as an optional “prover”, and implement a decider using the existing PlonK-ish prover. Due to time constraints, only the folding prover and verifier were implemented.

11/1/2023see updated link: https://hackmd.io/fiaKvd0bR1min9d_gp9ARA The original Halo paper explains how the Bulletproofs/IPA commitment scheme can be used to speed up recursive SNARK verification. Use IPA as PCS Notice that IPA is a "split accumuluation scheme" (c.f. Proof-Carrying Data from Accumulation Schemes) Based on Sonic arithmetization, the precursor to PlonK. The hard part of recursive verification is computing a final MSM inside a circuit over the non-native fieldUsing curve cycles, create a SNARK over a curve where the base field corresponds to the MSM's curve. The circuit performing the MSM is really small, and doesn't do anything else.

6/22/2023
Published on ** HackMD**

or

By clicking below, you agree to our terms of service.

Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet

Wallet
(
)

Connect another wallet
New to HackMD? Sign up