There is no commentSelect some text and then click Comment, or simply add a comment to this page from below to start a discussion.
Issue with public inputs
Let's assume our R1CS shape has variables, public inputs (including the initial 1, or ), and constraints, where . Take as the size of the public parameters.
Our witness is the vector considered as a multilinear polynomial in variables . We define as the MLE of the public inputs. From this, we construct the vector where its MLE has variables
Consider that are valid solutions, but the prover wants to use a malicious output . We can craft a malicious as The evaluation list of is represented as concatenated with , for a length of . It's easy to commit to this polynomial since our commitment key supports larger vectors.
Notice that the corresponding is equal to since the component cancels out. The prover returns the malicious evaluation of The verifier computes the evaluation of to verify the Sumcheck such that the evaluation matches.
If the PCS verifying the evaluations from Sumcheck does not enforce that has at most values, and instead batches it with other evaluations by considering and shifting the evaluation by the first Lagrange polynomial , then the verifier will validate the opening of the malicious .
Why this attack works
The problem comes from the fact that the evaluation is sent to the verifier after the verifier sends the Sumcheck challenge . This allows an attacker to compute the malicious evaluation which will pass the Sumcheck, and will be valid as an evaluation of a polynomials with variables.
In this particular case, there is no check ensuring that , which could be enforced either through the PCS, or via another Sumcheck invocation.
Impact
In the SNARK without pre-processing, all evaluations from the outer and inner instances are batched by invoking another instance of the Sumcheck protocol. This latter instance applies padding to the and vectors to ensure the evaluations and are for the points and respectively. This acts as a degree check, as long as the PCS ensures that both vectors have length at most . Therefore, it is not affected by this attack.
For the pre-processing SNARK however, all Sumcheck instances are padded to , and batched together. Since there is no "evaluation-batching" Sumcheck instance at the end, there is no guarantee that is not defined to cancel out the public input, allowing an attacker to create a fraudulent proof for any public inputs of their choice.
Fix
We describe 3 different way of fixing this issue, each with a different cost.
Opening at the correct point
The simplest approach to solving the problem is to use another invocation of the PCS opening protocol, where is opened at the last challenges of from the Sumcheck.
This is the solution implemented in the Nova repo in PR 274after we report the issue. Unfortunately, it has a big impact on performance in terms of prover/verifier time and proof size, since the PCS opening arguments makes up a large portion of the proof complexity.
Batched PCS Reduction with Sumcheck
We can solve the issue in the same way as is done with the non pre-processing SNARK, where we use another Sumcheck invocation to batch all PCS opening queries at different points, and reduce them to queries at the same common point defined by the Sumcheck randomness.
Computationally this is cheaper for the prover since there are no additional commitments involved. It does however increase the proof size by field elements, and the verifier will need to compute as many additional hashes for Fiat-Shamir.
Bounded Witness Sumcheck
The witness vector is used as part of the MemorySumcheck which is defined over vectors of size . It is defined as Therefore, we need to ensure that is zero in the range , so that the concatenation with public inputs does not overlap.
This approach involves adding another claim to the batched Sumcheck instead and checking directly, rather than checking its evaluation produced by Sumcheck.
The main idea is to prove that a random linear combination of the elements of in the range equals 0. To do so, we construct a special polynomial over variables, which equals 0 in the first evaluations, but corresponds to the usual definition on the other points.
It's exact definition is given by the multi-linear polynomial over variables, using a random point :
It is easy to check that this polynomial vanishes when , corresponding the first points of the boolean hypercube. When any of the first entries are 1, then the second component vanishes and the polynomial is equal to .
The memory Sumcheck instance is required to consider its input polynomials over variables. Currently it uses the vector directly, and we run into the issue described above. We would like to ensure that . To do so, we need to add the following claim to the instance:
The right-hand side shows that we are indeed proving that a random linear combination of is 0.
This approach does not increase the proofs size, and marginally increases the prover/verifier costs since the claim is proved using the existing batched Sumcheck instance.