Try   HackMD

Issue with public inputs

Let's assume our R1CS shape has

2n variables,
public inputs (including the initial 1, or
u
), and
2m
constraints, where
<2n
. Take
N=max{2n+1,2m}
as the size of the public parameters.

Our witness is the vector

WF2n considered as a multilinear polynomial in
n
variables
W(X0,,Xn1)
. We define
P(X0,,Xn1)
as the MLE of the
public inputs. From this, we construct the
ZF2n+1
vector where its MLE has
n+1
variables
Z(X0,X1,,Xn)=(1X0)W(X1,,Xn)+X0P(X1,,Xn)

Consider that

W,P are valid solutions, but the prover wants to use a malicious output
P~F
.
We can craft a malicious
W~
as
W~(X0,,Xn)=(1X0)W(X1,,Xn)+X0P(X1,,Xn)X0P~(X1,,Xn)

The evaluation list of
W~
is represented as
W
concatenated with
PP~
, for a length of
2n+
. It's easy to commit to this polynomial since our commitment key supports larger vectors.

Notice that the corresponding

Z~ is equal to
Z
since the
P~
component cancels out. The prover returns the malicious evaluation of
W

eW~=(1r0)eW+r0(ePeP~)1r0

The verifier computes the evaluation of
Z
to verify the Sumcheck
eZ=(1r0)eW~+r0eP~=(1r0)eW+r0eP

such that the evaluation matches.

If the PCS verifying the evaluations from Sumcheck does not enforce that

W has at most
2n
values, and instead batches it with other evaluations by considering
W(X0,,Xn)=(1X0)W(X1,,Xn)
and shifting the evaluation by the first Lagrange polynomial
eW=(1r0)eW
, then the verifier will validate the opening of the malicious
eW~
.

Why this attack works

The problem comes from the fact that the evaluation

eW is sent to the verifier after the verifier sends the Sumcheck challenge
r0
. This allows an attacker to compute the malicious evaluation which will pass the Sumcheck, and will be valid as an evaluation of a polynomials with
2n+1
variables.

In this particular case, there is no check ensuring that

|W|<2n, which could be enforced either through the PCS, or via another Sumcheck invocation.

Impact

In the SNARK without pre-processing, all evaluations from the outer and inner instances are batched by invoking another instance of the Sumcheck protocol. This latter instance applies padding to the

W and
E
vectors to ensure the evaluations
eW
and
eE
are for the points
ry[1..]
and
rx
respectively. This acts as a degree check, as long as the PCS ensures that both vectors have length at most
max{2n,2m}
. Therefore, it is not affected by this attack.

For the pre-processing SNARK however, all Sumcheck instances are padded to

N=max{2m,22n,|A|+|B|+|C|}, and batched together. Since there is no "evaluation-batching" Sumcheck instance at the end, there is no guarantee that
W
is not defined to cancel out the public input, allowing an attacker to create a fraudulent proof for any public inputs of their choice.

Fix

We describe 3 different way of fixing this issue, each with a different cost.

Opening
W
at the correct point

The simplest approach to solving the problem is to use another invocation of the PCS opening protocol, where

W2n is opened at the last
n
challenges of
r
from the Sumcheck.

This is the solution implemented in the Nova repo in PR 274after we report the issue. Unfortunately, it has a big impact on performance in terms of prover/verifier time and proof size, since the PCS opening arguments makes up a large portion of the proof complexity.

Batched PCS Reduction with Sumcheck

We can solve the issue in the same way as is done with the non pre-processing SNARK, where we use another Sumcheck invocation to batch all PCS opening queries at different points, and reduce them to queries at the same common point defined by the Sumcheck randomness.

Computationally this is cheaper for the prover since there are no additional commitments involved. It does however increase the proof size by

O(logN) field elements, and the verifier will need to compute as many additional hashes for Fiat-Shamir.

Bounded Witness Sumcheck

The witness vector

W is used as part of the MemorySumcheck which is defined over vectors of size
N
. It is defined as Therefore, we need to ensure that
W
is zero in the range
[2n,,N1]
, so that the concatenation with public inputs
X
does not overlap.

This approach involves adding another claim to the batched Sumcheck instead and checking

W directly, rather than checking its evaluation produced by Sumcheck.

The main idea is to prove that a random linear combination of the elements of

W in the range
[2n,,N1]
equals 0. To do so, we construct a special
eq
polynomial over
logN
variables, which equals 0 in the first
2n
evaluations, but corresponds to the usual definition on the other points.

It's exact definition is given by the multi-linear polynomial

eqlogN(τ,X) over
logN
variables, using a random point
τFlogN
:

eq(τ,X)(i=0logNn1(1τi)(1Xi))(i=logNnlogN1(1τi)(1Xi)+τiXi)
It is easy to check that this polynomial vanishes when
X=(0,,0,XlogNn,,XlogN1)
, corresponding the first
2n
points of the boolean hypercube. When any of the first
logNn
entries are 1, then the second component vanishes and the polynomial is equal to
eq(τ,X)
.

The memory Sumcheck instance is required to consider its input polynomials over

log(N) variables. Currently it uses the vector
Z
directly, and we run into the issue described above.
We would like to ensure that
|W|<2n
. To do so, we need to add the following claim to the instance:
0=x{0,1}log(N)eqlogN(τ,x)W(x)=(x0,,xlogNn1)x{0,1}log(N)0eq(τ,x)W(x).

The right-hand side shows that we are indeed proving that a random linear combination of

W[2n,,N] is 0.

This approach does not increase the proofs size, and marginally increases the prover/verifier costs since the claim is proved using the existing batched Sumcheck instance.