UltraGroth

Challenge rounds in Groth16-like system

This is a continuation and clarification of my previous post which was a very rough description of a possible scheme of adding lookups (and any other challenge arguments) to Groth16.

It generated some traction and comments. Mainly I would like to thank Weikeng Chen who gave me some references on similar approaches (based on LegoSNARK), and confirmed that it should be possible to get rid of all interfacing between different argument systems (which loosely works similar to Pinoccio), instead using a singular Groth16-styled equation.

It seems that it is, indeed, possible. I describe the protocol here, and provide a sketch of proof.

Let's recall normal Groth16 protocol, first

This turned out to be quite long, tbh. Can be skipped, but it makes sense to look on zero-knowledgness / soundness proofs here to make sense of my arguments for the UltraGroth.

R1CS In normal Groth16, we start from a rank 1 constraint system, R1CS:

LwRw=Ow

Here,

L,R,O are matrices of size
m×n
, where
m
is a size of a witness vector
wi
, and
n
is the amount of constraints, and
is a Hadamard product.

Individual constraints, thus, correspond to the rows of the matrix:

LiwRiw=Oiw
(jLjiwj)(jRjiwj)=jOjiwj

Some subset of indices of the witness, is also called "public inputs", and will be exposed by the proof, and all others are private.

Now, we map the set of constraints to some set

S in the scalar field of a pairing-friendly curve - I will denote
xi
the point corresponding to the constraint
i
.
Z(x)=i(xxi)
denotes the vanishing polynomial of this set, and we also define
Lj(x),Rj(x),Oj(x)
by their Lagrange interpolation of the values over the set:
Lj(xi)=Lji
(and the same for
R,O
).

Then, the R1CS problem can be reformulated as finding such witness vector

wj that
(jwjLj(x))(jRj(x))(jOj(x))
vanishes on
S
, <=> divisble by
Z(x)
, <=> there exists such
H(x)
(of degree at most
n
) that:
L(x)R(x)O(x)=Z(x)H(x)

where
L=jwjLj(x)

R=jwjRj(x)

O=jwjOj(x)

Groth16 argument

Now, lets recall how this problem is turned into a zk-proof.

We have the following toxic waste elements:

(α,β,γ,δ,τ), and the reference string is computed as follows:

[α]1,[β]1,[β]2,[γ]2,[δ]1,[δ]2,
for
0kn:
Zk=[τkZ(τ)δ]1
< these will allow us to compute
[H(τ)Z(τ)δ]1

Aj=[Lj(τ)]1,Bj=[Rj(τ)]2
,
Bj=[Rj(τ)]1

for
jpriv:
Cj=[Oj(τ)+βLj(τ)+αRj(τ)δ]1

for
jpub:
Cj=[Oj(τ)+βLj(τ)+αRj(τ)γ]1

[]

The Groth16 prover (given the solution to R1CS in the polynomial form described above) creates two random blinding scalars

r,s and computes the following:

A=[α]1+jwjAj+r[δ]1
B=[β]2+jwjBj+s[δ]2
,
B=[β]1+jwjBj+s[δ]1

C=jprivwjCj+hkZk+rB+sArs[δ]1

and

IC is computed from public inputs by both sides:

IC=jpubwjCj

And the verifier does the following check:

A,B=[α]1,[β]2+C,[δ]2+IC,[γ]2

Why this holds for a valid R1CS instance

Let's see why this works:

First, unwind this check to see that it passes. I will calculate LHS and RHS (and abuse notation a bit to calculate by omitting casting it to multiplication group

[]m everywhere). I will also use "A, B, C" as shorthands of their dlogs as scalars.

LHS:

AB=(α+jwjLj(τ)+rδ)(β+jwjRj(τ)+sδ)

RHS:

αβ+Cδ+ICγ=αβ+jwj(Oj(τ)+βLj(τ)+αRj(τ))+H(τ)Z(τ)+rδB+sδArsδ2

Now, small modification of LHS makes clear that all terms with

δ will cancel out:

LHS:

AB=sAδ+rBδrsδ2+(α+jwjLj(τ))(β+jwjRj(τ)).

The rest is direct inspection.

Zero-knowledge

Now, let's recall why this is perfectly zero-knowledge, and why this is computationally sound in algebraic group model.

Zero-knowledge: it is clear that

A,B are uniformly distributed (because prover adds up random uniformly distributed elements
r[δ]1,s[δ]2
).
C
is uniquely defined if other variables are fixed (because it is a linear equation), and, therefore, the probability distribution of proofs on a space of solutions of this equation is uniformly random.

Note: even a stronger fascinating property, re-randomizability, holds. It means that having a correct proof triple (A, B, C) one can construct a new proof (A', B', C') uniformly randomly distributed in the space of all proofs, without knowing the original witness. This property will, sadly, be broken in my extended UltraGroth protocol.

Soundness in Algebraic Group Model

Soundness in AGM: I won't go into details of AGM, but this has the following intuition: imagine that

α,β,γ,δ,τ are variables (i.e. we now calculate in a rational function ring depending on this variables). The adversary then tries to construct a proof, with

A and
C
being linear combination of expressions
α,β,δ,τkZ(τ)δ,Lj(τ),Rj(τ)
, for
jpriv:
Oj(τ)+βLj(τ)+αRj(τ)δ
, for
jpub:
Oj(τ)+αRj(τ)+βLj(τ)γ

and

B being linear combination of
β,δ,γ,Rj(τ)

And let's assume they succeed, i.e. the expressions satisfy

AB=αβ+Cδ+ICγ

First, let's notice that

A must contain
α
and
B
must contain
β
. Moreover, let's wlog rescale them in such a way that both of these are with coefficient
1
.

Ocurrence of

β in
A
is then prohibited due to the fact that
β2
can not occur on RHS.

Now, let's see that in

A there should be no terms with
δ
or
γ
in denominator. Indeed, if there were, their products by
β
would not be cancelled by anything in LHS, and neither in RHS (because RHS doesn't have a pole at
δ=0
).

Let us also notice that occurence of

γ in
B
is unwelcome because
γα
can not occure in RHS.

Therefore, we are in a situation where

A=α+ajLj(τ)+qjRj(τ)+rδ and
B=β+bjRj(τ)+sδ
. We have neither shown that
ajbj
yet, nor
qj=0
, nor stated anything about
C
.

WLOG, lets modify

A and
B
by removing
rδ
and
sδ
, and
C
by subtracting
sA
,
rB
and adding
rsδ
. We have successfully exiled
δ
from LHS, now it has the form:

A=α+ajLj(τ)
B=β+bjRj(τ)+qjLj(τ)

In this form, appeareance of any nonzero

Lj(τ),
Rj(τ)
or
α
or
β
in
C
is strictly prohibited - it will lead to occurence of the uncancelled term divisible by
δ
in RHS.

The terms with

γ in numerator are also banned in
C
for trivial reasons.

Hence,

C has the following form:

C=jprivcjOj(τ)+βLj(τ)+αRj(τ)δ+hkZ(τ)τkδ.

Wlog, assume that there is no linear dependencies between nonzero

Lj(τ)-s and
Rj(τ)
-s (separately). If there were, we could use them to make some of the coefficients vanish, and proceed in such assumption.

Assume also wlog that in such form no linear combination of

Lj(τ)-s occuring in
B
can be expressed as a sum of
Rj
-s - if it was, make such substitution in
B
to increase the amount of vanishing coefficients, and proceed.

Now, we want to show that after such transformations

qj=0, and
aj=bj=cj
. Indeed, grabbing all terms with
β
, we see that RHS contains
cjβLj(τ)
and LHS contains
ajLj(τ)
, which implies
aj=cj
using the fact that we don't have non-trivial linear combinations.

Similarly, all terms with

α yield in RHS contains
cjαRj(τ)
, and LHS contains
bjαRj(τ)+qjαLj(τ)
. Due to our assumptions on absence of linear combinations, all
qj
are forced to vanish,
bj=cj
.

UltraGroth argument

Assume, that our R1CS circuit is endowed with the additional structure. I.e., private section of the set of indices

1..n is separated into
d+1
parts,
round0
, ,
roundd
. Additionally, some public inputs are called "challenges", and the set of challenges is separated into
d
parts. The vector space
Ck
will denote the space of challenges in round
k
.

Let's define the space

Vk as space of vectors with indices of nontrivial coeffs inside of
roundk
. The "old" witness then lives in
kVkpublic inputs
.

We define

k+1-st round strategy as a function
Fk:Ck×ikVkVk+1

Strategy of

0-th round is just a vector in
V0
.

We will call a full witness the set of such strategies, for each round, which, while being used successively, succeed with overwhelming probability for random challenges.

Definition: Algebraic witness is a solution of R1CS over the field of rational functions on the space of challenges

Fq(C), such that round k+1 witness depends only on challenges from rounds up to k.

Algebraic witness trivially implies the aforementioned strategy.

It is useful to think about algebraic witnessess, but likely not so useful to work with them in practice - computations in the field of rational functions are not that friendly. I believe the mentioned above general definition in terms of strategies is a more adequate abstractions; i.e. in practice the witness to the circuit will be a bunch of functions executing some computations on the already computed part of the witness vector.


Now, we will emulate the challenge responses of the verifier, using Fiat-Shamir heuristic. Let's construct the following toxic waste:

α,β,γ,δ0,...,δd,τ

and obtain the following reference string:
all as before, with

δ's known in both G1 and G2, and
Cj
for private indices being computed as for
jroundk:
Cj=Oj(τ)+βLj(τ)+αRj(τ)δk
.

Now, for each round but the last, lets assume that the prover also exposes some point

C(k) to the verifier, and honest prover puts
C(k)=jroundkwjCj+rk[δd]1
, with
rk
being the blinding factor.

Verifier, then, sends the set of challenge inputs

wj|jchallk+1 by setting
acck+1=Hash(acck,C(k))
, and
wj=Hash(j,acck+1)
.

!! Note: it is important that we can not just put challenge to be the hash of j and C^(k), without running the risk of prover retroactively rewinding previous rounds without changing C^(k), which might be possible in some cases.

After we have run such process non-interactively, we have obtained the full witness vector

w. We then do the same computation as in normal R1CS to obtain
H(x)
.

Honest prover, then, picks two more random values

r,s, and sets the following:

A=[α]1+jwjAj+rδd
B=[β]2+jwjBj+sδd

IC=jpubwjCj
- here as everywhere pub inputs include challenges
for k < d as before
C(k)=jroundkwjCj+rk[δd]1

C(d)=jrounddwjCj+snhsZs+sA+rBk <drk[δk]1rs[δd]1

Then, the following equation (*) holds:

A,B=α,β+IC,[γ]2+kdC(k),[δk]2

Full verification algorithm checks this equation, correctness of public input and correctness of challenges.

Why this equation holds

It is, in fact, a direct inspection. Only thing that sufficiently differs from normal Groth16 is the blinding factors. Let's see how it works: the factor

rkδd from
C(k)
is multiplied by
δk
, which is exactly cancelled by the term
rkδkδd
coming from
C(d)
. The remaining blinding factors look exactly as in Groth16.

Zero knowledge

I claim that the honest verifier produces an uniformly random proof. Reason for this is the fact that

A,B and
C(k)
for k<d are all equidistributed, and
C(d)
is then unique.

Soundness?

I feel a bit shaky here. Anyways, let's proceed and work in AGM + ROM for the hash. Then, we first prove the following lemma:

Lemma: Any solution to equation (*) is a valid solution to R1CS, and

Ck's are binding commitments to the round witnesses.

The fact that solution to this equation in AGM leads to R1CS can be done with the following trick. Solution in AGM is solution in terms of some polynomials (i.e. treating toxic waste as variables). Let's substitute in this solution

δk=δ,k.
It is also easy to see that
C(k)
are not allowed to contain any
Cj
's from different rounds as this leads to the uncancellable
δkδs
factor (it can not come from RHS because
A
does no admit terms with nontrivial denominator, similar to the proof for normal Groth16). They might also contain some other terms, but because we don't know any linear dependencies between them, it is still a binding commitment.
.

Therefore, interactive protocol of communication with verifier (which we emulate by Fiat-Shamir) can be represented as follows: prover is providing some sequence of commitments to parts of the witness, and then generates a valid zk-proof with witness it had committed to.

I am not sure if I messed up here somewhere, requires proofread / review.