Try   HackMD

Current system.

  1. We can do sumcheck of the form

    (PQ)(x)eq(x;q).

  2. Formally,

    (PQ)(x) is defined as
    biPi(x)Qi(x)
    , where
    Pi,Qi
    are coordinate polynomials of
    P,Q
    .
    Pi
    and
    Qi
    can be expressed through Frobenius twists of
    P
    and
    Q
    , but it is only needed for verifier on the last stage, so prover does not keep twists of
    P
    and
    Q
    ever.

    2.1. Prover does it in

    2 phases. In 1st phase, taking
    c
    rounds, it works roughly normally, but operates over the quadratic polynomial
    PQ
    , computed on the subset
    (0,1,)c+1×(0,1)nc1
    .

    On i-th round,

    ic it keeps the restriction
    (PQ)(r0,..,ri1,xi,...,xn1)
    .
    When it needs to answer the query - i.e. send univariate polynomial
    U(t)=x>i(PQ)(r<i,t,x>i)eq(r<i,t,x>i;q)
    it exploits the fact that
    eq(r<i,t,x>i;q)
    is split as
    eq(r<i,t,x>i;q)=eq(r<i;q<i)eq(t;qi)eq(x>i;q>i)

    This allows us to not manipulate with polynomials of degree

    3 directly until the last stage -
    eq(r<i;q<i)eq(t;qi)
    terms do not depend on
    x>i
    , and so can be evicted from the sum. I.e. we compute the sum
    W(t)=x>i(PQ)(r<i,t,x>i)eq(x>i;qi)

    and then adjust it as

    U(t)=eq(r<i;q<i)eq(t;qi)W(t). This last stage is done in coefficient form, and the summation itself is done in evaluation form - i.e. it is just 3 sums

    W(t)=x>i(PQ)(r<i,t,x>i)eq(x>i;qi)

    for

    t=0,1, - which we can compute because we have the evaluations of
    (PQ)(r0,..,ri1,t,xi+1,...,xn1)
    for
    t=0,1,
    .

    2.2 After

    c phases, prover no longer has required data (as we have computed extension only on
    3c+1×2nc1
    subset), and can not proceed further. This is when we change strategy, and compute
    Pi(r0,...,rc,x>c)
    and
    Qi(r0,...,rc,x>c)
    (using RESTRICT functionality).

    2.3 Optimality considerations. First phase requires work roughly proportional to

    (3/2)c+1N. Second phase requires work roughly proportional to
    1282cN
    . By trial and error, I have found that the optimal inflection point is
    c=5
    , however,
    c=4,c=6
    are somewhat close.

  3. Normal sumcheck would end with prover's claims about values of

    Pi(r),Qi(r). We employ an additional argument to convince verifier in this values. Namely, for any number a, its i-th coordinate
    ai
    satisfies
    ai=π(γia)
    where
    π(x)=0j<128Frj(x)
    , and
    γi
    is a dual basis to our standard basis
    bi
    w.r.t. to the bilinear form
    x,y=π(xy)=(xy)0

Therefore, our prover actually communicates values of

(FrjP)(r), and verifier locally recovers intended
Pi(r)
-s.

Moreover, as we assume that actually verifier only has access to

P, we recall that
(FrjP)(r)=Frj(P(Frjr))
and thus it is equivalent to querying polynomial
P
in an inverse Frobenius orbit of
r
.

Multiopening reduction

Naive approach

In order for our system to be useful, we need to provide an additional argument, which will, for a polynomial

P, reduce the statements
P(Frir)
to a single opening in some different point
r
. There is a very classic, standard multiopening argument, working as follows:

  1. For some points
    q0,...,qk1
    (not necessarily Frobenius orbit), prover claims:
    P(qi)=si
  2. Verifier sends challenge
    u
    .
  3. Now, prover, trivially, has the following folded claim:
    iui(P(qi)si)=0
  4. Replace with sumchecks and rearrage:
    iui((xP(x)eq(x;qi))si)=0
    xP(x)(iuieq(x;qi))=iuisi
  5. Run the sumcheck.

Improvement 1

Crucially, for arbitrary set of opening points

qi, this argument requires
kN
from the prover in order to compute
iuieq(x;qi)
. In our case, however, this is unacceptable, as we have
128
opening points. But they are not random, and
eq(Frir;x)
is actually
=Frieq(r,x)
(Frobenius is a homomorphism, and coefficients of eq are just some algebraic expressions of
r
). Therefore, we can write

iuieq(x;Frir)=iuiFrieq(x;r)

This means, that we can just compute

eq(x;r), and then apply to its values a single
128×128
boolean matrix
iuiFri
. This is definitely cheaper than computing
eq(x;Frir)
separately! However, matrix application is still a heavy operation (much heavier than multiplication).

Improvement 2 (unimpl, likely unnecessary)

There is an alternative path, which suggests to instead operate with polynomials

Si(x)=eq(x,r)Pi(x)

The main feature here is that

eq(x;Frjr)Pi(x)=FrjSi because
Pi
has values in
F2
, and hence is invariant under Frobenius mapping.

Now, for the reason which will be clear very soon, we will run the sumcheck in a reverse order of variables.

Claim. On

j-th round of sumcheck, with previous challenges denoted as
rn1,rn2,...,rnj1
it is enough to compute sums
x<nj1Si(x<nj1,t,r>nj1)=eq(x<nj1,t,r>nj1;r)Pi(x<nj1,t,r>nj1)

The sum over

x-s can then be compressed into
eq(t,rnj1)Pi(r<nj1,t,r>nj1)

This is very efficient, because we already know polynomials

Pi(r<nj1,xnj1) from the second phase of our previous algorithm (we currently do not save them, but we can). This allows us to skip first few rounds almost completely - the remaining sum is proportional in size to
2j
. Which is exactly the reason why we want to run this sumcheck in reverse order of variables!

After skipping first few rounds (and not later than when

nj1 reaches
c
) we can just compute
uieq(x<nj1,rnj1;Frir)
by replacing
eq(x<nj1,rnj1;Frir)=eq(x<nj1;Frir<nj1)eq(rnj1;Frirnj1)
, and computing
(uieq(rnj1;Frirnj1))eq(x<nj1;Frir<nj1)

Optimization from Improvement 1 can be used here, though it is not strictly necessary, as this is already very efficient.

How to do XORs.

The system described above is enough to build ANDplonk - a version of PlonK, which supports, together with normal additions and multiplications, a bitwise AND operation. This, however, is not efficient for hashes, the main reason being that additions (XOR operations) will not be free in such arithmetization.

Sumchecks, however, do allow us to do very efficient linear operations.

Assume we are given some linear matrix

M(y;x) (with some efficiency assumptions, stated later). Then, application of this matrix to the vector of values is, by definition, a following sum:
(MP)(y)=xM(y;x)P(x)

This allows us to reduce claims about

MP to claims about
P
, using the sumcheck:

(MP)(r)=xM(r;x)P(x)

Efficiency requirements:

  1. Prover must be able to apply matrix: i.e. compute
    MP
    from
    P
    efficiently.
  2. Prover must be able to apply transposed matrix: i.e. compute
    M(r;x)=Mteq(r;x)
    efficiently.

In particular, let us assume that our linear transforms are slightly vectorized, i.e. matrix actually works on chunks of size

2k and not whole
2n
. Then, we can even compute the following reduced sumcheck:

xkM(r<k;x<k)P(x,rk)
which reduces the claim about evaluation of
MP(r)
to the evaluation of
P(r<k,rk)
.

This is somewhat better for verifier (less rounds), and much better for the prover if it knows

P(x,rk) in advance - which, funnily enough, exactly what happens in our scenario, due to

  1. M
    represeneting linear part of the round of the keccak hash - and hence being invertible.
  2. Us knowing
    MP(x;rk)
    from the previous sumcheck, and hence being able to compute
    P(x,rk)
    by applying
    M1
    .

Hashcaster

We construct a succinct proof system with efficient vectorized boolean operations.

While it can, potentially, be used for a lot of different application, the most obvious one is an efficient non-arithmetic hashes. Considering that the resulting proofs are still quite large, we believe that in many cases it will be embedded into some sort of external SNARK.

This motivates our name choice - we call our system Hashcaster, for its ability to turn non-arithmetic hashes into arithmetic ones (required for its verification in the embedding SNARK).

Idea

Recently, starting from Binius, there was a lot of attention related to extension fields of

F2. Creating proof systems over such fields is challenging, however, they do, by default, provide us at least with a very efficient vectorized XOR operator (linear addition over
F2
).

Our main insight is the idea that we can also express AND operator (coordinate-wise multiplication) in appropriate basis through algebraic operations: additions, multiplications and, crucially, Frobenius morphism.

We adapt the sumcheck protocol for Frobenius twists of polynomials.

Preliminaries

Representing bitwise AND algebraically

In what follows, we will mainly consider the field

K=F2128. This is done mainly for simplicity, and we believe that our result extends to other extensions. This can be done if the need arises.

Let's consider a quadratic extension,

F2k+1/F2k.

Definition. Artin-Schreier mapping

ak is defined by formula
ak(x)=x2kx
.

Lemma.

Ker(ak)=F2k
Im(ak)=F2k

Proof. Clearly, for every

xF2k it holds that
x2kx=0
(by generalized Fermat little theorem). Conversely, it no other element satisfies such property, because equation
x2kx=0
can not have more than
2k
roots. This proves that the kernel of
ak
is indeed
F2k
.

Now, the image of the mapping has dimension

2k+12k=2k, so to prove that it coincides with
F2k
it is enough to see that every element
sF2k
has a preimage:
x2kx=s
. But this is a quadratic polynomial, so it has a root in a quadratic extension of
F2k
, and there is a single quadratic extension for each finite field - in this case,
F2k+1
.

Definition. Now, we compose

ak-s through the tower of extensions:
π(x)=a1a7a8(x)

Lemma.

Im(π)=F2

Proof obvious (applying facts that we already know about

ak).

We will frequently abuse the notation and treat

π as
F2
-valued linear function on space
F2128
.

Now, pick any

F2-basis
r0,...,r127
in
F2128
. We define linear functions
πi(x)=π(rix)
.

Note that these are also linearly independent.

Definition. Chosen basis

b0,...,b127 is a dual basis to
πi
-s, i.e. satisfying
πi(bj)=δij
.

Lemma. Actually, any basis can be chosen for appropriate choices of

r-s, because the mapping
rπ(rx)
is an isomorphism with dual space.

Proof. Assume converse. Then, there is

r0 such that linear function
π(rx)
is
0
. However, we know that
π(x)
is nonzero, and substituting
x=y/r
we recover original
π
. Therefore, it is an isomorphism, and so for any basis
bi
the corresponding collection
ri
can be found.

We will normally choose basis

b to correspond with our main computation basis, and find
ri
-s using this procedure.

This allows us to express bitwise AND (in basis

bi) through an algebraic formula:

xy=ibiπi(x)πi(y)

Moreover, this formula is actually quadratic in

x,y and their Frobenius twists.

Linearized formula for
π

We observe, that actually

π(x) admits a rather simple expression:

π(x)=iFri(x)

i.e. it is a sum over whole Frobenius orbit. One can show that composition of these mappings for a sequence of extensions is once again mapping of this form.

Frobenius twists of polynomials

Consider a polynomial

P (potentially, multivariate), given in coefficient form. Let us denote
P>>k
the application of
k
-th power of Frobenius mapping to it. Then, it is well known that
P>>k(x)=Frk(P(Frk(x)))
(or, explicitly,
(P(x1/pk))pk
).

Note, that this operation also applies Frobenius mapping to the values of a polynomial on points defined over

F2. This is extremely important, as sumcheck will operate on a boolean hypercube
Bn={0,1}n
.

In what follows, we will frequently need to compute values of a polynomial in Frobenius orbit. We give an efficient prover for this subroutine:

Efficently evaluating multilinear polynomial in Frobenius orbit

Suppose we are given a multilinear polynomial

P(x) in evaluation form (and we will denote corresponding table of values
P[i]
). We want to compute it in points
r,r2,r22,...,Fri(r),...

Using formulas from previous section, we know that

P(Fri(r))=Fri((P>>i)(r)).

Naively, this would require us to compute all Frobenius shifts of

P, and evaluate them.

Naive cost (counting only asymptotic terms):

  1. N
    multiplications to compute the Lagrange kernel
    eq(r,x)
    .
  2. 128N
    Frobeniuses to compute shifts of
    P
    .
  3. 128N
    multiplications and
    128N
    additions to compute inner products of shifts with Lagrange kernels.

This cost is very significant, and would render protocol completely impractical. We suggest the following improved method:

  1. We split each
    P
    into standard basis, obtaining 128 tables with values in
    F2
    :
    P(x)=i<128biPi(x)
  2. We evaluate each
    Pi(r)
    , and compute their Frobenius orbits, as
    Pi(Frj(r))=Frj(Pi(r))
    , due to its values living in
    F2
    .
  3. We linearly combine the results back:
    P(Fri(r))=i<128biPi(Fri(r))
    .

Then, we can estimate the improved cost:

  1. N
    multiplications to obtain
    eq(r,x)
    .
  2. 128N
    additions and
    128N
    conditional choices (multiplication by bit) to compute all
    Pi(r)
    .
  3. Constant work to combine the results back.

This makes such evaluation comparable in costs with single evaluations.

Efficiently convincing the verifier about these evaluations (DEPRECATED, CHECK REAL METHOD ABOVE)

Assume that we are given a polynomial

P, and are trying to convince verifier that
P(Fri(r))=ci
. Polynomial is committed, so we can open it in any a single point. Then, there is a standard sumcheck argument, reducing multiple openings (in points
ri
) to a single one:

Verifier sends challenge value

γ, and we perform the sumcheck for the following equation:

P(x)(iγieq(x;ri))=iγici

In the end of sumcheck,

x will be specialized to some single random value
r
, and verifier will only need to check single opening
P(r)
, and compute some constant amount of values
eq(r;ri)
.

So, this sumcheck is very efficient for the verifier, the question is how to make it efficient for the prover. Naively evaluating

eq(x;ri) is out of question.

We suggest the following improved method.

  1. Split
    r
    into
    (r|r+)
    of half size.
  2. Perform sumcheck
    iγi((P(x|Fri(r+))eq(x;Fri(r)))

    Note that we can evaluate the tables
    P(x|Fri(r+))
    efficiently, using the argument from the previous section (for each
    x
    ).
    Tables
    eq(x;Fri(r))
    will be evaluated naively, which requires
    128N
    multiplications.
  3. This sumcheck spits out specialization
    xr
    , and now we are required to compute evaluations
    P(r,Fri(r+))
    .
  4. Which is normal naive protocol from before.

This method is efficient provided that

N>>128, which happens in all realistic scenarios.

Evaluating
PQ
in a point

Important prover subroutine is evaluating a quadratic polynomial

PQ=ibiπi(P)πi(Q) in a point. Simplest approach is just computing its coefficients (there are
3n=Nlog2(3)
terms), and evaluating it directly. There are somewhat better approaches if
N
is large enough: one can compute
Pi=πi(P)
,
Qi=πi(Q)
, evaluate them in
r
(which requires 256 N additions, because these polynomials are over
F2
, and compute
ibiPi(r)Qi(r)
).

For very small

N-s this method is worse, because it hides the constant
256
multiplications to combine the evaluation
Pi(r)
and
Qi(r)
into a single answer.

Combining it together: ANDCHECK

We would like compute the following sumcheck on a boolean hypercube:

xBn(P(x)Q(x))×eq(x;q)

which is treated as quadratic formula depending on Frobenius twists of

P and
Q

Let's recall that the prover response is a univariate polynomial of degree

3 defined as:

Ui(t)=x>iBni1(PQ)(r<i,t,x>i)×eq(r<i,t,x>i;q)

Which can be reaaranged as

Ui(t)=eq(r<i,q<i)eq(t,qi)x>iBni1(PQ)(r<i,t,x>i)×eq(x>i,q>i)

which drops the degree in

t under summation to 2.

Our main problem now is computing the restriction of

(PQ)(r,), for which we have discussed some approaches in the previous section.

We split our protocol in

2 phases:

Phase 1:

This phase consists of

c rounds, and we start by computing the table of evaluations of
P
,
Q
in the extended set of size
3c+12nc1
, namely
(0,1,)c+1×(0,1)nc1
.

This requires us to compute

3c+12nc12n additions (XORs).

Then, we also evaluate

PQ on this whole subset, which requires
3c+12nc1
evaluations of
(and
2n
, presumably, were already done).

Note: this work will be insignificant compared to what is done next, however, this tradeoff might change if instead of

we are using a more involved arithmetic circuit with multiple inputs and multiple arithmetic operations.

Next, we are ready to respond to verifier challenges. Prover enters the first round and does the following:

  1. Compute evaluations of (P \land Q)(t, x_{>0}), for
    t{0,1,}
    , and add them up multiplied by
    eq(x>0,q>0)
    . This requires
    32n1
    multiplications.
  2. Obtain challenge
    r0
    from verifier.
  3. Restrict our table on
    r0
    . This represents the bulk of work, taking
    3c+12nc1
    multiplications.

Phase 2

In Phase 2, we explicitly maintain polynomials

Pi=πi(P),
Qi=πi(Q)
. This slowdowns us a lot, so it makes sense to search for some balance. We believe
c=5
should provide a reasonable tradeoff.