RFC: Plookup in kimchi

In 2020, plookup showed how to create lookup proofs. Proofs that some witness values are part of a lookup table. Two years later, an independent team published plonkup showing how to integrate Plookup into Plonk.

This document specifies how we integrate plookup in kimchi. It assumes that the reader understands the basics behind plookup.

Overview

We integrate plookup in kimchi with the following differences:

we snake-ify the sorted table instead of wrapping it around (see later)
we allow fixed-ahead-of-time linear combinations of columns of the queries we make
we only use a single table (XOR) at the moment of this writing
we allow several lookups (or queries) to be performed within the same row
zero-knowledgeness is added in a specific way (see later)

The following document explains the protocol in more detail

Recap on the grand product argument of plookup

As per the Plookup paper, the prover will have to compute three vectors:

$f$ , the (secret) query vector, containing the witness values that the prover wants to prove are part of the lookup table.
$t$ , the (public) lookup table.
$s$ , the (secret) concatenation of
$f$ and
$t$ , sorted by
$t$ (where elements are listed in the order they are listed in
$t$ ).

Essentially, plookup proves that all the elements in

f

are indeed in the lookup table

t

if and only if the following multisets are equal:

${(1 + β) f, diff (t)}$
$diff (sorted (f, t))$

where

diff

is a new set derived by applying a "randomized difference" between every successive pairs of a vector. For example:

$f = {5, 4, 1, 5}$
$t = {1, 4, 5}$
${(1 + β) f, diff (t)} = {(1 + β) 5, (1 + β) 4, (1 + β) 1, (1 + β) 5, 1 + β 4, 4 + β 5}$
$diff (sorted (f, t)) = {1 + β 1, 1 + β 4, 4 + β 4, 4 + β 5, 5 + β 5, 5 + β 5}$

Note: This assumes that the lookup table is a single column. You will see in the next section how to address lookup tables with more than one column.

The equality between the multisets can be proved with the permutation argument of plonk, which would look like enforcing constraints on the following accumulator:

init:
$a c c_{0} = 1$
final:
$a c c_{n} = 1$
for every
$0 < i \leq n$ :

$a c c_{i} = a c c_{i - 1} \cdot \frac{(γ + (1 + β) f_{i - 1}) (γ + t_{i - 1} + β t_{i})}{(γ + s_{i - 1} + β s_{i})}$

Note that the plookup paper uses a slightly different equation to make the proof work. I believe the proof would work with the above equation, but for simplicity let's just use the equation published in plookup:

a c c_{i} = a c c_{i - 1} \cdot \frac{(1 + β) (γ + f_{i - 1}) (γ (1 + β) + t_{i - 1} + β t_{i})}{(γ (1 + β) + s_{i - 1} + β s_{i})}

Note: in plookup
$s$ is too large, and so needs to be split into multiple vectors to enforce the constraint at every
$i \in [[0; n]]$ . We ignore this for now.

Lookup tables

Kimchi uses a single lookup table at the moment of this writing; the XOR table. The XOR table for values of 1 bit is the following:

l	r	o
1	0	1
0	1	1
1	1	0
0	0	0

Whereas kimchi uses the XOR table for values of 4 bits, which has

2^{12}

entries.

Note: the (0, 0, 0) entry is at the very end on purpose (as it will be used as dummy entry for rows of the witness that don't care about lookups).

Querying the table

The plookup paper handles a vector of lookups

f

which we do not have. So the first step it to create such a table from the witness columns (or registers). To do this, we define the following objects:

a query tells us what registers, in what order, and scaled by how much, are part of a query
a query selector tells us which rows are using the query. It is pretty much the same as a gate selector.

Let's go over the first item in this section.

For example, the following query tells us that we want to check if

r_{0} \oplus r_{2} = 2 r_{1}

l	r	o
1, r0	1, r2	2, r1

The grand product argument for the lookup consraint will look like this at this point:

a c c_{i} = a c c_{i - 1} \cdot \frac{(1 + β) (γ + w_{0} (g^{i}) + j \cdot w_{2} (g^{i}) + j^{2} \cdot 2 \cdot w_{1} (g^{i})) (γ (1 + β) + t_{i - 1} + β t_{i})}{(γ (1 + β) + s_{i - 1} + β s_{i})}

Not all rows need to perform queries into a lookup table. We will use a query selector in the next section to make the constraints work with this in mind.

Query selector

The associated query selector tells us on which rows the query into the XOR lookup table occurs.

row	query selector
0	1
1	0

Both the (XOR) lookup table and the query are built-ins in kimchi. The query selector is derived from the circuit at setup time. Currently only the ChaCha gates make use of the lookups.

The grand product argument for the lookup constraint looks like this now:

a c c_{i} = a c c_{i - 1} \cdot \frac{(1 + β) \cdot q u e r y \cdot (γ (1 + β) + t_{i - 1} + β t_{i})}{(γ (1 + β) + s_{i - 1} + β s_{i})}

where

q u e r y

is constructed so that a dummy query (

0 \oplus 0 = 0

) is used on rows that don't have a query.

\begin{aligned} q u e r y = & s e l e c t o r \cdot (γ + w_{0} (g^{i}) + j \cdot w_{2} (g^{i}) + j^{2} \cdot 2 \cdot w_{1} (g^{i})) + \\ (1 - s e l e c t o r) \cdot (γ + 0 + j \cdot 0 + j^{2} \cdot 0) \end{aligned}

Queries, not query

Since we allow multiple queries per row, we define multiple queries, where each query is associated with a lookup selector.

At the moment of this writing, the ChaCha gates all perform

4

queries in a row. Thus,

4

is trivially the largest number of queries that happen in a row.

Important: to make constraints work, this means that each row must make 4 queries. (Potentially some or all of them are dummy queries.)

For example, the ChaCha0, ChaCha1, and ChaCha2 gates will apply the following 4 XOR queries on the current and following rows:

l	r	o	-	l	r	o	-	l	r	o	-	l	r	o
1, r3	1, r7	1, r11	-	1, r4	1, r8	1, r12	-	1, r5	1, r9	1, r13	-	1, r6	1, r10	1, r14

which you can understand as checking for the current and following row that

$r_{3} \oplus r 7 = r_{11}$
$r_{4} \oplus r 8 = r_{12}$
$r_{5} \oplus r 9 = r_{13}$
$r_{6} \oplus r 10 = r_{14}$

The ChaChaFinal also performs

4

(somewhat similar) queries in the XOR lookup table. In total this is 8 different queries that could be associated to 8 selector polynomials.

Grouping queries by queries pattern

Associating each query with a selector polynomial is not necessarily efficient. To summarize:

the ChaCha0, ChaCha1, and ChaCha2 gates that make
$4$ queries into the XOR table
the ChaChaFinal gate makes
$4$ different queries into the XOR table

Using the previous section's method, we'd have to use

8

different lookup selector polynomials for each of the different

8

queries. Since there's only

2

use-cases, we can simply group them by queries patterns to reduce the number of lookup selector polynomials to

2

The grand product argument for the lookup constraint looks like this now:

a c c_{i} = a c c_{i - 1} \cdot \frac{(1 + β)^{4} \cdot q u e r y \cdot (γ (1 + β) + t_{i - 1} + β t_{i})}{(γ (1 + β) + s_{i - 1} + β s_{i})}

where

q u e r y

is constructed as:

\begin{aligned} q u e r y = & s e l e c t o r_{1} \cdot p a t t e r n_{1} + \\ s e l e c t o r_{2} \cdot p a t t e r n_{2} + \\ (1 - s e l e c t o r_{1} - s e l e c t o r_{2}) \cdot (γ + 0 + j \cdot 0 + j^{2} \cdot 0)^{4} \end{aligned}

where, for example the first pattern for the ChaCha0, ChaCha1, and ChaCha2 gates looks like this:

\begin{aligned} p a t t e r n_{1} = & (γ + w_{3} (g^{i}) + j \cdot w_{7} (g^{i}) + j^{2} \cdot w_{11} (g^{i})) \cdot \\ (γ + w_{4} (g^{i}) + j \cdot w_{8} (g^{i}) + j^{2} \cdot w_{12} (g^{i})) \cdot \\ (γ + w_{5} (g^{i}) + j \cdot w_{9} (g^{i}) + j^{2} \cdot w_{13} (g^{i})) \cdot \\ (γ + w_{6} (g^{i}) + j \cdot w_{10} (g^{i}) + j^{2} \cdot w_{14} (g^{i})) \cdot \end{aligned}

Note:

there's now 4 dummy queries, and they only appear when none of the lookup selectors are active
if a pattern uses less than 4 queries, they'd have to pad their queries with dummy queries as well

Back to the grand product argument

There are two things that we haven't touched on:

The vector
$t$ representing the combined lookup table (after its columns have been combined with a joint combiner
$j$ ). The non-combined loookup table is fixed at setup time and derived based on the lookup tables used in the circuit (for now only one, the XOR lookup table, can be used in the circuit).
The vector
$s$ representing the sorted multiset of both the queries and the lookup table. This is created by the prover and sent as commitment to the verifier.

The first vector

t

is quite straightforward to think about:

if it is smaller than the domain (of size
$n$ ), then we can repeat the last entry enough times to make the table of size
$n$ .
if it is larger than the domain, then we can either increase the domain or split the vector in two (or more) vectors. This is most likely what we will have to do to support multiple lookup tables later.

What about the second vector?

The sorted vector
$s$

The second vector

s

is of size

n \cdot | queries | + | lookup_table |

That is, it contains the

n

elements of each query vectors (the actual values being looked up, after being combined with the joint combinator, that's

4

per row), as well as the elements of our lookup table (after being combined as well).

Because the vector

s

is larger than the domain size

n

, it is split into several vectors of size

n

. Specifically, in the plonkup paper, the two halves of

s

(which are then interpolated as

h_{1}

and

h_{2}

a c c_{i} = a c c_{i - 1} \cdot \frac{(1 + β)^{4} \cdot q u e r y \cdot (γ (1 + β) + t_{i - 1} + β t_{i})}{(γ (1 + β) + s_{i - 1} + β s_{i}) (γ (1 + β) + s_{n + i - 1} + β s_{n + i})}

Since you must compute the difference of every contiguous pairs, the last element of the first half is the replicated as the first element of the second half (

s_{n - 1} = s_{n}

), and a separate constraint enforces that continuity on the interpolated polynomials

h_{1}

and

h_{2}

L_{n - 1} (h_{1} (x) - h_{2} (g \cdot x)) = 0

which is equivalent with checking that

h_{1} (g^{n - 1}) = h_{2} (1)

The sorted vector
$s$ in kimchi

Since this vector is known only by the prover, and is evaluated as part of the protocol, zero-knowledge must be added to the polynomial. To do this in kimchi, we use the same technique as with the other prover polynomials: we randomize the last evaluations (or rows, on the domain) of the polynomial.

This means two things for the lookup grand product argument:

we cannot use the wrap around trick to make sure that the list is split in two correctly (enforced by
$L_{n - 1} (h_{1} (x) - h_{2} (g \cdot x)) = 0$ which is equivalent to
$h_{1} (g^{n - 1}) = h_{2} (1)$ in the plookup paper)
we have even less space to store an entire query vector. Which is actually super correct, as the witness also has some zero-knowledge rows at the end that should not be part of the queries anyway.

The first problem can be solved in two ways:

Zig-zag technique. By reorganizing
$s$ to alternate its values between the columns. For example,
$h_{1} = (s_{0}, s_{2}, s_{4}, \dots)$ and
$h_{2} = (s_{1}, s_{3}, s_{5}, \dots)$ so that you can simply write the denominator of the grand product argument as

$(γ (1 + β) + h_{1} (x) + β h_{2} (x)) (γ (1 + β) + h_{2} (x) + β h_{1} (x \cdot g))$
this is what the plonkup paper does.
Snake technique. by reorganizing
$s$ as a snake. This is what is done in kimchi currently.

The snake technique rearranges

s

into the following shape:

    _   _
 | | | | |
 | | | | |
 |_| |_| |

so that the denominator becomes the following equation:

(γ (1 + β) + h_{1} (x) + β h_{1} (x \cdot g)) (γ (1 + β) + h_{2} (x \cdot g) + β h_{2} (x))

and the snake doing a U-turn is constrained via something like

L_{n - 1} \cdot (h_{1} (x) - h_{2} (x)) = 0

If there's an

h_{3}

(because the table is very large, for example), then you'd have something like:

(γ (1 + β) + h_{1} (x) + β h_{1} (x \cdot g)) (γ (1 + β) + h_{2} (x \cdot g) + β h_{2} (x)) (γ (1 + β) + h_{3} (x) + β h_{3} (x \cdot g))

with the added U-turn constraint:

L_{0} \cdot (h_{2} (x) - h_{3} (x)) = 0

Unsorted
$t$ in
$s$

Note that at setup time,

t

cannot be sorted as it is not combined yet. Since

s

needs to be sorted by

t

(in other words, not sorted, but sorted following the elements of

t

), there are two solutions:

both the prover and the verifier can sort the combined
$t$ , so that
$s$ can be sorted via the typical sorting algorithms
the prover can sort
$s$ by
$t$ , so that the verifier doesn't have to do any sorting and can just rely on the commitment of the columns of
$t$ (which the prover can evaluate in the protocol).

We do the second one, but there is an edge-case: the combined

t

entries can repeat.
For some

i, l

such that

i \neq l

, we might have

t_{0} [i] + j t_{1} [i] + j^{2} t_{2} [i] = t_{0} [l] + j t_{1} [l] + j^{2} t_{2} [l]

For example, if

f = {1, 2, 2, 3}

and

t = {2, 1, 2, 3}

, then

sorted (f, t) = {2, 2, 1, 1, 2, 3}

would be one way of sorting things out. But

sorted (f, t) = {2, 2, 2, 1, 1, 3}

would be incorrect.

Recap

So to recap, to create the sorted polynomials

h_{i}

, the prover:

creates a large query vector which contains the concatenation of the
$4$ per-row (combined with the joint combinator) queries (that might contain dummy queries) for all rows
creates the (combined with the joint combinator) table vector
sorts all of that into a big vector
$s$
divides that vector
$s$ into as many
$h_{i}$ vectors as a necessary following the snake method
interpolate these
$h_{i}$ vectors into
$h_{i}$ polynomials
commit to them, and evaluate them as part of the protocol.

Joseph Spadavecchia

2022/03/31 21:00:25

or what? (Edited)

anais

2022/04/04 11:18:16

detail (Edited)

2022/04/04 11:25:59

following multisets are equal

but here you don't give the intuition of the beta, it looks like it is a typo and it should have been `(f, diff(t))` equal `to diff(sorted(f,t))` (Edited)

2022/04/04 11:26:45

Essentially, p

I think it would be clearer if you can first explain the strategy before giving the full equation (Edited)

2022/04/04 11:28:48

nice clarification for initial and final values of the accumulator, some papers only specify explicitly the final one (Edited)

2022/04/04 11:31:04

ookup proofs.

I would give some informal definition about what is a lookup (Edited)

2022/04/04 11:32:28

Perhaps say what properties these vectors have. Meaning, are they public and known to the verifier? (Edited)

2022/04/04 11:34:25

look like this at this point:

wonderful built-by-steps approach (Edited)

2022/04/04 11:40:46

current and following rows

so here, these 15 columns really correspond to each row of the ChaCha gadget? Why are you using the r_number notation to point to these cells? (Edited)

2022/04/04 20:30:44

Oh, what an unfortunate coincidence XD It is just not super clear to me where these cells point to. So, they are different from the cells in the execution trace? My understanding was that the XORs in the quarter rounds were checked against the lookup table with all of the XOR 4-bit entries but eventually they had to be linked to some witness cells in the rows somehow, but the solution does not click in my head yet.

David Wong

2022/04/04 22:25:43

what is a quarter round?

2022/04/04 22:26:14

here 1, r3 means 1 * r3 where r3 is the fourth witness column (on that specific row). So we are using values of the witness to form a query

2022/04/04 11:52:54

combined $t$ entries can repeat

and what happens if so? (Edited)

2022/04/04 18:15:22

then sorted(f || t) = (2, 2, 1, 1, 2, 3)

2022/04/04 18:15:27

for example

2022/04/04 18:15:40

but it MUST NOT be (2, 2, 2, 1, 1, 3)

2022/04/04 18:17:48

added the example within the text (Edited)

2022/04/04 20:40:44

I will have to read more context to understand this better

RFC: Plookup in kimchi

Overview

Recap on the grand product argument of plookup

Lookup tables

Querying the table

Query selector

Queries, not query

Grouping queries by queries pattern

Back to the grand product argument

The sorted vector s

The sorted vector s in kimchi

Unsorted t in s

Recap

Read more

List of PLONK implementations

Kimchi / Plonk / Halo / Pickles Glossary

How to write a tic-tac-toe snapp

ZK HACK: 1st puzzle write up

The sorted vector
$s$

The sorted vector
$s$ in kimchi

Unsorted
$t$ in
$s$