Multiset checks in PLONK and Plookup

# Multiset checks in PLONK and Plookup ## Grand Products [PLONK](https://eprint.iacr.org/2019/953.pdf) at its technical heart is based on the following primitive - The *grand product check*: given commitments to polynomials $f,g$ over a finite field $\mathbb{F}$, and a subset $H=\{x_1,\ldots,x_n\}\subset \mathbb{F}$ check that the product of values of $f$ and $g$ over $H$ agree. That is, whether $$\prod_{i\in [n]} a_i \stackrel{?}{=} \prod_{i\in [n]} b_i,$$ where $a_i = f(x_i)$ and $b_i = g(x_i)$. The PLONK paper shows this check can be performed with great efficiency when $H$ is a multiplicative subgroup. ### Polynomials and Vectors Throughout this post, whenever we discuss a vector $a$ of length $n$, we assume in the actual protocol the prover has sent a commitment to a polynomial $f$ with $f(x_i)=a_i$ as above. Thus, we can allow operations like adding vectors coordinate wise, that will be emulated in the real protocol by applying the same operation on the polynomial commitments. ## Grand products to multiset checks One reason a grand product check is useful, is that with a little randomness it translates to a more powerful primitive - the *multiset equality check*: Given two vectors $a=(a_1,\ldots,a_n),b=(b_1,\ldots,b_n)$ check they contain the same elements, counting repetitions, possibly in different order. For example, $(1,1,2,3)$ is multiset-equal to $(2,1,1,3)$ but not multiset-equal to $(1,2,3,3)$ or $(1,1,2,4)$. Let's see the reduction from multiset equality to grand product. The verifier simply chooses a random $\gamma \in \mathbb{F}$, and runs the grand product check on the randomly shifted vectors $a'\triangleq a+\gamma, b'\triangleq b+\gamma$ ($\gamma$ is added to all coordinates). This grand product check corresponds to $$\prod_{i\in [n]} (a_i+\gamma) \stackrel{?}{=} \prod_{i\in [n]} (b_i+\gamma)$$ Thinking of both sides as polynomials in $\gamma$, the Schwarz-Zippel Lemma implies that the grand product check fails with high probability unless $a,b$ are mutliset-equal. ## Permutations via Multiset checks Given a permutation $\sigma:[n]\to [n]$ suppose we want to check that $b=\sigma(a)$, in the sense that for each $i$, $b_i=a_{\sigma(i)}$. See the PLONK paper (and other SNARK papers) as to why permutation checks are central to SNARKs. To reduce this to a multiset check, look first at the vectors of *pairs* $$((a_i,i))_{i\in [n]}, ((b_i,\sigma(i))_{i\in [n]}$$ Thinking about it for a minute, you can see that they are multiset-equal if and only if $b=\sigma(a)$. But we wish to reduce to a mutliset check on vectors of elements, rather than vectors of pairs. For this we choose random $\beta$, and define vectors $a',b'$ by $$a'_i\triangleq a_i+\beta\cdot i, b'_i\triangleq b_i + \beta\cdot \sigma(i)$$ If the above vectors of pairs are not multiset equal, then with high probability $a'$ and $b'$ aren't either. Thus, it suffices to do the multiset check between $a'$ and $b'$. ## Table lookups via Multiset checks ### The XOR example Another very useful operation for zk-SNARKs is the table lookup operation. For example, suppose we had three vectors of field elements $a=(a_1,\ldots,a_n),b=(b_1,\ldots,b_n),c=(c_1,\ldots,c_n)$; and we want to check that for each $i\in [n]$, $a_i,b_i,c_i$ correspond to $8$-bit strings, and $c_i=a_i\oplus b_i$ where $\oplus$ is a bitwise XOR. A traditional SNARK approach would require many arithmetic constraints for each tuple $(a_i,b_i,c_i)$ to decompose the inputs into bits and perform the bitwise XORs. Lookup tables are an alternative approach to operations requiring many constraints. We start by precomputting the truth table $T$ corresponding to the XOR operation. So $T$ consists of three vectors/columns $T_1,T_2,T_3$ of length $2^{16}$ such that the rows of $T$, $(T_{1,i},T_{2,i},T_{3,i})$, go over all legal input/output combinations for 8-bit XOR. ***Remark:*** *In the SNARK context, $a,b,c$ typically correspond to witness polynomials for which only for some $i$'s we wish to check a XOR relation; combining different tables and gates can be achieved with "selectors", see e.g. Section 4.1 of the plookup paper.* ### Reducing tuples to single elements We thus wish to show that each tuple $(a_i,b_i,c_i)$ is equal to some row of $T$. The first thing we do is use randomness to reduce checking tuples to checking single elements: We choose a random $\alpha$ and look at the vector $f$ with $f_i = a_i + \alpha\cdot b_i + \alpha^2 c_i$ for each $i\in [n]$. We similarly create a randomly compressed version $t$ of the table $T$ with $t_i = T_{1,i} + \alpha T_{2,i} + \alpha^2 T_{3,i}$. The Schwarz-Zippel Lemma can show that if some tuple $(a_i,b_i,c_i)$ was *not* in $T$, then with very high probability $f_i$ is not an element of $t$. ### The plookup protocol - sort and compare differences So it now suffices to check that every element of $f$ is an element of $t$. Let's denote this by $f\subset t$. The [plookup protocol](https://eprint.iacr.org/2020/315) reduces this to one multiset equality check. The protocol is based on the following observation: Suppose we have sorted versions of two vectors, and further know they start at the same element. Then, they contain the same distinct elements if and only if they contain the same sequence of *non-zero differences* between adjacent elements. Let's describe this on a concrete example: $f=(2,2,1,1,5), t=(1,2,5)$. The prover will create an additional vector $s$ - which is a "sorted by $t$" version of the concatenation $(f,t)$ of $f$ and $t$. Sorted by $t$ means elements appear in $s$ in the same order they appear in $t$. In our example, we have $s=(1,1,1,2,2,2,5,5)$. Now the point is to look at the *difference vectors* $s',t'$ of $s$ and $t$; i.e. the vectors consisting of the differences of adjacent elements. We have $s'=(0,0,1,0,0,3,0), t'=(1,3)$. Note that when $f\subset t$, $s'$ contains exactly the same non-zero elements as $t'$. Let us denote by $t''$ the vector $t'$ concatenated with $|f|$ zeroes. At this point, let's first describe a simpler reduction using two multiset checks: 1. Between $s$ and $(f,t)$. 2. Between $s'$ and $t''$. We claim these two checks suffice: The first check implies in particular that $f\subset s$. Thus, after the first check it suffices to verify that $s\subset t$. The first check also implies that $t\subset s$. Thus, if $s$ contained a value outside of $t$, it would have to contain more than $|t|$ distinct values. But the second check implies $s'$ has at most $|t|-1$ non-zeroes, and so $s$ has at most $|t|$ distinct values. ### A more efficient version using randomness Now, let's see the plookup reduction that uses only one multiset check. The verifier chooses random $\beta$, and we define $s', t'$ to be the "*randomized* difference vectors" of $s$ and $t$. That is $s'_i = s_i + \beta s_{i+1}$, and $t'$ is defined analogously. We now claim that it suffices to do a single multiset check between $s'$ and $((1+\beta)f, t')$. To see this most easily, it will be convenient to think of our elements as *formal polynomials in $\beta$*, rather than field elements. Using this viewpoint, the elements of $s'$ are the degree one polynomials $s_i + s_{i+1}\cdot \beta$. When $s_i\neq s_{i+1}$ this cannot match with an element of $(1+\beta)\cdot f$ which will have the same coefficient for $1$ and $\beta$, and so this element must match with one from $t'$ of the form $t_j + t_{j+1}\cdot \beta$. This means that whenever $s$ "changes" the new value $s_{i+1}$ is contained in $t$, and so $s\subset t$. From the other side, each polynomial $f_i + f_i\cdot \beta$ must match with an element of $s'$. It can only match with an element $s_j + s_{j+1}\cdot \beta$ with $s_j=s_{j+1}$. So, for some $j$, $s_j=s_{j+1}=f_i$; hence $f\subset s$, and so $f\subset t$. Finally, the Schwarz-Zippel Lemma tells us that these comparisons between formal polynomials in $\beta$ with high probability will not differ from the comparisons of the actual elements for the random choice of $\beta$. ## Summary: Describing protocols via primitives like multiset checks abstracts away many details and let's us more easily see what's going on at a high level. ***References/Acknowledgements:** The reduction from permutations to grand products appeared first in a [paper by Bayer and Groth](http://www0.cs.ucl.ac.uk/staff/J.Groth/MinimalShuffle.pdf). We thank Tom Walton-Pocock for a review and suggestions*