# Multiset checks in PLONK and Plookup
## Grand Products
[PLONK](https://eprint.iacr.org/2019/953.pdf) at its technical heart is based on the following primitive -
The *grand product check*: given commitments to polynomials $f,g$ over a finite field $\mathbb{F}$, and a subset $H=\{x_1,\ldots,x_n\}\subset \mathbb{F}$ check that the product of values of $f$ and $g$ over $H$ agree.
That is, whether
$$\prod_{i\in [n]} a_i \stackrel{?}{=} \prod_{i\in [n]} b_i,$$
where $a_i = f(x_i)$ and $b_i = g(x_i)$. The PLONK paper shows this check can be performed with great efficiency when $H$ is a multiplicative subgroup.
### Polynomials and Vectors
Throughout this post, whenever we discuss a vector $a$ of length $n$, we assume in the actual protocol the prover has sent a commitment to a polynomial $f$ with $f(x_i)=a_i$ as above. Thus, we can allow operations like adding vectors coordinate wise, that will be emulated in the real protocol by applying the same operation on the polynomial commitments.
## Grand products to multiset checks
One reason a grand product check is useful, is that with a little randomness it translates to a more powerful primitive - the *multiset equality check*:
Given two vectors $a=(a_1,\ldots,a_n),b=(b_1,\ldots,b_n)$ check they contain the same elements, counting repetitions, possibly in different order.
For example, $(1,1,2,3)$ is multiset-equal to $(2,1,1,3)$ but not multiset-equal to $(1,2,3,3)$ or $(1,1,2,4)$.
Let's see the reduction from multiset equality to grand product. The verifier simply chooses a random $\gamma \in \mathbb{F}$, and runs the grand product check on the randomly shifted vectors $a'\triangleq a+\gamma, b'\triangleq b+\gamma$ ($\gamma$ is added to all coordinates).
This grand product check corresponds to
$$\prod_{i\in [n]} (a_i+\gamma) \stackrel{?}{=} \prod_{i\in [n]} (b_i+\gamma)$$
Thinking of both sides as polynomials in $\gamma$, the Schwarz-Zippel Lemma implies that the grand product check fails with high probability unless $a,b$ are mutliset-equal.
## Permutations via Multiset checks
Given a permutation $\sigma:[n]\to [n]$ suppose we want to check that $b=\sigma(a)$, in the sense that for each $i$, $b_i=a_{\sigma(i)}$. See the PLONK paper (and other SNARK papers) as to why permutation checks are central to SNARKs. To reduce this to a multiset check, look first at the vectors of *pairs*
$$((a_i,i))_{i\in [n]}, ((b_i,\sigma(i))_{i\in [n]}$$
Thinking about it for a minute, you can see that they are multiset-equal if and only if $b=\sigma(a)$. But we wish to reduce to a mutliset check on vectors of elements, rather than vectors of pairs.
For this we choose random $\beta$, and define vectors $a',b'$ by $$a'_i\triangleq a_i+\beta\cdot i, b'_i\triangleq b_i + \beta\cdot \sigma(i)$$
If the above vectors of pairs are not multiset equal, then with high probability $a'$ and $b'$ aren't either. Thus, it suffices to do the multiset check between $a'$ and $b'$.
## Table lookups via Multiset checks
### The XOR example
Another very useful operation for zk-SNARKs is the table lookup operation. For example, suppose we had three vectors of field elements $a=(a_1,\ldots,a_n),b=(b_1,\ldots,b_n),c=(c_1,\ldots,c_n)$;
and we want to check that for each $i\in [n]$, $a_i,b_i,c_i$ correspond to $8$-bit strings, and $c_i=a_i\oplus b_i$ where $\oplus$ is a bitwise XOR. A traditional SNARK approach would require many arithmetic constraints for each tuple $(a_i,b_i,c_i)$ to decompose the inputs into bits and perform the bitwise XORs.
Lookup tables are an alternative approach to operations requiring many constraints. We start by precomputting the truth table $T$ corresponding to the XOR operation. So $T$ consists of three vectors/columns $T_1,T_2,T_3$ of length $2^{16}$ such that the rows of $T$, $(T_{1,i},T_{2,i},T_{3,i})$, go over all legal input/output combinations for 8-bit XOR.
***Remark:*** *In the SNARK context, $a,b,c$ typically correspond to witness polynomials for which only for some $i$'s we wish to check a XOR relation; combining different tables and gates can be achieved with "selectors", see e.g. Section 4.1 of the plookup paper.*
### Reducing tuples to single elements
We thus wish to show that each tuple $(a_i,b_i,c_i)$ is equal to some row of $T$. The first thing we do is use randomness to reduce checking tuples to checking single elements: We choose a random $\alpha$ and look at the vector $f$ with $f_i = a_i + \alpha\cdot b_i + \alpha^2 c_i$ for each $i\in [n]$. We similarly create a randomly compressed version $t$ of the table $T$ with $t_i = T_{1,i} + \alpha T_{2,i} + \alpha^2 T_{3,i}$. The Schwarz-Zippel Lemma can show that if some tuple $(a_i,b_i,c_i)$ was *not* in $T$, then with very high probability $f_i$ is not an element of $t$.
### The plookup protocol - sort and compare differences
So it now suffices to check that every element of $f$ is an element of $t$. Let's denote this by $f\subset t$. The [plookup protocol](https://eprint.iacr.org/2020/315) reduces this to one multiset equality check. The protocol is based on the following observation: Suppose we have sorted versions of two vectors, and further know they start at the same element. Then, they contain the same distinct elements if and only if they contain the same sequence of *non-zero differences* between adjacent elements.
Let's describe this on a concrete example: $f=(2,2,1,1,5), t=(1,2,5)$. The prover will create an additional vector $s$ - which is a "sorted by $t$" version of the concatenation $(f,t)$ of $f$ and $t$. Sorted by $t$ means elements appear in $s$ in the same order they appear in $t$.
In our example, we have $s=(1,1,1,2,2,2,5,5)$.
Now the point is to look at the *difference vectors* $s',t'$ of $s$ and $t$; i.e. the vectors consisting of the differences of adjacent elements. We have $s'=(0,0,1,0,0,3,0), t'=(1,3)$.
Note that when $f\subset t$, $s'$ contains exactly the same non-zero elements as $t'$. Let us denote by $t''$ the vector $t'$ concatenated with $|f|$ zeroes. At this point, let's first describe a simpler reduction using two multiset checks:
1. Between $s$ and $(f,t)$.
2. Between $s'$ and $t''$.
We claim these two checks suffice:
The first check implies in particular that $f\subset s$. Thus, after the first check it suffices to verify that $s\subset t$.
The first check also implies that $t\subset s$. Thus, if $s$ contained a value outside of $t$, it would have to contain more than $|t|$ distinct values. But the second check implies $s'$ has at most $|t|-1$ non-zeroes, and so $s$ has at most $|t|$ distinct values.
### A more efficient version using randomness
Now, let's see the plookup reduction that uses only one multiset check. The verifier chooses random $\beta$, and we define $s', t'$ to be the "*randomized* difference vectors" of $s$ and $t$. That is $s'_i = s_i + \beta s_{i+1}$, and $t'$ is defined analogously.
We now claim that it suffices to do a single multiset check between $s'$ and $((1+\beta)f, t')$. To see this most easily, it will be convenient to think of our elements as *formal polynomials in $\beta$*, rather than field elements.
Using this viewpoint, the elements of $s'$ are the degree one polynomials $s_i + s_{i+1}\cdot \beta$.
When $s_i\neq s_{i+1}$ this cannot match with an element of $(1+\beta)\cdot f$ which will have the same coefficient for $1$ and $\beta$, and so this element must match with one from $t'$ of the form $t_j + t_{j+1}\cdot \beta$. This means that whenever $s$ "changes" the new value $s_{i+1}$ is contained in $t$, and so $s\subset t$.
From the other side, each polynomial $f_i + f_i\cdot \beta$ must match with an element of $s'$. It can only match with an element $s_j + s_{j+1}\cdot \beta$ with $s_j=s_{j+1}$. So, for some $j$, $s_j=s_{j+1}=f_i$; hence $f\subset s$, and so $f\subset t$.
Finally, the Schwarz-Zippel Lemma tells us that these comparisons between formal polynomials in $\beta$ with high probability will not differ from the comparisons of the actual elements for the random choice of $\beta$.
## Summary:
Describing protocols via primitives like multiset checks abstracts away many details and let's us more easily see what's going on at a high level.
***References/Acknowledgements:** The reduction from permutations to grand products appeared first in a [paper by Bayer and Groth](http://www0.cs.ucl.ac.uk/staff/J.Groth/MinimalShuffle.pdf). We thank Tom Walton-Pocock for a review and suggestions*

By Justin Drake && Ariel Gabizon && Izaak Meckler Let $H\subset F$ be a multiplicative subgroup of order $n=2^t$. In this note, we show a method for proving univariate identities over $H$ hold where the prover runs in $O(n)$ time when the polynomials involved have degree $O(n)$, and the base identity is of constant degree. The most common example of such an identity is the degree 2 identity $XY-Z$; namely, the Hadamard check: $f\cdot g = h$ on $H$. The typical way to do such checks is via computing a quotient, e.g. $T=(f\cdot g -h)/Z_H$ in the Hadamard case. This requires $O(n \log n)$ operations. Comparison to hyperplonk Lately, people have been using sumcheck and multilinear polynomials to get $O(n)$ prover time. Our protocol nearly matches the asymptotic performance of HyperPLONK.

1/18/2023Disclaimer: This documentation was written on September, 2021. It is intended to give readers a high-level understanding. The codebase is the canonical source of truth, and over time this document might fall behind the implementation details of the code. Emulated Field We want to evaluate a modular multiplication of $\mathbb{F}_p$ elements a, b, using constraints over $\mathbb{F}_n$ ($p,n$ are prime). In particular we allow for the case where $n < p$. We will choose integer $t$ such that $2^t\cdot n> p^2$. And also leverage operations also mod $2^t$ together with the CRT. In more detail: We will check that as positive integers $$a\cdot b = q\cdot p +r $$ The way we do this at a high level is

9/27/2022The purpose of this note is to explain (also to myself) the main idea in the Halo recursive SNARK construction of Bowe, Grigg and Hopwood https://eprint.iacr.org/2019/1021. In fact, my understanding of the protocol heavily relies on the new paper of Bünz, Chiesa, Mishra and Spooner [BCMS] which fills in many details and generalizes the approach to achieve "Proof Carrying Data Via Accumulation Schemes". The starting point is the inner product argument of Bootle et. al and its optimization in the Bullet Proofs paper. It was known that a special case of this inner product argument gave a polynomial commitment scheme (by taking the second vector to be a power vector of the form $(1,x,x^2,\ldots,x^n)$). Let us call this scheme for brevity the BP-PCS. For the purpose of this note it is not needed to understand the inner workings of BP-PCS. Here are its main good and bad properties:

3/12/2021
Published on ** HackMD**