multilinear: a multilinear polynomial has maximum degree of 1 in each variable.
Example:
\(1 + x + xy - xz + 2yz - 10xyz\)
Notes:
\(\begin{align*} \hspace{40pt} &(0,0,0)\leftrightarrow 1\\ &(0,0,1) \leftrightarrow v_1\\ &(0,1,0) \leftrightarrow v_2\\ &(0,1,1) \leftrightarrow v_1v_2\\ &(1,0,0) \leftrightarrow v_3\\ &(1,0,1) \leftrightarrow v_1v_3\\ &(1,1,0) \leftrightarrow v_2v_3\\ &(1,1,1) \leftrightarrow v_1v_2v_3 \end{align*}\)
\[ f_l: \{0,1\}^l \to B_l \]
then for
\[ x \in \{0,1\}^l \]
\[ y \in \{0,1\}^m \]
\[ f_{l+m}(x \otimes y) = f_l(x)f_m(y) \]
where \(\otimes\) is the tensor product (or concatenation).
multilinear extension: for any \(f: \{0,1\}^l\to \mathbb{F}\) there is a unique multilinear polynomial \(\widetilde{f}: \mathbb{F}^l \to \mathbb{F}\)
\(x\) | \(f(x)\) |
---|---|
\((0,0,0)\) | \(2\) |
\((0,0,1)\) | \(3\) |
\((0,1,0)\) | \(5\) |
\((0,1,1)\) | \(7\) |
\((1,0,0)\) | \(11\) |
\((1,0,1)\) | \(13\) |
\((1,1,0)\) | \(17\) |
\((1,1,1)\) | \(19\) |
Then the unique multilinear extension \(\widetilde{f}\) is given by
\[
\widetilde{f} =
2 + x_1 + 3x_2 + x_1x_2 + 9x_3 + x_1x_3 + 3x_2x_3 - x_1x_2x_3
\]
or in Lagrange basis:
\[
\begin{align}
\widetilde{f} &= 2\cdot(1-x_3)(1-x_2)(1-x_1)\\
&+ 3\cdot(1-x_3)(1-x_2)x_1\\
&+ 5\cdot(1-x_3)x_2(1-x_1)\\
&+ 7\cdot(1-x_3)x_2x_1\\
&+ 11\cdot x_3(1-x_2)(1-x_1)\\
&+ 13\cdot x_3(1-x_2)x_1\\
&+ 17\cdot x_3x_2(1-x_1)\\
&+ 19\cdot x_3x_2x_1\\
\end{align}
\]
\[\sum_{y\in\{0,1\}^{\log n}} \widetilde{M}(r,y)\cdot\widetilde{t}(y) = \widetilde{a}(r)\]
(Note the correction from \(N\) to \(n\))
Example of equation 5:
Say we have a lookup table for 1-bit XOR:
l | r | o |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
We can turn the table into a vector by reading each row as the binary representation of a number.
\[t = (0,3,5,6)\]
The multilinear extension of \(t\) can be computed using the Lagrange basis for multilinear polynomials when \(l=2\) (see Lasso p. 19). Here \(x_1\) is the low bit of the index and \(x_2\) is the high bit.
\[\begin{align} \widetilde{t} &= 0\cdot(1-x_2)(1-x_1)\\ &+3\cdot (1-x_2)x_1\\ &+5\cdot x_2(1-x_1)\\ &+6\cdot x_2x_1\\ \end{align}\]
Simplified:
\[\widetilde{t} = 3x_1+5x_2-2x_1x_2\]
Now say we have some lookups into table \(t\) stored in vector \(a\):
\[a = (5, 0)\]
The multilinear extension of \(a\) is:
\[\widetilde{a} = 5-5x_1\]
The prover shows that the elements of \(a\) are in \(t\) by showing that it knows a sparse matrix \(M\in\mathbb{F}^{m\times n}\) such that for each row of \(M\), only one cell has a value of 1 and the rest are zeros and that \(M\cdot t=a\) (Lasso p. 12).
The matrix is: \[M = \left[\begin{array}{cccc} 0 &0 &1 &0 \\ 1 &0 &0 &0 \end{array}\right]\]
To get the multilinear extension of \(M\) we treat it as a vector of its rows and apply the principle for vectors on Lasso, page 9, twice. (Once to get the multilinear extension of each row, and another for the resulting column vector of polynomials.)
\[M = \left[\begin{array}{c} (0, 0, 1, 0) \\ (1, 0, 0, 0)\end{array}\right]\]
\[\widetilde{M} = \left[\begin{array}{c} x_2(1-x_1)\\(1-x_2)(1-x_1)\end{array}\right]\]
\[\begin{align} \widetilde{M} &= (1-x_3)x_2(1-x_1)\\ &+x_3(1-x_2)(1-x_1)\end{align}\]
NOTE: because of the homomorphism mentioned earlier, the MLE for \(M\) is the same as the MLE for \(M' = (0,0,1,0,1,0,0,0)\), i.e. the concatenation of the rows of \(M\). In this example the variable \(x_3\) selects the row while the other variables select the columns.
\[\sum_{y\in\{0,1\}^{\log n}} \widetilde{M}(r,y)\cdot\widetilde{t}(y) = \widetilde{a}(r)\]
We will show that Equation 5 holds for our examples \(\widetilde{M}, \widetilde{t},\) and \(\widetilde{a}\) from earlier.
The variable \(r\) is a random challenge chosen by the verifier. The sum is indexed by \(y\in\{0,1\}^{\log n}\), and \(n\) is the number of table entries as well as the number of columns of \(M\) (\(n=4\) in our case). This means \(y\) is formed from the bits \(x_1\) and \(x_2\) in our equations for \(\widetilde{M}\) and \(\widetilde{t}\). This leaves \(r = x_3\) in \(\widetilde{M}\) and \(r = x_1\) in \(\widetilde{a}\).
\(=\) \((x_2, x_1)\) |
\(\widetilde{M}(r, y)\) | \(\widetilde{t}(y)\) |
---|---|---|
\((0,0)\) | \(r\) | 0 |
\((0,1)\) | \(0\) | 3 |
\((1,0)\) | \(1-r\) | 5 |
\((1,1)\) | \(0\) | 6 |
The left side of the sum is: \[5- 5r\]
\[\widetilde{a}(r) = 5-5r\]
Equation 5 holds!
From reading the paper and listening to Justin Thaler explain Lasso in the ZK Study Club talk, Lasso can be viewed as a component of a snark that provides efficient lookups.
A little history and comparison of snark lookups as I understand them (corrections or clarifications gladly accepted):
Plookup is a way of adding a custom gate that provides lookup functionality into a Plonkish system. The whole lookup table is embedded into the circuit, making the circuit as large as the lookup table. Because of this, only small lookup tables can be used efficiently. A common choice is to use a table of length 2^16 to represent a binary relation on two 8-bit inputs. To simulate lookups of 256-bit inputs, each input would be unpacked into 32 8-bit limbs, 32 lookups would be performed, and the results would be packed back together using arithmetic gates into a 256-bit output. For an 256-bit XOR table (which has no carry bits to worry about) the packing of the results could cost up to 31 arithmetic gates in width-3 Plonk; fewer in systems with wider gates. Importantly, these gates are (usually?) multilinear.
Caulk, cq, etc. separate the lookup table from the circuit itself, adding additional phases to proving and verifying. You could view a snark using one of these systems as two composed snarks: one verifies the arithmetic performed on private witnesses and lookup queries, and the other verifies that the lookup queries themselves were contained in the table. Homomorphic commitments to the table allow for some compression of the table, so that larger tables can be used. Caulk and cq use an additional pairing check for verifying lookup queries.
Lasso also separates the lookup table from the circuit itself and adds phases to the proving and verifying algorithms like Caulk and cq, but rather than using an additional pairing check, it uses a sum-check protocol to verify lookup queries. By exploiting certain homomorphic properties of multilinear extensions that play well with the sum-check protocol, and by requiring that the lookup table is structured nicely, the lookup queries as well as the linear arithmetic needed to unpack/repack can be compressed together. This compression allows a single "lookup gate" to be represented by only \(3\cdot c\) field elements. The structured lookup tables can be decomposed into \(c\) tables of size \(N^{1/c}\), so the prover commits additionally to \(c\cdot N^{1/c}\) field elements, for a total of \(3\cdot c \cdot m + c\cdot N^{1/c}\) for \(m\) lookups into a structured table of size \(N\).
In Plonk, I could create a width \(c+1\) circuit with two custom gates: one that packs/unpacks \(c\) "limbs" together and one that performs \(c\) lookups into \(c\) tables of size \(N^{1/c}\). Furthermore, I think I could design the "packing" custom gate to accept selectors which would turn it into an addition gate or multiplication gate mod N = 2^n pretty easily. In this version of Plonk, addition mod N would cost one constraint, and XOR on n bits would cost one constraint, as would multiplication mod N, AND on n-bits, and even bit rotation. This is how I think about Lasso–-lookups, packing, and simple limb-wise arithmetic are available in each constraint, so that in 3 rows you can do, say, 256-bit XOR or addition mod 2^256 etc.
In Plonk there would be a big cost to doing this because the tables would need to be represented in circuit and the prover would need to commit to vectors the same length as the table, and there is a lot of additional overhead for the lookup argument on top of the regular arithmetic gates. Lasso's lookup argument requires much less overhead which makes this approach attractive.