Bullet points:
1. Halo2 uses the lookup argument technique, allowing to run lookup on any random sets;
2. It is of more universality than the Plookup technique because the latter requires the same types of elements while the sizes can be different;
3. Compared with the permutation argument of Plonk, it is not required to specify the index for the permutation here;
4. For the main branch of corresponding Halo2 code, the commit is: a7cd600eb60b1528159b92af5e426adcc615de1a
Without zero-knowledge
Note: To make it easy to understand, let’s ignore the features of zero-knowledge for a while.
We will express "lookup" by discussing the "subset argument" between two sets of data on one table (2k rows, index starting from 0). As shown in the following figure, there are two sets of data, A and S, which are in two columns of the table, respectively.
The "subset argument" aims to prove that "each element in A appears in S, that is, A is a subset of S", which means that "all the elements in A must be contained in S, while all the elements in S may not be contained in A".
Take li be Lagrange baisi polynomial, that is, the value at row i is 1 while is 0 at the others.
Protocol:
Step1: Prover supply permutation columns of A and S
The permutation process is shown in the following figure (code: halo2\src\plonk\lookup\prover.rs: permute_expression_pair):
Note:
Step2: Constraints between \(A{'}\) and \(S{'}\)
Based on the above permutation results, it is required to make constraints to \(A{'}\) and \(S{'}\) (the black line in the figure) \(A{'}\)i = \(S{'}\)i or (the red line in the figure) \(A{'}\)i = \(A{'}\)i-1, that is:
(\(A{'}\)(X)- \(S{'}\)(X)) ∙(\(A{'}\)(X) - \(A{'}\)(w(-1)X)) = 0
In addition, it is required to make a constraint $A{'}$0 = $S{'}$0, that is
(\(A{'}\)(X)- \(S{'}\)(X)) ∙l0(X) = 0
To constrain that the permutations of A and S are \(A{'}\) and \(S{'}\) respectively, it is needed to use following rules as the constraints:
Z(wX)∙(\(A{'}\)(X) + β)∙(\(S{'}\)(X) + γ)-Z(X)∙(A(X) + β)∙(S(X) + γ)=0
(1-Z(X))∙l0(X) = 0
As is known, with d+1 points, a polynomial, of which the oder is less than d+1, can be exclusively confirmed. Therefore, if the order of the polynomial is not enough, just with more queries than orders of the polynomial on different points, the verifier can get all data of the polynomial, losing the features of zero knowledge. As a result, we need to scale the polynomial (Stark’s core concept. The principle is shown as follows (the polynomial value on y axis indicates secret):
Therefore, to obtain the features of zero knowledge, it requires to scale the original polynomial until its order is greater than the query times, so as to make sure that no information of the secret will be revealed.
In detail, for the table with a size of 2k, if the size of A and S is less than 2k, some random values need to be filled in it. But it must be noted that, the filled random values, which don’t meet requirements of the Lookup argument theory, need to be selected using the selector.
If the total number of rows is 2k, in which the valid rows number is u while the invalid is t, it needs to meet u + t +1 = 2k. The specific form is shown as follows:
There are two newly-added selectors:
Then the above constraints need to be changed as follows:
(1 - (qlast(X) + qblind(X)))∙(Z(wX)∙(\(A{'}\)(X) + β)∙(\(S{'}\)(X) + γ)-Z(X)∙(A(X) + β)∙(S(X) + γ))=0
(1 - (qlast(X) +qblind(X) ))∙(\(A{'}\)(X)- \(S{'}\)(X)) ∙(\(A{'}\)(X) - \(A{'}\)(w(-1)X)) = 0
The constraint to line 0 remains unchanged:
(\(A{'}\)(X)- \(S{'}\)(X)) ∙l0(X) = 0
(1-Z(X))∙l0(X) = 0
The constraint to line u:
(Z(X)2-Z(X))∙qlast(X) = 0
According to the formula of Z(X) in polynomial, Z(u) = 1 (the value set of numerator and denominator is the same). It is worth noting that this constraint, which constrains Z(u) within the range of {0,1}, is or quite small probability to be 0 (i.e., Ai + β = 0 or Si + γ = 0)
The above Lookup argument can be used for generalized conditions:
(1 - (qlast(X) +qblind(X) ))
∙(Z(wX)∙(\(A{'}\)(X) + β)∙(\(S{'}\)(X) + γ)-Z(X)
∙(\(∑_{i}\)σ(m-i) Ai(X) + β)∙(\(∑_{i}\)σ(m-i) Si(X) + γ))=0
According to 1, for any number of lookup argument, it can be realized with subset argument:
a. For example, constrain R (x,y, …) on each row, which means that the R can be regarded as some sets of S, and then verify (x,y, …) ϵ R.
b. In particular, if R is a function, it is required to run a validity check to the function input, and introducing a range check may be needed.
As well, it can support multiple tables by combining them into a large one and then corresponding to a tag selector, of which the method is similar to the technique of Chapter 4 and Chapter 5 in Plookup.