# Lookup Argument in Halo2 - Sin7Y Tech Review (14) **Bullet points:** 1. Halo2 uses the lookup argument technique, allowing to run lookup on any random sets; 2. It is of more universality than the Plookup technique because the latter requires the same types of elements while the sizes can be different; 3. Compared with the permutation argument of Plonk, it is not required to specify the index for the permutation here; 4. For the main branch of corresponding Halo2 code, the commit is: a7cd600eb60b1528159b92af5e426adcc615de1a ## Technique Description **Without zero-knowledge** *Note: To make it easy to understand, let’s ignore the features of zero-knowledge for a while.* We will express "lookup" by discussing the "subset argument" between two sets of data on one table (2^k^ rows, index starting from 0). As shown in the following figure, there are two sets of data, A and S, which are in two columns of the table, respectively. ![](https://i.imgur.com/YgxfHO2.jpg) The "subset argument" aims to prove that "each element in A appears in S, that is, A is a subset of S", which means that "all the elements in A must be contained in S, while all the elements in S may not be contained in A". * S may be fixed, or variable; for example, when to execute a range proof, S is fixed, e.g. S: = {0,...2^k^-1}, and when to prove that the input of one circuit belongs to the output of another circuit, S is variable. S may contain multiple columns, some of which are fixed while some are variable (advice columns). * There can be repeated data in A and S. If the data size of A or S is less than 2^k^, repeated data or false data can be filled into it until 2^k^. Take l~i~ be Lagrange baisi polynomial, that is, the value at row i is 1 while is 0 at the others. **Protocol:** Step1: Prover supply **permutation** columns of A and S The permutation process is shown in the following figure (**code**: halo2\src\plonk\lookup\prover.rs: permute_expression_pair): ![](https://i.imgur.com/7IJAMfz.jpg) 1. $A{'}$ is the simply sorted A; 2. In $A{'}$, if the consecutive values are different, we have $S{'}$ equal to this value. Meanwhile, confirm whether there exists a same value in the original $S{'}$, and if not, it comes to panic (the **error** part in the figure); if the consecutive values are same, there is no value equaled to $S{'}$, and save the index of the same value (repeated_table_map in the figure). **When finished such loop, it can make sure that all the values in A do be contained in S**; 3. Finally, fill leftover values not queried in the original S into S’. **These values may be contained in A, or may not** (because values in S can also be repeated, e.g. changing 1 into 2), and finally $S{'}$, the permutation of S, can be obtained. **Note:** 1. The permutation proof here **is different from that of Plonk**, in which the permutation proof indicates specific index (not required here in this paper), that is, a~i~ = b~j~, and therefore, the specific permutation index must be added into the permutation formula; 2. As well, the Lookup argument **here is different from that of Plookup**, which requires that the element values in A and S must be exact same, just with different size. Step2: Constraints between $A{'}$ and $S{'}$ Based on the above permutation results, it is required to make constraints to $A{'}$ and $S{'}$ (the black line in the figure) $A{'}$~i~ = $S{'}$~i~ or (the red line in the figure) $A{'}$~i~ = $A{'}$~i-1~, that is: ($A{'}$(X)- $S{'}$(X)) ∙($A{'}$(X) - $A{'}$(w^(-1)^X)) = 0 In addition, it is required to make a constraint $A{'}$0 = $S{'}$0, that is ($A{'}$(X)- $S{'}$(X)) ∙l~0~(X) = 0 To constrain that the permutations of A and S are $A{'}$ and $S{'}$ respectively, it is needed to use following rules as the constraints: Z(wX)∙($A{'}$(X) + β)∙($S{'}$(X) + γ)-Z(X)∙(A(X) + β)∙(S(X) + γ)=0 (1-Z(X))∙l~0~(X) = 0 ## Introduce zero knowledge As is known, with d+1 points, a polynomial, of which the oder is less than d+1, can be exclusively confirmed. Therefore, if the order of the polynomial is not enough, just with more queries than orders of the polynomial on different points, the verifier can get all data of the polynomial, losing the features of zero knowledge. As a result, we need to scale the polynomial ([Stark’s core concept](https://www.youtube.com/watch?v=_p7Zhrfsct8). The principle is shown as follows (**the polynomial value on y axis indicates secret**): ![](https://i.imgur.com/VNdc9yo.jpg) 1. If the order of original polynomial is 1, then verifier knows values of the two points and the secret can be calculated; 2. If the order is greater than 1, then verifier knows values of the two points and any information about the secret cannot be revealed; Therefore, to obtain the features of zero knowledge, it requires to scale the original polynomial until its order is greater than the query times, so as to make sure that no information of the secret will be revealed. In detail, for the table with a size of 2^k^, if the size of A and S is less than 2^k^, some random values need to be filled in it. But it must be noted that, the filled random values, which don’t meet requirements of the Lookup argument theory, need to be selected using the selector. If the total number of rows is 2^k^, in which the valid rows number is u while the invalid is t, it needs to meet u + t +1 = 2^k^. The specific form is shown as follows: ![](https://i.imgur.com/9R7RYOZ.jpg) There are two newly-added selectors: 1. q~blind~, the last t lines, in which the value is 1 and others are 0 2. q~last~, the value on row u is 1, and others are 0 (between usable rows and blind rows) Then the above constraints need to be changed as follows: (1 - (q~last~(X) + q~blind~(X)))∙(Z(wX)∙($A{'}$(X) + β)∙($S{'}$(X) + γ)-Z(X)∙(A(X) + β)∙(S(X) + γ))=0 (1 - (q~last~(X) +q~blind~(X) ))∙($A{'}$(X)- $S{'}$(X)) ∙($A{'}$(X) - $A{'}$(w^(-1)^X)) = 0 The constraint to line 0 remains unchanged: ($A{'}$(X)- $S{'}$(X)) ∙l~0~(X) = 0 (1-Z(X))∙l~0~(X) = 0 The constraint to line u: (Z(X)^2^-Z(X))∙q~last~(X) = 0 According to the formula of Z(X) in polynomial, Z(u) = 1 (the value set of numerator and denominator is the same). **It is worth noting that** this constraint, which constrains Z(u) within the range of {0,1}, is or quite small probability to be 0 (i.e., Ai + β = 0 or Si + γ = 0) ## Generalizations The above Lookup argument can be used for generalized conditions: 1. Multiple columns can be added to A and S. Compress with a random challenge, and the $A{'}$ and $S{'}$ can maintain One column (**code:** halo2\src\plonk\lookup\prover.rs:commit_permuted): a. The commit of each column in S can be calculated in advance, and according to the commit’s homomorphic features, the commit after compression can be easily calculated; b. Assuming that there is a M column, then compress_A = σm-1A0 + σm-2A1 +…+ A0, and $A{'}$ = permuted (compress_A), and then the corresponding constraint becomes (code: halo2\src\plonk\lookup\prover .rs: commit_product): (1 - (q~last~(X) +q~blind~(X) )) ∙(Z(wX)∙($A{'}$(X) + β)∙($S{'}$(X) + γ)-Z(X) ∙($∑_{i}$σ^(m-i)^ A~i~(X) + β)∙($∑_{i}$σ^(m-i)^ S~i~(X) + γ))=0 According to 1, for any number of lookup argument, it can be realized with subset argument: a. For example, constrain R (x,y, ...) on each row, which means that the R can be regarded as some sets of S, and then verify (x,y, ...) ϵ R. b. In particular, if R is a function, it is required to run a validity check to the function input, and introducing a range check may be needed. As well, it can support multiple tables by combining them into a large one and then corresponding to a tag selector, of which the method is similar to the technique of Chapter 4 and Chapter 5 in [Plookup](https://eprint.iacr.org/2020/315.pdf).