As mentioned in the previous article Hello, OlaVM!, OlaVM’s vision is to build a high-performance ZKVM, and this article will focus on one of the tools that make OlaVM high-performance, namely, Lookup Arguments. Lookup Arguments play an important role in reducing the size of the circuit, thereby improving Zero Knowledge efficiency, and it's widely used in the circuit design of ZKVMs. Throughout this article you'll learn more about the following:
The connection between the two Lookup Argument algorithms
The roles of a ZKVM
The ZKVM utilizes Zero Knowledge to constrain all the execution processes of the VM, and the execution process of the VM can generally be divided into: instruction execution, memory access and built-in function execution. It is somewhat impractical to execute constraints on these operations in one trace. First of all, in a trace, a row represents an operation type, and one operation type corresponds to multiple constraints, and different constraints correspond to different numbers of columns, resulting in different widths. If one of the rows is too wide due to one constraint corresponding to too many columns, then the width of the entire trace is affected, becoming too large. Resource usage of this design is wasteful when the rows corresponding to the remaining constraints do not require so many columns. Then, if there are too many different operation types in a single trace, more selectors will be introduced, increasing not only the number of polynomials, but also the order of the constraint. Finally, due to the order limitation of the group, the number of rows of the trace itself cannot exceed the order of the group, so the number of trace rows occupied by a certain type of operation should be minimized. So, for simplicity, we need to:
Split different operation types into multiple sub-traces and prove them separately, with data consistency between the main trace and the sub-traces to be ensured by Lookup argument.
For some ZK-unfriendly computations, we can reduce the size of the trace by Lookup Argument techniques, such as bitwise operations. Of course, there are other technical means to reduce the size of the trace, and that will be explained further down in this article.
Lookup between trace tables
All the execution processes of the VM will form a complete trace, called "the main trace". "Complete" refers to that it contains all the states of the VM execution, but not auxiliary states, such as some extended information that is convenient for ZK verification. As mentioned earlier, including this auxiliary information in the main trace will make the main trace complex and difficult to constrain. Therefore, for the convenience of constraints, some sub-traces are usually established, and then constrained for these sub-traces respectively, and the main traces are mainly used to execute the correct program constraints and Context constraints.
By creating different sub-traces, we divide the different operations performed by the VM and ensure that the data in the sub-trace is derived from the main trace by using Lookup Argument techniques. For the data validity proof in the sub-trace, you need to generate different traces according to a specific operation type, and then use the corresponding constraints to prove the validity of these traces. In particular, for zk-unfriendly operations such as Bitwise and rangecheck.
Lookup for ZK-unfriendly operations
As mentioned earlier, the proofs of each sub-trace are independent, so getting the smallest possible trace will improve the efficiency of prover. Taking Bitwise as an example, Bitwise operations include AND, XOR, and NOT. If you simply want to implement the constraints of Bitwise operations through circuits, you may need to split each OP into multiple binary limbs, and if these OPs are 32 bits wide, they will be split into 32 limbs. Then, you need to constrain that:
Sumcheck: All bits combined together equals the original OP,
Bitwise check per bit:
A total of 2+323=99 trace cells, and the number of constraints is 3 * sumcheck +33 bitwise=35. If there are some truth tables at this time, for AND, XOR, and NOT calculations, you can define three tables, which contain data for bitwise calculations that refer to the op with a specified bit width, such as 8-bit. For 32-bit ops, you only need to split them into four 8-bit limbs, and then the bitwise relationship between these OP limbs does not need to be implemented with corresponding constraints, only needs to execute Lookup on the fixed table. At this time, a total of 3+4*3=15 trace cells are taken up with 3 constraints sumcheck and one Lookup argument (Batch Lookup is supported).
Utilizing Lookup Arguments is a huge boost not only for proofs of bitwise operations, but also for rangecheck operations. For a 32-bit OP, you only need to split it into two 16-bit limbs. There are two good designs here. One is that it allows rangecheck to take up fewer trace cells, and the other is that our customized ADD-MUL constraint can be reused for rangecheck’s sum constraint. For different calculation types, being able to reuse the same constraint is of great help to the overall efficiency improvement. As shown in the figure above, the customized ASS-MUL gate can support the constraint reuse of five calculation types: ADD, MUL, ADD-MUL, EQ, and RANGECHECK.
Plookup Protocol
Introduction
Plookup is a protocol used to check that a polynomial of the order less than n, the value on the multiplicative group of n order is contained in a d-dimensional table . A typical scenario is to do a rangecheck in a zk-snark, verifying that the values of all polynomials over are on [0, m].
Symbol Description
is a multiplicative group of order n+1 on , and for , if , then can be expressed as , where represents
The vector can also be expressed by the polynomial , where .
Given two vectors , , represents .
Pretreatment
Suppose
If , repeat the last element times.
is at most , because the group has only order.
The polynomial represents the value of the lookup
is sets of finite field elements, .
The input polynomial is represented by .
Protocol Process
$s \in F^{2n+1} represents the combined vector of , and then two polynomial 、 on to represent :
There are elements in . represents , represents which means:
calculates 、 and sends it to a trusted third party
randomly selects random numbers 、 in two finite fields , and sends them to
uses to calculate a polynomial , defined as follows:
For :
sends to
checks the correctness of , for each :
outputs acc if all the equations in step 6 are satisfied.
Protocol Understanding
Define two polynomials and : where ,,if and only if:
is the ordering of with respect to
This transfers the proof to the proof , which is to prove that is a permutation of.
Prove that 、 is the prerequisite for to be true: For any,there is that makes , For (n*i) subscripts that are not listed above, there are always and 2 sets of are equal,so 。 the prerequisite is true. Please refer to the paper for proof of sufficient conditions.
What is of in the calculation
Defined when :
When , , and when , So the relations are satisfied for . When , So the relations between are satisfied for .
What is verifying:
checks a and d to verify that and
checks c to verify that .
checks b to verify that is satisfied for .
Further clarification of , and : Prove: ,let ,that is to prove let , , , …
And there is ,so
Halo2 Lookup Protocol
Introduction
The lookup protocol function of Halo2 is to, given two columns of data and with a length of , proving that every cell in is in , and some cells in can not be in , and
can be fixed or variable.
The case where is variable is the inclusion of columns in trace, and the values are not fixed.
and can contain duplicate data, and if the size of and does not reach , they need to be completed to .
Fill with some data from .
Complete with the last data.
Protocol Process
Let
calculates the permutations and of and
is an ordering of that places the same values on adjacent cells and randomly ranks between different values
is an ordering of , where the first cell of the different values in has to be the same as the value in the corresponding row in , and the other rows are random
and S'$ satisfy or
calculates the polynomial :
,
verifies and , and for each on , it satisfies:
verifies , and for each on , it satisfies :
outputs acc if both the equations in 4 and 5 are satisfied.
The protocol only checks against the expanded relationship between and . How is it guaranteed that the expansion is valid for ? Suppose A = {1,2} and S = {3,4}, both of which do not satisfy the subset argument, are expanded to become A = {1,2,3,4} and S = {1,2,3,4}, which can be justified by the subset argument. It is unreasonable, but how can it be proved to pass?
Support ZK
Finally, add rows of random numbers, and only use the first ~ rows to store the data
Put in row
Polynomials set the last row to 1 and the others to 0
Polynomials set the row to 1 and everything else to 0, which is the last row of data cell
Change the validation to
Why add here? The row u stores and it does not need to participate in validation
can be 1or 0, as there may be or
Extend - 1 : Vector Lookup
Suppose P has polynomials and a data table of rows and columns, and to prove that , you need to first convert and to the previous column of data.
Calculate polynomials for the column of data in , and the column of polynomials ,
selects a random number to send to
recalculates the polynomial and data that require lookup
and use and and for the previous lookup process
Extend - 2 : Multi-tables
Further, suppose has multiple data tables , where for each we need to prove that . For example, prove that every row of data in the witness is in some data table. Like a vector lookup, you want to transfer this into a case with only one polynomial and one column of data.
Form tables into , and add a column to each subtable
For each , , there is a row of elements in
Calculate the polynomial ,
Use Vector lookup to prove that ,
Comparison between Plookup and Lookup
Both the Plookup protocol and Halo2’s Lookup protocol can prove that , but the ideas behind the two protocols are different and their differences are as follows:
Plookup needs and to build a new sequence , where both the elements in and appear at least once in . It then proves by comparing the non-zero distance sets of elements in are equal, and finally .
Halo2’s lookup proves directly, without the need to construct a new sequence, and is more concise than Plookup.
Both Plookup and Halo2 lookup require sorting and completing the set. Once completed, Plookup is , and Halo2 Lookup is .