Lookup Arguments for Mad Scientists and Busy People

The setting for lookups

When proving a relation (think in a SNARK), standard arithmetic gates can be an awkward way to express what you want to prove. For example, byte-to-byte XOR (ubiquitous, e.g., in hashing) requires several arithmetic gates to be expressed.
A more compact way to express this and other constraints is a lookup table:
- precompute all possible values of the table and store them in a commitment
- at proving time, just prove you know a row in that table. This is tantamount to of showing that you know, say, A,B,C such that A XOR B = C.
A building block for this setting can be composed with other SNARKs and get a significantly more efficient prover.

Useful, but how do we build this?

If you ignore that a table is actually composed of "multiple columns"^[1], then this sounds like a vector commitment problem: just lookup, say, XOR outputs, by proving some values are inside a precommitted vector. This is what a vector commitment does! ^[2]
Except, that we actually have a few more problems to solve (explained below) This is why vector commitments are inadequate for the job. We need a better primitive!
A lookup argument is a building block that that we can use for the setting above. It can be thought as a vector commitment on steroids^[3].

The problems with vector commitments as lookup arguments

The "rows" you are looking up should be compressed, not sent. Vector commitments, in general, require the verifier to read the vector values being opened.
Prover should not run in the size of the whole table. Even if we solve the compressing problem, we want it to be compatible with a prover that does not need to work linearly in the whole table (even with preprocessing!) This is not necessarily the case with vector commitments.
It should support "multisets". For technical reasons, we may need to open the same value multiple times. A vector commitment instead generally does not guarantee this.
Zero-knowledge. (usually optional)

A final nuance

Several of the efficient constructions for lookups out there are motivated by consistency with a specific type of vector commitment we use: Pedersen on the Lagrangian basis. This is because this is very common in several SNARK constructions as a building block (usually implicitly). Other approaches are possible, e.g. from RSA. These other approaches achieve different tradeoffs, e.g. in size of parameters and running times. Nonetheless, for common practical cases, non-RSA approaches clearly have the most attractive efficienct balance.

It turns out that the many-column case reduces to the single-column one. Note, though, that this usually requires additional properties, e.g. homomorphic commitments. ↩︎
One should consider this as a point for clarity only. There are other constructions for lookups not based on vector commitments. ↩︎
Not a perfect analogy, but true for many constructions. See footnote above and this paragraph. ↩︎

Lookup Arguments for Mad Scientists and Busy People

The setting for lookups

Useful, but how do we build this?

The problems with vector commitments as lookup arguments

A final nuance

Read more

SnarkPack Audit---Matteo Campanelli

Sealstack modeling

zkTree Matrix representations

Untitled