owned this note
owned this note
Published
Linked with GitHub
# Halo2 notes
<!--
## Disambiguating types
### Traits
```rust
trait Chip {
type Config;
type Loaded;
fn config(&self) -> &Self::Config;
fn loaded(&self) -> &Self::Loaded;
}
```
A `Chip` exposes its fixed configuration and an optional state. Usually, the `Chip` stores its `Config` as a struct member, so that `chip.config()` returns `self.config`.
A `Circuit` also holds a `Config`, though since it is
The actual `Config` is created by a `configure` function in `Chip`. The caller (usually a `Circuit`) will query a `ConstraintSystem`
```rust
fn configure(
meta: &mut ConstraintSystem,
advice: [Column<Advice>],
fixed: [Column<Fixed>],
lookup: (TableColumn, ..., TableColumn),
)
```
```rust
pub trait Circuit {
type Config;
type FloorPlanner: FloorPlanner;
fn without_witnesses(&self) -> Self;
fn configure(meta: &mut ConstraintSystem) -> Self::Config;
fn synthesize(&self, config: Self::Config, layouter: impl Layouter) -> Result<(), Error>;
}
```
```rust
trait FloorPlanner {
fn synthesize<CS: Assignment, C: Circuit>(
cs: &mut CS,
circuit: &C,
config: C::Config,
constants: Vec<Column<Fixed>>,
) -> Result<(), Error>;
}
```
```rust
pub trait Assignment<F: Field> {
fn enter_region(&mut self, name_fn: String);
fn annotate_column(&mut self, annotation: String, column: Column<Any>);
fn exit_region(&mut self);
fn enable_selector(
&mut self,
annotation: String,
selector: &Selector,
row: usize,
) -> Result<(), Error>;
fn query_instance(&self, column: Column<Instance>, row: usize) -> Result<Value<F>, Error>;
fn assign_advice(
&mut self,
annotation: String,
column: Column<Advice>,
row: usize,
to: Assigned<F>,
) -> Result<(), Error>;
fn assign_fixed(
&mut self,
annotation: String,
column: Column<Fixed>,
row: usize,
to: Assigned<F>,
) -> Result<(), Error>;
fn copy(
&mut self,
left_column: Column<Any>,
left_row: usize,
right_column: Column<Any>,
right_row: usize,
) -> Result<(), Error>;
fn fill_from_row(
&mut self,
column: Column<Fixed>,
row: usize,
to: Value<Assigned<F>>,
) -> Result<(), Error>;
fn get_challenge(&self, challenge: Challenge) -> Value<F>;
}
```
```rust
trait Layouter {
type Root: Layouter;
fn assign_region(&mut self, name: String, assignment: A) -> Result<AR, Error>
where
A: FnMut(Region<'_, F>) -> Result<AR, Error>;
fn assign_table(&mut self, name: String, assignment: A) -> Result<(), Error>
where
A: FnMut(Table<'_, F>) -> Result<(), Error>;
fn constrain_instance(
&mut self,
cell: Cell,
column: Column<Instance>,
row: usize,
) -> Result<(), Error>;
fn get_challenge(&self, challenge: Challenge) -> Value<F>;
fn get_root(&mut self) -> &mut Self::Root;
fn push_namespace(&mut self, name_fn: String);
fn pop_namespace(&mut self, gadget_name: Option<String>);
fn namespace<NR, N>(&mut self, name_fn: N) -> NamespacedLayouter<'_, F, Self::Root>
where
NR: Into<String>,
N: FnOnce() -> NR,
{
self.get_root().push_namespace(name_fn);
NamespacedLayouter(self.get_root(), PhantomData)
}
}
```
```rust
pub trait RegionLayouter {
fn enable_selector(
mut self,
annotation: String,
selector: &Selector,
offset: usize,
) -> Result<(), Error>;
fn name_column(
mut self,
annotation: String,
column: Column<Any>,
);
/// Assign an advice column value (witness)
fn assign_advice(
mut self,
annotation: String,
column: Column<Advice>,
offset: usize,
to: Value<Assigned<F>> ,
) -> Result<Cell, Error>;
fn assign_advice_from_constant(
mut self,
annotation: String,
column: Column<Advice>,
offset: usize,
constant: Assigned<F>,
) -> Result<Cell, Error>;
fn assign_advice_from_instance(
&mut self,
annotation: String,
instance: Column<Instance>,
row: usize,
advice: Column<Advice>,
offset: usize,
) -> Result<(Cell, Value<F>), Error>;
fn assign_fixed(
mut self,
annotation: String,
column: Column<Fixed>,
offset: usize,
to: Value<Assigned<F>> ,
) -> Result<Cell, Error>;
fn constrain_constant(&mut self, cell: Cell, constant: Assigned<F>) -> Result<(), Error>;
fn constrain_equal(&mut self, left: Cell, right: Cell) -> Result<(), Error>;
}
```
```rust
pub trait TableLayouter {
fn assign_cell<'v>(
&'v mut self,
annotation: String),
column: TableColumn,
offset: usize,
to: Value<Assigned<F>>,
) -> Result<(), Error>;
}
``` -->
$$
\newcommand{\NP}{\mathsf{NP}}
\newcommand{\Rel}{\mathcal{R}}
\newcommand{\npx}{\mathbf{x}}
\newcommand{\npw}{\mathbf{w}}
\newcommand{\npX}{\mathbf{X}}
\newcommand{\npW}{\mathbf{W}}
\newcommand{\acc}{\mathsf{acc}}
\newcommand{\bin}{\{ 0,1 \}}
\newcommand{\l}{\ell}
\newcommand{\abs}[1]{\left| {#1} \right|}
\newcommand{\set}[1]{\left\{ #1 \right\}}
\newcommand{\Angle}[1]{\left \langle {#1} \right \rangle}
\newcommand{\concat}{\|}
\newcommand{\O}{\mathcal{O}}
\newcommand{\zk}[1]{\, {\color{red}#1}}
\newcommand{\Adversary}[1]{\mathcal{#1}}
\newcommand{\P}{\Adversary{P}}
\newcommand{\V}{\Adversary{V}}
\newcommand{\Adv}{\Adversary{A}}
\newcommand{\Ext}{\Adversary{E}}
\newcommand{\Sim}{\Adversary{S}}
\newcommand{\Algo}[1]{\mathsf{#1}}
\newcommand{\D}{\Algo{D}}
\newcommand{\C}{\mathcal{C}}
\newcommand{\sample}{\stackrel{\$}{\gets}}
\newcommand{\Commit}{\Algo{Commit}}
\newcommand{\Open}{\Algo{Open}}
\newcommand{\Poly}{\Algo{Poly}}
\newcommand{\Pairing}{\Algo{Pairing}}
\newcommand{\state}{\Algo{st}}
% Keys
\newcommand{\Key}[1]{\mathsf{#1}}
\newcommand{\ck}{\Key{ck}}
\newcommand{\pp}{\Key{pp}}
\newcommand{\vp}{\Key{vp}}
\newcommand{\dk}{\Key{dk}}
\newcommand{\vk}{\Key{vk}}
\newcommand{\pk}{\Key{pk}}
% ###
% # Sets with blackboard style
% ###
\newcommand{\FF}{\mathbb{F}}
\newcommand{\GG}{\mathbb{G}}
\newcommand{\NN}{\mathbb{N}}
\newcommand{\BB}{\mathbb{B}}
% 'Left' Group
\newcommand{\GL}{\GG_{1}}
% 'Right' Group
\newcommand{\GR}{\GG_{2}}
% 'Target Group
\newcommand{\GT}{\GG_{T}}
% Probabilities
\newcommand{\Pr}[1]{\mathbb{P}\left[ {#1} \right]}
\newcommand{\PrCond}[2]{\mathbb{P}\left[ {#1} \ \middle|\ {#2} \right]}
% Pairing Group
\newcommand{\GroupElementNormal}[2]{[#1]_#2}
\newcommand{\GroupElementBig}[2]{\big[{#1}\big]_#2}
\newcommand{\GroupElementVariable}[2]{\left[{#1}\right]_#2}
\newcommand{\gl}[1]{\GroupElementNormal{#1}{1}}
\newcommand{\glb}[1]{\GroupElementBig{#1}{1}}
\newcommand{\glv}[1]{\GroupElementVariable{#1}{1}}
\newcommand{\gr}[1]{\GroupElementNormal{#1}{2}}
\newcommand{\grb}[1]{\GroupElementBig{#1}{2}}
\newcommand{\grv}[1]{\GroupElementVariable{#1}{2}}
\newcommand{\gt}[1]{\GroupElementNormal{#1}{T}}
\newcommand{\gtb}[1]{\GroupElementBig{#1}{T}}
\newcommand{\gtv}[1]{\GroupElementVariable{#1}{T}}
% Multi-Variate polynomials
\newcommand{\pow}{\mathsf{pow}}
\newcommand{\shift}[1]{{#1}^{\circlearrowleft}}
\newcommand{\mv}[1]{\underline{#1}}
\newcommand{\u}{\mv{u}}
\newcommand{\i}{\mv{i}}
\newcommand{\X}{\mv{X}}
\newcommand{\even}{\mathsf{even}}
\newcommand{\odd}{\mathsf{odd}}
\newcommand{\col}[1]{\mathbf{#1}}
$$
## Protostar
### Frontend adaptor
With as few modifications as possible, it should be possible to accumulate any existing halo2 circuits using Protostar. Given a circuit and its inputs, it should run the IOP in non-interactive mode, sending commitments to columns, getting challenges, and returning the entire interaction transcript.
## IOP
The goal of the IOP phase is to fill the circuit *trace* with all the data required to check the validity of all constraints.
A section of a *trace* with $n$ rows and $k$ columns is denoted $\vec{\col{f}} \in {\left(\FF^{k}\right)}^n$. This emphasizes that the matrix is in *row-major* form.
We can write it as a list of columns $\vec{\col{f}} = [\col{f}_j]_{j \in [k]} = [\col{f}_1,\ldots,\col{f}_k] \in {\left(\FF^{n}\right)}^k$, which will be useful when describing the variables participating in a constraint.
The $i$-th *row* of $\vec{\col{f}}$ is a tuple $\vec{\col{f}}_i = [f_{1,i},\ldots,f_{k,i}] \in \FF^k$.
We group different sections of the trace with different function by assigning them a unique letter:
| Type | $\TeX$ | Description |
| ----------- |:-------------------------------------:| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Fixed` | $\vec{\col{f}}$ | Circuit-specific data, invariant w.r.t. public parameters. |
| `Constants` | $\vec{\col{q}} \subset \vec{\col{f}}$ | Constants used to parametrize gate expressions for each constraint. |
| `Permutations` | $\vec{\col{\sigma}} \subset \vec{\col{f}}$ | Indices of cells which must be equal to each other. Obtained by transforming the copy constraint data into a permutation. |
| `Table` | $\vec{\col{t}} \subset \vec{\col{f}}$ | Values which can be efficiently looked up into sections of `Advice`. Potentially much longer than the other entries, in which case the rows can be derived on-the-fly, or retrieved from storage. |
| `Selector` | $\vec{\col{s}} \subset \vec{\col{f}}$ | Contains only binary entries and usually sparse. Indicates which constraint to activate. |
| `Witness` | $\vec{\col{w}}$ | Contains all columns sent by the prover during the IOP. |
| `Public` | $\vec{\col{p}} \subset \vec{\col{w}}$ | (Possibly) short columns containing all inputs/output values supplied by the verifier. |
| `Advice` | $\vec{\col{a}} \subset \vec{\col{w}}$ | Contains variables constrained by the gate expressions. Usually dense, but entries can be very structured. |
| `Lookup` | $\vec{\col{m}} \subset \vec{\col{w}}$ | Sparse and integer-valued. Represents the number of copies from one set of `Table` rows to `Advice`. |
| `Aux` | $\vec{\col{z}} \subset \vec{\col{w}}$ | Additional columns sent after `Advice`, which help prove arguments (for example $Z(X)$ for permutations or $\col{m}$ used for lookups ) |
| `Challenge` | $\vec{c}$ | List of random challenges sent by the verifier. Can be re-derived by replaying the transcript using commitments to the above columns. |
### Preprocessing
With preprocessing, we want to ensure all fixed columns are available to the verifier before the interaction starts. All the $\vec{\col{f}}$ columns are independent of the instance data $\vec{\col{w}}$, so we compute their commitments. Note that in order to prove different sections of the trace more efficiently, different arguments may commit to their fixed data in different way (e.g. cq).
### Public data/Instance
The witness columns $\vec{\col{p}}$ contain the circuit variables corresponding to the public inputs and outputs of the circuit. We refer to the set of public inputs and outputs as the `Instance` values.
The goal is copy these values into relevant sections of the `Advice` columns $\vec{\col{a}}$, so that they can be reused when verifying constraints.
> Explain
> - what does "copying into advice" mean?
> - Difference between instance representations
#### Simple
When the number of instance values is small, we will add all the elements of $\vec{\col{p}}$ into the transcript directly. If the transcript uses a circuit-unfriendly hash for Fiat-Sharir, then this may be costly as the number of instances grows, since each element of $\vec{\col{p}}$ needs to be hashed.
The circuit defines a constraint of the type $G(\col{a}_i, \col{p}_i) = \col{a}_i- \col{p}_i$ to ensure that the $i$-th instance value is equal to the $i$-th advice value.
Since the size of $\vec{\col{p}}$ is small, the verifier is able to evaluate its polynomial extension efficiently. Instance columns are "virtual" in the sense that they do not need to participate in the polynomial commitment opening protocol.
#### Hashed
A short number of instance values can be *compressed* by computing vector commitment $h \gets \mathsf{hash}(\vec{\col{p}}) \in \FF$ as a single field element, and providing the pre-image $\vec{\col{p}}$ as part of the `Advice`.
In this situation, we only need to apply a single equality constraint to ensure $h$ is copied into an `Advice` cell. The elements of $\vec{\col{p}}$ are made available by hashing the pre-image using specific gates in the circuit.
#### Committed
In some cases, we want to provide a large amount of public data to the circuit. We commit to the $\vec{\col{p}}$ columns in the same way as we would the $\vec{\col{a}}$ columns. The commitment itself does not need to be computed by the verifier, but the verifier must still ensure the data is correct.
We can imagine that these are rows were part of another proof's lookup argument. The table would contain only results that were required for the other circuit, and we are now proving that all the entries in this table were correct. For example, each row could correspond to a computation of a hash and we can interpret these columns as the public inputs of a circuit which only checks that each hash is correct.
If the data is obtained from a "trusted third party" such as a smart contract, then the verifier would be able get the commitment while trusting that it was computed correctly somewhere else.
In general, this technique requires someone to compute the commitment so it is not always efficient for the verifier to be in this situation.
### Gate Constraints
We consider general *gate expressions* $$
G^{(j)}\left(\vec{\col{q}}_i, \left[\vec{\col{a}}_{r}\right]_{r \in [n]}, \vec{c} \right) \in \FF\left[\vec{Q}, \vec{A}_0, \ldots, \vec{A}_n, \vec{C}\right]
$$which is a multivariate polynomial representing an algebraic constraint applied at a given row $i\in [n]$. A `Chip` lays these polynomials down during its configuration.
The `Constants` values $\vec{\col{q}}_i$ are fixed per circuit. They allow a generic constraint polynomial to be customized on a per-row basis. For example, the constants $[q_L, q_R, q_M, q_O]$ can give rise to both addition and multiplication gates using the same same polynomial $q_L\cdot w_L + q_R \cdot w_R + q_M \cdot w_L \cdot w_R + q_O$.
The `Advice` columns $\vec{\col{a}}$ are obtained through several IOP rounds. After committing to the `Instance`, the prover sends a first set of advice columns. For some circuits, the prover requires randomness from the verifier for defining more complex gate expression. The verifier then responds with a certain number of challenges which are added to the challenge container $\vec{c}$. The next advice columns sent by the prover will be parametrized with regards to the challenge. This process repeats until all advice columns and challenges have been generated, and the gate expressions $G^{(j)}$ can be evaluated.
In order to handle *rotations* in `halo2`, we allow the gate equations $G^{(j)}$ to depend on all the `Advice` values across all rows, even though most values are not actually part of the polynomial expression. When we restrict rotations to `Cur` and `Next`, we can write $G_j\left(\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) \in \FF\left[\vec{Q}, \vec{A}, \vec{A'}, \vec{C}\right]$. This general enough to understand TurboPlonK gates for example, but can be extended to handle all rotations.
Each gate expression $G_j$ has an associated *selector column* $\col{s}_j \in \bin^n$ which indicates whether the expression is active at row $i$.
The equation $s_{j,i}\cdot G^{(j)}\left(\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) = 0$ should hold at every row, since either $s_{j,i} = 0$ or $G^{(j)}\left(\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) = 0$ must be true.
When a `Chip` outputs multiple gate expressions, it is useful to assign them the same selector, as they are activated together in the region. While simpler, it is not possible to assume that the selector columns are linearly-independent. That would mean that only a single selector is applied in each row.
To guarantee independence of the equation, we require an additional challenge $y\in\vec{c}$ such that we the following polynomial vanishes over all $i \in [n]$
$$
G(\vec{\col{s}}_i,\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c}) = \sum_{j} s_{j,i} \cdot G^{(j)}\left(\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) \cdot y^j .
$$
In some place, it will be easier to consider the polynomial resulting from having evaluated the `Fixed` variables only.
$$
G_i(\vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c}) = G(\vec{\col{s}}_i,\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c})
$$
Since the $\vec{\col{s}}_{i,j}$ are evaluated, we can ignore all gate expression which have their selector disabled, making the evaluation faster.
#### Challenge flattening
In Protostar, the prover time is sensitive to the degree of $G$, though not in the same ways as in PlonK. Here, we count both the `Advice` and `Challenges` columns as variables, while all `Fixed` columns (`Selectors` and `Constants`) are treated as constants.
For example, consider the polynomial $G(\vec{a}, c) = \sum_{j=0}^{d-1} a_j\cdot c^j$ where $c$ is a challenge and $\vec{a}$ is a row of $d$ advice values. The degree of the equation is $d$ due to the term $a_{d-1}\cdot c^{d-1}$. We introduce variable $c_1,\ldots,c_{d-1}$ such that $c_j$ represents $c^j$. We can then write $G'(\vec{a}, c_1,\ldots,c_{d-1}) = a_0 + \sum_{j=1}^{d-1} a_j\cdot c_j$ and $G(\vec{a},c) = G'(\vec{a}, c, c^2,\ldots, c^{d-1})$. After this transformation, $G'$ has degree $2$.
In order to prevent the degree from blowing-up, we can rewrite our expression for $G$ by symbolically increasing the number of `Challenges` variables $\vec{c}$. We replace it with challenge columns $\vec{\col{c}}$, where the $i$-th row contains the elements of $\col{c}$ raised to the $i$-th power. That is, we have $\vec{\col{c}}_0 = [1,\ldots,1]$, $\vec{\col{c}}_1 = \vec{c}$ and more generally $\vec{\col{c}}_{i,j} = c_j^i$. In our expression, we replace occurrences of $c_j^d$ of degree $d$ with the variable $\vec{\col{c}}_{i,j}$ which has degree 1 only.
While we considered the table $\vec{\col{c}}$ to be of size $n$, in reality we would only compute the powers required by the expression. We can however store the $\beta$ challenges of size $\sqrt{n}$ in this way. _note: maybe not the best since it is committed_
#### Homogenization
Protostar requires that the expressions applied to the rows are *homogeneous* over the `Witness` and `Challenges` (after transformation) variables. It is rarely the case that an expression is homogeneous, so we need to make it that way.
We introduce a *slack* variable $\mu$ which we can think of representing a column of $1$s. Multiplying a term inside the expression by $\mu$ does not affect the result since it evaluates to $1$, but it increases degree of the term. By traversing the `Expression` tree, we can multiply branches by this new variable to make it homogenous.
If we assume $H^{(j)}$ is the homogenization of $G^{(j)}$ through $\mu$, then the entire constraint identity becomes
$$
H(\mu, \vec{\col{s}}_i,\vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{\col{c}}) = \sum_{j} s_{j,i} \cdot H^{(j)}\left(\mu, \vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) \cdot y_j \cdot \mu^{d-d_j-1}
$$
Each term is multiplied by $\mu^{d-d_j-1}$ to ensure it has degree $d$ (since $y_j$ has degree $1$).
### Lookups
A lookup constraint ensures that a certain set of rows $\vec{\col{l}} \in \FF^{c\times n_\ell}$ are all contained inside a table $\vec{\col{t}} \in \FF^{c \times n_t}$, where $n_\ell$ and $n_t$ are the number of lookups and table entries respectively. We start by exploring the case where $\vec{\col{t}}$ are `Fixed` columns committed to in the same way as other $\vec{\col{f}}$ columns would be.
In the simple case, we can consider $\vec{\col{l}} \in \FF^{n_c\times n}$ to be equal to a subset of $n_c$ advice columns $\vec{\col{l}} = \vec{\col{a}}_{\ell} \subset \vec{\col{a}}$, and we check that each for each the row $\vec{\col{l}}_i$ for all $i\in [n]$, there exists some index $j \in [n_\ell]$ such that $\vec{\col{t}}_j = \vec{\col{l}}_i$.
More generally though, we may want to query the table with more flexibility. We do this by allowing the lookup rows $\vec{\col{l}}_i$ to be obtained by transforming advice rows.
$$
\begin{aligned}
\vec{\col{l}}_i
& = \left[ \col{l}_{i,1}, \ldots, \col{l}_{i,n_{c}}\right] \\
& = \vec{G}^{(\ell)}\left( \vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c}\right) \\
& = \left[ G^{(\ell)}_1\left( \vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) , \ldots, G^{(\ell)}_{n_c}\left( \vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) \right]
\end{aligned}
$$
We emphasize that each expression $G^{(\ell)}_j$ does not need to depend on all its inputs. In many cases, we can let $G^{(\ell)}_j(\vec{\col{a}}_i) = a_{i,j}$ so that it simply returns the $j$-th element in the $i$-th row. We could also use constants in $\vec{\col{q}}$ to create more elaborate entries such as $a_{i,j} + 2^b\cdot a_{i+1,j}$ (useful if both advice values are range constrained to $b$ bits).
We use a selector $\col{s}^{(\ell)}$ to restrict only certain `Advice` rows from participating in the lookup argument, so that gate constraints and lookup queries can happen within the same `Advice` trace.
$$\set{\vec{\col{l}}_i}_{\substack{i \in [n]\\\col{s}^{(\ell)}=1}} \subseteq \set{\vec{\col{t}}_i}_{i\in [n_t]}.$$
Note that it is possible to also obtain the rows $\vec{\col{t}}$ via a polynomial transformation $\vec{G}^{(t)}$ in the same way that the lookups $\vec{\col{l}}$ are computed. This is not usually useful when the tables are `Fixed` (in this case we set the transformation to be the identity), since any transformation can be applied directly as part of the preprocessing.
We prove the above using the logup technique.
After exchanging the `Instance` via the transcript, the prover sends a vector $\col{m} \in \NN^{n_\ell}$ such that $m_j$ is equal to the number of times $\vec{\col{t}}_j$ appears in $\vec{\col{l}}$. This is sent at the same time as the first set of advice columns.
We then compress both the lookup and table columns into a single one using a random challenge $\theta$.
That is, we consider the "virtual" columns $\col{l'} = \sum_{j \in [n_c]} \theta^j \cdot \col{l}_j$ and $\col{t'} = \sum_{j \in [n_c]} \theta^j \cdot \col{t}_j$ . With overwhelming probability, we can reduce our argument to checking that the values in two columns are the same:
$$\set{\col{l'}_i}_{\substack{i \in [n]\\\col{s}^{(\ell)}=1}} \subseteq \set{\col{t'}_i}_{i\in [n_t]}.$$
The above statement is equivalent to saying that the following polynomial identity holds :
$$
\sum_{i \in [n]} \frac{\col{s}^{(\ell)}_i}{X + \col{l'}_i} = \sum_{i \in [n_t]} \frac{\col{m}_i}{X + \col{t'}_i}.
$$
We test this equality in a randomly challenge $\beta$ sent by the verifier at the same time as $\theta$. The prover computes two `Aux` vectors $\col{z}^{(\ell)} \in \FF^n$ and $\col{z}^{(t)} \in \FF^{n_t}$, corresponding to the summands on each side of the equation.
We define them as
- $\col{z}^{(\ell)}_i = \frac{\col{s}^{(\ell)}_i}{\beta +\col{l'}_i}$ for all $i\in[n]$
- $\col{z}^{(t)}_i = \frac{\col{m}_i}{\beta + \col{t'}_i}$ for all $i\in[n_t]$
- $\sigma = \sum_{i \in [n]} \col{z}^{(\ell)}_i$
Note that since both $\col{s}^{(\ell)}$ and $\col{m}$ are sparse, we only need to evaluate and commit to entries for which the "multiplicities" columns are non-zero.
Given this information, we need only to verify the following constraints:
$$
\begin{aligned}
\col{s}^{(\ell)}_i - \col{z}^{(\ell)}_i\cdot \left(\beta + \sum_{j \in [n_c]} \theta^j \cdot G^{(\ell)}_j\left( \vec{\col{q}}_i, \vec{\col{a}}_{i}, \vec{\col{a}}_{i+1}, \vec{c} \right) \right) &= 0, &\forall \ i \in [n]\\
\col{m}_i - \col{z}^{(t)}_i\cdot \left(\beta + \sum_{j \in [n_c]} \theta^j \cdot \col{t}_{j,i} \right) &= 0, &\forall \ i \in [n_t]\\
\sigma - \sum_{i \in [n]} \col{z}^{(\ell)}_i &= 0,\\
\sigma - \sum_{i \in [n_t]} \col{z}^{(t)}_i &= 0.
\end{aligned}
$$
Our formulation include the claimed sum $s$ of both auxiliary columns, which differs from the Protostar paper, where the two sums are compared directly.
This additional field element give us flexibility to handle the constraints over different domain. That is, we do a zero-check and a Sumcheck over $[n]$ for the first and third constraint, but the second and fourth constraints are over a domain of size $n_t$.
#### Batching
We want to consider the case where multiple tables can be queried, with a lookup row specifying which table should be looked up.
The first approach would be to consider different tables as different set of columns.
If we have multiple tables $\vec{\col{t}^{(j)}}$, we create a separate multiplicities vector $\col{m}^{(j)}$ for each table, and also a separate auxiliary column $\col{z}^{t_j}$. For each table $j$, we send the claimed sum $\sigma^{(j)} = \sum_{i \in [n_t]} \col{z}^{t_j}_i$ which is checked via a constraint.
We do not need to send the claimed sum for the lookup entries since we only need to constrain $\sum_j \sigma^{(j)} - \sum_{i \in [n]} \col{z}^{(\ell)}_i = 0$ .
Suppose we have $k$ tables each with $n_c$ columns of length $n_t$. We can define a table with $n_c+1$ columns of length $k\cdot n_t$, by concatenating the columns one after another. The last column is used for domain separation, and only includes the index $j\in[k]$ of the table each row was originally part of.
On the advice side, we use one additional fixed column $\col{q}^{(\ell)}$ which is concatenated to the lookup row as $\col{l} = \left[ \col{l}_{i,1}, \ldots, \col{l}_{i,n_{c}}, \col{q}^{(\ell)}_i\right]$.
This reduces then to the single table case.
#### Online
The assumption that $\vec{\col{t}}$ are `Fixed` more closely resembles real-world scenarios, where the table contains all evaluations of a function with total arity $n_c$. This table is easy to precompute during preprocessing, but it would be painful to have to commit to it for every circuit execution.
It would be cheaper to provide a table containing **exactly** the entries queried. In this situation, the table would be different at every circuit execution, but its size would be equal to the number of lookups. Unfortunately, we are only able that the table and advice columns contain the same rows, but not the correctness of the table.
There are two situations where this can be helpful. For example, the table may be part of committed $\vec{\col{p}}$ instance, and that it comes from a trusted source. We can in this way "import" outside computation into the circuit directly.
For example, a smart contract could receive all hashes required in the circuit, compute their hash, and commit to these these evaluations as a set of instance columns.
Another useful application of online lookup tables is for proving that two sets of rows are equal. This enables circuits to model random-access-memory, where each lookup corresponds to one memory access. These entries are stored inside another section of the advice columns, but in a specific sorted order. The constraints are applied to the sorted section, and ensure that some relation holds between two consecutive accesses.
For proving that two sections $\vec{\col{a}}$ are equal, we consider here the simple case where we are comparing row of two columns $\col{a}_1, \col{a}_2 \in \FF^n$. We need one "multiplicities" vector $\col{m}_1,\col{m}_2 \in \FF^n$ per column participating in the lookup. This column can be both fixed or witness. Usually, $\col{m}$ is a witness when we are comparing columns as multi-sets, while it is fixed for a row permutation check.
In the fixed case, they behave in the same way as selectors, since they only indicate whether a row participates in the permutation. The constraints can be written as part of the same domain with:
$$
\begin{aligned}
\col{m}_{1,i} - \col{z}_{1,i}\cdot \left(\beta + \col{a}_{1,i} \right) &= 0, &\forall \ i \in [n]\\
\col{m}_{2,i} - \col{z}_{2,i}\cdot \left(\beta + \col{a}_{2,i} \right) &= 0, &\forall \ i \in [n]\\
\sum_{i \in [n]} (\col{z}_{1,i} - \col{z}_{2,i} )&= 0.
\end{aligned}
$$
## Error check
$e_i(X) \in \FF^{<d}[X]$
$$
\begin{align}
e_i(X) &= H(\acc.\mu+X\cdot 1, \vec{\col{s}}_i,\vec{\col{q}}_i, \acc.\vec{\col{a}}_{i} + X\cdot \pi.\vec{\col{a}}_{i}, \acc.\vec{\col{c}} + X \cdot \pi.\vec{\col{c}}) \\
&= H(\mu, \vec{\col{s}}_i,\vec{\col{q}}_i, \acc.\vec{\col{a}}_{i}, \cdot \acc.\vec{\col{c}}) + \sum_{j=1}^{d-1} e_{i,j}X^j + X^dH(1, \vec{\col{s}}_i,\vec{\col{q}}_i, \pi.\vec{\col{a}}_{i}, \cdot \pi.\vec{\col{c}})
\end{align}
$$
$e_i(X) = \mathsf{interpolate}(e_i(t), t=0,1,..,d)$
$\col{e}(X) = [e_0(X), ..., e_n(X)] \in (\FF^{<d}[X])^n$
This is what I want to prove
$$
H(\mu=1, \vec{\col{s}}_i,\vec{\col{q}}_i, \pi.\vec{\col{a}}_{i}, \pi.\vec{\col{c}}) = 0
$$