# Logup\*: Faster, Cheaper Indexed Lookups for Small Tables
In this technical note, we unpack and explain the recent note Logup*: faster, cheaper logup argument for small-table indexed lookups by Lev Soukhanov (May 2025). The paper proposes a new technique to perform *indexed lookup arguments* in zero-knowledge proof systems more efficiently when the lookup tables are small.
We will build up the background, provide intuition, and carefully walk through the mathematics involved.
## Introduction: The Lookup Argument
In many zero-knowledge proof systems, we often need to check that a list $X$ of witness elements is contained within some *table* $T$. This is called a **lookup argument** and arises, for example, when constraining values to be in a valid range or when using precomputed tables.
For arrays $X, T$ over a large field $F$, we want to prove
$$
X \subseteq T
$$
:::info
### Example
Let the table $T$ contain allowed values:
$$
T = \{3, 7, 20, 45\}
$$
Suppose we have a witness array
$$
X = \{7, 20, 3, 3\}
$$
In this case, we want to prove
$$
X \subseteq T
$$
which means every element in $X$ must be present in $T$.
:::
The *logup* argument, first proposed in prior work, provides a compact way to do this. Instead of explicitly sorting or building polynomials, it uses a logarithmic derivative trick to check a certain rational function identity at a random point.
## The Classic Logup Check
Let $|X| = n$, $|T| = m$. For example, if $X = \{5, 2\}$ and $T = \{2, 5, 9\}$, then, $|X| = n = 2$ and $|T| = m = 3$.
We also define an array $acc$ of size $m$, where $acc[j]$ counts how many times $T[j]$ appears in $X$. We assume no repeats in $T$ and that $n < p$ to avoid overflow.
:::info
### Example for $acc$ array
Let
$$
T = \{2, 5, 9\}
$$
and
$$
X = \{5, 2, 2\}
$$
We have
$$
|T| = m = 3, \quad |X| = n = 3
$$
We define an array $acc$ of size $m$ (same size as $T$). Each entry $acc[j]$ counts how many times $T[j]$ appears in $X$.
Let’s compute $acc$ entry by entry:
* $T[0] = 2$. It appears twice in $X$, so $acc[0] = 2$.
* $T[1] = 5$. It appears once in $X$, so $acc[1] = 1$.
* $T[2] = 9$. It does not appear in $X$, so $acc[2] = 0$.
Thus, we have
$$
acc = [2, 1, 0]
$$
We also assume that $T$ has no repeats (which it does not here) and that $n < p$, where $p$ is the characteristic of the finite field $\mathbb{F}$. For example, $p$ could be a large prime like $101$, so $3 < 101$, which ensures there is no overflow in these counts.
#### What would an overflow look like?
Suppose we are working in a field with a very small characteristic, say $p = 3$. Now imagine
$$
X = \{\underbrace{2, 2, 2}_{3 \text{ times}}, \underbrace{5, 5}_{2 \text{ times}}, 9\}
$$
Then
$$
acc = [3, 2, 1] \rightarrow [0, 2, 1] \mod 3
$$
But if $p = 3$, then $acc[0] = 3$ becomes zero when reduced modulo $p$. This "wrap-around" causes an **overflow**, meaning we can no longer correctly represent the actual count.
To avoid this, we always require $n < p$, so all counts $acc[j]$ stay below $p$ and no reduction occurs.
:::
The prover commits to $X, T$, and $acc$, then they check for a random challenge $c \in F$ that
\begin{equation}
\sum_{i=0}^{n-1} \frac{1}{c - X[i]} = \sum_{j=0}^{m-1} \frac{acc[j]}{c - T[j]}
\end{equation}
The soundness error is at most $\frac{n + m}{|F|}$.
:::info
### Example illustrating the rational function check
Let
$$
T = \{2, 5, 9\}
$$
and
$$
X = \{5, 2, 2\}
$$
We have
$$
acc = [2, 1, 0]
$$
Suppose we choose a random challenge $c = 10$ in our field $\mathbb{F}$.
#### Left-hand side (LHS)
\begin{equation}
\begin{aligned}
\sum_{i=0}^{2} \frac{1}{c - X[i]} &= \frac{1}{10 - 5} + \frac{1}{10 - 2} + \frac{1}{10 - 2} \\
&= \frac{1}{5} + \frac{1}{8} + \frac{1}{8} \\
&= \frac{1}{5} + \frac{2}{8} \\
&= \frac{1}{5} + \frac{1}{4} \\
&= \frac{4}{20} + \frac{5}{20} \\
&= \frac{9}{20}
\end{aligned}
\end{equation}
#### Right-hand side (RHS)
\begin{equation}
\begin{aligned}
\sum_{j=0}^{2} \frac{acc[j]}{c - T[j]} &= \frac{2}{10 - 2} + \frac{1}{10 - 5} + \frac{0}{10 - 9} \\
&= \frac{2}{8} + \frac{1}{5} + 0 \\
&= \frac{1}{4} + \frac{1}{5} \\
&= \frac{5}{20} + \frac{4}{20} \\
&= \frac{9}{20}
\end{aligned}
\end{equation}
#### Conclusion
Both sides evaluate to $\frac{9}{20}$, confirming that
$$
\sum_{i=0}^{n-1} \frac{1}{c - X[i]} = \sum_{j=0}^{m-1} \frac{acc[j]}{c - T[j]}
$$
This equality holds for all $c \in F$ if $X \subseteq T$ and $acc$ is correct. If not, with high probability, the equality will fail for a random $c$.
:::
## Indexed Lookup
Often we want an **indexed lookup**: for an index array $I$, we want to look up $T$ at positions $I[i]$. Define
$$
(I^* T)[i] = T[I[i]]
$$
We want to prove to the verifier that some witness array $X$ exactly matches the result of these indexed lookups, that is,
$$
X = I^* T
$$
The end goal is to ensure that the prover did not cheat by claiming arbitrary values in $X$ — instead, each $X[i]$ must truly be the entry at index $I[i]$ in $T$.
This is useful, for example, when $T$ is a precomputed table of constants (like roots of unity, valid opcodes, or lookup tables for optimized arithmetic), and $I$ specifies which entries the computation uses.
The classic approach reduces this to an unindexed lookup using random folding. We pick a random $\gamma$ and check
\begin{equation}
\sum_{i=0}^{n-1} \frac{1}{c - (I^* T[i] + \gamma I[i])} = \sum_{j=0}^{m-1} \frac{acc[j]}{c - (T[j] + \gamma j)}
\end{equation}
This requires committing to $I$, $I^* T$, and $acc$.
:::info
### Example for indexed lookup and random folding
Let
$$
T = \{20, 30, 40\}
$$
and index array
$$
I = \{2, 0, 1\}
$$
Then,
\begin{equation}
\begin{aligned}
I^* T[0] &= T[2] = 40 \\
I^* T[1] &= T[0] = 20 \\
I^* T[2] &= T[1] = 30
\end{aligned}
\end{equation}
Thus,
$$
I^* T = \{40, 20, 30\}
$$
Suppose we have a witness array
$$
X = \{40, 20, 30\}
$$
We want to prove that $X = I^* T$.
#### Classic approach: using random folding
We choose a random challenge $\gamma$ (say $\gamma = 7$) and a random $c$ (say $c = 100$).
For each $i$, we form
$$
I^* T[i] + \gamma I[i]
$$
Numerically:
- $I^* T[0] + 7 \cdot I[0] = 40 + 7 \cdot 2 = 40 + 14 = 54$
- $I^* T[1] + 7 \cdot I[1] = 20 + 7 \cdot 0 = 20$
- $I^* T[2] + 7 \cdot I[2] = 30 + 7 \cdot 1 = 30 + 7 = 37$
So we obtain the folded values:
$$
\{54, 20, 37\}
$$
#### Left-hand side (LHS)
\begin{equation}
\begin{aligned}
\sum_{i=0}^{2} \frac{1}{c - (I^* T[i] + \gamma I[i])} &= \frac{1}{100 - 54} + \frac{1}{100 - 20} + \frac{1}{100 - 37} \\
&= \frac{1}{46} + \frac{1}{80} + \frac{1}{63}
\end{aligned}
\end{equation}
#### Right-hand side (RHS)
We also define $acc$ by counting how many times each $T[j]$ appears in $I^* T$. In this example, each $T[j]$ appears once, so
$$
acc = [1, 1, 1]
$$
We know that:
- $T[0] + 7 \cdot 0 = 20 + 0 = 20$
- $T[1] + 7 \cdot 1 = 30 + 7 = 37$
- $T[2] + 7 \cdot 2 = 40 + 14 = 54$
Then
\begin{equation}
\begin{aligned}
\sum_{j=0}^{2} \frac{acc[j]}{c - (T[j] + \gamma j)} &= \frac{1}{100 - 20} + \frac{1}{100 - 37} + \frac{1}{100 - 54} \\
&= \frac{1}{80} + \frac{1}{63} + \frac{1}{46}
\end{aligned}
\end{equation}
Notice that the folded terms in RHS match the folded terms in LHS exactly, just in a different order. So both sides evaluate to the same sum.
#### Why do we do this?
The folding with $\gamma$ combines $I^* T[i]$ and $I[i]$ into a single value, allowing us to reduce an **indexed lookup** problem to a standard unindexed logup-style rational function check.
This requires committing to:
- $I$ (the index array)
- $I^* T$ (the looked-up values)
- $acc$ (access counts)
and then performing the rational function check at random $c$.
:::
## Challenges with Small Tables
In many practical protocols, we often have:
1. $m \ll n$: the table $T$ is small, but the witness array $X$ (and index array $I$) is large.
2. Indices $I[i]$ are small integers that can be committed to efficiently.
3. The table $T$ contains large, essentially random elements of $\mathbb{F}$ (for example, evaluations of a polynomial in SPARK-like protocols).
### Why does this cause problems?
While $I$ is cheap to commit to (since its elements are small), the looked-up values $I^* T$ are large field elements, and there are $n$ of them.
In hash-based commitment schemes, committing to each large element of $I^* T$ is expensive because:
* The number of elements is large ($n$).
* Each element is big (full field size).
This significantly increases prover time, proof size, and verifier cost.
:::info
### Example illustrating challenges with small tables
Suppose
$$
T = \{50, 80\}
$$
so $m = 2$, a very small table.
Now let the index array
$$
I = \{0, 1, 0, 0, 1, 0, 1, 0, 1, 0\}
$$
so $n = 10$, much larger than $m$.
The looked-up values are
$$
I^* T = \{50, 80, 50, 50, 80, 50, 80, 50, 80, 50\}
$$
Even though $I$ contains small indices (just 0 or 1), each value in $I^* T$ is a large field element (say, 256-bit values if $\mathbb{F}$ is large).
#### Why is this expensive?
- We must commit to $I^* T$, which has 10 large elements.
- In a hash-based scheme, each of these big elements must be individually hashed and included in the Merkle tree.
- The total cost scales with $n$ and with the size of each element.
#### Key point
Despite the table $T$ being tiny and $I$ being cheap, the commitment to $I^* T$ dominates the cost.
Logup* avoids this by removing the need to commit to $I^* T$ at all.
:::
## New Idea: Trading Pullback for Pushforward
The key insight in Logup\* is to **avoid committing to $I^* T$ altogether**. This is achieved by using a duality between *pullbacks* and *pushforwards*.
### Definitions
* **Pullback**: Given $I : {0,...,n-1} \rightarrow {0,...,m-1}$ and $T$, define
$$
I * T[i] = T[I[i]]
$$
* **Pushforward**: Given a function $A$ on $n$, define
$$
I_* A[j] = \sum_{i | I[i] = j} A[i]
$$
:::info
### Example illustrating pullback and pushforward
Let
$$
T = \{10, 20, 30\}
$$
and
$$
I = \{2, 0, 1, 2\}
$$
Then $n = 4$, $m = 3$.
### Pullback
We define
$$
I^* T[i] = T[I[i]]
$$
Compute for each $i$:
- $I * T[0] = T[2] = 30$
- $I * T[1] = T[0] = 10$
- $I * T[2] = T[1] = 20$
- $I * T[3] = T[2] = 30$
Thus,
$$
I * T = \{30, 10, 20, 30\}
$$
### Pushforward
Let
$$
A = \{5, 6, 7, 8\}
$$
We define
$$
I_* A[j] = \sum_{i | I[i] = j} A[i]
$$
Compute for each $j$:
- $I_* A[0]$: indices where $I[i] = 0$ is $i = 1$, so $I_* A[0] = A[1] = 6$.
- $I_* A[1]$: indices where $I[i] = 1$ is $i = 2$, so $I_* A[1] = A[2] = 7$.
- $I_* A[2]$: indices where $I[i] = 2$ are $i = 0$ and $i = 3$, so $I_* A[2] = A[0] + A[3] = 5 + 8 = 13$.
Thus,
$$
I_* A = \{6, 7, 13\}
$$
#### Intuition
- **Pullback** copies table values to positions specified by $I$.
- **Pushforward** aggregates input values $A$ according to which index they map to.
:::
### Duality Lemma
\begin{equation}
\langle I_* A, B \rangle = \langle A, I^* B \rangle
\end{equation}
where $\langle \cdot, \cdot \rangle$ denotes the inner product (sum over the matching indices).
:::info
### Example illustrating the duality lemma
Let
$$
A = \{2, 3, 4\}
$$
and
$$
B = \{10, 20\}
$$
Define
$$
I = \{1, 0, 1\}
$$
So $n = 3$, $m = 2$.
### Compute $I^* B$
$$
I^* B[i] = B[I[i]]
$$
- $I^* B[0] = B[1] = 20$
- $I^* B[1] = B[0] = 10$
- $I^* B[2] = B[1] = 20$
Then,
$$
I^* B = \{20, 10, 20\}
$$
### Compute $\langle A, I^* B \rangle$
\begin{aligned}
\langle A, I^* B \rangle &= 2 \cdot 20 + 3 \cdot 10 + 4 \cdot 20 \\
&= 40 + 30 + 80 \\
&= 150
\end{aligned}
### Compute $I_* A$
$$
I_* A[j] = \sum_{i : I[i] = j} A[i]
$$
- $I_* A[0]$: indices where $I[i] = 0$ is $i = 1$, so $I_* A[0] = A[1] = 3$.
- $I_* A[1]$: indices where $I[i] = 1$ are $i = 0$ and $i = 2$, so $I_* A[1] = A[0] + A[2] = 2 + 4 = 6$.
Then,
$$
I_* A = \{3, 6\}
$$
### Compute $\langle I_* A, B \rangle$
\begin{aligned}
\langle I_* A, B \rangle &= 3 \cdot 10 + 6 \cdot 20 \\
&= 30 + 120 \\
&= 150
\end{aligned}
### Conclusion
$$
\langle I_* A, B \rangle = \langle A, I^* B \rangle = 150
$$
This confirms the duality lemma.
:::
## The New Protocol
As mentionned previously, in classical approaches, the prover must commit to the full array $I^* T$, which is large. The key innovation here is to **avoid ever committing to $I^* T$**, while still convincing the verifier of the correctness of each lookup.
### What do we want?
We want to claim that for a random point $r$ (usually a challenge chosen by the verifier),
$$
I^* T(r) = e
$$
Here, $I^* T$ is viewed as a multilinear polynomial evaluated at point $r$.
### Strategy breakdown
1. **Commit to $I$ and $T$ only**
We avoid committing to $I^* T$, which would have $n$ large elements. Instead, we just commit to:
- $I$: the array of indices (small elements).
- $T$: the table (small size $m$, but possibly large elements).
2. **Claim the value $e$**
Prover declares a value $e$ which should be $I^* T(r)$.
3. **Define $eq_r$**
We define a special polynomial called the multilinear Lagrange kernel:
\begin{equation}
eq_r(x) = \prod_{i=0}^{k-1} (r_i x_i + (1 - r_i)(1 - x_i))
\end{equation}
4. **Construct and commit to $I_* eq_r$**
We pushforward $eq_r$ through $I$:
$$
I_* eq_r[j] = \sum_{i: I[i] = j} eq_r[i]
$$
5. **Key observation: switch inner product**
We have:
\begin{equation}
I^* T(r) = \langle I^* T, eq_r \rangle = \langle T, I_* eq_r \rangle
\end{equation}
6. **Use sumcheck to verify**
We use the sumcheck protocol to prove
\begin{equation}
\langle T, I_* eq_r \rangle = e
\end{equation}
After sumcheck, this reduces to a small number of evaluation claims on $T$ and $I_* eq_r$ at random points.
7. **Open commitments**
Finally, we open the necessary evaluations from $T$ and $I_* eq_r$, and check that they match the expected relations.
### Why does this help?
- We never commit to $I^* T$.
- We only commit to $I_* eq_r$, which is small (size $m$).
- The expensive large field element vector commitment is completely avoided.
:::info
### Simple numeric example illustrating the new strategy
Let
$$
T = \{100, 200\}
$$
and
$$
I = \{0, 1, 0\}
$$
So
$$
I^* T = \{100, 200, 100\}
$$
### Suppose we want to evaluate at $r = (0, 1, 0)$
On the Boolean hypercube $\{0,1\}^3$, $r$ corresponds to selecting index 2 (binary index 010).
### Step 1: Commit to $I$ and $T$ only
We do **not** commit to $I^* T = \{100, 200, 100\}$.
### Step 2: Claim value $e$
The prover claims
$$
e = I^* T(r) = I^* T[2] = 100
$$
### Step 3: Define $eq_r$
The kernel $eq_r$ is 1 only at $r = (0,1,0)$ and zero elsewhere, that is to say $eq_r = \{0,0,1\}$.
For simplicity in this example, think of $eq_r$ as an indicator that selects index 2.
### Step 4: Compute $I_* eq_r$
We compute:
$$
I_* eq_r[j] = \sum_{i | I[i] = j} eq_r[i]
$$
- For $j = 0$:
- $I[0] = 0$, $I[2] = 0$, and $I[1] = 1$
- Only $i = 2$ has $eq_r[2] = 1$, so $I_* eq_r[0] = 1$
- For $j = 1$:
- Only $i = 1$, but $eq_r[1] = 0$, so $I_* eq_r[1] = 0$
Thus,
$$
I_* eq_r = \{1, 0\}
$$
### Step 5: Switch inner product
We have
$$
I^* T(r) = \langle I^* T, eq_r \rangle = \langle T, I_* eq_r \rangle
$$
Numerically,
$$
\langle T, I_* eq_r \rangle = 100 \times 1 + 200 \times 0 = 100
$$
Matches the claimed value $e$.
### Step 6: Use sumcheck
The prover proves via sumcheck that:
$$
\langle T, I_* eq_r \rangle = e
$$
Which here is simply $100 = 100$.
### Step 7: Open evaluations
Finally, open the commitments to $T$ and $I_* eq_r$ at points chosen during the sumcheck.
### Key point
We never committed to
$$
I^* T = \{100, 200, 100\}
$$
Instead, we only committed to
$$
I_* eq_r = \{1, 0\}
$$
which is short (length $m = 2$), and we reused the commitment to $T$.
:::
## Pushforward Check via Logup
The core check for $I_* A$ can also be done using a logup-style argument.
Given $X$, $Y$ with $Y = I_* X$, for random $c$ we check
\begin{equation}
\sum_{i=0}^{n-1} \frac{X[i]}{c - I[i]} = \sum_{j=0}^{m-1} \frac{Y[j]}{c - j}
\end{equation}
In our context, we set $X = eq_r$, $Y = I_* eq_r$. The prover commits to $I$ and $T$, and validates
\begin{equation}
\sum_{i=0}^{n-1} \frac{eq_r[i]}{c - I[i]} = \sum_{j=0}^{m-1} \frac{I_* eq_r[j]}{c - j}
\end{equation}
This step is needed to ensure that the committed $I_* eq_r$ values are indeed the correct pushforward sums derived from $eq_r$ and $I$. Without this check, a dishonest prover could cheat by committing to a fake $I_* eq_r$ and then falsely convince the verifier that $\langle T, I_* eq_r \rangle = e$.
This guarantees that $I_* eq_r$ is computed correctly without having to commit to $I^* T$.
## Conclusion
The Logup\* protocol trades a *pullback* claim for a *pushforward* claim, allowing small-table indexed lookups without large additional commitments. This technique simplifies proof construction, reduces prover costs, and unlocks better performance for hash-based proofs.
When $m \ll n$ and $F$ is large, this new approach has major advantages:
* No need to commit to $I^* T$, saving $n$ large-field commitments.
* Only requires small-size commitments (size $m$) and sumchecks involving $m$ terms.
* Avoids numerator overflow issues present in classical logup accumulators.