LogUp-GKR

This note is about LogUp-GKR, and is the culmination of many notes that built up to this. Make sure to take a look at the book for links to all other articles. Namely, the previous article deep dove into the original GKR. Also make sure to familiarize yourself with our math conventions.

Recall the LogUp equation from the LogUp note:

\sum_{i = 0}^{n - 2} \sum_{j = 0}^{k - 1} \frac{m_{a_{i j}}}{(α - a_{i j})} = \sum_{i = 0}^{n - 2} \sum_{j = 0}^{k - 1} \frac{m_{b_{i j}}}{(α - b_{i j})}

LogUp-GKR is a technique for a prover to prove the correct evaluation of this equation to a verifier using a modified version of GKR. We're going to be specifically interested in the case where the

m_{a_{i j}}

m_{b_{i j}}

a_{i j}

and

b_{i j}

are values derived from a STARK trace. As discussed in the LogUp note, we know how to prove this relationship using AIR constraints directly. This note is interested in how to offload this task to GKR instead.

Conventions

Recall from the note on the multilinear extension that

{\tilde{h}}_{i}

stands for the MLE of some

h : {0, 1}^{k} \to F

. In the case of LogUp-GKR, we view the trace columns as functions

f_{i} : {0, 1}^{\log n}

, where

n

is the length of the trace. We use the bit decomposition of the row index as input to the function. For example, row index

6

would be accessed as

110

. Further, say that we number these bits as

b_{2} b_{1} b_{0}

, such that

b_{2} = 1

b_{1} = 1

, and

b_{0} = 0

. Then we would write

h_{i} (b_{0}, b_{1}, b_{2}) = h_{i} (0, 1, 1)

, such that the least significant bit comes first, and the most significant bit comes last.

Additionally, throughout the note,

f_{i} : {0, 1}^{\log n} \to F

will refer to the

i

th trace column. As usual,

{\tilde{f}}_{i}

will refer to the multilinear extension of

f_{i}

. Also, it will also be convenient to define

\tilde{ω} (x) = ({\tilde{f}}_{0} (x), \dots, {\tilde{f}}_{m - 1} (x))

. Lastly, we will say that the trace has

m

columns.

The circuit

Next, we will describe the circuit that LogUp-GKR operates on. However, before we discuss the circuit, let's first rearrange the above equation by bringing all the terms on the left-hand side:

\sum_{i = 0}^{n - 2} \sum_{j = 0}^{k - 1} \frac{m_{a_{i j}}}{(α - a_{i j})} - \frac{m_{b_{i j}}}{(α - b_{i j})} = 0

Notice that the above is a sum of fractions. To simplify the presentation of the circuit, let's assume that we only have 4 fractions:

\frac{a}{b} + \frac{c}{d} + \frac{e}{f} + \frac{g}{h} = 0

The first layer of the circuit will represent these 4 fractions. Then, the second layer will reduce the number of fractions by 2 by merging pairs of fractions as follows:

\frac{a \cdot d + c \cdot b}{b \cdot d} + \frac{e \cdot h + g \cdot f}{f \cdot h} = 0

And finally, the final layer in this example would halve the number of fractions down to a single one in a similar manner:

\frac{f \cdot h (a \cdot d + c \cdot b) + b \cdot d (e \cdot h + g \cdot f)}{b \cdot d \cdot f \cdot h}

Note that the circuit never performs any actual division; all it does is merge fractions together over and over with each new layer, down until there is only a single one remaining.

Hence, a fraction will be represented as a pair of field elements

(a, b)

. Additionally, we will define an addition operation that mimics how fractions are added:

(a, b) + (c, d) = (a \cdot d + c \cdot b, b \cdot d)

We will follow the paper's terminology and call this

(F^{2}, +)

the projective representation. Nodes in the circuit are going to be in projective representation. That is, every node will represent 2 field elements, and nodes are summed as described above.

Below is an example circuit, where the initial fractions (encoded in projective representation) lie in the input layer:

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

In drawing this circuit, we made the assumption that
$k = 1$ and
$n = 4$ . We omitted the
$j = 0$ index in the circuit, such that, for example,
$m_{a_{i}}$ stands for
$m_{a_{i 0}}$ . Additionally, note that the circuit ends with 2 fractions in the final layer instead of 1; we will explain why when we get to GKR.

The

(p_{i}, q_{i})

pairs on the left hand-side of the circuit represent the 2 polynomials that will be interpolated per layer; we will discuss these in-depth in the next sections.

Recall from the multiset check note that the terms

m_{a_{i j}}

m_{b_{i j}}

a_{i j}

and

b_{i j}

are assumed to be polynomials over the trace row

i

and

i + 1

. However, in this circuit and in the rest of this note, we make the simplifying assumption that the terms are polynomials solely over the trace row

i

GKR

We will now discuss the modifications to the GKR protocol that are made to prove the evaluation of this circuit. The point to address is how we handle the fact that we have 2 field elements per node compared to a single one in the original GKR. We will also define the layer polynomial differently (exploiting the additional structure in the circuit) to make the layer sum-check more efficient.

We will first discuss GKR over all layers other than the input layer, and end with the treatment of the input layer.

All inner layers (excluding the input layer)

We begin the discussion about all layers except the input layer. At layer

i

, let

p_{i} : {0, 1}^{k_{i}} \to F

and

q_{i} : {0, 1}^{k_{i}} \to F

be functions that encode all the left element of each node, and all the right element of each node, respectively. For example, referring to our example circuit at layer 2,

\begin{aligned} p_{2} (0, 0, 0) & = m_{a_{0}} \\ q_{2} (0, 0, 0) & = α - a_{0} \\ p_{2} (1, 0, 0) & = m_{b_{0}} \\ q_{2} (1, 0, 0) & = α - b_{0} \\ \dots \end{aligned}

Additionally, we can describe the circuit structure as

\begin{aligned} p_{i} (x) & = p_{i + 1} (0, x) q_{i + 1} (1, x) + p_{i + 1} (1, x) q_{i + 1} (0, x) \\ q_{i} (x) & = q_{i + 1} (0, x) q_{i + 1} (1, x) \end{aligned}

This follows from how the nodes are laid out, the fact that we index nodes in bit representation from least to most significant bit, and from the definition of addition for the projective representation.

Next, we need to give a definition of

{\tilde{p}}_{i}

and

{\tilde{q}}_{i}

which can be proved using sum-check. We will use the following definitions:

\begin{aligned} {\tilde{p}}_{i} (\hat{x}) & = \sum_{x \in {0, 1}^{k_{i}}} eq (x, \hat{x}) ({\tilde{p}}_{i + 1} (0, x) {\tilde{q}}_{i + 1} (1, x) + {\tilde{p}}_{i + 1} (1, x) {\tilde{q}}_{i + 1} (0, x)) \\ {\tilde{q}}_{i} (\hat{x}) & = \sum_{x \in {0, 1}^{k_{i}}} eq (x, \hat{x}) {\tilde{q}}_{i + 1} (0, x) {\tilde{q}}_{i + 1} (1, x) \end{aligned}

Remember that the MLE of a function
$f$ is unique, and hence all expressions which are multilinear and interpolate
$f$ are equivalent definitions of the same polynomial
$\tilde{f}$ . You can verify that the 2 properties hold for the above definitions. This concept is further discussed in this note from Justin Thaler.

Notice that unlike the original GKR protocol, both sum-checks are defined over

x

of the current layer
$i$ as opposed to over some

y, z \in {0, 1}^{k_{i + 1}}

of the next layer
$i + 1$ . Concretely, we are summing over

k_{i}

variables instead of

2 \cdot k_{i + 1}

, which is radically more efficient. We are able to do this by exploiting the additional structure in the circuit where a node with index

(x_{0}, \dots, x_{k_{i}})

in layer

i

will be the result of adding the nodes in layer

i + 1

with index

(0, x_{0}, \dots, x_{k_{i}})

and

(1, x_{0}, \dots, x_{k_{i}})

. Additionally, since there are no multiplications in the circuit, we omit them completely.

Recall that the core idea of the GKR protocol is to reduce a claim about the layer polynomial(s) at layer

i

to a claim about the layer polynomial(s) at layer

i + 1

. Assume that the claims at layer

i

are

{\tilde{p}}_{i} (ρ_{i})

and

{\tilde{q}}_{i} (ρ_{i})

, for some random

ρ_{i} \in_{R} F^{k_{i}}

. Here, we could execute 2 sum-check protocols to reduce the 2 claims

{\tilde{p}}_{i} (ρ_{i})

and

{\tilde{q}}_{i} (ρ_{i})

to 2 claims

{\tilde{p}}_{i + 1} (ρ_{i + 1})

and

{\tilde{q}}_{i + 1} (ρ_{i + 1})

, for some random

ρ_{i + 1} \in_{R} F^{k_{i + 1}}

. However, we will instead reduce the 2 sum-check problems to a single one by taking a random linear combination of both claims. Specifically, given a randomly sampled

λ_{i} \in F

\begin{aligned} {\tilde{p}}_{i} (\hat{x}) + λ_{i} {\tilde{q}}_{i} (\hat{x}) = \\ \sum_{x \in {0, 1}^{k_{i}}} eq (x, \hat{x}) [{\tilde{p}}_{i + 1} (0, x) {\tilde{q}}_{i + 1} (1, x) + {\tilde{p}}_{i + 1} (1, x) {\tilde{q}}_{i + 1} (0, x) + λ_{i} {\tilde{q}}_{i + 1} (0, x) {\tilde{q}}_{i + 1} (1, x)] \end{aligned}

The core property of a random linear combination of
$n$ claims is that it reduces the
$n$ claims to a single claim. In other words, if the reduced claim is proven to be true, then with high probability, the
$n$ original claims are proven true as well.

Hence, with this trick, we are back to having a single sum-check protocol per layer.

Finally, we are left with showing how to reduce a claim

{\tilde{p}}_{i} (ρ_{i}) + λ {\tilde{q}}_{i} (ρ_{i})

, for some random

ρ_{i} \in_{R} F^{k_{i}}

, to a claim

{\tilde{p}}_{i + 1} (ρ_{i + 1}) + λ_{i + 1} {\tilde{q}}_{i + 1} (ρ_{i + 1})

, for some random

ρ_{i + 1} \in_{R} F

and

λ_{i + 1} \in F

Hence, starting from the claim

{\tilde{p}}_{i} (ρ_{i}) + λ_{i} {\tilde{q}}_{i} (ρ_{i})

, we run the

k_{i}

rounds of sum-check until the last evaluation check:

\begin{aligned} {\tilde{p}}_{i} (ρ_{i}) + λ_{i} {\tilde{q}}_{i} (ρ_{i}) = \\ \sum_{x \in {0, 1}^{k_{i}}} eq (x, ρ_{i}) [{\tilde{p}}_{i + 1} (0, x) {\tilde{q}}_{i + 1} (1, x) + {\tilde{p}}_{i + 1} (1, x) {\tilde{q}}_{i + 1} (0, x) + λ_{i} {\tilde{q}}_{i + 1} (0, x) {\tilde{q}}_{i + 1} (1, x)] \end{aligned}

In the last step of the sum-check, let

γ \in_{R} F^{k_{i}}

be the randomness sampled during the

k_{i}

rounds. Hence, the verifier is left with evaluating

{\tilde{p}}_{i + 1} (0, γ)

{\tilde{p}}_{i + 1} (1, γ)

{\tilde{q}}_{i + 1} (0, γ)

, and

{\tilde{q}}_{i + 1} (1, γ)

. Just as in the original GKR, the prover will send these evaluations as claims. Also, just as in GKR, we need to reduce the 2 claims about

{\tilde{p}}_{i + 1}

to a single claim, and the 2 claims about

{\tilde{q}}_{i + 1}

to a single claim.

Recall from GKR that to reduce 2 claims about
${\tilde{W}}_{i}$ to a single claim about
${\tilde{W}}_{i + 1}$ , the prover had to send a univariate polynomial
$q_{i} = {\tilde{W}}_{i + 1} \circ l_{i + 1}$ of degree
$k_{i + 1}$ . In LogUp-GKR, we can do better by exploiting the additional structure of the circuit.

In LogUp-GKR, to reduce the 2 claims about

{\tilde{p}}_{i + 1}

, the prover only needs to send the 2 claims

{\tilde{p}}_{i + 1} (0, γ)

and

{\tilde{p}}_{i + 1} (1, γ)

. The verifier samples

γ_{0} \in_{R} F

, and computes the new claim

{\tilde{p}}_{i + 1} (γ_{0}, γ) = (1 - γ_{0}) \cdot {\tilde{p}}_{i + 1} (0, γ) + γ_{0} \cdot {\tilde{p}}_{i + 1} (1, γ)

. Similarly for

{\tilde{q}}_{i + 1}

, the prover also sends the claims

{\tilde{q}}_{i + 1} (0, γ)

and

{\tilde{q}}_{i + 1} (1, γ)

, which are reduced to

{\tilde{q}}_{i + 1} (γ_{0}, γ) = (1 - γ_{0}) \cdot {\tilde{q}}_{i + 1} (0, γ) + γ_{0} \cdot {\tilde{q}}_{i + 1} (1, γ)

. Finally, the verifier samples

λ_{i + 1} \in_{R} F

, yielding the next layer claim

{\tilde{p}}_{i + 1} (γ_{0}, γ) + λ_{i + 1} {\tilde{q}}_{i + 1} (γ_{0}, γ)

The above makes use of a property of multilinear polynomials, which was explored in this note.

Recall that our goal was to reduce the initial claim

{\tilde{p}}_{i} (ρ_{i}) + λ_{i} {\tilde{q}}_{i} (ρ_{i})

to a single claim

{\tilde{p}}_{i + 1} (ρ_{i + 1}) + λ_{i + 1} {\tilde{q}}_{i + 1} (ρ_{i + 1})

for some

ρ_{i + 1} \in_{R} F^{k_{i + 1}}

. We just achieved this with

ρ_{i + 1} = (γ_{0}, γ)

Hence, by exploiting the graph structure, in LogUp-GKR we are able to have the prover send 2 field elements per polynomial (

{\tilde{p}}_{i + 1} (0, γ)

{\tilde{p}}_{i + 1} (1, γ)

{\tilde{q}}_{i + 1} (0, γ)

and

{\tilde{q}}_{i + 1} (1, γ)

) instead of the

k_{i + 1}

field elements that represent the

q

polynomial in the original GKR.

The input layer

Recall that we are interested in the use of LogUp-GKR where the values

m_{a_{i j}}

m_{b_{i j}}

a_{i j}

and

b_{i j}

that sit in the input layer are derived from a STARK trace. In order to prove that this is indeed the case, we design the circuit to reduce the evaluation of the circuit to claims about the main trace columns

{\tilde{f}}_{0} (ρ)

\dots

{\tilde{f}}_{m - 1} (ρ)

, given a randomly sampled

ρ \in F^{\log n}

. Then, in the next article, we will show how to prove these claims with Air constraints.

Therefore, this additional constraint means that we need to reduce the claim about the 2nd layer (

p_{1} (ρ_{1}) + λ_{1} q_{1} (ρ_{1})

in our example circuit) differently from all other layers. Concretely, again referring to our example circuit, applying the same algorithm as all other circuit layers would reduce the 2nd layer claim to a claim

p_{2} (ρ_{2}) + λ_{2} q_{2} (ρ_{2})

, for some random

ρ_{2} \in_{R} F^{k_{2}}

. This is not quite what we need.

Recall that the input layer holds the values

m_{a_{i j}}

m_{b_{i j}}

a_{i j}

and

b_{i j}

, for

0 \leq i < n

and

0 \leq j < k

. That is,

i

selects the trace row, and

j

selects the fraction generated at row

i

Let

z \in F^{\log k}

be the bit decomposition of

j

. For example, if

j = 4

, then

z = (0, 0, 1)

, following our previously discussed convention that the least significant bit comes first.

Next, we will introduce

g_{[z]} (\tilde{ω} (x))

, which captures all numerators of a given row, and

h_{[z]} (\tilde{ω} (x))

to capture the denominators. In our example circuit, using a slight abuse of notation, we would have

\begin{aligned} g_{0} (\tilde{ω} (x)) & = m_{a_{x 0}} \\ g_{1} (\tilde{ω} (x)) & = m_{b_{x 0}} \\ h_{0} (\tilde{ω} (x)) & = α - a_{x 0} \\ h_{1} (\tilde{ω} (x)) & = α - b_{x 0} \end{aligned}

We abuse notation since
$x$ is the bit decomposition of the index
$i$ , which we use directly into, for example,
$m_{a_{x 0}}$ .

Then, let

{\tilde{p}}_{d - 2} (ρ_{d - 2}) + λ_{d - 2} {\tilde{q}}_{d - 2} (ρ_{d - 2})

be the claim from the 2nd layer. We will rewrite the claim as

{\tilde{p}}_{d - 2} (ρ_{z^{'}}, ρ_{x}) + λ_{d - 2} {\tilde{q}}_{d - 2} (ρ_{z^{'}}, ρ_{x})

where

ρ_{z^{'}} \in_{R} F^{\log (k) - 1}

and

ρ_{x} \in_{R} F^{\log n}

Then, the reduction to the input layer can written as

\begin{aligned} {\tilde{p}}_{d - 2} (ρ_{z^{'}}, ρ_{x}) + λ_{d - 2} {\tilde{q}}_{d - 2} (ρ_{z^{'}}, ρ_{x}) = \\ \sum_{x \in {0, 1}^{\log n}} \sum_{z^{'} \in {0, 1}^{\log (k) - 1}} eq (z^{'}, ρ_{z^{'}}) \cdot eq (x, ρ_{x}) \\ [g_{[(0, z^{'})]} (\tilde{ω} (x)) \cdot h_{[(1, z^{'})]} (\tilde{ω} (x)) + g_{[(1, z^{'})]} (\tilde{ω} (x)) \cdot h_{[(0, z^{'})]} (\tilde{ω} (x))] \end{aligned}

We only run the sum-check protocol over the

x

variable. Then, after the last round of sum-check, let

\hat{x} \in_{R} F^{\log n}

be the randomness generated during sum-check. The prover sends the claims

\tilde{ω} (\hat{x}) = ({\tilde{f}}_{0} (\hat{x}), \dots, {\tilde{f}}_{m - 1} (\hat{x}))

, which the verifier uses to complete the final evaluation of sum-check.

And… We're done! We are left with claims

({\tilde{f}}_{0} (\hat{x}), \dots, {\tilde{f}}_{m - 1} (\hat{x}))

to prove, which we will be proved in Air. We will discuss this in the next article.

LogUp-GKR

Conventions

The circuit

GKR

All inner layers (excluding the input layer)

The input layer

Read more

Basefold

LogUp

The Sum-Check Protocol

LogUp-GKR: The Air constraints