LogUp-GKR - HackMD

# LogUp-GKR This note is about [LogUp-GKR](https://eprint.iacr.org/2023/1284.pdf), and is the culmination of many notes that built up to this. Make sure to take a look at [the book](/HJGhiRIPC) for links to all other articles. Namely, [the previous article](/H1mFRcoYR) deep dove into the original GKR. Also make sure to familiarize yourself with our [math conventions](/HkCFc-vDC). Recall the LogUp equation from the [LogUp note](/S1X5XFAwR): $$ \sum_{i=0}^{n-2} \sum_{j=0}^{k-1} \frac{m_{a_{ij}}}{(\alpha - a_{ij})} = \sum_{i=0}^{n-2} \sum_{j=0}^{k-1} \frac{m_{b_{ij}}}{(\alpha - b_{ij})} $$ LogUp-GKR is a technique for a prover to prove the correct evaluation of this equation to a verifier using a modified version of GKR. We're going to be specifically interested in the case where the $m_{a_{ij}}$, $m_{b_{ij}}$, $a_{ij}$ and $b_{ij}$ are values derived from a STARK trace. As discussed in the LogUp note, we know how to prove this relationship using AIR constraints directly. This note is interested in how to offload this task to GKR instead. ## Conventions Recall from the [note on the multilinear extension](/ByPVdWluC#The-Multilinear-Extension-MLE) that $\tilde{h}_i$ stands for the MLE of some $h: \{0,1\}^k \rightarrow \mathbb{F}$. In the case of LogUp-GKR, we view the trace columns as functions $f_i: \{0,1\}^{\log n}$, where $n$ is the length of the trace. We use the bit decomposition of the row index as input to the function. For example, row index $6$ would be accessed as $110$. Further, say that we number these bits as $b_2b_1b_0$, such that $b_2=1$, $b_1=1$, and $b_0=0$. Then we would write $h_i(b_0,b_1,b_2)=h_i(0,1,1)$, such that the least significant bit comes first, and the most significant bit comes last. Additionally, throughout the note, $f_i: \{0,1\}^{\log n} \rightarrow \mathbb{F}$ will refer to the $i$th trace column. As usual, $\tilde{f}_i$ will refer to the multilinear extension of $f_i$. Also, it will also be convenient to define $\tilde{\omega}(x) = (\tilde{f}_0(x), \dots, \tilde{f}_{m-1}(x))$. Lastly, we will say that the trace has $m$ columns. ## The circuit Next, we will describe the circuit that LogUp-GKR operates on. However, before we discuss the circuit, let's first rearrange the above equation by bringing all the terms on the left-hand side: $$ \sum_{i=0}^{n-2} \sum_{j=0}^{k-1} \frac{m_{a_{ij}}}{(\alpha - a_{ij})} - \frac{m_{b_{ij}}}{(\alpha - b_{ij})} = 0 $$ Notice that the above is a sum of fractions. To simplify the presentation of the circuit, let's assume that we only have 4 fractions: $$ \frac{a}{b} + \frac{c}{d} + \frac{e}{f} + \frac{g}{h} = 0 $$ The first layer of the circuit will represent these 4 fractions. Then, the second layer will reduce the number of fractions by 2 by merging pairs of fractions as follows: $$ \frac{a \cdot d + c \cdot b}{b \cdot d} + \frac{e \cdot h + g \cdot f}{f \cdot h} = 0 $$ And finally, the final layer in this example would halve the number of fractions down to a single one in a similar manner: $$ \frac{f \cdot h (a \cdot d + c \cdot b) + b \cdot d(e \cdot h + g \cdot f)}{b \cdot d \cdot f \cdot h} $$ Note that the circuit never performs any actual division; all it does is merge fractions together over and over with each new layer, down until there is only a single one remaining. Hence, a fraction will be represented as a pair of field elements $(a,b)$. Additionally, we will define an addition operation that mimics how fractions are added: $$ (a,b) + (c, d) = (a \cdot d + c \cdot b, b \cdot d) $$ We will follow the paper's terminology and call this $(\mathbb{F}^2, +)$ the *projective representation*. Nodes in the circuit are going to be in projective representation. That is, every node will represent 2 field elements, and nodes are summed as described above. Below is an example circuit, where the initial fractions (encoded in projective representation) lie in the input layer: ![image](https://hackmd.io/_uploads/SyGH2im5R.png) > In drawing this circuit, we made the assumption that $k=1$ and $n=4$. We omitted the $j=0$ index in the circuit, such that, for example, $m_{a_i}$ stands for $m_{a_{i0}}$. Additionally, note that the circuit ends with 2 fractions in the final layer instead of 1; we will explain why when we get to GKR. The $(p_i, q_i)$ pairs on the left hand-side of the circuit represent the 2 polynomials that will be interpolated per layer; we will discuss these in-depth in the next sections. Recall from the [multiset check note](/H1fQ8evD0#Revisiting-multiset-check-equation) that the terms $m_{a_{ij}}$, $m_{b_{ij}}$, $a_{ij}$ and $b_{ij}$ are assumed to be polynomials over the trace row $i$ and $i+1$. However, in this circuit and in the rest of this note, we make the simplifying assumption that the terms are polynomials solely over the trace row $i$. ## GKR We will now discuss the modifications to the GKR protocol that are made to prove the evaluation of this circuit. The point to address is how we handle the fact that we have 2 field elements per node compared to a single one in the original GKR. We will also define the layer polynomial differently (exploiting the additional structure in the circuit) to make the layer sum-check more efficient. We will first discuss GKR over all layers other than the input layer, and end with the treatment of the input layer. ### All inner layers (excluding the input layer) We begin the discussion about all layers except the input layer. At layer $i$, let $p_i: \{0, 1\}^{k_i} \rightarrow \mathbb{F}$ and $q_i: \{0, 1\}^{k_i} \rightarrow \mathbb{F}$ be functions that encode all the left element of each node, and all the right element of each node, respectively. For example, referring to our example circuit at layer 2, $$ \begin{align} p_2(0,0,0) &= m_{a_0}\\ q_2(0,0,0) &= \alpha - a_0\\ p_2(1,0,0) &= m_{b_0}\\ q_2(1,0,0) &= \alpha - b_0\\ \dots \end{align} $$ Additionally, we can describe the circuit structure as $$ \begin{align} p_i(x) &= p_{i+1}(0, x)q_{i+1}(1, x) + p_{i+1}(1, x)q_{i+1}(0, x) \\ q_i(x) &= q_{i+1}(0, x) q_{i+1}(1, x) \end{align} $$ This follows from how the nodes are laid out, the fact that we index nodes in bit representation from least to most significant bit, and from the definition of addition for the projective representation. Next, we need to give a definition of $\tilde{p}_i$ and $\tilde{q}_i$ which can be proved using sum-check. We will use the following definitions: $$ \begin{align} \tilde{p}_i(\hat{x}) &= \sum_{x \in \{0,1\}^{k_i}} \mathrm{eq}(x, \hat{x}) (\tilde{p}_{i+1}(0, x) \tilde{q}_{i+1}(1, x) + \tilde{p}_{i+1}(1, x) \tilde{q}_{i+1}(0, x)) \\ \tilde{q}_i(\hat{x}) &= \sum_{x \in \{0,1\}^{k_i}} \mathrm{eq}(x, \hat{x}) \tilde{q}_{i+1}(0, x)\tilde{q}_{i+1}(1, x) \end{align} $$ > Remember that the MLE of a function $f$ is unique, and hence all expressions which are multilinear and interpolate $f$ are equivalent definitions of the same polynomial $\tilde{f}$. You can verify that the 2 properties hold for the above definitions. This concept is further discussed in [this note from Justin Thaler](https://people.cs.georgetown.edu/jthaler/GKRNote.pdf). Notice that unlike the original GKR protocol, both sum-checks are defined over $x$ of the *current layer $i$* as opposed to over some $y,z \in \{0,1\}^{k_{i+1}}$ of the *next layer $i+1$*. Concretely, we are summing over $k_i$ variables instead of $2 \cdot k_{i+1}$, which is radically more efficient. We are able to do this by exploiting the additional structure in the circuit where a node with index $(x_0, \dots, x_{k_i})$ in layer $i$ will be the result of adding the nodes in layer $i+1$ with index $(0, x_0, \dots, x_{k_i})$ and $(1, x_0, \dots, x_{k_i})$. Additionally, since there are no multiplications in the circuit, we omit them completely. Recall that the core idea of the GKR protocol is to reduce a claim about the layer polynomial(s) at layer $i$ to a claim about the layer polynomial(s) at layer $i+1$. Assume that the claims at layer $i$ are $\tilde{p}_i(\rho_i)$ and $\tilde{q}_i(\rho_i)$, for some random $\rho_i \in_R \mathbb{F}^{k_i}$. Here, we *could* execute 2 sum-check protocols to reduce the 2 claims $\tilde{p}_i(\rho_i)$ and $\tilde{q}_i(\rho_i)$ to 2 claims $\tilde{p}_{i+1}(\rho_{i+1})$ and $\tilde{q}_{i+1}(\rho_{i+1})$, for some random $\rho_{i+1} \in_R \mathbb{F}^{k_{i+1}}$. However, we will instead reduce the 2 sum-check problems to a single one by taking a random linear combination of both claims. Specifically, given a randomly sampled $\lambda_i \in \mathbb{F}$, $$ \begin{align} &\tilde{p}_i(\hat{x}) + \lambda_i \tilde{q}_i(\hat{x}) = \\ &\sum_{x \in \{0,1\}^{k_i}} \mathrm{eq}(x, \hat{x}) [\tilde{p}_{i+1}(0, x) \tilde{q}_{i+1}(1, x) + \tilde{p}_{i+1}(1, x) \tilde{q}_{i+1}(0, x) + \lambda_i \tilde{q}_{i+1}(0, x)\tilde{q}_{i+1}(1, x)] \end{align} $$ > The core property of a random linear combination of $n$ claims is that it reduces the $n$ claims to a single claim. In other words, if the reduced claim is proven to be true, then with high probability, the $n$ original claims are proven true as well. Hence, with this trick, we are back to having a single sum-check protocol per layer. Finally, we are left with showing how to reduce a claim $\tilde{p}_i(\rho_i) + \lambda \tilde{q}_i(\rho_i)$, for some random $\rho_i \in_R \mathbb{F}^{k_i}$, to a claim $\tilde{p}_{i+1}(\rho_{i+1}) + \lambda_{i+1} \tilde{q}_{i+1}(\rho_{i+1})$, for some random $\rho_{i+1} \in_R \mathbb{F}$ and $\lambda_{i+1} \in \mathbb{F}$. Hence, starting from the claim $\tilde{p}_i(\rho_i) + \lambda_i \tilde{q}_i(\rho_i)$, we run the $k_i$ rounds of sum-check until the last evaluation check: $$ \begin{align} &\tilde{p}_i(\rho_i) + \lambda_i \tilde{q}_i(\rho_i) = \\ &\sum_{x \in \{0,1\}^{k_i}} \mathrm{eq}(x, \rho_i) [\tilde{p}_{i+1}(0, x) \tilde{q}_{i+1}(1, x) + \tilde{p}_{i+1}(1, x) \tilde{q}_{i+1}(0, x) + \lambda_i \tilde{q}_{i+1}(0, x)\tilde{q}_{i+1}(1, x)] \end{align} $$ In the last step of the sum-check, let $\gamma \in_R \mathbb{F}^{k_i}$ be the randomness sampled during the $k_i$ rounds. Hence, the verifier is left with evaluating $\tilde{p}_{i+1}(0, \gamma)$, $\tilde{p}_{i+1}(1, \gamma)$, $\tilde{q}_{i+1}(0, \gamma)$, and $\tilde{q}_{i+1}(1, \gamma)$. Just as in the original GKR, the prover will send these evaluations as claims. Also, just as in GKR, we need to reduce the 2 claims about $\tilde{p}_{i+1}$ to a single claim, and the 2 claims about $\tilde{q}_{i+1}$ to a single claim. > Recall from GKR that to reduce 2 claims about $\tilde{W}_i$ to a single claim about $\tilde{W}_{i+1}$, the prover had to send a univariate polynomial $q_i = \tilde{W}_{i+1} \circ l_{i+1}$ of degree $k_{i+1}$. In LogUp-GKR, we can do better by exploiting the additional structure of the circuit. In LogUp-GKR, to reduce the 2 claims about $\tilde{p}_{i+1}$, the prover only needs to send the 2 claims $\tilde{p}_{i+1}(0, \gamma)$ and $\tilde{p}_{i+1}(1, \gamma)$. The verifier samples $\gamma_0 \in_R \mathbb{F}$, and computes the new claim $\tilde{p}_{i+1}(\gamma_0, \gamma) = (1-\gamma_0) \cdot \tilde{p}_{i+1}(0, \gamma) + \gamma_0 \cdot \tilde{p}_{i+1}(1, \gamma)$. Similarly for $\tilde{q}_{i+1}$, the prover also sends the claims $\tilde{q}_{i+1}(0, \gamma)$ and $\tilde{q}_{i+1}(1, \gamma)$, which are reduced to $\tilde{q}_{i+1}(\gamma_0, \gamma) = (1-\gamma_0) \cdot \tilde{q}_{i+1}(0, \gamma) + \gamma_0 \cdot \tilde{q}_{i+1}(1, \gamma)$. Finally, the verifier samples $\lambda_{i+1} \in_R \mathbb{F}$, yielding the next layer claim $\tilde{p}_{i+1}(\gamma_0, \gamma) + \lambda_{i+1} \tilde{q}_{i+1}(\gamma_0, \gamma)$. > The above makes use of a property of multilinear polynomials, which was explored in [this note](/HJ0Ae_BbA#Key-multi-linear-polynomial-identity). Recall that our goal was to reduce the initial claim $\tilde{p}_i(\rho_i) + \lambda_i \tilde{q}_i(\rho_i)$ to a single claim $\tilde{p}_{i+1}(\rho_{i+1}) + \lambda_{i+1} \tilde{q}_{i+1}(\rho_{i+1})$ for some $\rho_{i+1} \in_R \mathbb{F}^{k_{i+1}}$. We just achieved this with $\rho_{i+1} = (\gamma_0, \gamma)$. Hence, by exploiting the graph structure, in LogUp-GKR we are able to have the prover send 2 field elements per polynomial ($\tilde{p}_{i+1}(0, \gamma)$, $\tilde{p}_{i+1}(1, \gamma)$, $\tilde{q}_{i+1}(0, \gamma)$ and $\tilde{q}_{i+1}(1, \gamma)$) instead of the $k_{i+1}$ field elements that represent the $q$ polynomial in the original GKR. ### The input layer Recall that we are interested in the use of LogUp-GKR where the values $m_{a_{ij}}$, $m_{b_{ij}}$, $a_{ij}$ and $b_{ij}$ that sit in the input layer are derived from a STARK trace. In order to prove that this is indeed the case, we design the circuit to reduce the evaluation of the circuit to claims about the main trace columns $\tilde{f}_0(\rho)$, $\dots$, $\tilde{f}_{m-1}(\rho)$, given a randomly sampled $\rho \in \mathbb{F}^{\log n}$. Then, in the next article, we will show how to prove these claims with Air constraints. Therefore, this additional constraint means that we need to reduce the claim about the 2nd layer ($p_1(\rho_1) + \lambda_1 q_1(\rho_1)$ in our example circuit) differently from all other layers. Concretely, again referring to our example circuit, applying the same algorithm as all other circuit layers would reduce the 2nd layer claim to a claim $p_2(\rho_2) + \lambda_2 q_2(\rho_2)$, for some random $\rho_2 \in_R \mathbb{F}^{k_2}$. This is not quite what we need. Recall that the input layer holds the values $m_{a_{ij}}$, $m_{b_{ij}}$, $a_{ij}$ and $b_{ij}$, for $0 \leq i < n$ and $0 \leq j < k$. That is, $i$ selects the trace row, and $j$ selects the fraction generated at row $i$. Let $z \in \mathbb{F}^{\log k}$ be the bit decomposition of $j$. For example, if $j = 4$, then $z = (0,0,1)$, following our previously discussed convention that the least significant bit comes first. Next, we will introduce $g_{[z]}(\tilde{\omega}(x))$, which captures all numerators of a given row, and $h_{[z]}(\tilde{\omega}(x))$ to capture the denominators. In our example circuit, using a slight abuse of notation, we would have $$ \begin{align} g_0(\tilde{\omega}(x)) &= m_{a_{x0}} \\ g_1(\tilde{\omega}(x)) &= m_{b_{x0}} \\ h_0(\tilde{\omega}(x)) &= \alpha - a_{x0} \\ h_1(\tilde{\omega}(x)) &= \alpha - b_{x0}\\ \end{align} $$ > We abuse notation since $x$ is the bit decomposition of the index $i$, which we use directly into, for example, $m_{a_{x0}}$. Then, let $\tilde{p}_{d-2}(\rho_{d-2}) + \lambda_{d-2} \tilde{q}_{d-2}(\rho_{d-2})$ be the claim from the 2nd layer. We will rewrite the claim as $$ \tilde{p}_{d-2}(\rho_{z'}, \rho_x) + \lambda_{d-2} \tilde{q}_{d-2}(\rho_{z'}, \rho_x) $$ where $\rho_{z'} \in_R \mathbb{F}^{\log (k)-1}$ and $\rho_x \in_R \mathbb{F}^{\log n}$. Then, the reduction to the input layer can written as $$ \begin{align} &\tilde{p}_{d-2}(\rho_{z'}, \rho_x) + \lambda_{d-2} \tilde{q}_{d-2}(\rho_{z'}, \rho_x) = \\ &\sum_{x \in \{0,1\}^{\log n}} \sum_{z' \in \{0,1\}^{\log(k) - 1}} \mathrm{eq}(z', \rho_{z'}) \cdot \mathrm{eq}(x, \rho_x) \\ &\quad \quad [g_{[(0, z')]}(\tilde{\omega}(x)) \cdot h_{[(1, z')]}(\tilde{\omega}(x)) + g_{[(1, z')]}(\tilde{\omega}(x)) \cdot h_{[(0, z')]}(\tilde{\omega}(x))] \end{align} $$ We only run the sum-check protocol over the $x$ variable. Then, after the last round of sum-check, let $\hat{x} \in_R \mathbb{F}^{\log n}$ be the randomness generated during sum-check. The prover sends the claims $\tilde{\omega}(\hat{x}) = (\tilde{f}_0(\hat{x}), \dots, \tilde{f}_{m-1}(\hat{x}))$, which the verifier uses to complete the final evaluation of sum-check. And... We're done! We are left with claims $(\tilde{f}_0(\hat{x}), \dots, \tilde{f}_{m-1}(\hat{x}))$ to prove, which we will be proved in Air. We will discuss this in the next article.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.