This note is intended to be a condensed description of the GKR protocol. For a more gentle introduction to the subject, it may be useful to first read my GKR introduction note. I will also assume familiarity with the sum-check protocol, as well as the multilinear extension.
GKR is a recursively defined protocol where a prover convinces a verifier of a correct evaluation of a predetermined layered circuit.
Below is an example of a layered circuit:
In general, GKR supports layered circuits of any number \(d\) of layers, where each node in layer \(i\) is connected to 2 nodes in layer \(i+1\), and where those 2 nodes are either added or multiplied together.
Let \(S_i\) be the number of nodes at layer \(i\). We will assume that \(S_i\) is a power of 2, and hence define \(k_i = \log_2(S_i)\) to be the logarithm of \(S_i\).
We will model each layer as a function \(W_i: \{0,1\}^{k_i} \rightarrow \mathbb{F}\), where \(\mathbb{F}\) is a finite field. The domain is the boolean hypercube, and can be thought of as referring to the bit decomposition of node indices. For example, referring to our example circuit above, \(W_2(0,1,1) = y_3\).
Lastly, the formal definition of a layer will be:
\[ W_i(x) = \sum_{b, c \in \{0, 1\}^{k_{i+1}}} add_{i+1}(x, b, c) (W_{i+1}(b) + W_{i+1}(c)) + mul_{i+1}(x, b, c) (W_{i+1}(b) \cdot W_{i+1}(c)), \]
where \(add_{i+1}(x, b, c) = 1\) when \(W_i(x) = W_{i+1}(b) + W_{i+1}(c)\), and \(add_{i+1}(x, b, c) = 0\) otherwise. Similarly, \(mul_{i+1}(x, b, c) = 1\) when \(W_i(x) = W_{i+1}(b) \cdot W_{i+1}(c)\), and \(mul_{i+1}(x, b, c) = 0\) otherwise. You should convince yourself that this layer definition is consistent with our description of the circuit format.
The protocol will work over the multilinear extensions (MLE) of layers, defined as:
\[ \begin{align} \tilde{W}_i(\hat{x}) &= \sum_{x \in \{0, 1\}^{k_i}} eq(\hat{x}, x) W_i \\ &= \sum_{b, c \in \{0, 1\}^{k_{i+1}}} \tilde{add}_{i+1}(\hat{x}, b, c) (\tilde{W}_{i+1}(b) + \tilde{W}_{i+1}(c)) + \tilde{mul}_{i+1}(\hat{x}, b, c) (\tilde{W}_{i+1}(b) \cdot \tilde{W}_{i+1}(c)). \end{align} \]
The proof of why the second equality holds is covered in Thaler's note. This is the definition of \(\tilde{W}_i\) that we will use.
GKR is a recursively-defined protocol, where we show how to reduce a claim about \(\tilde{W}_i\) to a claim about \(\tilde{W}_{i+1}\). The base case occurs with the input layer, where the verifier is assumed to be able to evaluate the layer on its own and verify the final claim.
At the beginning of the protocol, the prover sends \(W_0\) to the verifier, which corresponds to the final evaluation of the circuit. The verifier samples \(\hat{x}_0 \in \mathbb{F}^{k_0}\), and the claim to prove becomes \(\tilde{W}_0(\hat{x}_0)\). That is, by the Schwartz-Zippel lemma, if \(\tilde{W}_0(\hat{x}_0)\) is proven to be correct, then we can treat the entire polynomial \(\tilde{W}_0\) to be correct.
Next, we describe the recursive case of the protocol, where the claim \(\tilde{W}_i(\hat{x}_i)\) is reduced to a claim \(\tilde{W}_{i+1}(\hat{x}_{i+1})\) for a randomly sampled \(\hat{x}_{i+1}\), for \(0 \leq i < d\).
In the recursive case, at layer \(i\), we need to reduce the claim \(\tilde{W}_i(\hat{x}_i)\) for a randomly sampled \(\hat{x}_i \in \mathbb{F}^{k_i}\) to \(\tilde{W}_{i+1}(\hat{x}_{i+1})\) for a randomly sampled \(\hat{x}_{i+1} \in \mathbb{F}^{k_{i+1}}\). Recall the definition of \(\tilde{W}_i(\hat{x}_i)\):
\[ \tilde{W}_i(\hat{x}_i) = \sum_{b, c \in \{0, 1\}^{k_{i+1}}} \tilde{add}_{i+1}(\hat{x}_i, b, c) (\tilde{W}_{i+1}(b) + \tilde{W}_{i+1}(c)) + \tilde{mul}_{i+1}\hat{x}_i, b, c) (\tilde{W}_{i+1}(b) \cdot \tilde{W}_{i+1}(c)). \]
Notice that the right-hand side is a sum-check problem over \(b,c\). Or, rewritten slightly,
\[ C_i = \sum_{b, c \in \{0, 1\}^{k_{i+1}}} g(b, c), \]
where \(C_i = \tilde{W}_i(\hat{x}_i)\), and \[g(b, c) = \tilde{add}_{i+1}(\hat{x}_i, b, c) (\tilde{W}_{i+1}(b) + \tilde{W}_{i+1}(c)) + \tilde{mul}_{i+1}(\hat{x}_i, b, c) (\tilde{W}_{i+1}(b) \cdot \tilde{W}_{i+1}(c)).\]
We run the sum-check protocol until the final evaluation of \(g\). We will call the random values sampled during sum-check as \(\rho_b \in_R \mathbb{F}^{k_{i+1}}\) and \(\rho_c \in_R \mathbb{F}^{k_{i+1}}\) corresponding to \(b\) and \(c\), respectively.
At the end of the sum-check protocol, the verifier needs to evaluate \(g(\rho_b, \rho_c)\). However, to compute this value, the verifier needs to know \(\tilde{W}_{i+1}(\rho_b)\) and \(\tilde{W}_{i+1}(\rho_c)\), which it does not have access to. Recall that our goal is to reduce \(\tilde{W}_i(\hat{x}_i)\) to a single \(\tilde{W}_{i+1}(\hat{x}_{i+1})\) claim, for some \(\hat{x}_{i+1}\); but we currently have 2 such claims. Therefore, we next describe how to reduce the 2 claims \(\tilde{W}_{i+1}(\rho_b)\) and \(\tilde{W}_{i+1}(\rho_c)\) to a single claim.
We define the line \(l_{i+1}: \mathbb{F} \rightarrow \mathbb{F}^{k_{i+1}}\) as \(l_{i+1}(x) = (1-x) \cdot \rho_b + x \cdot \rho_c\). Notice that \(l_{i+1}(0) = \rho_b\) and \(l_{i+1}(1) = \rho_c\).
Then, the prover sends the univariate polynomial \(q(x): \mathbb{F} \rightarrow \mathbb{F}\) which is claimed to be equal to \(q(x) = \tilde{W}_{i+1} \circ l_{i+1}(x)\). The verifier uses the values \(q(0)\) for \(\tilde{W}_{i+1}(\rho_b)\) and \(q(1)\) for \(\tilde{W}_{i+1}(\rho_c)\). We are then left with proving that \(q(x)\) is as claimed, which we do using the Schwartz-Zippel lemma once again. The verifier samples \(r_{i+1} \in \mathbb{F}\), and asks the prover to prove \(q(r_{i+1})\).
However, recall that \(q(x) = \tilde{W}_{i+1} \circ l_{i+1}(x)\). Hence,
\[ \begin{align} q(r_{i+1}) &= \tilde{W}_{i+1} \circ l_{i+1}(r_{i+1}) \\ &= \tilde{W}_{i+1}(l_{i+1}(r_{i+1})) \\ &= \tilde{W}_{i+1}(\hat{x}_{i+1}), \end{align} \]
where \(\hat{x}_{i+1} = l_{i+1}(r_{i+1})\). Therefore, we can use the GKR protocol once again to prove the claim \(\tilde{W}_{i+1}(\hat{x}_{i+1})\). Or, in other words, we have successfully reduced the claim \(\tilde{W}_i(\hat{x}_i)\) to a new claim \(\tilde{W}_{i+1}(\hat{x}_{i+1})\).
Note: the maximum degree of \(q(x)\) is \(k_{i+1}\), and hence "sending \(q(x)\)" means sending \(k_{i+1}\) field elements. As we will see in the next note about LogUp-GKR, if we assume more structure in our circuit, we are able to modify the GKR protocol in order to only send 2 field elements - namely only the 2 evaluation claims of \(\tilde{W}_{i+1}\).
The base case occurs at the input layer \(W_{d-1}\), where we have to verify the claim \(\hat{W}_{d-1}(\hat{x}_{d-1})\). The verifier can simply evaluate this directly, since we assume that the verifier has access to the input layer.
We end with a brief note about terminology. What we presented here was the original GKR as described in the 2015 paper (modulo the abovementioned modification by Thaler). Since then however, many different extensions and modifications of the protocol were discovered; we typically still refer to those as "GKR".
For example, as we will see in LogUp-GKR, we assume more structure about our circuit, which allows us to define the layer polynomial such that the sum-check is more efficient. We also interpolate 2 polynomials per layer instead of 1. Lastly, the verifier doesn't typically have access to the input layer, which we then need to make further reductions. This last point can arguably be considered outside of GKR, but we include here to give an intuition of how the protocol is used in practice.