# GKR reading note
[toc]
## 1. Intuition
### 1.1 zero check / testing
- [x] how to encode $N$ constraints in uni-variate zkp (pairing-based / hash-based)?
- [x] a uni-variate polynomial $f(X) = \sum_{i=0}^{N-1} a_i * X^i$ of degree less than $N$ can be uniquely represented by its evaluation over set $T = \{w^i: 0 \leq i < N\}$
- $\{ f(w^i): 0 \le i < N\}$
- We can use Lagrange interpolation to convert between evaluation form and coefficient form.
- for each $w^i$, we have $f(w^i) * g(w^i) = h(w^i)$;
- this is equivalent to $f(X)*g(X)-h(X) = 0$ for $X \in T$
- this is equivalent to $(X - w^i)$ divides $f(X)*g(X) - h(X)$ for each i;
- this is equivalent to $$Z_T(X) = \prod_{i=0}^{N-1} (X - w^i) = (X^N - 1)$$ divides $f(X)*g(X) - h(X)$;
- this is equivalent to
> there exists a polynomial $q(X)$ s.t. $q(X)*Z_T(X) = f(X)*g(X) - h(X)$.
- [x] how to encode $N$ constraints in GKR?
- we can encode $3N$ variables into 3 multilinear polynomials $f(\vec{X}), g(\vec{X}), h(\vec{X})$ via interpolation over the hypercube $B_n = \{0, 1\}^n$;
- for each element $\vec{b} = (b_1, \ldots, b_n)$ in boolean hypercube, we have $$f(\vec{b}) * g(\vec{b}) - h(\vec{b}) = 0.$$
- this is equivalent to $c(\vec{X}) = f(\vec{X})*g(\vec{X}) - h(\vec{X})$ evaluates to 0 over boolean hypercube. Note that $c(X)$ is **not** a multilinear polynomial.
- But MLE $\tilde{c}(\vec{X})$ of $c(\vec{X})$ **is** a multilinear polynomial.$$\tilde{c}(\vec{X}) = \sum_{b \in B^n} \textrm{eq}(\vec{\textbf{b}}, \vec{X})*[f(\vec{b})*g(\vec{b}) - h(\vec{b})].$$
- the above $N$ constraints imply that $\tilde{c}(\vec{X})$ is zero polynomial:
- [x] By [Schwartz-Zippel lemma](https://en.wikipedia.org/wiki/Schwartz%E2%80%93Zippel_lemma), we know that the probability that a non-zero multilinear polynomial $p(\vec{X})$ evaluates at a random point $\vec{\beta} \in F^n$ is no larger than $\frac{1}{|F|}$. This means that if a multilinear polynomial $p(\vec{X})$ evaluates to 0 at a random point, then we have $p(\vec{X}) = 0$ with high probability ($1-\frac{1}{|F|}$).
Therefore we can use this zero testing technique:
> sample a random challenge $\vec{\beta} \in F^n$ check if $\tilde{c}(\vec{\beta}) = 0$.
- Therefore zero check of $\tilde{c}(X)$ is done via
$$\textrm{prover.sumcheck}(\textrm{poly} = eq(\vec{x}, \vec{\beta})*[f(\vec{x})*g(\vec{x}) - h(\vec{x})], \textrm{sum} = 0)$$
### 1.2 MLE
MLE of a multi-variate function $g: F^n \rightarrow F$.
- Background
$$\textrm{eq}(x_1, \ldots, x_n, y_1, \ldots, y_n) = \prod_{i=1}^n [(x_i * y_i) + (1-x_i)*(1-y_i)]$$
It's easy to see for any two elements $(\vec{X}, \vec{Y})$ from the boolean hypercube $B^n$: $\textrm{eq}(\vec{X}, \vec{Y}) = 1$ iff $\vec{X} = \vec{Y}$.
- Lagrange basis for $n$-variate multilinear polynomials
- [x] how to uniquely recover a $n$-variate multilinear polynomial?
> Each $n$-variate multilinear polynomial is uniquely determined by its evaluations ($2^n$ elements) over boolean hypercube!
$f(X_1, X_2, \ldots, X_{n-1}, X_n) = f_0(X_1, \ldots, X_{n-1}) + f_1(X_1, \ldots, X_{n-1}) * X_n$
1. $f_0(x_1, \ldots, x_{n-1}) = f(x_1, \ldots, x_{n-1}, 0)$
2. $f_1(x_1, \ldots, x_{n-1}) = f(x_1, \ldots, x_{n-1}, 1) - f(x_1, \ldots, x_{n-1}, 0)$.
We can prove this using induction starting with the simple fact
> a linear uni-variate polynomial can be uniquely recovered by **two** evaluations.
- $L_{\vec{\textbf{b}}}(\vec{X}) = \textrm{eq}(\vec{\textbf{b}}, \vec{X}) = 1$ if $\vec{X} = \vec{\textbf{b}}$ and 0 if not.
- MLE of $g$ (denoted by $\tilde{g}$) is a $n$-variate multilinear polynomial: $$\tilde{g}(\vec{X}) = \sum_{\vec{\textbf{b}} \in B^n} \textrm{eq}(\vec{\textbf{b}}, \vec{X})*g(\vec{\textbf{b}}).$$
### 1.3 Sumcheck
$n$-variate polynomial $f(X) \in F[\vec{X}]: F^n \rightarrow F$, the scripts generated in sumcheck protocol are:
- for each round i from 1 to n,
- P sends V: $\big[f^i(0), f^i(1), f^i(2), \ldots, f^i(d) \big]$ where $$f^i(X) = \sum_{\vec{b} \in B_{n-i}} f(r_1, \ldots, r_{i-1}, X, \vec{b}).$$
- In each round, the sumcheck prover needs to do summation over $2^{n-i}$ elements from hypercube $B_{n-i}$.
- V sends P: challenge $r_i$;
## 2. textbook GKR for layered circuit
- textbook GKR: gkr08 paper, cmt12 paper
- the relation between values at layer $i$ and layer $i+1$ is that for each gate $g \in B_{s}$,
$$
\begin{eqnarray}
\tilde{v}_i(g) &=& \sum_{b, c \in \{0, 1\}^s} \tilde{\textrm{add}}(g, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big) + \\ & &\tilde{\textrm{mul}}(g, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big).
\end{eqnarray}
$$
Since both sides of the above equation are multilinear polynomials in $g$, we know that **this equation holds for any $z \in F^s$**.
$$
\begin{eqnarray}
\tilde{v}_i(z) &=& \sum_{b, c \in \{0, 1\}^s} \tilde{\textrm{add}}(z, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big) + \\ & &\tilde{\textrm{mul}}(z, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big).
\end{eqnarray}
$$
- MLE math:
1. $\tilde{\textrm{add}}(z, b, c) = \sum_g \textrm{eq}(g, z) \cdot \textrm{add}(g, b, c)$.
2. closed form for $\tilde{\textrm{add}}(z, x, y)$:
- For any $p \in \{0,1\}^{3s}$, we define $p_0 = p[0:s], p_1 = p[s:2s], p_2 = p[2s:3s]$;
- For any $z, x, y \in F^s$, we have $$\tilde{\textrm{add}}(\vec{z}, \vec{x}, \vec{y}) = \sum_{p \in \{0, 1\}^{3s}: \textrm{add}(p) =1} eq(p_0, \vec{z}) \cdot eq(p_1, \vec{x}) \cdot eq(p_2, \vec{y}).$$
- [x] problem: the selector $\tilde{\textrm{add}}(\vec{z}, \vec{b}, \vec{c})$ has $2s$ variables. In round $j$ of sumcheck prover, an naive implementation will do $2^{2s-j}$ summation. how to reduce the time from $O(S^3)$ to $O(S \log S)$?
- Check **appendix A** of [CMT12 paper](https://arxiv.org/abs/1105.2003).
- Let's define $\vec{p}_j = (p_{j+1}, \ldots, p_{3s})$;
- Let $\vec{r}_j = (r_1, \ldots, r_{j-1}) \in F^{j-1}$ and $\vec{b} = (b_{j+1}, \ldots, b_{3s})$, then we have
$$
\begin{eqnarray}
\sum_{\vec{b} \in \{0,1\}^{2s-j}} \tilde{\textrm{add}}(\vec{z}, \vec{r}, X_j, \vec{b}) & = & \sum_{p \in \{0, 1\}^{3s}: \textrm{add}(p) =1} \sum_{\vec{b}} \chi_p(\vec{r}, X_j, \vec{b}) \\
& = & \sum_{p \in \{0, 1\}^{3s}: \textrm{add}(p) =1} \chi_p(\vec{r}, X_j, \vec{p}_j)
\end{eqnarray}
$$
- $\chi_p$ function has the nice form
$$
\chi_p(\vec{z}, \vec{r}_j, X_j, \vec{p}_j)= \textrm{eq}(X_{j}, p_{s+j}) \prod_{i=1}^s \textrm{eq}(p_i, z_i) \prod_{i=1}^{j-1} \textrm{eq}(p_{i+s}, r_i)
$$
Therefore we can store `eq_rand[p]` for each $p$ and update it incrementally: `eq_rand[p] *= eq(p[s+j], r_j)`.
- Therefore, the GKR prover runs in $O(2^s)$ time for each round.
- [ ] The correctness of $\tilde{v}_i(z)$ is reduced to the correctness of evaluations of $\tilde{v}_{i+1}(X)$ at points $\{ \omega_1, \omega_2 \}$. how to merge these two into one?
- [x] linear time GKR for layered circuit: [Libra19 paper](https://eprint.iacr.org/2019/317)
- we can write the gkr layer reduction formula in another form by defining one helper multilinear polynomial $g_z(X)$:
- $g_z(X) = \sum_{c \in \{0,1\}^s} \tilde{\textrm{mul}}(z, X, c) * \tilde{v}_{i+1}(c)$;
- textbook GKR relation can then be rewrote as
$$
\begin{eqnarray}
\tilde{v}_i(z) & = & \sum_{b, c} \tilde{\textrm{mul}}(z, b, c)*\tilde{v}_{i+1}(b) * \tilde{v}_{i+1}(c) \\
& = & \sum_b \tilde{v}_{i+1}(b) * \sum_c \tilde{\textrm{mul}}(z, b, c) * \tilde{v}_{i+1}(c) \\
& = & \sum_b \tilde{v}_{i+1}(b) * g_z(b)
\end{eqnarray}
$$
- **Therefore we can replace the old $2s$-variate sumcheck with two new $s$-variate sumchecks**.
- phase/step 1 sumcheck:
$$\tilde{v}_i(z) = \sum_b \tilde{v}_{i+1}(b) * g_z(b)$$ => $\tilde{v}_{i+1}(x)$ and $g_z(x)$ where $x \in F^s$;
- phase/step 2 sumcheck:
$$g_z(x) = \sum_{c \in \{0,1\}^s} \tilde{\textrm{mul}}(z, x, c) * \tilde{v}_{i+1}(c)$$ => $\tilde{\textrm{mul}}(z, x, y)$ and $\tilde{v}_{i+1}(y)$ where $y \in F^s$.
- [x] how to get $g_z(X)$'s MLE repr?
- Since $\tilde{\textrm{mul}}(z, b, c) = \sum_{a \in B_s} \textrm{eq}(z, a) \textrm{mul}(a, b, c)$, we have for each $b \in B_s$
$$
\begin{eqnarray}
g_z(b) & = & \sum_c \tilde{\textrm{mul}}(z, b, c) * \tilde{v}_{i+1}(c) \\
& = & \sum_{a, c \in B_s} \textrm{eq}(z, a) * \textrm{mul}(a, b, c) * \tilde{v}_{i+1}(c);
\end{eqnarray}
$$
- Let $P_b = \{a: \exists c, \textrm{mul}(a, b, c) = 1\}$ be the subset of gates at layer $i$ that has gate $b$ at layer $i+1$ as left input.
- We can then do only $|P_b|$ summation:
$$
\begin{eqnarray}
g_z(b) & = & \sum_{a, c \in P_b} \textrm{eq}(z, a) * \tilde{v}_{i+1}(c);
\end{eqnarray}
$$
- For some random vector $z \in F^s$, we can compute $\textrm{eq}(z, b)$ for all elements of $B_s$ in $O(2^s)$ time.
- since $\sum_b |P_b| \le 2^s$, we have the desired property that getting $g_z(b)$ for all $b$ runs in $O(2^s)$ time.
- [x] for each $c \in B_s$, how to compute $\tilde{\textrm{mul}}(z, x, c)$ efficiently?
- $\tilde{\textrm{mul}}(z, x, c) = \sum_{a, b} \textrm{eq}(z, a) * \textrm{eq}(x, b) * \textrm{mul}(a, b, c)$.
- Let $Q_c = \{ a: \exists c, \textrm{mul}(a, b, c) = 1\}$ be the subset of gates at layer $i$ that has gate $c$ at layer $i+1$ as right input.
- `G_z[a] = eq(z, a)` is stored in $O(2^s)$ space and computed in $O(2^s)$ time.
- Similarly, `G_x[a] = eq(x, a)` can also be stored in $O(2^s)$ space.
- Therefore $\tilde{\textrm{mul}}(z, x, c) = \sum_{a, b \in Q_c} \textrm{eq}(z, a) * \textrm{eq}(x, b)$.
- [ ] how does libra work with data parallel circuit?
## 3. GKR for layered data parallel circuit
- $2^b$ is the number of instances; $s$ is the number of variables per layer in the inner circuit.
- The relation between layer $i$ and layer $i+1$ is
$$
\begin{eqnarray}
v_i(g_1, g_2) & = & \sum_{b_1, c_1 \in B_s} \tilde{\textrm{mul}}(g_1, b_1, c_1) * \tilde{v}_{i+1}(b_1, g_2) * \tilde{v}_{i+1}(c_1, g_2)
\end{eqnarray}
$$
With MLE, we have
$$
\begin{eqnarray}
\tilde{v}_i(z_1, z_2) & = & \sum_{g_1 \in B_s, g_2 \in B_b} \textrm{eq}(z_1, g_1) * \textrm{eq}(z_2, g_2) \sum_{b_1, c_1} \tilde{\textrm{mul}}(g_1, b_1, c_1) * \tilde{v}_{i+1}(b_1, g_2) * \tilde{v}_{i+1}(c_1, g_2)
\end{eqnarray}
$$
## 4. GKR for unlayered data parallel circuit (ceno GKR)
## QA
## Appendix A. useful docs (survey / paper / video)
- talk by Ariel Gabizon (★★★★★) {%youtube tSpvk3EGRSM %}
- a note on GKR by j.thaler (★★★★★) https://people.cs.georgetown.edu/jthaler/GKRNote.pdf