# GKR reading note [toc] ## 1. Intuition ### 1.1 zero check / testing - [x] how to encode $N$ constraints in uni-variate zkp (pairing-based / hash-based)? - [x] a uni-variate polynomial $f(X) = \sum_{i=0}^{N-1} a_i * X^i$ of degree less than $N$ can be uniquely represented by its evaluation over set $T = \{w^i: 0 \leq i < N\}$ - $\{ f(w^i): 0 \le i < N\}$ - We can use Lagrange interpolation to convert between evaluation form and coefficient form. - for each $w^i$, we have $f(w^i) * g(w^i) = h(w^i)$; - this is equivalent to $f(X)*g(X)-h(X) = 0$ for $X \in T$ - this is equivalent to $(X - w^i)$ divides $f(X)*g(X) - h(X)$ for each i; - this is equivalent to $$Z_T(X) = \prod_{i=0}^{N-1} (X - w^i) = (X^N - 1)$$ divides $f(X)*g(X) - h(X)$; - this is equivalent to > there exists a polynomial $q(X)$ s.t. $q(X)*Z_T(X) = f(X)*g(X) - h(X)$. - [x] how to encode $N$ constraints in GKR? - we can encode $3N$ variables into 3 multilinear polynomials $f(\vec{X}), g(\vec{X}), h(\vec{X})$ via interpolation over the hypercube $B_n = \{0, 1\}^n$; - for each element $\vec{b} = (b_1, \ldots, b_n)$ in boolean hypercube, we have $$f(\vec{b}) * g(\vec{b}) - h(\vec{b}) = 0.$$ - this is equivalent to $c(\vec{X}) = f(\vec{X})*g(\vec{X}) - h(\vec{X})$ evaluates to 0 over boolean hypercube. Note that $c(X)$ is **not** a multilinear polynomial. - But MLE $\tilde{c}(\vec{X})$ of $c(\vec{X})$ **is** a multilinear polynomial.$$\tilde{c}(\vec{X}) = \sum_{b \in B^n} \textrm{eq}(\vec{\textbf{b}}, \vec{X})*[f(\vec{b})*g(\vec{b}) - h(\vec{b})].$$ - the above $N$ constraints imply that $\tilde{c}(\vec{X})$ is zero polynomial: - [x] By [Schwartz-Zippel lemma](https://en.wikipedia.org/wiki/Schwartz%E2%80%93Zippel_lemma), we know that the probability that a non-zero multilinear polynomial $p(\vec{X})$ evaluates at a random point $\vec{\beta} \in F^n$ is no larger than $\frac{1}{|F|}$. This means that if a multilinear polynomial $p(\vec{X})$ evaluates to 0 at a random point, then we have $p(\vec{X}) = 0$ with high probability ($1-\frac{1}{|F|}$). Therefore we can use this zero testing technique: > sample a random challenge $\vec{\beta} \in F^n$ check if $\tilde{c}(\vec{\beta}) = 0$. - Therefore zero check of $\tilde{c}(X)$ is done via $$\textrm{prover.sumcheck}(\textrm{poly} = eq(\vec{x}, \vec{\beta})*[f(\vec{x})*g(\vec{x}) - h(\vec{x})], \textrm{sum} = 0)$$ ### 1.2 MLE MLE of a multi-variate function $g: F^n \rightarrow F$. - Background $$\textrm{eq}(x_1, \ldots, x_n, y_1, \ldots, y_n) = \prod_{i=1}^n [(x_i * y_i) + (1-x_i)*(1-y_i)]$$ It's easy to see for any two elements $(\vec{X}, \vec{Y})$ from the boolean hypercube $B^n$: $\textrm{eq}(\vec{X}, \vec{Y}) = 1$ iff $\vec{X} = \vec{Y}$. - Lagrange basis for $n$-variate multilinear polynomials - [x] how to uniquely recover a $n$-variate multilinear polynomial? > Each $n$-variate multilinear polynomial is uniquely determined by its evaluations ($2^n$ elements) over boolean hypercube! $f(X_1, X_2, \ldots, X_{n-1}, X_n) = f_0(X_1, \ldots, X_{n-1}) + f_1(X_1, \ldots, X_{n-1}) * X_n$ 1. $f_0(x_1, \ldots, x_{n-1}) = f(x_1, \ldots, x_{n-1}, 0)$ 2. $f_1(x_1, \ldots, x_{n-1}) = f(x_1, \ldots, x_{n-1}, 1) - f(x_1, \ldots, x_{n-1}, 0)$. We can prove this using induction starting with the simple fact > a linear uni-variate polynomial can be uniquely recovered by **two** evaluations. - $L_{\vec{\textbf{b}}}(\vec{X}) = \textrm{eq}(\vec{\textbf{b}}, \vec{X}) = 1$ if $\vec{X} = \vec{\textbf{b}}$ and 0 if not. - MLE of $g$ (denoted by $\tilde{g}$) is a $n$-variate multilinear polynomial: $$\tilde{g}(\vec{X}) = \sum_{\vec{\textbf{b}} \in B^n} \textrm{eq}(\vec{\textbf{b}}, \vec{X})*g(\vec{\textbf{b}}).$$ ### 1.3 Sumcheck $n$-variate polynomial $f(X) \in F[\vec{X}]: F^n \rightarrow F$, the scripts generated in sumcheck protocol are: - for each round i from 1 to n, - P sends V: $\big[f^i(0), f^i(1), f^i(2), \ldots, f^i(d) \big]$ where $$f^i(X) = \sum_{\vec{b} \in B_{n-i}} f(r_1, \ldots, r_{i-1}, X, \vec{b}).$$ - In each round, the sumcheck prover needs to do summation over $2^{n-i}$ elements from hypercube $B_{n-i}$. - V sends P: challenge $r_i$; ## 2. textbook GKR for layered circuit - textbook GKR: gkr08 paper, cmt12 paper - the relation between values at layer $i$ and layer $i+1$ is that for each gate $g \in B_{s}$, $$ \begin{eqnarray} \tilde{v}_i(g) &=& \sum_{b, c \in \{0, 1\}^s} \tilde{\textrm{add}}(g, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big) + \\ & &\tilde{\textrm{mul}}(g, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big). \end{eqnarray} $$ Since both sides of the above equation are multilinear polynomials in $g$, we know that **this equation holds for any $z \in F^s$**. $$ \begin{eqnarray} \tilde{v}_i(z) &=& \sum_{b, c \in \{0, 1\}^s} \tilde{\textrm{add}}(z, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big) + \\ & &\tilde{\textrm{mul}}(z, b, c) * \big(\tilde{v}_{i+1}(b) + \tilde{v}_{i+1}(c)\big). \end{eqnarray} $$ - MLE math: 1. $\tilde{\textrm{add}}(z, b, c) = \sum_g \textrm{eq}(g, z) \cdot \textrm{add}(g, b, c)$. 2. closed form for $\tilde{\textrm{add}}(z, x, y)$: - For any $p \in \{0,1\}^{3s}$, we define $p_0 = p[0:s], p_1 = p[s:2s], p_2 = p[2s:3s]$; - For any $z, x, y \in F^s$, we have $$\tilde{\textrm{add}}(\vec{z}, \vec{x}, \vec{y}) = \sum_{p \in \{0, 1\}^{3s}: \textrm{add}(p) =1} eq(p_0, \vec{z}) \cdot eq(p_1, \vec{x}) \cdot eq(p_2, \vec{y}).$$ - [x] problem: the selector $\tilde{\textrm{add}}(\vec{z}, \vec{b}, \vec{c})$ has $2s$ variables. In round $j$ of sumcheck prover, an naive implementation will do $2^{2s-j}$ summation. how to reduce the time from $O(S^3)$ to $O(S \log S)$? - Check **appendix A** of [CMT12 paper](https://arxiv.org/abs/1105.2003). - Let's define $\vec{p}_j = (p_{j+1}, \ldots, p_{3s})$; - Let $\vec{r}_j = (r_1, \ldots, r_{j-1}) \in F^{j-1}$ and $\vec{b} = (b_{j+1}, \ldots, b_{3s})$, then we have $$ \begin{eqnarray} \sum_{\vec{b} \in \{0,1\}^{2s-j}} \tilde{\textrm{add}}(\vec{z}, \vec{r}, X_j, \vec{b}) & = & \sum_{p \in \{0, 1\}^{3s}: \textrm{add}(p) =1} \sum_{\vec{b}} \chi_p(\vec{r}, X_j, \vec{b}) \\ & = & \sum_{p \in \{0, 1\}^{3s}: \textrm{add}(p) =1} \chi_p(\vec{r}, X_j, \vec{p}_j) \end{eqnarray} $$ - $\chi_p$ function has the nice form $$ \chi_p(\vec{z}, \vec{r}_j, X_j, \vec{p}_j)= \textrm{eq}(X_{j}, p_{s+j}) \prod_{i=1}^s \textrm{eq}(p_i, z_i) \prod_{i=1}^{j-1} \textrm{eq}(p_{i+s}, r_i) $$ Therefore we can store `eq_rand[p]` for each $p$ and update it incrementally: `eq_rand[p] *= eq(p[s+j], r_j)`. - Therefore, the GKR prover runs in $O(2^s)$ time for each round. - [ ] The correctness of $\tilde{v}_i(z)$ is reduced to the correctness of evaluations of $\tilde{v}_{i+1}(X)$ at points $\{ \omega_1, \omega_2 \}$. how to merge these two into one? - [x] linear time GKR for layered circuit: [Libra19 paper](https://eprint.iacr.org/2019/317) - we can write the gkr layer reduction formula in another form by defining one helper multilinear polynomial $g_z(X)$: - $g_z(X) = \sum_{c \in \{0,1\}^s} \tilde{\textrm{mul}}(z, X, c) * \tilde{v}_{i+1}(c)$; - textbook GKR relation can then be rewrote as $$ \begin{eqnarray} \tilde{v}_i(z) & = & \sum_{b, c} \tilde{\textrm{mul}}(z, b, c)*\tilde{v}_{i+1}(b) * \tilde{v}_{i+1}(c) \\ & = & \sum_b \tilde{v}_{i+1}(b) * \sum_c \tilde{\textrm{mul}}(z, b, c) * \tilde{v}_{i+1}(c) \\ & = & \sum_b \tilde{v}_{i+1}(b) * g_z(b) \end{eqnarray} $$ - **Therefore we can replace the old $2s$-variate sumcheck with two new $s$-variate sumchecks**. - phase/step 1 sumcheck: $$\tilde{v}_i(z) = \sum_b \tilde{v}_{i+1}(b) * g_z(b)$$ => $\tilde{v}_{i+1}(x)$ and $g_z(x)$ where $x \in F^s$; - phase/step 2 sumcheck: $$g_z(x) = \sum_{c \in \{0,1\}^s} \tilde{\textrm{mul}}(z, x, c) * \tilde{v}_{i+1}(c)$$ => $\tilde{\textrm{mul}}(z, x, y)$ and $\tilde{v}_{i+1}(y)$ where $y \in F^s$. - [x] how to get $g_z(X)$'s MLE repr? - Since $\tilde{\textrm{mul}}(z, b, c) = \sum_{a \in B_s} \textrm{eq}(z, a) \textrm{mul}(a, b, c)$, we have for each $b \in B_s$ $$ \begin{eqnarray} g_z(b) & = & \sum_c \tilde{\textrm{mul}}(z, b, c) * \tilde{v}_{i+1}(c) \\ & = & \sum_{a, c \in B_s} \textrm{eq}(z, a) * \textrm{mul}(a, b, c) * \tilde{v}_{i+1}(c); \end{eqnarray} $$ - Let $P_b = \{a: \exists c, \textrm{mul}(a, b, c) = 1\}$ be the subset of gates at layer $i$ that has gate $b$ at layer $i+1$ as left input. - We can then do only $|P_b|$ summation: $$ \begin{eqnarray} g_z(b) & = & \sum_{a, c \in P_b} \textrm{eq}(z, a) * \tilde{v}_{i+1}(c); \end{eqnarray} $$ - For some random vector $z \in F^s$, we can compute $\textrm{eq}(z, b)$ for all elements of $B_s$ in $O(2^s)$ time. - since $\sum_b |P_b| \le 2^s$, we have the desired property that getting $g_z(b)$ for all $b$ runs in $O(2^s)$ time. - [x] for each $c \in B_s$, how to compute $\tilde{\textrm{mul}}(z, x, c)$ efficiently? - $\tilde{\textrm{mul}}(z, x, c) = \sum_{a, b} \textrm{eq}(z, a) * \textrm{eq}(x, b) * \textrm{mul}(a, b, c)$. - Let $Q_c = \{ a: \exists c, \textrm{mul}(a, b, c) = 1\}$ be the subset of gates at layer $i$ that has gate $c$ at layer $i+1$ as right input. - `G_z[a] = eq(z, a)` is stored in $O(2^s)$ space and computed in $O(2^s)$ time. - Similarly, `G_x[a] = eq(x, a)` can also be stored in $O(2^s)$ space. - Therefore $\tilde{\textrm{mul}}(z, x, c) = \sum_{a, b \in Q_c} \textrm{eq}(z, a) * \textrm{eq}(x, b)$. - [ ] how does libra work with data parallel circuit? ## 3. GKR for layered data parallel circuit - $2^b$ is the number of instances; $s$ is the number of variables per layer in the inner circuit. - The relation between layer $i$ and layer $i+1$ is $$ \begin{eqnarray} v_i(g_1, g_2) & = & \sum_{b_1, c_1 \in B_s} \tilde{\textrm{mul}}(g_1, b_1, c_1) * \tilde{v}_{i+1}(b_1, g_2) * \tilde{v}_{i+1}(c_1, g_2) \end{eqnarray} $$ With MLE, we have $$ \begin{eqnarray} \tilde{v}_i(z_1, z_2) & = & \sum_{g_1 \in B_s, g_2 \in B_b} \textrm{eq}(z_1, g_1) * \textrm{eq}(z_2, g_2) \sum_{b_1, c_1} \tilde{\textrm{mul}}(g_1, b_1, c_1) * \tilde{v}_{i+1}(b_1, g_2) * \tilde{v}_{i+1}(c_1, g_2) \end{eqnarray} $$ ## 4. GKR for unlayered data parallel circuit (ceno GKR) ## QA ## Appendix A. useful docs (survey / paper / video) - talk by Ariel Gabizon (★★★★★) {%youtube tSpvk3EGRSM %} - a note on GKR by j.thaler (★★★★★) https://people.cs.georgetown.edu/jthaler/GKRNote.pdf