GKRFold: SumFold-based GKR Proof Compression

# GKRFold: SumFold-based GKR Proof Compression Masato Tsutsumi (Waseda University) **Disclaimer**: This post is in an early idea stage, and the logic and mathematical formulations may not be fully rigorous. We welcome any comments or feedback on the idea itself, as well as suggestions regarding the logical structure. ## Abstract In this paper, we introduce GKRFold, a protocol for compressing the multiple GKR-based SNARK proofs into a single proof. GKRFold incorporates SumFold—a component originally proposed in NeutronNova[KS24]—to achieve this compression. Ultimately, the compression process yields a single GKR or Sumcheck instance along with one commitment. As a result, the verifier is required to perform only one Sumcheck and two random polynomial evaluations. Moreover, the protocol is applicable to both uniform and non-uniform instances of the GKR protocol. ## 1. Introduction The explanation of MLE, Sumcheck is omitted here. Those who wish to study the details should refer to the following documents. - https://eprint.iacr.org/2013/351.pdf - https://jolt.a16zcrypto.com/background/sumcheck.html - https://www.youtube.com/watch?v=lMo-MmJ7e_E - https://eprint.iacr.org/2019/317.pdf - https://www.youtube.com/watch?v=lMo-MmJ7e_E ### 1-2. GKR Protocol and its challenges. The GKR proof system[GKR15,XZZ19], a prominent SNARK (Succinct Non-interactive ARgument of Knowledge) due to its efficient prover time, has been employed in systems such as [Expander](https://github.com/PolyhedraZK/Expander) and [Ceno](https://github.com/scroll-tech/ceno). Although the prover can generate proofs in $O(N)$ time without committing to intermediate results or relying on FFTs, the verification time and proof size exhibit a complexity of $O(D\log N)$. Consequently, achieving efficient proof compression remains an important challenge. ### 1-2. Contributions In this work, we propose GKRFold, a novel method that applies SumFold to efficiently compress $n$ instances of GKR proofs. The primary contributions of this study are as follows: 1. We demonstrate that SumFold can be effectively utilized to compress GKR instances, leading to substantial improvements in proof size and verification efficiency. 2. We show that GKRFold can compress $n$ GKR proofs, each comprising $d$ layers (amounting to a total of $6 \times n \times d$ Sumcheck instances), into a single Sumcheck instance. This result ... (TODO)... ## 2. Preliminaries ### 2-1. GKR Protocol Its design is based on representing each circuit layer as a corresponding layer polynomial. Starting from the output, the protocol sequentially relays a claim about one layer to a claim about the subsequent layer by executing a Sumcheck protocol on the evaluation of the layer polynomial at a randomly chosen point. ![image](https://hackmd.io/_uploads/rJDhMAMc1l.png) More precisely, the layer polynomial $V_i$ is defined as $$ V_i(z) = \sum_{x,y \in \{0,1\}^{S_i}} \Bigl\{ add_{i+1}(z,x,y)\bigl(V_{i+1}(x)+V_{i+1}(y)\bigr) + mult_{i+1}(z,x,y)\bigl(V_{i+1}(x) \cdot V_{i+1}(y)\bigr) \Bigr\}, $$ where the functions $add_i(z,x,y)$ and $mult_i(z,x,y)$ return 1 if the gate $(z,x,y)$ in layer $i$ is an addition or multiplication gate, respectively, and 0 otherwise. An overview of the GKR protocol is as follows: 1. The prover sends the circuit evaluation $C(x,w)=y$ to the verifier. 2. For every layer $i = 1,\dots,d-1$ (excluding the input layer), the verifier issues a challenge by selecting a random point $r_i$ and then initiates the Sumcheck protocol on $V_i(r_i)$. 3. The verifier directly evaluates the input layer $V_d(r_d)$. ### 2-2. Linear GKR Protocol (Libra[XZZ19]) Libra[XZZ19] achieves the linear-time prover algorithm for the GKR protocol by applying 2-phases Sumcheck. Let's assume that the following layer polynomial is given: $$ V_i(z) = \sum_{x,y \in \{0,1\}^{S_i}} \Bigl\{ add_{i+1}(z,x,y)\bigl(V_{i+1}(x)+V_{i+1}(y)\bigr) + mult_{i+1}(z,x,y)\bigl(V_{i+1}(x) \cdot V_{i+1}(y)\bigr) \Bigr\}. $$ We partition this expression into three terms: $$ V_i(z) = \sum_{x,y \in \{0,1\}^{S_i}} add_{i+1}(z,x,y)V_{i+1}(x) \;+\; $$ $$ \sum_{x,y \in \{0,1\}^{S_i}} add_{i+1}(z,x,y)V_{i+1}(y) \;+\; $$ $$ \sum_{x,y \in \{0,1\}^{S_i}} mult_{i+1}(z,x,y)\bigl(V_{i+1}(x) \cdot V_{i+1}(y)\bigr). $$ For each term, a 2-phase Sumcheck protocol is executed. For instance, consider the third term: $$ \sum_{x,y \in \{0,1\}^{S_i}} mult_{i+1}(z,x,y)\bigl(V_{i+1}(x) \cdot V_{i+1}(y)\bigr) =\sum_{x \in \{0,1\}^{S_i}} h_{i+1}(x)V_{i+1}(x), $$ where $$ h_{i+1}(x) = \sum_{y \in \{0,1\}^{S_i}} mult_{i+1}(z,x,y)V_{i+1}(y). $$ The 2-phase Sumcheck protocol is then carried out as follows: 1. Receive a random value $r_i$ from V 2. Prove the correctness of $h_{i+1}(r_i)$ via Sumcheck. 3. Receive a random value $u_i$ from V 4. Prove the correctness of the term $\sum_{x \in \{0,1\}^{S_i}} h_{i+1}(x)V_{i+1}(u_i,x)$ via Sumcheck. Thus, for each layer, a total of $2 \times 3 = 6$ Sumcheck instances are executed in parallel. ## 3. SumFold SumFold, as introduced in NeutronNova[KS24], is a technique that folds an arbitrary number of Sumcheck instances into a single instance. ### 3-1. Protocol Overview Assume that we want to prove, via Sumcheck, a function defined by a composition $T = F(\vec{g}(x))$. The goal of SumFold is to reduce the $n$ Sumcheck instances to a single instance. ![image](https://hackmd.io/_uploads/HyGj7mWc1g.png) #### Step 1: Commitment Construction by the Prover The prover constructs the polynomials $$ f_1(b,x),\ f_2(b,x),\ \dots,\ f_t(b,x) $$ and sends a commitment to these functions. Each polynomial is defined such that its output corresponds to $g_{bj}(x)$ for the $b$-th instance. ![Figure: Commitment Diagram](https://hackmd.io/_uploads/BkyoQGGcJe.png) #### Step 2: Random Challenge by the Verifier The verifier selects a random field element $\rho$ and transmits it to the prover. #### Step 3: Construction of $Q(b)$ by the Prover The prover constructs the polynomial $Q(b)$ such that: $$ Q(\rho) = T_{\rho}, \quad \text{and} \quad Q(b) = 0 \text{ for } b \neq \rho. $$ #### Step 4: Execution of the Sumcheck Protocol Both the prover and the verifier engage in the Sumcheck protocol. Initially, the verifier checks that $$ T' = Q(\rho). $$ Subsequently, the protocol proves the following equality: $$ T' = \sum_{x} F\bigl(f_1(\rho,x),\, f_2(\rho,x),\, \dots,\, f_t(\rho,x)\bigr) $$ using the Sumcheck protocol. Notably, Steps 2 and 3 are iterated for at most $\log(n)$ rounds. #### Step 5: Output Generation Upon the completion of the above steps, the protocol outputs the folded instance. --- ### 3-2. Proof of Theorem (Informal) Completeness: The folded instance produced by SumFold is correctly contained within the Sumcheck instance. Under the assumption that each individual Sumcheck instance is valid, the instance $T'$ obtained through the folding procedure by the prover and verifier corresponds to a randomly chosen instance among $n$ instances. Specifically, if the prover correctly prepares the value $ T_i$, then for any selected instance $i$ it holds that $$ Q(i) = T_i. $$ Furthermore, the sum of the polynomial $F\bigl(\vec{g}_i(x)\bigr)$ verified by the Sumcheck protocol is exactly equal to $T_i$. Knowledge Soundness: Due to the commitment scheme, the prover is unable to conveniently modify the instance after learning the random challenge. Moreover, if a dishonest prover were to embed an invalid instance among the $n$ Sumcheck instances in advance, and the verifier subsequently selects that instance, the soundness of the Sumcheck protocol ensures that it will be rejected with high probability. In the event that the verifier does not select the invalid instance, the instance might pass verification; however, by iterating the challenge process multiple times, the verifier can ultimately achieve a very high probability of rejecting such a dishonest attempt. ### 3-3. Costs Let - $t$ denotes the element size of $\vec{g}$. - $D$ denotes the degree of $F$, - $n$ denotes the number of instances - $l$ denotes the number of variables in $g$. The complexities are summarized in the following table: | **Round** | **Communication** | **Prover Time** | **Verifier Time** | |-------------------|----------------------------------------|----------------------------------------------------|-------------------------------------| | $1+\log(n)$ | $O(D \log(n))$ field elements | $O(n \cdot t \cdot D \cdot 2^l)$ field elements | $O(D \log(n))$ field elements | By following Theorem 2 in the NeutronNova paper[KS24]. ## 4. GKRFold GKRFold is a technique that leverages SumFold to compress $n$ GKR instances into a single instance. In what follows, we describe two variants: the normal GKRFold (GKR-to-GKR) and the ultra GKRFold (GKR-to-Sumcheck). --- ### 4-1. Normal GKRFold: GKR-to-GKR In the normal GKRFold, the idea is to directly apply the SumFold method to GKR instances. The method proceeds as follows: 1. **Function Construction and Commitment:** The prover constructs functions $f_{j}(b,x)$ and commits to them with the verifier. Here, $f_{j}(i, x)$ is defined to return the layer polynomial corresponding to layer $j$ of the $i$-th GKR instance. In effect, the polynomial $f_{j}(b,x)$ is designed so that when $b = i$ its output equals the polynomial of the $i$-th instance. 2. **Random Challenge and Instance Selection:** In response to the verifier’s random challenge $\rho$, the GKR protocol is executed on the $\rho$-th GKR instance. The randomness ensures that the prover cannot selectively cheat on a subset of instances. 3. **Output:** The $\rho$-th GKR instance is then output as the folded instance. The correctness of the folding is ensured by the underlying SumFold protocol. --- ### 4-2. Ultra GKRFold: GKR-to-Sumcheck Next, we describe a method to fold $n$ GKR instances into a single Sumcheck instance. In this variant, SumFold is applied to the layer polynomials of the GKR instances, effectively reducing the Sumcheck instances corresponding to a given layer. Recall that, as described in Section 2-2, the GKR layer Sumcheck splits into a 3-term, 2-phase structure. With suitable mappings, the following table summarizes the correspondence: | | **Sumcheck 1** $(h_{1,i+1}(x))$ | **Sumcheck 2** $(h_{2,i+1}(y))$ | **Sumcheck 3** $(h_{3,i+1}(x))$ | **Sumcheck 4** (1st term) | **Sumcheck 5** (2nd term) | **Sumcheck 6** (3rd term) | |------------------------|---------------------------------------------|---------------------------------------------|---------------------------------------------|------------------------------------------------|------------------------------------------------|------------------------------------------------| | **Expression** | $\displaystyle \sum_{y \in \{0,1\}^{S_i}} add_{i+1}(z,x,y)$ | $\displaystyle \sum_{x \in \{0,1\}^{S_i}} add_{i+1}(z,x,y)$ | $\displaystyle \sum_{y \in \{0,1\}^{S_i}} mult_{i+1}(z,x,y)V_{i+1}(y)$ | $\displaystyle \sum_{x \in \{0,1\}^{S_i}} h_{1,i+1}(x)V_{i+1}(x)$ | $\displaystyle \sum_{y \in \{0,1\}^{S_i}} h_{2,i+1}(y)V_{i+1}(y)$ | $\displaystyle \sum_{x \in \{0,1\}^{S_i}} h_{3,i+1}(x)V_{i+1}(x)$ | | **Form of** $F$ | $F(g_1(x),g_2(x)) = g_1(x) \cdot g_2(x)$ | same | same | same | same | same | | **Vector** $\vec{g}$ | $\{ add_{i+1}(z,x,y),\, g_e(x,y,z) \}$ | $\{ add_{i+1}(z,x,y),\, g_e(x,y,z) \}$ | $\{ mult_{i+1}(z,x,y),\, V_{i+1}(y) \}$ | $\{ h_{1,i+1}(x),\, V_{i+1}(x) \}$ | $\{ h_{2,i+1}(y),\, V_{i+1}(y) \}$ | $\{ h_{3,i+1}(x),\, V_{i+1}(x) \}$ | Here, $g_e(x) = 1$ is the identity function that always returns 1. Note that Sumcheck 1 and 4, Sumcheck 2 and 5, Sumcheck 3 and 6 correspond respectively to phase 1 and phase 2 of the GKR protocol. Using this mapping, the $6 \times n \times d$ Sumcheck instances can be arranged as indicated in the figure below. <img src="https://hackmd.io/_uploads/HJo8TwWqkg.png" width="400" /> Based on the above mapping, GKRFold functions as follows. --- #### Step 1. Sumcheck Initialization (for all layers $ 1,\dots,d $) The verifier sends random challenges $u_1, u_2, \dots, u_{3n}$ to the prover. Then, based on these random values, the prover constructs $3*n$ Sumcheck instances corresponding to the entries in columns 1, 2, and 3 of the table above and sends them to the verifier. Next, without running the Sumcheck immediately, the verifier sends random challenges $r_1, r_2, \dots, r_{3n}$ to the prover. Subsequently, the prover constructs another set of $3*n$ Sumcheck instances corresponding to the entries in columns 4, 5, and 6 (again, based on the random challenges) and sends them to the verifier. Repeating this process for layers 1 through $d$ initializes a total of $6 \times n \times d$ Sumcheck instances. --- #### Step 2. Commitment Construction by the Prover The prover constructs the polynomials $$ f_1(b,x),\quad f_2(b,x) $$ and sends a commitment to these functions. Each polynomial is defined such that its output corresponds to $g_{bj}(x)$ for the $b$-th instance, ensuring that when $b$ is fixed, the correct layer polynomial is recovered. <img src="https://hackmd.io/_uploads/HJo8TwWqkg.png" width="400" /> --- #### Step 3. Running SumFold Finally, the SumFold protocol is executed in a manner analogous to the standard SumFold. Multiple rounds of challenges are issued. Depending on the outcome of these challenges, indices corresponding to the phases (phase 1 and phase 2) are selected, and the GKR Round Sumcheck is executed accordingly. ### 4-3. Cost Analysis TODO... ### 4-4. Proof of Theorem (Informal) TODO... ## Reference 1. [KS24](https://eprint.iacr.org/2024/1606) 2. [GKR15](https://dl.acm.org/doi/10.1145/2699436) 3. [XZZ19](https://eprint.iacr.org/2019/317.pdf)