Strategy for Sumcheck SVO Multi-Point Evaluation in WHIR

# Strategy for Sumcheck SVO Multi-Point Evaluation in WHIR - Paper: https://eprint.iacr.org/2025/1117 ## Introduction For now, the Small-Value Optimization (SVO) for sumcheck proving assumes a single equality constraint of the form $g(X) = \text{eq}(w, X) \cdot p(X)$. However, the WHIR protocol often requires checking proximity or evaluation at multiple points $z_1, \dots, z_k$. This note outlines two strategies for handling multiple constraints, analyzing their performance trade-offs to determine the optimal approach for implementation. ## Option 1: Prefix-Suffix Inner Product Protocol (Low $k$) This approach treats the sum-check for $\sum_{x} \text{eq}(r, x) \cdot p(x)$ as an inner product between an arbitrary vector $u$ (representing $p$) and a structured vector $a$ (representing $\text{eq}(r, \cdot)$). Because $\text{eq}$ has a specific tensor structure, we can use a specialized "Prefix-Suffix" protocol that avoids materializing the full vector $a$. ### Mechanism The equality polynomial $\text{eq}(r, x)$ is prefix-suffix decomposable. If we split the variables $x$ into a prefix $x_{pre}$ (first $m$ variables) and a suffix $x_{suf}$ (remaining $\ell - m$ variables), we can write: \begin{equation} \text{eq}(r, x) = \text{eq}(r_{pre}, x_{pre}) \cdot \text{eq}(r_{suf}, x_{suf}) \end{equation} The prover protocol runs in stages: 1. **Array Construction:** The prover makes a pass over $p$ to compute a table $Q$ of size $2^m$ (where $m \approx \ell/2$). \begin{equation} Q[y] = \sum_{x_{suf}} p(y, x_{suf}) \cdot \text{eq}(r_{suf}, x_{suf}) \end{equation} Simultaneously, it builds a table $P$ for the prefixes: $P[y] = \text{eq}(r_{pre}, y)$. 2. **Reduced Sumcheck:** We can now rewrite the original sumcheck claim as the inner product of these two tables. \begin{equation} \begin{aligned} \sum_{x} \text{eq}(r, x) \cdot p(x) &= \sum_{y} \text{eq}(r_{pre}, y) \cdot \left( \sum_{x_{suf}} p(y, x_{suf}) \cdot \text{eq}(r_{suf}, x_{suf}) \right) \\ &= \sum_{y} P[y] \cdot Q[y] \end{aligned} \end{equation} The problem reduces to checking the inner product of $P$ and $Q$, which are vectors of size $\sqrt{N}$ (where $N=2^\ell$). The prover runs a standard sumcheck on this smaller instance. ### Cost Derivation The efficiency of this protocol relies on decomposing the equality polynomial $\text{eq}(r, x)$ into a prefix part (size $\sqrt{N}$) and a suffix part (size $\sqrt{N}$). By substituting this decomposition into the sumcheck equation, we can isolate the costs associated with the Base Field (small values) and the Extension Field (large values). \begin{equation} \sum_{x \in \{0,1\}^\ell} \text{eq}(r, x) \cdot p(x) = \sum_{y \in \{0,1\}^m} \underbrace{\text{eq}(r_{pre}, y)}_{\substack{\text{Table } P[y] \\ \text{Size: } 2^{\ell/2} = \sqrt{N} \\ \text{(Extension Field)}}} \cdot \underbrace{\left( \sum_{x_{suf} \in \{0,1\}^{\ell-m}} \underbrace{p(y, x_{suf})}_{\text{Base}} \cdot \underbrace{\text{eq}(r_{suf}, x_{suf})}_{\text{Extension}} \right)}_{\substack{\text{Table } Q[y] \text{ Construction} \\ \text{Stream } p(x) \text{ over all } 2^\ell \text{ points} \\ \text{Cost: } \mathbf{2 \cdot N \cdot \mathcal{M}_{BE}}}} \end{equation} This transforms the original claim into a much smaller inner product check: \begin{equation} \text{Reduced Claim} = \sum_{y \in \{0,1\}^m} \underbrace{P[y] \cdot Q[y]}_{\substack{\text{Reduced Sumcheck} \\ \text{Size: } 2^{\ell/2} = \sqrt{N} \text{ iterations} \\ \text{Cost: } \mathbf{\sqrt{N} \cdot \mathcal{M}_{EE}}}} \end{equation} 1. **The Inner Sum ($Q[y]$):** This is where the heavy lifting happens. We effectively iterate over every coefficient of the polynomial $p(x)$ ($N$ terms). However, because $p(x)$ is in the Base Field and we multiply it by the suffix $\text{eq}$ (Extension Field), we only pay the cheaper $\mathcal{M}_{BE}$ cost. The factor of $2$ accounts for the two logical passes typically required to structure these prefix/suffix tables. :::info ### Explanation about the $\mathbf{2}$ to achieve $2 \cdot N \cdot \mathcal{M}_{BE}$ This factor comes from the configuration of the Prefix-Suffix protocol to achieve Square-Root Space ($O(\sqrt{N})$). In the paper https://eprint.iacr.org/2025/611.pdf (Appendix A), the protocol is defined by a parameter $C$, which determines how many chunks we split the variables into. To achieve $\sqrt{N}$ space, we set $C=2$. The paper states: > *"The protocol proceeds in $C$ stages... In each stage, the prover first makes a single pass over $u$ and $a$."* Therefore, with $C=2$, the prover iterates over the data size ($N$) exactly twice: * **Pass 1 (Prefix Stage):** The prover streams over all $N$ items to fold the Prefix variables (the first $\ell/2$ variables). This constructs the intermediate table $Q$. * *Cost:* $1 \cdot N$ operations. * **Pass 2 (Suffix Stage):** The prover streams over the data again to handle the Suffix variables (the remaining $\ell/2$ variables), utilizing the challenges generated after Pass 1. * *Cost:* $1 \cdot N$ operations. $$\text{Total: } 1N + 1N = \mathbf{2N}$$ ::: 2. **The Outer Sum ($P \cdot Q$):** Once $Q$ is built, the problem shrinks to size $\sqrt{N}$. The operations here are expensive Extension $\times$ Extension multiplications ($\mathcal{M}_{EE}$), but because $\sqrt{N} \ll N$, this cost is negligible. To verify $k$ distinct points, we must run the prefix-suffix decomposition independently for each point. \begin{equation} \text{Total Cost} \approx \underbrace{k}_{\text{repeat for } k \text{ points}} \cdot \Bigg( \underbrace{2 \cdot N \cdot \mathcal{M}_{BE}}_{\substack{\text{Array Construction} \\ \text{Streaming } p(x) \text{ (size } N) \\ \text{vs Eq suffixes}}} + \underbrace{O(\sqrt{N}) \cdot \mathcal{M}_{EE}}_{\substack{\text{Reduced Sumcheck} \\ \text{Proving inner product} \\ \text{of tables } P, Q \text{ (size } \sqrt{N})}} \Bigg) \end{equation} - Pros: Extremely memory efficient; very fast for small $k$. - Cons: Linear cost scaling with $k$. ## Option 2: Batched Degree-2 Sum-Check (High $k$) This strategy abandons the prefix-suffix structure in favor of a brute-force batching approach that leverages the Small-Value Optimization (SVO) to keep costs low. It constructs a single combined polynomial $W(X)$ representing *all* constraints and runs a standard sumcheck on the product $W(X) \cdot p(X)$. \begin{equation} W(X) = \sum_{i=0}^{k-1} \gamma^i \cdot \text{eq}(z_i, X) \end{equation} The prover verifies the following sum (as in our classic sumcheck): \begin{equation} \sum_{x \in \{0,1\}^\ell} \underbrace{W(x)}_{\substack{\text{Combined Weight} \\ \text{Extension Field} \\ \text{(Large)}}} \cdot \underbrace{p(x)}_{\substack{\text{Polynomial} \\ \text{Base Field} \\ \text{(Small)}}} = S \end{equation} ### Mechanism Unlike Option 1, the combined polynomial $W(X)$ has no clean prefix-suffix structure; it is essentially a dense vector of large random values. However, $p(X)$ remains "small" (Base Field). We can exploit this asymmetry using the Small-Value Optimization (SVO). The protocol proceeds in two phases: 1. **SVO Phase (First $\ell_0$ rounds):** We compute round messages using accumulators. Because we are multiplying a large $W(x)$ by a small $p(x)$, every operation is a cheap $\mathcal{M}_{BE}$ (Base $\times$ Extension) multiplication. 2. **Standard Phase (Remaining rounds):** After folding $\ell_0$ times, the polynomial $p(x)$ becomes "large" (Extension Field). We switch to the standard linear-time prover, paying the expensive $\mathcal{M}_{EE}$ cost, but on a much smaller domain size ($N / 2^{\ell_0}$). ### Explicit Cost Derivation The total cost is the sum of three distinct computational phases: 1. **Accumulation ($\mathcal{M}_{BE}$):** Computing the SVO accumulators (requires streaming $p(x)$). For $\ell_0$ rounds, the cost is $(3^{\ell_0} - 1) \cdot \frac{N}{2^{\ell_0}} \cdot \mathcal{M}_{BE}$. 2. **Binding ($\mathcal{M}_{BE}$):** Compressing the polynomial $p(x)$ once the SVO phase ends. 3. **Standard Prover ($\mathcal{M}_{EE}$):** Running the linear-time protocol on the remaining variables. We define two cost units: * $\mathcal{M}_{BE}$: Base Field $\times$ Extension Field multiplication (Cheap). * $\mathcal{M}_{EE}$: Extension Field $\times$ Extension Field multiplication (Expensive). The total cost is the sum of the **SVO Phase** (Accumulation + Binding) and the **Standard Phase** (Geometric Series). ### Case 1: One Round of SVO ($\ell_0 = 1$) In this baseline scenario, we perform only the first round using accumulators. 1. **SVO Phase (Round 1)** We stream the polynomial (size $N$) to compute accumulators and then bind the polynomial. * Accumulation: $(3^1 - 1) = 2$ accs $\times (N/2 \text{ items}) = 1N$. * Binding: $1 \text{ fold} \times N = 1N$. * *Total:* $2N \cdot \mathcal{M}_{BE}$. 2. **Standard Phase (Rounds $2 \dots \ell$)** We run the standard linear-time prover on the remaining domain (starting size $N/2$). The cost is the sum of a geometric series. * $\text{Cost} = 2 \times (N/2 + N/4 + \dots) = 2N$. * *Total:* $2N \cdot \mathcal{M}_{EE}$. \begin{equation} \text{Total Cost} \approx 2N \cdot \mathcal{M}_{BE} + 2N \cdot \mathcal{M}_{EE} \end{equation} ### Case 2: Two Rounds of SVO ($\ell_0 = 2$) We extend the SVO phase to the second round, trading cheap operations to halve the expensive ones. 1. **SVO Phase (Rounds 1 & 2)** * Accumulation: $(3^2 - 1) = 8 \text{ accs} \times (N/4) = 2N$. * Binding: $1 \text{ fold} \times N = 1N$. * *Total:* $3N \cdot \mathcal{M}_{BE}$. 2. **Standard Phase (Rounds $3 \dots \ell$)** The standard prover now starts on a domain of size $N/4$. * $\text{Cost} = 2 \times (N/4 + N/8 + \dots) = 1N$. * *Total:* $1N \cdot \mathcal{M}_{EE}$. \begin{equation} \text{Total Cost} \approx 3N \cdot \mathcal{M}_{BE} + 1N \cdot \mathcal{M}_{EE} \end{equation} ### Case 3: Three Rounds of SVO ($\ell_0 = 3$) We extend SVO to the third round. This is the optimal configuration for reducing Extension field operations. 1. **SVO Phase (Rounds 1, 2 & 3)** * Accumulation: $(3^3 - 1) = 26 \text{ accs} \times (N/8) = 3.25N$. * Binding: $1 \text{ fold} \times N = 1N$. * *Total:* $4.25N \cdot \mathcal{M}_{BE}$. 2. **Standard Phase (Rounds $4 \dots \ell$)** The standard prover starts on a domain of size $N/8$. * $\text{Cost} = 2 \times (N/8 + N/16 + \dots) = 0.5N$. * *Total:* $0.5N \cdot \mathcal{M}_{EE}$. \begin{equation} \text{Total Cost} \approx 4.25N \cdot \mathcal{M}_{BE} + 0.5N \cdot \mathcal{M}_{EE} \end{equation} *Summary:* By moving from Case 2 to Case 3, we increase the cheap Base Field operations by $1.25N$, but we reduce the expensive Extension Field operations by $0.5N$. ## Comparison The optimal strategy depends on the number of evaluation points $k$. | Feature | Option 1: Prefix-Suffix | Option 2: Batched SVO | | :--- | :--- | :--- | | **Primary Cost** | $2k \cdot N$ (Base $\times$ Ext) | $\approx 4.25N$ (Base $\times$ Ext) | | **Space Complexity** | $O(\sqrt{N})$ | $O(N)$ (or stream processing) | | **Scaling** | Linear with $k$ | Constant with $k$ | **Conclusion:** * **If $k = 1$:** Option 1 is faster ($2N$ vs $4.25N$). * **If $k = 2$:** Option 1 ($4N$) is slightly faster than Option 2 ($4.25N$). * **If $k \ge 3$:** Option 2 ($4.25N$) is faster than Option 1 ($6N+$).