Univariate Skip for WHIR P3

# Technical Note: Verifier Constraint Checks in WHIR This document provides a detailed explanation of the constraint checking mechanism within WHIR. We will focus on the verifier's role in a classical, round-by-round folding scenario with and without the univariate skip optimization. ## Introduction: The Verifier's Role about the Constraints In any interactive proof system, a **prover** aims to convince a **verifier** that they possess a valid *witness* for a given public *statement*. In WHIR, this statement is expressed as a system of polynomial constraints. The verifier's primary responsibility is to check these constraints efficiently without needing access to the full, often massive, witness. The core idea is to transform a large number of complex constraints into a single, final check of a simple identity. This is achieved through a multi-round protocol where constraints are recursively "folded" into one another using randomness supplied by the verifier. Let's use a concrete example of a multilinear polynomial over three variables, $X_0, X_1, X_2$: \begin{aligned} f(X_0, X_1, X_2) &= c_0 + c_1X_0 + c_2X_1 + c_3X_2 + c_4X_0X_1 + c_5X_0X_2 \\ &+ c_6X_1X_2 + c_7X_0X_1X_2 \end{aligned} A typical statement asserts that this polynomial satisfies certain properties, such as evaluating to a specific value $v$ at a point $z$. This forms a constraint, written as $f(z) = v$. The verifier must check if the prover's claimed polynomial satisfies all such constraints simultaneously. ## Our Goal for this Note: The Final Verification Equation Before diving into the process, it's helpful to understand the final check the verifier performs. After all rounds are complete, the verifier will have accumulated: * A final claimed sum, denoted as $S_{final}$. This is the result of folding all constraint claims together across all rounds. * A final polynomial, $p_{final}(Y)$, a very small polynomial which is the result of recursively folding the initial witness polynomial. * A complete set of folding challenges, which form a single multilinear point $r = (r_0, r_1, \dots, r_{n-1})$. * The final sumcheck challenges, denoted $r_{final}$. The verifier's ultimate goal is to check a single, elegant equation: $$S_{final} = W(r) \cdot p_{final}(r_{final})$$ Here, $W(r)$ represents the evaluation of a "grand composite constraint polynomial" at the collected folding point $r$. This single equation's validity implies the correctness of the entire proof. The protocol is designed to ensure that if the prover started with a valid witness, this equation will hold. If he cheated, it will fail with high probability. ## The Verification Process: A Round-by-Round Breakdown The verifier's work unfolds over several rounds, mirroring the prover's folding process. ### Initial Setup (Round 0): Aggregating the Initial Claims The protocol kicks off with the verifier's first major task: to consolidate all initial polynomial constraints into a single, efficiently verifiable claim. These initial constraints are the assertions about the prover's witness polynomial and originate from two distinct sources: 1. **The Public Statement**: These are the explicit constraints that define the computational problem. 2. **Out-of-Domain (OOD) Checks**: To ensure the prover has committed to a genuine low-degree polynomial, the verifier challenges the prover with random points outside the main evaluation domain. Each OOD check $p(z_i) = v_i$ forms an individual constraint. To merge these $k$ checks into one, the verifier performs a random linear combination. As implemented in the `combine_constraints` function, it draws a single random scalar $\alpha_0$ from the Fiat-Shamir transcript and uses its powers to compute an initial claimed sum, $S_0$: $$S_0 = \sum_{i=1}^{k} \alpha_0^i \cdot s_i$$ where $s_i$ is the expected sum for the $i$-th constraint. This $S_0$ becomes the target value for the first sumcheck, which must now verify the combined identity: $$\sum_{x \in \{0,1\}^n} p(X) \cdot \left(\sum_{i=1}^{k} \alpha_0^i \cdot w_i(X)\right) = S_0$$ where $w_i(X)$ is the weight polynomial for the $i$-th constraint. The prover then executes the sumcheck protocol. The verifier validates this by: 1. Receiving the prover's sumcheck polynomial, $h_0(Y)$. 2. Checking that $h_0(0) + h_0(1) = S_0$. 3. If the check passes, sampling the first set of folding challenges $r^{(0)}$. Finally, the verifier updates its claimed sum by evaluating the sumcheck polynomial at the new challenge: $S_1 = h_0(r^{(0)})$. This new value, $S_1$, becomes the folded result that is carried into the next round. ### Intermediate Rounds (Round $j$ > 0): Recursive Folding and Checking For each subsequent round, the process is similar but adds a crucial cross-round consistency check. 1. **Receive New Commitment**: The verifier receives the Merkle root of the prover's new, folded polynomial, $p_j$. 2. **STIR Queries**: The verifier must now check that $p_j$ is a correct folding of the previous polynomial, $p_{j-1}$. It generates random STIR challenges to query $p_{j-1}$ and verifies the prover's supplied Merkle proofs against the *previous* round's commitment. This is handled by the `verify_stir_challenges` function. 3. **Formulate & Combine New Constraints**: The verified openings from the STIR queries, along with any new OOD checks on $p_j$, become the constraints for this round. The verifier again performs a random linear combination with a new challenge $\alpha_j$ to get a combined sum for *this round's* new constraints. 4. **Update Claimed Sum**: This new combined sum is added to the folded sum carried over from the previous round ($S_j$). The target for the current round's sumcheck becomes: $$ S_{j, \text{target}} = S_j + \sum_{i=1}^{k'} \alpha_j^i \cdot s'_i $$ where $s'_i$ are the sums from the new constraints of round $j$. 5. **Sumcheck Verification**: The verifier validates the sumcheck for round $j$, which uses $S_{j, \text{target}}$ as its goal. Upon success, it samples the next set of folding challenges $r^{(j)}$ and calculates the next folded sum $S_{j+1}$ for the subsequent round. This iterative process continues, with each round verifying the link to the previous one and folding the accumulated claim further. ### The Final Round: Unmasking the Polynomial In the last round, the process culminates in the final verification. 1. **Receive Final Polynomial**: The folded polynomial, $p_{final}$, is now small enough for the prover to send all its coefficients directly to the verifier. 2. **Final STIR Check**: The verifier performs one last STIR check against the second-to-last commitment. However, instead of creating new constraints, it directly evaluates the now-explicit $p_{final}$ at the challenged points to ensure it's a valid folding of the previous polynomial. 3. **Final Sumcheck**: If required, a final sumcheck is run on $p_{final}$. This updates the claimed sum one last time to yield $S_{final}$ and produces the final challenges, $r_{final}$. 4. **The Constraint Check**: Now, the verifier has all the pieces for the final equation. It evaluates all the constraint polynomials from all rounds at the final evaluation point $r$ (which is the concatenation of all folding challenges $r^{(0)}, r^{(1)}, \dots$). This step, performed by `eval_constraints_poly`, yields the value $W(r)$. The verifier then performs the ultimate check: $$ S_{final} \stackrel{?}{=} W(r) \cdot p_{final}(r_{final}) $$ This final check ties together all constraints from every round. If it holds, the verifier is convinced that the prover has been consistent throughout the entire protocol and accepts the proof. ### Details about the constraint check The final step of the verification process is a single equation that synthesizes all the claims made by the prover across every round of the protocol into one definitive check. After the last sumcheck is complete, the verifier performs this grand check. The equation is: $$S_{final} \stackrel{?}{=} W(r) \cdot p_{final}(r_{final})$$ Let's break down each component to understand its meaning and origin. * $S_{final}$: This is the final claimed sum. It's the scalar value held by the verifier after the very last round of sumcheck folding. It represents the accumulated, folded value of every constraint claim made throughout the entire proof. * $p_{final}(r_{final})$: This is the evaluation of the final polynomial. The prover sends the small, fully folded polynomial $p_{final}$ in the clear. The verifier evaluates it at $r_{final}$, the random challenge from the last sumcheck round. * $W(r)$: This is the evaluation of the grand composite weight polynomial. It is the most intricate part. It is not a single polynomial but the sum of the evaluations of all weight polynomials from all rounds. The evaluation point $r = (r^{(0)}, r^{(1)}, \dots)$ is the concatenation of all folding challenges sampled by the verifier during the entire process. The logic, implemented in the `eval_constraints_poly` function, is to iterate through the constraints collected in each round `j` and evaluate their respective weight polynomials at the relevant part of the final point `r`. Mathematically, if round $j$ had constraints with weight polynomials $\{w_{j,1}, w_{j,2}, \dots\}$ and a combination challenge $\alpha_j$, its contribution to $W(r)$ is $\sum_i (\alpha_j)^i \cdot w_{j,i}(r_j)$. Here, $r_j$ is the prefix of the full point $r$ that corresponds to the number of variables present in round $j$. The total value $W(r)$ is the sum of these contributions from all rounds. $$W(r) = \sum_{j=0}^{\text{num_rounds}-1} \left( \sum_{i=1}^{\text{num_constraints}_j} (\alpha_j)^i \cdot w_{j,i}(r_j) \right)$$ Here, $r_j$ is the suffix of the full point $r$ that corresponds to the number of variables present in round $j$. #### An Illustrative Example Let's use the 3-variable polynomial from the test case, but simplify to two folding rounds. * **Polynomial**: $p(X_0, X_1, X_2)$ * **Round 0**: Folds 1 variable ($X_0$). * **Round 1**: Folds 1 variable ($X_1$), leaving a univariate polynomial in $X_2$. Here’s what the verifier collects: 1. **From Round 0**: * Initial constraints: $p(z_1) = v_1$ and $p(z_2) = v_2$. Their weight polynomials are: - $w_{0,1}(X_0,X_1,X_2) = \text{eq}_{z_1}(X)$ - $w_{0,2}(X_0,X_1,X_2) = \text{eq}_{z_2}(X)$. * A combination challenge $\alpha_0$. * A folding challenge $r_0$ (for variable $X_0$). 2. **From Round 1**: * A new STIR constraint is generated from checking the previous round, giving a weight $w_{1,1}(X_1, X_2)$. * A combination challenge $\alpha_1$. * A folding challenge $r_1$ (for variable $X_1$). 3. **From the Final Sumcheck (on the univariate polynomial in $X_2$):** * The final folded sum, $S_{final}$. * The final polynomial, $p_{final}(X_2)$. * The final challenge, $r_{final} = r_2$. The verifier's full evaluation point is $r = (r_0, r_1, r_2)$. To perform the grand check, the verifier calculates $W(r)$ as follows: \begin{aligned} W(r) &= \underbrace{ \left( \alpha_0^1 \cdot w_{0,1}(r_0, r_1, r_2) + \alpha_0^2 \cdot w_{0,2}(r_0, r_1, r_2) \right) }_{\text{Contribution from Round 0 constraints (3 variables)}} \\ &+ \underbrace{ \left( \alpha_1^1 \cdot w_{1,1}(r_1, r_2) \right) }_{\text{Contribution from Round 1 constraints (2 variables)}} \end{aligned} Notice how the weight polynomials from round 0 are evaluated at the full point $(r_0, r_1, r_2)$, while the weight from round 1 is evaluated at $(r_1, r_2)$, as variable $X_0$ has been folded away. The verifier then checks if its internally computed $S_{final}$ matches the product of the two values it can now compute: $W(r)$ and $p_{final}(r_2)$. This final check works because the sumcheck protocol guarantees an equivalence between operating on the sums (the left side of the equation, folded into $S_{final}$) and operating on the polynomials (the right side, folded into $p_{final}$ and evaluated). This check confirms that this equivalence has held true across every step of the proof. ### Details about the Constraint Check with Univariate Skip The **univariate skip** is an optimization that makes the protocol more efficient. Instead of folding variables one-by-one, it folds a block of $k$ variables in a single, more powerful sumcheck round. This changes how the verifier works, especially during the final constraint check. #### How the Protocol Changes * **A Different First Round**: The prover sends a single, high-degree univariate polynomial, $h_{skip}(Y)$, instead of $k$ separate quadratic ones. This one polynomial collapses the first $k$ variables all at once. * **Modified Sumcheck Verification**: The verifier’s check for the first round is different. Instead of a simple $h(0) + h(1) = S_0$, it must verify the prover’s claim over a larger univariate domain (like a multiplicative subgroup). * **A Single Challenge**: The verifier samples only a **single** folding challenge, $r_{skip}$, for the first $k$ variables. #### The Impact on the Final Constraint Check The main challenge with the univariate skip is a "dimension mismatch." The verifier gets one challenge, $r_{skip}$, but the initial weight polynomials, like $w_{0,i}(X_0, \dots, X_{n-1})$, are multilinear and need $k$ separate inputs for the first few variables. How can it evaluate a function that expects multiple coordinates from a single value? The solution is a **deterministic and publicly known polynomial mapping** that bridges the univariate and multilinear domains. The univariate skip works by replacing a sum over the $k$-dimensional **Boolean hypercube** $\{0,1\}^k$ with an equivalent sum over a 1D **multiplicative subgroup** $H$. This is possible thanks to a mapping $M: H \to \{0,1\}^k$ that creates a one-to-one correspondence between points in the subgroup and points in the hypercube. This map is then extended into a vector of $k$ low-degree univariate polynomials, $M(Y) = (M_0(Y), \dots, M_{k-1}(Y))$, which both the prover and verifier know. When the verifier gets the single challenge $r_{skip}$, it uses this mapping to generate the multilinear coordinates it needs: $$b_{skip} = M(r_{skip}) = (M_0(r_{skip}), M_1(r_{skip}), \dots, M_{k-1}(r_{skip}))$$ It then constructs the full evaluation point $r$ by combining this result with challenges from later rounds. If the subsequent challenges are $(r_k, \dots, r_{n-1})$, the full point becomes: $$r = (M_0(r_{skip}), \dots, M_{k-1}(r_{skip}), r_k, \dots, r_{n-1})$$ The verifier uses this correctly constructed point $r$ to evaluate all the initial weight polynomials. #### An Illustrative Example with Univariate Skip Let's adapt our 3-variable example to use this mapping-based univariate skip. * **Polynomial**: $p(X_0, X_1, X_2)$ * **Round 0 (Univariate Skip)**: Folds $k=2$ variables ($X_0, X_1$) at once. The protocol uses a public mapping polynomial $M(Y) = (M_0(Y), M_1(Y))$ that relates a subgroup of size 4 to the hypercube $\{0,1\}^2$. * **Round 1**: Folds the last variable ($X_2$). Here’s what the verifier collects: 1. **From Round 0**: * Initial constraints for $p(z_1) = v_1$ and $p(z_2) = v_2$, with weight polynomials $w_{0,1}(X_0, X_1, X_2)$ and $w_{0,2}(X_0, X_1, X_2)$. * A combination challenge $\alpha_0$. * A single folding challenge, $r_{skip}$. 2. **From Round 1**: * A new STIR constraint with weight $w_{1,1}(X_2)$. * A combination challenge $\alpha_1$. * A folding challenge $r_2$. 3. **From the Final Check**: * The final folded sum, $S_{final}$. * The final (now constant) polynomial, $p_{final}$. To perform the grand check, the verifier first constructs the full 3-variable evaluation point $r$ by applying the mapping $M$ to $r_{skip}$: $$r = (\underbrace{M_0(r_{skip}), M_1(r_{skip})}_{\text{Mapped from } X_0, X_1}, \underbrace{r_2}_{\text{from } X_2})$$ Now it can calculate $W(r)$ by plugging the coordinates of this point into the weight polynomials from each round: \begin{aligned} W(r) &= \underbrace{ \left( \alpha_0^1 \cdot w_{0,1}(M_0(r_{skip}), M_1(r_{skip}), r_2) + \alpha_0^2 \cdot w_{0,2}(M_0(r_{skip}), M_1(r_{skip}), r_2) \right) }_{\text{Contribution from Round 0 constraints (3 variables)}} \\ &+ \underbrace{ \left( \alpha_1^1 \cdot w_{1,1}(r_2) \right) }_{\text{Contribution from Round 1 constraints (1 variable)}} \end{aligned} This mapping elegantly resolves the dimension mismatch, ensuring the evaluation of the initial constraints is perfectly consistent with the algebra of the univariate skip. This allows the final verification equation to hold true. ### How the Mapping Works: An Example with $k=2$ The best way to understand the mapping is to build it from scratch. Let's say the univariate skip is folding $k=2$ variables, $X_0$ and $X_1$. The goal is to create a bridge between two different domains: 1. **The Multilinear Domain:** This is the Boolean hypercube $\{0,1\}^2$. It has $2^2 = 4$ points: $(0,0)$, $(0,1)$, $(1,0)$, and $(1,1)$. 2. **The Univariate Domain:** This is a multiplicative subgroup $H$ of size 4 in our field. Let's say $w$ is a 4th root of unity. The points are $H = \{w^0, w^1, w^2, w^3\}$, where $w^0=1$. The mapping polynomial, $M(Y) = (M_0(Y), M_1(Y))$, must translate any point from the univariate domain into the corresponding point in the multilinear domain. #### Step 1: Define the Correspondence First, we establish a clear, one-to-one correspondence between the points in the subgroup $H$ and the hypercube $\{0,1\}^2$. The standard way is to use the binary representation of the exponent of $w$: | Univariate Point (from $H$) | Exponent (binary) | Mapped Multilinear Point (in $\{0,1\}^2$) | | :---------------------------- | :---------------- | :--------------------------------------------- | | $w^0 = 1$ | `0 = 00` | `(0,0)` | | $w^1$ | `1 = 01` | `(0,1)` | | $w^2$ | `2 = 10` | `(1,0)` | | $w^3$ | `3 = 11` | `(1,1)` | Now, we need to find two polynomials, $M_0(Y)$ and $M_1(Y)$, that satisfy this table. For instance, when we plug in $Y=w^2$, we must get $M_0(w^2)=1$ and $M_1(w^2)=0$. #### Step 2: Use Lagrange Basis Polynomials as "Switches" An important ingredient is a set of Lagrange basis polynomials. Think of these as "on/off" switches for each point in our subgroup $H$. For our subgroup $H = \{w^0, w^1, w^2, w^3\}$, we have four Lagrange polynomials: $L_0(Y), L_1(Y), L_2(Y), L_3(Y)$. Each one has a special property: * $L_i(Y)$ evaluates to $1$ if $Y = w^i$. * $L_i(Y)$ evaluates to $0$ if $Y$ is any other point in the subgroup (e.g., $L_0(w^1)=0$, $L_0(w^2)=0$, etc.). #### Step 3: Construct the Mapping Polynomials We build $M_0(Y)$ and $M_1(Y)$ by combining these Lagrange "switches." **Constructing $M_0(Y)$ (for the first coordinate):** We look at the first column of our table. We want a polynomial that outputs `1` only when the input is $w^2$ or $w^3$. So, we "turn on" the switches for $L_2(Y)$ and $L_3(Y)$: $$M_0(Y) = 0 \cdot L_0(Y) + 0 \cdot L_1(Y) + 1 \cdot L_2(Y) + 1 \cdot L_3(Y) = L_2(Y) + L_3(Y)$$ **Constructing $M_1(Y)$ (for the second coordinate):** We look at the second column. We want a polynomial that outputs `1` only for inputs $w^1$ and $w^3$. So, we turn on the switches for $L_1(Y)$ and $L_3(Y)$: $$M_1(Y) = 0 \cdot L_0(Y) + 1 \cdot L_1(Y) + 0 \cdot L_2(Y) + 1 \cdot L_3(Y) = L_1(Y) + L_3(Y)$$ These two polynomials, $M_0(Y)$ and $M_1(Y)$, are public and known to everyone. #### Step 4: How the Verifier Uses the Map Now, we can see how this solves the verifier's problem. 1. The verifier runs the univariate sumcheck and gets the single challenge $r_{skip}$. This challenge is just a random element from the field. 2. It needs the corresponding two coordinates $(r_0, r_1)$ to check the initial constraints. 3. It simply evaluates the public mapping polynomials at that point: * $r_0 = M_0(r_{skip})$ * $r_1 = M_1(r_{skip})$ This gives the verifier the exact multilinear point it needs. Because the mapping is defined by polynomials, it works perfectly even for a random challenge $r_{skip}$ that isn't in the original subgroup $H$. This ensures the algebraic relationship between the two domains remains sound throughout the protocol.