Notes on SamaritanPCS

# Notes on SamaritanPCS :::success [Kurt Pan](https://kurtpan.xyz) @ [ZKPunk](https://zkpunk.pro) ::: ## Overview [SamaritanPCS](https://eprint.iacr.org/2025/419.pdf) transforms a **homomorphic univariate PCS** into a **multilinear PCS** with linear-time proving. The key innovation is the **divide-and-conquer evaluation proof** that achieves $O(n)$ prover complexity instead of $O(n \log n)$. --- ## The Decomposition Idea The core insight is to split a large multilinear polynomial evaluation into evaluations of smaller polynomials. ### Setup Given: - A multilinear polynomial $\widetilde{f}$ over $\mu$ variables where $n = 2^\mu$ - We split $\mu = \nu + \kappa$ where $n = \ell \cdot m$ with $\ell = 2^\nu$ and $m = 2^\kappa$ - Variables partitioned as $\mathbf{z} = (\mathbf{z}_y, \mathbf{z}_x)$ where $\mathbf{z}_y \in \mathbb{F}^\nu$ and $\mathbf{z}_x \in \mathbb{F}^\kappa$ ### Key Identity The multilinear polynomial can be decomposed as: $$ \widetilde{f}(\mathbf{x}, \mathbf{y}) = \sum_{i=1}^{\ell} \widetilde{eq}(\langle i \rangle, \mathbf{y}) \cdot \widetilde{f}(\mathbf{x}, \langle i \rangle) $$ This means we can express the evaluation in terms of $\ell$ smaller multilinear polynomials $\widetilde{g}_i(\mathbf{x}) = \widetilde{f}(\mathbf{x}, \langle i \rangle)$, each over only $\kappa$ variables instead of $\mu$. :::info **Intuition**: Instead of proving one big evaluation directly, we break it into $\ell$ smaller evaluations and then aggregate them using the $\widetilde{eq}$ function as weights. ::: --- ## Concrete Example: $\mu = 2$, split as $\nu = 1, \kappa = 1$ Let's walk through a minimal example with $n = 4$ total variables. ### Step 1: Define the multilinear polynomial Suppose we have $\widetilde{f}(X_1, X_2)$ with coefficient vector $\mathbf{f} = (f_0, f_1, f_2, f_3)$ where: - $f_0 = 5, f_1 = 3, f_2 = 7, f_3 = 2$ The multilinear polynomial is: $$ \widetilde{f}(X_1, X_2) = 5 \cdot (1-X_1)(1-X_2) + 3 \cdot X_1(1-X_2) + 7 \cdot (1-X_1)X_2 + 2 \cdot X_1 X_2 $$ Via the isomorphism $\varphi$, the univariate version is: $$ \hat{f}(X) = 5 + 3X + 7X^2 + 2X^3 $$ ```python # SageMath code to define the polynomials F = GF(101) # Working over finite field F_101 R.<X1, X2> = PolynomialRing(F) # Define multilinear polynomial from coefficient vector coeffs = [5, 3, 7, 2] f_tilde = (coeffs[0] * (1-X1)*(1-X2) + coeffs[1] * X1*(1-X2) + coeffs[2] * (1-X1)*X2 + coeffs[3] * X1*X2) print("Multilinear polynomial f̃(X₁, X₂):") print(f_tilde) # Define the univariate version via isomorphism φ R_uni.<X> = PolynomialRing(F) f_hat = sum(coeffs[i] * X^i for i in range(4)) print("\nUnivariate polynomial f̂(X):") print(f_hat) ``` ### Step 2: Decompose using base-$m$ representation With $m = 2$ and $\ell = 2$, we write $\hat{f}(X)$ in base $X^2$: $$ \hat{f}(X) = \underbrace{(5 + 3X)}_{\hat{g}_1(X)} + X^2 \cdot \underbrace{(7 + 2X)}_{\hat{g}_2(X)} $$ These correspond to multilinear polynomials: - $\widetilde{g}_1(X_1) = \widetilde{f}(X_1, 0) = 5(1-X_1) + 3X_1$ - $\widetilde{g}_2(X_1) = \widetilde{f}(X_1, 1) = 7(1-X_1) + 2X_1$ With coefficient vectors: - $\mathbf{g}_1 = (5, 3)$ - $\mathbf{g}_2 = (7, 2)$ ```python # Decompose f̂(X) in base X^m where m = 2 m = 2 ell = 2 # Extract coefficients for g_1 and g_2 g1_coeffs = [coeffs[0], coeffs[1]] # Coefficients of X^0, X^1 g2_coeffs = [coeffs[2], coeffs[3]] # Coefficients of X^2, X^3 (shifted down) # Define univariate g_i polynomials g1_hat = g1_coeffs[0] + g1_coeffs[1] * X g2_hat = g2_coeffs[0] + g2_coeffs[1] * X print("Decomposition in base X^m:") print(f"ĝ₁(X) = {g1_hat}") print(f"ĝ₂(X) = {g2_hat}") # Verify the decomposition f_reconstructed = g1_hat + X^m * g2_hat print(f"\nVerification: ĝ₁(X) + X² · ĝ₂(X) = {f_reconstructed}") print(f"Original f̂(X) = {f_hat}") print(f"Match: {f_reconstructed == f_hat}") # Define multilinear versions R_single.<X1> = PolynomialRing(F) g1_tilde = g1_coeffs[0] * (1-X1) + g1_coeffs[1] * X1 g2_tilde = g2_coeffs[0] * (1-X1) + g2_coeffs[1] * X1 print(f"\nMultilinear versions:") print(f"g̃₁(X₁) = {g1_tilde}") print(f"g̃₂(X₁) = {g2_tilde}") ``` :::warning **Key Observation**: The decomposition $\hat{f}(X) = \hat{g}_1(X) + X^m \hat{g}_2(X) + \cdots$ is **unique** given the degree constraints on each $\hat{g}_i(X)$. ::: ### Step 3: Evaluation claim Suppose we want to prove $\widetilde{f}(z_2, z_1) = v$ for some evaluation point $\mathbf{z} = (z_2, z_1)$. Let's use $z_2 = 1$ and $z_1 = 0$ (typical Boolean hypercube points). First, let's compute the actual value: $$ \widetilde{f}(0, 1) = 5 \cdot (1-0)(1-1) + 3 \cdot 0 \cdot (1-1) + 7 \cdot (1-0) \cdot 1 + 2 \cdot 0 \cdot 1 = 7 $$ The claim decomposes as: $$ \widetilde{f}(0, 1) = \widetilde{eq}(\langle 1 \rangle, 1) \cdot \widetilde{g}_1(0) + \widetilde{eq}(\langle 2 \rangle, 1) \cdot \widetilde{g}_2(0) $$ Where: - $\widetilde{eq}(\langle 1 \rangle, 1) = \widetilde{eq}(0, 1) = (1-0)(1-1) + 0 \cdot 1 = 0$ - $\widetilde{eq}(\langle 2 \rangle, 1) = \widetilde{eq}(1, 1) = (1-1)(1-1) + 1 \cdot 1 = 1$ This means only $\widetilde{g}_2$ contributes to the final value (since its weight is 1). ```python # Define the eq̃ function for single variable def eq_tilde(a, b): """Multilinear extension of equality: eq̃(a,b) = (1-a)(1-b) + ab""" return (1-a)*(1-b) + a*b # Evaluation point z1 = F(0) # X₁ coordinate z2 = F(1) # X₂ coordinate # Compute actual evaluation v_actual = f_tilde(z1, z2) print(f"Actual evaluation: f̃({z1}, {z2}) = {v_actual}") # Compute evaluations of g_i at z₁ v1 = g1_tilde(z1) v2 = g2_tilde(z1) print(f"\nSub-evaluations:") print(f"v₁ = g̃₁({z1}) = {v1}") print(f"v₂ = g̃₂({z1}) = {v2}") # Compute eq̃ weights for aggregation # <1> represents bit string 0, <2> represents bit string 1 weight1 = eq_tilde(F(0), z2) # eq̃(<1>, z₂) = eq̃(0, 1) weight2 = eq_tilde(F(1), z2) # eq̃(<2>, z₂) = eq̃(1, 1) print(f"\neq̃ weights:") print(f"eq̃(⟨1⟩, {z2}) = eq̃(0, {z2}) = {weight1}") print(f"eq̃(⟨2⟩, {z2}) = eq̃(1, {z2}) = {weight2}") # Verify decomposition v_reconstructed = weight1 * v1 + weight2 * v2 print(f"\nAggregated value: {weight1} · {v1} + {weight2} · {v2} = {v_reconstructed}") print(f"Matches actual: {v_reconstructed == v_actual}") ``` --- ## The Evaluation Proof Protocol Now we describe exactly how the prover convinces the verifier of the evaluation claim. ### Prover's Actions **Step 1**: Compute and send commitments to the smaller polynomials - Compute $\mathrm{cm}_1 \leftarrow \mathrm{uPC.Com}(\hat{g}_1(X))$ - Compute $\mathrm{cm}_2 \leftarrow \mathrm{uPC.Com}(\hat{g}_2(X))$ - Send $(\mathrm{cm}_1, \mathrm{cm}_2)$ to the verifier **Step 2**: Compute evaluations at $\mathbf{z}_x$ - Compute $v_1 = \widetilde{g}_1(z_1) = \widetilde{g}_1(0) = 5(1-0) + 3 \cdot 0 = 5$ - Compute $v_2 = \widetilde{g}_2(z_1) = \widetilde{g}_2(0) = 7(1-0) + 2 \cdot 0 = 7$ - Send $(v_1, v_2) = (5, 7)$ to the verifier ```python # Prover computes the evaluations print("Prover's computations:") print(f"Step 1: Commit to ĝ₁(X) and ĝ₂(X)") print(f" (In practice, use KZG: cm₁ = [ĝ₁(τ)]₁, cm₂ = [ĝ₂(τ)]₁)") print(f"\nStep 2: Evaluate at z_x = {z1}") print(f" v₁ = g̃₁({z1}) = {v1}") print(f" v₂ = g̃₂({z1}) = {v2}") print(f" Send (v₁, v₂) = ({v1}, {v2}) to verifier") ``` ### Verifier's Checks The verifier must verify three things: #### Check 1: Correct Decomposition **Claim**: The polynomials $\hat{g}_1(X), \hat{g}_2(X)$ correctly decompose $\hat{f}(X)$. **How**: The verifier checks that: $$ \hat{f}(X) = \hat{g}_1(X) + X^m \hat{g}_2(X) $$ This is done using the homomorphic property of the commitment scheme. The verifier can check: $$ \mathrm{cm}_f \stackrel{?}{=} \mathrm{cm}_1 \cdot \mathrm{cm}_2^{X^m} $$ In KZG, if $\mathrm{cm}_f = [\hat{f}(\tau)]_1$ and $\mathrm{cm}_i = [\hat{g}_i(\tau)]_1$, then this check becomes: $$ [\hat{f}(\tau)]_1 \stackrel{?}{=} [\hat{g}_1(\tau)]_1 + \tau^m \cdot [\hat{g}_2(\tau)]_1 $$ ```python # Simulate the homomorphic check at a random point # In KZG, this would be at the trusted setup point τ tau = F.random_element() # Simulating the secret τ print(f"\nCheck 1: Correct Decomposition") print(f"Verify: f̂(τ) = ĝ₁(τ) + τ^m · ĝ₂(τ)") # Evaluate polynomials at τ f_at_tau = f_hat(tau) g1_at_tau = g1_hat(tau) g2_at_tau = g2_hat(tau) # Check the homomorphic relation lhs = f_at_tau rhs = g1_at_tau + tau^m * g2_at_tau print(f" LHS: f̂(τ) = {lhs}") print(f" RHS: ĝ₁(τ) + τ² · ĝ₂(τ) = {rhs}") print(f" Check passes: {lhs == rhs}") ``` :::info **Why this works**: The commitment scheme is homomorphic, so we can perform polynomial operations directly on commitments. The uniqueness of the decomposition ensures no prover can cheat here. ::: #### Check 2: Correct Evaluations **Claim**: Each $v_i = \widetilde{g}_i(\mathbf{z}_x)$ is correct. **How**: For each $i \in [\ell]$, the prover provides an evaluation proof for the claim $\hat{g}_i(z_1) = v_i$ using the underlying univariate PCS. In our example: - Prove $\hat{g}_1(0) = 5$ with opening proof for $\mathrm{cm}_1$ - Prove $\hat{g}_2(0) = 7$ with opening proof for $\mathrm{cm}_2$ Since we're evaluating at $X = 0$, these proofs are particularly simple (just verify the constant term). **Optimization**: These $\ell$ proofs can be batched into a single proof using random linear combination. ```python print(f"\nCheck 2: Correct Evaluations") print(f"Verify: ĝ₁({z1}) = {v1} and ĝ₂({z1}) = {v2}") # Direct evaluation check g1_eval = g1_hat(z1) g2_eval = g2_hat(z1) print(f" ĝ₁({z1}) = {g1_eval}, claimed v₁ = {v1}, match: {g1_eval == v1}") print(f" ĝ₂({z1}) = {g2_eval}, claimed v₂ = {v2}, match: {g2_eval == v2}") # Batching with random linear combination gamma = F.random_element() print(f"\nOptimization: Batch with random γ = {gamma}") # Batched polynomial P(X) = γ^0 · g₁(X) + γ^1 · g₂(X) P_batched = g1_hat + gamma * g2_hat v_batched = v1 + gamma * v2 print(f" P(X) = ĝ₁(X) + γ · ĝ₂(X)") print(f" P({z1}) = {P_batched(z1)}") print(f" v₁ + γ · v₂ = {v_batched}") print(f" Batched check passes: {P_batched(z1) == v_batched}") ``` #### Check 3: Correct Aggregation **Claim**: The final evaluation $v$ correctly aggregates the sub-evaluations. **How**: The verifier computes: $$ v^* = \sum_{i=1}^{\ell} \widetilde{eq}(\langle i \rangle, \mathbf{z}_y) \cdot v_i $$ and checks that $v^* = v$. In our example: $$ v^* = 0 \cdot 5 + 1 \cdot 7 = 7 $$ The verifier checks that the claimed value $v = 7$. ```python print(f"\nCheck 3: Correct Aggregation") print(f"Verify: v = Σᵢ eq̃(⟨i⟩, z_y) · vᵢ") # Verifier computes the aggregation v_star = weight1 * v1 + weight2 * v2 print(f" v* = {weight1} · {v1} + {weight2} · {v2}") print(f" v* = {v_star}") print(f" Claimed v = {v_actual}") print(f" Final check passes: {v_star == v_actual}") ``` :::success **Verification complete**: If all three checks pass, the verifier is convinced that $\widetilde{f}(\mathbf{z}) = 7$. ::: --- ## Why This Achieves $O(n)$ Complexity ### The Recursive Structure The key to linear complexity is that we can **recursively apply** this decomposition. When proving $\widetilde{g}_i(\mathbf{z}_x)$ for each of the $\ell$ smaller polynomials: - Each $\widetilde{g}_i$ is a $\kappa$-variate polynomial (instead of $\mu$-variate) - We can recursively decompose each again if $\kappa$ is still large - Continue until we reach base case (small enough to prove directly) ### Complexity Analysis Let $T(n)$ be the time to prove an evaluation for an $n$-dimensional hypercube. **Recurrence**: $$ T(n) = \ell \cdot T(m) + O(n) $$ Where: - We make $\ell$ recursive calls on problems of size $m$ (where $n = \ell \cdot m$) - The $O(n)$ term is for computing the decomposition and aggregation **Solution**: When $\ell = m = \sqrt{n}$, we get: $$ T(n) = \sqrt{n} \cdot T(\sqrt{n}) + O(n) $$ By the Master theorem, this solves to $T(n) = O(n)$. ```python # Demonstrate the recursive structure for larger example def compute_complexity(n, ell=None, m=None): """Compute the number of operations for recursive decomposition""" if n <= 2: # Base case return 1 if ell is None: ell = int(n**0.5) # √n split m = n // ell # T(n) = ℓ · T(m) + O(n) recursive_cost = ell * compute_complexity(m, ell, m) decomposition_cost = n return recursive_cost + decomposition_cost print("\nComplexity for different sizes:") for size in [4, 16, 64, 256]: ops = compute_complexity(size) print(f" n = {size:3d}: T(n) = {ops:5d} operations (vs n·log n = {size * int(size).bit_length():5d})") ``` :::info **Comparison with naive approach**: - Basic protocol: $O(n \log n)$ from computing $\hat{f}(X) \cdot \hat{\Psi}(X; \mathbf{z})$ via FFT-style multiplication - Divide-and-conquer: $O(n)$ by breaking into smaller independent subproblems ::: --- ## Summary Table | Component | Purpose | Complexity | |-----------|---------|------------| | Decompose $\hat{f}$ into $\hat{g}_i$ | Split problem into $\ell$ subproblems | $O(n)$ | | Commit to each $\hat{g}_i$ | Create verifiable commitments | $O(\ell \cdot m) = O(n)$ | | Prove each $\widetilde{g}_i(\mathbf{z}_x) = v_i$ | Recursive evaluation proofs | $\ell \cdot T(m)$ | | Aggregate via $\widetilde{eq}$ weights | Combine subresults | $O(\ell) = O(\sqrt{n})$ | | **Total** | | **$O(n)$** | --- ## Additional Optimizations [The paper](https://eprint.iacr.org/2025/419.pdf) further optimizes the constant factors by: 1. **Batching the $\ell$ evaluation proofs** using random linear combinations (reduces from $\ell$ proofs to 1 proof) 2. **Using bivariate KZG** to relate $\hat{f}$ and the batched $\hat{P}(X) = \sum_{i=1}^{\ell} \gamma^{i-1} \hat{g}_i(X)$ via a unique bivariate polynomial $Q$ 3. **Exploiting KZG-specific properties** to minimize concrete proof size These optimizations maintain the asymptotic $O(n)$ prover time while achieving $O(\log n)$ communication (proof size + commitments).