# Batched Sumcheck explanation ## Lagrange Polynomial basics For $0 \leq i < 2^n$, define it's bit representation as $(i_0,\ldots, i_{n-1})$ such that $$i = \sum_{k=0}^{n-1} i_k \cdot 2^{n-k-1}$$ Note that for $n' < n$, if $i <2^{n'}$ then its bit representation will look like $(0,\ldots,0,i_{n'-n},\ldots, i_{n-1})$, with $n-n'$ zeros in front. The Lagrange polynomials over $\{0,1\}^n$ are indexed by $0\leq i<2^n$, where \begin{aligned} L_i(X_0,\ldots,X_{n-1}) &= \mathsf{eq}(i_0,\ldots,i_{n-1},X_0,\ldots,X_{n-1}) \\ &= \mathsf{eq}(i_0,\ldots,i_{n-n'-1},X_0,\ldots,X_{n-n'-1}) \cdot \mathsf{eq}(i_{n-n'},\ldots,i_{n-1},X_{n-n'},\ldots,X_{n-1}) \end{aligned} The Lagrange polynomials where $0 \leq i < 2^{n'}$ can be factored as \begin{aligned} L_i(X_0,\ldots,X_{n-1}) &= \mathsf{eq}(i_0,\ldots,i_{n-n'-1},X_0,\ldots,X_{n-n'-1}) \cdot \mathsf{eq}(i_{n-n'},\ldots,i_{n-1},X_{n-n'},\ldots,X_{n-1}) \\ &= \mathsf{eq}(0,\ldots,0,X_0,\ldots,X_{n-n'-1}) \cdot \mathsf{eq}(i_{n-n'},\ldots,i_{n-1},X_{n-n'},\ldots,X_{n-1}) \\ &= \Bigg(\prod_{k=0}^{n-n'-1} (1-X_k) \Bigg) \cdot \Bigg(\prod_{k=n-n'}^{n-1} \big((1-i_k)(1-X_k) + i_k\cdot X_k\big) \Bigg) \\ &= L_0(X_0,\ldots,X_{n-n'-1})\cdot L_i(X_{n-n'}, \ldots, X_{n-1}) \end{aligned} Here, we use the convention that a Lagrange polynomial in $m$ variables is indexed over all $0\leq i < 2^m$. The multi-linear extension of a polynomial over the $n$-th boolean hypercube is $$P(X_0,\ldots,X_{n-1}) = \sum_{i=0}^{2^n-1} P_i \cdot L_i(X_0,\ldots,X_{n-1})$$ If the polynomial only only has non-zero values at the first $2^{n'}$ evaluation points, then the factorization of the Lagrange polynomials still applies. \begin{aligned} P(X_0,\ldots,X_{n-1}) &= \sum_{i=0}^{2^n-1} P_i \cdot L_i(X_0,\ldots,X_{n-1}) \\ &= \sum_{i=0}^{2^{n'}-1} P_i \cdot L_i(X_0,\ldots,X_{n-1}) \\ &= \sum_{i=0}^{2^{n'}-1} P_i \cdot L_0(X_0,\ldots,X_{n-n'-1})\cdot L_i(X_{n-n'}, \ldots, X_{n-1}) \\ &= L_0(X_0,\ldots,X_{n-n'-1})\cdot \sum_{i=0}^{2^{n'}-1} P_i \cdot L_i(X_{n-n'}, \ldots, X_{n-1}) \\ \end{aligned} An important fact about Lagrange polynomials, is that they must sum to 1: $$1 = \sum_{i=0}^{2^m-1} L_i(X_0,\ldots,X_{m-1})$$ If we have a polynomial over $n'<n$ variables $$P'(X_0,\ldots, X_{n'-1}) = \sum_{i=0}^{2^{n'}-1} P'_i \cdot L_i(X_0,\ldots, X_{n'-1})$$ The canonical way to "lift" this polynomial to $n$ variables, is to consider \begin{aligned} P(X_0,\ldots,X_{n-1}) &= P'(X_{n-n'-1},\ldots, X_{n-1}) \\ &= \sum_{i=0}^{2^{n'}-1} P'_i \cdot L_i(X_{n-n'-1},\ldots, X_{n-1}) \\ &= \sum_{i=0}^{2^{n}-1} P_i \cdot L_i(X_{0},\ldots, X_{n-1}) \end{aligned} Let's see how the list of evaluations $\{P_i\}_{i=0}^{2^n-1}$ relates to the initial ones $\{P'_i\}_{i=0}^{2^{n'}-1}$. For any $0\leq i' < 2^{n'}$ and $0\leq j < 2^{n-n'}$, we can write the index $0\leq i < 2^{n}$ as $i = i' + j\cdot 2^{n'}$, and it's bit decomposition is $(j_0,\ldots,j_{n-n'-1},i'_0,\ldots,i'_{n'-1})$. The $i$-th evaluation of $P$ is equal to the $i'$-th evaluation of $P'$: \begin{aligned} P_i &= P_{i' + j\cdot 2^{n'}}\\ &= P(j_0,\ldots,j_{n-n'-1},i'_0,\ldots,i'_{n'-1}) \\ &= P'(i'_0,\ldots,i'_{n'-1})\\ &= P'_{i'} \end{aligned} In other words, the list of evaluations of $P$ is equal to $2^{n-n'}$ repetitions of the list of evaluations of $P'$: $$P_i = P'_{i \bmod 2^{n'}}$$ ## Commitment We have two a combination function $F: \mathbb{F} \times \mathbb{F} \rightarrow \mathbb{F}$ and two sets of two MLE polynomials $\big(P_1, Q_1\big)$ $\big(P_2, Q_2\big)$ with $2^{n_1}$ and $2^{n_2}$ evaluations respectively. Let $n=\max\{n_1,n_2\}$. Given $P_b$ (respectively $Q_b$) with $2^{n_b}$ evaluations ($b=1,2$), we interpret it as a polynomial with $n$ variables, such that when we list out its evaluations in the "canonical" order, the $2^{n_b}$ non-zero evaluations are first. We can define it as: $$P'_b(X_0,\ldots,X_{n-1}) = L_0(X_0,\ldots, X_{n-n_b-1})P_b(X_{n-n_b},\ldots,X_{n-1})$$ In order to open both pairs of polynomials using the same PCS opening argument, we need to commit to $P'_b$, obtained by computing the MSM with the first $2^{n_b}$ group elements in the commitment key. The interpretation of $P'_b$ allows us to avoid allocating zeros at the end of the list of evaluations of $P_b$. ## Sumcheck In the context of Sumcheck, we use a different interpretation of $P_b$ to avoid padding. We want to batch the claims $$\sigma_b = \sum_{x \in \{0,1\}^{n_b}} F\big(P_b(x),Q_b(x)\big)$$ using a random linear combination challenge $\gamma$. In this context we run it on the polynomials $$P''_b(X_0,\ldots,X_{n-1}) = P_b(X_{n-n_b},\ldots,X_{n-1})$$ This representation is equivalent to taking the list of evaluations of $P_b$, and repeating them $2^{n-n_b}$ times such that $P''_b$ is of length $2^{n}$. Again, we want to avoid having to allocate that much space, especially for repeated values, so we need to "imagine" these repetitions instead. Sumcheck works partially evaluating the claims in the challenge variables $r = (r_0,\ldots,r_{n-1})$. If we instead apply the same Sumcheck combination on the repeated polynomials, we notice that the sum we get is the original one, scaled by $2^{n-n_b}$. \begin{aligned} \sum_{x\in\{0,1\}^n} F\big( P''_b(x) , Q''_b(x)\big) &= \sum_{y\in\{0,1\}^{n-n_b}} \sum_{x\in\{0,1\}^{n_b}} F\big( P_b(x) , Q_b(x) \big) \\ &= \sum_{y\in\{0,1\}^{n-n_b}} \sigma_b \\ &= 2^{n-n_b}\cdot \sigma_b \end{aligned} In rounds $0\leq i < n-n_b$, the prover computes the univariate polynomial for this claim, given by: \begin{aligned} S^{(i)}_b(X_i) &= \sum_{x\in\{0,1\}^{n-i-1}} F\big( P''_b(r_0,\ldots, r_{i-1},X_i, x) , Q''_b(r_0,\ldots, r_{i-1},X_i, x)\big) \\ &= \sum_{y\in\{0,1\}^{n-i-1-n_b}} \sum_{x\in\{0,1\}^{n_b}} F\big( P_b(x) , Q_b(x) \big) \\ &= 2^{n-i-1-n_b} \cdot \sigma_b \end{aligned} This polynomial is constant, and is equal to the initial claim scaled by $2^{n-i-1-n_b}$. When we bind the polynomial to $r_i$, it does not affect our list of evaluations, since $$P''_b(r_0,\ldots, r_{i-1},r_i, X_{i+1},\ldots, X_{n-1}) = P_b(X_{n-n_b},\ldots, X_{n-1})$$ For rounds $n-n_b \leq i < n$, the protocol will start evaluating the "active" variables of $P_b$, and the Sumcheck prover simply continues as previously, without having to do any scaling. During the protocol, the prover will have sent the univariate polynomials $S^{(i)}(X_i) = S^{(i)}_1(X_i) + \gamma \cdot S^{(i)}_2(X_i)$ and at the end, sends the evaluations $$p''_b = P''_b(r_0,\ldots,r_{n-1})=P_b(r_{n-n_b},\ldots,r_{n-1}), \quad q''_b = Q''_b(r_0,\ldots,r_{n-1})=Q_b(r_{n-n_b},\ldots,r_{n-1}).$$ The verifier having processed this data through the transcript, does the following: - Set $e^{(0)} = 2^{n-n_1}\cdot \sigma_1 + \gamma \cdot 2^{n-n_2}\cdot \sigma_2$ - For each $i = 0,1,\ldots,n-1$, - Check $e^{(i)} = S^{(i)}(0)+S^{(i)}(1)$ - Set $e^{(i+1)} = S^{(i)}(r_i)$ - Check $e^{(n)} = F(p''_1, q''_1) + \gamma \cdot F(p''_2, q''_2)$ The parties now run a PCS argument to check the evaluations $p''_b, q''_b$ ## PCS The evaluations returned by the prover are for the polynomials $P''_b,Q''_b$. Note however that we have committed to the polynomials $P',Q'$ as described in the first section. We can transform the evaluation $u''_b$ for $P''$ at $(r_0,\ldots,r_{n-1})$, into an evaluation $u'_b$ for $P'$ at the same point, by setting \begin{aligned} u'_b &= P'(r_0,\ldots,r_{n-1}) = L_0(r_0,\ldots,r_{n-n_b-1})\cdot P(r_{n-n_b},\ldots, r_{n-1}) \\ &= P'(r_0,\ldots,r_{n-1}) = L_0(r_0,\ldots,r_{n-n_b-1})\cdot u''_b \\ &= u''_b \cdot \prod_{i=0}^{n-n_b-1}\big(1-r_i\big). \end{aligned} This simply involves rescaling the original Sumcheck evaluation by the first Lagrange polynomial, evaluated in the "missing" first variables of $P''$. We can then batch all polynomial evaluation instances into a single instance of size $n$.