# Perspectives on Sumcheck batching
We assume the reader is familiar with the multivariate sumcheck protocol and the linear-time prover algorithms for multilinear composite polynomials.
The sumcheck protocol takes $\mu$-variate polynomial $f(X_0, ..., X_{\mu-1})$ and a claimed hypercube $s$ as the statement. The reduction concludes that $s = \sum_{v \in B_\mu} f(\{v\})$ if $f(r_0, ..., r_{\mu-1}) = z$ for some values $r_0, ..., r_{\mu-1}, z$ output by the verification protocol.
## Terminology
- A _composition_ polynomial is a multivariate polynomial, assumed to be of low degree. For example, $\mathcal{C}(A, B, C) = ABC$.
- A _multilinear composite_ polynomial is a composition composed with multilinear polynomials in the same number of variables substituted for each indeterminate in the composition. For example, given three $\mu$-variate multilinears $p_0(X_0, ..., X_{\mu - 1}), p_1(...), p_2(...)$, we could define the composite $f(X_0, ..., X_{\mu-1}) = \mathcal{C}(p_0(X_0, ..., X_{\mu-1}), p_1(X_0, ..., X_{\mu-1}), p_2(X_0, ..., X_{\mu-1}))$.
- **whp**: with high probability.
## Sumcheck "endianness"
The sumcheck protocol has a round-by-round structure. We can choose to either specialize variables from low-to-high index order or high-to-low index order. Mathematically there is no different.
In implementation, we assume that our multilinears are represented in memory by their hypercube evaluations (ie. multilinear Lagrange basis coefficients) in "little-endian" order. For example, for a 2-variate multilinear, the coefficients are stored in a vector $p(0, 0), p(1, 0), p(0, 1), p(1, 1)$.
We choose to run the sumcheck protocol specializing variables low-to-high as that corresponds to a memory access pattern that processes coefficients in streaming fashion. (As a slight aside, specializing high-to-low has some advantages for SIMD parallelism, but the disadvantage in terms of spacial locality outweighs.)
## Batching with the same number of variables
Let's say we have $n$ $\mu$-variate polynomials $f_0, ..., f_{n-1}$ and their claimed hypercube sums $s_0, ..., s_{n-1}$.
Protocol: the verifier observes the statement and then samples $n$ random challenges $c_0, ..., c_{n-1}$. The prover and verifier engage in a sumcheck for the polynomial $\sum_{i=0}^n c_i f_i$ and claimed sum $\sum_{i=0}^n c_i s_i$.
Soundness: Schwartz-Zippel lemma for a line with coefficients $\sum_{v \in B_\mu} f_i(\{v\}) - s_i$.
### Structured coefficients
Instead of fully random mixing coefficients, we can sample a single coefficient $c$ and mix with the values $c^0, ..., c^{n-1}$. Soundness follows from the Schwartz-Zippel lemma for univariate curves.
Similarly, we can sample $c_0, ..., c_{\log n - 1}$ and mix with their tensor product expansion. Soundness follows from the Schwartz-Zippel lemma for multilinears.
## Goals of batching
1. Reduce the amount of data that the prover sends in each round and the amount of verifier randomness. This reduces Fiat-Shamir costs in the non-interactive setting.
2. Deduplicate the reduced claims by reducing to multilinear evaluation claims at the same points. For example, consider when $f_0$ and $f_1$ are composites that share an underlying multilinear polynomial $p$. If $f_0$ and $f_1$ and handled by different sumchecks, we would need to prove the reduced claims on $p(r_0, ..., r_{\mu-1})$ and $p(r'_0, ..., r'_{\mu-1})$ for separate evaluation points $r$ and $r'$. The batched protocol outputs a single multilinear evaluation claim for $p$.
## Batching with different numbers of variables
Now we want to handle the case of batching when not all hypercube domains are the same size.
Consider the case when we have $f_0(X_0, ..., X_{\mu + \eta - 1})$ and $f_1(X_0, ..., X_{\mu - 1})$ for some integer $\eta > 0$. Run separately, the sumcheck for $f_0$ would be a $\mu + \eta$-round protocol and the sumcheck for $f_1$ would be a $\mu$-round protocol. How can we batch these?
### Padding variables
Let's define a $\mu + \eta$-variate polynomial $f'$ that takes the values of $g$ on exactly one $\mu$-dimensional hypercube and the value 0 everywhere else. Specifically,
$$
f'(X_0, ..., X_{\mu + \eta - 1}) = f(X_0, ..., X_{\mu - 1}) \prod_{i=\mu}^{\mu + \eta - 1} X_i
$$
Observe that $f'(\{v\}, \{1\}^\eta) = f(\{v\})$ and $f'(\{v\}, \{u\}) = 0$ for all $v \in B_\mu$ and $u \in B_\mu \setminus \{1\}^\eta$. Clearly $\sum_{v \in B_\mu} f(\{v\}) = \sum_{v \in B_{\mu + \eta}} f'(\{v\})$.
This shows that given a sumcheck claim over the $\mu$-variate hypercube, we can pad it with variables up to a sumcheck with more rounds arbitrarily.
Now let's look at how the proving and verification algorithms for this padded polynomial looks. Let $\mathcal{P}, {V}$ be the prover and verifier algorithms for the $\mu$-round sumcheck on $f$. We want to construct algorithms $\mathcal{P}', \mathcal{V}'$ for the $(\mu + \eta)$-round sumcheck on $f'$, invoking $\mathcal{P}, \mathcal{V}$ as inner routines round-by-round.
Write $\text{deg}_{X_i}(h)$ for the individual degree of a function $h$ in the variable $X_i$. We have $\text{deg}_{X_i}(f') = \text{deg}_{X_i}(f)$ for all $i \in \{0, ..., \mu - 1\}$ and $\text{deg}_{X_i}(f') = 1$ for all $i \in \{\mu, ..., \mu + \eta - 1\}$.
We can see that $\mathcal{P}'$'s message in the first $\mu$ rounds is identical to $\mathcal{P}$'s message for those rounds. Consider that in all rounds $i$ from $0$ to $\mu - 1$, $\mathcal{P}'$ needs to send
\begin{align*}
p'(Y) &= \sum_{v \in B_{\mu + \eta - i}} f'(r_0, ..., r_{i-1}, Y, \{v\}) \\
&= \sum_{v \in B_{\mu - i}} f(r_0, ..., r_{i-1}, Y, \{v\})
\end{align*}
In rounds $\mu$ to $\mu + \eta - 1$, $\mathcal{P}'$ must send the line $p(Y) = z Y \prod_{j=\mu}^{i-1} r_j$ where $z = f(r_0, ..., r_{\mu - 1})$, one of the final outputs of $\mathcal{P}$.
The verifier logic is equally simple. $\mathcal{V}'$ proceeds like a regular sumcheck verifier and at the end needs to check that $z' = f(r_0, ..., r_{\mu - 1}) \prod_{i=\mu}^{\mu + \eta - 1} r_i$.
## Front-loaded batch sumcheck
Now let's return to the question of batching sumchecks over different numbers of variables. Crucially, now that we have the ability to pad $f$ with more variables, we can do regular sumcheck batching on $f_0$ and $f_1'$, after padding $f_1$. This will reduce our multiple sumcheck claims to multilinear evaluation claims about $f_0(r_0, ..., r_{\mu + \eta - 1})$ and $f_1(r_0, ..., r_{\mu - 1})$.
We call this a _front-loaded_ batch sumcheck because the evaluation points of the differently-sized multilinears share the same low-index variables. That is, all of the evaluation points are prefixes of one another. This can be important and is actually very important for the FRI-Binius/Basefold PCS. But that's a story for another note.
## Back-loaded batch sumcheck
In other situations it is desirable to do a _back-loaded_ batch sumcheck, where the evaluation points share the same high-index variables and are thus all suffixes of one another.
Mathematically, this is easy, we will just rearrange the index of the padding variables. Instead of padding $f$ to $f'$ as we did above, instead define $f''(X_0, ..., X_{\mu + \eta - 1}) = \prod_{i=0}^{\eta-1} X_i f(X_\eta, ..., X_{\mu + \eta - 1})$.
Consider the prover algorithm $\mathcal{P}''$. In rounds $0$ to $\eta - 1$, the prover must send the line
\begin{align*}
p''(Y) &= (r_0 \cdot ... \cdot r_{i-1}) Y \sum_{v \in B_\mu} f(\{v\}) \\
&= (r_0 \cdot ... \cdot r_{i-1}) s Y
\end{align*}
In rounds $i$ from $\eta$ to $\eta + \mu - 1$, $\mathcal{P}$'' sends $\prod_{j=0}^{\eta-1} r_j p_{i-\eta}(Y)$, where $p_{i-\eta}(Y)$ is the round message $\mathcal{P}$ would have sent in round $i - \eta$.