# Batch Opening in WHIR
Let us consider $\mathsf{c_a, c_b}$ the commitments to polynomials $f_a, f_b$. The commitment happens in sequence, so we assume that we do not know if the degree of $f_b$, $d_b$ is going to be higher of the degree $d_a$ of $f_a$. We want to design an opening protocol that allows us to receive the commitments and two opening points chosen by the verifier $\mathbf{z_a}, \mathbf{z_b}$ and prove, in a *single proof* that
$$
\begin{align*}
f_a(\mathbf{z_a})=\mathbf{v_a} \\
f_b(\mathbf{z_b})=\mathbf{v_b}
\end{align*}
$$
With $\mathbf{v_i}$ being a vector which size is equal to the number of variables of $p_i$.
Let us go through what happens in the commitment and opening phase for a single polynomial (no batching). We have $f_a$ being a multilinear polynomial of $m_a$ variables.
## Single Polynomial Commit
Let us first recap the commitment and opening phase for a single multilinear polynomial $f$ of $m$ variables (no batching). In the commitment phase, we use a Merkle-based scheme.
1. We evaluate $f$ in the boolean hypercube of size $2^{m}$ and store is as an `EvaluationsList<F>`. `EvaluationsList` is by construction of lenght power of 2. And the evaluations are ordered _lexicographically_ to the points on the hypercube: index `i` corresponds to the hypercube point whose binary rapresentation is `i`.
Lexicographical order is chosen makes folding operations inside the sumcheck protocol highly efficient: when compressing the **first** variable $X$ of a polynomial with a challenge we see that all the evaulation of the polynomial in the hypercube which have $X=0$ are on the left-hand side of the `EvaluationsList`, wherelse on the right-hand side there are all the evaluations that have $X=1$.
```
For example: A multilinear polynomial f(x₀, x₁, x₂, x₃, x₄) stored as 32 evaluations (2⁵) over the Boolean hypercube {0,1}⁵:
[f₀, f₁, f₂, f₃, ..., f₃₁]
lexicographical order: index 5 = binary 00101 = point (0, 0, 1, 0, 1).
Therefore f₅ = f(0,0,1,0,1).
[ f(0, 0..0), f(0, 0..1), ..., f(0, 1..1) | f(1, 0..0), f(1, 0..1), ..., f(1, 1..1) ]
└────────── Left Half (f(0, x')) ─────────┘ └────────── Right Half (f(1, x')) ─────┘
```
2. We re-shape these evaluations into a matrix. The width of the matrix is going to be $2^{m-k_0}$, where $k_0$ is the _folding factor_ at round 0: the number of rounds of the initial sumcheck. The rows of this matrix are going to be sum together during the initial sumcheck round.
```
With m=5 and k₀=2 (width=2³=8, height=4):
col 0 col 1 col 2 col 3 col 4 col 5 col 6 col 7
row 0 f₀ f₁ f₂ f₃ f₄ f₅ f₆ f₇
row 1 f₈ f₉ f₁₀ f₁₁ f₁₂ f₁₃ f₁₄ f₁₅
row 2 f₁₆ f₁₇ f₁₈ f₁₉ f₂₀ f₂₁ f₂₂ f₂₃
row 3 f₂₄ f₂₅ f₂₆ f₂₇ f₂₈ f₂₉ f₃₀ f₃₁
Each row can be viewed as all the evaluations in the hypercube of a multilinear
polynomial where the first k₀ variables are fixed.
row 0 : f(0,0,*)
row 1 : f(0,1,*)
row 2 : f(1,0,*)
row 3 : f(1,1,*)
```
3. We transpose the matrix, and pad it with zeros to the *target height* to increase redundancy, that is the Reed-Solomon encoding step: by padding with zeros (and subsequently applying DFT) we are reaching the desired codeword lenght. The target height is $2^{m + \log_2(1/\rho) - k_0}$, where $\rho$ is the rate of the Reed-Solomon code. Transposing the matrix is necessary because `dft_batch` computes the DFT of each column of a input matrix. And since we have picked the lexicographical order for `EvaluationsList` we are now forced to transpose the matrix.
```
With rate $\rho = 1/2$ target height = $2^{5+1-2} = 16$:
col 0 col 1 col 2 col 3
row 0 f₀ f₈ f₁₆ f₂₄
row 1 f₁ f₉ f₁₇ f₂₅
row 2 f₂ f₁₀ f₁₈ f₂₆
row 3 f₃ f₁₁ f₁₉ f₂₇
row 4 f₄ f₁₂ f₂₀ f₂₈
row 5 f₅ f₁₃ f₂₁ f₂₉
row 6 f₆ f₁₄ f₂₂ f₃₀
row 7 f₇ f₁₅ f₂₃ f₃₁
row 8 0 0 0 0
row 9 0 0 0 0
⋮ ⋮ ⋮ ⋮ ⋮
row 15 0 0 0 0
```
4. Perform DFT on the padded evaluation matrix.
```
Before DFT: After DFT:
col j: [fⱼ, fⱼ₊₈, fⱼ₊₁₆, fⱼ₊₂₄, col j: [DFT₀, DFT₁, ..., DFT₁₅]
0, 0, ..., 0]
```
5. Finally we commit to the matrix by building a Merkle tree from the transformed matrix rows: `MerkleTreeMmcs.commit` hashes each row first and then construct a merkle tree form that. `MerkleTreeMmcs.commit` is a Merkle Tree-based commitment scheme for multiple matrices of potentially differing heights. In this case, we are giving the row of the matrices after DFT as leaves (1 leaf = 1 row), so that the number of leaves is the height of the DFT matrix. During verification, WHIR queries reveal entire rows. This row-based structure allows efficient authentication path generation for the query pattern.
6. After commiting, we perform Out-of-Domain sampling: we sample outside the boolean hypercube to guarantee uniqueness
## Single Polynomial Opening
For WHIR the general claim is of the form:
$$
\sum_{\mathbf{b}\in\{0,1\}^m}\hat{w}(\hat{f}(\mathbf{b}), \mathbf{b})= \sigma
$$
The evaluation of a multilinear polynomial $\hat{f}$ at $\mathbf{z} \in \mathbb{F}^m$ corresponds to the sumcheck query with weight polynomial
$$
\hat{w}(Z,\mathbf{X})=Z\cdot\mathsf{eq}(\mathbf{X},\mathbf{z})
$$
And $\hat{f}(\mathbf{z})=\sigma$.
with:
$$
\text{eq}(\mathbf{x}, \mathbf{z}) = \prod_{i=1}^{m} \left( x_i z_i + (1 - x_i)(1 - z_i) \right)
$$
acting as the lagrange basis for the boolean hypercube. After commitment, the prover must convince the verifier that the committed polynomial satisfies certain constraints. The protocol proceeds in rounds.
Equivalently, the evaluation of a univariate polynomial can be converted to the evaulation of a multilinear one. If we consider $z\in \mathbb{F}$ an using the evaluation point $\mathbb{z}=(z^{2^{0}}, ..., z^{2^{m-1}})$
### Initial Sumcheck Phase
Given the claim:
$$
\sum_{\mathbf{b}\in0,1^m}\hat{w}(\hat{f}(\mathbf{b}), \mathbf{b})= \sigma
$$
The prover runs $k_0$ sumcheck rounds to reduce this claim to an evaluation at a random point $\mathbf{r} = (r_0, r_1, \ldots, r_{k_0-1})$. After $k_0$ rounds, we have folding randomness $\mathbf{r} \in \mathbb{F}^{k_0}$.
### WHIR Rounds (Folding)
Each round :
1. Commit to folded polynomial: The prover commits to the folded polynomial using the same Merkle-based scheme (transpose → pad → DFT → Merkle tree).
2. Out-of-Domain (OOD) sampling: Sample random points outside the hypercube and provide evaluations. This guarantees the committed polynomial is unique.
3. Proof-of-Work (grinding)
4. STIR Queries: The verifier samples random indices:
`let stir_challenges_indexes = get_challenge_stir_queries(...);`
5. Merkle Opening: For each query index, the prover opens the corresponding row of the committed matrix with a Merkle proof:
`let commitment = mmcs.open_batch(*challenge, &prover_data);`
6. Each opened row contains $2^{k_{\text{next}}}$ field elements (the "folded" evaluations).
7. Fold the opened values: The verifier (and prover) evaluates the opened polynomial at the folding randomness:
```
let eval = evals.evaluate_hypercube(&round_state.folding_randomness);
stir_statement.add_constraint(var, eval);
```
8. Combine constraints: OOD evaluations and STIR query results become new constraints for the next sumcheck:
```
let constraint = Constraint::new(
challenger.sample_algebra_element(), // combination randomness
ood_statement,
stir_statement,
);
```
9. Run sumcheck: Reduce the combined constraint to a new folding randomness point.
10. Update state: Domain size shrinks, new Merkle tree becomes "previous" for next round.
### Final Roundn
When the polynomial is small enough:
1. Send polynomial directly: Instead of committing, send all evaluations:
```
proof.final_poly = Some(round_state.sumcheck_prover.evals.clone());
```
2. Final queries: Verify consistency with the previous commitment.
3. Final sumcheck: Run remaining sumcheck rounds on the small polynomial.
## Two polynomials commit
As mentioned in the intro, we are going to consider the case in which a prover commits to two polynomials at different times.
1. Commit to $f_a \rightarrow \mathsf{cm}_a$ (_as described above_).
2. Commit to $f_b \rightarrow \mathsf{cm}_b$.
3. Aggregation: The "Batch Commitment" is simply the list of roots $[\mathsf{cm}_a, \mathsf{cm}_b]$. Alternatively, you can hash these roots together into a `GlobalRoot`.
## Two polynomials opening
We want to prove $f_a(\mathbf{z_a})=v_a$ and $f_b(\mathbf{z_b})=v_b$ simultaneously. Let us define a virtual polynomial $f_c$ which combines the two. To make the math work cleanly for batching, we treat $f_a$ and $f_b$ as sharing the same variable domain (conceptually padding the smaller polynomial with dummy variables if needed, though we handle the size difference explicitly later).
### Selector variable approach
Let $X$ be a new selector variable. We define:
$$
f_c(X, x_1, ..., x_m)= Xf_a(x_1, ..., x_m)+\alpha(1-X)\tilde{f_b}(x_1, ..., x_{m})
$$
Where $\tilde{f_b}$ extends $f_b$ by zero-padding to match the number of variables of $f_a$.
Now let us consider the following sum:
$$
\sum_{x\in\{0,1\}}\sum_{\mathbf{b}\in\{0,1\}^{m}}{f_c}(x,\mathbf{b})\cdot[x\mathsf{eq}(\mathbf{b},\mathbf{z_a})+\alpha(1-x)\mathsf{eq}(\mathbf{b},\mathbf{z_b})]=\sigma
$$
We have:
$$
\begin{align*}
f_a(z_{1_a}, ..., z_{m_a})=\mathbf{v_a} \\
f_b(z_{1_b}, ..., z_{m_b})=\mathbf{v_b}
\end{align*}
$$
And
$$
\sum_{\mathbf{b}\in\{0,1\}^m}{f_i}(\mathbf{b})\mathsf{eq}(\mathbf{b},\mathbf{z}_i)=\sigma_i=\mathbf{v_i}
$$
Therefore $\sigma=\sigma_a+\alpha \cdot \sigma_b = \mathbf{v_a}+\alpha \mathbf{v_b}$.
Let us consider the first round of a sumcheck protocol:
1. At the start of the protocol, the prover sends $\sigma$ claimed to be equal to the sum above.
2. In the first round it sends the univariate polynomial $h(X)$ claimed to equal to
$$
\sum_{\mathbf{b}\in\{0,1\}^{m}}f_c(X,\mathbf{b})\cdot[X\mathsf{eq}(\mathbf{b},\mathbf{z_a})+\alpha(1-X)\mathsf{eq}(\mathbf{b},\mathbf{z_b})]
$$
3. The verifier checks that $\sigma = h(0)+h(1)$ and that the degree of $h(X)$ is at most $2$.
4. The verifier provides randomnsess $r_0$ and sends it to the prover.
5. The prover sends the univariate polynomial:
$$
\sum_{\mathbf{b}\in\{0,1\}^{m-1}}f_c(r_0,X,\mathbf{b})\cdot[r_0\mathsf{eq}((X,\mathbf{b}),\mathbf{z_a})+\alpha(1-r_0)\mathsf{eq}((X,\mathbf{b}),\mathbf{z_b})]
$$
At this point the porotcol continues as in the single polynomial opening case.
### Direct linear combination approach
We define:
$$
f_a(\mathbf{z_a}) + \alpha \tilde{f_b}(\mathbf{z_b})=\sum_{\mathbf{b}\in\{0,1\}^m}[f_a(\mathbf{b})\cdot\mathsf{eq}(\mathbf{b},\mathbf{z_a}))+\alpha \cdot f_b(\mathbf{b})\cdot\mathsf{eq}(\mathbf{b},\mathbf{z_b})]=\sigma_a+\alpha \sigma_b = \sigma
$$
With $\alpha$ chosen at random by the verifier. We run the WHIR protocol on this sum.
1. **Initial sumcheck**: Proceeds as in the standard single-polynomial case, reducing $k_0$ variables.
2. **Main loop**: After each sumcheck phase, the prover must commit to the folded polynomials. Crucially, unlike the selector variable approach, folding does not merge $f_a$ and $f_b$ into a single polynomial—the weight polynomials $\mathsf{eq}(\cdot, \mathbf{z_a})$ and $\mathsf{eq}(\cdot, \mathbf{z_b})$ remain distinct. Therefore, the prover sends **two** folded functions $f_{a,i}$ and $f_{b,i}$ per round, represented as Merkle tree commitments.These can be batched into a single Merkle root using MMCS (Mixed Matrix Commitment Scheme), but at the cost of doubling the leaf size—each leaf contains evaluations from both polynomials.
Alternatevly one could run the WHIR protocol on the virtual polynomial defined as $f_c = f_a + \alpha \tilde{f_b}$ and try to find a single weight $w_c$ such that $\sum f_c \cdot w_c$ gives us what we want, a natural attempt is:
$$
w_c = \mathsf{eq}(\cdot, \mathbf{z_a}) + \alpha \cdot \mathsf{eq}(\cdot, \mathbf{z_b})
$$
But then expanding $f_c \cdot w_c$:
$$
(f_a + \alpha f_b)\cdot(\mathsf{eq}(\cdot, \mathbf{z_a}) + \alpha \cdot \mathsf{eq}(\cdot, \mathbf{z_b}))
$$
$$
= f_a \cdot \mathsf{eq}(\cdot, \mathbf{z_a}) + \alpha^2 f_b \cdot \mathsf{eq}(\cdot, \mathbf{z_b}) + \underbrace{\alpha f_a \cdot \mathsf{eq}(\cdot, \mathbf{z_b}) + \alpha f_b \cdot \mathsf{eq}(\cdot, \mathbf{z_a})}_{\text{cross-terms}}
$$
The prover would need to compute the cross terms above and send them tothe verifier.
### Comparison
**Proof size**: The selector variable approach adds one extra sumcheck round, contributing only 3 field elements (a degree-2 univariate polynomial). In contrast, the direct approach doubles the leaf size in every STIR opening throughout all WHIR rounds.
**Prover time**: The selector variable approach pays a one-time $O(n)$ cost to compute the linear combination $g = r_X \cdot f_a + (1 - r_X) \cdot \alpha \cdot f_b$, after which it proceeds as single-polynomial WHIR. The direct approach avoids this upfront cost but performs roughly 2× the FFT and folding work per round, since both polynomials must be processed separately throughout the protocol.
### Generalization to Multiple Openings
The random linear combination naturally generalizes to opening multiple polynomials at multiple points. Suppose we want to prove:
1. $f_a$ at points $\mathbf{z_a^{(1)}}, \mathbf{z_a^{(2)}}, ..., \mathbf{z_a^{(n_a)}}$
2. $f_b$ at points $\mathbf{z_b^{(1)}}, \mathbf{z_b^{(2)}}, ..., \mathbf{z_b^{(n_b)}}$
We can use powers of $\alpha$ to combine all these claims:
$$
\sigma = \sum_{i=1}^{n_a} \alpha^{i-1} \cdot f_a(\mathbf{z_a^{(i)}}) + \sum_{j=1}^{n_b} \alpha^{n_a + j - 1} \cdot f_b(\mathbf{z_b^{(j)}})
$$
The corresponding sumcheck weight polynomial becomes:
$$
\sum_{x\in\{0,1\}}\sum_{\mathbf{b}\in\{0,1\}^{m}}f_c(x,\mathbf{b})\cdot\left[x\sum_{i=1}^{n_a}\alpha^{i-1}\mathsf{eq}(\mathbf{b},\mathbf{z_a^{(i)}})+(1-x)\sum_{j=1}^{n_b}\alpha^{n_a+j-1}\mathsf{eq}(\mathbf{b},\mathbf{z_b^{(j)}})\right]
$$
### Alternative to padding
Instead of padding $f_b$ to match the size of $f_a$, we can start by applying $m_a-m_b$ rounds of sumcheck protocol on the sum:
$$
\sum_{\mathbf{b}\in\{0,1\}^{m_a}}{f_a}(\mathbf{b})\mathsf{eq}(\mathbf{b},\mathbf{z}_a)=\sigma_a
$$
until we reduce the claim above to:
$$
\sum_{\mathbf{b}\in\{0,1\}^{m_b}}{f_a}(r_0,\ldots,r_{m_a-m_b-1},\mathbf{b})\mathsf{eq}((r_0,\ldots,r_{m_a-m_b-1},\mathbf{b}),\mathbf{z}_a)=\sigma_a'
$$
and then consider the function:
$$
f_c(X, x_1, \ldots, x_{m_b}) = X f_a(r_0,\ldots,r_{m_a-m_b-1}, x_1, \ldots, x_{m_b}) + \alpha(1-X) f_b(x_1, \ldots, x_{m_b})
$$
where $\alpha$ is the random combination coefficient. The combined claim becomes $\sigma = \sigma_a' + \alpha \cdot \sigma_b$, and we carry on as in the case above.