owned this note
owned this note
Published
Linked with GitHub
# Why Ring-Switching works
These are personal notes to understand how ring-switching works. I am going to use the notation in Section 4 of the new [FRI-Binius paper](https://eprint.iacr.org/2024/504.pdf). Thanks to Jim Posen and especially Ben Diamond for having provided detailed explanations of this material.
## Setup
### Notation
Let $K$ be a finite field, let $L/K$ be a degree $2^{\kappa}$ extension, with a fixed vector-space basis $\{\beta_{\vec{v}}\}_{\vec{v}\in \mathcal B_{\kappa}}$, where $\mathcal B_{\kappa}:=\{0,1\}^{\kappa}$ denotes the boolean hypercube of dimension $\kappa$. We assume that $\beta_{\vec{0}}=1$ for simplicity.
Let $\ell' = \ell-\kappa$. We set $A:=L\otimes_K L$. (This object will be discussed at length further below.)
We identify $\{0,1\}^{\kappa}$ with $\{0,\ldots, 2^{\kappa}-1\}$ *without explicit reference* via (a consistent choice of) binary expansion.
### Multilinear polynomials
Let $R$ be a commutative ring (with unit) and let $f\in R[X_0,\ldots, X_{N-1}]^{\preceq 1}$ be a multilinear polynomial. Recall that this means that for each $i\in \{0,\ldots, {N-1}\}$, the degree of $X_i$ in $f$ is at most 1.
We call the following *the fundamental theorem of multilinear polynomials*: $f$ is uniquely determined by its values at $\mathcal B_{N}:=\{0,1\}^N\subset R^N$. This can be made constructive (and, with a bit of thought, efficiently computable) as follows: $$f(X_0,\ldots, X_{N-1})=\sum_{\vec{v}\in \mathcal B_{N}} \widetilde{eq}(\vec{v},\vec{X})f(\vec{v}),$$
where $\widetilde{eq}(\vec{Y},\vec{X}):=\prod_i \big(X_iY_i + (1-X_i)(1-Y_i\big))$.
We will mostly be concerned with multilinear polynomials over fields, but this greater generality will be useful in the construction of ring-switching.
## Where are we going?
### Polynomial commitment schemes for multilinear polynomials
Let $M$ be a finite field and let $s\in M[X_0,\ldots, X_{N-1}]^{\preceq 1}$. A polynomial commitment scheme (PCS) is a protocol that allows someone (the *prover*) to (publicly) commit $s$, say $\text{Com}(s)=[s]$, in such a way that later if someone (the *verifier*) wants the value of $s$ at some $(x_0,\ldots, x_{N-1})\in M^N$, the prover can send the claimed value and then have a short conversation with the verifier after which the verifier is sure with overwhelming probability that the claimed value is "correct" (meaning, compatible with the commitment $[s]$ of the polynomial $s$ that the prover "had in mind" at the beginning of the protocol). Despite the use of air quotes, these notions can be made precise using simulators.
```sequence
Note left of Prover: s is multilinear polynomial
Prover->Verifier: Claimed commitment [s],
Note right of Verifier: Desired evaluation point
Note right of Verifier: (x_0,\ldots,x_{N-1})\in M^N
Verifier->Prover:(x_0,\ldots,x_{N-1})
Prover->Verifier: claimed value of s(x_0,\ldots, x_{N-1})
Prover->Verifier: ...
Note right of Verifier: Flip coins
Verifier->Prover: ...
Note right of Verifier: Flip coins
Prover->Verifier: ...
Prover->Verifier: Final stage of "proof"
Note right of Verifier: Check the proof!
```
The main desiderata of a PCS: it should be *succinct and efficient*. This means that the commitment itself should be short and also that the verifier complexity should be significantly less than if she tried to evaluate $s(x_0,\ldots, x_{N-1})$ herself. (If she had $s$ in the clear, this would take $\Theta(2^{N})$ time.) Another useful property is *batching*: if the verifier wants the evaluation of a bunch of points, the verifier complexity is ideally sublinear in the number of points.
### Assumption
Suppose we have a black-box polynomial commitment scheme (PCS) for multilinear polynomials with coefficients in $L$.
### Goal
We wish to derive a PCS for multilinear polynomials with coefficients in $K$ with the following two desired properties:
* we are allowed to query at points in $L$; and
* where we don't incur extra "embedding costs."
This latter means the following. Given a multilinear polynomial $t\in K[X_0,\ldots, X_{\ell-1}]^{\preceq 1}$, I can of course consider it as a polynomial $t\in L[X_0,\ldots, X_{\ell-1}]^{\preceq 1}$; if I have black-box access to a PCS for multilinear polynomials over $L$, then it seems that I have everything I want.
However, representing a multilinear with coefficients in $L$ takes more bits than a multilinear with coefficients in $K$; this is what we mean by "embedding costs."
### Packing
Let $t\in K[X_0,\ldots, X_{\ell-1}]^{\preceq 1}$ be a multilinear function, let $t'\in L[X_0,\ldots, X_{\ell'-1}]^{\preceq 1}$ denote the packed multilinear function. (Note that packing is a lossless procedure.) To be extremely explicit, for $\vec{w}\in \mathcal B_{\ell'}$:
$$t'(\vec{w}):=\sum_{\vec{v}\in \mathcal B_{\kappa}}t(\vec{v}||\vec{w})\cdot \beta_{\vec{v}}\in L$$
Note that the assignment $K[X_0,\ldots, X_{\ell}]^{\preceq 1}\rightarrow L[X_0,\ldots, X_{\ell'}]^{\preceq 1}$ is a bijection. It is this sense that we don't incur any extra embedding cost: the amount of information (in bits, say) is precisely the same in $t$ and $t'$.
As an aside: to define the packed polynomial, we certainly need a basis of $L/K$.
## What is the tensor product?
The tensor product $L\otimes_K L$ is a $K$ vector space of dimension $2^{2\kappa}=(2^{\kappa})^2$. It is in fact a $K$-algebra. We explain several ways to think about this.
### Take 1: generators and relations
The object $L\otimes_K L$ can be defined by generators and relations as follows. We consider the $K$-vector space spanned by symbols $x\otimes y$, for $x,y\in L$, and then we mod out by the following relations (for all $x, x', y, y'\in L$):
* $(\lambda x)\otimes y = x\otimes \lambda y = \lambda(x\otimes y)$, for $\lambda\in K$.
* $(x'+x)\otimes y = x'\otimes y + x\otimes y$ and $x\otimes (y+y')= x\otimes y + x\otimes y'$.
There is a natural commutative ring structure, defined on simple tensors as $(x\otimes y)\cdot (x'\otimes y'):=xx'\otimes yy'$ and extended distributively.
So, every element $a\in L\otimes_K L$ can be written as a sum of *simple tensors* $x_i\otimes y_i$, though this "decomposition" into simple tensors by no means canonical!
The *rank* of an element $a\in L\otimes_K L$ is the minimal length of a list of simple tensors that sum to $a$.
### Take 2: Matrices
There is another way to understand the tensor product, which is much more concrete. Having fixed our basis $\{\beta_{\vec{v}}\}$, we obtain a $K$-basis of the vector space $L\otimes_K L$, namely: $\{\beta_{\vec{u}}\otimes \beta_{\vec{v}}\}$. This basis therefore furnishes a $K$-linear isomorphism $M_{2^{\kappa}\times 2^{\kappa}}(K)\rightarrow A$. We use the usual matrix index conventions. (**NOTE:** this is not a ring isomorphism!)
Given elements, $x,y\in L$, we can understand $x\otimes y$ as the rank 1 matrix given by the *exterior product* of the corresponding vectors. (This means: write $x$ and $y$ in terms of the basis, then multiply them in the correct order to get a $2^{\kappa}$-square matrix.)
### Take 3: Abstract nonsense
The following are "abstract" ways to understand the tensor product. They characterize the tensor product via "universal properties." For more information on this, see [wikipedia](https://en.wikipedia.org/wiki/Universal_property) and [Tag 001P](https://stacks.math.columbia.edu/tag/001P).
* as tensor product of $K$-vector spaces. For all $K$-vector spaces $M$, the set of $K$-bilinear maps $L\times L\rightarrow M$ to $K$-modules is in natural bijection with $\text{Hom}_K(L\otimes_K L, M)$.
* as tensor product of $K$-algebras. For every $K$-algebra $B$, there exists a natural bijection
$$\text{Hom}_{K-alg}(L\otimes_K L, B)= \text{Hom}_{K-alg}(L, B)\times \text{Hom}_{K-alg}(L, B).$$
(Note: in the second formulation, it is clear that $L\otimes_K L$ has a natural $K$-algebra structure.)
### The canonical maps $\varphi_0$ and $\varphi_1$
We are equipped with maps $\varphi_0\colon L\rightarrow L\otimes_K L$, $x\mapsto x\otimes 1$, and $\varphi_1\colon L\rightarrow L\otimes_K L$, $x\mapsto 1\otimes x$.
In particular, given usual matrix conventions and our choice $\beta_{\vec{0}}=1$, $\varphi_0(L)$ embeds as the matrix whose first column is the $K$-expansion of $L$ in the basis $\{\beta_{\vec{u}}\}$ and is 0 everywhere else. The image of $\varphi_1(L)$ admits a similar description.
### Ranks
The rank of a matrix is the *same* as the rank of the associated tensor. This gives a slick proof of the statement that "row rank = column rank": the row rank is the minimal number of row rank 1 matrices needed to sum of a given matrix, the column rank is the analogous thing for columns, and for rank 1 matrices, row and column rank are *visibly* the same. (Boris Alexeev explained this to me in 2009.)
### Column and Row Representations
Given an $a\in A$, we may either express it as $\sum a_u\otimes \beta_u$, with $a_u\in L$, or $\sum \beta_v\otimes a_v$, with $a_v\in L$. The former is referred to as the *column representation*, the latter as the *row representation*.
## Idea
We obtain the commitment $[t']$ of the packed polynomial, pick some elements $r_0,\ldots, r_{\ell-1}\in L$, and wish to evaluate $t(r_0,\ldots r_{\ell-1})$. What information do we need to (efficiently) compute this?
**Claims:**
1. First of all, $t'(r_{\kappa},\ldots, r_{\ell-1})$, together with $(r_0,\ldots, r_{\ell-1})$ *does not* uniquely specify $t(r_0,\ldots r_{\ell-1})$.
2. However, $s_0:=\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))\in A:=L\otimes_K L$, together with $(r_0,\ldots, r_{\ell-1})$, *does* determine $t(r_0,\ldots r_{\ell-1})$.
One focus of this note is to explain this apparent asymmetry.
### An interlude: what does the expression $\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$ mean?
This expression can be understood in two ways, but first of all, note that we are "evaluating a polynomial of the form $1\otimes f$ on elements of the form $x\otimes 1$." This might seem mildly wacky at first, so we briefly discuss what it means.
#### Monomial Basis
Given any commutative ring (with unit) $R$, any polynomial $f\in R[X_0,\ldots, X_{N-1}]$, and any elements $x_0,\ldots, x_{N-1}\in R$, we can evaluate $f(x_0,\ldots, x_{N-1})$ by "plugging in".
Now, given any $R$ algebra $\varphi\colon R\rightarrow S$, and any polynomial $f$ as above, we can consider $\varphi(f)$ by just applying $\varphi$ to each of the coefficients. Then we have the following equality of elements of $S$:
$$\varphi(f)(\varphi(x_0),\ldots, \varphi(x_{N-1}))=\varphi(f(x_0,\ldots, x_{N-1}))\in S$$
Specializing to our situation, as $t'\in L[X_0,\ldots, X_{\ell'-1}]^{\preceq 1}$ is a polynomial. Then $\varphi_1(t')\in A[X_0,\ldots, X_{\ell'-1}]^{\preceq 1}$ is simply a "the same polynomial, whose coefficients we view as elements of $A$."
#### Lagrange Basis
On the other hand, we know that multilinear $\ell'$-variate polynomials in particular are specified by their value on $\mathcal B_{\ell'}$. What are the values of $\varphi_1(t')$ on $\mathcal B_{\ell'}\subset A^{\ell'}$? Note that $\mathcal B_{\ell'}$ is itself the homomorphic image, under $\varphi_1$ of $\mathcal B_{\ell'}\subset L^{\ell'}$. (Both 0 and 1 are part of the data of a commutative ring.)
Chasing through the definitions, we see that $\varphi_1(t')(\vec{v})=\varphi_1(t'(\vec{v})),$ for $\vec{v}\in \mathcal B_{\ell'}$. In other words, the Lagrange coefficients of $\varphi_1(t')$ are simply $\varphi_1$ applied to the Lagrange coefficients of $t'$.
Therefore, to determine the value of $\varphi_1(t')$, we may simply use the fundamental theorem on multilinear polynomials together with the values on $\mathcal B_{\ell'}$.
### Key equality and the column representation
The fundamental equality we use to prove that $\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$ contains the desired information is:
$$t(\vec{r})=\displaystyle \sum_{\vec{v}\in \mathcal B_{\kappa}}\sum_{\vec{w}\in \mathcal B_{\ell'}}\widetilde{eq}(v_0,\ldots, v_{\kappa-1},r_0,\ldots, r_{\kappa-1})t(\vec{v}||\vec{w})\widetilde{eq}(r_{\kappa},\ldots, r_{\ell-1},w_0,\ldots, w_{\ell'-1})$$
Let's set $\displaystyle s_0:= \varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1})).$ We first claim that the column representation of $s_0$ is:
$$s_{0,\vec{v}}=t(v_0,\ldots, v_{\kappa-1},r_{\kappa},\ldots, r_{\ell-1}),$$
for $\vec{v}=(v_0,\ldots, v_{\kappa-1})\in \mathcal B_{\kappa}.$ We will prove this below, but let us pause for a second.
### Key Difference
This is the **key difference** between sending $t'(r_{\kappa},\ldots,r_{\ell-1})$ and $s_0:=\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$; the former is a bare element in $L$, the latter, as an of $A$, contains the information of all $t(\vec{v}||r_{\kappa},\ldots, r_{\ell-1})$ for all vectors $\vec{v}\in \mathcal B_{\kappa}$. In other words, $\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$ is a rank $2^\kappa$ element of $A=L\otimes_K L$. In fact, under the canonical "multiplication" map $m\colon L\otimes_K L\rightarrow L$ of $K$-algebras, we in fact have:
$$m(\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1})))=t'(r_{\kappa},\ldots,r_{\ell-1})\in L$$
As the map $m\colon L\otimes_K L\rightarrow L$ is surjective (with kernel the ideal generated by $x\otimes 1 - 1\otimes x$ for $x\in L$), we see that $s_0$ has *strictly more information* than $t'(r_{\kappa},\ldots, r_{\ell-1}).$
To sum up, once the Verifier is given $s_0$, we can assume that she *precisely* knows $(t(\vec{v}||r_{\kappa},\ldots, r_{\ell-1}))_{v\in \mathcal B_{\kappa}}$.
### Juggling to obtain the key expression
Note that $\displaystyle s_0 = \sum_{\vec{w}\in \mathcal B_{\ell'}}\varphi_1(t')(\vec{w})\widetilde{eq}(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}), w_0,\ldots, w_{\ell'-1})$ (which follows that $\varphi_1(t')$ is multilinear and the fundamental theorem of multilinear polynomials). Now we use the definition of $\varphi_1$ and of packing: note that the *column representation* of $\varphi_1(t')(\vec{w})$ is:
$$(t(\vec{v}||\vec{w}))_{\vec{v}\in \mathcal B_{\kappa}},$$
i.e., for our (fixed, God-given) basis $\beta_{\vec{v}}$, we have:
$$\displaystyle \varphi_1(t')(\vec{w}) = \sum_{\vec{v}\in \mathcal B_{\kappa}}t(\vec{v}||\vec{w})\otimes \beta_{\vec{v}} = 1\otimes \big(\sum_{\vec{v}\in \mathcal B_{\kappa}}t(\vec{v}||\vec{w}) \beta_{\vec{v}}\big) = \varphi_1(t'(\vec{w})).$$
(Note that the values of $t$ on $\mathcal B_{\ell}$ are all in $K$.)
Therefore, we have:
$$\displaystyle s_0 = \sum_{\vec{w}\in \mathcal B_{\ell'}}\sum_{\vec{v}\in \mathcal B_{\kappa}}\big(\widetilde{eq}(r_\kappa, \ldots, r_{\ell-1},w_0,\ldots, w_{\ell'-1})t(\vec{v}||\vec{w})\otimes \beta_{\vec{v}}\big).$$
Switching the order of summation and "contracting," using again the fundamental theorem of multilinear polynomials, we obtain that:
$$\displaystyle s_0=\sum_{\vec{v}\in \mathcal B_{\kappa}}t(\vec{v}||r_{\kappa},\ldots, r_{\ell-1})\otimes \beta_{\vec{v}}\in A,$$
as desired.
### Final evaluation (for the verifier)
How do we obtain our desired evaluation from $s_0$? On a more intuitive level, we have the information of all of the $t(\vec{v}||r_{\kappa},\ldots, r_{\ell-1})$, for $\vec{v}\in \mathcal B_{\kappa}$, we can of course use the fundamental theorem once again to deduce $t(r_0,\ldots, r_{\ell-1})$.
Informally, we just "take the dot product" with $\widetilde{eq}(v_0,\ldots, v_{\kappa-1},r_0,\ldots, r_{\kappa-1})$.
## Sumcheck
Recall what we wanted to do. We were given a multilinear polynomial $t\in K[X_0,\ldots, X_{\ell}]$, some elements $r_i\in L$, and wanted to prove that $t(\vec{r})$ is some given value in $L$.
We have reduced proving this claim to proving that $\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$ is some given value $s_0\in A$. Now, the claim is that we can run a sumcheck on the following *multiquadratic* polynomial:
$$h(X_0,\ldots, X_{\ell'-1}):=\varphi_1(t')(X_0,\ldots, X_{\ell'-1})\cdot \widetilde{eq}(\varphi_0(r_{\kappa}), \ldots, \varphi_0(r_{\ell}), X_0,\ldots, X_{\ell'-1}).$$
(Here, we follow the cryptography conventions and say that a multivariate polynomial has multiquadratic if it is at most quadratic in each fixed variable. We can also call this the *slice-quadratic*.)
Now, the key point with running this sumcheck is that we may draw our challenges from $\varphi_1(L)$, so that our final evaluation will simply be an evaluation of $t'$ on $L$. In particular, this reduces us to a black box query of a $t'$.
### Wait, what about soundness?
At this point, careful readers will be angry. The classical sumcheck protocol over a field $k$ relies on the following algebraic lemma: if $f\in k[X]$ is a degree $d$ polynomial, then $f$ has at most $d$ roots in $k$.
The analog of this statement is certainly *false* for $A=L\otimes_K L$ (which is a product of copies of $L$, as we indicate in "Some Algebra" below). Here is a perhaps instructive example: for any field $k$ of characteristic $\neq 2$, there are exactly four square-roots of $1:=(1,1)\in k\times k$.
However, Diamond-Posen show the following simple algebraic lemma. Let $A/k$ be a $k$-algebra and let $f\in A[X]$ be a degree $d$ polynomial. Then $f$ has at most $d$ roots *in $k$*.
In particular, the verifier choosing random challenges in $\varphi_1(L)$ was not only convenient to reduce us to our black-box PCS over $L$; it also allowed us to recover soundness.
## Summary
The above shows how to construct a small-field multilinear PCS from a large-field multilinear PCS.
* Given $t$, first construct $t'$, a multilinear polynomial in $L$. Commit it (or even commit $\varphi_1(t')$).
* If you want to open $t$ on $(r_0,\ldots, r_{\ell-1})$, first the prover should compute and reveal the *value* of $\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$. This claimed value can be proven via a sumcheck, drawing challenges from $\varphi_1(L)$, reducing ourselves to our black-box PCS for multilinears with coefficients in $L$.
* The prover, doing $\Theta(2^{\kappa})$ work, can then figure out what $t(r_0,\ldots, r_{\ell-1})$ is.
To insist: $\varphi_1(t')(\varphi_0(r_{\kappa}),\ldots, \varphi_0(r_{\ell-1}))$ is far from a simple tensor, and its column representation contains the information of the evaluations of $t$ on $(\vec{v}||r_{\kappa},\ldots, r_{\ell-1})$, for $\vec{v}\in \mathcal{B}_{\kappa}$.
# Some algebra
Here are some fun facts for those who like algebra and geometry.
1. A finite extension $L/K$ of degree $d$ is Galois iff there is an $L$-algebra isomorphism:
$$L\otimes_K L\cong \prod L,$$
where we understand the LHS to be an $L$-algebra via $\varphi_0$ and the RHS to be an $L$-algebra via the diagonal action.
In fact, an explicit isomorphism can be given as follows: index the RHS copies of $L$ by elements $\sigma\in G:=\text{Gal}(L/K)$ (so, we think of the RHS as $\prod_{\sigma\in G}L_{\sigma}$). Then:
$$x\otimes y\mapsto (x\sigma(y))_{\sigma}\in \prod_{\sigma\in G}L_{\sigma}.$$
An alternative perspective on this isomorphism: let $\alpha_0\in L$ be a primitive element (i.e., such that $L=K(\alpha_0)$) and let $f\in K[X]$ be its minimal polynomial, which is necessarily irreducible of degree $d$. Then, as $L/K$ is Galois, $f$ in fact splits completely in $L$, i.e., $f(X) = \prod_{i=0}^{d-1}(X-\alpha_i)$ for some $\alpha_i\in L$.
It follows that $$L\otimes_K L\cong L\otimes_K K[X]/f(X)\cong L[X]/f(X)\cong L[X]/\prod (X-\alpha_i)\cong \prod_{i=0}^{d-1} L,$$
where the last isomorphism is given by the Chinese remainder theorem, i.e., induced from the map $L[X]\rightarrow \prod_i L$ sending:
$$L[X]\ni p(X)\mapsto (p(\alpha_i))_i\in \prod_{i=0}^{d-1} L.$$
2. The complex:
$$0\rightarrow K\rightarrow L\xrightarrow{\varphi_0-\varphi_1} L\otimes_K L\rightarrow L\otimes_K L\otimes_K L\rightarrow\cdots,$$
where the further arrows are the alternating sum of the embeddings, is an exact sequence of $K$-vector spaces.
This complex is called the *Amitsur complex*, it is exact in much greater generality, and it is a rather miraculous algebraic construction which can be used to extract geometric information.