changed a year ago
Published Linked with GitHub

1/2: Succinct Proofs and Linear Algebra

preparation for studying Binius / more code-related PCS stuff fundamentally

note: this write-up is really for my understanding

  • so something that might be well-known for the readers might not be for me
  • and something that might be well-known for me might not be for the readers

paper link: https://eprint.iacr.org/2023/1478.pdf

additional material

Overview

Basically, the goal of this paper is to reduce most security proofs to the most fundamental facts commonly used in linear algebra (with some additional facts on polynomials, I guess).

There is a proof about FRI security in the latter parts of the paper, but I will skip it here.

  • Regarding this, I have a question: it seems that the proof in Section 4 does \((n, q) \rightarrow (n/2, q/3)\), so the distance decay is exponentially there. I thought that this was precisely what the FRI proof tries to avoid (it has to keep the relative distance equal)
  • Do I have the idea wrong or is this FRI proof just for pure demonstration purposes?
  • Also anyone know the proof references for Tom Gur's Lecture 5 notes?
  • need to re-read FRI soundness proof assuming we proved the hardest lemmas

Everything is a Linear Code

A rough look into SNARKs (PLONK for example) is something like

  • write literally everything as a polynomial (via interpolations)
  • use polynomial commitments
  • test at a random point to check if polynomial identity holds

Of course, the final part here is thanks to the Schwartz-Zippel Lemma, which, for univariate polynomials, state the fact that a degree \(d\) polynomial has at most \(d\) roots, so a random point being a root has a probability of at most \(d / \lvert \mathbb{F} \rvert\). This can be expressed as follows:

Consider a matrix \(G\) over \(\mathbb{F}\) with size \(\lvert \mathbb{F} \rvert \times n\), with

\[g_{i, j} = \text{index_to_field}(i)^j\]

If you consider \(Gx\) for some vector \(x \in \mathbb{F}^n\), the results would be the polynomial evaluations for each field element, where the polynomial coefficients are precisely \(x\). Via Schwartz-Zippel, we see that \(G\) creates a linear code with distance \(\lvert \mathbb{F} \rvert - n + 1\).

So going back to the whole random check thing, what we are doing is

  • we want to test \(x = 0\), but direct testing is hard
  • re-define the test to \(Gx = 0\)
  • re-define the test to a random entry of \(Gx\), say \((Gx)_r\) being zero
    • unfortunately this must come with some soundness error
  • evaluate a random entry of \(Gx\) (i.e. polynomial commitment evaluation)

This introduces us to two models of performing checks on a vector \(x\)

  • directly accessing entries of \(x\), a "direct access model"
  • accessing a entry of \(Gx\), a "coding model"

so one could imagine using a mix of these two to create some nice arguments.

Zero Checks

Regarding zero checks and linear combinations, things are quite classic already.

For example, it's classic technique to compress checking

\[x_1 = x_2 = \cdots = x_n = 0\]

into taking random \(r_1, r_2, \cdots , r_n\) and checking

\[\sum_{i=1}^n r_i x_i = 0\]

and further, simply taking random \(\alpha\) (to reduce used randomness) and checking

\[\sum_{i=1}^n \alpha^i x_i = 0\]

This can be represented with linear codes as well. Via the coding model,

\[(Gx)_r = 0 \implies x = 0\]

with failure probability \(1 - d/m\) where \(G \in \mathbb{F}^{m \times n}\).

One can compress vectors with codes as well - if we have \(X \in \mathbb{F}^{k \times n}\) with columns \(x_1, \cdots, x_n\),

\[G_{r1}x_1 + \cdots + G_{rn} x_n = 0 \implies X = 0\]

with a small failure probability as the left hand side implies

\[(G\tilde{x}_1)_r = \cdots = (G\tilde{x}_k)_r = 0\]

where \(\tilde{x}\) are the row vectors of \(X\).

This can extend to tensor codes - as the distance of \(G \otimes G'\) is at least the product of the distances of \(G\) and \(G'\), one can consider the tensor code-based zero check

\[((G \otimes G')x)_r = 0 \implies x = 0\]

where the failure probability is something like \(1 - d_1d_2 / m_1m_2\).

The check also extends to checking for subspace inclusion. This is natural, as all subspaces \(V\) have a corresponding "check matrices" \(C\) such that \(x \in V\) if and only if \(Cx = 0\).

Sparsity Checks

The more interesting part usually comes with sparsity checks. For example -

Example: FRI

The hardest part of the basic soundness proof as well as the improvements comes from the so-called "correlated agreement theorem". (see 2022/1216 for example)

The theorem states that if \(f_0 + \lambda f_1\) is \(\theta\)-close to the Reed-Solomon code for some portion of \(\lambda\), then \(f_0, f_1\) each must be \(\theta\)-close to the Reed-Solomon code as well.

In other words, it "controls the distance over linear combinations".

Example: Brakedown

In the soundness proof for the testing phase, the key lemma is the following.

In a code with relative distance \(\gamma\), if there are \(m\) vectors \(\hat{u}_1, \cdots , \hat{u}_m\) in \(\mathbb{F}^N\) and their closest codewords \(c_1, \cdots, c_m\), if there is a set \(E\) of indices such that

  • \(\lvert E \rvert \le (\gamma / 3) N\)
  • for each \(i \in E\), there is a \(k \in [1, N]\) such that \(\hat{u}_{k, i} \neq c_{k, i}\)

with failure prob \(\le N / \lvert F \rvert\) over \(r \in \mathbb{F}^m\), we have \(\sum_{i=1}^m r_i \hat{u}_i\) being \(\lvert E \rvert\)-far from all codewords.

In other words, it "controls the distance over linear combinations".

Note that this is also from Ligero - so practically common for many code-based PCS.

Generalizing with Codes

The distance to a codeword can be thought as a distance to a subspace - so now we can actually tie these two statements up into a more general statement as follows.

Let the distance of a vector \(x\) to a subspace \(V\) be the minimum \(\lVert x - y \rVert_0\) for some \(y \in V\).

Let the distance of a matrix \(X\) to a subspace \(V\) be the minimum number of nonzero rows of \(X - Y\) where each columns of \(Y\) is in \(V\). We can now define \(q\)-close/far based on this.

The main statement is that

\[\left\lVert \sum_{i=1}^n G_{ri} x_i - V \right\rVert_0 \le q \implies X \text{ is } q\text{-close to } V\]

so a "linear combination being close" leads to "each vector being close with correlated agreements". The two examples differ in selections of \(G\) and \(V\) - in detail,

  • FRI: \(q < d'/2\) works with different failure probability, \(G, V\) are from Reed-Solomon
    • This is the "unique-decoding radius" part of [BCI+23]
  • Brakedown: \(q < d'/3\) works with \(G\) being Hadamard code, \(V\) is arbitrary.

Here, \(d'\) is the minimum value of \(\lVert x \rVert_0\) for \(x \in V\).

The paper proves the case \(n = 2\) and \(q < d'/4\) which is enough for basic FRI.

Proof for \(n = 2\) and \(q < d' / 4\)

Denote

\[R = \{r \in \{1, \cdots , m\} : \lVert Xg_r - V \rVert_0 \leq q \}\]

The proof is in two parts -

Part 1: if \(\lvert R \rvert > m - d\), then for all \(z \in \mathbb{F}^2\) one has

\[\lVert Xz - V \rVert_0 \le 2q < d'/2\]

Part 2: In the case where Part 1's conclusion is true, the check works.

Part 1

We first claim that there must be two rows in \(R\) that are linearly independent - this is actually quite easy to see, since if otherwise, there exists \(x \in \mathbb{F}^2\) that is perpendicular to all rows in \(R\). In that case, \(Gx\) will have at least \(\lvert R \rvert\) zeroes, so \(\lVert Gx \rVert_0 < d\), a contradiction.

Now take two vectors \(u, v\) that satisfy

  • \(u, v\) are linearly independent
  • \(\lVert Xu - V \rVert_0 \le q\), \(\lVert Xv - V \rVert_0 \le q\)

For all \(z \in \mathbb{F}^2\), we can now find \(\alpha, \beta \in \mathbb{F}\) such that

\[\lVert Xz - V\rVert_0 = \lVert X(\alpha u + \beta v) - V \rVert_0 \le \lVert Xu - V \rVert_0 + \lVert Xv - V \rVert_0 \le 2q \]

Part 2

Taking \(z\) as the unit vectors, we see that \(x_1, x_2\) can be written as

\[x_1 = y_1 + \delta_1, \quad x_2 = y_2 + \delta_2\]

uniquely, where \(y_1, y_2 \in V\) and \(\lVert \delta_1 \rVert_0, \lVert \delta_2 \rVert_0 \le 2q\).

This is thanks to \(4q < d'\), so we are in unique-decoding radius.

We now see that

\[\lVert Xz - V\rVert_0 = \lVert z_1 \delta_1 + z_2 \delta_2 - V\rVert\]

so \(R\) can be redefined as

\[R = \{r \in \{1, \cdots , m\} : \lVert G_{r1} \delta_1 + G_{r2} \delta_2 - V \rVert_0 \leq q \}\]

so for each \(r \in R\), we can define a unique \(y_r\) such that

\[\lVert G_{r1} \delta_1 + G_{r2} \delta_2 - y_r \rVert_0 \le q\]

Take two distinct \(r, r' \in R\). Note that

\begin{align} G_{r'1} y_r - G_{r1} y_{r'} &= -G_{r'1}(G_{r1}\delta_1 + G_{r2} \delta_2 - y_r) \\ & + G_{r1}(G_{r'1} \delta_1 + G_{r'2} \delta_2 - y_{r'}) \\ & + (G_{r'1}G_{r2} - G_{r1}G_{r'2}) \delta_2 \end{align}

so via triangle inequality, one has

\[\lVert G_{r'1}y_r - G_{r1} y_{r'} \rVert_0 \le 4q < d'\]

which forces \(G_{r'1}y_r = G_{r1} y_{r'}\) and similarly \(G_{r'2}y_r = G_{r2} y_{r'}\).

If there is \(r'\) such that \(y_{r'} \neq 0\), then we can take \(c\) such that \(G_{r'1} + c G_{r'2} = 0\). Now

\[(G_{r1} + cG_{r2}) y_{r'} = (G_{r'1} + cG_{r'2}) y_r = 0\]

so \(G_{r1} + cG_{r2} = 0\), so all row vectors in \(R\) is perpendicular to \([1, c]\), leading to similar contradictions as in Part 1. This shows that all \(y_r = 0\). Now our check is just

\[R = \{r \in \{1, \cdots , m\} : \lVert G_{r1} \delta_1 + G_{r2} \delta_2 \rVert_0 \leq q \}\]

now there is no subspace to deal with! Note that taking \(Y = [y_1, y_2]\) gives \(X - Y = [\delta_1, \delta_2]\).

The remaining proof is relatively simple union bound and zero checks.

Select a repo