Summary on the BaseFold Security Proofs

BaseFold [ZCF23] is a hash-based multilinear polynomial commitment scheme (Multilinear PCS, MPCS) that closely resembles FRI. See [this blog](https://hackmd.io/@al5VH2hqS4ia1WS7YKXuAA/B1H_6Iqma) for a high-level introduction to the protocol of BaseFold. This blog summarizes the progress in the security analysis of BaseFold. This blog will not dive into the details, and only aims to provide an intuitive understanding about the state of the art. Since BaseFold is closely related to FRI, its security proof shares many concepts with that of FRI. See [this blog](https://hackmd.io/@yczhang/rywxV7yWyg) for an introduction to these concepts. Here I'll provide a brief summary for these concepts. ## Summary of FRI Security Analysis The core of FRI security proof is a theorem call proximity gap theorem. There are two functions $\delta(\rho)$ and $\epsilon(\rho)$, such that for any $\delta<\delta(\rho)$ and $\epsilon>\epsilon(\rho)$, a Reed-Solomon code $V$ with code rate $\rho$ has the so called **$(\delta,\epsilon)$-proximity gap**. When translated to the soundness error of FRI, the soundness error is bounded by $\epsilon_{COMMIT}+\epsilon_{QUERY}$, where $\epsilon_{COMMIT}$ is positively related to $\epsilon(\rho)$, and $\epsilon_{QUERY}=(1-\delta(\rho))^{\ell}$, where $\ell$ is the number of queries in FRI. Consequently, we generally want $\epsilon(\rho)$ to be smaller than a certain threshold, and $\delta(\rho)$ to be as large as possible. Here is a summary of the progress in improving $\delta(\rho)$. The corresponding queries are estimated for security bits $\lambda=100$ and $\rho=1/2$. | Paper | $\delta(\rho)$ | Name | Queries | | ------------------------------------------------------------ | ----------------------- | -------------------------- | ------- | | Original FRI Paper [BBHR18] | $(1-3\rho)/4-\eta$ | | - | | Worst-case Paper [BKS18] | $1-\sqrt[4]{\rho}-\eta$ | Double Johnson bound | 400 | | DEEP-FRI Paper [BGKS19] | $1-\sqrt[3]{\rho}-\eta$ | One-and-half Johnson bound | 300 | | Proximity Gaps Paper [BCIKS20] | $1-\sqrt{\rho}-\eta$ | Johnson bound | 200 | | Conjecture of [BCIKS20] | $1-\rho-\eta$ | Conjectured bound | 100 | | The parameter $\delta(\rho)$ steadily improves throughout these works, growing from the very small $(1-3\rho)/4$ to the state of the art $1-\sqrt{\rho}$. It is widely believed that the proximity gap theorem holds for $\delta(\rho)=1-\rho$, but this is not proved yet. Despite that, due to the small number of queries implied by this conjectured bound, many real-world projects based on FRI choose to trust this conjectured security. | | | | ## BaseFold For BaseFold, because the protocol is different from FRI, its security cannot be reduced to that of FRI in a black-box way. Instead, the authors proved the security of BaseFold from one of the above proximity gap theorems. Unfortunately, this proof cannot directly use the most up-to-date version of proximity gap in [BCIKS20]. Instead, the soundness error for BaseFold reverts back to the level of double Johnson bound. This is probably because the original proof technique for FRI cannot work for BaseFold when the oracle vector is beyond the unique decoding radius. To explain in more detail, in each round of BaseFold, in addition to checking that the given oracle is a low degree polynomial, the verifier additionally needs to check that this polynomial is consistent with the sum-check message that is executed in parallel with FRI. This additional requirement defeats the original proof technique that works for FRI. There are two recent works that try to overcome this problem and bring the security level of BaseFold back to the same of FRI. ## DeepFold The idea of DeepFold [GLHQTZ24] is to apply the DEEP technique, proposed in the DEEP-FRI [BGKS19] paper, to BaseFold. Be aware that after applying the DEEP technique, the protocol is a different one from the original, and the security no longer relies on the original proximity gap theorems. In fact, the DEEP technique switches to an entirely different story, from the story of proximity gap to the story **list decodability**. The two stories are easy to be confused, as they have similar state of the art. - Proximity gap is known to hold for Reed-Solomon code $V$ with code rate $\rho$ with $\delta(\rho)=1-\sqrt{\rho}$, which is the best provable result, and is conjectured to hold for $\delta(\rho)=1-\rho$. - The maximal provable list decoding radius is also $\delta(\rho)=1-\sqrt{\rho}$, and it is conjectured that the list decoding radius for Reed-Solomon code is as large as $1-\rho$. By applying the DEEP technique, BaseFold is also changed to a different protocol, which is called DeepFold, whose security relies on the list decodability rather than proximity gap. However, there is still some difficulty in directly applying DEEP technique to BaseFold. In detail, in DEEP-FRI, before sending the folding challenge in each round, the verifier first sends an **out-of-domain** point $z$ and asks the prover to evaluate the current polynomial at $z$. Suppose the answer from the prover is $y$. To validate $y$, the parties update the current polynomial by quotienting, i.e., replace $f(X)$ with $(f(X)-y)/(X-z)$, and continue the protocol with the quotient polynomial. However, quotienting cannot by applied to BaseFold, as this will break the consistence between the oracle in the current round and the sum-check constraint required by BaseFold. To solve this problem, the authors of DeepFold come up with a different method for validating $y$ that avoids the quotienting. I will not dive in the detail of DeepFold, for two reasons: - In terms of the goal for overcoming the obstacle in applying the state-of-the-art proximity gap theorem to BaseFold, its modification to BaseFold is rendered unnecessary by the next paper [Hab24] that I'll talk about, which achieves the same result in the original story of proximity gap, without switching to the story list decodability, and without changing BaseFold. - In addition, DeepFold is outperformed by the new protocol called WHIR [ACFY24]. So I prefer to skip DeepFold and directly investigate WHIR. I'll leave WHIR to another blog. For the remaining of this blog, I will introduce the paper [Hab24] that brings BaseFold back to the same level of FRI. ## BaseFold in List Decoding Regime This paper introduces a variant of the state-of-the-art proximity gap theorem that can be applied to BaseFold, maintaining its original structure and improving its number of queries for achieving the same level of security. To explain the insights of this paper, let me bring in some additional details of FRI, BaseFold and the proximity gap theorem. In FRI, in each round of interaction, the prover sends a new oracle that is claimed to be a polynomial $f$, and the verifier replies with a folding challenge $\alpha$, then the prover folds $f$ by $\alpha$ into a smaller polynomial, and sends the oracle of the smaller polynomial in the next round. This is repeated until the polynomial becomes small enough. In BaseFold, the above procedure is executed together with an instance of the sum-check protocol. In detail, for each oracle sent to the verifier that is claimed to be polynomial $f$, the prover additionally sends a quadratic polynomial $h(X)$ that satisfies $h(0)+h(1)=s$, where $s$ is the claimed sum of the sum-check protocol. The job of the verifier is to, in addition to check that the oracle is indeed a polynomial $f$ as in FRI, also check that $\sum f_i\cdot \otimes(\vec{r})$ is equal to the claimed $s$. Here $f_i$ is the $i$-th coefficient of $f$, and $\otimes(\vec{r})=(1,r_0)\otimes\cdots\otimes(1,r_{\log n-1})$ can be viewed as a publicly known structured vector that has the same number of entries as the polynomial degree, and the sum can be viewed as the inner product as the coefficients of $f$ and this public vector. This inner product is equivalent to evaluating $f$ at $\vec{r}$ as if its coefficients are coefficients of a multilinear polynomial. After receiving the challenge $\alpha$ from the verifier, the oracle and $f$ are folded as in FRI, and $s$ is updated to $h(\alpha)/eq(\alpha,r_0)$, where the division by $eq(\alpha,r_0)$ is to compensate the factor caused by substituting $\alpha$ into the $eq$ polynomial. In practice, this factor is not divided from the target sum, but instead implicitly multiplied to the book-keeping table of the $eq$. For FRI, the security proof is outlined as follows: - The current oracle is close to some polynomial $f$ if only if it can be decomposed into two smaller vectors (of half the size) that are respectively close to some smaller polynomials $f_0$ and $f_1$ that, in the honest case, are computed by decomposing $f$. - The two smaller vectors are both close to some polynomials if and only if their random linear combination is close to some polynomial. The second statement above is essentially guaranteed by the proximity gap theorem. For BaseFold, the situation is more complex because of the additional requirements related to sum-check. To handle the additional requirements, this paper introduces a variant of the proximity gap theorem. The original theorem guarantees the Reed-Solomon code $V$ has the proximity gap property. The variant theorem introduced in this paper switches the target to a subcode $V_{w}$ of $V$, which is the linear subspace of $V$ that additionally requires the inner product between the encoded message and the parameter $w$ is zero. Formally, $V_w=\{v\in V|\langle decode(v),w\rangle=0\}$. This paper proves the same result about $V_w$ as the [BCIKS20] paper about $V$. Now, we can outline the security proof of BaseFold by establishing the equivalence between the following sequence of statements. I remark that this logic is slightly different from the paper, because the paper changes the BaseFold implementation, but I stick to the original version of BaseFold. - The current oracle is close to some polynomial $f$, and the inner product between coefficients of $f$ and $\otimes(\vec{r})$ is equal to $h(0)+h(1)$. - The oracle can be decomposed to two smaller oracles that are respectively close to $f_0$ and $f_1$, such that the inner product between $f_0$ and $\otimes(\vec{r}')$ is $h(0)/(1-r_0)$, and the inner product between $(f_0+f_1)$ and $\otimes(\vec{r}')$ is $h(1)/r_0$. Here $\vec{r}'$ is $\vec{r}$ removing the first entry. This step is pure linear algebra, and the decomposition is the same as FRI. I won't explain the details. - The oracle can be decomposed to two smaller oracles that are respectively close to $f_0$ and $f_1$, such that both $f_0\cdot (1-r_0)-h(0)$ and $(f_0+f_1)\cdot r_0-h(1)$ are in the subcode $V_{\otimes(\vec{r}')}$. - The oracle can be decomposed to two smaller oracles, whose random linear combination with linear coefficients $1$ and $\alpha$, is close to a polynomial $f'$, such that $f'-h(\alpha)/eq(\alpha,r_0)$ is in the subcode $V_{\otimes(\vec{r}')}$. This is the result of applying the above variant of the theorem. Note that $f_0+\alpha f_1$ can be viewed as a linear combination of $f_0\cdot(1-r_0)$ and $(f_0+f_1)\cdot r_0$, with linear coefficients $(\frac{1-\alpha}{1-r_0},\frac{\alpha}{r_0})$. When we apply this linear combination to $h(0)$ and $h(1)$, we obtain exactly $h(\alpha)/eq(\alpha,r_0)$. This conclusion follows from some manipulations of the $eq$ polynomial and the definition of $h(X)$, which I will not dive into. - The oracle can be decomposed to two smaller oracles, whose random linear combination with linear coefficients $1$ and $\alpha$, is close to a polynomial $f'$, such that the inner product of $f'$ and $\otimes(\vec{r}')$ is the new sum $h(\alpha)/eq(\alpha,r_0)$. This is a trivial restatement of the previous one. - After sending the above linearly combined oracle, we arrive exactly at the state of the next round. As a result, we get the same security guarantee for BaseFold as the state of the art of that for FRI, without modifying the protocol. ## References - [ZCF23] BaseFold: Efficient Field-Agnostic Polynomial Commitment Schemes from Foldable Codes, https://eprint.iacr.org/2023/1705 - [BBHR18] Fast Reed-Solomon Interactive Oracle Proofs of Proximity, http://drops.dagstuhl.de/opus/volltexte/2018/9018 - [BKS18] Worst-Case to Average Case Reductions for the Distance to a Code, CCS 2018, https://doi.org/10.4230/LIPIcs.CCC.2018.24 - [BGKS19] DEEP-FRI: Sampling Outside the Box Improves Soundness, https://eprint.iacr.org/2019/336 - [BCIKS20] Proximity Gaps for Reed-Solomon Codes, https://eprint.iacr.org/2020/654 - [GLHQTZ24] DeepFold: Efficient Multilinear Polynomial Commitment from Reed-Solomon Code and Its Application to Zero-knowledge Proofs, https://eprint.iacr.org/2024/1595 - [ACFY24] WHIR: Reed–Solomon Proximity Testing with Super-Fast Verification, https://eprint.iacr.org/2024/1586 - [Hab24] Basefold in the List Decoding Regime, https://eprint.iacr.org/2024/1571