owned this note
owned this note
Published
Linked with GitHub
# 1/6: [[DP23]](https://eprint.iacr.org/2023/630.pdf) Proof of Extractability + Efficiency
Continued from https://hackmd.io/keVLzFSrQdmCcmUVh2l_BQ
NOTE 1: There's an issue with a proof (and the proof in the paper as well) - to be fixed later.
NOTE 2: https://hackmd.io/qb-NrfZ7SgWMvPGNF4xPxw
NOTE 3: The proof is now fixed on eprint.
## Part 1: Extractor Definition
The claim is that if $\mathcal{Q}$ is admissible, then our construction has extractability.
The extractor itself resembles but has important differences with the one from Brakedown.
Step 1: Run $\mathcal{A}$ all the way and check if the verifier passes - if not, terminate.
Step 2: By observing $\mathcal{A}$'s random oracle queries while computing $c$, extract $u = (u_i)_{i=0}^{m-1}$
Step 3: Denote $(r_{0, 0}, \cdots , r_{0, 2l-1}), t_0'$ as the results sent by $\mathcal{A}$ for Step 1-2. The extractor rewinds $\mathcal{A}$ to the point immediately after outputting $c$ - then gets samples $(r_{i, 0}, \cdots, r_{i, 2l-1})$ again and runs the verifier again - and if the verifier works, the extractor registers it as $(r_{i, 0}, \cdots, r_{i, 2l-1}), t_i'$. This continues until the extractor gets $m$ datasets. Then,
$$\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}$$
is checked to be invertible - if it is not, terminate.
Step 4: One inverts the matrix describe above and multiplies it to $[t'_0, \cdots, t'_{m-1}]$ to get $t$.
## Part 2: Time Complexity Analysis & Proof Outline
Let $\epsilon$ be the probability that $\mathcal{A}$ passes the verifier check - we do not know $\epsilon$.
Since the process terminates immediately when $\mathcal{A}$ fails on the first run, the expected number of $\mathcal{A}$ runs for the extractor algorithm $\mathcal{E}$ can be simply computed as
$$1 + \epsilon \cdot \frac{m-1}{\epsilon} = m$$
and as each run of $\mathcal{A}$ is poly-time, $\mathcal{E}$ is shown to be PPT.
We show the following in order
- First, we show that $u = (u_i)_{i=0}^{m-1}$ can be extracted - this is a fun algorithm that only observes the points where the random oracle was queried to re-build the entire merkle tree. To be exact, we show that this is possible with negligible failure probability.
- We show that $d((u_i)_{i=0}^{m-1}, V) \ge d/3$ forces $\mathcal{A}$ to pass with negligible probability.
- This shows that $u_i$'s are in unique decoding radius. Write $t_i$ to be the closest codewords.
- $t_0' \neq [\otimes_{i=l}^{2l-1} (1 - r_{0, i}, r_{0, i})] \cdot [t_0, \cdots, t_{m-1}]^T$ forces $\mathcal{A}$ to pass with negligible probability.
- Disregarding all failure probabilities mentioned above, we show that
- $t_i' = [\otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j})] \cdot [t_0, \cdots, t_{m-1}]^T$ with overwhelming probability.
- $\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}$ is invertible with overwhelming probability.
It can be easily shown that these claims are sufficient to show extractability.
## Step 1: Extract the Merkle Tree: [[BSCS16 Appendix]](https://www.iacr.org/archive/tcc2016b/99850156/99850156.pdf)
A more technical outline can be found in the paper linked above, but intuitively this is not hard. Each time the verifier accepts a proof, it accepts merkle proofs for the $\kappa$ columns of $(u_i)_{i=0}^{m-1}$ that are queried. Due to the classical properties of a Merkle tree, the query answers will be consistent (otherwise there will be a collision in the random oracle side) and sufficient runs would be enough to get all $n$ columns. The failure probabilities come from cases like a random oracle value collision or random oracle not even being queried for some points on the authentication path, but if the random oracle's range of outputs is a sufficiently large set these probabilities are clearly negligible. This idea of thinking does require rewinding.
I'm not sure if the literal "straight-line" extraction (with no rewinding) is possible actually. The concrete proof idea *with rewinding* should be similar to the extractor in BSCS16 though.
## Step 2: $d((u_i)_{i=0}^{m-1}, V) < d/3$
Denote $e = \lfloor (d - 1) / 3 \rfloor$, and write $u' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [u_0, \cdots, u_{m-1}]^T$.
We assume that $d((u_i)_{i=0}^{m-1}, V) \ge d/3$ - by our testing result, we know that $d(u', V) > e$ with failure probability at most $2el/q$. Therefore, $d(u', \text{Enc}(t')) > e$, giving us the success probability to be at most (see Theorem 4.4 from [Ligero](https://eprint.iacr.org/2022/1608.pdf))
$$ \frac{2el}{q} + \left( 1 - \frac{e}{n} \right)^\kappa < \frac{2dl}{3q} + \left( 1 - \frac{d-3}{3n} \right)^\kappa$$
which is negligible, as we desired. Therefore, we may assume $d((u_i)_{i=0}^{m-1}, V) < d/3$.
## Step 3: $t_0' = [\otimes_{i=l}^{2l-1} (1 - r_{0, i}, r_{0, i})] \cdot [t_0, \cdots, t_{m-1}]^T$
As above, write $e = \lfloor (d - 1) / 3 \rfloor$, and write $u' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [u_0, \cdots, u_{m-1}]^T$.
Also, we write $v' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [\text{Enc}(t_0), \cdots, \text{Enc}(t_{m-1})]^T$. We know $d(u', v') \le e$.
If we have $\text{Enc}(t') \neq v'$, then we have, by triangle inequality,
$$d(u', \text{Enc}(t')) \ge d(v', \text{Enc}(t')) - d(u', v') \ge d - e > 2d/3$$
so the testing passes with probability at most
$$\left( 1 - \frac{2d}{3n} \right)^\kappa$$
which is once again negligible.
## Step 4: $t_i' = [\otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j})] \cdot [t_0, \cdots, t_{m-1}]^T$ for each $i$
This is fundamentally different from Step 4 - note that the first execution of $\mathcal{A}$ is literally the "first" execution (if it fails, then it's game over, so we can safely disregard the cases where the success probability of $\mathcal{A}$ is forced to be negligible) but now we run $\mathcal{A}$ until it works.
However, the intuition with Bayes theorem leads us to thinking that Step 3 should work - especially in the case where success probability of $\mathcal{A}$ is sufficiently large enough. If $\epsilon$ is negligible, we are done, and if $\epsilon$ is sufficiently large, we are done too. So we set up a "boundary" for $\epsilon$ to use two proofs depending on whether $\epsilon$ is over/below that boundary.
So we write
- $V$: the event where $\mathcal{A}$ passes
- $E$: the event in which $\mathcal{A}$ outputs $t' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [t_0, \cdots, t_{m-1}]^T$
Basically, what we have already proved so far is that
$$\max(l/q, \text{Pr}[V | \neg E]) = \delta$$
is a negligible value. If $\epsilon \le \sqrt{\delta}$, then the success probability of one run itself is already negligible, so we are done already - failing to extract anything is fine here.
We note that $\delta$ can be explicitly written with a formula via concrete analysis of Step 1-3.
Now we assume $\epsilon > \sqrt{\delta}$. A nice handling of conditional probabilities leads to
$$\sqrt{\delta} < \epsilon = \text{Pr}[V] = \text{Pr}[V \cap E] + \text{Pr}[V | \neg E] \cdot \text{Pr}[\neg E] \le \text{Pr}[V \cap E] + \delta$$
so $\text{Pr}[V \cap E] > \sqrt{\delta} - \delta$, which via Bayes theorem gives us
$$\text{Pr}[E | V] = \frac{\text{Pr}[V \cap E]}{\text{Pr}[V \cap E] + \text{Pr}[V | \neg E] \cdot \text{Pr}[\neg E]} > \frac{ \sqrt{\delta} - \delta }{\sqrt{\delta} - \delta + \delta } = 1 - \sqrt{\delta}$$
which is overwhelmingly high. So this works for all $m-1$ iterations as well.
## Step 5: $\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}$ is invertible
This is similar to Step 4 - we are proving that $\epsilon$ is so much larger than the probability that a linear dependence would occur - that the Bayes theorem would be enough to finish off. The probability of linear dependence is, of course, non other than Schwartz-Zippel lemma.
We claim that for each proper linear subspace $A$, the probability of $[\otimes_{i=l}^{2l-1} (1 - r_{i}, r_{i})] \in A$ is negligible. By repeatedly iterating this for the subspaces created by the first $i$ tensor product vectors output, we will obtain the result that the final matrix is invertible with overwhelming probability. To simplify, we can just prove this for $A$ a hyperplane - write $A = \{u : \langle u, a \rangle = 0 \}$.
One notices that the set $S$ of $(r_l, \cdots ,r_{2l-1})$ such that $\langle [\oplus_{i=l}^{2l-1} (1-r_i, r_i)], a \rangle = 0$ has density at most $l/q \le \delta$ - this is simply due to Schwartz-Zippel lemma. We also define $S$ as the event where $(r_l, \cdots, r_{2l-1}) \in S$ as well. Now it's a similar Bayes theorem technique again -
$$\sqrt{\delta} < \epsilon = \text{Pr}[V] \le \text{Pr}[S] + \text{Pr}[V \cap \neg S] \le \delta + \text{Pr}[V \cap \neg S]$$
so $\text{Pr}[V \cap \neg S] > \sqrt{\delta} - \delta$. Once again, by Bayes theorem, we have
$$\text{Pr}[\neg S | V] = \frac{\text{Pr}[V \cap \neg S]}{\text{Pr}[V \cap \neg S] + \text{Pr}[V \cap S]} > \frac{\sqrt{\delta} - \delta}{\sqrt{\delta} - \delta + \delta} = 1 - \sqrt{\delta}$$
which is once again overwhelmingly high. This works for all $m-1$ iterations as well.
The technique of setting $\sqrt{\delta}$ as the "boundary" is very beautiful!!
## Final Notes on Efficiency
We note that the increased error probability for the proximity testing doesn't actually increase $\kappa$ at all - the distance is still guaranteed to be at least $d/3$ when the test fails, so it's all good.
In Brakedown / Ligero, the proof size is mentioned to be $2c + \kappa r$.
Here, however, it can be seen that the size is $c + \kappa r$. This is due to testing/evaluation being done at once. Therefore, by parameter selection (with the usual AM-GM stuff) can be done so that the proof size decreases by a factor of $\sqrt{2}$, which is a very nice result!
We note that the optimal $r, c$ is
- Brakedown / Ligero: $r = \sqrt{2/\kappa} \cdot m$, $c = \sqrt{\kappa/2} \cdot m$
- This work: $r = \sqrt{1 / \kappa} \cdot m$, $c = \sqrt{\kappa} \cdot m$
Also it's intuitive that the less-repetitive format of the testing/evaluation leads to optimizations, about a factor of $2$. Apparently this is supported by the benchmarks as well.
For a more detailed reason why we can expect a 2x speedup, see below.
The prover time is
- Commitment: $r \cdot \text{Enc}(c, \lambda)$ for both Ligero and this work
- Prove: $2m^2$ for Ligero and $m^2$ for this work
so on pure proving time we can expect a 2x speedup.
The verifier time is
- $2(\text{Enc}(c, \lambda) + \kappa \cdot r) + c$ for Ligero
- $\text{Enc}(c, \lambda) + \kappa \cdot r + c$ for this work.
depending on $\text{Enc}$'s concrete constants, (which I think is 10 to 20) this is a 2x speedup.
![image](https://hackmd.io/_uploads/SJ_36xDOp.png)
This is from the Brakedown paper, Figure 2, for example.