1/6: [DP23] Proof of Extractability

# 1/6: [[DP23]](https://eprint.iacr.org/2023/630.pdf) Proof of Extractability + Efficiency Continued from https://hackmd.io/keVLzFSrQdmCcmUVh2l_BQ NOTE 1: There's an issue with a proof (and the proof in the paper as well) - to be fixed later. NOTE 2: https://hackmd.io/qb-NrfZ7SgWMvPGNF4xPxw NOTE 3: The proof is now fixed on eprint. ## Part 1: Extractor Definition The claim is that if $\mathcal{Q}$ is admissible, then our construction has extractability. The extractor itself resembles but has important differences with the one from Brakedown. Step 1: Run $\mathcal{A}$ all the way and check if the verifier passes - if not, terminate. Step 2: By observing $\mathcal{A}$'s random oracle queries while computing $c$, extract $u = (u_i)_{i=0}^{m-1}$ Step 3: Denote $(r_{0, 0}, \cdots , r_{0, 2l-1}), t_0'$ as the results sent by $\mathcal{A}$ for Step 1-2. The extractor rewinds $\mathcal{A}$ to the point immediately after outputting $c$ - then gets samples $(r_{i, 0}, \cdots, r_{i, 2l-1})$ again and runs the verifier again - and if the verifier works, the extractor registers it as $(r_{i, 0}, \cdots, r_{i, 2l-1}), t_i'$. This continues until the extractor gets $m$ datasets. Then, $$\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}$$ is checked to be invertible - if it is not, terminate. Step 4: One inverts the matrix describe above and multiplies it to $[t'_0, \cdots, t'_{m-1}]$ to get $t$. ## Part 2: Time Complexity Analysis & Proof Outline Let $\epsilon$ be the probability that $\mathcal{A}$ passes the verifier check - we do not know $\epsilon$. Since the process terminates immediately when $\mathcal{A}$ fails on the first run, the expected number of $\mathcal{A}$ runs for the extractor algorithm $\mathcal{E}$ can be simply computed as $$1 + \epsilon \cdot \frac{m-1}{\epsilon} = m$$ and as each run of $\mathcal{A}$ is poly-time, $\mathcal{E}$ is shown to be PPT. We show the following in order - First, we show that $u = (u_i)_{i=0}^{m-1}$ can be extracted - this is a fun algorithm that only observes the points where the random oracle was queried to re-build the entire merkle tree. To be exact, we show that this is possible with negligible failure probability. - We show that $d((u_i)_{i=0}^{m-1}, V) \ge d/3$ forces $\mathcal{A}$ to pass with negligible probability. - This shows that $u_i$'s are in unique decoding radius. Write $t_i$ to be the closest codewords. - $t_0' \neq [\otimes_{i=l}^{2l-1} (1 - r_{0, i}, r_{0, i})] \cdot [t_0, \cdots, t_{m-1}]^T$ forces $\mathcal{A}$ to pass with negligible probability. - Disregarding all failure probabilities mentioned above, we show that - $t_i' = [\otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j})] \cdot [t_0, \cdots, t_{m-1}]^T$ with overwhelming probability. - $\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}$ is invertible with overwhelming probability. It can be easily shown that these claims are sufficient to show extractability. ## Step 1: Extract the Merkle Tree: [[BSCS16 Appendix]](https://www.iacr.org/archive/tcc2016b/99850156/99850156.pdf) A more technical outline can be found in the paper linked above, but intuitively this is not hard. Each time the verifier accepts a proof, it accepts merkle proofs for the $\kappa$ columns of $(u_i)_{i=0}^{m-1}$ that are queried. Due to the classical properties of a Merkle tree, the query answers will be consistent (otherwise there will be a collision in the random oracle side) and sufficient runs would be enough to get all $n$ columns. The failure probabilities come from cases like a random oracle value collision or random oracle not even being queried for some points on the authentication path, but if the random oracle's range of outputs is a sufficiently large set these probabilities are clearly negligible. This idea of thinking does require rewinding. I'm not sure if the literal "straight-line" extraction (with no rewinding) is possible actually. The concrete proof idea *with rewinding* should be similar to the extractor in BSCS16 though. ## Step 2: $d((u_i)_{i=0}^{m-1}, V) < d/3$ Denote $e = \lfloor (d - 1) / 3 \rfloor$, and write $u' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [u_0, \cdots, u_{m-1}]^T$. We assume that $d((u_i)_{i=0}^{m-1}, V) \ge d/3$ - by our testing result, we know that $d(u', V) > e$ with failure probability at most $2el/q$. Therefore, $d(u', \text{Enc}(t')) > e$, giving us the success probability to be at most (see Theorem 4.4 from [Ligero](https://eprint.iacr.org/2022/1608.pdf)) $$ \frac{2el}{q} + \left( 1 - \frac{e}{n} \right)^\kappa < \frac{2dl}{3q} + \left( 1 - \frac{d-3}{3n} \right)^\kappa$$ which is negligible, as we desired. Therefore, we may assume $d((u_i)_{i=0}^{m-1}, V) < d/3$. ## Step 3: $t_0' = [\otimes_{i=l}^{2l-1} (1 - r_{0, i}, r_{0, i})] \cdot [t_0, \cdots, t_{m-1}]^T$ As above, write $e = \lfloor (d - 1) / 3 \rfloor$, and write $u' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [u_0, \cdots, u_{m-1}]^T$. Also, we write $v' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [\text{Enc}(t_0), \cdots, \text{Enc}(t_{m-1})]^T$. We know $d(u', v') \le e$. If we have $\text{Enc}(t') \neq v'$, then we have, by triangle inequality, $$d(u', \text{Enc}(t')) \ge d(v', \text{Enc}(t')) - d(u', v') \ge d - e > 2d/3$$ so the testing passes with probability at most $$\left( 1 - \frac{2d}{3n} \right)^\kappa$$ which is once again negligible. ## Step 4: $t_i' = [\otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j})] \cdot [t_0, \cdots, t_{m-1}]^T$ for each $i$ This is fundamentally different from Step 4 - note that the first execution of $\mathcal{A}$ is literally the "first" execution (if it fails, then it's game over, so we can safely disregard the cases where the success probability of $\mathcal{A}$ is forced to be negligible) but now we run $\mathcal{A}$ until it works. However, the intuition with Bayes theorem leads us to thinking that Step 3 should work - especially in the case where success probability of $\mathcal{A}$ is sufficiently large enough. If $\epsilon$ is negligible, we are done, and if $\epsilon$ is sufficiently large, we are done too. So we set up a "boundary" for $\epsilon$ to use two proofs depending on whether $\epsilon$ is over/below that boundary. So we write - $V$: the event where $\mathcal{A}$ passes - $E$: the event in which $\mathcal{A}$ outputs $t' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [t_0, \cdots, t_{m-1}]^T$ Basically, what we have already proved so far is that $$\max(l/q, \text{Pr}[V | \neg E]) = \delta$$ is a negligible value. If $\epsilon \le \sqrt{\delta}$, then the success probability of one run itself is already negligible, so we are done already - failing to extract anything is fine here. We note that $\delta$ can be explicitly written with a formula via concrete analysis of Step 1-3. Now we assume $\epsilon > \sqrt{\delta}$. A nice handling of conditional probabilities leads to $$\sqrt{\delta} < \epsilon = \text{Pr}[V] = \text{Pr}[V \cap E] + \text{Pr}[V | \neg E] \cdot \text{Pr}[\neg E] \le \text{Pr}[V \cap E] + \delta$$ so $\text{Pr}[V \cap E] > \sqrt{\delta} - \delta$, which via Bayes theorem gives us $$\text{Pr}[E | V] = \frac{\text{Pr}[V \cap E]}{\text{Pr}[V \cap E] + \text{Pr}[V | \neg E] \cdot \text{Pr}[\neg E]} > \frac{ \sqrt{\delta} - \delta }{\sqrt{\delta} - \delta + \delta } = 1 - \sqrt{\delta}$$ which is overwhelmingly high. So this works for all $m-1$ iterations as well. ## Step 5: $\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}$ is invertible This is similar to Step 4 - we are proving that $\epsilon$ is so much larger than the probability that a linear dependence would occur - that the Bayes theorem would be enough to finish off. The probability of linear dependence is, of course, non other than Schwartz-Zippel lemma. We claim that for each proper linear subspace $A$, the probability of $[\otimes_{i=l}^{2l-1} (1 - r_{i}, r_{i})] \in A$ is negligible. By repeatedly iterating this for the subspaces created by the first $i$ tensor product vectors output, we will obtain the result that the final matrix is invertible with overwhelming probability. To simplify, we can just prove this for $A$ a hyperplane - write $A = \{u : \langle u, a \rangle = 0 \}$. One notices that the set $S$ of $(r_l, \cdots ,r_{2l-1})$ such that $\langle [\oplus_{i=l}^{2l-1} (1-r_i, r_i)], a \rangle = 0$ has density at most $l/q \le \delta$ - this is simply due to Schwartz-Zippel lemma. We also define $S$ as the event where $(r_l, \cdots, r_{2l-1}) \in S$ as well. Now it's a similar Bayes theorem technique again - $$\sqrt{\delta} < \epsilon = \text{Pr}[V] \le \text{Pr}[S] + \text{Pr}[V \cap \neg S] \le \delta + \text{Pr}[V \cap \neg S]$$ so $\text{Pr}[V \cap \neg S] > \sqrt{\delta} - \delta$. Once again, by Bayes theorem, we have $$\text{Pr}[\neg S | V] = \frac{\text{Pr}[V \cap \neg S]}{\text{Pr}[V \cap \neg S] + \text{Pr}[V \cap S]} > \frac{\sqrt{\delta} - \delta}{\sqrt{\delta} - \delta + \delta} = 1 - \sqrt{\delta}$$ which is once again overwhelmingly high. This works for all $m-1$ iterations as well. The technique of setting $\sqrt{\delta}$ as the "boundary" is very beautiful!! ## Final Notes on Efficiency We note that the increased error probability for the proximity testing doesn't actually increase $\kappa$ at all - the distance is still guaranteed to be at least $d/3$ when the test fails, so it's all good. In Brakedown / Ligero, the proof size is mentioned to be $2c + \kappa r$. Here, however, it can be seen that the size is $c + \kappa r$. This is due to testing/evaluation being done at once. Therefore, by parameter selection (with the usual AM-GM stuff) can be done so that the proof size decreases by a factor of $\sqrt{2}$, which is a very nice result! We note that the optimal $r, c$ is - Brakedown / Ligero: $r = \sqrt{2/\kappa} \cdot m$, $c = \sqrt{\kappa/2} \cdot m$ - This work: $r = \sqrt{1 / \kappa} \cdot m$, $c = \sqrt{\kappa} \cdot m$ Also it's intuitive that the less-repetitive format of the testing/evaluation leads to optimizations, about a factor of $2$. Apparently this is supported by the benchmarks as well. For a more detailed reason why we can expect a 2x speedup, see below. The prover time is - Commitment: $r \cdot \text{Enc}(c, \lambda)$ for both Ligero and this work - Prove: $2m^2$ for Ligero and $m^2$ for this work so on pure proving time we can expect a 2x speedup. The verifier time is - $2(\text{Enc}(c, \lambda) + \kappa \cdot r) + c$ for Ligero - $\text{Enc}(c, \lambda) + \kappa \cdot r + c$ for this work. depending on $\text{Enc}$'s concrete constants, (which I think is 10 to 20) this is a 2x speedup. ![image](https://hackmd.io/_uploads/SJ_36xDOp.png) This is from the Brakedown paper, Figure 2, for example.