1/6: [DP23] Proof of Extractability + Efficiency

Continued from https://hackmd.io/keVLzFSrQdmCcmUVh2l_BQ

NOTE 1: There's an issue with a proof (and the proof in the paper as well) - to be fixed later.

NOTE 2: https://hackmd.io/qb-NrfZ7SgWMvPGNF4xPxw

NOTE 3: The proof is now fixed on eprint.

Part 1: Extractor Definition

The claim is that if \(\mathcal{Q}\) is admissible, then our construction has extractability.

The extractor itself resembles but has important differences with the one from Brakedown.

Step 1: Run \(\mathcal{A}\) all the way and check if the verifier passes - if not, terminate.

Step 2: By observing \(\mathcal{A}\)'s random oracle queries while computing \(c\), extract \(u = (u_i)_{i=0}^{m-1}\)

Step 3: Denote \((r_{0, 0}, \cdots , r_{0, 2l-1}), t_0'\) as the results sent by \(\mathcal{A}\) for Step 1-2. The extractor rewinds \(\mathcal{A}\) to the point immediately after outputting \(c\) - then gets samples \((r_{i, 0}, \cdots, r_{i, 2l-1})\) again and runs the verifier again - and if the verifier works, the extractor registers it as \((r_{i, 0}, \cdots, r_{i, 2l-1}), t_i'\). This continues until the extractor gets \(m\) datasets. Then,

\[\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}\]

is checked to be invertible - if it is not, terminate.

Step 4: One inverts the matrix describe above and multiplies it to \([t'_0, \cdots, t'_{m-1}]\) to get \(t\).

Part 2: Time Complexity Analysis & Proof Outline

Let \(\epsilon\) be the probability that \(\mathcal{A}\) passes the verifier check - we do not know \(\epsilon\).

Since the process terminates immediately when \(\mathcal{A}\) fails on the first run, the expected number of \(\mathcal{A}\) runs for the extractor algorithm \(\mathcal{E}\) can be simply computed as

\[1 + \epsilon \cdot \frac{m-1}{\epsilon} = m\]

and as each run of \(\mathcal{A}\) is poly-time, \(\mathcal{E}\) is shown to be PPT.

We show the following in order

  • First, we show that \(u = (u_i)_{i=0}^{m-1}\) can be extracted - this is a fun algorithm that only observes the points where the random oracle was queried to re-build the entire merkle tree. To be exact, we show that this is possible with negligible failure probability.
  • We show that \(d((u_i)_{i=0}^{m-1}, V) \ge d/3\) forces \(\mathcal{A}\) to pass with negligible probability.
  • This shows that \(u_i\)'s are in unique decoding radius. Write \(t_i\) to be the closest codewords.
  • \(t_0' \neq [\otimes_{i=l}^{2l-1} (1 - r_{0, i}, r_{0, i})] \cdot [t_0, \cdots, t_{m-1}]^T\) forces \(\mathcal{A}\) to pass with negligible probability.
  • Disregarding all failure probabilities mentioned above, we show that
    • \(t_i' = [\otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j})] \cdot [t_0, \cdots, t_{m-1}]^T\) with overwhelming probability.
    • \(\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}\) is invertible with overwhelming probability.

It can be easily shown that these claims are sufficient to show extractability.

Step 1: Extract the Merkle Tree: [BSCS16 Appendix]

A more technical outline can be found in the paper linked above, but intuitively this is not hard. Each time the verifier accepts a proof, it accepts merkle proofs for the \(\kappa\) columns of \((u_i)_{i=0}^{m-1}\) that are queried. Due to the classical properties of a Merkle tree, the query answers will be consistent (otherwise there will be a collision in the random oracle side) and sufficient runs would be enough to get all \(n\) columns. The failure probabilities come from cases like a random oracle value collision or random oracle not even being queried for some points on the authentication path, but if the random oracle's range of outputs is a sufficiently large set these probabilities are clearly negligible. This idea of thinking does require rewinding.

I'm not sure if the literal "straight-line" extraction (with no rewinding) is possible actually. The concrete proof idea with rewinding should be similar to the extractor in BSCS16 though.

Step 2: \(d((u_i)_{i=0}^{m-1}, V) < d/3\)

Denote \(e = \lfloor (d - 1) / 3 \rfloor\), and write \(u' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [u_0, \cdots, u_{m-1}]^T\).

We assume that \(d((u_i)_{i=0}^{m-1}, V) \ge d/3\) - by our testing result, we know that \(d(u', V) > e\) with failure probability at most \(2el/q\). Therefore, \(d(u', \text{Enc}(t')) > e\), giving us the success probability to be at most (see Theorem 4.4 from Ligero)

\[ \frac{2el}{q} + \left( 1 - \frac{e}{n} \right)^\kappa < \frac{2dl}{3q} + \left( 1 - \frac{d-3}{3n} \right)^\kappa\]

which is negligible, as we desired. Therefore, we may assume \(d((u_i)_{i=0}^{m-1}, V) < d/3\).

Step 3: \(t_0' = [\otimes_{i=l}^{2l-1} (1 - r_{0, i}, r_{0, i})] \cdot [t_0, \cdots, t_{m-1}]^T\)

As above, write \(e = \lfloor (d - 1) / 3 \rfloor\), and write \(u' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [u_0, \cdots, u_{m-1}]^T\).

Also, we write \(v' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [\text{Enc}(t_0), \cdots, \text{Enc}(t_{m-1})]^T\). We know \(d(u', v') \le e\).

If we have \(\text{Enc}(t') \neq v'\), then we have, by triangle inequality,

\[d(u', \text{Enc}(t')) \ge d(v', \text{Enc}(t')) - d(u', v') \ge d - e > 2d/3\]

so the testing passes with probability at most

\[\left( 1 - \frac{2d}{3n} \right)^\kappa\]

which is once again negligible.

Step 4: \(t_i' = [\otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j})] \cdot [t_0, \cdots, t_{m-1}]^T\) for each \(i\)

This is fundamentally different from Step 4 - note that the first execution of \(\mathcal{A}\) is literally the "first" execution (if it fails, then it's game over, so we can safely disregard the cases where the success probability of \(\mathcal{A}\) is forced to be negligible) but now we run \(\mathcal{A}\) until it works.

However, the intuition with Bayes theorem leads us to thinking that Step 3 should work - especially in the case where success probability of \(\mathcal{A}\) is sufficiently large enough. If \(\epsilon\) is negligible, we are done, and if \(\epsilon\) is sufficiently large, we are done too. So we set up a "boundary" for \(\epsilon\) to use two proofs depending on whether \(\epsilon\) is over/below that boundary.

So we write

  • \(V\): the event where \(\mathcal{A}\) passes
  • \(E\): the event in which \(\mathcal{A}\) outputs \(t' = [\otimes_{i=l}^{2l-1} (1 - r_i, r_i)] \cdot [t_0, \cdots, t_{m-1}]^T\)

Basically, what we have already proved so far is that

\[\max(l/q, \text{Pr}[V | \neg E]) = \delta\]

is a negligible value. If \(\epsilon \le \sqrt{\delta}\), then the success probability of one run itself is already negligible, so we are done already - failing to extract anything is fine here.

We note that \(\delta\) can be explicitly written with a formula via concrete analysis of Step 1-3.

Now we assume \(\epsilon > \sqrt{\delta}\). A nice handling of conditional probabilities leads to

\[\sqrt{\delta} < \epsilon = \text{Pr}[V] = \text{Pr}[V \cap E] + \text{Pr}[V | \neg E] \cdot \text{Pr}[\neg E] \le \text{Pr}[V \cap E] + \delta\]

so \(\text{Pr}[V \cap E] > \sqrt{\delta} - \delta\), which via Bayes theorem gives us

\[\text{Pr}[E | V] = \frac{\text{Pr}[V \cap E]}{\text{Pr}[V \cap E] + \text{Pr}[V | \neg E] \cdot \text{Pr}[\neg E]} > \frac{ \sqrt{\delta} - \delta }{\sqrt{\delta} - \delta + \delta } = 1 - \sqrt{\delta}\]

which is overwhelmingly high. So this works for all \(m-1\) iterations as well.

Step 5: \(\left( [ \otimes_{j=l}^{2l-1} (1 - r_{i, j}, r_{i, j}) ] \right)_{i=0}^{m-1}\) is invertible

This is similar to Step 4 - we are proving that \(\epsilon\) is so much larger than the probability that a linear dependence would occur - that the Bayes theorem would be enough to finish off. The probability of linear dependence is, of course, non other than Schwartz-Zippel lemma.

We claim that for each proper linear subspace \(A\), the probability of \([\otimes_{i=l}^{2l-1} (1 - r_{i}, r_{i})] \in A\) is negligible. By repeatedly iterating this for the subspaces created by the first \(i\) tensor product vectors output, we will obtain the result that the final matrix is invertible with overwhelming probability. To simplify, we can just prove this for \(A\) a hyperplane - write \(A = \{u : \langle u, a \rangle = 0 \}\).

One notices that the set \(S\) of \((r_l, \cdots ,r_{2l-1})\) such that \(\langle [\oplus_{i=l}^{2l-1} (1-r_i, r_i)], a \rangle = 0\) has density at most \(l/q \le \delta\) - this is simply due to Schwartz-Zippel lemma. We also define \(S\) as the event where \((r_l, \cdots, r_{2l-1}) \in S\) as well. Now it's a similar Bayes theorem technique again -

\[\sqrt{\delta} < \epsilon = \text{Pr}[V] \le \text{Pr}[S] + \text{Pr}[V \cap \neg S] \le \delta + \text{Pr}[V \cap \neg S]\]

so \(\text{Pr}[V \cap \neg S] > \sqrt{\delta} - \delta\). Once again, by Bayes theorem, we have

\[\text{Pr}[\neg S | V] = \frac{\text{Pr}[V \cap \neg S]}{\text{Pr}[V \cap \neg S] + \text{Pr}[V \cap S]} > \frac{\sqrt{\delta} - \delta}{\sqrt{\delta} - \delta + \delta} = 1 - \sqrt{\delta}\]

which is once again overwhelmingly high. This works for all \(m-1\) iterations as well.

The technique of setting \(\sqrt{\delta}\) as the "boundary" is very beautiful!!

Final Notes on Efficiency

We note that the increased error probability for the proximity testing doesn't actually increase \(\kappa\) at all - the distance is still guaranteed to be at least \(d/3\) when the test fails, so it's all good.

In Brakedown / Ligero, the proof size is mentioned to be \(2c + \kappa r\).

Here, however, it can be seen that the size is \(c + \kappa r\). This is due to testing/evaluation being done at once. Therefore, by parameter selection (with the usual AM-GM stuff) can be done so that the proof size decreases by a factor of \(\sqrt{2}\), which is a very nice result!

We note that the optimal \(r, c\) is

  • Brakedown / Ligero: \(r = \sqrt{2/\kappa} \cdot m\), \(c = \sqrt{\kappa/2} \cdot m\)
  • This work: \(r = \sqrt{1 / \kappa} \cdot m\), \(c = \sqrt{\kappa} \cdot m\)

Also it's intuitive that the less-repetitive format of the testing/evaluation leads to optimizations, about a factor of \(2\). Apparently this is supported by the benchmarks as well.

For a more detailed reason why we can expect a 2x speedup, see below.

The prover time is

  • Commitment: \(r \cdot \text{Enc}(c, \lambda)\) for both Ligero and this work
  • Prove: \(2m^2\) for Ligero and \(m^2\) for this work

so on pure proving time we can expect a 2x speedup.

The verifier time is

  • \(2(\text{Enc}(c, \lambda) + \kappa \cdot r) + c\) for Ligero
  • \(\text{Enc}(c, \lambda) + \kappa \cdot r + c\) for this work.

depending on \(\text{Enc}\)'s concrete constants, (which I think is 10 to 20) this is a 2x speedup.

image

This is from the Brakedown paper, Figure 2, for example.

Select a repo