owned this note changed 2 years ago
Linked with GitHub

Recursion Protocol

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Larger image at https://hackmd.io/_uploads/B1Z__PLxh.png

Recursive proof composition steps

Goal is to prove correctness of BN254 transcript commitments \([Y_1]_{IM}, [Y_2]_{IM}, [Y_3]_{IM}, [Y_4]_{IM}\). The transcript contains instructions to an "Elliptic Curve Virtual Machine" to perform the BN254 group operations required to verify intermediate proofs in the recursive proof stack.

  1. Recursive aggregator circuit looks up non-native group operations from an Instruction Machine transcript
  2. Recursion aggregator circuit efficiently aggregates Instruction Machine transcript commitments into transcript accumulators \([Y_1]_{IM}, [Y_2]_{IM}, [Y_3]_{IM}, [Y_4]_{IM}\)
  3. Produce transcript commitments for the Elliptic Curve VM (over the Grumpkin curve)
  4. Compute a proof for the Elliptic Curve VM (over the Grumpkin curve). This proof contains the evaluations of the Grumpkin transcript polynomials at a challenge value \(\zeta\)
  5. Use a curve transposition circuit to do the following:
    1. derive the Elliptic Curve VM transcript coefficients from the BN254 transcript
    2. convert Elliptic Curve VM transcript coefficients into non-native field VM instructions, t evaluates the Elliptic Curve VM transcript polynomials at \(\zeta\) (and assert they are equivalent to the Grumpkin evaluations)
  6. Compute a proof for the non-native field VM (over the BN254 curve)

Note:
Under the assumption that a mature Honk protocol uses a non-native field VM for other purposes (e.g. secp256k1 sigs, EIP4844 blobs), it is Prover-efficient to also use the non-native field VM to evaluate the Elliptic Curve VM transcript polynomial. (vs doing the non-native field arithmetic in the curve transposition circuit)

Protocol

TODO: standard SNARK definitions (proof relation, definition of \(crs\), input strings, witness, group definitions, field definitions etc blah blah blah).

  • \(\mathbb{F}_{BN254}\) = prime field w. characteristic equal to BN254 group order
  • \(\mathbb{F}_{Grumpkin}\) = prime field w. characteristic equal to Grumpkin group order
  • \(\mathbb{G}_{BN254}\) = BN254 \(\mathbb{G}_1\) group
  • \(\mathbb{G}_{Grumpkin}\) = Grumpkin \(\mathbb{G}_1\) group

Elliptic Curve VM Circuit

The ECC VM circuit takes an input string \(u_{ECC}\) that contains \(\{ [\vec{Y}_{ecc}], c_{ecc} \}\), where \([\vec{Y}_{ecc}] \in \mathbb{G}_{grumpkin}^5, c_{ecc} \in \mathbb{F}_{grumpkin}\). (i.e. the circuit uses 5 columns to describe the VM instructions).

The ECC VM circuit takes an witness \(w_{ECC}\) that contains the set of vectors \(\{ \vec{y}_{ecc, 0}, ..., \vec{y}_{ecc, 4}\} \in \mathbb{F}_{grumpkin}^{5c_{ecc}}\)

The relation for the ECC VM circuit validates the following:

  1. \([\vec{Y}_{ecc}] = \text{ Commit}(\vec{y}_{ecc, 0}, \ldots, \vec{y}_{ecc, 4})\)
  2. \(\{ \vec{y}_{ecc, 0}, \ldots, \vec{y}_{ecc, 4} \}\) describes a set of satisfied elliptic curve operations over the BN254 curve

The available elliptic curve operations are defined by the instruction set of the ECC VM.

We require the proof of the ECC VM circuit, \(\pi_{ECC}\) to contain challenge parameter \(\zeta \in \mathbb{F}_{grumpkin}\).

The input string also contains the commitment to the Instruction Machine transcript, \([\vec{Y}_{IM}]\), to ensure \(\zeta\) is generated after the Prover commits to \([\vec{Y}_{IM}]\).

The vector \(\vec{Z}\) defines the set of powers of \(\zeta\): \(\{ 1, \zeta, \ldots, \zeta^{c_{ECC} - 1} \} \in \mathbb{F}_{grumpkin}^{c_{ECC}}\)

We require \(\pi_{ECC}\) to contain parameters \(\{ z_0, \ldots, z_4 \} \in \mathbb{F}_{grumpkin}^5\), which represent the inner products of the ECC transcript vectors and \(\vec{Z}\). i.e.

\[z_j = \vec{y}_{ECC, j} \circ \vec{Z} \text{ for } j \in [0, \ldots, 4]\]

(Assuming a univariate polynomial commitment scheme in the coefficient basis, we get these inner products as part of the main SNARK protocol).

Non-Native Field VM Circuit

The NNF VM circuit takes an input string \(u_{NNF}\) that contains \([\vec{Y}_{NNF}], c_{NNF}\), where \(\vec{[Y]}_{NNF} \in \mathbb{G}_{BN254}^9\). (i.e. the VM circuit uses 9 columns to describe the VM instructions)

The NNF VM circuit takes an witness \(w_{NNF}\) that contains the set of vectors \(\{ \vec{y}_{NNF, 0}, ..., \vec{y}_{NNF, 8 \} } \in \mathbb{F}_{BN254}^{9c_{NNF}}\)

The relation for the NNF VM circuit validates the following:

  1. \([\vec{Y}_{NNF}] = \text{ Commit}(\vec{y}_{NNF})\)
  2. \(\vec{y}_{NNF}\) describes a set of satisfied non-native field operations

The non-native field operations avaiable are defined by the instruction set of the Non-native field VM.

Instruction Machine Circuit

The Instruction Machine takes an input string \(u_{IM}\) that contains the following:

  • \([\vec{Y}_{IM}] \in \mathbb{G}_{BN254}^4\)
  • \([\vec{Y}_{NNF}] \in \mathbb{G}_{BN254}^9\)
  • \(\zeta \in \mathbb{F}_{grumpkin}\)
  • \(\{ z_0, \ldots, z_8 \} \in \mathbb{F}_{grumpkin}\)

The witness string \(w_{IM}\) contains vectors that describe a set of input instructions:

  • \(\{ \vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3} \} \in \mathbb{F}_{BN254}^4\)

Where \([\vec{Y}_{IM}] = \text{Commit}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3})\)

The Instruction Machine circuit defines two mappings: \(\sigma_{NNF}, \sigma_{ECC}\).

The \(\sigma_{ECC}\) mapping describes how to transform the elliptic curve instructions in the input transcript \(\{ \vec{y}_{IM,0}, \ldots, \vec{y}_{IM, 3} \} \in \mathbb{F}_{BN254}^{4c_{IM}}\) into a transcript for the ECC VM, \(\{ \vec{y}_{ECC, 0}, \ldots, \vec{y}_{ECC, 4} \} \in \mathbb{F}_{grumpkin}^{5c_{ECC}}\). i.e.

\[\sigma_{ECC}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}) \rightarrow \{ \vec{y}_{ECC, 0}, \ldots, \vec{y}_{ECC, 4} \} \in \mathbb{F}_{grumpkin}^{5 c_{ecc}}\]

N.B. these transcripts are structured differently for efficiency purposes. The IM transcript efficiently represents instructions in a width-4 UltraPlonk-type arithmetisation. The ECC transcript represents instructions most optimally for the ECC VM. The information in the two transcripts is the same, but that information's representation across columns/rows is not.

The \(\sigma_{NNF}\) mapping converts the input transcript \(\vec{y}_{IM}\) into a transcript for the NNF VM, \(\vec{y}_{NNF}\):

\[\sigma_{NNF}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}) \rightarrow \{ \vec{y}_{NNF, 0}, \ldots, \vec{y}_{NNF, 8} \} \in \mathbb{F}_{BN254}^{9 c_{NNF}}\]

The vector \(\vec{Z}\) defines the set of powers of \(\zeta\), \(\{ 1, \zeta, \ldots, \zeta^{c_{ECC} - 1} \}\)

The instructions encoded into the NNF VM transcript validate the that the inner products of \(\vec{Z}\) with each vector produced from \(\sigma_{ECC}\) equals \(\{ z_0, \ldots, z_4 \}\):

\[z_j = \vec{y}_{ECC, j} \circ \vec{Z} \text{ for } j \in [0, \ldots, 4]\]

To put it another way, the IM circuit produces a Non-Native Field VM transcript that validates the correctness of the inner products:

\[z_j = \sigma_{ECC}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3})_j \circ \vec{Z} \text{ for } j \in [0, \ldots, 4]\]

Validating Non-native elliptic curve operations using cycle curves

Consider the case where we have 3 proofs \(\pi_{ECC}, \pi_{IM}, \pi_{NNF}\) and the following is true:

  1. \(\text{Verify}(crs, u_{ECC}, \pi_{ECC}) = 1\).
  2. \(\text{Verify}(crs, u_{IM}, \pi_{IM}) = 1\).
  3. \(\text{Verify}(crs, u_{NNF}, \pi_{NNF}) = 1\).
  4. \([\vec{Y}_{IM}, c_{IM}] \in u_{ECC}\)
  5. \([\vec{Y}_{NNF}, \vec{Y}_{IM}, c_{NNF}, c_{IM}] \in u_{IM}\)
  6. \([\vec{Y}_{NNF}, c_{NNF}] \in u_{NNF}\)
  7. \(\{ \zeta, z_0, \ldots, z_4 \} \in \pi_{ECC}\)
  8. \(\{ \zeta, z_0, \ldots, z_4 \} \in u_{IM}\)

If the above conditions are satisfied, we can infer that the following relationship holds:

\[ [\vec{Y}_{IM}] = Commit_{BN254}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}) \\ [\vec{Y}_{ecc}] = Commit_{Grumpkin}(\sigma_{ecc}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3})) \]

From this we can infer that the bn254 elliptic curve instructions (and associated constraints + assertions) present in \(\vec{Y}_{IM}\) are satisfied.

Goblin-aggregatable SNARKs

For a SNARK to be aggregateable using the goblin scheme, its statement \(u\) will contain \([\vec{Y}_{old}], [\vec{Y}_{new}]\); binding commitments to the Instruction Machine transcript at the previous recursion step and current recursion step; i.e. the set of instructions that can be evaluated by the Instruction Machine protocol. These instructions are present in the witness as \(\vec{y}_{old}, \vec{y}_{new}\).

The commitment \(\vec{Y_{old}}\) represents an existing set of instructions to be executed that is independent of the current SNARK's circuit description.

The commitment \(\vec{Y_{new}}\) represents the union of the set of instructions in \(\vec{Y_{old}}\) and the set of instructions added to the IM transcript by the current SNARK circuit.

The goblin scheme requires that the proof relation for the SNARK will prove that \(\vec{Y_{new}}\) has the correct structure and that one can infer inductively that \(\vec{Y_{old}}\) also has the correct structure.

The goblin scheme also requires the proof relation for each SNARK contains a public aggregation scheme (e.g. Halo2), where non-native group operations are delegated to the IM. The nature of the scheme is not intrinsiclly linked to Goblin. In the case of the bn254 curve, the "aggregation scheme" directly verifies the proof (delegating \(\mathbb{G}_{BN254}\) operations to the IM), excluding the pairing. The pairing check is deferred using a folklore aggregation scheme.

i.e. consider two proofs \(\pi_1, \pi_2\). The recursive aggregation step verifies the proofs up the point where the following pairings must be satisfied:

\[ e([A_1], [1]) \cdot e([B_1], [H]) \text{ for } \pi_1 \\ e([A_2], [1]) \cdot e([B_2], [H]) \text{ for } \pi_2 \]

(where \([H]\) is equal to \(x \cdot [1]\) for some unknown value \(x\) produced by a trusted setup).

The aggregation step consists of generating a random challenge \(k\) and returning the aggregated points \(\{ [A_1] + k \cdot [A_2], [B_1] + k \cdot [B_2] \}\).

Select a repo