---
title: Recursion Protocol
tags: recursion-book
description:
author: Zac
---
# Recursion Protocol

Larger image at https://hackmd.io/_uploads/B1Z__PLxh.png
Recursive proof composition steps
---
Goal is to prove correctness of BN254 transcript commitments $[Y_1]_{IM}, [Y_2]_{IM}, [Y_3]_{IM}, [Y_4]_{IM}$. The transcript contains instructions to an "Elliptic Curve Virtual Machine" to perform the BN254 group operations required to verify intermediate proofs in the recursive proof stack.
1. Recursive aggregator circuit looks up non-native group operations from an Instruction Machine transcript
1. Recursion aggregator circuit efficiently aggregates Instruction Machine transcript commitments into transcript accumulators $[Y_1]_{IM}, [Y_2]_{IM}, [Y_3]_{IM}, [Y_4]_{IM}$
1. Produce transcript commitments for the Elliptic Curve VM (over the Grumpkin curve)
1. Compute a proof for the Elliptic Curve VM (over the Grumpkin curve). This proof contains the evaluations of the Grumpkin transcript polynomials at a challenge value $\zeta$
1. Use a **curve transposition circuit** to do the following:
1. derive the Elliptic Curve VM transcript coefficients from the BN254 transcript
2. convert Elliptic Curve VM transcript coefficients into *non-native field VM* instructions, t evaluates the Elliptic Curve VM transcript polynomials at $\zeta$ (and assert they are equivalent to the Grumpkin evaluations)
1. Compute a proof for the non-native field VM (over the BN254 curve)
Note:
Under the assumption that a mature Honk protocol uses a non-native field VM for other purposes (e.g. secp256k1 sigs, EIP4844 blobs), it is Prover-efficient to also use the non-native field VM to evaluate the Elliptic Curve VM transcript polynomial. (vs doing the non-native field arithmetic in the curve transposition circuit)
## Protocol
TODO: standard SNARK definitions (proof relation, definition of $crs$, input strings, witness, group definitions, field definitions etc blah blah blah).
* $\mathbb{F}_{BN254}$ = prime field w. characteristic equal to BN254 group order
* $\mathbb{F}_{Grumpkin}$ = prime field w. characteristic equal to Grumpkin group order
* $\mathbb{G}_{BN254}$ = BN254 $\mathbb{G}_1$ group
* $\mathbb{G}_{Grumpkin}$ = Grumpkin $\mathbb{G}_1$ group
## Elliptic Curve VM Circuit
The ECC VM circuit takes an input string $u_{ECC}$ that contains $\{ [\vec{Y}_{ecc}], c_{ecc} \}$, where $[\vec{Y}_{ecc}] \in \mathbb{G}_{grumpkin}^5, c_{ecc} \in \mathbb{F}_{grumpkin}$. (i.e. the circuit uses 5 columns to describe the VM instructions).
The ECC VM circuit takes an witness $w_{ECC}$ that contains the set of vectors $\{ \vec{y}_{ecc, 0}, ..., \vec{y}_{ecc, 4}\} \in \mathbb{F}_{grumpkin}^{5c_{ecc}}$
The relation for the ECC VM circuit validates the following:
1. $[\vec{Y}_{ecc}] = \text{ Commit}(\vec{y}_{ecc, 0}, \ldots, \vec{y}_{ecc, 4})$
2. $\{ \vec{y}_{ecc, 0}, \ldots, \vec{y}_{ecc, 4} \}$ describes a set of satisfied elliptic curve operations over the BN254 curve
The available elliptic curve operations are defined by the instruction set of the [ECC VM](/BkGNaHUJn/%2FZs730vdURaOw0n4PsQCsNg).
We require the proof of the ECC VM circuit, $\pi_{ECC}$ to contain challenge parameter $\zeta \in \mathbb{F}_{grumpkin}$.
The input string also contains the commitment to the Instruction Machine transcript, $[\vec{Y}_{IM}]$, to ensure $\zeta$ is generated *after* the Prover commits to $[\vec{Y}_{IM}]$.
The vector $\vec{Z}$ defines the set of powers of $\zeta$: $\{ 1, \zeta, \ldots, \zeta^{c_{ECC} - 1} \} \in \mathbb{F}_{grumpkin}^{c_{ECC}}$
We require $\pi_{ECC}$ to contain parameters $\{ z_0, \ldots, z_4 \} \in \mathbb{F}_{grumpkin}^5$, which represent the inner products of the ECC transcript vectors and $\vec{Z}$. i.e.
$$z_j = \vec{y}_{ECC, j} \circ \vec{Z} \text{ for } j \in [0, \ldots, 4]$$
(Assuming a univariate polynomial commitment scheme in the coefficient basis, we get these inner products as part of the main SNARK protocol).
### Non-Native Field VM Circuit
The NNF VM circuit takes an input string $u_{NNF}$ that contains $[\vec{Y}_{NNF}], c_{NNF}$, where $\vec{[Y]}_{NNF} \in \mathbb{G}_{BN254}^9$. (i.e. the VM circuit uses 9 columns to describe the VM instructions)
The NNF VM circuit takes an witness $w_{NNF}$ that contains the set of vectors $\{ \vec{y}_{NNF, 0}, ..., \vec{y}_{NNF, 8 \} } \in \mathbb{F}_{BN254}^{9c_{NNF}}$
The relation for the NNF VM circuit validates the following:
1. $[\vec{Y}_{NNF}] = \text{ Commit}(\vec{y}_{NNF})$
2. $\vec{y}_{NNF}$ describes a set of satisfied non-native field operations
The non-native field operations avaiable are defined by the instruction set of the [Non-native field VM](/%2FPJWE1lpSQqaWKQbYajwf1g).
### Instruction Machine Circuit
The Instruction Machine takes an input string $u_{IM}$ that contains the following:
* $[\vec{Y}_{IM}] \in \mathbb{G}_{BN254}^4$
* $[\vec{Y}_{NNF}] \in \mathbb{G}_{BN254}^9$
* $\zeta \in \mathbb{F}_{grumpkin}$
* $\{ z_0, \ldots, z_8 \} \in \mathbb{F}_{grumpkin}$
The witness string $w_{IM}$ contains vectors that describe a set of input instructions:
* $\{ \vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3} \} \in \mathbb{F}_{BN254}^4$
Where $[\vec{Y}_{IM}] = \text{Commit}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3})$
The Instruction Machine circuit defines two mappings: $\sigma_{NNF}, \sigma_{ECC}$.
The $\sigma_{ECC}$ mapping describes how to transform the elliptic curve instructions in the input transcript $\{ \vec{y}_{IM,0}, \ldots, \vec{y}_{IM, 3} \} \in \mathbb{F}_{BN254}^{4c_{IM}}$ into a transcript for the ECC VM, $\{ \vec{y}_{ECC, 0}, \ldots, \vec{y}_{ECC, 4} \} \in \mathbb{F}_{grumpkin}^{5c_{ECC}}$. i.e.
$$\sigma_{ECC}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}) \rightarrow \{ \vec{y}_{ECC, 0}, \ldots, \vec{y}_{ECC, 4} \} \in \mathbb{F}_{grumpkin}^{5 c_{ecc}}$$
N.B. these transcripts are structured differently for efficiency purposes. The IM transcript efficiently represents instructions in a width-4 UltraPlonk-type arithmetisation. The ECC transcript represents instructions most optimally for the ECC VM. The information in the two transcripts is the same, but that information's representation across columns/rows is not.
The $\sigma_{NNF}$ mapping converts the input transcript $\vec{y}_{IM}$ into a transcript for the NNF VM, $\vec{y}_{NNF}$:
$$\sigma_{NNF}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}) \rightarrow \{ \vec{y}_{NNF, 0}, \ldots, \vec{y}_{NNF, 8} \} \in \mathbb{F}_{BN254}^{9 c_{NNF}}$$
The vector $\vec{Z}$ defines the set of powers of $\zeta$, $\{ 1, \zeta, \ldots, \zeta^{c_{ECC} - 1} \}$
The instructions encoded into the NNF VM transcript validate the that the inner products of $\vec{Z}$ with each vector produced from $\sigma_{ECC}$ equals $\{ z_0, \ldots, z_4 \}$:
$$z_j = \vec{y}_{ECC, j} \circ \vec{Z} \text{ for } j \in [0, \ldots, 4]$$
To put it another way, the IM circuit produces a Non-Native Field VM transcript that validates the correctness of the inner products:
$$z_j = \sigma_{ECC}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3})_j \circ \vec{Z} \text{ for } j \in [0, \ldots, 4]$$
## Validating Non-native elliptic curve operations using cycle curves
Consider the case where we have 3 proofs $\pi_{ECC}, \pi_{IM}, \pi_{NNF}$ and the following is true:
1. $\text{Verify}(crs, u_{ECC}, \pi_{ECC}) = 1$.
1. $\text{Verify}(crs, u_{IM}, \pi_{IM}) = 1$.
1. $\text{Verify}(crs, u_{NNF}, \pi_{NNF}) = 1$.
1. $[\vec{Y}_{IM}, c_{IM}] \in u_{ECC}$
1. $[\vec{Y}_{NNF}, \vec{Y}_{IM}, c_{NNF}, c_{IM}] \in u_{IM}$
1. $[\vec{Y}_{NNF}, c_{NNF}] \in u_{NNF}$
1. $\{ \zeta, z_0, \ldots, z_4 \} \in \pi_{ECC}$
1. $\{ \zeta, z_0, \ldots, z_4 \} \in u_{IM}$
If the above conditions are satisfied, we can infer that the following relationship holds:
$$
[\vec{Y}_{IM}] = Commit_{BN254}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}) \\
[\vec{Y}_{ecc}] = Commit_{Grumpkin}(\sigma_{ecc}(\vec{y}_{IM, 0}, \ldots, \vec{y}_{IM, 3}))
$$
From this we can infer that the bn254 elliptic curve instructions (and associated constraints + assertions) present in $\vec{Y}_{IM}$ are satisfied.
# Goblin-aggregatable SNARKs
For a SNARK to be aggregateable using the goblin scheme, its statement $u$ will contain $[\vec{Y}_{old}], [\vec{Y}_{new}]$; binding commitments to the Instruction Machine transcript at the previous recursion step and current recursion step; i.e. the set of instructions that can be evaluated by the Instruction Machine protocol. These instructions are present in the witness as $\vec{y}_{old}, \vec{y}_{new}$.
The commitment $\vec{Y_{old}}$ represents an existing set of instructions to be executed that is independent of the current SNARK's circuit description.
The commitment $\vec{Y_{new}}$ represents the union of the set of instructions in $\vec{Y_{old}}$ and the set of instructions added to the IM transcript by the current SNARK circuit.
The goblin scheme requires that the proof relation for the SNARK will prove that $\vec{Y_{new}}$ has the correct structure and that one can infer inductively that $\vec{Y_{old}}$ also has the correct structure.
The goblin scheme also requires the proof relation for each SNARK contains a *public aggregation scheme* (e.g. Halo2), where non-native group operations are delegated to the IM. The nature of the scheme is not intrinsiclly linked to Goblin. In the case of the bn254 curve, the "aggregation scheme" directly verifies the proof (delegating $\mathbb{G}_{BN254}$ operations to the IM), excluding the pairing. The pairing check is deferred using a folklore aggregation scheme.
i.e. consider two proofs $\pi_1, \pi_2$. The recursive aggregation step verifies the proofs up the point where the following pairings must be satisfied:
$$
e([A_1], [1]) \cdot e([B_1], [H]) \text{ for } \pi_1 \\
e([A_2], [1]) \cdot e([B_2], [H]) \text{ for } \pi_2
$$
(where $[H]$ is equal to $x \cdot [1]$ for some unknown value $x$ produced by a trusted setup).
The aggregation step consists of generating a random challenge $k$ and returning the aggregated points $\{ [A_1] + k \cdot [A_2], [B_1] + k \cdot [B_2] \}$.