Post Quantum signature aggregation: Brute-Force Recursion vs. Folding

# Architectural Questions for Post-Quantum Signature Aggregation: A Comparative Analysis ## Introduction: The Post-Quantum Aggregation Imperative The advent of quantum computing necessitates a fundamental shift in blockchain cryptography. For systems like Ethereum, the planned transition from the elegant and efficient BLS signature scheme to a post-quantum alternative, such as the hash-based signature scheme **XMSS (eXtended Merkle Signature Scheme)**, introduces a critical scalability challenge. While quantum-resistant, XMSS signatures are significantly larger and orders of magnitude more computationally expensive to verify than their pre-quantum counterparts. Individually verifying hundreds of thousands of such signatures on-chain for each block is computationally and economically infeasible. The solution is **signature aggregation**: a cryptographic process that compresses a massive set of signatures into a single, constant-size proof that can be verified on-chain with minimal gas cost. This proof must provide an irrefutable guarantee that every signature in the original set was valid. The architecture of this aggregator is a super important design choice for Ethereum's future. It is a recursive cryptographic process. This article provides a detailed analysis of the two primary architectural paradigms for implementing this recursion: 1. **Path A: Brute-Force SNARK Recursion.** This approach involves recursively verifying a full SNARK proof inside another SNARK circuit. 2. **Path B: Specialized Recursive Primitives.** This approach uses some new **folding** or **accumulation schemes**, which provide a more direct and highly efficient cryptographic shortcut for recursion by combining the underlying mathematical statements of proofs without performing a full verification. A zkVM can serve as an abstraction layer on top of either path, but as we will see, its necessity and role differ dramatically between the two. We will analyze the technical mechanisms, performance profiles, and crucial engineering trade-offs of these distinct philosophies. ## The Aggregation Framework The aggregator's design must handle a potential huge set of validator participants. This is achieved through a multi-layered, parallelized architecture. ### High-Level Design: `Aggregate` and `Merge` The system relies on two core cryptographic operations, which occur within a flexible P2P topology of aggregator nodes: * **`Aggregate`**: This is the initial, non-recursive step where a node takes a batch of raw XMSS signatures, verifies them, and produces an initial proof attesting to their validity. * **`Merge`**: This is the recursive step. A node takes two existing proofs and combines them into a single, new proof that attests to the validity of both inputs. This process continues, within partitioned **subnets** of validators and then globally between them, until a single, final proof representing the entire participating validator set is produced. This final proof is the only object submitted on-chain. The chosen cryptographic paradigm must efficiently handle both the `Aggregate` and `Merge` operations. The fundamental architectural divergence lies in *how* these operations are implemented. ## Path A: Brute-Force SNARK Recursion This paradigm implements the `Merge` operation in the most direct way imaginable: the circuit for the `Merge` proof contains a full SNARK verifier. ### Core Mechanism The relation being proven by the `Merge` SNARK is formally: \begin{aligned} R_{Merge} = \{ (proof_{out}) : \exists proof_A, proof_B \text{ s.t. } \\ V(pk, proof_A) = \text{accept} \land V(pk, proof_B) = \text{accept} \} \end{aligned} Here, $V$ is the verification algorithm for the SNARK. The prover for the `Merge` step must convince a verifier that he knows two valid proofs, $proof_A$ and $proof_B$. This is achieved by executing the entire logic of the verification algorithm $V$ inside the `Merge` circuit. A modern, hash-based proof system like **SuperSpartan+WHIR** exemplifies this approach. ### The Role of the zkVM Abstraction The primary challenge of this path is the complexity of the SNARK verifier's circuit. A verifier for a system like WHIR involves complex cryptographic operations that translate into a massive and highly intricate constraint system. Building, auditing, and maintaining such a circuit by hand is practically infeasible. This is where a **zkVM** becomes a practical necessity. It is an abstraction layer that manages this complexity: 1. **High-Level Logic**: We can write a simple program for the `Merge` logic (via the `AggregateMerge` algorithm) using a minimal, high-level Instruction Set Architecture (ISA). 2. **Cryptographic Precompiles**: The VM is equipped with precompiled circuits for expensive cryptographic primitives like `POSEIDON` or `DOT_PRODUCT`, which can be called as single instructions. 3. **Compilation**: A compiler translates this high-level program into the VM's bytecode. The zkVM's prover then automatically generates the final, complex constraint system for the entire execution trace of that program. ### Performance Profile and Bottlenecks * **Prover Cost**: The primary drawback is the significant computational overhead. The prover must generate a proof for the entire VM execution trace, which includes not only the core verification logic but also every step of the VM's operation: instruction fetching and decoding, register updates, memory accesses, etc. This overhead makes the `Merge` step computationally intensive. * **Latency**: The high prover cost directly translates to higher latency for each `Merge` operation, which can be a bottleneck in a time-sensitive P2P aggregation network. * **Proof Size**: Modern hash-based SNARKs produce very compact proofs, so meeting size targets like **~128 kiB** is generally achievable. ## Path B: Specialized Recursive Primitives This paradigm leverages cryptographic primitives designed specifically to achieve recursion more efficiently. Instead of verifying a full proof, **folding** and **accumulation schemes** work by combining the underlying mathematical structures of proofs—their **instance-witness pairs**. ### A Unified Framework for `Aggregate` and `Merge` In this model, every proof is an **instance-witness pair** $(x, w)$ for a specific relation $R$. This uniform structure is used for both the initial `Aggregate` step and the recursive `Merge` steps. * **The `Aggregate` Step**: This operation creates the first instance-witness pair. The **relation** is the XMSS signature verification algorithm. The **instance** consists of public data (public keys, message, bitfield), and the **witness** consists of the secret data (the raw signatures). The output is an initial proof (e.g., an `MCS` instance in Neo) that is ready to be folded. * **The `Merge` Step**: A folding scheme provides a lightweight protocol to take two such instance-witness pairs, $(x_1, w_1)$ and $(x_2, w_2)$, and compute a single folded pair $(x_{folded}, w_{folded})$ for the same relation. This process is significantly cheaper than running a full SNARK verifier. ### Mechanism 1: Lattice-Based Folding (e.g., Neo) **Neo** is a folding scheme based on plausibly post-quantum lattice assumptions that provides a unified framework for both the `Aggregate` and `Merge` operations. * **`Aggregate` and `Merge` as a Folding Cycle**: The process begins with the `Aggregate` step, which creates an initial **`MCS` (Matrix CCS)** instance-witness pair. The **instance** contains the public inputs (public keys, message, bitfield), while the **witness** contains the secret inputs (the raw XMSS signatures). The underlying **relation** is a circuit that performs the XMSS signature verification for each signature indicated by the bitfield. Subsequently, the `Merge` operation takes two existing proofs (e.g., a running accumulator `ME` instance and a new `MCS` instance) and executes Neo's 3-phase folding cycle: unification ($\Pi_{CCS}$), compression ($\Pi_{RLC}$), and normalization ($\Pi_{DEC}$), to combine them into a single, updated accumulator. * **Specialized Advantage: Pay-per-Bit Commitments**. Neo's lattice-based commitments have a unique property: the cost for the prover to commit to a witness vector scales with the **bit-width** of its elements. This could be a nice optimization for the `AggregateMerge` circuit, which is dominated by two types of operations: **(1)** XMSS signature verification, which consists of thousands of hash calls, and **(2)** bitfield manipulation. A generic zkVM prover must represent every bit and every small intermediate value in a hash as a full, large field element. With Neo, the prover's cost is drastically reduced because committing to the vast number of single-bit values in the witness is significantly cheaper. This makes both the initial `Aggregate` step (which is heavy on signature verification) and the `Merge` step (which involves bitfield logic) highly efficient. ### Mechanism 2: Hash-Based Accumulation (e.g., WARP) **WARP** is an accumulation scheme based on plausibly post-quantum hash functions. * **`Merge` as Accumulation**: The `Merge` operation corresponds to adding a new claim to a running accumulator. In WARP, each claim is a **PESAT (Polynomial Equation Satisfiability)** instance. The accumulation prover, $P_{ACC}$, updates the accumulator to include the new claim and generates a proof of this transition. * **Specialized Advantage - Linear-Time Prover**. WARP is designed for maximal prover performance. Its runtime scales linearly with the circuit size, avoiding both the super-linear costs of group-based cryptography and the large constant overheads of a zkVM. For a massive-scale task like signature aggregation, a linear-time prover offers a profound performance improvement. ### The Engineering Challenge - Witness Management The most significant engineering challenge for this path is managing witness data in a P2P network. The prover for a `Merge` step needs access to the witnesses of the two proofs being combined. A naive implementation would require transmitting large amounts of data. However, this is a solvable protocol-level problem. The solution is **witness chunking**: 1. **Commitment to Chunks**: A large witness (e.g., for thousands of signatures) is not treated as a monolithic object. It is partitioned into smaller, constant-size vectors of a smaller size $k$. The prover commits to each of these $k$-sized chunks individually. 2. **Folding of Chunks**: When two such proofs are folded, the resulting **folded witness** is itself only a single vector of size $k$. The state that must be passed between recursive steps remains small and constant. 3. **Efficient Verification**: This approach increases the number of commitments the verifier must handle. However, schemes like Neo are designed for this. Using techniques like "double commitments", the commitments to the witness chunks can be folded efficiently without adding significant complexity or cost to the recursive verifier circuit. Then it becomes more of a **P2P data availability problem**: ensuring aggregator nodes can access the specific witness chunks they need to perform a `Merge`. ### Finalization: Creating the On-Chain SNARK The recursive `Merge` process results in a single, final instance-witness pair. This pair is computationally valid but not "succinct" in the traditional sense, as the witness component is still present. It cannot be posted on-chain directly. Therefore, the final aggregator node must perform one last step: generate a standard, non-recursive SNARK of this final folded pair. The pipeline for this is as follows: 1. The final folded instance-witness pair is treated as the statement to be proven. 2. A **Polynomial Interactive Oracle Proof (PIOP)** like **(Super)Spartan** is applied. It reduces the task of proving knowledge of the folded witness to a set of multilinear polynomial evaluation claims. 3. A **Polynomial Commitment Scheme (PCS)** is then used to commit to these polynomials and prove the evaluations. A hash-based PCS like **FRI** or a more modern and efficient alternative like **WHIR** is employed for this. This hybrid approach has a key implication: the overall system relies on **two distinct cryptographic engines**: the folding/accumulation scheme (e.g., Neo or WARP) for highly efficient recursion, and a separate hash-based SNARK (e.g., Spartan+WHIR) for final succinctness. This adds complexity to the technology stack but combines the strengths of both paradigms: fast recursion and a tiny final proof. ## A Comparative Analysis of Architectural Trade-Offs The choice between these two paths is a decision about where to place complexity and how to prioritize performance. | Feature | Path A: Brute-Force SNARK Recursion | Path B: Specialized Primitives (Folding/Accumulation) | | -------------------------- | --------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | | **Recursive Engine** | Full SNARK Verifier (computationally heavy) | Lightweight Folding/Accumulation Algorithm (computationally cheap) | | **Prover Performance** | Good, but with significant overhead from proving the VM's execution. | To be defined with the implementations. | | **Circuit Complexity** | **Extremely High:** Requires a zkVM abstraction to be manageable. | **Low:** The circuit for a folding verifier is simple enough to be built directly and even formally verified. | | **Primary Locus of Complexity** | **Infrastructure:** The zkVM compiler, prover, and toolchain. | **Protocol:** The P2P witness availability and management layer. | | **Final Proof Generation** | Integrated: The output of `Merge` is already a SNARK. | **Hybrid:** Requires a final SNARK (e.g., Spartan+WHIR) to compress the folded proof. | | **Need for a zkVM** | **High:** A practical necessity to manage the complexity of the SNARK verifier circuit. | **Low:** A zkVM is an optional tool for developer experience, not a requirement for managing complexity. | ## Conclusion: Choosing the Right Path for Ethereum The debate over the optimal architecture for post-quantum signature aggregation is more nuanced than a simple choice between a zkVM and a folding scheme. The real decision lies between two fundamentally different approaches to recursion, each with a distinct profile of strengths and engineering challenges. * **Path A (Brute-Force SNARK Recursion)** manages complexity by abstracting it away behind a **zkVM**. This simplifies application logic at the cost of significant, unavoidable computational overhead in the recursive step. * **Path B (Specialized Primitives)** achieves superior performance by using a more efficient cryptographic engine for recursion and a different SNARK for finalization. This pushes complexity from the cryptographic core to the **P2P protocol layer**, requiring sophisticated witness management solutions but offering a much faster, more scalable, and highly optimized result.