Polygon zkEVM - HackMD

# Polygon zkEVM ## Composition, Recursion and Aggregation ### Proof Composition in STARKs In a standard STARK system, the verifier checks the validity of a proof $\pi_{STARK}$ using public parameters and verification criteria. Proof composition refers to the idea of utilizing different proof systems to construct a more efficient proof verification process. Specifically, composition approach delegates the verification of a STARK proof $\pi_{STARK}$ to a circuit verifier $C$. The prover can then provide a proof $\pi_{CIRCUIT}$ that verifies the correct execution of the circuit verifier $C$. As shown in the figure below, the verification process is reduced to verifying $\pi_{CIRCUIT}$, which is smaller and quicker to verify than the original $\pi_{STARK}$. ![image](https://hackmd.io/_uploads/H1d_KHR1yg.png) The primary advantage of this composition is that the proof $\pi_{CIRCUIT}$ is more compact, thus reducing the overall verification complexity. Let the original STARK proof be denoted as $\pi_{STARK}$. The verifier for this proof can be represented as a circuit $C$, whose execution can be verified by a second proof $\pi_{CIRCUIT}$. Therefore, the verification of $\pi_{STARK}$ is reduced to the verification of the circuit proof $\pi_{CIRCUIT}$. In this case, if the prover provides a proof for the correct execution of the verification circuit $\pi_{CIRCUIT}$, then this is enough to verify the original STARK. By constructing $C$ as a compact circuit, the size of $\pi_{CIRCUIT}$ can be minimized, making it more efficient for the verifier. ### Recursion in STARKs #### Recursive Verification Setup The verifier in a STARK system is generally much more efficient than the prover. This disparity allows us to implement a recursive verification scheme, where the output of each verification step becomes the input for the next. In this recursive architecture, multiple STARK verifiers are arranged in a chain, each verifying the proof produced by the previous verifier. Figure below demonstrates this setup, where we define intermediate circuits to represent each verification step. ![image](https://hackmd.io/_uploads/ByGK2H0kke.png) We define the first STARK system as $\text{STARK}_\text{A}$ with parameters `pil`, `constants`, and `starkinfo`. The verifier for $\text{STARK}_\text{A}$ is converted into a circuit $\text{CIRCUIT}_{\text{A.verifier}}$, which is then expressed in terms of Rank-1 Constraint Systems (R1CS). This process is called STARK-to-CIRCUIT (S2C). In other words, the R1CS description of the STARK verifier circuit can be pre-processed before the computation of the proof. The circuit definition (R1CS) then becomes the input for a new STARK proof system, $\text{STARK}_\text{B}$, which we derive via the CIRCUIT-to-STARK (C2S) transformation. This transformation allows us to generate a new set of STARK parameters for $\text{STARK}_\text{B}$ and it is essentially a $\mathcal{PlonK}\text{ish}$ arithmetization with some custom gates of the verification circuit of $\text{STARK}_\text{A}$. This recursive process can be repeated any number of times, compressing the proof at each step while increasing the complexity for the prover. #### Recursive Proof Generation Once the recursive setup is complete, the proof generation proceeds by feeding the output of one prover into the next prover in the sequence. The first prover generates a proof $\pi_{\text{A}}$, which is then passed to the next prover along with additional public inputs, generating the next proof $\pi_{\text{B}}$, and so on. Figure below shows the structure of this recursive proof generation process. ![image](https://hackmd.io/_uploads/HkaFhrCkyl.png) The final proof is a circuit-based proof (such as $\texttt{FFLONK}$) that is much smaller and more efficient to verify than the original STARK proof. ### Aggregation in STARKs Aggregation in proof systems allows multiple valid proofs to be combined into a single proof that can be verified efficiently. This method is especially useful in scenarios where many individual proofs need to be verified, such as in batch processing or decentralized applications. By aggregating proofs, we reduce the verification burden by consolidating them into one verifiable entity. Aggregation architecture supports proof aggregation through the use of intermediate circuits, which act as proof aggregators. As depicted in figure below, binary aggregators combine two individual proofs into a single proof. The final aggregated proof can then be verified in constant time, regardless of the number of individual proofs that were combined. ![image](https://hackmd.io/_uploads/ByZ8D80yyg.png) Let $\pi_1, \pi_2, \dots, \pi_n$ represent a set of valid proofs. The aggregation function combines these proofs into a single aggregated proof $\pi_{n + 1}$: $$ \pi_{n + 1} = \text{Aggregate}(\pi_1, \pi_2, \dots, \pi_n) $$ The verifier only needs to verify $\pi_{n + 1}$, ensuring that all individual proofs were valid. ## Architecture This section presents a detailed examination of the architecture of Polygon zkEVM, focusing on the process of recursive proof generation, aggregation, and composition. This describes the step-by-step methodology to prove the correct execution of batches of transactions using zkEVM and recursive SNARKs. The two-phased process of proof generation is outlined, with an emphasis on the recursive proving phase. This provides a comprehensive breakdown of the five intermediate stages involved in transitioning from individual batch proofs to a final aggregated validity proof. These stages include Compression, Normalization, Aggregation, Finalization, and the generation of the ultimate SNARK proof. ### Two-Phase Proof Generation The process of generating zero-knowledge proofs for zkEVM follows two distinct phases: 1. **Setup Phase**: This phase is performed once for each STARK definition. In the context of zkEVM, the STARK definition corresponds to the batch processing of Ethereum-compatible transactions. During the setup phase, all necessary artifacts (e.g., R1CS constraints, witness generation keys, proving keys, verification keys) are generated for the specific STARK circuit. This preprocessing step enables efficient proof generation in the subsequent proving phase. 2. **Proving Phase**: This phase is performed for each batch of transactions. For each set of inputs (i.e., a batch of transactions), a proof is generated that certifies the correct execution of the zkEVM's batch processor. The proving phase involves multiple intermediate steps that eventually lead to the creation of a final validity proof. This final proof, a SNARK proof, is compact and can be efficiently verified by anyone. ### Recursive Proving Structure Focusing on the recursive proving process, the goal is to generate an efficiently verifiable proof of correctness for multiple batches of transactions. The recursion mechanism in zkEVM starts with individual batch proofs and progressively compresses and aggregates them through a series of intermediate stages. This culminates in a single proof that attests to the correctness of all batches processed. An overview of this process is shown in figure below. ![image](https://hackmd.io/_uploads/ByqFR8RJkl.png) ### Building the zkEVM STARK #### Constructing the zkEVM ROM The first step in building the zkEVM STARK involves constructing the zkEVM ROM, which defines the EVM program to be executed. The zkEVM ROM contains a sequence of instructions that the zkEVM state machine must execute, resulting in an execution trace. This trace records all state transitions during the execution of the program and serves as the basis for generating a STARK proof that certifies the correct execution of the zkEVM. Let the execution trace for the zkEVM program be represented as $T = \{t_1, t_2, \dots, t_n\}$, where each $t_i$ corresponds to the state of the zkEVM at step $i$. The zkEVM ROM generates all the constant values required to compute this trace. Notably, the polynomials committed during the STARK proving process are not needed at this stage. As a result, the zkEVM executor does not need to run during the setup phase. #### Polynomial Identity Language (PIL) for Trace Validation To verify the execution trace generated from the zkEVM ROM, we use Polynomial Identity Language (PIL) constraints. PIL ensures that the zkEVM execution trace follows the correct rules for executing EVM instructions, enforcing polynomial relationships between consecutive state transitions. #### Initial STARK Proof Generation The initial zkEVM proof is generated from the execution trace, resulting in a large proof $\pi_{\text{batch}}$ for each batch of transactions. Each $\pi_{\text{batch}}$ proof verifies the correctness of a single batch of transactions processed by zkEVM. This initial proof includes multiple polynomials representing different aspects of the state transitions. In this case, we use a blowup factor of $2$ and $128$ queries to generate the proof. However, these initial proofs are large because the execution trace is represented by a significant number of polynomials. To reduce the proof size, a compression stage is introduced, which compresses $\pi_{\text{batch}}$ into a more compact proof $\pi_{\text{c12a}}$. ### Compression Stage and Proof Optimization #### From $\pi_{\text{batch}}$ to $\pi_{\text{c12a}}$ The initial proof for a batch of transactions, $\pi_{\text{batch}}$, is large due to the number of polynomials used in the execution trace. To reduce the size of these proofs, zkEVM employs a **Compression Stage**. During this stage, the initial proof $\pi_{\text{batch}}$ is compressed into a more compact proof $\pi_{\text{c12a}}$, which employs only 12 polynomials. These 12 polynomials are carefully selected to capture the essential components of the zkEVM execution trace. By reducing the number of polynomials from the initial proof, zkEVM is able to significantly reduce the overall size of the proof without sacrificing the security or integrity of the computation. #### The Compressed Proof $\pi_{\text{c12a}}$ The final compressed proof, $\pi_{\text{c12a}}$, is significantly smaller than the initial proof $\pi_{\text{batch}}$. It uses only $12$ polynomials and a higher blowup factor of $\beta = 4$, along with fewer FRI queries ($64$ queries instead of $128$). This optimization reduces the overall size of the proof while ensuring that the correctness of the zkEVM execution trace is preserved. This compressed proof is used as the input for subsequent recursive proving steps in zkEVM, allowing for efficient verification of large batches of transactions across multiple layers of recursion. ### Normalization Stage The **Normalization Stage** introduces this step by transforming the verification circuit so that it accepts the root constants as public inputs. This ensures that the root constants, which are crucial for the aggregation of proofs, can be uniformly handled across all recursion steps. To understand the need for normalization, consider the transition from one recursive proof $\pi_{\text{c12a}}$ to the next recursive proof $\pi_{\text{recursive1}}$. The proof $\pi_{\text{c12a}}$ verifies a large STARK proof $\pi_\text{batch}$ for a batch of transactions, and $\pi_{\text{recursive1}}$ is responsible for verifying $\pi_{\text{c12a}}$. The recursive proof construction requires that each circuit verifier in the recursion chain is identical in structure to ensure that proofs can be aggregated consistently. This means that both the aggregator circuit and the normalization circuit must be identical, with the only difference being the inclusion of the root constant as a public input during normalization. #### Blowup Factor and FRI Queries For the recursive proof $\pi_{\text{recursive1}}$, we use a blowup factor of $\beta = 2^4 = 16$, meaning that the polynomials in the recursive proof are evaluated at 16 times the original number of points. This helps maintain the low-degree properties of the polynomials while minimizing the overall size of the proof. Additionally, we reduce the number of FRI queries to $32$, down from $64$ in the previous stages, which further reduces the complexity of the proof while maintaining security. ### Aggregation Stage In recursive zkEVM proofs, multiple proofs are generated at different stages of the recursive process. These proofs must eventually be combined into a single, verifiable proof to reduce the computational load on verifiers. The **Aggregation Stage** is responsible for combining these proofs, ensuring that they can be verified as a unified whole. At each level of recursion, the system produces either a normalized proof $\pi_{\text{re1}}$ or an aggregated proof $\pi_{\text{re2}}$. The role of the Aggregation Stage is to combine two such proofs—either two $\pi_{\text{re1}}$, two $\pi_{\text{re2}}$, or one of each—into a single aggregated proof. This process continues recursively until all proofs are aggregated into one final proof. #### Aggregation Circuit Design To perform aggregation, we design a **Circom aggregation circuit** capable of handling different types of proofs. This circuit takes as input either two $\pi_{\text{re1}}$ proofs, two $\pi_{\text{re2}}$ proofs, or one of each. The circuit determines the correct handling of these proofs based on the type of each proof. In particular, the circuit must be able to differentiate between cases where the root constant from a previous proof should be passed as a public input (as in the case of $\pi_{\text{re2}}$) or provided directly (as in the case of $\pi_{\text{re1}}$). The following diagram illustrates the structure of the aggregation circuit: ![image](https://hackmd.io/_uploads/BkHejD1gke.png) In the circuit, the **selector** determines whether the proof being aggregated is of type $\pi_{\text{re1}}$ or $\pi_{\text{re2}}$. For $\pi_{\text{re1}}$, the root constant is passed directly into the aggregation process. For $\pi_{\text{re2}}$, the root constant is provided as a public input from a prior circuit. #### Blowup Factor and FRI Queries As in previous stages, the blowup factor and the number of FRI queries are critical for maintaining the efficiency and security of the aggregation process. For the aggregation step, we maintain the blowup factor at $\beta = 2^4 = 16$, meaning that the polynomials are evaluated at 16 times the original number of points. This helps to ensure that the low-degree properties of the polynomials are preserved, while also minimizing the size of the proof. Additionally, we use $32$ FRI queries in this step, as in the previous stages. This ensures that the aggregated proof remains secure and efficient, without introducing unnecessary complexity. #### Final Aggregated Proof Generation The aggregated proof $\pi_{\text{rec2}}$ serves as the final proof for this step of recursion and can be used as input for further recursive steps or for final verification. ### Final Stage in Recursive zkEVM STARK Proofs The Final Stage in zkEVM STARK proofs serves two main purposes: 1. **Transition to a different finite field**: In this stage, the proof verification is transitioned from the finite field used in the earlier recursive stages to the finite field defined by the bn128 elliptic curve. This is essential because the next step in the zkEVM pipeline involves generating a SNARK proof using the $\texttt{FFLONK}$ protocol, which requires the use of bn128. 2. **Handling of multiple root constants**: In the Final Stage, we handle two root constants, each corresponding to one of the two proofs that were aggregated in the previous step. These root constants are necessary to verify the correctness of the aggregation and to maintain the integrity of the recursive structure. Let $\mathbb{F}_{\text{bn128}}$ denote the finite field associated with the bn128 curve. The key operation in this stage is transitioning all challenges and polynomials from the previous field $\mathbb{F}_{\text{STARK}}$ (the field used for STARK proof generation) to $\mathbb{F}_{\text{bn128}}$, ensuring that the proof is compatible with the elliptic curve cryptographic operations required by $\texttt{FFLONK}$. #### Blowup Factor and FRI Queries In the Final Stage, the blowup factor and the number of FRI queries are critical for maintaining the security and efficiency of the proof. For the final recursive proof $\pi_{\text{ref}}$, we maintain the blowup factor at $\beta = 2^4 = 16$, meaning that the polynomials are evaluated at 16 times the original number of points. This ensures that the low-degree properties of the polynomials are preserved, while also minimizing the overall size of the proof. Additionally, we use 32 FRI queries, as in previous stages. These queries ensure that the polynomials remain of low degree and that the proof remains secure and efficient. ### SNARK Stage in Recursive zkEVM: Generating the Final $\texttt{FFLONK}$ Proof The **SNARK Stage** in zkEVM has two key purposes: 1. **Reduce Proof Size**: STARKs, while scalable and transparent, typically produce relatively large proofs. In contrast, SNARKs such as $\texttt{FFLONK}$ offer **constant-size proofs**, which are independent of the complexity of the statement being proven. The final $\texttt{FFLONK}$ proof, $\pi_{\texttt{FFLONK}}$, is much smaller than the previous recursive STARK proof $\pi_{\text{ref}}$. 2. **Reduce Verification Complexity**: Verifying a STARK proof has **linear complexity** in the size of the execution trace. SNARKs like $\texttt{FFLONK}$ offer **constant-time verification**, making them highly suitable for use cases where fast verification is essential, such as in blockchain systems where on-chain verification costs are a concern. The goal of the SNARK Stage is to generate a small, verifiable proof that confirms the correctness of the recursive proof $\pi_{\text{ref}}$. The proof $\pi_{\texttt{FFLONK}}$ will be sent to the verifier, who can efficiently confirm the validity of the entire zkEVM computation. #### Transition from STARK to SNARK The recursive zkEVM proof system, up until the SNARK Stage, operates using STARKs. The recursive proofs generated earlier in the process, such as $\pi_{\text{re2}}$ and $\pi_{\text{ref}}$, are verified using STARK protocols that rely on techniques like Fast Reed-Solomon Interactive Oracle Proofs (FRI). However, the final proof is a **$\texttt{FFLONK}$ SNARK**, meaning that the proof verification logic must be adapted to the SNARK setting. This transition is made by: 1. **Converting the STARK Verification Circuit to a SNARK-compatible Circuit**: The recursive proof verification circuit (from the previous stage) is converted into a **Circom circuit** that can be compiled for $\texttt{FFLONK}$ SNARK verification. This involves taking the verification logic for $\pi_{\text{ref}}$ and encoding it in a format that is compatible with SNARKs. 2. **Generating the $\texttt{FFLONK}$ Proof**: Once the circuit is compiled, the $\texttt{FFLONK}$ protocol is used to generate the final proof. The resulting proof, $\pi_{\texttt{FFLONK}}$, is succinct and can be verified in constant time, regardless of the complexity of the underlying computation. ## References 1. https://github.com/0xPolygonHermez/zkevm-techdocs/blob/main/docs/proof-recursion.pdf