Try   HackMD

Project description and final update

Research project on decentralized prover mechanisms of zk-rollups

Ethereum Protocol Fellowship (July - November 2023)

Project proposal

Our mentor: Barnabé Monnot, Research Scientist at the Robust Incentives Group of the Ethereum Foundation

Our team: Nilu, Rachit and myself.

Motivation

The main motivation for this project was to contribute to Ethereum’s rollup-centric scaling approach with focus on zk-rollups. A well-designed prover mechanism is crucial for the efficiency and reliability of zk-rollups.

This is why the focus of the project was to do a comprehensive review and analysis of the current state and directions of development around prover models, and identify potential inefficiencies, vulnerabilities, and design weaknesses of prover selection and incentive mechanisms. With this the project's aim was to arrive at an optimal prover strategy.

Why decentralized proving matters

Resilience and Reliability

Decentralized prover models introduce redundancy, deploying multiple provers to generate and submit proofs. This enhances system resilience and prevents the operation of the entire rollup from being disrupted by the failure of one prover. This redundancy is imperative for maintaining continuous liveness and reliability.

Censorship Resistance

Centralized prover models pose a higher risk of censorship, as allowing authorities and misbehaving provers to influence transaction inclusion. Decentralized prover networks distribute proof generation across multiple entities, making it challenging for any single entity to exert undue control, and ensuring resistance against censorship.

Scalability

With the increasing demand for fast and low-fee transactions, centralized provers may struggle to meet computational requirements. Decentralized prover networks distribute the workload efficiently, offering better scalability to accommodate increased transaction throughput. This scalability is crucial for adapting to a growing user base without compromising performance.

Security and Integrity

Decentralized prover networks enhance security by introducing redundancy. The failure of one prover does not compromise the overall security and integrity of the zk-rollup. This decentralized approach aligns with the principles of trustlessness, avoiding dependence on a single entity and promoting a more resilient system.

Fair Distribution of Rewards

More centralized prover models may lead to disproportionate profit distribution, creating imbalances in rollup economics. Decentralized prover networks address this issue by distributing rewards more equally and fairly among participants.

Design challenges

While decentralized prover networks offer several advantages, challenges persist. Managing a large number of participants and coordinating network interactions becomes a scalability challenge. Preventing Sybil attacks and collusion demands resilience against various attack vectors. Designing effective incentive mechanisms to discourage malicious behavior while fairly rewarding honest participation requires meticulous economic and game theoretic considerations.

Review & analysis

We started off with a detailed analysis of existing prover selection methods and incentive mechanism. There were two main topics we focused on: research related to prover mechanisms as well as crypto/rollup economics. I am highlighting some important pieces below as well as a more extensive list of resources:

Exploring prover research directions:

Full list of resources: awesome-prover-mechanisms repo

Exploring incentive mechanism and rollup economics

Full list of resources: awesome-rollup-economics repo

Detailed analysis of L2 architectures

In addition an in-depth review of the architecture of several zk-L2s was conducted through docs and Github repos. This included:

  • Starknet, Aztec, Scroll, Taiko, Linea, Polygon and zkSync.

We distributed the work within our team, and I was focusing on Scroll, Taiko and Linea, and created extensive research notes on their architecture:

I have also created a high-level analysis of different prover selection methods.

Design directions

Our analysis showed that the driving principles of designing prover models can vary significantly, as there is no one-design-to-rule-them-all.

Enshrine or not to enshrine?

An important question is whether enshrined prover networks or out-of-protocol proof markets are better suited to guarantee decentralized, secure and robust proof generation.

In-protocol prover networks ensure that the rollup has no external dependency. In this case the redundancy is greater and the compute power of participants is not utilized in the most efficient way, and sustaining the network is more costly. On the other hand out-of-protocol proof markets ensure efficiency of compute resources, as the provers can work on tasks coming from different sources. Still, I believe that sufficiently decentralized proof market can ensure redundancy through the high number of participants in the prover pool. Both directions have their advantages and trade-offs, and I believe that hybrid solutions, like the one implemented in Taiko's Alpha-5 testnet may be the way forward.

Separate or merge?

Certain concepts aim for separating the role of the prover from the sequencers whose main task is to bundle up transactions to build rollup blocks. This approach is similar to the proposer-builder separation (PBS) on Ethereum, called Prover-Sequencer Separation (PSS) on L2s. Others are exploring models in which sequencers themselves could also act as provers, meaning that one network of players could perform both roles.

Do we need sequencers at all?

If most sequencers are likely to outsource block building to specialized buiders anyway, similarly to proposers on Ethereum, then it is even possible block builders could directly propose blocks to provers without the need for a separate sequencer role to exist. This simplifies the protocol design with only one enshrined role, however the overhead for provers to reach consensus may be lardg in this case. For more details on the above please see the talk of Toghrul from the Scroll team.

All these are valid and important questions that will shape the design of future mechanisms.

Design criteria

Once we have completed the review of existing rollup architectures as well as the research directions, the next step was to find a set of design criteria that are important to optimize for. These are the following:

  • Decentralization
  • Liveness and scalability
  • Censorship resistance
  • Competition and efficiency
  • Permissionless entry
  • Transparency and fairness
  • Honest behaviour
  • Incentives
  • Security
  • Sybil attack resistance
  • Cost

Metrics & Optimization problems

We defined certain metrics for each of the above criteria: in order to quantify them we developed mathematical formulas and turned them into optimization problems. In optimization problems we are looking for the largest value or the smallest value that a function can take, thus we were aiming to maximize or minimize the values of our metrics, and also identified constraints

Here are the criteria and related metrics I was working on:

Censorship Resistance

The mechanism should encourage more provers to participate, which should increase censorship resistance. A small set of provers may refuse to prove certain types of transactions.

  1. Non-censorship Index (NCI): this metric aims to measure the degree of censorship resistance in a prover network. It can be defined as the ratio of successfully proved transactions to the total number of transactions.

    • NCI = (Σ X) / T
      • ‘Σ X’ represents the sum of binary variables ‘X’.
      • ‘X’ is a binary decision variable (1 if not censored, 0 if censored)’ for each transaction.
      • ‘T’ represents the total number of transactions.
    • Objective function: Maximize NCI
    • Participation constraint: with this we aim to ensure that all provers prove a minimum proportion of all transactions, and to discourage refusal
      • A certain participation proportion to be defined for all transaction types (privacy transactions, or low fee&high compute transactions)
    • Prover Non-censorship Index (PNI): the above Non-censorship Index (NCI) could also be tailored to measure the censorship of the individual provers, for this we include the below penalty in the formula:
      • Prover Censorship Penalty (PCP): could be included in the objective function to penalize censoring provers. This should be subtracted from the NCI when a prover censors a transaction.
      • PNI = (Σ X) / T - Σ C * (1 - X)
        • ‘C’ is the censorship penalty for the transactions.
      • This index could be used in two ways:
        • included in the calculation formula of the provers’ reputation
        • Could be used to dynamically adjust the rewards for provers; i.e. non-censoring provers get full rewards, while censoring provers get rewarded according to their score (1 = 100% rewards; 0.9 => 90% of rewards)
  2. Censorship Likelihood (CL): this metric quantifies the likelihood of provers refusing to prove certain types of transactions. Our objective is to minimize the likelihood of censorship by a subset of provers for any transaction type.

    • Minimize CL = ∑(i=1 to N) Ci * Xi
      • ‘N’ is the number of different types of transactions (e.g.: privacy transactions, low-fee & high-compute transactions etc)
      • ‘Ci’ is the likelihood coefficient for type ‘i’
      • ‘Xi’ is the decision variable (0 or 1) indicating whether the transaction is censored or not
      • Different environments carry different censorship likelihood for certain types of transaction; e.g. low-fee & high-compute transactions in a profit maximizing environment)
    • Participation constraint: just as further above, with this we aim to ensure that all provers prove a minimum proportion of all transaction types, and to discourage refusal
      • A certain participation proportion to be defined for all transaction types

Permissionless entry

The mechanism should allow any eligible participant to join without requiring special permissions or approvals, ensuring open access and inclusivity.

  1. Permissionlessness Coefficient (PC): this metric aims to ensure that anyone can join the prover network without special permissions or approvals and to maximize the openness and inclusivity of the prover mechanism, which can be represented by the number of eligible participants (who meet the accessibility criteria under the Accessibility Coefficient (AC) further below).
    • PC = Maximize Number of Participants
    • Constraints:
      • Censorship constraint: to ensure that no eligible participant is denied access to the prover network due to censorship or discrimination.
        • No_Censorship = 1
      • Access Permission constraint: to ensure that there are no special permissions or approvals required by any central parties for an eligible prover to join
        • Special_Permissions = 0
  2. Accessibility Coefficient (AC): it aims to minimize the entry barrier for participation in the prover network. This barrier can be financial as well as technical. The lower the AO, the lower the entry barrier for prover participation.
    • Objective function: Minimize AO = Σ(Bi(Stake, TechnicalRequirements))
      • ‘Bi’ is the entry barrier for each participant ‘i’
      • ‘Stake’ represents the amount of stake needed to join the prover network.
      • ‘TechnicalRequirements’ represents the technical prerequisites that participants need to meet.
    • Financial constraint: a certain threshold (maximum acceptable amount) to be defined to ensure the financial prerequisite never exceeds it
      • Constraint: Stake ≤ StakeThreshold
    • Technical constraint: to ensure that the technical requirements to generate zk-proofs does not exceedingly limit participation
      • Constraint: TechnicalRequirements ≤ MaxTechnicalRequirements

Transparency and Fairness

The mechanism should maintain transparency in its operation and ensure that all participants are treated fairly, without any discriminatory practices or bias in participant selection and reward distribution.

  1. Transparency Index (TI):
    This metric quantifies the level of transparency in the operation of the prover mechanism, and aims to include the availability of information about the mechanism’s rules, operation, and state, as well as the transparency of decision-making processes.

    Objective function:

    • Maximize TI = ∑ Wi * Si (where i = 1,2,,n)
      • ‘Wi’ is the weight of the sub-objective reflecting it’s relative importance
      • ‘Si’ is the score for the i-th sub-objective
      • ‘n’ is the number of sub-objectives

    Information Availability Sub-objective = (Available information) / (Total necessary information ensuring transparency)

    • It measures the availability of information about the zk-rollup mechanism's rules, operation, and state.
    • Constraint: the score for the information availability sub-objective to be higher than the predefined acceptable threshold

    Openness in Decision-making Sub-objective = (Openly made decisions) / (Total number of decisions)

    • This assesses the openness of decision-making processes within the zk-rollup mechanism
    • Constraint: the score for the Decision-making sub-objective to be higher than the predefined acceptable threshold
  2. Selection Probability Rate (SPR):
    Considering a staking-based approach with predetermined max_Stake_Amount and random selection: this metric aims to measure the probability of a prover being selected which can be determined by the ratio of the provers’ stake to the total amount staked in the prover network.

    • Pi = (Stake of Prover i ≤ max_Stake_Amount)/(total_Staked_Amount)
      • Pi is the probability of prover i being selected.
      • Proportional selection ensures that any prover with higher stake amount is rewarded proportionally higher, while the max_Stake_Amount limits any prover from becoming dominant in the network
    • After incorporating Randomness (‘Ri’) the final SPR of Prover i = Pi * Ri

Security

The mechanism should prioritize security measures to protect against potential attacks, vulnerabilities, malicious behavior, or errors, possibly using staking and slashing as mechanisms to deter and penalize wrongful actions, ensuring the safety of the network.

  1. Rate of Decentralization (RD):
    Decentralization is key to ensure the security of a prover network and its resistance to attacks. See metrics further below under the “Decentralization” criteria.

  2. Proof Validity Ratio (PVR):
    This metric aims to quantify the ratio of valid zero-knowledge proofs (VZKP) to the total zero-knowledge proofs (TZKP) generated by provers. A valid zero-knowledge proof (VZKP) is one that is verified successfully by the zk-rollup network's L1 verifier contract. The objective is to encourage the generation of valid zero-knowledge proofs while maintaining the security and integrity of the zk-rollup network.

    Objective function:

    • Maximize PVR = Valid zero-knowledge proofs (VZKP) / Total zero-knowledge proofs (TZKP)

    Constraints:

    • Proof Generation Constraint: TZKP > 0
    • Validity Constraint: VZKP <= TZKP
    • Prover Identity Constraints: ensure that each prover has a unique and verifiable identity or reputation within the network to prevent Sybil attacks. See further details on this in further below at the “Sybil Attack Resistance” criteria
    • Consensus Constraints: the zk-rollup’s consensus mechanism to ensure that only valid transactions are included in a block. Consensus Validity (CV) = 1 if all transactions are correct, and CV = 0 if not
    • Resource Constraints: Consider resource limitations such as computational power which may affect the generation of proofs. Computational Resource Capacity (CPC) = 1 if resources are adequate to generate the proof successfully, CPC = 0 if not
  3. Impact (Cost) of No Proof (INP):
    The cost of not generating a valid zk-proof for a block can impact the overall security and efficiency of the network, and can be measured as a combination of various factors, such as security cost, data integrity loss, increased computational cost, loss of trust or reputation cost.

    Objective function:

    • Minimizing Impact of No Proof = W1​ * Security Risk + W2 ​* Data Integrity Loss + W3 * Computational Costs + W4 * Loss of Trust
      Where:
      • Wi​ are weight coefficients that determine the importance of each cost component. These weights can be assigned based on the system's priorities.
      • Security Risk quantifies the risk of security breaches due to invalid proofs.
      • Data Integrity Loss measures the impact on data integrity caused by missing or invalid proofs.
      • Computational Costs represent the increased costs of computation in case of an invalid proof.
      • Loss of Trust from users and validators when valid proofs are not consistently generated.

    Constraints:

    • Transaction Processing Constraint: Ensure that all transactions within a block are correctly processed and included in the proof generation process.
    • Security Constraint: The security constraint may involve checks to ensure the absence of fraudulent or malicious transactions.
    • Computational Constraint: Maximize the efficiency of having a valid proof generated by another prover and minimize waste in the additional computation
    • Trust and Reputation Constraint: Maintain trust in the system and prevent reputation damage. This constraint may involve setting thresholds for trust levels and reputation metrics and ensuring that they are not breached.
  4. Risk of Network Failure (RNF):
    The objective is to minimize the risk of network failure due to dishonest provers. We can define a risk metric, R, that captures the impact of dishonest provers on network availability or security.

    Objective function:

    • Minimize R = W1 * CapacityCompromized + W2 * RecoveryCost + W3 * TrustLoss
      • Where Wi is the weight of each parameter

    Constraints:

    • Distributed Validation: Increase the number of provers in the network to reduce the impact of dishonest provers and distribute the risk across multiple entities.
    • Redundancy: Implement redundancy mechanism to ensure that no single prover or validator has critical control over the network.
    • Incentive Mechanisms: Design incentive/collateral mechanisms that penalizes dishonest behavior, increase the cost of attack and reward honest provers.

Sybil Attack Resistance

The mechanism should be resistant to Sybil attacks, where an attacker creates many pseudonymous identities to increase their chances of being selected.

  1. Risk of Sybil Attacks (RSA):
    This metric measures the risk of a Sybil attack, with an emphasis on prover identity.The goal is to minimize the Risk of Sybil Attacks (RSA)

    Objective Function:

    • Minimize RSA = W1 * IV + W2 * RS + W3 * SC + W4 * PBA
      • Where Wi is the weight of each of the below constraints.

    Constraints:

    • Identity Verification (IV): A proof of unique identity (e.g.: an identity token) and identity verification should be implemented for the participating provers.
      • IV = 1 if identity is verified, IV = 0 if not
    • Reputation Score (RS): A reputation score to be assigned to each prover (0 ≤ RS ≤ 1) based on their behavior and previous reliability and performance. The higher the reputation, the lower the risk of Sybil attacks:
      • RS = f(prover_behavior)
    • Stake or Collateral (SC): Provers must stake or deposit collateral, which can be a fixed amount or a proportion of their assets. This provides an economic protection against Sybil attacks:
      • SC = 1 if collateral is staked, SC = 0 if not
    • Prover Behavior Auditing (PBA): Periodically audit prover identities and behaviors:
      • PBA = 1 if auditing is performed, BA = 0 otherwise

My teammates, Nilu and Rachit, worked on quantifying the rest of the criteria mentioned further above in the "Design criteria" section including decentralization, liveness, competition, incentives and cost: the metrics can be found here.

Outlining a possible prover model

At this point of our project, and with the aim to draft initial version of a possible prover mechanism, all three of us in our team created our own prover designs that could considered to be integrated into our simulation. My initial design with criteria and assumptions is published here.

Towards our optimal prover design

After thorough discussions with Barnabé as well as my team members, we have agreed to prioritize the following criteria for our prover strategy:

  • Decentralization
  • Cost
  • Liveness
  • Incentives/Honest behaviour

At this point, we started to work parallel. As they are advanced in Python and Jupyter, my teammates, Nilu and Rachit, started to prepare the code for the simulation.

In the meantime we discovered that the Aztec team published a Request for Proposal for decentralized prover coordination. We decided to submit our prover model as a proposal.

Based on all the research our team has done in the past months I have prepared a draft proposal:

We extensively reviewed Fernet, the sequencer model the Aztec team selected, as well as the other proposals for prover coordination that were already submitted to the RFP.

Our final prover mechanism

We continued to work on the initial proposal, removed or simplified certain elements, and also introduced new features. Please see details of our final prover mechanism below.

In summary our decentralized prover strategy is based on

  • Staking (& slashing) to create financial interest and promote honest behavior,
  • Prover reputation scores to promote reliability and soft competition, and maximize liveness,
  • Random prover selection from provers with top 25% reputation.

An emergency mechanism has been implemented in case of prover failure to minimize reorgs:

  • N number of provers selected to generate a proof based on proof-racing,
    • The compute waste from the proof race can be considered as network redundancy to maximize liveness, and minimize the time/cost lost due to prover failure.

Other added features:

  • Proof batching,
  • Distributed proving,
  • Integrating liquid proving pools

Our final proposal submitted to the Aztec team can be found here:

[Proposal] Decentralized Prover Network (Staking, Reputations and Proof Races)

It feels really amazing that the timing of our research on decentralized prover models matched the Aztec RFP, and thus we had the chance to submit our own proposal which may influence the final prover mechanism the Aztec team is designing.

Outstanding items and future work

In the last weeks of the cohort our focus shifted to the Aztec proposal, and thus the simulation is in its early phase, and still to be completed.

Through the simulation we are aiming to analyze how our mechanism behaves under certain circumstances, transaction load, normal mode and emergency mode, different number of active provers in the network etc.

Conclusions and self-evaluation

I am very grateful for the opportunity to do this research with the Ethereum Protocol Fellowship. It is an amazing opportunity for everyone to enhance their knowledge and contribute to Ethereum and its ecosystem.

With this project I have built extensive knowledge on the prover designs of the existing zk-rollup landscape, as well as the current research directions.

The analysis has shown that the vast majority of zk-rollups are using centralized provers except for Taiko which turned out to be the sole decentralized network from the L2s researched. I have been in touch with the core teams of different rollups and it is clear that all of them are still in the phase of evaluating the possible approaches and considering the best suited mechanism for their protocol. It is clear that there is no solution-to-rule-them-all, and it is all about finding the right balance of advantages and trade-offs based on the protocols' priorities.

What is really exciting for me is that the field of research for decentralized sequencers and provers is a fresh, vibrant and constantly evolving space where new ideas are born every week (if not every day). Also, it is amazing to see protocols working and building hand in hand with the community, engaging in discussions, issuing request for proposals, and open-sourcing existing solutions.

I will definitely keep watching this dynamic landscape to see how the prover mechanism evolve, and which solutions become mainstream.