--- title: "The Hitchhiker's Guide to P2P Overlays in Ethereum Consensus" disqus: hackmd --- **The Hitchhiker's Guide to P2P Overlays in Ethereum** Co-written by Louis Thibault and Dan Marzec ## Abstract >The p2p topology of the beacon network is uncommon knowledge. In this post we dissect it's underlying components, and review some of the subtleties and optimizations at play. Using this knowledge we go through specifics to the beacon chains instantiation. If anything, consider this a poor man's GossipSub + consensus spec explainer. <center> <img src="https://hackmd.io/_uploads/rJOUvPBr3.png" width="450" height="450"> </center> ## Table of Contents [TOC] ## Motivation Broadly speaking, there are two fields of study involved in the Ethereum consensus layer (CL): crypto-economics and distributed systems. Within the latter, the study of peer-to-peer (P2P) overlays plays a crucial role in the pre-consensus phases of Ethereum's CL protocol, Gasper Proof-of-Stake. It is our impression, however, that this fascinating topic is under-appreciated in most resources. Recently an [equivocation attack](https://hackmd.io/@dmarz/total-eclipse-of-the-relay) on [mev-boost](https://github.com/flashbots/mev-boost) relays showed that the community was partially blind to a class of attacks waged through the p2p layer. It is our objective to offer a partial remedy to this blind-spot by means of the present article: an introduction to the P2P overlay networks that underpin Ethereum's CL. Let us begin by situating P2P overlays within the Ethereum network. The Ethereum network is in reality two distinct networks, each with its own networking stack: the execution layer (EL) and the consensus layer (CL). The [devp2p](https://geth.ethereum.org/docs/tools/devp2p) project provides the EL's networking stack, which is responsible for propagating the latest execution states, themselves derived from transactions confirmed in prior blocks. In contrast, the CL employs a more advanced P2P networking library called [libp2p](https://libp2p.io), whose hardened, performant network primitives are more suitable for the delicate and high-stakes work of coordinating the validators that secure the network. As its name implies, the consensus layer (CL) is responsible for maintaining some form of distributed consensus. Specifically, it maintains Byzantine fault-tolerant consensus over batches of computation in the execution layer. At the risk of trivializing the Gasper Proof-of-Stake protocol, all distributed consensus eventually boils down to an exercise in message-passing. CL nodes must exchange messages and apply some kind of logic over these data such that all participants eventually converge on a single, totally-ordered, linked-list of transactions: a *blockchain*. Understanding this in detail is a two-part interrogation into 1. how these messages are efficiently and reliably shared between nodes; and, 1. how this total ordering is achieved in the presence of Byzantine faults. To this end, we begin with a basic review of the structure and function of P2P overlays, progressively introducing the main layers of optimization at play in Ethereum's CL. From there, we conduct brief survey of the various overlay networks in Ethereum's Consensus Layer (CL), emphasizing their function, limitations and open research questions. <!-- ### Execution vs Consens Layers ### Differences between layer's networks ### Overlays in beacon network --> ## The Structure and Function of Peer-to-Peer Overlays On the Ethereum beacon network, peers need to broadcast several data types to each other: blocks, attestations, *etc*. How is this achieved? And in particular, how is this achieved in a fully peer-to-peer manner? Libp2p provides a general-purpose broadcast primitive called "pubsub". The name is derived from the fact that this broadcast primitive is defined via its API, which in turn defines a "publish" and "subscribe" method; nodes can "publish" a message to a namespace (called a "topic"), as well as "subscribe" to topics, which allows them to receive messages published by others. Crucially, this API is implementation-agnostic, and there exist many strategies for routing the messages behind the scenes. ### Floodsub and Randomsub The simplest strategy is called "flooding" (or "floodsub" in libp2p parlance). [Flooding](https://en.wikipedia.org/wiki/Flooding_(computer_networking)) was a popular strategy in the file-sharing networks of the early 2000's, mostly due to its extreme simplicity and resilience. In a flooding network, routing is achieved by having each node forward each incoming message to the remainder of its peers. Locally, each node maintains a set of previously-seen messages and refrains from forwarding duplicate messages. One immediate problem with this construction: flooding involves significant duplication of messages. How can we do better? The first and most obvious optimization is colloquially known as ["randomsub"](https://dl.acm.org/doi/abs/10.1145/945506.945507). With this method, nodes route a message by selecting `f` of the peers they are connected to, at random, and forwarding the message only to those `f` peers. Larger values of `f` (called the "fanout factor") result in faster message-propagation (up to flooding). A second factor, called `v` (or "view size") determines the maximum number of peer-connections (called "neighbors") to maintain. Interestingly, `v` exerts no influence on the speed at which messages propagate. Instead, higher values of `v` increase the resiliency of the network, regardless of `f` (provided `f` is at least the natural log of the overall network size). This allows networks to trade a small amount of latency for a significant decrease in bandwidth consumption, without threatening the reliability of the overlay network as a whole. Randomsub is used in Ethereum's execution layer (EL), but not its CL. Why? For one, randomsub still faces the same problem as flooding; the same redundancy that gives rise to excellent resiliency properties also consumes a significant chunk of bandwidth. While the severity of this issue is quite reduced, its presence is still felt at the practical level. But equally important is the fact that randomsub is picking peers *at random*. That is, it is not picking the "best" peers, which would typically be a mix of those with low latency and those that improve connectivity to other peers. Ideally, in fact, each node would pick only a handful of peers in a way that allows the network to self-organize into a [minimum-spanning tree](https://en.wikipedia.org/wiki/Minimum_spanning_tree). This is precisely what the CL achieves via libp2p's GossipSub protocol. ### GossipSub overlays Going from an understanding of randomsub to understanding GossipSub is a three-step process. We must understand 1. hybrid overlays, 2. how these can be made to self-optimize their peerings; and, 3. how the GossipSub protocol uses these optimized overlays to present a p2p publish/subscribe API. Let us examine each step in turn. #### Hybrid Overlays The term "hybrid overlay" refers to systems that layer two separate overlay networks, called the "passive" and "active" overlays. Hybrid overlays first made an appearance in an optimization over randomsub called HyParView (a portmanteau of *hybrid partial view*). In effect, HyParView is taking advantage of the orthogonality of `f` and `v` to produce networks with resiliency equivalent to flooding. The strategy is to select `f` peers, connect to them, and use them to receive/forward all messages. This is the "active" overlay. Simultaneously, the "passive" overlay is constructed by maintaining a bounded view (of size `v`) peer addresses. It is important to emphasize that the passive overlay does not maintain any long-lived connections to its peers, and instead takes the form of a bounded pool of addresses that can be used to replace broken connections in the active overlay. The construction and mainenance of these overlays is beyond the scope of this paper, but is straightforward and well-documented by the [HyParView paper](https://ieeexplore.ieee.org/abstract/document/4272993). In brief, the two overlays emerge as a consequence of each peer maintaining two distinct "views" of the network. The passive overlay is constructed through repeated sampling of the network via random walks of a predetermined length, the endpoint of which is added to the "passive view". Once admitted, passive-view peers are periodically polled for liveness and evicted if unresponsive. The active is simultaneously constructed by repeatedly sampling from the passive view, until size `f` is achieved. The net outcome of HyParView's architecture a network that is capable of repairing itself to the same degree as a flooding network, while utilizing fewer resources (memory, network io, etc). For our purposes, however, the important feature is the distinct behaviors of the active and passive overlays, which we summarize in the table below. | | Active | Passive | |--------------|-----:|-----------:| | **Size** | Small | Large | | **Contents** | Peer Connections | Peer Addresses | | **Use** | Transporting messages | Replacing broken connections | | **Messaging Pattern** | Push | Request/Response (liveness check) | Note that HyParView does not improve the message-propagation properties of the network. It is only concerned with decoupling `f` and `v` to recuperate some of the robustness that is lost in the transition from flooding to randomsub. In the next subsection, we will show how the hybrid-overlay architecture can be extended to improve the actual broadcast properties of the network. #### Self-Optimizing Hybrid Overlays Random overlays derive their robustness from the fact that its network topology is both sparse (each node is connected to a relatively small set of peers) and well-connected, *i.e.* a large number of connections must fail in order for the network to become partitioned. This well-connectedness in turn stems from a fundamental property of uniformly random graphs: uniformly random peerings. Unfortunately, this uniformity is at odds with efficiency. A node located in New York is equally likely to connect to a faraway peer in Tokyo as a nearby one in Boston. Ideally, we would like to preserve the uniform random distribution of the passive overlay (since it is responsible for ensuring robust connectivity) while minimizing the latency between peers in the active overlay. The [Plumtree algorithm](https://link.springer.com/chapter/10.1007/978-0-387-09751-0_29) (a portmanteau of *Push, Lazy-pUsh Multicast Tree*) achieves this through coordination of the active and passive views. At its heart, Plumtree is a quest to turn an initially random graph--the active overlay--into a [minimum spanning tree](https://en.wikipedia.org/wiki/Minimum_spanning_tree). Doing so is a matter of detecting and pruning redundant connections between peers, while ensuring the graph remains strongly connected. The intuition behind Plumtree is that duplicate messages are indicative of redundant paths, while missed messages are indicative of insufficient connectivity. From there, Plumtree is simple enough as to appear almost trivial in hindsight. When a node receives a duplicate message, it terminates the connection to the peer from which it was received, and adds that peer's address to the passive view. Meanwhile, extra metadata is piggybacked onto the periodic liveness checks that are performed in passive view. This metadata consists of the identifiers of recently-seen messages (often a monotonically increasing sequence number, or a message hash), and serves a dual purpose. Firstly, it allows the receiving peer to request the full payload of any message it may have missed. Secondly, it indicates that a faster path exists for a subset of messages. The response to a missed-message event is to establish a connection with the corresponding peer, and promote it to the active view. Surprisingly, perhaps, this is all that is needed for the network to converge on a minimum spanning tree. In practice, the only additional requirement is prevent connection churn resulting from spurious duplicates. Several strategies exist, the most popular of which is to set a threshold for duplicate/missed-message events within some unit of time (often several seconds), and to trigger a response only when this threshold is exceeded. Other strategies exist, but all are essentially concerned with smoothing out connection upgrades and downgrades (also called "grafts" and "prunes"). #### P2P PubSub with GossipSub Returning to libp2p's PubSub API, we might ask how Plumtree can be extended to provide topic-oriented broadcast semantics. This is the role of GossipSub, a Plumtree-based protocol. In essence, GossipSub maintains a separate plumtree overlay for each topic. This helps the topologies stay small, adding to their performance. It is also worth noting that because peers can dynamically join and leave topics, they can also create generate a dynamic stream of topic names, and periodically migrate to the next topic in the stream. A simple mechanism for generating such a stream is for the topic name `t1` to be defined as `hash(t0)`, for example. The actual joining and leaving of topics is performed by sending [`subscribe` and `unsubscribe`](https://docs.libp2p.io/concepts/pubsub/overview/#subscribing-and-unsubscribing) messages to peers, and is detailed in the libp2p documentation designated by the prior hyperlink. In addition to providing a topic-oriented API, GossipSub adds a handful of sub-protocols that improve its suitability for use in large public networks. In particular: 1. GossipSub v1.0 adds [control-message piggybacking](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#control-message-piggybacking). 2. GossipSub v1.1 adds various robustness improvements, most notably: - [Peer Scoring](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring) - [Peer Exchange](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange) We leave the study of these subprotocols as an exercise for the reader. In the next section, we will instead enumerate some of the GossipSub topics at play in the Ethereum consensus layer. ## The Structure and Function of Ethereum Consensus :::info This portion rests largely on the shoulders of giants, [🔗consensus spec](https://github.com/ethereum/consensus-specs/), [🔗annotated spec](https://github.com/ethereum/annotated-spec), [🔗phase 0 for humans](https://notes.ethereum.org/jDcuUp3-T8CeFTv0YpAsHw?view), [🔗eth2book](https://eth2book.info/), ::: While the finer details of the Ethereum CL fall outside the scope of this document, a quick explanation is nonetheless necessary. Unlike the execution layer, the rules of Ethereum's consensus protocol have a direct impact on it's topology. Consensus is performed by a set of $N$ validators $V = \{V_1, \dots, V_N\}$ for $i \leq \text{}$ `MAX_VALIDATOR_SET_SIZE`, each of which has previously staked a quantity of ETH $w(V_i)$ as collateral to incentivize "[honest behavior](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/validator.md#phase-0----honest-validator)". Validators work together to come to consensus on what is conceptually two different ledgers, one which is dynamically available ledger and the other which is a finalized prefix ledger of the dynamically available ledger. Another way to say this is that Gasper exhibits the ["ebb-and-flow" property](https://arxiv.org/pdf/2009.04987.pdf), meaning it's: > a flexible protocol that supports two types of clients: conservative clients who favor finality and want to be safe under network partition, and more aggressive clients who favor availability and want to be live under dynamic availability. A conservative client will only trust a finalized ledger, which is a prefix of a longer believed by a more aggressive client. The finalized ledger falls behind the available ledger when network partitions, but catches up when the network heals. This ebb-and-flow property avoids a system-wide determination of availability versus finality and instead leaves this decision to the clients. <img src="https://hackmd.io/_uploads/SkXmAOTHh.png" width="550" height="250"> <img src="https://hackmd.io/_uploads/SJXmCO6Sh.png" width="550" height="250"> Both diagrams from [here](https://www.youtube.com/watch?v=2nFMfN8aaIA). ### Validator Duties The ultimate goal of running validator software is to make money, or operate a public good, or some mix of the two. In order to maximize profit, a validator must behave honestly, which means performing specific duties. These duties are assigned randomly using controlled deterministic-shuffling: >The beacon chain shufflings are designed to provide a minimum of 1 epoch lookahead on the validator's upcoming committee assignments for attesting dictated by the shuffling and slot. Note that this lookahead does not apply to proposing, which must be checked during the epoch in question. - [consensus spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/validator.md#lookahead) More on the use of RANDAO and the use and construction of randomness on the beacon chain can be found [here](https://eth2book.info/capella/part2/building_blocks/randomness/). When reading through the [consensus spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/validator.md) and [original paper](https://arxiv.org/pdf/2003.03052.pdf), there are different references to the number of core validator duties. Here we break them down into two categories, **explicit** and **implicit**. #### Explicit From the [phase 0 consensus spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/validator.md), > A validator has two primary responsibilities to the beacon chain: proposing blocks and creating attestations. Proposals happen infrequently, whereas attestations should be created once per epoch. > we have two primary validator duties: - **proposing blocks** - **creating attestations** *Proposing blocks* is needed to keep extending the chain and including new transactions. *Attestations* are used to come to agreement on what a "finalized" set of blocks are. You can think of attestations as votes for the finalized prefix chain mentioned above. ![](https://hackmd.io/_uploads/SkQSOkjS2.png) #### Implicit - **aggregating attestations** [BLS12-381 signatures](https://hackmd.io/@benjaminion/bls12-381#fn1) allow us to combine multiple attestation into one compact attestation of attestations. Aggregating allows for reducing verification time on large number of attestations. Further on we will explain how the protocol randomly assigns aggregation duties. In the [altair consensus spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/validator.md#sync-committee) a new duty gets introduced: - **sync light clients** > The beacon chain is designed to be light client friendly for constrained environments to access Ethereum with reasonable safety and liveness. Such environments include resource-constrained devices (e.g. phones for trust-minimized wallets) and metered VMs (e.g. blockchain VMs for cross-chain bridges). After the altair hardfork it become part of a validators role to help ensure light clients can verify recent blocks to sync with. [An example rust light client using sync committees](https://github.com/a16z/helios). ### Joining, Exiting, and Honesty To enter the set a validator deposits 32 Eth and is responisble for keeping this balance above 16 Eth. At present a balance can only be attributed to a single pubkey, but [there are proposals to remove this cap](https://notes.ethereum.org/@mikeneuder/increase-maxeb). Exiting is a similar process as entering and both require a message to be sent over the beacon network. An overview of [this process](https://eth2book.info/capella/part2/deposits-withdrawals/) can be seen below: - ![](https://hackmd.io/_uploads/SJpIpRiH2.png) Since validators accept and count messages based on the current known validator set, the ability to exit creates a class of attacks known as "[Long Range Attacks](https://blog.ethereum.org/2014/05/15/long-range-attacks-the-serious-problem-with-adaptive-proof-of-work)". The solution to these attacks are weak subjectivity checkpoints which can be [read about more here](https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/weak-subjectivity/). Validators fulfilling their assigned responsibilities and coordinating with their peers are deemed "honest". However, the term 'honest' might be misleading in security and performance analysis, given that even these validators may experience occasional downtime. A key advantage of our protocol is its dynamic availability, which ensures ongoing operation despite network partition. Alongside, the protocol is also designed to handle instances of validators acting maliciously, such as proposing two blocks or two attestations to a particular chain view. To enforce appropriate or "honest" behavior, the protocol incorporates 'slashing conditions' - rules that an honest validator would never violate, and any violation can be provably detected. If these conditions are witnessed, Proposer and Attestor Slashing events are triggered. Such events are initiated by an honest validator who submits the conflicting signatures to the network. ## Ethereum Overlays As you may have guessed, all coordination and communication relating to Validator duties are done through a collection of gossipsub topics (read: a collection of overlays) called the "beacon network". We detail the structure and function of these topics below. ![](https://hackmd.io/_uploads/rJthP2TH2.png) From this artist's rendering of the beacon network we can see that each node maintains an aggregate set of peers across all protocols (including GossipSub) in which it participates. GossipSub performs unicast message exchange with peers in this "peer store" in order to perform high-level topic joining/leaving operations. In the layers above the peer store, we see how peers are grouped into various topics (the colored nodes). Each topic carries messages relating to a specific duty. ### GossipSub Topics #### Global topics There are six topics on the beacon network which all validators ***must*** be subscribed to in order to perform their duties: ###### `beacon_block` This is the core data type that our validator and node set are coming to consensus upon. ```python class BeaconBlock(Container): slot: Slot proposer_index: ValidatorIndex parent_root: Root state_root: Root body: BeaconBlockBody ``` ```python class BeaconBlockBody(Container): randao_reveal: BLSSignature eth1_data: Eth1Data # Eth1 data vote graffiti: Bytes32 # Arbitrary data # Operations proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS] attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS] attestations: List[Attestation, MAX_ATTESTATIONS] deposits: List[Deposit, MAX_DEPOSITS] voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS] sync_aggregate: SyncAggregate # [New in Altair] ``` Two of the main components of `BeaconBlock` are`BeaconBlockHeader` and `BeaconBlockBody`, which are both a crucial part of the blinded beacon block construction on ethereum (which enables PBS). A signed`BeaconBlockHeader` is passed around originating at the proposing validators node and used as a commitment to proposing the beacon block. Commiting to two states of the world (*i.e.* two `BeaconBlock`s for a single slot) is a slashable offense. ###### `beacon_aggregate_and_proof` Aggregate proofs are generated to reduce verification time and overlay traffic, the need for this functionality is expanded on further in this document. ```python class AggregateAndProof(Container): aggregator_index: ValidatorIndex aggregate: Attestation selection_proof: BLSSignature ``` ###### `voluntary_exit` Both joining and exiting the network are tracked through the deposit and withdrawal contracts on ethereum. Messages from validators stating their intent to exit the network are propagated over this topic. ```python class VoluntaryExit(Container): epoch: Epoch # Earliest epoch when voluntary exit can be processed validator_index: ValidatorIndex ``` ###### `proposer_slashing` and `attester_slashing` When a slashable offense or "equivocation" has been detected, anyone who has witnessed a commitment to two different world states, via signed header or attestation, combines the two commitments into one object and propagates them over these two topics to slash the "misbehaving" valdiator. The last user submitted slashing can be seen [here](https://twitter.com/mikeneuder/status/1642899865288994816?s=20). ```python class ProposerSlashing(Container): signed_header_1: SignedBeaconBlockHeader signed_header_2: SignedBeaconBlockHeader ``` ```python class AttesterSlashing(Container): attestation_1: IndexedAttestation attestation_2: IndexedAttestation ``` ###### Attestation Subnets: `beacon_attestation_{subnet_id}` From the [spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#attestations-and-aggregation), >The correct subnet for an attestation can be calculated with compute_subnet_for_attestation. beacon_attestation_{subnet_id} topics, are rotated through throughout the epoch in a similar fashion to rotating through shards in committees (future beacon chain upgrade). The subnets are rotated through with committees_per_slot = get_committee_count_per_slot(state, attestation.data.target.epoch) subnets per slot. There are currently `ATTESTATION_SUBNET_COUNT` = 64 attestation subnets. ```python class Attestation(Container): aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE] data: AttestationData signature: BLSSignature class AttestationData(Container): slot: Slot index: CommitteeIndex # LMD GHOST vote beacon_block_root: Root # FFG vote source: Checkpoint target: Checkpoint ``` ###### Sync Subnets: `sync_committee_{subnet_id}` `SYNC_COMMITTEE_SIZE` = 512 and `EPOCHS_PER_SYNC_COMMITTEE_PERIOD` = 256 ```python class SyncAggregate(Container): sync_committee_bits: Bitvector[SYNC_COMMITTEE_SIZE] sync_committee_signature: BLSSignature ``` ### Committess If the whole validator set were to attest simultaneously the number of messages on the network would be insurmountable and at the benefit of unnecessary resiliency. With a small trade-off in message availability we can lower the requirements of our node, thus the concept of committees are introduced. A committee is an array of `ValidatorIndexs` that are assigned to a specific task every certain number of slots. The point of committees is to split up the responsibilities among the validators. [Currently](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md) `MAX_COMMITTEES_PER_SLOT` = `64`. #### Beacon The beacon committee is repsonsible for attesting to beacon blocks. The number of committees is defined via constant `MAX_COMMITTEES_PER_SLOT` which is currently to `64`. Why are there 64? As Ben so astutely puts [here](https://eth2book.info/altair/part2/building_blocks/committees/): >The current beacon committee structure was strongly influenced by a previous roadmap that included in-protocol data sharding. That design is now deprecated, yet a remnant of it remains in our 64 beacon committees per slot. These were originally intended to map directly to 64 shards as "crosslink committees" but no longer have that function. Nonetheless, beacon committees still serve a useful purpose in parallelising the aggregation of attestations. Whether 64 remains the right number of committees per slot has not been analysed to my knowledge. The trade-off is that fewer beacon committees would reduce the amount of block space needed for aggregate attestations, but would increase the time needed for aggregators to do their work. #### Sync The sync committee is responsible for attesting to the validity and availability of blocks to reduce the burden on light clients. `SYNC_COMMITTEE_SUBNET_COUNT` = 4 and `TARGET_AGGREGATORS_PER_SYNC_SUBCOMMITTEE` = 16 From the [altair annotated spec](https://github.com/ethereum/annotated-spec/blob/master/altair/sync-protocol.md#introduction), >The sync committee is the "flagship feature" of the Altair hard fork. This is a committee of 512 validators that is randomly selected every sync committee period (~1 day), and while a validator is part of the currently active sync committee they are expected to continually sign the block header that is the new head of the chain at each slot. The purpose of the sync committee is to allow Light Clients to keep track of the chain of beacon block headers. The other two duties that involve signing block headers, block proposal and block attestation, do not work for this function because computing the proposer or attesters at a given slot requires a calculation on the entire active validator set, which Light Clients do not have access to (if they did, they would not be light!). Sync committees, on the other hand, are (i) updated infrequently, and (ii) saved directly in the beacon state, allowing Light Clients to verify the sync committee with a Merkle branch from a block header that they already know about, and use the public keys in the sync committee to directly authenticate signatures of more recent blocks. ### Attestation Aggregation Attestations represent the bulk of traffic going through our overlays and are required to finalize the chain. A succint explanation of the aggregator role can be found [here](https://ethresear.ch/t/view-merge-as-a-replacement-for-proposer-boost/13739): > **Aggregators**: Within each subnet, validators broadcast their individual attestations. Since we don’t want all attestations to have to be gossipped to the whole network and processed by everyone, we aggregate them within a subnet, and only globally gossip the aggregates. Each subnet has some validators with a special role, called aggregators, which are randomly (self-)elected in a verifiable way 1. The expected number of aggregators per subnet is TARGET_AGGREGATORS_PER_COMMITTEE = 16, sufficient to ensure a high probability of there being at least one honest aggregator in every subnet. Aggregators are tasked with collecting individual attestations from the subnet and creating an aggregate attestation, i.e. an attestation whose data is the same as all of the individual attestations and whose signature is the BLS aggregate signature of all of the individual signatures, i.e. their sum, verifiable with the aggregate public key, the sum of all public keys. To be precise, they only aggregate attestations which agree with their own, and this is the only incentive which they currently have to perform their aggregation duty, since it’s not rewarded in the Beacon Chain. Additionally, validators are incentivized via reward proportional to attestation inclusion in their block. Aggregators are also rewarded similarly. A visual explainer can be seen below with 1 million validators from [here](https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219). ![](https://hackmd.io/_uploads/B1WgIYTrn.png) As shown above attestations can be included in the block immediately following the slot being attested to. In an ideal world, all attestations would be aggregated and included in the next slot, but the reality is different. Below we can see a disitrubition of slot inclusion delays for attestations (note `y-axis` is log scaled), and while the bulk are included 1 slot away, many are not. The tighter that we can get these ranges of votes the more interesting things we can do with the protocol. ![](https://hackmd.io/_uploads/BJUNZ4AS3.png) In this light aggregation plays 4 important roles for our network: - reduce network load on global `beacon_attestation_{subnet_id}` topic - reduce network load on global `beacon_aggregate_and_proof` topic - reduce network load by using less blockspace for signatures - reduce signature verificaiton load on the next block proposer One question you may be asking seeing these benefits is, "if aggregators are perfectly efficient at packing a committee's attestation into one signature, then why would the aggregate topic receive more traffic?". This, and the reality of production systems, highlight the underlying complexity in attestation aggregation and will help shed some light on it's role in p2p traffic. #### Attestation Aggregation Packing Problem A very intentional decision made in the consensus layer is the use of elliptic curve [BLS12-381](https://electriccoin.co/blog/new-snark-curve/) as it allows for signature signature aggregation. But there exist constraints that our choice of signature and validator tracking schemes impose upon us. Let's walk through an example: Given 4 validators $V_1, V_2, V_3$, v_4, which each sign $m_1, m_2, m_3, m_4$ to produce $sig_1, sig_2, sig_3, sig_4$ (each 96 bytes for reference) we can: Aggregate signatures of $(sig_1, sig_2)$ and $(sig_3, sig_4)$: $Sig_{a1} = Aggregate([Sig_1, Sig_2])$, and $Sig_{a2} = Aggregate([Sig_3, Sig_4])$. Further, we can aggregate $Sig_{a1}, Sig_{a2}$ into: $Sig_{a12} = Aggregate([Sig_{a1}, Sig_{a2}])$, creating an aggregate of aggregates. This property of [pairing-friendly](https://electriccoin.co/blog/pairing-cryptography-in-rust/) curves allows us to scale verificaiton of large amounts of signatures. But one nuance stops us from true theoretical efficiency: we cannot aggregate sets of signatures with overlaps, meaning: $Sig_{a} = Aggregate(Aggregate([Sig_1, Sig_2]), Aggregate([Sig_2, Sig_3])$ will not produce valid results because, ["the bitfield we use to track participation cannot represent the overlapping participants"](https://notes.ethereum.org/@hww/aggregation). This creates a nasty problem where validators must now iterate over all possible combinations of aggregates in order to produce the most profitable. #### Problem Reduction Attestation Aggregattion Packing (AAP) it turns out is an NP-hard optimisation problem, mapping it to the likes of problems such as [Maximum Clique Enumeration](https://www.sciencedirect.com/science/article/abs/pii/S0743731509000082), [Maximum Coverage](https://en.wikipedia.org/wiki/Maximum_coverage_problem), and [Set Cover](https://en.wikipedia.org/wiki/Set_cover_problem). Approximation algorithms are our only practical choice in dealing with this class of problems. Unfortunately the short comings in any implementaiton to this problem extends past resource use. A validators attestation rewards increase proportionaly with the amount of attestations they include. Given the gravity of the scenario Lighthouse put out a [formal analysis of the AAP Problem](https://lighthouse-blog.sigmaprime.io/optimising-attestation-packing.html), concluding: >"Given the near-optimal performance of the greedy algorithm in the majority of cases, we do not believe that attestation packing represents a significant centralisation risk for Ethereum. A block proposer in possession of a fast exact solver stands to produce blocks that are only marginally better than those produced by the greedy algorithm, and will only see benefits greater than 5% in a minute number of cases (0.03%)." #### Different Implementations This post [attestation aggregation strategies](https://notes.ethereum.org/@hww/aggregation) does a great job explaining the approaches that were around during phase 0. Some more links on implementations. - [Nimbus implementation twitter thread](https://twitter.com/jcksie/status/1384154639596101632) - [Prysm eth research write up](https://ethresear.ch/t/attestation-aggregation-heuristics/7265) - [More by Lighthouse](https://lighthouse-blog.sigmaprime.io/attestation-packing.html) But as mentioned above, this view of attestations is limited by the current implementation/technology, can we do better by upgrading the protocol? #### All Roads Lead to SSF + ZK As with all things (state growth, censorship, etc), Single Slot Finality (SSF) and Zero Knowledge(ZK) proofs play an important role in the future of attestation packing. ##### SSF Right now only a subset of the validator set votes on each slot and we require an epoch to reach finalization, but if the entire validator set were to vote, we could reach finalization within two slots. Even further with two votes in a single slot we can finalize in 1 slot. > this has multiple important consequences: **Better confirmation UX:** users (including exchanges, bridges) can quickly rely on economic finality. **Reorg prevention:** mitigating the possible destabilizing effects of MEV **Provable liveness:** it is much easier to guarantee that the consensus protocol will successfully finalize something if everyone is able to vote at once, because we only need to ensure short-term agreement on what needs to be finalized. For example, proposers can be used as a coordination point. - [source](https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219) Depicted in the [diagram](https://hackmd.io/@nibnalin/bls-multiplicity) below, doing a second round of attestations is easier than the first as we can assume we have a disjoint set. ![](https://hackmd.io/_uploads/SJ2DbK6Sh.png) The multi-round aggregation proposal with the most in-depth analysis can be found [here](https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219). And lastly, in "[Path towards single slot finality](https://notes.ethereum.org/@vbuterin/single_slot_finality#Good-news-extremely-large-validator-counts-are-more-possible-with-BLS-aggregation-than-we-thought)" Vitalik claims as one of the three main problems: >**Determining the optimal aggregation strategy.** For as high an $N$ as possible, we want to aggregate and include signatures from $N$ validators into a block, with a level of node overhead that we are willing to accept. ##### ZK All of the "succinctness" properties of zero knowledge proofs can also be used for signature aggregation. From the [Horn](https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219) proposal mentioned earlier >It’s well known that you can design SNARKs 7 that prove that legitimate signatures were aggregated correctly. However, such proving systems are too slow (on the prover side) for our use case. One way to work around this is to prove a simpler statement: That a provided public key is the result of aggregating certain public keys together with certain multiplicities. It’s possible to design such a custom proof system using Bulletproofs-like IPAs. This proposal requires an additional `10s` so it is unlikely to be incorporated soon in its current form, but interesting nonetheless in our journey of p2p optimization. Which is now over.