# BEAMSIM report
## Abstract
This report evaluates the performance of post-quantum signature aggregation for the Lean consensus mechanism using the BEAMSIM discrete event simulator. We model and compare two primary network topologies, Gossipsub and a structured Grid layout, to determine their efficiency in aggregating signatures from 8,192 validators within a target slot time of 4 seconds. Our simulations analyze the impact of key variables, including validator bandwidth, signature size, and aggregation rates.
The results demonstrate that the Grid topology consistently outperforms Gossipsub, reducing aggregation latency by 30-40% due to shorter message paths and fewer duplicates. With a 100Mbps bandwidth limit, the Grid topology achieves final SNARK creation in 3.2 seconds, compared to 3.9 seconds for Gossipsub.
However, both approaches fail to meet the ideal sub-2-second target required for robust block production. We identify validator bandwidth (specifically the 50Mbps proposed in EIP-7870) and the computational speed of SNARK creation as the primary bottlenecks.
The best results were achieved when using a grid topology exclusively among the aggregators, who receive the signatures of the other subnet validators directly. With this configuration and a 200 Mbps bandwidth limit, global aggregation was achieved in 2.3 seconds.
We conclude that significant optimizations and a potential increase in hardware requirements are necessary to make post-quantum signature aggregation viable for fast consensus.
## 1. Introduction
The transition to post-quantum (PQ) cryptography is essential for the long-term security of blockchain networks like Ethereum. However, this shift presents a significant performance challenges. PQ-signatures are substantially larger than current elliptic curve signatures, creating a challenge for the rapid collection and aggregation of thousands of attestations.
This report investigates the feasibility of large-scale PQ signature aggregation under the constraints of bandwidth limit. Using the BEAMSIM simulator, we model a network of 8,192 validators to evaluate the performance of two distinct network strategies: the widely used Gossipsub protocol and a more deterministic Grid topology. The primary objective is to identify an optimal approach and quantify the key performance bottlenecks, such as validator bandwidth and the computational speed of SNARK generation.
This study first outlines the simulation methodology and network models. It then presents the empirical results for both local and global aggregation, followed by a discussion of their implications for the future of lean consensus.
## 2. Methodology
### 2.1 BEAMSIM simulator
BEAMSIM is an open-source simulator available at https://github.com/qdrvm/beamsim/ and was developed to simulate post-quantum signature aggregation.
It is a Discrete Event Simulator (DES) implemented in C++ and built upon the [NS-3 library](https://www.nsnam.org). Unlike frameworks such as Shadow, NS-3 does not execute existing client binaries by intercepting system calls. Instead, with NS-3 all simulation logic should be implemented from scratch, resulting in a single executable binary. This approach was considered appropriate for the simulation because:
1. A BEAM client implementations do not yet exist.
2. Our simulation focuses solely on the signature aggregation algorithm, a specific component of the future client.
BEAMSIM offers a trade-off between model accuracy and execution speed through its selection of backends:
* `ns3-direct`
* The default backend, this simulates real TCP connections, providing the most realistic results. However, it can be slow for large network simulations and may consume significant memory (~15GB RAM per 1024 nodes). It optionally supports parallel execution via MPI, running each subnet in a separate process
* `queue`
* alternative, lighter backend sacrifices some accuracy by not simulating real TCP connections, prioritizing faster execution.
> In results presented below ns3-direct backed was used, while queue backend was mostly used to quickly test different assumptions during development.
During simulation, no real signature generation or aggregation is going on. Instead we simulate these events by putting an artificial sleep to the program.
Simulation begins with validators gossiping their signature into the subnet and ends when first global aggregator produces a SNARK proving supermajority of all validators signatures.
### 2.2 Network & Aggregation Model

Inspired by existing BLS aggregation, our network is designed with a hierarchical structure of aggregating subnets.
Signatures are produced by attesting validators that are grouped into multiple aggregating subnets.
Among these validators, a subset can act as **subnet aggregators**, collecting the signatures and aggregating them into a SNARK. Additionally there are **global aggregators**, capable of performing aggregation of subnet SNARKs into a global SNARK.
Within each subnet, a configurable number of **local aggregators** are responsible for collecting and aggregating individual post-quantum signatures. Once any local aggregator accumulates a sufficient number of signatures (defined by the `snark1_threshold` parameter in our simulator), these signatures are aggregated into a **local SNARK (SNARK1)**. This SNARK1 is then propagated to a set of **global aggregators**.
Global aggregators, in turn, collect SNARK1s from various subnets. Upon gathering enough SNARK1s to collectively prove at least $2/3 + 1$ of all signatures across the network, they aggregate these local SNARKs into a **global SNARK (SNARK2)**, representing the final aggregated proof.
This high-level topology defines a design space for simulation, allowing us to explore various configurations:
* **Subnet Topologies (Local & Global):** We can test different network structures within and between subnets, such as Gossipsub or Grid topologies.
* **Local Aggregation Strategy:** Our current simulator focuses on a strategy where SNARK1s are produced only when the `snark1_threshold` is met. Future work could explore more dynamic strategies, such as actively merging and updating intermediate SNARKs as signatures are collected (e.g., propagating a partial SNARK after 30% of subnet signatures are gathered, allowing other aggregators to merge and update it).
* **Local-to-Global aggregators Communication:** We can investigate different communication paradigms for SNARK1 propagation, including "Push" (local aggregators send SNARK1s to global aggregators) and "Pull" (global aggregators request SNARK1s from local aggregators).
* **Global-to-Global aggregators communication:** Global aggregators may naively propagate any subnet SNARK they observe to their gossipsub (mesh) or grid neighbors. Or they can propagate only one SNARK per subnet, therefore reducing redundant traffic
### 2.3 Investigated Topologies
This section details the subnet topologies and the communication between local and global aggregators that were investigated in the experiments.
#### Gossipsub

Gossipsub is an efficient message propagation protocol that is widely used in Ethereum (e.g. BLS aggregation, Blobs propagation). It creates a "mesh" of established, long-lived peer connections (`mesh_n`) and a "gossip" mechanism that periodically sends messages to a few randomly selected "non-mesh" peers (`non_mesh_n`).
**How it works in BEAMSIM for signature aggregation:**
1. **Initial signature propagation:** when simulation starts each validator produces a signature. They publish signature message to the Gossipsub topic associated with validator's subnet. Its mesh peers immediately receive and re-propagate the signature to their mesh peers, and so on.
2. **Local aggregator role:** Local aggregators are the subset of validators and are also publishing their signatures to the signature topic. They also collect observed signature towards their `snark1_threshold` limit and then generate a SNARK1
3. **SNARK1 propagation**: once a local aggregator generates a SNARK1, it is announced to the global aggregators via `IHAVESNARK1` message containing a bitfield of signatures that are proved in the SNARK1. `IHAVESNARK1` announcement and pulling of SNARK1 happens in a separate global aggregation gossipsub topic
4. **Duplicates handling**: BEAMSIM implements Gossipsub 1.1 that is used by most of CL clients for signature aggregation today ([link](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#the-gossip-domain-gossipsub)) and Gossipsub 1.2 which uses `idontwant` messages for duplicates handling.
#### Grid

The Grid is an alternative gossip strategy that utilizes a more structured and deterministic approach to network communication within subnets compared to Gossipsub. Nodes are arranged in a 2-dimensional grid, and messages are routed to all column and row neighbors. This design aims to minimize the distance between any pair of peers to two hops, as well as minimize message duplicates. Grid topology was discussed during libp2p day in Cannes ([link](https://www.youtube.com/watch?v=5nTUUL6Uuvs&list=PLX9e-uG608s-7_y-DAUfsWVqY1yEk6NqY&index=6&t=200s&pp=iAQB)), and during [Beam Call #3](https://youtu.be/dJkuwuh2Nrs?si=2p2kqEqrRU1-rdg7&t=3707).
**How it works in BEAMSIM for signature aggregation:**
0. **Grid formation:** the process begins with all validators in a given subnet using a common source of randomness to generate a shared, consistent 2D grid view that maps the location of each validator. Following this, every validator proceeds to establish connections with all other validators situated within its corresponding row and column.
1. **Signature propagation**: when a validator generates a signature, it transmites this signature to its row and column neighbors (Step 1. on the diagram). These neighbors, upon receiving a signature, forward it to their neighbors in orthogonal direction, i.e. if message is received from column neighbor, it is propagated to column neighbors (Step 2. on the diagram).
2. **Local aggregator role:** Local aggregators, like all the other validators, are positioned in the grid structure and collect up to `snark1_threshold` signatures before producing a SNARK1
3. **SNARK1 propagation:** once a local aggregator generates a SNARK1, it is announced to the global aggregators via `IHAVESNARK1` message. Global aggregators are then fetch the SNARK1 if they haven't yet observed SNARK1 for the given subnet. Then global aggregator, connected to other global aggregators via grid topology, announces its SNARK1. This process continues unless one of global aggregators would collect enough of SNARK1s to prove 2/3 + 1 of total signatures in the network
The forming of Grid topology, however, requires all validator addresses to be known and mutually reachable, which could present an operational hurdle (e.g. for home operators under NAT) and mean giving up some nodes privacy. However, the data from ProbeLab reports demonstrates that roughly 90% of Ethereum peers discovered via `discv5` are already reachable today ([link](https://probelab.io/ethereum/discv5/2025-28/#stale_records_7d_plot-plot)).
##### Aggregators-only Grid

We can limit Grid topology usage to only aggregating validators, while non-aggregating validators are sending their signatures directly to aggregators. This approach would prevent non-aggregating validators from needing to expose their peer-to-peer addresses.
###### Privacy concerns
Note that with this approach, aggregators could easily learn the P2P addresses of the attesters and match them with their signing keys. This would compromise the privacy of the validators, which we want to avoid. While not being the focus of current work there could be several approaches to mitigate this:
1. Maintaining a gossipsub mesh between non-aggregating nodes in addition to their connections to aggregators, so that they propagate not only their signatures, but also from other signers as well
2. Similar, to 1., but instead of propagating all signatures instantly, attesters would collect some $k$ number of signatures, then create a batch of $n < k$ signatures, which would be sent to aggregators
3. Apply an approach similar to that used in [Dandelion](https://arxiv.org/pdf/1805.11060.pdf), where the attesters propagate their signature messages to a random attester. That attester would then, with a predefined probability, propagate the message to the next attester or the aggregator.
4. A signing validator can randomly select one of the aggregators' public keys to encrypt a message before sending it to another aggregator, who will then broadcast the encrypted message between the aggregators. Once the selected aggregator receives the message, they decrypt it and then broadcast the decrypted message to the other aggregators.
5. Implement ring signatures, so that it is impossible to learn who was the validator that signed a message.
Neither approach is perfect and require further investigation. Approaches 1–4 introduce additional latency. Approach 5, sacrifices validator accountability because it becomes impossible to determine who is responsible if invalid data is signed.
Another consideration is that, in order to form a grid, aggregators must expose their public addresses, making them susceptible to DDoS attacks that could stall the aggregation process.
To mitigate this a concept of *public* and *private* aggregators could be introduced:
* **Public aggregators:** advertise their role via ENR record and expose their public addresses and form a Grid for fast signature dissemination
* **Private aggregators:** do not expose their role, do not participate in Grid topology and obtain other validators signatures via the gossipsub mesh connections with other attesting validators, not participating in Grid.
That way aggregation subnets would effectively have two paths for signature aggregation:
* **Fast path:** aggregation is done by public aggregators, who receive signatures directly from attesting validators and are the subject of potential DDOS attacks
* **Slow path:** aggregation is done by private aggregators, which p2p addresses are unknown and therefore they cannot be a subject of denial of service attacks.
We anticipate that in 99.9% of cases aggregation will be performed using the fast path, but when public aggregators are attacked slow path acts as a back up method, so that aggregation is not stalled completely.
The results presented in the following sections and referenced as "Aggregators-only Grid/Gossipsub" are based on a topology where non-aggregating validators propagate their signatures directly to aggregators, thereby excluding additional gossipsub connections between non-aggregators. This experimental setup was chosen to isolate and evaluate the potential performance gains of the "fast path" architecture. The "slow path" was intentionally omitted from the simulation as its performance, by definition, would not be sufficient to surpass the fast path.
To configure nodes in BEAMSIM to operate in a model where non-aggregating validators send signatures to aggregators the `signature_half_direct` parameter was introduced. By default, it is set to `0`, which disables direct signature propagation from attesters to aggregators (which enables classical Grid, described in the previous section). When set to any number greater than `0`, the parameter specifies the number of aggregators to which a non-aggregating node will send its signatures upon generation.
#### Global aggregation
Global aggregators are pulling subnet aggregations from the aggregators that announced that they might have a SNARK. After the local SNARK is downloaded global aggregators announce their SNARK to its neighbors using either gossipsub or grid before propagating aggregations further.
There were few optimizations implemented in the BEAMSIM to improve pull strategy:
**SNARK1 early announcement**
Under this strategy, subnet aggregators announce *the beginning of their aggregation process* instead of waiting for its completion. The announcement message contains a bitfield corresponding to the signatures being proven. Upon receiving this, a global aggregator can signal its intent to download the in-progress SNARK1 by responding with an IWANT message. This preemptive communication allows the local aggregator to know its interested peers the moment the SNARK1 is ready.
**One SNARK1 per group request**
Global aggregators can choose various strategies for downloading SNARK1s. While each subnet produces multiple SNARK1s, each proving a fraction of local signatures, global aggregators in this simulation only requested one SNARK1 per subnet.
For resilience in a real-world scenario, requesting multiple SNARK1s from a subnet would be a better practice protecting aggregators from malicious nodes. However, analyzing malicious or slow nodes was outside the scope of this work, and therefore in our simulations global aggregators request only single SNARK per subnet.
This behaviour can be disabled by setting the `snark1_group_once` flag to `false`.
**One SNARK1 propagate per group per aggregator**
Global aggregators can propagate only one SNARK1 per group. Assuming that all SNARK1 aggregations prove the same amount of signatures, it does not make much sense to propagate more than one SNARK1 per group.
### 2.4 Simulation parameters
**Network size params**
* `group_count`: number of subnets performing local aggregation (validator groups)
* default: `8`
* `group_validator_count`: number of validators per subnet
* default: `1024`
* `group_local_aggregator_count`: number of local aggregators per subnet (subset of subnet validators)
* default: `102` (10% of local validators)
* `global_aggregator_count`: total number of global aggregators (exclusive to local subnet validators)
* default: `100`
**Bitrate & Latency params**
* `gml`: path to processed ripe-atlas database, containing real world latencies between the peers (~50ms on average)
* default: `shadow-atlas.bin`, that is located in the root of the project directory
* `max_bitrate`: maximum bitrate for each individual peer
* `50Mbps` (download speed for attester from [EIP-7870](https://eips.ethereum.org/EIPS/eip-7870))
**PQ-signature aggregation params**
* `signature_size`: size of a single pq-signature
* default: `3072 bytes`
* `snark_size`: Size of SNARK proofs
* default: `131072 bytes` (128KB)
* `snark1_threshold`: Threshold fraction of signatures needed for SNARK1 generation
* default: `0.9` (e.g. for 1024 validators subnet, $1024 \times 0.9=922$ signatures has to be collected by aggregators before the start of SNARK1 generation)
* `snark2_threshold`: threshold fraction of signatures needed to be proven for final SNARK2
* default: `0.66` (e.g. for 8192 of total validators, global SNARK should prove correctness of $8192 \times 0.66=5407$ signatures)
* `aggregation_rate_per_sec`: rate of signature aggregation into a SNARK
* default: `1000sigs/second`
* `snark_recursion_aggregation_rate_per_sec`: rate of recursive SNARK aggregation
* default: `10 snarks/second`
* `pq_signature_verification_time`: time to verify a single PQ-signature
* default: `30 microseconds`
* `snark_proof_verification_time`: time to verify a single SNARK-proof
* default: `2ms`
> Default values were obtained after consultations with researcher working on XMSS signatures and signature aggregations schemes.
## 3. Theoretical minimum aggregation time
Simplifying some assumptions (e.g. by ignoring latencies between peers, traffic consumed for message duplicates) we may roughly calculate the theoretical minimum time for generation of a single global SNARK. We may compare this theoretical minimum aggregation time with our results to understand how much further it is possible to optimize aggregation.
**Local aggregation**
* \# of signatures required for subnet aggregation: $1024 \times 0.9 = 922$
* data required for local aggregation: $922 \times 3072=2832384\text{ bytes}=2.7\text{ MB}$
* time required for downloading enough signatures for aggregation (assuming EIP-7870's `50Mbps`=6.25MB/second bandwidth):
* $2.7/6.25=0,43$ seconds
* time to aggregate:
* $922\text{signatures} / 1000\text{sigs_per_second} = 0.922$ seconds
* **Total**: 0.43 + 0.922 = 1,352 seconds required at very minimum for local aggregation
**Global aggregation**
* At very least 6 out of 8 subnet aggregations are required to be downloaded to prove $\geq2/3+1$ of all signatures
* Time to download $6$ subnet aggregations
* $6\times 131072\text{bytes} / 6.25\text{MB_second}=0.12$ econds
* Time to recursively aggregate $6$ subnet aggregations (assuming 10snarks/second rate):
* $6/10=0,6$ seconds
* **Total:** 0.12 + 0.6 =0,72 seconds required at very minimum for global aggregation
Therefore, with our default parameters we may estimate that **1.35 + 0.72 = 2,07 seconds** is an absolute minimum time required for the entire aggregation pipeline.
In reality we spend additional times due to existence of redundant duplicate messages, multi-hops communication, time to announce and pull subnet snark.
We also need to take into account time from the beginning of the slot to propagate a block (~1 second), + time required to propagate global SNARK for inclusion into a block + additional traffic required for other communications (e.g. DAS, transaction propagation).
## 4. Results
### Local subnet aggregation
BEAMSIM can be launched with a local aggregation only mode. In such case only a single subnet is simulated, allowing fine tuning parameters for a single subnet SNARK aggregation. Simulation stops when **any** subnet aggregator produced a SNARK proving predefined fraction of subnet signatures (90% by default). Each scenario was executed five times with different random seeds. Average result for each configuration is plotted on a graphic.
This section describes results of simulation of local aggregation using gossipsub and grid topologies by taking default parameters described above and checking how speed of aggregation is affected when one of parameter is set to custom values. The following parameters were set by default:
```
'backend': 'ns3-direct',
'random_seed': 42,
'group_count': 8,
'group_validator_count': 1024,
'group_local_aggregator_count': "10%",
'mesh_n': 8,
'non_mesh_n': 4,
'signature_time': '20ms',
'signature_size': 3072,
'snark_size': 131072,
'snark1_threshold': 0.9,
'aggregation_rate_per_sec': 1000,
'pq_signature_verification_time': '30us',
'gml': 'shadow-atlas.bin', # ripe atlas db's latencies
'max_bitrate': '50Mbps' # Default EIP-7870 bandwidth limit
```
The following parameters were tested on how they affect aggregation speed:
* Signature sizes
* Bandwidth limits
* Gossip mesh analysis
* Local aggregators count
* Subnet sizes
We ran our simulations using gossipsub 1.1 (`Gossipsub (idontwant=False)` on the plots), gossipsub 1.2 (`Gossipsub (idontwant=True)`), and grid topologies. We also put subnet's theoretical minimum aggregation time for reference.
Additionally we plot the average number of duplicate messages received by the aggregator that was the first to produce subnet aggregation.
#### Signature sizes

| Signature Size | gossipsub (idontwant=False) Time (ms) | gossipsub (idontwant=True) Time (ms) | Grid (signature_half_direct=0) Time (ms) | Grid (signature_half_direct=4) Time (ms) | Theoretical Minimum Time (ms) |
| -------- | -------- | -------- | -------- | -------- | -------- |
| 96 bytes | 1305.8 | 1403.6 | 1300.0 | 1283.0 | 935.1 |
| 1024 bytes | 2008.0 | 2031.6 | 1429.0 | 1593.0 | 1065.6 |
| 2048 bytes | 2415.8 | 2483.6 | 2042.0 | 2351.0 | 1209.6 |
| 3072 bytes | 3126.8 | 3156.4 | 2318.0 | 2683.0 | 1353.6 |
| 4096 bytes | 3565.0 | 3891.4 | 2525.0 | 3019.0 | 1497.6 |
| Signature Size | gossipsub (idontwant=False) Avg Duplicates | gossipsub (idontwant=True) Avg Duplicates | Grid (signature_half_direct=0) Avg Duplicates | Grid (signature_half_direct=4) Avg Duplicates |
| -------- | -------- | -------- | -------- | -------- |
| 96 bytes | 2.612 | 2.377 | 0.776 | 1.392 |
| 1024 bytes | 2.066 | 1.649 | 0.498 | 0.980 |
| 2048 bytes | 1.370 | 1.450 | 0.457 | 0.934 |
| 3072 bytes | 1.510 | 1.712 | 0.467 | 1.225 |
| 4096 bytes | 1.501 | 1.653 | 0.492 | 1.074 |
We observe that having small signatures (96 bytes) results in a very fast aggregation within 1.2-1.3 seconds.
With the default signature size of 3072 bytes gossipsub takes ~3.1 seconds to aggregate. Classic Grid topology performs local aggregation in 2.3 seconds, while if for Grid topology only among aggregators result is 2.6 seconds.
It is observed that gossipsub 1.2's `idontwant` messages do not play any significant role as number of duplicate messages remains similar to gossipsub 1.1 version without `idontwant` messages. This behavior will remain similar through out the other tests as well.
Possible reason of `idontwant` messages not playing a significant role is that duplicates could be already on the wire, when `idontwant` is sent. Similar conclusion was observed in Probelab's research post ([link](https://ethresear.ch/t/impact-of-idontwant-in-the-number-of-duplicates/22652))
Grid due to its structure each peer might get at most one duplicate per message. In our test average number of duplicates per message is 0.47 for 3KB signatures, which suggests that ~50% of messages get at most 1 duplicate.
Number of duplicates, however, increases for Aggregator-only Grid topology. It could be explained by having multiple Grid aggregators receiving the same signature and propagating among the other aggregators.
#### Bandwidth limits

| Bandwidth | gossipsub (idontwant=False) Time (ms) | gossipsub (idontwant=True) Time (ms) | Grid (signature_half_direct=0) Time (ms) | Grid (signature_half_direct=4) Time (ms) | Theoretical Minimum Time (ms) |
| -------- | -------- | -------- | -------- | -------- | -------- |
| 5Mbps | 11930.6 | 12455.6 | 8998.0 | 10087.0 | 5241.6 |
| 25Mbps | 4485.2 | 4449.6 | 2865.0 | 3620.0 | 1785.6 |
| 50Mbps | 3126.8 | 3156.4 | 2318.0 | 2683.0 | 1353.6 |
| 100Mbps | 2296.8 | 2355.0 | 1872.0 | 2041.0 | 1137.6 |
| 200Mbps | 2228.8 | 2254.6 | 1403.0 | 1526.0 | 1029.6 |
| Bandwidth | gossipsub (idontwant=False) Avg Duplicates | gossipsub (idontwant=True) Avg Duplicates | Grid (signature_half_direct=0) Avg Duplicates | Grid (signature_half_direct=4) Avg Duplicates |
| -------- | -------- | -------- | -------- | -------- |
| 5Mbps | 0.783 | 0.869 | 0.443 | 0.643 |
| 25Mbps | 1.351 | 1.182 | 0.515 | 0.936 |
| 50Mbps | 1.510 | 1.712 | 0.467 | 1.225 |
| 100Mbps | 1.413 | 1.636 | 0.518 | 1.099 |
| 200Mbps | 1.350 | 1.023 | 0.554 | 0.907 |
We observe that bandwidth increase plays a significant role in the speed of signature aggregation. And current download speed requirement defined by EIP-7870 is not enough for fast signature aggregation. Going from 50Mbps to 100Mbps drops subnet aggregation time by ~0.5 seconds for both gossipsub and grid topologies.
It is worth noting, that for gossipsub topologies no significant aggregation speed improvements were observed when bitrate was increased from 100Mbps to 200Mbps (-0.1 seconds improvement). In contrast, for grid topologies aggregation speed dropped more significantly for the same bitrate increase (-0.5-0.6 seconds), which demonstrates Grid's superior bandwidth utilization.
#### Gossipsub mesh analysis

| Peers | gossipsub (idontwant=False) Time (ms) | gossipsub (idontwant=True) Time (ms) | Theoretical Minimum Time (ms) |
| -------- | -------- | -------- | -------- |
| 4 peers | 2936.6 | 3018.0 | 1353.6 |
| 6 peers | 3032.4 | 3049.4 | 1353.6 |
| 8 peers | 3126.8 | 3156.4 | 1353.6 |
| 10 peers | 3188.2 | 3148.6 | 1353.6 |
| 12 peers | 3237.2 | 3191.2 | 1353.6 |
| Peers | gossipsub (idontwant=False) Avg Duplicates | gossipsub (idontwant=True) Avg Duplicates |
| -------- | -------- | -------- |
| 4 peers | 0.632 | 0.619 |
| 6 peers | 1.296 | 1.315 |
| 8 peers | 1.510 | 1.712 |
| 10 peers | 1.826 | 1.267 |
| 12 peers | 1.597 | 1.490 |
Increasing mesh size (gossipsub's fan-out factor) can lead to less number of hops required for any message to reach an arbitrary node. However, blindly increasing mesh size also increases the number of duplicates received per node.
We see that aggregation times tend to increase with the increase of mesh sizes.
Duplicates analysis demonstrates that increase of mesh sizes slightly increases number of duplicate messages.
#### Subnet size

| Validator Count | gossipsub (idontwant=False) Time (ms) | gossipsub (idontwant=True) Time (ms) | Grid (signature_half_direct=0) Time (ms) | Grid (signature_half_direct=4) Time (ms) | Theoretical Minimum Time (ms) |
| -------- | -------- | -------- | -------- | -------- | -------- |
| 128 validators | 636.4 | 650.0 | 548.0 | 532.0 | 169.2 |
| 256 validators | 1035.4 | 1110.2 | 677.0 | 708.0 | 338.4 |
| 512 validators | 1692.4 | 1691.0 | 1229.0 | 1401.0 | 676.8 |
| 768 validators | 2296.2 | 2235.8 | 1977.0 | 2194.0 | 1015.2 |
| 1024 validators | 3126.8 | 3156.4 | 2318.0 | 2683.0 | 1353.6 |
| Validator Count | gossipsub (idontwant=False) Avg Duplicates | gossipsub (idontwant=True) Avg Duplicates | Grid (signature_half_direct=0) Avg Duplicates | Grid (signature_half_direct=4) Avg Duplicates |
| -------- | -------- | -------- | -------- | -------- |
| 128 validators | 1.682 | 1.504 | 0.579 | 1.211 |
| 256 validators | 1.534 | 1.289 | 0.559 | 1.039 |
| 512 validators | 1.139 | 1.158 | 0.458 | 0.926 |
| 768 validators | 1.310 | 1.289 | 0.432 | 1.189 |
| 1024 validators | 1.510 | 1.712 | 0.467 | 1.225 |
Aggregation time grows linearly with the number of validators in the subnet.
To keep aggregation speed around 2 seconds or less with 50Mbps bandwidth number of validators cannot exceed 768 with grid topology and 512 with gossipsub topology.
#### Local aggregator count

| Aggregators | gossipsub (idontwant=False) Time (ms) | gossipsub (idontwant=True) Time (ms) | Grid (signature_half_direct=0) Time (ms) | Grid (signature_half_direct=4) Time (ms) | Theoretical Minimum Time (ms) |
| -------- | -------- | -------- | -------- | -------- | -------- |
| 10 aggregators | 3521.2 | 3615.4 | 2451.0 | 3900.0 | 1353.6 |
| 102 aggregators | 3126.8 | 3156.4 | 2318.0 | 2683.0 | 1353.6 |
| 256 aggregators | 3053.8 | 3151.0 | 2316.0 | 2507.0 | 1353.6 |
| 512 aggregators | 3064.4 | 3075.6 | 2346.0 | 2425.0 | 1353.6 |
| Aggregators | gossipsub (idontwant=False) Avg Duplicates | gossipsub (idontwant=True) Avg Duplicates | Grid (signature_half_direct=0) Avg Duplicates | Grid (signature_half_direct=4) Avg Duplicates |
| -------- | -------- | -------- | -------- | -------- |
| 10 aggregators | 1.414 | 1.784 | 0.485 | 1.087 |
| 102 aggregators | 1.510 | 1.712 | 0.467 | 1.225 |
| 256 aggregators | 1.544 | 1.645 | 0.458 | 1.061 |
| 512 aggregators | 1.529 | 1.219 | 0.505 | 0.873 |
It is observed that 10% of validators (102) being also an aggregators seems to be optimal value for gossipsub topologies as no significant increase was noticed with higher number of aggregators.
Interestingly, aggregation times for grid topology were not affected much with increase of fraction of aggregators in subnet and remained consistent around 2.3-2.4 seconds.
### Global aggregation
For global aggregation the following parameters were used:
```
'backend': 'ns3-direct',
`snark1_pull`: true # snark1 is first announce via bitfield before it is downloaded
`snark1_group_once`: true # each global aggregator only requests one snark1 per subnet
`snark1_half_direct`: true # local aggregators propagate their snarks directly to global aggregators
`snark1_global_push`: true # global aggregator propagates 1 snark1 per subnet
'random_seed': 42,
'group_count': 8,
'group_validator_count': 1024,
'group_local_aggregator_count': "10%",
'mesh_n': 8,
'non_mesh_n': 4,
`idontwant`: false # set to false as no improvements were observed during local aggregation simulations
'signature_time': '20ms',
'signature_size': 3072,
'snark_size': 131072,
'snark1_threshold': 0.9,
'aggregation_rate_per_sec': 1000,
`snark_recursion_aggregation_rate_per_sec`: 10
'pq_signature_verification_time': '30us',
'gml': 'shadow-atlas.bin', # ripe atlas db's latencies
'max_bitrate': '100Mbps' # was intentionally set higher than EIP-7870
```
<!-- #### Push enabled:
With Push enabled local aggregators propagate SNARK1s directly without first announcing them via `IHAVESNARK1` message.

Aggregators-only grid & gossipsub:

#### Pull enabled:
-->

* X axis is the elapsed time from the beginning of the slot
* Y axis is number of signatures that were collected or aggregated
* Blue dotted lines is showing the moment when first global SNARK2 was generated.
* Red vertical line shows how many signatures were aggregated by SNARK1 subnets
* Green line shows the number of signatures that can be proven by the collected SNARK1s by global aggregator
With Gossipsub 1.1 it took ~2.4 seconds for 6 subnets to generate their local aggregations (which is enough to prove supermajority of signatures). The first global SNARK was produced roughly 1.6 seconds later, at about 4 seconds into the slot, which is not acceptable as slot time itself is 4 seconds.
With Grid topology it took ~1.9 seconds for 6 subnets to generate their local aggregations. The first global SNARK was produced ~1.3 seconds later, at about 3.2 seconds into the slot.
**Aggregators-only gossip/grid topology**

When we set non-aggregating validators to publish-only mode, and therefore enabling them to send their signatures directly to aggregators with fan-out factor (`signature_half_direct` parameter) set to 4 we can see better results for gossipsub topology (3.69 seconds vs 4 seconds in previous simulation).
For grid topology result remained approximately the same, and global SNARK was created 3.44 seconds into the slot.
### What it Takes to Significantly Speed Up Aggregation
To significantly accelerate the aggregation process, our simulations explore two optimistic but potentially achievable assumptions: a higher network **bandwidth** and a faster **SNARKs aggregation rate**.
#### Assumption 1: Increased Bandwidth
Based on Nielsen's Law, which suggests that high-end user connection speed grows by 50% per year, we can optimistically assume a bandwidth limit of 200 Mbps for validators by 2029, a projected timeframe for the lean consensus upgrade. By leveraging the most efficient configuration—enabling publish-only non-aggregating validators—the simulator demonstrated significant gains.
With this setup, the global aggregation time using the Aggregators-Grid topology drops to less than 3 seconds:

#### Assumption 2: Improved SNARKs Aggregation
Another optimistic assumption is that SNARKs aggregation could be improved to reach a rate of 20 SNARKs per second. This would effectively double the network's aggregation capacity within the same timeframe. Consequently, we could reduce the size of each validator subnet from 1024 to 512, while increasing twice the number of subnets, which would allow local aggregation to begin sooner and further speed up the overall process.
The results from the simulator with both assumptions (20 SNARKs/second aggregation rate and a 200 Mbps bandwidth limit) on Aggregators-only Grid and Gossipsub are as follows:

We can see that with Aggregators-only Grid topology global aggregation was achieved in 2.3 seconds, which is close to our target of 2 seconds.
## 5. Discussion & Conclusions
### Why Grid Outperforms Gossipsub
The simulation results consistently demonstrate that the Grid topology achieves 30-40% lower aggregation latency compared to Gossipsub, with final SNARK creation completing in 3.2 seconds versus 4 seconds under 100Mbps bandwidth conditions. This significant performance advantage can be directly attributed to the Grid's structural efficiency in message propagation, which, however, comes at the cost of giving up some nodes privacy assumptions.
Unlike Gossipsub's probabilistic mesh approach, the Grid topology deliberately minimizes the maximum distance between any two peers to just two hops through its deterministic 2D arrangement. This feature eliminates the variable path lengths of Gossipsub, where messages may traverse multiple hops before reaching their destination. The data confirms this advantage: while Gossipsub has an average of 1.51 duplicate messages per signature (for 3KB signatures at 50Mbps), the Grid maintains a lower average of only 0.47 duplicates.
These duplicates not only consume additional bandwidth but also create contention that delays aggregation. Our results consistently show that as signature size increases (from 96 bytes to 4096 bytes), the performance gap between Grid and Gossipsub widens, confirming that the Grid's superior bandwidth utilization becomes increasingly valuable with larger post-quantum signatures.
### Implications of the Aggregator-only Grid Optimization
The "Aggregator-only Grid" approach, where only aggregating validators participate in the Grid topology while non-aggregators send signatures directly to aggregators, presents some promising advantages and notable challenges.
From a performance perspective, this configuration (represented as "Grid (signature_half_direct=4)" in our results) demonstrates competitive subnet aggregation times, completing in 2.68 seconds at 50Mbps compared to 2.32 seconds for the full Grid implementation. While slightly slower than the full Grid, it remains significantly faster than Gossipsub (3.13 seconds).
However, this approach introduces new considerations:
- **Centralization risk**: Aggregators become critical network points with public endpoints that could be targeted by DDoS attacks, potentially stalling the entire aggregation process. As noted in report, this vulnerability could be mitigated through a dual-path approach with public (Grid-connected) and private (Gossipsub-connected) aggregators, so that most of the time public aggregators do the work providing the fast path for aggregation, while private ones operate as a backup, providing necessary redundancy required in case of attacks.
- **Privacy leakage**: Aggregators can potentially correlate signing keys with network addresses, creating privacy concerns. As discussed previously, this could be mitigated through gossipsub among non-aggregators, or other approaches involving public key encryption of messages or additional hops for privacy.
### Next steps
Moving forward, three critical improvements appear essential for viable post-quantum consensus:
1. Increasing validator inbound bandwidth requirements beyond EIP-7870's 50Mbps recommendation
2. Accelerating both signature aggregation (beyond 1,000 signatures/second) and SNARK recursion rates (beyond 10 SNARKs/second). As demonstrated in our theoretical calculations, these parameters alone account for 0.922 seconds of local aggregation time and 0.6 seconds of global aggregation time. If these rates were doubled through hardware acceleration or algorithmic improvements (to 2,000 signatures/second and 20 SNARKs/second), we would immediately save approximately 0.73 seconds of processing time (0.46 seconds from local aggregation and 0.3 seconds from global aggregation)
3. Implementing the dual-path aggregation strategy to balance performance with resilience
Without these enhancements, the current trajectory suggests that post-quantum signature aggregation may struggle to meet 4 second slots timing requirements. However, with targeted optimizations in computational performance and bandwidth utilization the sub-2-second target remains achievable, enabling a secure transition to quantum-resistant consensus.
## Appendix: how to reproduce simulations
Results were produced using BEAMSIM report-1 release: https://github.com/qdrvm/beamsim/releases/tag/report-1
To reproduce results:
1. Clone beamsim repository: https://github.com/qdrvm/beamsim
2. Checkout `report-2` version
3. Build beamsim: https://github.com/qdrvm/beamsim/tree/report-1?tab=readme-ov-file#local-build-with-ns-3-support
4. Start jupyter-notebook: https://github.com/qdrvm/beamsim/tree/report-1?tab=readme-ov-file#starting-jupyter-notebook
To begin local-aggregation simulations run cells in `beamsim-global-aggregation-analysis.ipynb` notebook.
To begin global-aggregation simulations run cells in `beamsim.ipynb` notebook.