# Scalability limits of Beacon Chain
<!-- Put the link to this slide here so people can follow -->
slide: https://hackmd.io/@ericsson49/attestations
---
# Goal
- Initially researched Attestation Aggregagtion and Dissemination problem
- But want to present most important and recent takeaways from the research
- Other issues are described in my write-ups
- More research to be done
- More write-ups to be written
---
# Plan
- Problem description
- Validators and Nodes
- Scalability vs BFT tradeoff
- Attestations (if time left)
- Block size
- Attestation aggregation
- Slashing detection
---
# Main ideas
- Lots of validators is an obstacle to scalability:
- many active nodes
- lots of messages each epoch
- bloats block size
- tricky attestation aggregation
- storage for slashing detection
- BFT vs scalability
- network layer attacks
---
# Problem description
---
## Preliminary notes
Time:
- Slot, 12 seconds each
- slot_number = (time - start_time) // 12
- Epoch, 32 consecutive slots
- epoch_number = slot_number // 32
---
## Preliminary notes
Validators:
- stake holders
- assigned with Proposer and/or Attester responsibilities
Nodes:
- actual actors from network point of view
---
## Proposer and Attesters
#### Proposer:
- proposes changes to the beacon chain
- chosen (pseudo)randomly one per slot
#### Attesters:
- attest (vote for) latest block (in their view)
- each Epoch all validators shuffled and split into 32 groups (per slot)
- each validator votes each Epoch
- votes are issued each slot
---
## Logistics
- At the beginning of each slot Proposer issues a block
- Has 6 seconds (half a slot duration) to propagate it
- to attesters, to everyone in general
- At the middle of each slot Attesters attest head block (from their respective point of view).
- Have 6 seconds to propagate attestations
- to next proposer, to everyone in general
---
# Validators and Nodes
---
## Validators vs nodes
#### Validators:
- min mainnet scenario: |V| = 300K, 10M ETH
- max possible: |V| = 4M, 128M ETH
#### Nodes:
- |N| = 3K-10K-30K
---
## Validators, slots, epochs
Validators in V are assigned to slots during an epoch:
- 32 slots in epoch currently
- around 10K validators per slot, if |V| = 300K
- 100K+, if |V| = 4M
---
## Active nodes
Nodes are actors from network point of view.
Only 1/32 of validators are active during a slot.
- some part of nodes is active
What is the amount of active nodes?
---
## Uniform distribution
With more or less uniform distribution of validators per node, amount of active nodes is around min(|V|/32,|N|).
- |N| > |V|/32:
- 10K active nodes each slot, if |V| = 300K
- 100K active nodes each slot, if |V| = 4M
- |N| <= |V|/32 - looks like more realistic scenario
- most nodes are active!
---
## Non-uniform distribution
Assume, lots of nodes with few validators. E.g. single validator nodes.
Then there will be either:
- lots of nodes (comparable to |V|)
- unlikely, huge problems on network layer
- power concentration (few nodes posessing majority of votes)
- collusion is easier
---
## Non-uniform distribution
#### Example
- 10K nodes, 300K validators scenario:
- 9900 nodes with 10 validators each
- 100 nodes with 2000 validators each
- about 1/3 nodes active each slot
- but 2/3 of stake possessed by 100 nodes
---
## Non-uniform distribution is dangerous
Less active nodes, but:
- power concentration => collusion is easier
- lots of nodes with few validators => network layer attacks (flood/block messages) are easier (cheaper) for adversary
---
## Why split validators at all?
My hyposthesis - faster changes:
- Changes are possible each slot
- Full voting round - each epoch (more time to gather votes from lots of validators)
Given lots of validators, it seems like much less work each slot
- But only when |V|/32 << |N| !!!
---
## Is it reasonable to split?
- Non-uniformity is dangerous
- lots of nodes - performance problems
- more or less uniform distribution + moderate amount of nodes => most nodes are active!!
But if most nodes are active:
- All validators can attest each slot with similar communication efforts
- since can aggregate votes on each node
---
# Scalability vs BFT tradeoff
---
## Bandwidth for block propagation
With 300K validators, around 10K receivers (assuming 10K+ nodes and uniformity).
6 seconds to propagate (half slot duration).
Direct communication:
- Bandwidth: `block size` * |V| / (`epoch duration` / 2)
- 32K * 300K / ((12 * 32) / 2) ~ 50Mbytes/sec!!!
Have to use indirect nodes (to amplify bandwidth). But this reduces BF tolerance.
---
## p2p graph
Limit outbound traffic by sending to part of nodes (e.g. peers). Peers resending to their peers.
Assume 10 peers:
- proposer sends to 10 peers
- 10 peers send to their 100 peers
- 100 -> 1000
- 1000 -> 10000
- Ignoring overlap
---
## p2p graph
- Have to verify messages before re-sending.
- 4 hops, thus should send to 10 peers during 1.5 sec (6 sec / 4).
- Network delay about 200-400ms
- 200-400ms less time to transmit
- 32K * 10 / 1.1 sec - approx 320 Kbytes/sec.
---
## BFT vs Scalability
Have to reduce node degree in p2p graph because of bandwidth limits, but:
- few peers => an adversary can organize eclipse attacks with lower costs.
- transient nodes => adversary can block messages
---
## BFT vs Scalability
How does this affect BFT properties?
- No clear answer (beyond my initial research scope)
- Obviously 10 peers is way too low compared to 10K
- Sybil attack example https://github.com/ethresearch/p2p/issues/6
- suggests 50 peers for the setup
---
## Nodes vs Validators
Validators are actors from the spec point of view.
But nodes are actors from network point of view.
There is a restriction to become a validator - deposits.
A node expected to contain several validators (tens or even hundreds, on average).
But node with zero validators is much less restricted!
---
## Attacks on network layer
- An adversary can create lots of nodes with 0/1 validator behind each.
- A node with one or zero validator is cheaper to set up.
- Lots of nodes mean they can either flood or block messages
- a big problem with transient nodes, e.g. p2p
- Sybil attacks
- low node degree => eclipse attacks easier
---
## Proof-of-X means participant constraint
BFT guarantees are expressed in terms of max amount of faulty processes, e.g. |F| < 1/3 |N|.
Beacon Chain BFT guarantees are of the same form, but in terms of validators.
To become validator, one should make a deposit. So, there is an economic restriction on |V|.
But node set is not restricted to the same degree in Beacon Chain!
---
## Proof-of-X means participant constraint
As mentioned before, highly non-uniform distribution poses a threat: lots of 0/1 validator nodes.
Probably, cannot break correctness of Casper FFG (still need > 1/3 of voting power).
But unrestricted amount of nodes can affect liveness: deter/prevent justification/finalization.
---
# Attestations
---
## Attestations
- |V| is expected to be around 300K initially, and up to 4M.
- 300K attestations each epoch, ~300 bytes each
- Have to compress:
- to include in a block
- to disseminate through the network
- Block size affected by amount of validators (and bandwidth requirements)
- Have to store attestations for many epochs, to detect surround votes
---
## Block size
Significant portion of a block occupied by (aggregated) Attestations. Up to 128 currently. We think it should be increased to accomodate recent spec changes.
We estimated that a block can occupy up to around 130K bytes (without compression), when all 128 attestation slots are filled.
Since currently there are more committees per slot (64 vs 16), the 128 limit should be increased.
But smart compression should help.
---
## Block compression
Opportunities:
- AttestationData have three votes, can be many common (sub)fields
- each Attestation has aggregation and custody bitsets, but they can be partially utilized (different views per committee)
- more than 1 signature can be allowed in signature aggregates
- signatures of different votes can be aggregated too
---
## Slashing detection
Currently, assumed that for slashing detection purposes, attestations for Weak Subjectivity period should be stored - about 54K epochs.
Storing sucn amount of attestations and indeces is a problem, e.g. 300K * 54K = 15 billions of entries. And 200 billions for 4M validators.
Which will be a problem for small actors.
Not clear how to compress this for now, except to restrict look back-period (e.g. keep only since finalization until now).
---
## Conclusion
- Beacon Chain specs doesn't restrict nodes with economic constraints like deposits (e.g. no Proof-of-X constraint).
- Network-wise actors are nodes.
- Lots of Validators is a problem to Scalability
- many active nodes
- attestations take lots of space/bandwidth
- Do we really need lots of Validators?
My conclusion: beacon chain design should be re-considered
{"metaMigratedAt":"2023-06-15T01:27:39.611Z","metaMigratedFrom":"YAML","title":"Scalability limits of Beacon Chain","breaks":true,"description":"Attestation aggregation and dissemination research results","contributors":"[{\"id\":\"6a1f88c4-0c61-422e-bd1f-8fcbf4e2dad4\",\"add\":14156,\"del\":4910}]"}