Try   HackMD

Vena: Optimistically Responsive Consensus Over Large Validator Sets

Vena is an optimistically responsive consensus protocol in the partially synchronous network model tailored for deployment on public blockchains with large validator sets. Vena employs a form of "optimistic" aggregation committee broadcast that when successful results in all participants, including the leader, sending less than

n messages in a given round. In the uncommon case that this broadcast fails (robustness is parameterizable), Vena falls back to a single, all-to-all broadcast that guarantees eventual progress in the presence of
f<n/3
static byzantine faults. It is left to future work to construct a formal proof of correctness for Vena.

Overview

Simplex Consensus, as proposed by Benjamin Y. Chan and Rafael Pass, is an optimistically responsive consensus protocol that tolerates

f<n/3 static byzantine faults in the partially synchronous network model. While Simplex Consensus "matches the best optimistic liveness bounds, while at the same time matching and even improving the best known pessimistic liveness bounds (both in expectation and in the worst-case)," its application to public blockchains with large validator sets has been limited by its usage of multiple all-to-all broadcasts per round. Chan and Pass suggest that deployment of Simplex Consensus in such environments may be achieved by "subsampling the committee of voters" to reduce the number of messages sent and received by any participant, akin to Algorand.

To bound the probability of failure on any composed committee (

f>=nc/3 where
nc
is the size of the committee) to a reasonable level, however, Algorand parameterizes each committee to contain thousands of participants drawn from all "online tokens". For some critical operations, the committees are parameterized to include over
5000
participants (which bounds the risk of failure to
<240
if
f<0.2873n
where
n
is the total population of "online tokens"
). This parameterization is made practically efficient by allowing operators that control multiple selected tokens to aggregate their participation using a technique referred to as "party aggregation". In a scenario where a large population of operators each own the same number of tokens (where each committee participant in expectation has ~1 selected token), the number of messages sent and received by each participant in each round will still be significant.

Vena is an extension of Simplex Consensus tailored for deployment on public blockchains with large validator sets that retains both optimistic responsiveness and the ability to confirm blocks with a single, honest leader. Vena takes a different approach to scaling participation than that suggested by Chan and Pass. Rather than subsampling the committee of voters, Vena employs a form of "optimistic" aggregation committee broadcast that when successful results in all participants, including the leader, sending less than

n messages in each round. We refer to Vena's aggregation committee broadcast as "optimistic" because validators always attempt to use it at least once for broadcast per round, even in some cases where they end up voting for the "dummy" block (the fallback vote after a timeout). Vena remains correct in the presence of
f<n/3
static byzantine faults unlike alternative protocols that operate over a subsample of the total population of participants (review the performance of any configuration when
fp =1/3
).

Aggregation committees, popularized by the ETH2 Altair Fork, are some randomly assigned subset of validators that work together to aggregate their communications with a broader network (often by sending messages to other aggregation committees rather than all other participants) to minimize the communication complexity of the broader population of which they are part. Each aggregation committee typically consists of some set of "aggregators" to which all committee members send their messages to. Likewise, each aggregator also pools messages sent by other committees (specifically the aggregators in those committees) and forwards them to all committee members once they have a sufficient stake weight. As long as there exists one honest aggregator in an aggregation committee, messages generated by honest participants in the committee will be forwarded to other aggregators and messages sent to a committee will be forwarded to its participants.

Aggregation committees in Ethereum, to accommodate a now superseded data sharding model and to protect RANDAO, are tuned to never incur a supermajority byzantine takeover. Vitalik Buterin concluded that a minimum committee size of

111 was required to maintain a
240
probability of a supermajority byzantine takeover, when drawing from an equal weight population that is
1/3
dishonest. Ethereum opted to use
128
members because it is the next higher power of 2. Likewise, the target number of aggregators in each committee is set to 16 to avoid the scenario where all aggregators in a committee are dishonest. Vena has no such requirements around aggregation committee construction and can recover from any assignment of participants and/or aggregators with a single, all-to-all broadcast.

Vena's capacity to recover from "uncooperative" aggregation committee assignments enables aggressive tuning of the broadcast parameterization to reduce message complexity when the validator set is predominantly honest and online. If things suddenly deteriorate, the all-to-all broadcast fallback ensures safety and liveness will be maintained (as long as there are

f<n/3 byzantine faults where
n
is the number of total participants). Optimizing consensus for the "happy path" and falling back to a more expensive but rarely invoked mechanism to recover was inspired by Frosty, however unlike Frosty, Vena does not require voting to enter an explicit "recovery epoch" nor does it require running a separate consensus protocol in said "recovery epoch" to progress to the next round.

This post concludes with a section that outlines future optimizations for Vena. First, it offers an overview of a streamlined variant of Vena (similar to the streamlined variant proposed in Simplex Consensus) where a

vote for
h+1
is sent in the same message as a
finalize
for
h
, reducing the number of messages sent in each successful round by
50%
. This proposed variant, like Vena, does not require consecutive honest leaders to finalize
h
but does increase confirmation latency. Second, it outlines a mechanism for validators to individually adjust their aggregation committee broadcast strategy based on their observation of network conditions (unseen notarizations included in newly constructed blocks and the dummy block rate). This subjective, out-of-consensus tuning would allow validators to increase aggregation committee robustness even when a standard block cannot be finalized (where a synchronized parameter change or validator eviction could be done). Lastly, it outlines a technique to further reduce message complexity (for very large validator sets) using a multi-layered aggregation committee broadcast where committees can be composed of aggregators who mediate the broadcast to other committees. Vena assumes participant depth of 1, however, this isn't required and could be increased to reduce message complexity at the cost of additional latency.

The deployment of consensus protocols to public blockchains in byzantine environments is rife with complex tradeoffs between communication complexity, confirmation latency, and fault tolerance. Vena explores a favorable balance between these tradeoffs for blockchains with large validator sets. It is left to future work to construct a formal proof of correctness for Vena.

Protocol Description

Vena borrows terminology and techniques from Simplex Consensus. It is strongly recommended that you familiarize yourself with the linked work before continuing.

Assumptions

  • Vena is evaluated under the partial synchrony model. There exists some upper bound on the message delivery time (
    ) that is known to all validators, however, this bound only holds after some unknowable but finite "Global Stabilization Time" (GST). The true message delivery time (
    δ
    ) is less than or equal to
    after GST and varies by participant.
  • All validators (
    n
    ) participating in consensus for a given height
    h
    are static and known by all other validators. The set of validators participating in consensus may change between heights but it does not have to.
  • At a given height
    h
    , there are
    f<n/3
    static byzantine faults. Byzantine actors at a given height
    h
    can act arbitrarily (i.e. they can deviate from the protocol in any way).
  • All validators have equal weight at a given
    h
    (assumed to simplify protocol description and analysis but not required for implementation).
  • There exists some mechanism that can give a random and unbiased ordering of validators at each height
    h
    . This mechanism can be run locally and deterministically by any validator.
  • Each validator has a known identity. Validators can sign arbitrary messages as this identity and other validators can verify such messages uniquely originated from said validator. In other words, there exists a basic PKI.
  • There exists some cryptographic scheme for aggregating signed messages of any subset of validators. This aggregated signature along with a list of the identities that participated can be verified by any validator. It is assumed that this signature and accompanying list of identities is smaller than the concatenation of all aggregated signatures.
  • There exists a cryptographic hash function that can be used to hash arbitrary payloads. There is no known attack on this hash function such that 2 payloads with different pre-images produce the same hash.

Sketch

Vena operates over a chain of blocks where each block is a tuple of

<h,[]notarization,parent,payload>. At a given height
h
, there exists either a "standard" block or a "dummy" block (of form
<h,,,⊥>
) that does not extend the blockchain.
[]notarization
is an array of notarizations (i.e. a BLS Multi-Signature) each over the hash of some block (either standard or dummy) from
height(parent)
to
h1
, inclusive. The dummy block does not contain a notarization for
h1
(if it did, it may not be possible to achieve a quorum with a dummy block vote).
parent
is the hash of some standard block that the block at
h
extends.
payload
is a set of arbitrary bytes contained in the block at
h
(i.e. transactions).

Vena employs a form of "optimistic" aggregation committee broadcast that when successful results in all participants, including the leader, sending less than

n messages in a given round. Each aggregation committee is composed of some subset of validators, of which some smaller subset are given the unique role of "aggregator". Aggregators collect signed messages from members of a committee, aggregate their signatures, and forward aggregated messages to the aggregators of other committees. When receiving messages from other committees, aggregators buffer signatures until there exists
2f+1
signatures over a message, at which point the message and aggregate signature are forwarded to all members of the committee the aggregator is assigned.

Validators only belong to one aggregation committee for a given height

h and rotate which committee they belong to at each height (can be configured to be stable for longer periods of time). This assignment of validators to aggregation committees is performed using a deterministic and bias-resistant random oracle. One leader is randomly selected per height
h
to propose a block. The leader is responsible for proposing a block and broadcasting it to the aggregators of all aggregation committees. After broadcast, the leader (who is a member of some aggregation committee but is not an aggregator) behaves as a normal validator.

Happy Path

Vena is optimistically responsive (i.e. blocks are finalized at network speed) when there is an honest leader and a "cooperative" aggregation committee assignment. A cooperative assignment exists when a leader's standard block

b at
h
can be (if there exists an honest leader) broadcast to
2f+1
honest participants before
3
,
2f+1
vote messages for
b
from honest participants reach
2f+1
honest participants before
7
, and
2f+1
finalize messages for
b
from honest participants eventually reach
2f+1
honest participants (timeouts explained in detail later). In other words, a cooperative assignment is simply a round where the constructed aggregation committees provide a functional approximation of multicast.

To see how Vena works in practice, let's first walkthrough this "happy path". To begin, the honest leader first sends a standard block

b to the aggregators of all aggregation committees. The aggregators then forward the block to the participants of their committee:

Aggregation Committee 3
Aggregation Committee 2
Aggregation Committee 1
{propose,h,b_h}_l
{propose,h,b_h}_l
Participant 7
Aggregator 3
Participant 8
Participant 9
Participant 4
Aggregator 2
Participant 5
Participant 6
Participant 1
Aggregator 1
Participant 2
Participant 3
Leader
Block Broadcast

After a block is distributed to all aggregation committees, participants in each committee vote on the block. Participants broadcast their vote messages to the aggregators of their committee. Aggregators aggregate these vote messages and forward the aggregated committee vote to the aggregators of other committees. To retain optimistic responsiveness, aggregators use message weight to determine when to forward messages to other aggregation committees rather than timeouts. Aggregators send an initial aggregate message as soon as they collect an initial message weight, the "Initial Broadcast Weight" (

ib), and then send some number of additional aggregate messages every "Delta Broadcast Weight" (
db
) they collect.
ib
and
db
can be dynamically adjusted based on network conditions and desired tolerance to byzantine faults. When an aggregator has buffered
2f+1
vote messages, they forward the aggregated notarization to all participants in their committee and to the leader of
h+1
(to ensure they enter the next round within
δ
of the first honest aggregator):

Aggregation Committee 3
Aggregation Committee 2
Aggregation Committee 1
{vote,h,b_h}_p1
{vote,h,b_h}_a1
{vote,h,b_h}_a1
Aggregator 3
Participant 7
Participant 8
Participant 9
Aggregator 2
Participant 4
Participant 5
Participant 6
Aggregator 1
Participant 1
Participant 2
Participant 3
Vote Broadcast

Upon receiving

2f+1 vote messages (referred to as a "notarization" for
h
), the participants in each committee begin voting on
h+1
(referred to as "entering the next round") and attempt to finalize the block at
h
. Like before, participants broadcast their finalize messages to the aggregators of their committee. Aggregators aggregate these finalize messages and forward the aggregated finalization to the aggregators of other committees (again once they collect signatures from
ib
of the stake weight of the committee and then subsequently every
db
). When an aggregator has buffered
2f+1
finalize messages, they forward the aggregated finalization to all participants in their committee:

Aggregation Committee 3
Aggregation Committee 2
Aggregation Committee 1
{finalize,h}_p1
{finalize,h}_a1
{finalize,h}_a1
Aggregator 3
Participant 7
Participant 8
Participant 9
Aggregator 2
Participant 4
Participant 5
Participant 6
Aggregator 1
Participant 1
Participant 2
Participant 3
Finalize Broadcast

Hooray! The block

b is confirmed!

Optimistic Dummy Block Vote

In the case that a validator has not seen a block at height

h by
3
, it will optimistically send a vote for the dummy block via its assigned aggregation committee. Like Simplex Consensus, any honest participant in a given round
h
will send either a dummy block vote message or a standard block finalize message. As soon as a participant begins the dummy block voting procedure, they will never contribute to finalizing a standard block at
h
. Vena's aggregation committees are considered "optimistic" because validators attempt to use said aggregation committees although there is no proof that they are functional (unbeknown to the validator, the reason they may be voting for the dummy block is because all aggregators are byzantine and they didn't forward the leader's standard block). A useful byproduct of optimistic aggregation committee broadcast is that dishonest leaders cannot force validators to fallback to all-to-all communication by not broadcasting a block in a timely manner.

Aggregation Committee 3
Aggregation Committee 2
Aggregation Committee 1
{vote,h,⊥_h}_p1
{vote,h,⊥_h}_a1
{vote,h,⊥_h}_a1
Aggregator 3
Participant 7
Participant 8
Participant 9
Aggregator 2
Participant 4
Participant 5
Participant 6
Aggregator 1
Participant 1
Participant 2
Participant 3
Vote (Dummy) Broadcast (Optimistic)

Pessimistic Dummy Block Vote

When all the aggregators for a given committee are dishonest/offline, committee participants may not learn about new blocks, messages sent by committee participants may not be sent to other aggregation committees, they may not receive aggregate messages from other aggregation committees, and/or notarizations may not be sent to the leader at

h+1. There can be more than one aggregator assigned to each committee to increase the likelihood there is at least one honest aggregator. The robustness and communication complexity of different aggregation committee configurations is explored in detail in later sections (directly correlated).

Aggregation Committee 3
Aggregation Committee 2
Aggregation Committee 1
{vote,h,b_h}_p1
{vote,h,b_h}_a1
Aggregator 3
Participant 7
Participant 8
Participant 9
Aggregator 2
Participant 4
Participant 5
Participant 6
Aggregator 1
Participant 1
Participant 2
Participant 3
Vote Broadcast (Bad Aggregator)

If a large number of aggregators are offline or byzantine, the aggregation committee assignment in a round may be "uncooperative" and it may not be possible to produce a notarization for any block at

h using aggregation committee broadcast. If a validator has not seen a notarization for some block (standard or dummy) at
h
by
7
, it enters an all-to-all broadcast phase where it sends a vote for the dummy block to all other validators. This broadcast fallback does not require any aggregation committee reconfiguration, opting instead for all validators to "bail out" of a given configuration and try again at
h+1
.

Aggregation Committee 3
Aggregation Committee 2
Aggregation Committee 1
Participant 8
Participant 7
Participant 9
Participant 5
Participant 4
Participant 6
Participant 2
Participant 1
Participant 3
Vote (Dummy) Broadcast (Fallback)

It is possible that a participant may by chance be included in a "run" of faulty aggregation committees and not see a notarization or finalization for some time (potentially much later than other participants especially if almost all other aggregation committees are stable). If this occurs, the participant will stall until it eventually receives a notarization for

h+k1 when receiving a block broadcast at
h+k
(where
k
is the number of heights the participant has been stalled). At this time, said participant will backfill the standard block ancestry of
h+k
and enter
h+k
(as they would have seen a notarization for
h
to
h+k1
). If there is an honest leader and a cooperative aggregation committee assignment, this will permit
2f+1
honest participants to then send vote messages (and eventually notarize) for the standard block
h+k
(even though they potentially entered
h+k
long after the leader and/or other participants).

Undermining the Next Leader

The leader in round

h has
3
(evaluated against the local timer set by the first participant entering round
h
) to distribute a standard block
b
to
2f+1
honest participants for said block to have any chance of being notarized (and eventually included in the canonical blockchain). It takes
for a standard block
b
to be sent from an honest leader to all aggregators and
for honest aggregators to forward the block to all participants in their committee. Thus, an honest leader must enter
h
within
of the first honest participant for the standard block they produce to be notarized.

Said differently, if a byzantine adversary of

f can force any honest participant to enter round
h
more than
before the leader (e.g. they control all aggregators for the committee containing the leader), they can prevent the notarization of a standard block at
h
produced by an honest leader even if the aggregation committee assignment is cooperative. If repeated, this could allow an adversary to significantly reduce the number of standard blocks included in the canonical blockchain.

The dummy block vote timeout semantics are slightly different in Simplex Consensus (timeout after not seeing notarization rather than after not seeing a block), however, the same potential attack applies. Simplex defends against this by requiring all honest participants that see a notarization for

h1 to "multicast [their] view of the notarized blockchain to everyone else." This ensures all honest participants enter
h
within
δ
of the first honest participant (providing enough time to construct a notarization before timeout). Vena could introduce the same mechanism and require all honest participants to send a broadcast to all other participants when they see a notarization for
h1
, however, this would arguably defeat the purpose of introducing aggregation committees in the first place (as the "happy path" would no longer operate with sublinear message complexity).

Vena, instead, requires all honest aggregators to forward any notarization they see for

h1 to the leader at
h
when sending said notarization to all participants in their committee. This ensures that if a notarization is constructed through aggregation committee broadcast that the leader at
h
will see a notarization for
h1
within
δ
of the first honest aggregator. Aggregators (assuming a cooperative assignment) are the first honest participants to enter
h
.

If no honest aggregators see a notarization for

h1, it is still possible that the leader at
h
will not enter
h
within
of the first honest participant and a standard block notarization at
h
will be impossible to construct (even if there is an honest leader and a cooperative assignment at
h
). Consider some aggregation committee assignment where all aggregators across all committees are dishonest but not crashed. They follow the protocol as described previously (forwarding a standard block from an honest leader, aggregating vote messages from their committee, forwarding aggregate vote messages to other committees) but differ in 2 specific flows. Once each aggregator collects
2f+1
vote messages for
b
, they forward the notarization to all committee participants (unless that participant is the leader in the next round
h+1
) and they do not forward said notarization to the leader at
h+1
if they are not a committee participant. We later explore how the dummy block rate (especially when a standard block is seen after the timeout) can be used to locally adjust aggregation committee assignment robustness to counteract sudden changes in byzantine participation without a consensus decision (adding more aggregators makes it less likely no honest aggregator will see a notarization if any honest committee participant does).

For completeness, if the aggregation committee assignment is not cooperative, the all-to-all dummy block vote that occurs at

7 also results in an honest leader entering
h
within
δ
of the first honest participant. There is a thoughtful analysis of this case in the original Simplex Consensus paper.

Handling Conflicting Notarizations

Participants enter round

h+1 as soon as they see a notarization for a standard block or a dummy block at
h
, however, they will only finalize
h
if they see
2f+1
finalize messages for a standard block at
h
. Recall, the dummy block never extends the blockchain (and is thus never finalized). A participant may see up to 2 notarizations per height
h
(one for the standard block and one for the dummy block). If this occurs, the leader of
h+1
can choose to include either notarization (may not have seen other at time of block broadcast) and build off of the standard block at
h
or some earlier standard block (that does not conflict with the finalized tip) as long as they can provide all dummy block notarizations between
height(parent)
and
h1
, inclusive.

Algorithm

When Validator

p enters round
h
:

  1. Start timers:
    Ts
    =
    3
    and
    Tn
    =
    7
  2. Compute a random shuffle over all validators at height
    h
    :
    sh
    =
    Shuffle(h,vh)
  3. Determine leader for
    h
    :
    lh
    =
    sh[0]
  4. sh+1
    =
    Shuffle(h+1,vh+1)
  5. lh+1
    =
    sh+1[0]
  6. For each committee
    i
    (
    nc
    committees of size
    ncp
    ):
    1. Select
      nca
      aggregators:
      cia
      =
      v[incp:incp+nca]
    2. Select
      ncp
      committee participants:
      cip
      =
      v[incp:incp+ncp]
  7. ca=u=0nc1cia
    (all aggregators)

If

plh:

  1. Build standard block
    b=<h,[]notarization,parent,payload>
  2. Send block
    b
    to
    ca

If

pcia:

  1. Upon receiving first standard block
    b
    from
    lh
    , send block to
    cip
  2. Upon receiving
    ib
    votes from
    cip
    for some block
    q
    :
    1. Initial Vote Broadcast: Aggregate
      ib
      votes for
      q
      and send to
      ca
    2. Delta Vote Broadcast: For every
      db
      vote messages received after initial broadcast, send updated aggregation to
      ca
  3. Upon receiving
    2f+1
    votes for some block
    q
    :
    1. Construct notarization over
      q
      and send to
      lh+1
      and
      cip
  4. Upon receiving
    ib
    finalize messages from
    cip
    for some block
    q
    :
    1. Initial Finalize Broadcast: Aggregate finalize messages for
      q
      and send to
      ca
    2. Delta Finalize Broadcast: For every
      db
      finalize messages received after initial broadcast, send updated aggregation to
      ca
  5. Upon receiving
    2f+1
    finalize messages for some block
    q
    :
    1. Construct finalization over
      q
      and send to
      cip

If

pcip:

  1. Upon receiving first standard block
    b
    from some Aggregator
    acia
    , cancel
    Ts
    (if not fired):
    1. Standard Block Vote: If have not yet voted for a block in round
      h
      , send vote for
      b
      to
      cia
  2. Upon receiving
    2f+1
    votes for some block
    q
    , cancel
    Ts
    and
    Tn
    (if not fired):
    1. Standard Block Notarization: If
      Ts
      and
      Tn
      have not fired, send finalize message to
      cia
      and enter round
      h+1
    2. Dummy Block Notarization: Enter round
      h+1
  3. Upon receiving
    2f+1
    finalize messages for standard block
    b
    , finalize block
    b
    (transitively finalize its ancestry)
  4. Upon
    Ts
    firing:
    1. Optimistic Dummy Block Vote: Broadcast a vote for the dummy block to
      cia
  5. Upon
    Tn
    firing:
    1. Pessimistic Dummy Block Vote: Broadcast a vote for the dummy block to
      vh

Vena is a Permissioned Consensus protocol and assumes all validators are known and fixed prior to the start of each round (validators can still change between rounds). Aggregator

a in a Committee
c
serves as both an Aggregator and a Committee Participant (also sends its own
vote
and
finalize
messages). Leader
l
is never an Aggregator.

Participants only broadcast a vote for the first non-dummy block they receive at a given height

h that does not conflict with their finalized tip. Observing a block broadcast that conflicts with the finalize tip (under some conditions), observing multiple block broadcasts at a given height
h
, or observing a participant sending a vote/finalize message for conflicting standard blocks at
h
could be used to apply a penalty (i.e. slashing).

Vena participants begin voting for the dummy block as soon as

Ts ("standard block timeout") fires. This differs from Simplex Consensus where participants only begin voting after
Th
("notarization timeout"). Simplex Consensus could adopt this technique to reduce the expected round latency for dishonest leaders.

Initial Broadcast Weight (

ib) and Delta Broadcast Weight (
db
) are configurable and can be dynamically adjusted based on network stability and desired tolerance to byzantine faults (where less required tolerance can lead to more "dummy blocks" but can reduce message complexity without reducing safety tolerance). When sending a delta weight broadcast, aggregators still send the cumulative aggregated message to other aggregators (not just an aggregation of the signatures received since the last broadcast). This doesn't require extra bandwidth or computation by the broadcasting aggregator and makes tracking aggregated messages easier for the receiving aggregator.

A cooperative assignment exists when aggregation committees are constructed such that the standard block

b can be (if there exits an honest leader) broadcast to
2f+1
honest participants before
3
,
2f+1
vote messages for
b
from honest participants reach
2f+1
honest participants before
7
, and
2f+1
finalize messages for
b
from honest participants reach
2f+1
honest participants.

To finalize a standard block

b at
h
(and transitively its ancestry), there must exist some honest leader that enters
h
within
of the first honest participant, said leader produces and broadcasts a valid standard block
b
at
h
, and a cooperative aggregation committee assignment at
h
. While byzantine faults are static, the behavior of any byzantine actor is not. It is possible for some actor to vote honestly for
h
and then refuse to produce a finalize message for
h
or aggregate committee finalize messages for
h
(if an aggregator). If this occurs, a standard block may be notarized at
h
but never be finalized at
h
.

The timer to broadcast an optimistic dummy block vote is set to

3 because the latest possible time an honest leader will enter
h
(if there is at least one honest aggregator that sees a notarization in
h1
) is
δ
(evaluated against the local timer set by the first participant entering round
h
). It will take
2+δ
to disseminate their standard block to the first honest participant entering round
h
.

The timer to broadcast a pessimistic dummy block vote is set to

7 because all honest participants in a cooperative assignment in
h1
should enter
h
within
δ
of the first honest aggregator in their assigned committee (which should itself be within
δ
of other aggregators). If there is a cooperative assignment in
h
, it takes
3
(after the
3
timeout) for an optimistic dummy block vote to be communicated over the aggregation committees. This means that the latest a participant should see said notarization is
6+δ
(evaluated against the local timer set by the first participant entering round
h
). Note, Vena is not optimized for the case where there are a significant number of uncooperative assignments (in this case, aggregator robustness should be increased).

It is important that honest participants don't accept standard block votes from other participants that aren't aggregators. Otherwise it would be easier for a byzantine adversary to cause an honest node to enter

h+1 more than
before the leader for
h+1
.

Other consensus protocols have reduced latency and/or message complexity in the "happy path" by only sending certain messages to the leader in round

h or
h+1
(at the cost of worse performance in the "worst case"). Similar changes could be made to Vena by modifying aggregators to send votes to the leader at
h+1
and rely on leader to broadcast notarization for
h
in
h+1
. There are other complexities of pursuing this approach that are considered future work to explore.

Optimistic Confirmation Latency

To make the comparisons in this section easier, we assume all participants enter a given round of consensus at the same time.

With the introduction of aggregation committees, Vena is not (nor does it attempt to be) as fast as Simplex Consensus. However, it is still optimistically responsive when there is an honest leader and a cooperative aggregation committee assignment. On the optimistically responsive path, Vena confirms blocks 2.6x slower than Simplex Consensus (

8δ rather than
3δ
). The block time in Vena, on this path, is 1.6x slower than Simplex Consensus (
5δ
rather than
3δ
).

Block Broadcast:
* [leader] δ broadcast block to committee aggregators
* [aggregator] δ broadcast block to committee participants

Block Vote:
* [participant] δ broadcast vote message to committee aggregators
* [aggregator] δ broadcast aggregate vote message to other committee aggregators
* [aggregator] δ broadcast aggregate vote message to participants

Block Finalize:
* [participant] δ broadcast finalize message to aggregators
* [aggregator] δ broadcast aggregate finalize message to other committee aggregators
* [aggregator] δ broadcast aggregate finalize message to participants

Block Notarized: 5δ
Block Finalized: 8δ

If confirmation latency is more important than message complexity when there is a dishonest leader or aggregation committees are uncooperative, Vena can be configured to avoid optimistic dummy block voting and immediately transition to all-to-all broadcast. This would result in notarization latency of

4 in the "unhappy path" instead of
8
(same as Simplex Consensus).

Simplex Consensus, for comparison, has the following confirmation latency profile:

Block Broadcast:
* [leader] δ broadcast block to participants

Block Vote:
* [participant] δ broadcast vote message to all other participants

Block Finalize:
* [participant] δ broadcast finalize message to all other participants

Block Notarized: 2δ
Block Finalized: 3δ

To better reason about

3δ and
8δ
, let us also translate Snowman Consensus to this model. Snowman Consensus, like Simplex and Vena, is optimistically responsive and routinely finalizes blocks in ~1.5s over 1800 validators. Confirmation occurs in time
12δ
assuming that all polls are successful:

k = 20
beta = 20
concurrent polls = 4

* [participant] 2δ poll for preference (block discovery)
* [participant] 2δ poll for preference
* [participant] 2δ poll for preference
* [participant] 2δ poll for preference
* [participant] 2δ poll for preference
* [participant] 2δ poll for preference

Block Finalized: 12δ

In Snowman Consensus, there is no dedicated block broadcast step. Instead, participants poll each other for new blocks and upon learning of one begin to vote on it (after fetching it from the peer that first informed it of its existence). Snowman Consensus collects no voting artifacts from participants and requires an out-of-protocol mechanism to generate certificates of acceptance (Vena provides this out-of-the-box).

Communication Profile

The communication complexity of Vena (prior to all-to-all broadcast) is primarily dependent on:

  • nc
    : the number of aggregation committees
  • nca
    : the number of aggregators per committee
  • ncp
    : the number of participants per committee
  • ib
    : the initial broadcast weight
  • db
    : the delta broadcast weight

As discussed previously, Vena does not require committees to be supermajority honest for correctness, so committees don't need to contain a significant number of validators. Vena, however, is more sensitive to the number of honest aggregators in each committee as one honest aggregator is required to forward messages between honest participants in different committees and honest aggregators help honest leaders enter the next round in a timely manner. If there are no honest aggregators in a committee (even if all non-aggregators are honest), their messages (outside of a fallback all-to-all dummy block vote) will never reach other validators. Likewise, setting the value of

ib too high means that even when there are honest aggregators, aggregation committee signatures will never be forwarded to other committees. Setting the value of
db
too high means that the number of messages sent by aggregators (not non-aggregators) could far exceed the messages they would send in Simplex Consensus.

We can model the communication complexity of Vena as follows:

c = number of aggregation committees
a = number of aggregators per committee
p = number of participants per committee
b = signature broadcasts (sum of initial and delta broadcasts)

Block Broadcast:
* leader: ca (incoming) + ca (outgoing)
* aggregator: 1 (incoming) + p (outgoing)
* participant: a (incoming)

Block Vote (leaving committee):
* participant: a (outgoing)
* aggregator: p (incoming) + cab (outgoing)

Block Vote (entering committee):
* aggregator: cab (incoming) + p + 1 (outgoing)
* participant: a (incoming)

Block Finalize (leaving committee):
* participant: a (outgoing)
* aggregator: p (incoming) + cab (outgoing)

Block Finalize (entering committee):
* aggregator: cab (incoming) + p (outgoing)
* participant: a (incoming)

Summary (incoming):
* leader: ca
* aggregator: 1 + 2(p + cab)
* participant: a + 2a

Summary (outgoing):
* leader: ca
* aggregator: p + 2(cab + p) + 1
* participant: 2a

Example without Dummy Block (c=32, a=1, p=64, b=1):
* leader (also participant): 32 (incoming) + 32 (outgoing)
* aggregator: 193 (incoming) + 225 (outgoing)
* participant: 3 (incoming) + 2 (outgoing)

Example without Dummy Block (c=32, a=4, p=64, b=1):
* leader (also participant): 128 (incoming) + 128 (outgoing)
* aggregator: 385 (incoming) + 449 (outgoing)
* participant: 12 (incoming) + 8 (outgoing)

It is left to future work to better optimize the broadcast of the block within an Aggregation Committee. As currently specified, each Aggregator will forward the entire block to each Committee Participant.

Simplex Consensus has the following communication profile:

n = number of validators

Block Broadcast:
* leader: n (outgoing)
* participant: 1 (incoming)

Block Vote:
* participant: n (outgoing) + n (incoming)

Notarized View:
* participant: n (outgoing) + n (incoming)

Block Finalize:
* participant: n (outgoing) + n (incoming)

Summary (incoming):
* leader: 3n
* participant: 1 + 3n

Summary (outgoing):
* leader: 4n
* participant: 3n

Example (n=2048):
* leader: 6144 (incoming) + 8192 (outgoing)
* participant: 6145 (incoming) + 6144 (outgoing)

To reduce the message complexity of Simplex, one could batch the notarized view broadcast with the block finalize broadcast on the "happy path".

Again, to put things in context, let us also represent Snowman Consensus in this model. Snowman Consensus, unlike many other consensus mechanisms, only requires an expected constant communication overhead per processing in the "happy path" (when the protocol is not under substantial byzantine attack):

k = number of participants per poll
beta = 20

Block Discovery:
* participant: 1 (outgoing) + 1 (incoming)

Poll for Preference:
* participant: k (outgoing) + k (incoming)

Summary (incoming): 1 + 20k
Summary (outgoing): 1 + 20k

Example (n=2048, k=20):
* participant: ~401 (incoming) + ~401 (outgoing)

We assume all Snowman Consensus polls are successful and symmetric to simplify the analysis.

Exploring the Tradeoff Between Committee Robustness and Message Complexity

Depending on the desired tolerance to byzantine faults when confirming standard blocks, there are many ways to reasonably compose (where the communication complexity is expected to be less than Simplex Consensus) Vena's aggregation committees. To help explore the tradeoff between robustness and message complexity, we simulate the performance of Vena over a range of possible values of

a (the number of aggregators per committee),
ib
(the initial broadcast weight), and
db
(the delta broadcast weight) for a given
c
(the number of aggregation committees) and
n
(the number of validators). The code for these simulations is included at the end of this write-up and can be modified to explore different settings of
nc
and
n
(which default to
32
and
2048
, respectively). The simulation is run over 10,000 samples for each configuration and the percentage of cooperative samples (where a standard block is confirmed) is recorded (we always assume the leader is honest). Recall, a failure of a given sample just means that the standard block was not confirmed but does not mean that the protocol stalled (as participants always fallback to all-to-all broadcast).

Robustness

Aggregators = 1

aggregators = 1

Aggregators = 2

aggregators = 2

Aggregators = 4

aggregators = 4

Aggregators = 8

aggregators = 8

Aggregators = 12

aggregators = 12

Messages

Leader

leader

Aggregator

aggregator

Committee Participant

committee participant

Further Optimization Directions

Streamlining the Protocol

As noted in the original Simplex Consensus paper:

To recover a streamlined protocol, it is possible to “piggyback” the finalize message onto the first vote message of the next block; this would only make liveness a little slower, and consistency would still hold.

A similar optimization could be applied to Vena to reduce the number of messages attributable to each round

h. Unlike many other "streamlined" protocols, this optimization could be introduced without requiring consecutive, honest leaders/healthy aggregation committees to finalize a standard block. To see how this is true, let us consider a scenario where some honest leader broadcasts a block at
h
,
2f+1
participants vote for
h
, and the notarization of the standard block
h
is delivered before
7
(such that
2f+1
participants enter
h+1
and will never vote for the dummy block at
h
). Now, let us say that the leader elected at
h+1
does not broadcast a block (or there is an uncooperative aggregation committee assignment). After
3
,
2f+1
participants will broadcast a vote for the dummy block at
h+1
and will include their finalize for
h
in the same message. If the dummy block for
h+1
is not notarized by
7
, these participants will then all-to-all broadcast a vote for the dummy block at
h+1
including the same finalize for
h
. The "trick" here is that some block in
h+1
(standard or dummy) is always voted upon, whether or not there is an honest leader (allowing
h
, if its notarization is delivered before
7
to
2f+1
honest participants, to always be confirmed).

The communication profile of such an approach would look something like:

c = number of aggregation committees
a = number of aggregators per committee
p = number of participants per committee
b = signature broadcasts (sum of initial and delta broadcasts)

Summary (incoming):
* leader: ca
* aggregator: 1 + p + cab
* participant: a + a

Summary (outgoing):
* leader: ca
* aggregator: p + cab + p + 1
* participant: a

Example without Dummy Block (c=32, a=1, p=64, b=1):
* leader (also participant): 32 (incoming) + 32 (outgoing)
* aggregator: 97 (incoming) + 129 (outgoing)
* participant: 2 (incoming) + 1 (outgoing)

Example without Dummy Block (c=32, a=4, p=64, b=1):
* leader (also participant): 128 (incoming) + 128 (outgoing)
* aggregator: 193 (incoming) + 257 (outgoing)
* participant: 8 (incoming) + 4 (outgoing)

The described approach to streamlining is not without tradeoffs. Streamlining Vena would increase the confirmation latency for

h (as participants would wait to broadcast their finalization for
h
until they voted for a block at
h+1
instead of as soon as they saw a notarization for
h
). There is likely, however, some set of potential deployers of Vena that would trade a further elevated confirmation latency for a meaningful reduction in message complexity.

Dynamically Adjusting Aggregation Committee Parameterization

As demonstrated previously, there are a number of parameters (all else equal) that have a significant impact on the robustness and communication profile of Vena. It is worthwhile to consider adjusting these parameters based on observed network conditions (e.g. unseen notarizations included in newly constructed blocks, the dummy block rate, the all-to-all fallback rate) to maintain an optimal balance between robustness and message complexity. In essence, why incur additional network overhead to defend against a

30% byzantine adversary if
99.9999%
of the time no such adversary exists (recall, safety will always be maintained regardless of aggregation committee robustness)?

A network with less than

10% byzantine participation can almost always confirm blocks with only 1 aggregator per committee (2048 validators, 32 committees, 64 participants per committee):

aggregators = 1

Likewise, the same can be considered for

ib (the initial broadcast weight) and
db
(the delta broadcast weight) to dynamically adjust the message complexity of Vena based on previously observed committee participation. This should allow the number of broadcasts per aggregator (per vote/finalize) to approach ~1.

aggregator

When implementing such a feature, it would be wise to avoid relying on standard block finalization to perform a synchronized adjustment to aggregation committee parameterization. Standard block finalization requires a successful aggregation committee shuffle and a large, sudden drop in participation (i.e. AWS outage of

1520% of validator set) could prevent the finalization of a standard block for hours (even when
2f+1
participants are online). Let us sketch an algorithm that each validator can follow to adjust their broadcast robustness independently (agreement on aggregation committee roles never synchronized):

  1. Each validator can track network conditions (unseen notarizations included in newly constructed blocks and the dummy block rate) and begin sending messages to more aggregators as network conditions degrade (where each committee member up to some predefined max cardinality has a "rank" for becoming an aggregator).
  2. If any eligible aggregator (i.e. has a rank) receives
    ib
    weight for some message (and subsequent
    db
    weight), it will pass the aggregated message along to other aggregators up to its rank in each committee (rank
    1aggregator
    ).
  3. When network conditions improve (or malicious validators are evicted), validators can begin sending messages to fewer aggregators (other honest validators will eventually follow suit).

Multi-Layer Aggregation Committees

To further reduce message complexity (which may be required to support very very large validator sets), Vena could introduce a multi-layer aggregation committee topology. Instead of populating each committee with aggregators and participants,

k layers of committees could be populated with "super-aggregators" and "sub-aggregators" where each super-aggregator is responsible for serving the role of an aggregator in a Vena committee and each sub-aggregator is a participant in a given layer and a super-aggregator in a committee one layer deeper. At the lowest layer, committees would contain sub-aggregators as aggregators and some set of participants which serve as neither super-aggregators nor sub-aggregators elsewhere.

Each layer would incorporate committee-based aggregation but instead of forwarding aggregated messages to all other "super-aggregators" at a layer (across committees like in Vena), messages would just be passed to the "super-aggregators" in the parent layer (the committee in

l in which the super-aggregator of
l+1
serves as a sub-aggregator). When at the first layer, messages would be passed to all other aggregators (across committees). This extra relaying per message would increase latency per additional layer introduced.

If you are interested to learn more about this approach, Vitalik Buterin explored the idea of supporting multiple layers of aggregation a few years ago in "Paths toward single-slot finality". His high-level construction contemplates how to best integrate layers of aggregation within Ethereum's existing consensus (or a close derivative of it) but generally speaking seeks the same outcome: scaling participation in a single round of consensus to massive numbers of participants.

Acknowledgements

Thanks to Benjamin Y. Chan (byc@cs.cornell.edu) and Rafael Pass (rafaelp@tau.ac.il) for their novel work on Simplex Consensus. Thanks to Benjamin Y. Chan, in particular, for answering detailed questions about Simplex Consensus and for reviewing numerous pre-release drafts of Vena.

Thanks to Stephen Buttolph for contributing "Delta Signatures" (to maintain optimistic responsiveness), suggesting to explore the use of multi-layer aggregation committees, for collaborating on "Dynamically Adjusting Aggregation Committee Parameterization", and for catching numerous errata in earlier drafts. He is as much an author of Vena as I am.

Thanks to Michael Kaplan for suggesting that "Delta Signatures" include the aggregation of all messages seen up to

ib+kdb weight rather than just
db
weight.

Thanks to Aaron Buchwald, Andrew Lewis-Pye, and Kevin Sekniqi for their feedback during initial Vena research.

Thanks to Ben Edgington for publishing Upgrading Ethereum, which clearly details how Aggregation Committees work in ETH2.

Appendix

Constructing Honest Committees

This simulation code computes the probability that a subsampled committee of size

cn has
cnfc
faults given an attacker controls proportion
p
of the validator set. Below, you'll find some example configurations and the code used to reproduce the results. You can read more about this calculation here.

threshold (2^-40)= 9.094947017729282e-13

c_n= 111 f_c= 0.6666666666666666 f_p= 0.3333333333333333 p= 7.70618523410950724734966268675e-13

c_n= 128 f_c= 0.3333333333333333 f_p= 0.25 p= 0.028760674518293299914576387027729554
c_n= 128 f_c= 0.3333333333333333 f_p= 0.2873 p= 0.1774192013558914573970131701826451
c_n= 128 f_c= 0.3333333333333333 f_p= 0.3333333333333333 p= 0.5826316169491489404023714250043237
c_n= 128 f_c= 0.6666666666666666 f_p= 0.3333333333333333 p= 2.24551812442079985690710185445192014e-14

c_n= 512 f_c= 0.3333333333333333 f_p= 0.3333333333333333 p= 0.54149595957599760113175761394716210208503251272797376257099744489446021079973844302322316731358758604771194134091923083108964276840526020404
c_n= 512 f_c= 0.6666666666666666 f_p= 0.3333333333333333 p= 3.32378069652400978955480292288055178838712164885041744671263028924512115703682587968112350529124810166636232971460275141839970844924133039033e-53

c_n= 5000 f_c= 0.3333333333333333 f_p= 0.25 p= 8.87399211806982714803760183787252829673107763616725723433194329141010391018306947168133068916738648678382111118887809738311488508995518101154840893608215041869124574127987220863029211663194419388296031054121965944679752009327917306847431602598482346173765397094251238118178409981832728056466583900887975088889813957335639898539987948979257015934327309823673264306372837522409416436351101294159276964585310287888424447240817211204444919588061944008316223220559086507684650196131133302458172650611663328670904640364747816665732639859326594705164538844591594566762145684179217863397178864191960999231350176394755632972565160279011159481427070121325983258178556067908585769731420551058010563445709204652799795575885196610152170932960939522002720525977833959656068528072726800329049426364736124200015066303843954395687573534554354601009441772756112001852645585977956283265648102854006448324395032017231217642999674596037821066070022862211170134418405791296819948601797457345248588497875808047946165115281253108267602594777487268624775407150007518380800412331130104312846354074166856495763433831983941473976577977625626615759930083720809917488767600193335570330629531072854763633729004796305551261084196433133703012714447105767768206059401735661731359220468329792239639109786295199868336926747935349005872464994497144197139413252378146817950968042314890376899976822858483327639583839775e-40
c_n= 5000 f_c= 0.3333333333333333 f_p= 0.2873 p= 8.60571128660813853480649566082826072260630970588856718333470383702436149863440217472363509491524013599006786364555690022708126380666326119832334792020849016817771101279326866950851504211536575107849419687685178982538752364794820722354607358822244068330763842018401156341209042800024401539819100147389245707921381195996185289357014544674380434897439198235879450413614297734063556596916390793316684351998114803366153173204461989202601284405425345912762665841607041415141398502672495598140829921081978034596575969467354676324071329086368173026230998677773400283946187064312229994678567998078127281059999659715087679622049939737100534284955562106193700324029513520518045644473417587124712586355045241799567500200683388988564752368732928938568928457247094684954792735541817800651394384268924607849190120456653880337581875013226299496700194314651631930015481345463181674728300379284660605883909017113440952310779576163067304801352596400169212411169716563103373885220539909693041099960881687232468833486917804076684270949471807381517203474452005005076177360403486012495563040181585890886961981205890118305891333597191836188733086745043868626382818375679184586072226069196512923787271928657849963891384245797076752190939473112596062866553479955018012092779937654204011412631304168766629958711048247113239849798797301816157104155990866504662282592462445389932714363750536921860046445396304e-13
c_n= 5000 f_c= 0.3333333333333333 f_p= 0.3333333333333333 p= 0.513296089094446886516399044887632088936099787146919219981527037898996284592078279427410882588261430505787513545089150964627384269010981267948676935738018052503875877723166424544126214806892011257130807919263310521023925455549651757970860990339489821046310532500860035700215846783564077523939749917468924340655615451724135100998849681316146359831525430632764234489515048891171834776394590612088827895318564254593736485729650570976937718520195758015922230338999632483948533572954238694265474941335090354773512099280030318966796576784160148421447729060668907692336800444433811579614717798605876309895032264906093264151908442095677404192780723987757508133938771127055992430537584234572405256365186932297302694627438658212750965436442069055258399760783891458273420705094259760964593579122464330397774257793987773998465342814515252601779512654706683851913416011920035079705937250994652351634736454824243742866513959642182391023945798059447822279112155667378365101010221702229536763452134447365661320010180922595546905146457724573998704427124379493750713763900010004580495284924490396960439428687336155911388421506649099038382534632403979307766183059088059905493619984629178822898078723343842743732282437545792099850758890452123499410009209906129805114027012758749820062827532351784704473630104390127864076197400313817163738829359296921041064902792404309397778330609761329839298574769252
c_n= 5000 f_c= 0.6666666666666666 f_p= 0.3333333333333333 p= 4.862786079461605399190533562146698493922068600934085172713368492582869127077516320772684513533194798977011741338051205386138092097237891089925358696098596770851024274566746072285599447938878871499576705939734423671127549517424153453707983028204173684683684429441815317165076170025656857503237512172306914922363894296468380136355269604901577836035065257166933864339537528359342243626259755838599031645218453934945118108602724090609619520652766829465422950652219558795436152036114279943765240025982387882395816234541108962329835046120013388062159251420692487108725350657442573516955529378314418439123282807688056308424396621511816668137998732729983859963872646705141238811730117510280875277128720431271946281505123960223246497863849193429103676808715478783637642152581526063805694324326333893139406781640157832046263508978457516745720339711464915605578205033020229606532364207843560391846081295496122986380766957116189672858008748196648396339657497725606048098928824829884143634450092659568474295722197406524943669945827291996345742830406624706068329307494438943453319735434369838857239665503926213464264628088420866911228347588053191251496231668536656864445205776028445558619635196297066324832744309882428960853450193043726467643372611909514178147181524636837146848723241262014041030136959413151849154676201348093436858744128767970521780387467136799529557672652453109396803362235378e-504
package main

import (
	"fmt"
	"math"
	"math/big"
)

func choose(n, k int64) *big.Int {
	if k > n/2 {
		k = n - k
	}
	b := big.NewInt(1)
	for i := int64(1); i <= k; i++ {
		num := new(big.Int).Mul(big.NewInt(n-k+i), b)
		b = new(big.Int).Div(num, big.NewInt(i))
	}
	return b
}

func pow(v float64, n int64) *big.Float {
	r := new(big.Float).SetFloat64(1)
	for i := int64(0); i < n; i++ {
		r = r.Mul(r, new(big.Float).SetFloat64(v))
	}
	return r
}

func prob(n, k int64, p float64) *big.Float {
	// Compute changes
	a := pow(p, k)
	b := pow(1-p, n-k)
	c := new(big.Float).SetInt(choose(n, k))

	// Compute value
	r := new(big.Float).Mul(a, c)
	r = r.Mul(r, b)
	return r
}

func probge(n, k int64, p float64) *big.Float {
	sum := new(big.Float)
	for i := k; i <= n; i++ {
		sum = sum.Add(sum, prob(n, i, p))
	}
	return sum
}

func main() {
	fmt.Println("threshold (2^-40)=", math.Pow(2, -40))
	for _, sim := range []struct {
		n  int64
		fC float64
		fP float64
	}{
		{111, 2 / 3.0, 1 / 3.0}, // p ~= 2^-40
		{128, 1 / 3.0, 1 / 4.0},
		{128, 1 / 3.0, 0.2873},
		{128, 1 / 3.0, 1 / 3.0},
		{128, 2 / 3.0, 1 / 3.0},
		{512, 1 / 3.0, 1 / 3.0},
		{512, 2 / 3.0, 1 / 3.0},
		{5000, 1 / 3.0, 1 / 4.0},
		{5000, 1 / 3.0, 0.2873}, // p ~= 2^-40
		{5000, 1 / 3.0, 1 / 3.0},
		{5000, 2 / 3.0, 1 / 3.0},
	} {
		fmt.Println("c_n=", sim.n, "f_c=", sim.fC, "f_p=", sim.fP, "p=", probge(sim.n, int64(float64(sim.n)*sim.fC), sim.fP))
	}
}

Evaluating Committee Robustness and Message Complexity

package main

import (
	"fmt"
	"math/rand"
	"slices"
	"strings"

	"golang.org/x/exp/maps"
	"gonum.org/v1/plot"
	"gonum.org/v1/plot/plotter"
	"gonum.org/v1/plot/plotutil"
	"gonum.org/v1/plot/vg"
)

const (
	samples = 10_000

	validators = 2_048
	committees = 32
)

var (
	aggregators      = []int{1, 2, 4, 8, 12}
	initialBroadcast = []float64{0.25, 0.35, 0.5, 0.65, 0.75}
	deltaBroadcast   = []float64{0 /* no delta broadcast */, 0.05, 0.1, 0.15, 0.2, 0.25}
)

func main() {
	// Determine minimum number of honest participants required to reach consensus (2f+1)
	minHonest := validators*2/3 + 1
	if validators*2/3%100 != 0 {
		// If there was floor division, we need to add 1
		minHonest++
	}

	// Determine committee members
	committeeMembers := validators / committees
	fmt.Printf("Validators: %d Committee Members: %d\n", validators, committeeMembers)
	if validators%committees != 0 {
		panic("validators must be divisible by committees")
	}
	fmt.Printf("Minimum Honest Validators to Finalize: %d/%d (%.2f%%)\n", minHonest, validators, float64(minHonest)/float64(validators)*100)
	fmt.Printf("\n")

	// Graph Messages
	fmt.Printf("Computing Messages Sent\n")
	drawMessages(committeeMembers)
	fmt.Printf("\n")

	// Graph Robustness
	fmt.Printf("Computing Robustness\n")
	results := map[int]map[float64]map[float64]map[float64]float64{} // aggregators -> initial broadcast -> delta broadcast -> byzantine rate -> success
	for _, a := range aggregators {
		results[a] = map[float64]map[float64]map[float64]float64{}
		for _, ib := range initialBroadcast {
			results[a][ib] = map[float64]map[float64]float64{}
			for _, db := range deltaBroadcast {
				results[a][ib][db] = map[float64]float64{}

				// Skip cases where success is impossible
				if skip(ib, db) {
					continue
				}

				// Iterate over byzantine rates
				for byz := 0; byz <= 35; byz++ {
					// Construct participants array
					byzantine := make([]bool, validators)
					count := validators * byz / 100
					for i := 0; i < count; i++ {
						byzantine[i] = true
					}
					byzantineRate := float64(count) / float64(validators) * 100

					// Run samples
					success := 0
					for i := 0; i < samples; i++ {
						// Clone original array to keep each sample independent
						if runRobustness(i, a, ib, db, committeeMembers, slices.Clone(byzantine)) >= minHonest {
							success++
						}
					}
					successRate := float64(success) / float64(samples) * 100
					fmt.Printf(
						"[Aggregators: %d Initial Broadcast: %.2f Delta Broadcast: %.2f] Byzantine: %d/%d (%f%%) Success: %d/%d (%f%%)\n",
						a, ib, db,
						count, validators, byzantineRate,
						success, samples, successRate,
					)
					results[a][ib][db][byzantineRate] = successRate
				}
			}
		}
	}
	drawRobustness(results)
}

func skip(ib, db float64) bool {
	if ib+db >= 1.0 {
		// Delta broadcast never hit (we skip 1 because this is almost identical
		// to not having any delta broadcast over 0% byzantine)
		return true
	}
	if ib <= 2.0/3.0 && db == 0 {
		// No delta broadcast will never work
		return true
	}
	return false
}

// The most messages are sent in the case where all participants are virtuous, so
// we just model that case.
func drawMessages(committeeSize int) {
	// 0 = leader, 1 = aggregator, 2 = participant
	args := [3][]any{}
	for _, aggregators := range aggregators {
		for _, initialBroadcast := range initialBroadcast {
			pts := [3]plotter.XYs{}
			for _, deltaBroadcast := range deltaBroadcast {
				if skip(initialBroadcast, deltaBroadcast) {
					continue
				}

				// Configure number of delta broadcasts
				deltaBroadcasts := 0
				if deltaBroadcast > 0 {
					deltaBroadcasts = int((1 - initialBroadcast) / deltaBroadcast)
				}

				// Compute all
				perLeader := committees * aggregators                                   // block broadcast to aggregators
				perAggregator := committeeSize - aggregators                            // block broadcast to committee
				perParticipant := aggregators                                           // vote broadcast to committee aggregators
				perAggregator += (committees - 1) * aggregators * (1 + deltaBroadcasts) // vote broadcast to other aggregators
				perAggregator += committeeSize + 1                                      // aggregated vote broadcast to committee and next leader
				perParticipant += aggregators                                           // finalize broadcast to committee aggregators
				perAggregator += (committees - 1) * aggregators * (1 + deltaBroadcasts) // finalize broadcast to other aggregators
				perAggregator += committeeSize                                          // aggregated finalize broadcast to committee

				// Skip if number of messages sent by aggregators is way larger than we would ever want
				if perAggregator > validators {
					continue
				}

				// Compute roles
				fmt.Printf(
					"[Aggregators: %d Initial Broadcast: %.2f Delta Broadcast: %.2f (%d)] SM/Leader: %d SM/Aggregator: %d SM/Participant: %d\n",
					aggregators, initialBroadcast, deltaBroadcast, deltaBroadcasts,
					perLeader, perAggregator, perParticipant,
				)
				pts[0] = append(pts[0], plotter.XY{X: deltaBroadcast, Y: float64(perLeader)})
				pts[1] = append(pts[1], plotter.XY{X: deltaBroadcast, Y: float64(perAggregator)})
				pts[2] = append(pts[2], plotter.XY{X: deltaBroadcast, Y: float64(perParticipant)})
			}
			for i := 0; i < 3; i++ {
				args[i] = append(args[i], fmt.Sprintf("AG=%d IB=%.2f", aggregators, initialBroadcast), pts[i])
			}
		}
	}

	// Plot Delta Overhead
	for i, prefix := range []string{"Per Leader", "Per Aggregator", "Per Participant"} {
		p := plot.New()
		p.Legend.Left = false
		p.Legend.Top = true
		p.Title.Text = fmt.Sprintf("[%s] Messages Sent (Validators=%d, Committees=%d)", prefix, validators, committees)
		p.X.Label.Text = "Delta Broadcast"
		p.Y.Label.Text = "Messages"
		p.Y.Min = 0
		if err := plotutil.AddLinePoints(p, args[i]...); err != nil {
			panic(err)
		}
		if err := p.Save(16*vg.Inch, 8*vg.Inch, fmt.Sprintf("messages_%s.png", strings.ReplaceAll(strings.ToLower(prefix), " ", "_"))); err != nil {
			panic(err)
		}
	}
}

func runRobustness(simulation int, aggregators int, initialBroadcast float64, deltaBroadcast float64, committeeMembers int, byzantine []bool) int {
	// Seed with [simulation] to keep results deterministic
	rand := rand.New(rand.NewSource(int64(simulation)))
	rand.Shuffle(len(byzantine), func(i, j int) { byzantine[i], byzantine[j] = byzantine[j], byzantine[i] })

	// Compose committees from participants (assume leader is honest)
	var (
		prev           = 0
		nonAggregators = committeeMembers - aggregators
		weight         = 0
	)
	for i := 0; i < committees; i++ {
		var (
			honestAggregators = 0
			committeeWeight   = 0
		)
		for _, byzantine := range byzantine[prev : prev+aggregators] {
			if !byzantine {
				honestAggregators++
				committeeWeight++
			}
		}
		prev += aggregators
		if honestAggregators == 0 {
			// If there are no honest aggregators, there is nothing left to do or count
			prev += nonAggregators
			continue
		}
		for _, byzantine := range byzantine[prev : prev+nonAggregators] {
			if !byzantine {
				committeeWeight++
			}
		}
		prev += nonAggregators

		// Only add to connectedAndHonest if we have > [initialBroadcast] honest participants and
		// add in intervals of [deltaBroadcast] to simulate delta signatures
		initialWeight := int(float64(committeeMembers) * initialBroadcast)
		if committeeWeight < initialWeight {
			continue
		}
		weight += initialWeight
		if deltaBroadcast == 0 {
			continue
		}
		deltaSize := int(float64(committeeMembers) * deltaBroadcast)
		if deltaSize == 0 {
			panic("delta broadcast too small")
		}
		for counter := initialWeight + deltaSize; counter <= committeeWeight; counter += deltaSize {
			weight += deltaSize
		}
	}
	return weight
}

func drawRobustness(results map[int]map[float64]map[float64]map[float64]float64) {
	for _, aggregators := range aggregators {
		args := []any{}
		for _, initialBroadcast := range initialBroadcast {
			for _, deltaBroadcast := range deltaBroadcast {
				// Skip useless cases
				if skip(initialBroadcast, deltaBroadcast) {
					continue
				}

				// Ensure results are in sorted order
				byzantineResults := results[aggregators][initialBroadcast][deltaBroadcast]
				sortedByzantine := maps.Keys(byzantineResults)
				slices.Sort(sortedByzantine)

				// Compute Successes
				pts := plotter.XYs{}
				for _, byzantineRate := range sortedByzantine {
					pts = append(pts, plotter.XY{X: byzantineRate, Y: byzantineResults[byzantineRate]})
				}
				args = append(args, fmt.Sprintf("IB=%.2f DB=%.2f", initialBroadcast, deltaBroadcast), pts)
			}
		}

		// Plot Broadcast Params over Different Messaging Approaches
		p := plot.New()
		p.Legend.Left = true
		p.Legend.Top = false
		p.Title.Text = fmt.Sprintf("Optimistic Aggregation Robustness (Validators=%d, Committees=%d, Aggregators=%d)", validators, committees, aggregators)
		p.X.Label.Text = "Byzantine Participants (%)"
		p.Y.Label.Text = "Success Rate [10k Samples] (%)"
		if err := plotutil.AddLinePoints(p, args...); err != nil {
			panic(err)
		}
		if err := p.Save(16*vg.Inch, 8*vg.Inch, fmt.Sprintf("robustness_v%d_c%d_a%d.png", validators, committees, aggregators)); err != nil {
			panic(err)
		}
	}
}