# Subverting the total eclipse (of the heart) *cowritten by Dan Marzec and Louis Thibault* :::warning :warning: **TL;DR:** malicious actors can rely on eclipse attacks to win the "attestation race". We present an analysis of the exploit and summarize mitigation strategies. ::: <center> <img src="https://hackmd.io/_uploads/H14BnQ0Zn.png" width="600" height="300"> </center> ## Anatomy of an Unbundling Attack On April 3rd, 2023, an attacker used an exploit present in major mev-boost relays ([mev-boost-relay](https://github.com/flashbots/mev-boost-relay), [dreamboat](https://github.com/blocknative/dreamboat)) to extract $30M from front-running bots. The attack is outlined in the diagram below by [Mike Neuder](https://twitter.com/mikeneuder): <img src="https://hackmd.io/_uploads/H1Jb636Z2.png" width="3500" height="450"> A central concept underlying Proposer-Builder Separation (PBS) is that proposers must not be allowed to access the contents of a block before they "commit" to it (_i.e._ by signing the corresponding header). After a proposer has signed a block, they are susceptible to a slashing penalty if they sign a second one. In this case, the attacker baited front-running bots by emitting high-slippage swaps into the public mempool. It's important to note that front-running is a non-atomic form of arbitrage, meaning that the transactions involved can be unbundled. This creates an incentive much larger than the slashing penalty of 1 Eth for an attacker able to discover the block, rearrange its transactions so as to front-run the front-runners, and propagate the forged block to the attestation committee ahead of the original. To pull this off, the attacker utilized a vulnerability in MEV relays wherein signed headers were not properly verified. This allowed a malicious proposer to commit to an invalid block, whereupon the builder would erroneously reveal its full block. From there, it unbundled transactions, constructed a valid block extracting $30m in front-running MEV, and propagated it across the network. What allowed this to work is that the relay did not find a block corresponding to the invalid header, and therefore sent nothing to the beacon node. **In effect, the invalid header was a mechanism used to prevent the propagation of the legitimate block**, while still fooling the relay into revealing it. The header verification bug has since been patched in [mev-boost-relay](https://github.com/flashbots/mev-boost-relay/commit/3025635e3ae2837c66466cf089b7eb4e2d7da6ed) and [dreamboat](https://github.com/blocknative/dreamboat/pull/124). ### Further Reading - [Punk3155's initial find tweet](https://twitter.com/punk3155/status/1642771856758546434?s=20) - [0xst1ng3r operational analysis of adversary](https://twitter.com/0xSt1ng3R/status/1642940461420625949?s=20) - [samczsun's twitter thread](https://twitter.com/samczsun/status/1642848556590723075?s=20) - [bertcmiller's post mortem thread](https://twitter.com/bertcmiller/status/1643401549635366913?s=20) - [flashbot's post-mortem post](https://collective.flashbots.net/t/post-mortem-april-3rd-2023-mev-boost-relay-incident-and-related-timing-issue/1540) ## Future Attacks : Analysis and Mitigations While patching the relays, two additional vulnerabilities were brought to the surface, and have since been the subject of worried discussion owing to their non-mutual-exclusivity, and the absence of an obvious solution. These are: 1. **Multi-block griefing attack.** The adversary foregoes his current slot proposal (by submitting too late) in order to access to the block and reveal the private txns. Since the attacker cannot actually profit, it is considered "griefing". If the attacker were to control the next slot, however, he could potentially profit from this information, provided searcher bundles do not constrain the block number. While this attack is important to analyze and mitigate for future In-Protocol-PBS (IP-PBS), it is a more sophisticated vector and will not be covered in this document. 2. **Eclipse/Saturation attack.** This is an alternate approach to the block equivocation attack observed in the wild. It substitutes the relay's validation bug for an eclipse attack that isolates the relay's beacon node from the rest of the network. This has the effect of preventing the legitimate block from reaching the attestation committee, allowing the unbundling attack to take place. The present document focuses on attack vector #2. In most discussions this attack has been referred to as an "eclipse attack", but this appellation risks conflating the solution space with well studied eclipse attack mitigation in p2p networks. There are at least two important things going on: 1. an eclipse attack that slows down message propagation; and, 2. an attestation race. Viewed in this light, the primary attack vector is the attestation race, not the eclipse. The latter is effectively a substitute for the invalid header from the original exploit: a means of preventing the valid block from reaching attestors, so that the malicious block reaches attestors first. This application of the eclipse attack incidental enough, and specific enough to Ethereum PBS that it warrants its own name. We refer to it as a "Bonnie attack", after Bonnie Tyler's 1983 hit single [*Total Eclipse of the Heart*](https://youtu.be/lcOxhH8N3Bo). <center> <img src="https://hackmd.io/_uploads/Hy5lpQAb3.jpg" width="400" height="300"> </center> </br> This framing is essential to understanding how the attack can be mitigated, so it bears repeating again: the eclipse attack is incidental. The important part is the attestation race. ### Analysis of the Bonnie Attack As we have seen, Bonnie attacks are part of a larger class of equivocation involving attacks in which an attacker must discover the block data, construct a malicious block, and propagate it to a quorum of attestors before these can observe the "legitimate" block. Thus, interdicting this class of attacks is a deceptively simple matter of ensuring a block is witnessed by attestors before it is revealed to the (possibly malicious) proposer. The specificicity of the Bonnie attack is its use of an eclipse attack to isolate a given relay -- the victim -- from the broader network, and slow the propagation of its messages. This specificity introduces the following complication: the peers eclipsing the relay beacon can communicate the block contents back to the malicious proposer, allowing it to construct an alternate block. In order to be effective, a mitigation strategy must censor the contents of a block from the proposer until it has been witnessed by the attestor quorum. This introduces a necessary prerequisite: preventing a total eclipse of the relay's beacon. ![](https://hackmd.io/_uploads/BJdtYzCZn.png) A Bonnie attack does not require a total eclipse of the relay's beacon; it is sufficient to delay the propagation of its messages to the point that the majority of the current attestation committee receives the malicious block first ([*It should be noted that proposer-boost gives an incentive to proposing a block earlier in the slot*](https://github.com/ethereum/consensus-specs/pull/2730)). In a partial-eclipse scenario, where a supermajority of the relay beacon's peers are controlled by an attacker, this is achieved through the combined effect of (a) a reduction in the victim's effective fanout-factor and (b) the attacker's perogative to fan-out messages to an extremely large number of peers. With this in mind, it is clear that preventing a total eclipse of the relay beacon is a *necessary* part of defeating the Bonnie attack, but is by no means *sufficient*. How can this be done? We begin by reviewing proposed solutions that we believe fail to adequately resolve the issue. We then proposing a social scheme to prevent total eclipses before describing a consensus layer extension that neutralizes Bonnie attacks. ### Non-Solutions #### 1. Custom Peer Overlay - BloxRoute BDN comes to mind, but as mentioned by others, the attacker can just use BDN. - In fact this might mean that the attacker may be using it and therefore the relay community should also be using it? Mutually Assured Peering? #### 2. Refresh Relay's Beacon Node's Peer IDs and Peer set - Idea here is that an adversary can monitor the beacon block gossip topic and use relay data APIs to correlate peer IDs to a relay. To make this harder, randomly rotating peers and peer IDs can make seem as if a relay's blocks are being published from random points in the network. - How do boot nodes work in CL clients? - If an attacker knows our beacon nodes IP, or rough IP if in cloud, can it brute force peering? - Total eclipse not necessary, anyway. - Weakens most existing form of p2p security, as these rely on having stable identities onto which reputation can be assigned (e.g. peer scoring in Gossipsub) - Can we block IP ranges assocaited with public cloud providers as that would be the easiest way to spin up many beacon nodes? - Anecdotally Avalanche has done something similar ## A solution to Bonnie Attacks (?) Mitigating Bonnie attacks requires us to (1) avoid total eclipses of relay beacons, and (2) signal the identity of the new block without revealing block-contents to attackers. ### Avoiding Total Eclipses Total eclipses are mitigated by establishing a network of direct peering relations between the beacon nodes of trusted relays. This is a two-part process: 1. Public relay operators publish the public keys and peering information of their beacon nodes. 2. Using the lip2p pubsub [`WithDirectPeers`](https://pkg.go.dev/github.com/libp2p/go-libp2p-pubsub#WithDirectPeers) option, relay operators configure direct peering relations with beacon nodes from other relay operators. Note that relay operators MUST continue to accept dynamic, anonymous peers in addition to these trusted peers, to avoid forming a private network. Messages are unconditionally forwarded to direct peers, ensuring that the node cannot be fully eclipsed, unless all of its direct peers are also fully eclipsed. This can also be achieved by having relays directly publish their blocks to other relays. This solution is also not very interesting in the medium to long term as it creates centralization vectors on the relay set. ### Frontrunning Attestation Races Having reduced the risk of total eclipses to a small probability, we now return to the question of witnessing block data without revealing it to attackers. The solution is to _Disembargo After New Block ANnouncement Delay_ (DANBAND). >**Disclaimer:** the name DANBAND was suggested by Louis Thibault. Dan Marzec is not _that_ self-involved. <iframe width="560" height="315" src="https://www.youtube.com/embed/cIRiZsDObrU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> In case it wasn't immediately obvious, DANBAND is named after this (in)famous subversion of Bonnie Tyler's *Total Eclipse of the Heart* by *The Dan Band*. Give it a listen (mildly NSFW). DANBAND works analogously to the "commit-reveal" pattern found in the standard PBS protocol. In standard PBS, the proposer "commits" to a given block by signing a block header provided by the relay. Crucially, this block header comprises a hash of the proposed block, allowing it to be proven *post hoc* that some given block data corresponds to the header that was signed. We can extend this idea to the propagation of new blocks through the beacon network. Beacons subscribe to a new Gossipsub topic, which we call `witness` and monitor it continuously for messages. Upon receiving a signed header from the proposer, a relay performs the following actions: 1. it publishes this signed header on the `witness` topic; and, 2. it waits *s* seconds before publishing the block _data_ to the standard topic. Upon receiving a header from the `witness` topic, the beacon validates the proposer's signature, after which it considers that the proposer has committed to the corresponding block. It subsequently rejects any other headers, as well as any blocks that do not match the selected header's checksum. It should be noted that headers are signed by the proposer, so conflicting headers can only be produced if the proposer has signed two different blocks, in which case these can be submitted as a slash proof. <center> <img src="https://hackmd.io/_uploads/HJ7WrQAW3.png" width="400" height="300"> </center> The net effect of this behavior is that legitimate headers are given a significant "head start" in the race to propagate to the attestation committee. ## Discussion Of course this solution comes with its costs. Allocating time in the slot for header propagation delays propagating beacon blocks. In fact, the more time we devote to header propagation, the higher chance of orphaned blocks this design creates. As witnessed from delaying block propagation in the initial mitigations of the 4/3 attack, [orphaned blocks increased](https://collective.flashbots.net/t/note-on-increased-number-of-forked-blocks-missed-slots/1559). Thus we also introduce a tradeoff in the amount of time builders have to build the most valuable block they can by incorporating all information revealed by searchers and tx originators over the time window. This design may in fact be better suited for a two-slot PBS design. There are two beneficial properties of DANBAND and the private peering arrangements on which it relies: backwards compatibility and collateral safety. First, both provide full backwards-compatibility. This is trivially obvious in the case of private peering. Where DANBAND is concerned, beacon implementations that adopt DANBAND are not penalized for dropping messages under equivocation conditions, nor are they penalized for witholding block data for a brief period. However, the possibility of a reorg exists in networks where a majority of nodes do not support the DANBAND protocol. In such cases, a Bonnie attack may result in the network accepting a block that individual DANBAND nodes have previously rejected. Because the witness-based validation does not apply to reorgs, DANBAND nodes should recover gracefully and protocol faults are not expected. In effect, DANBAND exhibits an adoption threshold beyond which its security properties can be enjoyed, but sub-threshold usage does not introduce faults or regressions. Second, the DANBAND protocol introduces no edge cases in which non-malicious actors are harmed, be it economically or in terms of fair network participation. It does not directly slash nor penalize nodes, nor classify existing legitimate behaviors as malicious. As in the current state of affairs, attesters are never penalized for attesting to the "wrong thing". Instead, it provides a pathway for detecting equivocation before blocks are settled. Future work should investigate the feasibility and desireability of automatically slashing validators based on DANBAND-level violations.