relay monitor design doc

# relay monitor design doc thanks to @metachris, @bertcmiller and @thegostep for helpful discussions on this design # intro this document assumes knowledge of the general `mev-boost` architecture. you will want to be familar with the `builder-specs`: https://github.com/ethereum/builder-specs and here are some diagrams that are relevant, both from the `flashbots/mev-boost` repo on github: ## high-level architecture ![](https://i.imgur.com/sf6gsKJ.png) ## builder API flow ![](https://i.imgur.com/TFkDvSR.png) # what in the initial version of `mev-boost` (https://boost.flashbots.net) there is a trusted actor that connects proposers to builders called the **relay**. the relay has a series of duties including verifying that builders supply good blocks that respect the proposers' preferences (e.g. gas limit of the block) and facilitating an auction so that proposers can efficiently allocate blockspace. while this role is trusted, it is permissionless so that anyone is able to run a relay and it is on proposers to configure "relay mux" software (e.g. https://github.com/flashbots/mev-boost) with relays of their choosing to connect to the builders they are interested in working with during each auction. to mitigate potential abuses of this role by byzantine relays, flashbots has suggested a "relay monitor": https://github.com/flashbots/mev-boost/issues/142 this document refines the ideas in that issue towards a concrete specification. # design the relay monitor (RM) uses publicly available data to form a view on the **behavior** and **performance** of the set of relays it is monitoring. this includes an "opt-in" setting of the relay mux that forwards each proposer's view of the network to the RM. doing so greatly improves the relay monitor's ability to understand what the relay (and also proposer) did for each proposal. by **behavior**, we refer to the set of responsibilites relays have to ensure safety and liveness of the ethereum protocol. users of the mev-boost ecosystem take on risk that relays and/or builders fail to fulfill these responsibilities and an important role of the RM is to ensure relays carry out their portion of requirements. by **performance**, we refer to metrics like latency and throughput relating to how well the relay works in a system context. ## types of faults ### behavior faults first, we describe the types of faults the RM watches for and the corresponding data the RM consumes for each relay and receives from each proposer. the high-level structure of the `mev-boost` flow is a commit-reveal scheme where first a set of bids are provided that commit to a given payload (`getHeader` in the API sequence diagram) and then a proposer selects one bid in a binding way by signing over a block containing the data committed to in the bid. the relay should then release the corresponding payload (`getPayload` in the sequence diagram). note that there is nothing stopping a relay from offering a bid during one call of the API and a different (likely more valuable) bid on a subsequent call. this complicates the monitoring process as the RM will not know what view the proposer had unless the proposer shares it. this fault -- "auction integrity" -- is listed for completeness but should be considered an "advanced" class of faults to detect and fault should only be assigned assuming best intentions of the relay if there is doubt. we can now look at the possible faults for each phase of the commit-reveal scheme: #### bid faults * malformed bid * relay produces some bid that is syntatically invalid (e.g. incomplete execution payload header) or has an invalid signature * **requires**: bid from relay * consensus-invalid bid * relay produces a bid that is invalid with respect to the consensus logic, e.g. the block `number` in the execution payload is wrong (note: RM should be tolerant to bids on non-canonical forks) * **requires**: bid from relay * payment-invalid bid * the payment claimed in the bid does not match the ultimate value delivered to the proposer if the payload for this bid was included on-chain * **requires**: bid from relay *AND* full payload in next step * valid bid, but nonconforming * the bid is valid wrt the prior conditions but does not reflect the latest validator preferences for the given proposer * **requires**: bid from relay, latest preferences from proposer * auction integrity: non-discriminatory * relay does not discriminate between bidders * **requires**: bid(s) from relay, bid(s) seen by proposer * auction integrity: market competition * this one came from @thegostep * relay does not deviate from its peers over a given period of time * **requires**: bid(s) from relays across a period of time #### payload faults * malformed payload * relay returns an execution payload that is syntactically invalid (e.g. missing block number) or does not match the execution payload header from the accepted bid * **requires**: payload from relay, signed blinded beacon block from proposer * consensus-invalid payload * assuming the prior conditions hold, this failure occurs if the execution payload contains invalid transactions wrt the consensus logic * **requires**: payload from relay, signed blinded beacon block from proposer * unavailable payload * relay committed to making available some payload via producing the bid in the prior step and could not make this payload available * note: if the corresponding block *eventually* ends up in the canonical chain (or even orphaned) it should not count against the relay, or perhaps as a distinct sub-fault, "published but failed to provide payload" * **requires**: payload from relay, signed blinded beacon block from proposer ### performance faults / metrics Given its central role to network security, I recommend the behavior monitoring is implemented ahead of any performance monitoring. **TODO**: flesh these out more some metrics to watch: - relay latency - relay error rate (including uptime, HTTP error rate) - not as interested in getting raw requests/sec but open to hearing otherwise ## relay monitor feature set ### goal: an indexed collection of relay faults we can start simply by tabulating the above number of faults observed for each relay. a downstream consumer can use the fault data to make their own assessment of relay risk, or form a "reputation" score, etc. this means a relay monitor keeps a mapping of fault type ordered by "chain coordinate" and indexed by relay (pubkey). for example: ```json { "0xdeadbeefcafe": { // relay pubkey "malformed-bid": [ (slot, parent-hash, proposer-pubkey) , (slot1, parent-hash1, proposer-pubkey1) ], // list of faults and 'when' they occurred, organized by fault type ..., // other faults here "unavailable-payload": [], // no availability faults observed for this relay } } ``` note: this schema is just a strawman to illustrate and is very likely to change #### aside on auditability a relay monitor will also want to persist all events ingested so that any given assessment of relay performance can be re-derived as desired. note that the input data is authenticated with the author's signature (e.g. `signature` on `BuilderBid`) and this along with the canonical chain would be able to verify the authenticity of any RM output. if some service existed to capture all block data (even off the canonical chain), you could independently verify any claims made by any relay monitor. ### how to process faults the requirements for each type of fault induce a dependency graph which suggests the following high-level functionality for the RM organized as a series of concurrent, long-running "tasks": 0. proposer preference aggregation any relay mux software should send a copy of the `SignedValidatorRegistration` to any connected relay monitors when they perform the routine calls to any connected relays. the RM stores these preferences by pubkey (and will want a validator index mapping). :::success **requires**: change `mev-boost` software to upstream validator preferences to configured relay monitors, along with relays and see note below about this being user-configurable ::: 1. bid inspection for each slot of the consensus chain and each head of the execution chain, the relay monitor fetches the offered bid from each relay for the scheduled proposer on that chain. the RM can verify the bid does not have the following faults: - malformed bid - consensus-invalid bid - valid bid, but nonconforming bids should be stored, indexed by relay pubkey and execution block hash for later analysis 2. payload inspection there is an option here depending on how much data the proposer wants to share with the relay monitor: they can send the `SignedBlindedBeaconBlock` and the `SignedBuilderBid` (let's call this the "bid acceptance data") in a call to the relay monitor alongside the `getPayload` call to the relay(s). note proposers may wish to skip this step for either privacy reasons or to avoid the resource overhead of sending the data to an additional endpoint. in the event the RM never receives the bid acceptance data from the proposer, they can still watch the canonical chain for whatever block arrives and see if they have *some* matching bid (using the execution block hash in the block on-chain), using the `value` to break ties if required. this gives less accuracy to the RM overall but supports smaller stakers (compared to say well-resourced pools) so I'd argue is a use case worth supporting. if the RM does recieve the bid acceptance data, it can finish the remaining validations after calling the appropriate relay(s) to get the corresponding execution payload: - payment-invalid bid - malformed payload - consensus-invalid payload - unavailable payload bonus: the RM can gossip the complete `SignedBeaconBlock` if they are well-peered with the rest of the Ethereum network. :::success **requires**: allow relay monitor data collection to be configurable (downstream convo if it by default and to who or not) **requires**: `mev-boost` software sends bid and signed block to RM along with the call to get the execution payload from the relay(s). ::: 3. auction inspection this section will be left opaque for now and detection of this type of fault will be considered part of the scope of some future work. it would boil down to making sure that for the set of observed data there were no inconsistencies in what proposers see and what the relays provide. ## rating API while running the data collection and fault processing described above, the RM continually computes a "scorecard" per relay like the example given above. this "rating" API can provide the current snapshot of this "scorecard" at any point a caller requests it via this API. the primary consumer of this data is the relay mux software that can use it to make policy decisions about which relays to use. :::success **requires**: `mev-boost` software should be able to call the "rating" API for a set of relay monitors and process the resulting data Q: how to handle multiple relay monitors -- just merge them all together (?) Q: points out how we want to handle "relay monitor" fraud -- to start, let's just assume RMs are trusted, even moreso than relays ::: i'd suggest there is a "default profile" baked into any relay mux software that encodes a given policy that protects against the most harmful faults. but this could become a configurable part of the `mev-boost` software. ### default profile the default fault detection algorithm is: :::info if there are `M` faults of any type by a relay across the last `N` slots, then the relay mux should stop using that relay for the next `2**N` slots ::: note: the exact profile is up for debate so if you see something you like better, please let me know! :::success **requires**: `mev-boost` software should call the rating API every epoch and process the fault data into a policy decision for any attached relays **requires**: `mev-boost` software needs to know when epochs are, or at least how long an epoch is (and can just query at some point in an epoch if not at the start) Q: we can refine to call ahead of any proposal, but then `mev-boost` needs a lot more context about the chain than it currently has -- what do we think about this? ::: ### aside on soundness there are fallback pathways in client software such that we can safely scope these policy decisions to the relay mux software and the upstream client software will work in terms of network safety and liveness. for example, the relay mux software could temporarily identify no relays for a given proposal based on recent faults and in that case the proposer software would use a local pathway to construct a block in lieu of the remote pathway enabled by the mev-boost ecosystem. ## additional notes / open questions - how to ensure everyone has the same configuration for relays? - e.g. if i have same `pubkey` but different relay endpoints and they serve different data? ## rationale / discussion ### can proposers abuse the monitoring system? - proposer can make an invalid beacon block for an honest bid - this can likely be detected and published as well also want proposer's view, and can make reputation for bad proposers, if we wanted, but this is definitely outside the scope for now ### why a "simple scorecard" to communicate reputation? easier to start with something simple, and push some work to the consumer. exposing aggregate metrics is more neutral than aggregation *and then* taking a view on relay reputation ### privacy concerns this design of the relay monitor, esp the parts where `mev-boost` software upstreams information about validators to the RM, does not respect validator privacy very well. i think this trade-off is mitigated by the opt-in nature of sending data to the RM and is appropriate given the severity of the risks it addresses. future iterations of this functionality can start incorporating constructions that obscure information flow to support better privacy.