Expanding the relay monitor

The following is a mini-report of my EPF project work on expanding the feature set of relay monitor. This can also be used as a paired read along with this PR that implements the actual changes to the functionality. This report assumes some familiarity with the MEV supply chain, PBS, and mev-boost.

In terms of background reading on MEV, I made a page with all my notes, roughly organized / tagged as part of my reading-up during the first few weeks of EPF. It can be found here:

https://hackmd.io/@andriidski/Hkl6Jh_nBn

There are also slides from my short presentation / demo to the EPF fellows, which can be found here:

https://docs.google.com/presentation/d/1OjwVr5jGSGLKNj3u3h7ijw8hU49DX16b4SoI2Y5DEck/edit?usp=share_link

And recording of the presentation / demo:

https://youtu.be/oF_BRlXMVNY?t=6471

TLDR

For a chunk of the EPF program duration I’ve expanded the feature set of the relay monitor to include
- Bid collection + persistence
- Bid analysis + persistence
- Queryable data based on time (slots)
  - Bid statistics — aggregate faults per category
  - Bid records — lists of bid fault records per category
- Relay scoring based on the persisted bid collection + analysis (POC)
  - Reputation scoring
  - Bid delivery rate scoring
API spec for what the monitor should do and which endpoints it exposes.
- https://spec.rm.mev.software for a rendered spec
- https://github.com/andriidski/relay-monitor-specs github repo
Deployed monitor running on Sepolia network

Introduction

On today’s Ethereum network (as of early March 2023) about 90% of blocks are being routed through parties called relays and originating from specialized parties referred to as the “builders”. This reality represents the current state of out-of-protocol proposer-builder separation (PBS) and is largely possible due to the work of the Flashbots collective R&D in delivering the original architecture to interface with post-merge PoS Ethereum as well as the actual software pieces (binaries) that are run in production today.

In the current PoS Ethereum ↔ MEV world the relays exist in order to broker the communications between the builders and the validators. Validators are able to opt-in to outsource block construction by running a sidecar piece of software that does the “requesting” of the blocks. Assuming the two roles are logically separated, one of the fundamental problems is that the two parties do not trust each other. If block construction is outsourced and sent over with a bid to the validator directly, a validator willing to run extra software will be able to take it for themselves by modifying the block and claiming they have built it, with no way for the builder to prove malfeasance. At the same time, a builder tasked with providing blocks for proposers that are “dumb”, and await blocks that they will blindly propose, may have incentives to grief all or specific proposers by building invalid blocks that may claim to payout the proposer a big sum of money in return for block inclusion but in reality have transactions/block data that are invalid w.r.t consensus. We also have no guarantees about builder / proposer relationships: it may be the case at any given time that both the builder and the validator are under the control of the same entity. Separating the two roles is mainly about sweeping many problems to a single side (the one where the validator isn’t on).

With respect to the block delivery pipeline, the relay has a set of tasks to perform to satisfy a set of conditions. On the block building side, the relays receive submissions from multiple block builders, validate that those submissions are valid, and store them in the database to await delivery when requested. On the validator side, the relays process registrations from validators that allow validators to express block preferences and await requests to deliver a block when the validator receives their turn to propose a block to the network. Here is a diagram of the specific multi-step process of how a block gets delivered when the construction is outsourced to the builder network:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

(from https://hackmd.io/A2uex3QFSfiaJJ9BKxw-XA)

A way to think about the relay is as a trusted block escrow and block deliverer. Without naive “pick from mempool and build” or “fcfs", the relay is the place through which the blocks have to go through if they are to land onchain. Builders and proposers both “sign up” for their jobs with a set of “expectations” i.e. trust assumptions:

Builders sign up
- To deliver bids + payloads that are valid, though the relay does have a way to ban/rank builders in case they misbehave
- Expecting their blocks to be validated by the relay but then escrowed and not stolen, e.g. the relay also controls the validator at current slot and takes the submitted block contents and submits them taking all the MEV for themselves
Proposers sign up expecting
- To receive a header of a valid block when it’s their turn to propose a block and that header to contain a bid value that is maximal
- To receive the full contents of the block after they sign over the block header (which ties them to that block for the slot) so that the block can be proposed. Since the proposers interact with the relay in a two-step process where the first step is a commitment to the contents and the second is the reveal, they need the full transactions contents released to them by the relay (referred to as payload) for the block to land onchain.

The relay itself signs up to do the full job that can enable all of the above. All in all, correct relay operation is instrumental for smooth interactions between builders and proposers and for blocks to land onchain when validators are connected to the MEV-boost block building network.

Relay Monitor

Given the importance of reliable relay operation, ideally we’d have as robust of a system to ensure safe operation as possible. So, in addition to any metrics/monitoring/performance tooling that relays may operate, a Relay Monitor (RM) is a separate software component that connects to a set of relays and collects data on their operation, with an opt-in component for validators should they choose to participate in sharing certain data with the RM from their “point of view”. At a high level an RM:

Connects to relays that are to be “monitored”
- Checks their headers (bids)
- Checks their payloads
- Computes scores for each relay based on some criteria
Exposes an API for above records / stats
Exposes an API for validators to contribute to the “monitoring”
- Validators can share their “view”, for example, showing that they received a bid from the relay + signed over that bid

Header validation is possible in large due to the endpoint for header delivery (getHeader()) can be called by anyone for a given slot and proposer. In other words, the call that the proposer will make can be also made by the relay monitor, sort of on in parallel / “on their behalf”, allowing the monitor to

Perform the validations according to own rules (proposers / their mev-boost sidecar should still and does validate on it’s own). This creates opportunities for more validation in case something is missed.
Store the bid / header to have a record of a relays potentially faulty behavior. In the case where the relay, for whatever reason, responds with different bids to the proposer and to the relay monitor and if the proposer chooses to share with the relay a record of their view (i.e. “I received bid X from relay Y and here is a signature proving I accepted this bid) then the RM can record this as a fault on the relay.

Operation of the relay monitor is mostly a loop of requesting bids from the relays that are being monitored and processing any data sent over from proposers in the case of some faulty behavior during auction. The monitor saves all the data to a DB and can then exposes scores for each relay based on any detected faulty behavior.

Actual faults that the relay monitor looks out for vary, but the idea is to ideally monitor everything from malformed signatures on the bids delivered to request latencies. For the purposes of the project work I focused mostly on header verification and simulating invalid bids, e.g. an old timestamp or an invalid parent hash.

Contributions

Persistence

These changes add a Postgres backend to the relay monitor and slightly refactor how the data is stored. In addition to generating a report of “stats” we also support generating reports of records, i.e. lists of fault records that provide more information about any faults then simply aggregates. We store all the bids that the monitor can collect from the monitored relays as well as the result of every bid analysis. Every bid is tied to a slot + every analysis result is as well, along with the parent hash and proposer public key, so it can be linked up back to the bid. The analysis categories store information about the result of the relay monitor processing of the bid, e.g. a bid is valid (no fault) or a bid is not valid (category/type of the fault). Storing everything into a DB instead of an in-memory map lets us easily execute queries like “give me the count of bid faults of any kind by relay X within the last N slots” or “give me a list of bid fault records by relay X within the last N slots where the fault was a malformed timestamp”, etc.

Scoring

These changes are mostly a POC of what a relay monitor can implement using a persistent store of bids + their analyses. Since all the data about relay bid delivery is stored, we can compute scores for each relay based on any faults that the monitor detects and attributes. The spec contains endpoints for both a report (scores for all relays) and scores per-relay. The two basic scores implemented so far are

Reputation score — how trustworthy a relay is, based on the recency of relay fault (if any). If a relay is detected to have a fault, the score plummets to 0, after which it will progressively increase as time passes (new slots) unless another fault is detected, and so on. Right now the score only takes the recency into account and levels off to a perfect score after a day assuming no faults (~7200 slots), but one can make all kinds of scores where fault types and fault timing can be weighted to compute more complex reputation measures. This score is somewhat of a “safety” rating that users running mev-boost can use to make decisions around connecting/disconnecting from relays.
Bid delivery score — how good a relay is at delivering bids, based on a ratio of getHeader() calls resulting in a forwarded bid over the last N slots. The API endpoints expose a parameter to specify a “lookback” allowing users running mev-boost to get a sense of bid delivery activity for a given relay over various periods of time. This score is somewhat of a “probe” / effectiveness rating since it doesn’t provide information on safety of a relay but more of a metric for how active a relay is. Perhaps users could use this as a “liveness” probe for relays, i.e. even if the relay is responding with a 200 OK on status endpoint, if the bid delivery score drops X% then disconnect for M slots, etc.

Operational Metrics

These changes add general metric endpoints for anyone interested in the operation of the RM. Exposed here are things like “how many bids processed in last N slots”, “how many of those were faults”, “how many registrations did the RM process”, etc.

Miscellaneous

The patch additionally adds

Re-tries to bid fetching logic. There is now a way to have the relay monitor re-try to fetch a bid some n number of times at interval t (this is configurable).
Support for Capella blocks. The changes here add types for capella blocks and backwards compatibility with bellatrix so that they can be parsed correctly by the consensus client part of the relay monitor.

Expanding the relay monitor

TLDR

Introduction

Relay Monitor

Contributions

Persistence

Scoring

Operational Metrics

Miscellaneous

Read more

Relay Monitor Deployment Notes

mev-boost community call #4 recap

MEV notes

mev-boost community call #1 recap