The following is a mini-report of my EPF project work on expanding the feature set of relay monitor. This can also be used as a paired read along with this PR that implements the actual changes to the functionality. This report assumes some familiarity with the MEV supply chain, PBS, and mev-boost.
In terms of background reading on MEV, I made a page with all my notes, roughly organized / tagged as part of my reading-up during the first few weeks of EPF. It can be found here:
https://hackmd.io/@andriidski/Hkl6Jh_nBn
There are also slides from my short presentation / demo to the EPF fellows, which can be found here:
And recording of the presentation / demo:
https://youtu.be/oF_BRlXMVNY?t=6471
For a chunk of the EPF program duration I’ve expanded the feature set of the relay monitor to include
API spec for what the monitor should do and which endpoints it exposes.
Deployed monitor running on Sepolia network
On today’s Ethereum network (as of early March 2023) about 90% of blocks are being routed through parties called relays and originating from specialized parties referred to as the “builders”. This reality represents the current state of out-of-protocol proposer-builder separation (PBS) and is largely possible due to the work of the Flashbots collective R&D in delivering the original architecture to interface with post-merge PoS Ethereum as well as the actual software pieces (binaries) that are run in production today.
In the current PoS Ethereum ↔ MEV world the relays exist in order to broker the communications between the builders and the validators. Validators are able to opt-in to outsource block construction by running a sidecar piece of software that does the “requesting” of the blocks. Assuming the two roles are logically separated, one of the fundamental problems is that the two parties do not trust each other. If block construction is outsourced and sent over with a bid to the validator directly, a validator willing to run extra software will be able to take it for themselves by modifying the block and claiming they have built it, with no way for the builder to prove malfeasance. At the same time, a builder tasked with providing blocks for proposers that are “dumb”, and await blocks that they will blindly propose, may have incentives to grief all or specific proposers by building invalid blocks that may claim to payout the proposer a big sum of money in return for block inclusion but in reality have transactions/block data that are invalid w.r.t consensus. We also have no guarantees about builder / proposer relationships: it may be the case at any given time that both the builder and the validator are under the control of the same entity. Separating the two roles is mainly about sweeping many problems to a single side (the one where the validator isn’t on).
With respect to the block delivery pipeline, the relay has a set of tasks to perform to satisfy a set of conditions. On the block building side, the relays receive submissions from multiple block builders, validate that those submissions are valid, and store them in the database to await delivery when requested. On the validator side, the relays process registrations from validators that allow validators to express block preferences and await requests to deliver a block when the validator receives their turn to propose a block to the network. Here is a diagram of the specific multi-step process of how a block gets delivered when the construction is outsourced to the builder network:
(from https://hackmd.io/A2uex3QFSfiaJJ9BKxw-XA)
A way to think about the relay is as a trusted block escrow and block deliverer. Without naive “pick from mempool and build” or “fcfs", the relay is the place through which the blocks have to go through if they are to land onchain. Builders and proposers both “sign up” for their jobs with a set of “expectations” i.e. trust assumptions:
The relay itself signs up to do the full job that can enable all of the above. All in all, correct relay operation is instrumental for smooth interactions between builders and proposers and for blocks to land onchain when validators are connected to the MEV-boost block building network.
Given the importance of reliable relay operation, ideally we’d have as robust of a system to ensure safe operation as possible. So, in addition to any metrics/monitoring/performance tooling that relays may operate, a Relay Monitor (RM) is a separate software component that connects to a set of relays and collects data on their operation, with an opt-in component for validators should they choose to participate in sharing certain data with the RM from their “point of view”. At a high level an RM:
Header validation is possible in large due to the endpoint for header delivery (getHeader()
) can be called by anyone for a given slot and proposer. In other words, the call that the proposer will make can be also made by the relay monitor, sort of on in parallel / “on their behalf”, allowing the monitor to
Perform the validations according to own rules (proposers / their mev-boost
sidecar should still and does validate on it’s own). This creates opportunities for more validation in case something is missed.
Store the bid / header to have a record of a relays potentially faulty behavior. In the case where the relay, for whatever reason, responds with different bids to the proposer and to the relay monitor and if the proposer chooses to share with the relay a record of their view (i.e. “I received bid X from relay Y and here is a signature proving I accepted this bid) then the RM can record this as a fault on the relay.
Operation of the relay monitor is mostly a loop of requesting bids from the relays that are being monitored and processing any data sent over from proposers in the case of some faulty behavior during auction. The monitor saves all the data to a DB and can then exposes scores for each relay based on any detected faulty behavior.
Actual faults that the relay monitor looks out for vary, but the idea is to ideally monitor everything from malformed signatures on the bids delivered to request latencies. For the purposes of the project work I focused mostly on header verification and simulating invalid bids, e.g. an old timestamp or an invalid parent hash.
These changes add a Postgres backend to the relay monitor and slightly refactor how the data is stored. In addition to generating a report of “stats” we also support generating reports of records, i.e. lists of fault records that provide more information about any faults then simply aggregates. We store all the bids that the monitor can collect from the monitored relays as well as the result of every bid analysis. Every bid is tied to a slot + every analysis result is as well, along with the parent hash and proposer public key, so it can be linked up back to the bid. The analysis categories store information about the result of the relay monitor processing of the bid, e.g. a bid is valid (no fault) or a bid is not valid (category/type of the fault). Storing everything into a DB instead of an in-memory map lets us easily execute queries like “give me the count of bid faults of any kind by relay X within the last N slots” or “give me a list of bid fault records by relay X within the last N slots where the fault was a malformed timestamp”, etc.
These changes are mostly a POC of what a relay monitor can implement using a persistent store of bids + their analyses. Since all the data about relay bid delivery is stored, we can compute scores for each relay based on any faults that the monitor detects and attributes. The spec contains endpoints for both a report (scores for all relays) and scores per-relay. The two basic scores implemented so far are
mev-boost
can use to make decisions around connecting/disconnecting from relays.getHeader()
calls resulting in a forwarded bid over the last N slots. The API endpoints expose a parameter to specify a “lookback” allowing users running mev-boost
to get a sense of bid delivery activity for a given relay over various periods of time. This score is somewhat of a “probe” / effectiveness rating since it doesn’t provide information on safety of a relay but more of a metric for how active a relay is. Perhaps users could use this as a “liveness” probe for relays, i.e. even if the relay is responding with a 200 OK on status endpoint, if the bid delivery score drops X% then disconnect for M slots, etc.These changes add general metric endpoints for anyone interested in the operation of the RM. Exposed here are things like “how many bids processed in last N slots”, “how many of those were faults”, “how many registrations did the RM process”, etc.
The patch additionally adds