EPF Update #3

TLDR / Main Updates

For the last few weeks I have been working on designing a detailed data model for storing arbitrary amounts of data by a relay-monitor. I think an interesting re-usable feature of a relay monitor is being able to independently collect as much data as possible and at smallest granularities possible, with "analysis" processes then independently / in parallel reducing these into stats. The key is that the analyzers can be different, for instance I think it's interesting to be able to build independent analyzers for scoring correctness, or censorship, or latency. While the "scores" measure different metrics of reputation, the relay monitor can be used as a minimally stateful / un-opinionated data collector.

Last week I've hacked together two proof-of-concepts with Postgres and MongoDB (since both have pretty solid Go libraries) to store validator registartions and bid submissions.

Technical recap from early weeks

For posterity, since early updates eluded to the initial work I was doing on the relay monitor but weren't very in-depth.

Exposing relay domain – intro ticket that implemented a way to include metadata along with a transcript. This is used now to include the domain for the relay, which could be useful for verification or in case it updates. There is room to add other metadata under the meta field in the response

Exposing validator registrations – the follow-up work was focused on implementing a basic version of "stats" in the relay monitor based on basic validator registrations collection. I got more ideas from this work that I've been working on since to make data colletion work well. Also, with my new work to have working queries on time, now queries can be more specific, i.e. "a stats counter for validator registrations that the relay monitor has processed in the last N slots". Here the work was to add a few endpoints

GET /monitor/v1/validators – A stats counter for unique validators that the relay monitor has processed registrations for. A unique registration is identified and stored by the relay monitor via the provided validator "pubkey". Registrations are submitted via the exposed registerValidator endpoint.

Example response:

{
    "count": 1
}

GET /monitor/v1/validators/registrations

Exposes a stats counter for the total number of validator registrations that the relay monitor has processed. A validator can, and is expected to, update block building preferences via the exposed registerValidator endpoint.

Example response:

{
    "count": 10
}

Dev Update for 01/09/2023

Mongo vs. Postgres

I'm more familiar with Mongo but with Postgres I was able to re-use a few designs from the mev-boost repo by Flashbots, since conveniently a relay monitor collects the same data as the relay (hence the "monitor"). In a way, one could think of this is an extra redundancy guarantee. In any case, both work relatively simply with the data that we have to collect so far.

Basic Structure

For the relay monitor we generally need to keep track and collect

Bids (from builders)
Acceptances (from proposers in response to builders)
Validator registrations

This is since we then use the collected data to see what actually landed on-chain and perform "audits" via an analyzer process.

Modelling bids

I've spent the most time on bids, specifically I've been working on fast ways to allow bids to be added, updated, and be queried by time.

For bids there is generally a "context" that we handle

type BidContext struct {
	Slot              Slot      `json:"slot"`
	ParentHash        Hash      `json:"parent_hash"`
	ProposerPublicKey PublicKey `json:"proposer_public_key"`
	RelayPublicKey    PublicKey `json:"relay_public_key"`
}

with the "slot" being a "time" identifier. In order to keep track of bids we need to take this context into account. The simplest way to keep track of bids is actually use the context as the "primary key" but there are downsides (this is what our relay monitor does in-memory). The reason why one might want to do this is since the context sort of "namespaces" bids, so we use it to differentiate, which is very trivial.

Another obvious reason for why we want to have a way to use the BidContext as some sort of index, is that the relay allows bids to be re-submitted by the builder over a time interval with updated payloads. This is in order to allow the builder more time to gather txs and any MEV, hence resulting in a higher-value block (potentially, since we do assume more time = more tx = more opportunity for extraction) for the proposer. Therefore, as bids come in, the relay such as the one being run by Flashbots is collecting and updating bids in it's DB, so we need to do the same.

The actual bid itself follows a structure like this

type SignedBuilderBid struct {
	Message   *BuilderBid `json:"message"`
	Signature Signature   `json:"signature" ssz-size:"96"`
}

so we have most of the data that we care about such as the message itself, for instance the message itself

type BuilderBid struct {
	Header *ExecutionPayloadHeader `json:"header"`
	Value  U256Str                 `json:"value" ssz-size:"32"`
	Pubkey PublicKey               `json:"pubkey" ssz-size:"48"`
}

and the signature for ID'ing the builder.

(againt credit to Flashbots and we can reuse plenty of the Go types here).

Bids + queries

We then can store all the bids in a single table and link to a context. Time is "provided" by slots. We can now expose stats on

Count of bids across a slot range
Total bids submitted per slot
Total bids submitted per slot by specific builder
Stats on bids submitted per slot and whether the builder won

Validator registrations

For validator registrations, metric collection is straightforward. We just collect everything into a single table / collection and store SignedValidatorRegistration objects, which give us a way to authenticate / verify the validator and a RegisterValidatorRequestMessage which gives the actual data. Time is "provided" by a Timestamp on this object, so making queries is pretty straightforward on this integer field

type RegisterValidatorRequestMessage struct {
	FeeRecipient Address   `json:"fee_recipient" ssz-size:"20"`
	GasLimit     uint64    `json:"gas_limit,string"`
	Timestamp    uint64    `json:"timestamp,string"`
	Pubkey       PublicKey `json:"pubkey" ssz-size:"48"`
}

Note that since we also store a Pubkey it's easy to expose a stat endpoint for how the validator registrations have unfolded over time per Pubkey.

Work for upcoming week

This week I will be working on

Writing a detailed analyzer now that data collection is more or less done
Getting started on implementing an endpoint to open a "cursor" and start receiving updates on data collection or analysis. Most likely this will be via gRPC and documented in next update

EPF Update #3

TLDR / Main Updates

Technical recap from early weeks

Dev Update for 01/09/2023

Mongo vs. Postgres

Basic Structure

Modelling bids

Bids + queries

Validator registrations

Work for upcoming week

Read more

Relay Monitor Deployment Notes

mev-boost community call #4 recap

Expanding the relay monitor

MEV notes