Fair Distribution of Economic Opportunities in OFAs

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

TLDR; In OFAs such as mev-share, wallet-boost, Rook, and more, events are sent to searchers which represent economic opportunities in the form of information needed to extract MEV. When trying to design an information sharing protocol to optimize for some notion of "fairness", i.e. access to events is indepdent of resources, it seems likely that you end up stuck between a rock and a hard place, that is between needing a consensus free delay function like construction (PoW, VRDFs) or needing a blockchain on top of your blockchain. Is my logic faulty? Is there related literature or posts? Did I just rephrase the MEV problem at large in the context of OFA info reveal? LMK! :)

Introduction

Order Flow Auctions are on the rise, with 12 in prod, 2 more in the pipeline, and reasoning to suggest tens(hundreds?) of others in the near future. A recent analysis breaks the OFA design space into ordertype, permissioning, information shared, bid selection, and settlement rights. The information shared portion of the design space is the subject of this post, and it questions an assumption implicit in both partial and full info sharing, mainly, is there a "fair" way to do so.

Whether or not an OFA chooses to reveal full or partial information, the goal of most OFAs nonetheless is to eventually expose that information to searchers in a permissionless manner. Because the information being shared has economic value, we can expect to see similar dynamics as latency games that have already developed in TradFi and mempools. David's recent post does an awesome job at illustrating the cost/benefit structure of latency arm races. The main difference in what is explored here is not FCFS for transaction ordering but FCFS for information reveal (which does not include settlement).

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

And we don't need to look far to see this info reveal problem in the wild: see the arbitrum sequencer fiasco that happened a few weeks ago, which architecturally is fairly equivalent to information reveal in OFAs.

Thoughts on Solutions

Some hypothetical info sharing protocol has two roles, revealers and learners. The revealers receive some piece of information msg, likely a user order, and run it through some reveal function that makes it publicly available. In mev-share the matchmaker is the revealer and along with reveal has some privacy function that turns msg into some subset or variation of the original information before passing to reveal.

learners then have some read function that is used on the information exposed via reveal. For exampe, read can be called on information sent to a learner over websocket or SSE, and or it can be called on data at a specific location in the network or internet.

We consider reveal to be fair if: the time between reveal and read is independent of a learner's resources.

"Resources" is vague here, but we can think anecdotally of 1) searchers on arbitrum who had the resources to spin up 10-100k ws connections and 2) geolocation and private backbone deals.

Zeroth Path - Vanilla Scalability

I call this path zeroth because its a non-starter for a robust decentralized protocol. But it's useful to think through how from a short term and well funded perspective, this approach probably suffices in an OFAs infancy.

Vanilla Horizontal Scalability - Matchmakers run two services, an agreggator and horizontally scalable revealers. User orders are then all sent to one location, replicate still prviate orders across different matchmaker controlled revealer instances, and then call reveal at some synchronized time.

Granted there are ~300 active searchers, it may not be an awful approach to keep resources available in the short term.

From our initial definition of "fair" this approach does not reduce geographic concerns, but it seeks to make the information so widely available that no amount of resources could hog all of it by throwing money at the problem. This is an obvious cost vector to matchmakers, that searchers can now force larger costs via autoscaling of revealers as opposed to the old scenario where when they exhaust resources they only crowd out other searchers and have a fixed cost to matchmakers (assuming monopoly on info doesnt create user UX issues which then do have a cost to the OFA at large).

Random Pin to IPFS - Instead of calling reveal(privacy(msg)), revealers can call reveal(IPFS_HASH(encrypt(privacy(msg), sk)) + sk) after they have pinnedencrypt(privacy(msg), sk) to a random(geopgraphically dispersed) IPFS node in the network. This approach doesn't have to be with IPFS but it's the easiest to think through with a consensus free data network.

This approach breaks the learners call to read into two steps: 1) get the IPFS hash and 2) retrieve user order cipher text from IPFS. By doing this were throwing the problem down the road, access to IPFS_HASH(encrypt(privacy(msg), sk)) + sk is still "unfair". But it perhaps moves the needle slightly in that it lowers information revealed (~104b vs ~3kb) allowing a revealer to scale connections more easily, and it introduces a random geopgraphic element. And of course^2, learners could still optimize their peering on IPFS to get the encrypted order faster. Which perhaps highlights a point that even geographically random events in p2p networks can be gamed, a lesson the ethereum mempool has shown us well. It's also interesting to consider that this approach is potentially forcing searcher to do "positive work" on the IPFS network by helping the availability of any item less than or equal to the max size of an encrypted order (as the obvious thing to do would be to not seed any data above that size).

First Path - Consensus

So shoving things in a random mailbox around the world doesn't work, but what if we propagate orders encrypted to a random revealer so only one can succesfully call reveal?

very related to below https://eprint.iacr.org/2022/898.pdf

is there any timing results in the literature around dkg mempools?
there is

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

DKG + TPKE Protocol Based - This idea is similar to the Decentralized Order Flow Distributor proposed by josojo. The key difference here is that instead of allowing for a random subset of permissioned builders to access the order, a random subset of revealer instances would make the unsigned tx information publicly available.

At first glance this approach appears to get us closer to "fair" in that we hypotheticaly remove the latency sensitive step, but upon further inspection it's not clear that's true, and on top of that we introduce a new system constraint, a permissioned revealer set, which in a byzantine environment === the big C (consensus). On the first point, although there is no no single host or set of hosts which searchers can optimize connections to in order to learn who has the info they want to call read on, this protocol would necessarily have to anounce the next set of chosen revealers and creates a lot of complexity. If the protocol uses something like prevRandao you could deterministically calculate the next revealers within some bound, but then were back at the beginning as learners could then race to optimize their connections to those specific peers each epoch. On top of this we have an impossibility, this approach benefits from there being so many revealers one learner could not reasonably connect to all of them, and at the same time, encrypted orders need to be propagated fast enough that revealers can attempt to call reveal fast enough that the order is still available to be acted on.

All in all, our hypothetical protocol needs some notion of a "known validator/revealer set" which will require us to introduce consensus and staking. I doubt approach is completely infeasible, but consensus on top of consensus certainly introduces a lot of complexity that may hinder adoption in the short term.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Second Path - Consensus Free

Consensus is heavy and adds multiple new dimensions of complexity, which may be okay on sufficient timelines, but poses a risk in the short term. In the absence of anyway to create permissionless yet known set of revealers or searchers that is also sybil resistant, then we can use a consensus free delay mechanism such as PoW.

Weakly Encrypted Messages - Revealers send around user bids encrypted to a secret key that can be brute forced within a minimal but non-trivial amount of time.

At this point it becomes unsurprising that Arbitrum chose to introduce a similar concept into their sequencer. The key difference is that in the Arbitrum proposal, searchers are required to find the magical nonce that gives them random_economic_opportunity. And with this approach searchers can access the random_economic_opportunity without communication back to the sequencer. In terms of "fairness", this approach departs from that ideal, as more compute directly allows you to brute force the key faster, and on top of that you can play your typical resource hogging games with the revealer.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Conclusion

While these approaches are likely nowhere near exhaustive, they highlight the tricky tradeoffs between privacy, order liveliness, and "fairness" appearing everywhere in MEV research. We seem to be at a fork in the road deciding between resource waste and complexity.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

In the wake is reputation, and questions

Reputation seems to be the only promising approach left. This works well with searcher <> builder communication in the wild, but there are some nasty adverse selection effects that can occur if implemented poorly. And decentralized reputation will require coming to consensus on data used to calculate reputations. It might also encourage cartelization of searchers i.e. "lets set up a subscription to revealers and forward to our inside circle of searchers only", accumulating reputation under one identity. This of course has its own trust issue inside of this new searcher cartel that formed.

(Side note: I wonder if there is some axis along which making info private is efficient. Obfuscating orders can make extraction less efficient, and most likely a 100 wei arb opportunity doesn't need full privacy. But at what swap size is privacy warranted? How about time sensitivity? And can these variables be used as an input to the share function?)

Assumming a bounded number of connections, can we ensure a maximum number of simultaneous submissions?
- bounded by bandwidth
Is the mempool more fair than the current approach? how would we make such an assesment?
decentralized reputation

Louis Thibault

2023/03/11 22:04:24

mainly

namely? (Edited)

2023/03/11 22:04:47

Hyphen :P (I know you want a *technical* review, but I can't help myself.) (Edited)

2023/03/11 22:12:19

FCFS

Has this acronym been introduced before? Not sure what it means? (First-come-first-served?) (Edited)

2023/03/11 22:14:20

Some

I think you want to say "A" rather than "Some". The "Some" suggests that certain info-sharing protocols might *not* have these two roles. (Edited)

2023/03/11 22:14:41

Or maybe you want "Consider some hypothetical info-sharing protocol that has two roles [...]" ? (Edited)

2023/03/11 22:17:11

The more I think about this, the more I'm convinced that we can be more succinct *and* more precise by saying "economic resources". So, `reveal` is fair iff teh tiem between reveal and read is independent of a learner's economic resources. (Fully captures "compute resources", since these are derived from economic resources.) (Edited)

2023/03/11 22:22:41

I call this path zeroth because its a non-starter for a robust decentralized protocol.

Not sure what "this" is. Maybe introduce the definition before commenting on it? (I do see that it's right below -- but it made my eyes stutter.) (Edited)

2023/03/11 22:25:39

There's kind of a logical jump here. It's not immediately clear to me (a) whether this is a detour / lemma of some sort and (b) why it's needed. It *seems* like the direction you're going in is the following: survey practical approaches to disseminating info to learners and show where the unfairness resides? Maybe apply the "tell me what you're going to tell me, tell me, and then tell me what you just told me" approach here? (Edited)

2023/03/11 22:30:04

it seeks to make the information so widely available that no amount of resources could hog all of it by throwing money at the problem

There's a problem here, which is that I can try to limit the dissemination of information (effectively: a [partial] eclipse attack against the point of origin of the info. Think also: partial DoS, where the goal is to slow the spread). There may be solutions here. I haven't thought of this very deeply. (Edited)

2023/03/11 22:30:48

Relatedly: you can attack the p2p overlay in a similar way, such that the point-of-origin doesn't "see" the attack. Both of these are tricky to perform in practice, and mature overlay solutions like libp2p's gossipsub have mechanism to make this more expensive. I haven't thought of the asymptotics of these attacks and defenses in any particular depth, though. (Edited)

2023/03/11 22:33:29

Random Pin to IPFS

I don't think this is actually a solution because the revealers have two choices: (1) Host the pinned data and then reveal CID (2) Distribute the encrypted data and then reveal the key In both cases, there is an advantage to peering with the revealer and/or eclipsing other peers from the revealer. As per previous comment, I think you'd need some mechanism for revealing iff sufficient dissemination has taken place. (Edited)

2023/03/11 22:34:45

unfair

Seems like a nod to my previous comment. If so, it's worth fleshing out, as it's a very significant issue. Yesterday we glossed over how dissemination is accomplished, but there needs to be a very solid story, there. (Edited)

2023/03/11 22:38:26

One thought that comes to mind is to combine Reed-Solomon encoding (a la danksharding) with a bit-torrent-inspired peer-throttling based on whether or not the peer is sharing data. The idea is that you'd need _several_ messages to re-construct the payload, and you can punish peers that don't share. Of course, you still have the bootstrapping problem: how to design this in such a way that I cant spin up a bunch of sybils and sample enough messages from the source to reconstruct the payload. (Edited)

2023/03/11 22:38:33

PoW? :trollface: (Edited)

2023/03/11 22:42:19

Having read through this section, I think presenting things through the lens of scaling is confusing the issue. In my view, the issue is that of resource-rich actors having unfair access to information. That unfair access stems from a disproportionate use of network/compute resources. What makes this disproportionate use possible is that the costs are externalized (namely: the damages are incurred by the revealer, who must scale up compute resources, for example). Problem = externalities. (Edited)

2023/03/11 22:43:29

no no

<-- (Edited)

2023/03/11 22:44:12

I doubt approach is completely infeasible

Why? (Edited)

2023/03/11 22:46:28

Weakly Encrypted Messages

Another idea just occurred to me: strongly-encrypted messages, with progressive, *partial* reveals of the encryption key. (Or more generally: progressively reveal constraints on the key space. Possible the two are strictly equivalent.) (Edited)

2023/03/11 22:50:04

BTW, the problem of partial-eclipse/DoS still applies here. That also needs to be either resolved, or proven to be negligible within some bounds. (Edited)

2023/03/11 22:50:46

At this point it becomes unsurprising that Arbitrum chose to introduce a similar concept into their sequencer.

Here's your click-bait fodder, if you want it. "Hey, Arbitrum is actually really close to the right solution, d00dz." (Edited)

Fair Distribution of Economic Opportunities in OFAs

Introduction

Thoughts on Solutions

Zeroth Path - Vanilla Scalability

First Path - Consensus

Second Path - Consensus Free

Conclusion

In the wake is reputation, and questions

Read more

Constraints API First Pass Comments

Ethereum Block Building Workload

Subverting the total eclipse (of the heart)

OSS Developent Coordination Exploration