Order Flow Auctions are on the rise, with 12 in prod, 2 more in the pipeline, and reasoning to suggest tens(hundreds?) of others in the near future. A recent analysis breaks the OFA design space into ordertype
, permissioning
, information shared
, bid selection
, and settlement rights
. The information shared portion of the design space is the subject of this post, and it questions an assumption implicit in both partial and full info sharing, mainly, is there a "fair" way to do so.
Whether or not an OFA chooses to reveal full or partial information, the goal of most OFAs nonetheless is to eventually expose that information to searchers in a permissionless manner. Because the information being shared has economic value, we can expect to see similar dynamics as latency games that have already developed in TradFi and mempools. David's recent post does an awesome job at illustrating the cost/benefit structure of latency arm races. The main difference in what is explored here is not FCFS for transaction ordering but FCFS for information reveal (which does not include settlement).
And we don't need to look far to see this info reveal problem in the wild: see the arbitrum sequencer fiasco that happened a few weeks ago, which architecturally is fairly equivalent to information reveal in OFAs.
Some hypothetical info sharing protocol has two roles, revealers
and learners
. The revealers
receive some piece of information msg
, likely a user order, and run it through some reveal
function that makes it publicly available. In mev-share
the matchmaker is the revealer
and along with reveal
has some privacy
function that turns msg
into some subset or variation of the original information before passing to reveal
.
learners
then have some read
function that is used on the information exposed via reveal
. For exampe, read
can be called on information sent to a learner
over websocket or SSE, and or it can be called on data at a specific location in the network or internet.
We consider reveal
to be fair if: the time between reveal
and read
is independent of a learner's resources.
"Resources" is vague here, but we can think anecdotally of 1) searchers on arbitrum who had the resources to spin up 10-100k ws connections and 2) geolocation and private backbone deals.
I call this path zeroth because its a non-starter for a robust decentralized protocol. But it's useful to think through how from a short term and well funded perspective, this approach probably suffices in an OFAs infancy.
revealer
instances, and then call reveal
at some synchronized time.Granted there are ~300 active searchers, it may not be an awful approach to keep resources available in the short term.
From our initial definition of "fair" this approach does not reduce geographic concerns, but it seeks to make the information so widely available that no amount of resources could hog all of it by throwing money at the problem. This is an obvious cost vector to matchmakers, that searchers can now force larger costs via autoscaling of revealers as opposed to the old scenario where when they exhaust resources they only crowd out other searchers and have a fixed cost to matchmakers (assuming monopoly on info doesnt create user UX issues which then do have a cost to the OFA at large).
reveal(privacy(msg))
, revealers
can call reveal(IPFS_HASH(encrypt(privacy(msg), sk)) + sk)
after they have pinnedencrypt(privacy(msg), sk)
to a random(geopgraphically dispersed) IPFS node in the network. This approach doesn't have to be with IPFS but it's the easiest to think through with a consensus free data network.This approach breaks the learners
call to read
into two steps: 1) get the IPFS hash and 2) retrieve user order cipher text from IPFS. By doing this were throwing the problem down the road, access to IPFS_HASH(encrypt(privacy(msg), sk)) + sk
is still "unfair". But it perhaps moves the needle slightly in that it lowers information revealed (~104b vs ~3kb) allowing a revealer
to scale connections more easily, and it introduces a random geopgraphic element. And of course^2, learners
could still optimize their peering on IPFS to get the encrypted order faster. Which perhaps highlights a point that even geographically random events in p2p networks can be gamed, a lesson the ethereum mempool has shown us well. It's also interesting to consider that this approach is potentially forcing searcher to do "positive work" on the IPFS network by helping the availability of any item less than or equal to the max size of an encrypted order (as the obvious thing to do would be to not seed any data above that size).
So shoving things in a random mailbox around the world doesn't work, but what if we propagate orders encrypted to a random revealer
so only one can succesfully call reveal
?
very related to below https://eprint.iacr.org/2022/898.pdf
is there any timing results in the literature around dkg mempools?
there is
revealer
instances would make the unsigned tx information publicly available.At first glance this approach appears to get us closer to "fair" in that we hypotheticaly remove the latency sensitive step, but upon further inspection it's not clear that's true, and on top of that we introduce a new system constraint, a permissioned revealer
set, which in a byzantine environment === the big C (consensus). On the first point, although there is no no single host or set of hosts which searchers can optimize connections to in order to learn who has the info they want to call read
on, this protocol would necessarily have to anounce the next set of chosen revealers
and creates a lot of complexity. If the protocol uses something like prevRandao you could deterministically calculate the next revealers
within some bound, but then were back at the beginning as learners
could then race to optimize their connections to those specific peers each epoch. On top of this we have an impossibility, this approach benefits from there being so many revealers
one learner
could not reasonably connect to all of them, and at the same time, encrypted orders need to be propagated fast enough that revealers
can attempt to call reveal
fast enough that the order is still available to be acted on.
All in all, our hypothetical protocol needs some notion of a "known validator/revealer
set" which will require us to introduce consensus and staking. I doubt approach is completely infeasible, but consensus on top of consensus certainly introduces a lot of complexity that may hinder adoption in the short term.
Consensus is heavy and adds multiple new dimensions of complexity, which may be okay on sufficient timelines, but poses a risk in the short term. In the absence of anyway to create permissionless yet known set of revealers or searchers that is also sybil resistant, then we can use a consensus free delay mechanism such as PoW.
Revealers
send around user bids encrypted to a secret key that can be brute forced within a minimal but non-trivial amount of time.At this point it becomes unsurprising that Arbitrum chose to introduce a similar concept into their sequencer. The key difference is that in the Arbitrum proposal, searchers are required to find the magical nonce that gives them random_economic_opportunity. And with this approach searchers can access the random_economic_opportunity without communication back to the sequencer. In terms of "fairness", this approach departs from that ideal, as more compute directly allows you to brute force the key faster, and on top of that you can play your typical resource hogging games with the revealer
.
While these approaches are likely nowhere near exhaustive, they highlight the tricky tradeoffs between privacy, order liveliness, and "fairness" appearing everywhere in MEV research. We seem to be at a fork in the road deciding between resource waste and complexity.
Reputation seems to be the only promising approach left. This works well with searcher <> builder communication in the wild, but there are some nasty adverse selection effects that can occur if implemented poorly. And decentralized reputation will require coming to consensus on data used to calculate reputations. It might also encourage cartelization of searchers i.e. "lets set up a subscription to revealers and forward to our inside circle of searchers only", accumulating reputation under one identity. This of course has its own trust issue inside of this new searcher cartel that formed.
(Side note: I wonder if there is some axis along which making info private is efficient. Obfuscating orders can make extraction less efficient, and most likely a 100 wei
arb opportunity doesn't need full privacy. But at what swap size is privacy warranted? How about time sensitivity? And can these variables be used as an input to the share
function?)
Assumming a bounded number of connections, can we ensure a maximum number of simultaneous submissions?
Is the mempool more fair than the current approach? how would we make such an assesment?