changed 3 years ago
Published Linked with GitHub

This is MEV

MEV has largely been a field where engineering drives science, it's now time for science to drive engineering: we present a formalization of MEV and the theory/taxonomy based on it.

Setting the scene

Objects

In our universe of objects, there exist:

  • a virtual machine (domain) \(\Xi\) that execute state transitions
  • a mechanism \(M\) that lives on domain \(\Xi\)
  • multiple agents, \(p_1, p_2, ... p_n \in P\)
  • an agent, the coordinator, \(c\). \(c\) is the one who executes mechanism \(M\).

Agents

The agents are subject to several constraints

  • agent \(p_i\) operate under \(\Gamma_{p_i}\)
  • coordinator \(c\) operate under \(\Gamma_c\)

We use \(\Gamma_P\) to represent the union of the constraints for the agents in \(P\) and \(\Gamma\) to represent the union of the constraints for \(P\) and \(c\). We deliberately under-specify \(\Gamma\) and expand later when we provide an axiomatic semantics for it.

Interaction

We use the following notations to describe how agents interact with \(M\):

  • \(T_{p_i}\) is the transaction that agent \(p_i\) sends to \(M\). We write \(T_P\) to represent \(\bigcup_{p_i \in P} T_{p_i}\) and similarly \(T_c\) the set of transactions submitted by the coordinator \(c\)
  • \(\Xi\) takes as input \(s_{i-1} \in S\) the previous state, \(T=T_P \cup T_c\) the set of transactions, and outputs a state \(s_i\), note that \(s_i\) has to be reachable from \(s_{i-1}\) using only transitions in the powerset of \(T\), \(\mathcal{P}(T)\). We say that a transition is admissible if it is within the reachable set of the powerset of the transactions.

Furthermore, the agents has the following properties:

  • \(\iota_i \in I_i\) is an agent \(p_i\)'s true type, i.e., its private information/signal that is not known to any other agents in \(P\) or \(c\). For example, the pricing function of an agent. We denote the space of all possible types of an agent \(p_i\) as \(I_i\) and the type space of all agents as \(I^n\).
  • \(\theta_i \in \Theta_i\) is the message that agent \(p_i \in P\) sends to \(M\). We denote \(\Theta_i\) as the message space of agent \(p_i\) and \(\Theta^n\) as the message space of all agents. In practice, a message may be the preference of an agent, e.g., its preference curve on fee escalator.
  • \(v_i: S, I^n \rightarrow \mathbb{R}^+\) is the valuation function of an agent \(p_i\) that depends on the output state by \(\Xi\) and all agents' private information/types.

Here, we define "ingredients" as program statements that can cause state transitions in \(\Xi\), i.e., the transactions \(T\) (as opposed to messages in \(\Theta^n\) which are not valid inputs into \(\Xi\)).

Of course, in reality, the messages in \(\Theta^n\) might be encoded as a smart contract (as in the case of coinbase.transfer) and sent together as the ingredients in \(T_{p_i}\), which in turn can cause state transitions. For simplicity, we pretend that messages are seperate from transactions.

Definition

Given those notions, we can define:

  • \(f: \Theta^n, T \rightarrow S\) is the outcome function of \(M\), it takes as input the transactions and messages sent by all agents and outputs a state
  • \(\pi: \Theta^n, T \rightarrow \mathbb{R}^{n}\) is the payment function of \(M\) which outputs a payment vector \(\bar{a}\), it takes the transactions and messages of all gents and outputs \(\bar{a}\) recording the payment to be made by each agent. We write \(\bar{a}(p_i)\) to denote the payment made by agent \(p_i\), and use shorthand \(\bar{a}(c)\) to denote the payment of the coordinator \(c\) which equals \(\sum_{p_i \in P} \bar{a}(p_i)\).
  • \(u_i: S, I^n \rightarrow \mathbb{R} = v + \pi(i)\) is the utility of the agent \(p_i\) given an outcome state and a vector of types which equals its value plus its payment.
  • \(W: S, I^n \rightarrow \mathbb{R}^+\) is the social welfare function which records the total social welfare of a state \(s_i \in S\) when the agents have private types (signals) \(\iota_n \in I_n\). For example, a utilitarian welfare function \(W = \sum_{p_i\in P} u_i\) the summation of the utilities.

Similar to the conceptual separation of ingredients and messages, we conceptually separate payments and state transitions even though in practice they are mixed.

The mechanism \(M\)'s inputs are: messages sent by agents in \(P\), \(\Theta^n\), the transactions sent by agents in \(P\), \(T_P\), and the previous state \(s_{i-1}\). After receiving those inputs, within \(M\), the coordinator \(c\) creates ingredient \(T_c\), so that \(\Xi(s_{i-1}, T = T_P \cup T_c) = s_{i}\) is in the set of all reachable states. Then, \(c\) computes the outcome and payment vector \(f\) and \(\pi\) and requires that the outcome state by \(f\) is within the set of all reachable states.

Thus, we have \(M (\Theta^n, T_P, s_{i-1}) = (f(\Theta^n, T), \pi(\Theta^n, T)) = (s_{i}, \bar{a})\) under the constraints of:

  • \(\Gamma\)
  • \(s_{i}\) is admissible given \(\Xi(s_{i-1}, T)\)

When the optimization yields ties (or condorcet paradoxes, as we are ultimately aggregating preference), we introduce another mechanism \(M_t'\) to break the ties. Of course, we understand that there still exists an coordinator \(c_t'\) for the resolution mechanism \(M_t'\). And since the utility of the agents \(u_i\) constitutes a weak ordering, \(M_t'\) itself might have ties and thus need to invoke further mechanisms. For simplicity, we assume that in this case \(c_t'\) has dictatorial power.

Mental models

We present some examples which hopefully serve as intuitive mental models for understanding our definition of \(M\).

Highlevel example: heatmaps

Pretend that there exists a projection from the space of all reachable states (i.e., \(\Xi(s_{i-1}, T)\) is admmisible) to a 2D plane, then for each agent, we can visualize its utility function \(u_p\) as a heatmap where the temperature of each point equals to the utility that agent \(p\) has for that specific state.

So now imagine the social choice function is a utilitarian one where the coordinator \(c\) maximizes the utilitarian social welfare \(W\) for all agents in \(P\) perfectly with efficiency \(\mathcal{E} = 1\). Then, \(c\)'s job is to overlay every agent's heatmap on top of each other with the temperature of the overlaied map equaling to the sum of every sub-map, and choose any of the hottest point and outputs the state transition in \(M\) (assuming randomness tiebreaker \(M_t'\)).

Or, if the coordinator is more advanced in that it considers payoff in the future, then each point's tempurature need to be summed with future expected tempuratures (some aggregated probability of the future occuring multiplied by that future's heatmap) and then choose the hottest point.

Specification on state as resources

Traditionally, resource allocation on blockchain has been done using fee markets, because the resources are commoditized (i.e., replaceable, fungible, or at least, compatible across time). And for allocation of non-commoditized resources, you can either expand each different kind of the resource into another dimension in the fee market (essentially making the n-dimension to infinite dimension) and then slap a 1559 on it, or use a single dimension in the fee market that treats every kind of the resource as a single kind (which allocates inefficiently). Neither solutions work for the allocation of resources present MEV.

In our formalization of \(M\), the resource we are allocating is what specification should the state \(s_i\) adhere to, or, what property should the next state in \(\Xi\) satisfy. \(u_i\) basically describes the utility of an agent \(p_i\) with respect to different specifications[1] on the state. And clearly, "specifications on state" (spec-on-state) is not a commodity. Of course, one can also think the resource as indivisual states, but due to the granularity of each agent's utility and how in reality those utilities are communicated (by coinbase.transfer), a more useful mental model is to think the resource as of the specification that the state should satisfy. Thus, MEV is the allocation of the spec-on-state resource, which here we abstract to the mechanism \(M\).

Blockchain example: Transaction Spectrum

We present a "transaction spectrum" that hopefully captures some of the relation between \(T\) and \(u_i\) and therefore shed light on how we could model \(\Gamma\) and design \(M\).

Given two transactions \(t_1\) and \(t_2\) in \(T\), we allocate the "spec-on-state" resource to them. There are five basic cases of \(T\):

  • conflicting, if including one transaction causes the other to revert. This means that there does not exist a resource (specification on \(s_i\)) that can be allocated to satisfy both and thus we need to optimize over the allocation.
  • partial-parallelizable, if the difference of the shared data that \(t_1\) and \(t_2\) touch is not an empty set. This means that interchanging the order of execution for \(t_1\) and \(t_2\) will result in states \(s_, s_2\) where \(s_1 \neq s_2\).
  • commutative, if interchanging the execution order of \(t_1\) and \(t_2\) does not change the end state. This means that if we condition the utilities of players \(U_p\) only on the ending VM execution state output by \(\Xi\), then we have a trivial solution for allocation at this moment in time on this single domain!
  • communicative, assuming our agents are strategic and can observe information other than purely the result state output by \(\Xi\), it is possible that some information is communicated by interchanging the order of \(t_1\) and \(t_2\) even though they do not change the ending VM state, and by this extra communication the agents' behavior (aside from utility) could change.
  • strongly commutative, if interchanging \(t_1\) and \(t_2\) does not change the ending state across domain and across time.

From left to right in the spectrum, we witness a decrease in the overlapping of preferences over different state specifications, i.e., qualitatively (quantitatively things might be very different), the resource allocation problem for \(c\) in \(M\) becomes easier.

Tradeoffs

We discuss the limitations we might encounter in our optimization of \(M\).

Properties

Since \(M\) is usually implemented on a blockchain with financial values, we first list some unique properties[2] that \(M\) exhibits, which helps to further restrain its design space:

  1. enforceable actions and a high-degree of commitment credibility (through smart contracts)
  2. easily transferrable utility so cooperative games can be played (e.g., conditional payments)
  3. the coordinator/mechanism cannot lose money from \(\bar{a}\)
  4. permissionless (e.g., agents can be indistinguishable in a repeated game setting)
  5. easy to use cryptographic primitives for both privacy and validation (of an agent's knowledge and actions), which helps incentive compatibility
  6. agents start with ex-interim values (only knowing their own types/private information, and not knowing the other agents') and can sometimes gain knowledge of others' types through peeking public mempool or available private orderflow
  7. agents have interdependent values, meaning the value function of an agent \(v_i\) depends not only on its own type \(\iota_i\) but also the types of other agents, \(I^n\). This is intuitive in crypto as one agent might value a token lower if it learns from another angent that the token's founder is actually at the same time running ten other protocols.
  8. the agents can communicate with each other with some noise subject to the cost of the collusion mechanism \(M'\).

Mechanism Design

Obviously, our formalization of \(M\) is very similar to traditional formalizations in mechanism design[3]. But there are a few important differences that makes our formulation very different from mechanism design:

  • the outcome of \(M\), unlike in traditional mechanism design, is not common knowledge, but a function on agents' submitted transactions. This is important as it means the design of \(M\) more akin to the design of a search problem of multiverse timelines with branch and bound guided by agents' messages \(\Theta^n\) and transactions \(T\).
  • the distribution from which the types of the agents are drawn from isn't necessarily common knowledge, and the distribution might not be same. Of course, it is well known that when the types of agents are not identically and independently distributed, the reasoning of the game is intractable as agents will counter speculate on each other's action space/rationality, etc,. not to mention of the knowledge structure is more complex. We shall demonstrate how we approximate the reasoning using \(\Gamma\) and weakening of properties later.

\(M\) exists in an 10-dimensional manifold

Based on the unique properties, many tradeoffs emerge.

  • communication: \(P\) and \(c\) might have communication constraints such that they can only communicate part of their utility function, or, full utility but not in time.[4] Ideally, a mechanism should have a low communication complexity (e.g., in the number of bits transmitted in the message space \(\Theta^n\), or in the number of queries asked). In our setting, since the network latency of the communication channel is significant and potentially outweighs than the computation time of adding more bits to the message a lot, we should focus less on optimizing query complexity.
  • privacy: if we place opaque privacy (e.g., threshold encryption or commit-reveal schemes) on \(T_P\) and assume no out-of-band commnication happens, then the coordinator \(c\) cannot effetively coordinate because no information is communicated. If we use privacy by adding noise, then there is a tradeoff between how much coordination can be done and how much privacy is preserved. Finally, if we have computable privacy (e.g., solutions based on SGX, MPC, or FHE) or privacy based on trust (e.g., trusted private mempool or public mempool but requires the coordinator to output a SNARK proving that the allocations \((s_{i}, \bar{a})\) are indeed done through a pre-defined mechanism), then we are able to both do the coordination and preserve some kind of privacy.

  • computation: \(M\) is a constraint optimization problem searching answer in an almost infinite space (the space of all reachable states), so executing it is extremely hard. Thus, \(c\), under computation constraints (e.g., decentralization, throughput, latency) is subject to tradeoffs. Interestingly, the search for \(M\) itself poses a meta-problem on computation: we know automated mechanism design is in general NP-Complete, so in using machines to help us optimize the design of \(M\) we still run into constraints.

  • incentive compatibility & individual rationality: many inefficiencies in coordination stems from agents playing mind-games. Ideally, the mechanism \(M\) should satisfy individual rationality and incentive compatibility (\(P\) are incentivized to use \(M\), and \(P\) have incentive to tell the truth). However, there are many negative or impossibility results in the implementability, budget-balanceness, and collusion-resistance of incentive compatible mechanisms especially for computationally hard problems. Moreover, the typical assumption of direct revelation mechanisms (agents reporting their preferences directly to \(M\)) falls apart in complex environments like a blockchain. As a result, we will have to navigate our way through the incentive compatibility spectrum to find a sweet spot in it.

  • fairness: even though fairness is an ill-defined term, in a world where users (who create on-chain economics activities quantified by the value of their orderflow) behave like sovereign individuals[5], the "fairness" of a mechanism still acts as an important indicator to the question of "what effect does the spec-on-state resource allocation policy adopted by a domain have on the domain & the policy's popularity." So if we believe that users, who create orderflow and thus in turn largen the reachable set of outcomes and increase the long-term prosperity of the domain/mechanism, is one of the most convex (return to scale) advantages to \(M\) and the domain \(\Xi\) it lives on, then we must value fairness.

  • welfare: given the above tradeoffs, we know that the maximum welfare \(c\) can create in \(M\) and its aggregated expected efficiency to optimize \(W\) the welfare function is bounded by communication/privacy/computation/incentive compatibility/fairness. Note that welfare doesn't necessarily have a monotonically increasing relation with those tradeoffs, therefore sometimes we need to rely on meta-mechanisms like social norms (as a means to shift the focal point by communicating commitments ahead of game).

  • revenue: much alike how fairness, from a user/orderflow point, affects the domain \(\Xi\) and spec-on-state allocation mechanism \(M\)'s popularity, the revenue of the coordinator \(c\) also affects the long-term stability and prosperity of the domain. Specifically, if the mechanism is set up in a way such that the coordinator \(c\) could earn more revenue by deviating from expected behavior (pricing in the risk), then the properties of \(M\) are not robust. In practice, coordinators are typically validators, and by not maximizing their revenue (up to what they would have got if they were actively manipulating the game), we are encouraging dishonest behaviors, which creates centralization in the set of sequencers/proposers/validators, which harms geographical decentralization, network security, censorship resistance, and liveness. Not setting up the revenue direction in \(M\) can impose a great risk to the long-term stability of the protocol because players cannot be trusted to be honest in an environment where they are encouraged to break the honesty assumption with minimal risk. Overall, honesty/security/decentralization of the network has to go hand in hand with correct revenue properties of the spec-on-state allocation mechanism \(M\).

  • expressiveness: the expressivity of a mechanism is important as it directly relates to the efficiency of the mechanism in implementing \(W\). Measures of expresivity include the granularity of the outcome space (how many valid outcomes there are), the dimensionality of the message space, or the how much an agents can impact the outcome. Essentially, expressiveness comes from an agent’s ability to control the pathway of its information (\(T\) and \(\Theta_i\)) and how its information is used along the pathway to impact the allocation.

We've identified roughly 9 dimensions in the tradeoff space of \(M\). Next, we focus on three of them that are most explored in existing designs.

Tetralemma of \(M\)

Fixing communication, incentive compatibility, fairness, revenue, and expressiveness, we observe that there is a tetralemma around the design \(M\) (you can only choose one face of the tetrahedron). Our job as a MEV mechanism designer is to maximize the volume \(M\) tetrahedron occupies.

The tetralemma is just a small example of how we can explore the design space[6]. As the title of this section suggests, the tradeoff space is huge and is a multi-dimensional manifold. So in the end we are all just searching for the perfect "dodecahedron"/"deodetralemma" for \(M\).

For a few more examples on potential tradeoffs:

  • typically, incentive compatibility is an overrated property. It has been shown that by trading off incentive compatibility one can have a big boost in computation and communication guarantees. And naturally, communication touches on privacy as the more information (\(T\) and \(\Theta_i\)) an agent reveal to others, the less "alpha" it's expected to have.
  • the inefficiency in a mechanism's allocation generally comes from agents' private information, for example, if agents knows others' types, then only logarithmic communication is needed to achieve efficiency (welfare). Generally, the more (global) information an agent has, the less expressive \(M\) needs to be to achieve efficiency. In some cases, it is more desirable to have a less expressive \(M\) to get higher revenue. This means the tradeoff between expressivity, communication, revenue, and welfare is not monotonic.

Problems

We highlight three problems induced by those tradeoffs which motivates our formalization.[2:1]

  1. coordinator priviledge: since \(c\) is a rational agent, it is able to perform actions that change the equilibrium in its favor leveraging the advantageous position it has under \(\Gamma_c\) (e.g., it has more information than individual agents) or by breaking \(\Gamma_c\) without having repercussions (e.g., \(c\) can vertically integrate and break the trust assumptions by "stealing bundles").

  2. coordination cost: there exists many inefficiencies in using \(M\) in the case of an ill-designed \(M\) or in the absence of \(c\) & \(M\).

    For example, in the case of auction by other means like receive-order-fairness protocols or randomness-order protocols, agents are approximating the spec-on-state resource allocation mechanism \(M\) using a mechanism \(B\) that is used for allocation of blockspaces, thus the allocation can only be done poorly. Plus, absent planner \(c\), the coordination by agents in \(P\) to approximate to the allocation itself has huge inefficiencies as well, so it's coordination cost squared.

    This is also the case with per-account-based fee markets and priority-gas markets where the mechanism \(B'\) is used to allocate distinct-state-access[7] resources or position-in-blockspace resources, so fundamentally it is still an auction by other means and agents still end up suffering from the same inefficiencies.

  1. incompleteness: in existence of coordination cost and potential incentive compatibility issues, agents need to coordinate to use the coordination mechanism of \(M\).

    Concretely, suppose that there exists some undesireable property \(Pr\) of \(M\) that arises from the inefficiency of \(M\). One may eliminate the inefficiency by devising a mechanism \(M'\) for agents to coordinate prior to the coordination on \(M\). But since \(M'\) itself is a coordination mechanism (therefore also a state transition mechanism), it inherit \(Pr\) as well. And since \(M'\) cannot self-coordinate to make the problem go away, one might devise a new mechanism \(M''\), and so on[8].

    You can solve \(Pr\) in the original game, but \(Pr\) comes back in the wrapped game.

    You can slay the Moloch, but the the Moloch's Curse lives.

    In fact, we can (informally) write this problem as something akin to Tarski's undefinability theorem or Godel's second incompleteness theorem: any state transition mechanism that is strong enough to eliminate property \(Pr\) of another mechanism will also have \(Pr\), and it cannot resolve its own property of \(Pr\)[9]. Of course, here we imply that \(Pr\) is a property that all complex-enough mechanisms inhabit and that any mechanism that is able to solve \(Pr\) is complex enough.

Mental model for \(\Gamma\): the Sophistication Spectrum

One can identify that all three problems above arise from the action space for agents being too large. Thus, based on this identification and practical observations on mev-explore (that many users are unsophisticated when using blockchain protocols, or rather, they had to be given the unsound decentralized finance application layer design), we give a teaser example on how we could model \(\Gamma\) to approximate agents behavior.

From left to right, the agent becomes more sophisticated in using the mechanism \(M\). For simplicity, we give definitions:

  • unsophisticated, i.e., agents who can communicate their own utility (with a cost of communication \(\Delta_{u}\)) but are not strategic (do not play with other agents in mind).
  • sophisticated, meaning the agent can effectively collude with other agents (with a cost of collusion of \(\Delta_{c}\)).

We argue that those two definitions capture most of the user behaviors in practice and thus it is sufficient for us to give a relatively exhaustive definition of MEV by restraining \(\Gamma\) to be consisted of those two cases (plus any combinatorial cases).

We expand our reasoning by giving a formal semantics of sophistication in the next section.

A Formalization of MEV

Sophistication semantics

On a high level, informally:

  • \(\Gamma_c\) says \(c\) is always an sophisticated agent.
  • \(\Gamma_P\) says agents in \(P\) might be sophisticated and might be unsophisticated.

Concretely, a semantics for sophitication is modeled using three relations: \(\psi\) (monadic) for strategic behavior, \(\xrightarrow[]{c}\) and \(\xrightarrow[]{k}\) (binary) for collusion and knowledge of utility.

All of three relations are defined over elements in the set \(C \cup P\) where \(C = \{ c \}\):

  • \(p_i \xrightarrow[]{c} p_j\) means \(p_i\) can initiate collusion contracts with \(p_j\). Note that this relation is not symmetric[10], i.e., \(p_j\) might not be able to initiate a collusion contract with \(p_i\).
  • \(p_i \xrightarrow[]{k} p_j\) means \(p_i\) have knowledge of the utility function of \(p_j\).
  • \(\psi p_i\) means \(p_i\) have a theory of mind (i.e, being strategic) when playing the game, specifically, agent \(p_i\) will use the knowledge on other agents' utility (by \(\xrightarrow[]{k}\)) and the collusion channels (by \(\xrightarrow[]{c}\)) it has maximally to its advantage.

Notice that when an agent initiates a collusion with another agent, they need to use some mechanism \(M'\). The cost of collusion \(\Delta_{c}\) and the cost of communication \(\Delta_{u}\) defined with the relations \(\xrightarrow[]{k}\) and \(\xrightarrow[]{c}\) are associated per \(M'\)[11].

Formally, \(\Delta_{c}\) is a function parametrized over the tuple of a collusion mechanism \(M'\) and two agents \(p_i\) and \(p_j\) where \(\xrightarrow[]{c}\) is defined over \(p_i p_j\). Similar for \(\Delta_{u}\).

A sophistication semantics that gives meaning to \(\Gamma_c\) and \(\Gamma_p\) is therefore defined as a sophistication model \(\Gamma= \langle \psi, \xrightarrow[]{k}, \xrightarrow[]{c}, \Delta_c, \Delta_u \rangle\) that satisfy following axioms (for simplicity of presentation, assume all agents are in \((C \cup P)\):

  1. opaque coordinator: for simplicity of reasoning (and that the coordinator is in a priviledged position by executing \(M\)), no agent knows the coordinator's utility, including \(c\) itself.

    \[\forall p, \neg (p \xrightarrow[]{k} c)\]

  2. colluding monotonicity: a non-strategic agent cannot collude.

    \[\forall p_i p_j, (p_i \xrightarrow[]{c} p_j) \rightarrow \psi p_i \wedge \psi p_j\]

  3. self-awareness: all strategic agents (except the coordinator) know its own utility.

    \[\forall p_i \in P, (p_i \xrightarrow[]{k} p_i)\]

  4. strategic coordinator: the coordinator is always strategic.

    \[\psi c\]

Common Knolwedge: the sophistication model \(\Gamma\) (represented by a graph) is common knowledge to all strategic users, and a non-strategic user has no knowledge of the graph.

Theorem: we can also immediately conclude the following theorem from colluding monotonicity and strategic coordinator):

  • (in)ability of collusion: every agent colluding with the coordinator is a strategic agent.

    \[\forall p_i, (p_i \xrightarrow[]{c} c) \rightarrow \psi p_i\]

Justification: We use this specific model of unsophistication based on observation of searcher and user behaviors. Next we give some examples to why we chose this modeling of \(\Gamma\).

  • frontrunning: a sophisticated user (searcher) is able to learn a unsophisticated user's ingredient and preference by simulating the transaction prior to its order being finalized. On the other hand, a sophisticated user will try to hide its utility from other unsophisticated users by obfuscating transaction contents or deployed bytecodes.
  • public mempool: present public mempool, unsophisticated users interact with \(M\) by broadcasting transactions without any protective wrappers (e.g., setting low slippage, include blocknumber to counter time-bandit attacks, etc,.), which essentially makes their utility transparent to sophisticated agents (searchers, blockbuilders, etc,.).

Remark on knowledge and collusion: with ex-post information of an unsophisticated agent's utility (and ingredients), a sophiticated agent might update its own utility and ingredients (since they have interdependent values), which may create conflicting preferences on spec-on-state and thus leads to collusion necessity. And, due to the (in)ability of collusion (or its contrapositive), sophisticated agents are able to realize an advantage on this ex-post knowledge gained from unsophisticated agents, while unsophisticated agents have neither the knowledge nor the power to react against potential changes in their preference expression.

In practice, this asymmetry of outcomes is reflected by activities including sandwiching, market making, predicting multi-block MEV based on mempool transactions (DEX price discovery), and generalized frontrunning.

Visualization: we can represent a sophistication model using graphs, for example:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

We represent the relation \(\xrightarrow[]{k}\) using black arrows, e.g., coordinator \(c\) having a black arrow pointing to agent \(A\) means the cooridinator \(c\) has knowledge of agent \(A\)'s utility function. Similarly, we use green arrows for the collusion-initiation relation \(\xrightarrow[]{c}\). We represent strategic agents using white circles and non-strategic agents using yellow circles.

Therefore, the above figure depicts a model (satisfying the three axioms above) in which there are three agents and the coordinator knows everybody's utility and that the strategic agent \(A\) learns the utility of naive agent \(B\).

MEV in the sophisticated case

We give a definition of MEV where the sophistication semantics has five additional axioms:

  1. collusion symmetry: collusion is always symmetric between two strategic agents.

    \[\forall p_i p_j, \psi p_i \wedge \psi p_j \rightarrow (p_i \xrightarrow[]{c} p_j) \rightarrow (p_j \xrightarrow[]{c} p_i)\]

  2. strategic universality: all agents are strategic.

    \[\forall p, \psi p\]

  3. collusion universality: all strategic agents can initiate collusion (with other strategic agents).

    \[\forall p_i p_j, \psi p_i \wedge \psi p_j \rightarrow (p_i \xrightarrow[]{c} p_j)\]

  4. low-cost collusion: cost of collusion approaches zero for all agents uniformly.

    \[\forall p_i p_j, (p_i \xrightarrow[]{c} p_j) \rightarrow \Delta_c(M', p_i, p_j) = \epsilon\]

  5. low-cost knowledge: cost of learning others' utility approaches zero for all agents uniformly.

    \[\forall p_i p_j, (p_i \xrightarrow[]{c} p_j) \rightarrow \Delta_u(M', p_i, p_j) = \epsilon\]

Definition: the MEV of a mechanism \(M\) is the coordinator tax \(\bar{a}(c)\) under \(\langle \Xi, M, c, P, W, \Gamma \rangle\).

Next, we do a case-by-case analysis on possible sophistication models under the axioms our semantics impose.

Case 1: one agent, one coordinator, i.e., \(P = \{ A \}\).

  • Case 1.1. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Assuming that the collusion mechanism that \(c\) and \(A\) use is \(M'\), then coordinator \(c\) is able to either directly capture the payoff itself, or extort \(A\).

In the extorsion case, a coalition \(\alpha\) colluding with the coordinator \(c\) can maximally extract:

\[\kappa(\alpha, \beta) = \underset{s_i \in S_{\alpha \cup \beta}}{\operatorname{max}} W_{\alpha \cup \beta}(s_i) - \underset{s_i \in S_{\alpha \cup \mu}}{\operatorname{max}} W_{\alpha}(s_i) - \underset{s_j \in S_{\alpha \cup \mu}}{\operatorname{min}} W_{\beta}(s_j)\]

from another coalition \(\beta\). We define \(S_\alpha\) as the states reachable purely using transactions (ingredients) sent by the coalition \(\alpha\) and \(\mu\) the set of initial ingredient transactions that is available to use by \(\alpha\). In other words, \(\mu\) is sent by the set of agents whose utility is known by any agent in \(\alpha\), i.e., by having a \(\xrightarrow[]{k}\) relation. Now, assume \(c\)'s vertically integration coalition is \(\alpha\), then \(\bar{a}(c) = \bar{a}(\alpha)\).

In the no collusion case (only relying on \(\xrightarrow[]{k}\), we write the value a coalition \(\alpha\) (including the coordinator) is able to extract with only the initial set of transactions sent by coalition \(\beta\) as:

\[\zeta(\alpha) = \underset{s_i \in S_{\alpha \cup \mu}}{\operatorname{max}} W_{\alpha}(s_i)\]

Therefore, MEV equals:

\[\max(\kappa(c, A), \zeta(c))\]

which is (using \(I_A\) meaning the set of initial transactions sent by \(A\) to the mechanism)

\[\max(\underset{s_i \in S_{C \cup \{A\}}}{\operatorname{max}} W_{C \cup \{A\}}(s_i) - \underset{s_i \in S_{C \cup I_A}}{\operatorname{max}} U_{c}(s_i) - \underset{s_j \in S_{C \cup I_A}}{\operatorname{min}} U_{A}(s_j),\underset{s_i \in S_{C \cup I_A}}{\operatorname{max}} U_{c}(s_i))\]

In practice, this correspond to the value a validator can extract through censorship, generalized frontrunning of long-tail strategies, or "stealing bundles."

Paradox of low-cost collusion: the axiom of low-cost collusion is paradoxical. To prove this, we see that since the collusion mechanism \(M'\) is ultimately another state transition mechanism, \(\Delta_c \geq \bar{a}'\). But given the axiom we have \(\Delta_c = \epsilon\) so \(\bar{a}' = \epsilon\). Now suppose \(M'\) is a state transition mechanism that is complex enough to encode \(M\) on it, then we have no need to use \(M\) at all as we can just use the variant of \(M\) that is encoded in \(M'\). Therefore, MEV is trivially zero by definition, which means our definition is flawed! To resolve this paradox in our definition, we set \(M = M'\) and assume there's no other cost aside from MEV in using collusion mechanisms, effectively moving \(\bar{a}'\) from \(\Delta_c\) to \(\bar{a}\) such that \(\Delta_c = 0\) without violating the axiom. Finally, note that this paradox highly resembles the problem of incompleteness we discussed earlier.

Remark on collusion mechanism choice: we give a concrete example on why our resolution makes sense in reality; the content of the extorsion contract could be highly noticable to the coordinator \(c'\) who executes the collusion mechanism \(M'\), so then \(c\), \(A\) and \(c'\) will endup in the sophistication model same as Case 2.5 where \(c'\) can extract \(\bar{a}'(c') = \max(\kappa(c, A), \zeta(c))\), making \(\bar{a}(c) = \max(\kappa(c, A), \zeta(c)) - \bar{a}'(c') - \bar{a}''(c'') - \bar{a}'''(c''') - \dots\). So \(c\) has no incentive to use an external mechanism \(M'\) for collusion and pay an extra of \(\bar{a}'\), it will choose \(M' = M\) and internalize the \(\bar{a}'\) extraction profits. Therefore, in future case analysis we assume \(M = M'\) and skip the recursive reasoning on the cost of collusion mechanisms.

  • Case 1.2. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Since coordinator \(c\) has no knowledge of agent \(A\)'s utility, it is not able to collude/extort effectively. Of course, \(c\) can publish a contract based on an ex-interim measure of \(A\)'s utility function and thus have a positive expected value for MEV extraction. But in practice ex-interim expectations usually imply repeated games and thus \(c\) has some knowledge of \(A\)'s utility which violates the assumptions in \(\Gamma\)[12]. Thus, in this case \(\bar{a}(c) = 0\).

In practice, this corresponds to longtail strategy scenarios (e.g., whitehat sending a transaction through private mempool), where there is no other entity aside from \(A\) (as we didn't model \(B\) here) that knows of the strategy (thus their spec-on-state can be merged) and that the coordinator \(c\) is not able to decipher the strategy \(A\) has using things like generalized-frontrunning bots.

Case 2.1: \(P = \{ A, B\}\), symmetric knowledge between strategic agents.

  • Case 2.1.1 \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Suppose absent \(c\), \(A\) and \(B\) will end up in a set of equilibrium states \(Eq\). We define the price of anarchy[13]

\[\text{PoA} = \underset{s \in S_{C \cup P}}{\operatorname{max}} W(s) - \underset{s \in Eq}{\operatorname{min}} W(s)\]

Note that the definition here resembles that of MEV in the collusion case:

\[\kappa(\alpha, \beta) = \underset{s_i \in S_{\alpha \cup \beta}}{\operatorname{max}} W_{\alpha \cup \beta}(s_i) - \underset{s_i \in S_{\alpha \cup \mu}}{\operatorname{max}} W_{\alpha}(s_i) - \underset{s_j \in S_{\alpha \cup \mu}}{\operatorname{min}} W_{\beta}(s_j)\]

where coalition \(\alpha\) is \(C\) and coalition \(\beta\) is \(P\). In fact, \(\text{PoA} = \kappa(C, P)\) in this case.

We further assume (the validity of this assumption is unclear) that:

\[\kappa(C \cup P_n, P \setminus P_n) \leq \kappa(C, P)\]

where \(P_n \subseteq P\), i.e., the extorsion that \(c\) can do itself to \(P\) is no less than the extorsion a coalition formed by \(c\) and any subset of \(P\) can do.

If PoA is zero in this model, i.e., the specification that \(A\) wants and \(B\) wants can be perfectly merged, then the reasoning reduces to Case 1.1 and \(\bar{a}(c) = \bar{a}_A(c) + \bar{a}_B(c)\) with \(\bar{a}_p(c)\) meaning a model same as Case 1.1 where \(A\) is the only agent in \(P\).

If PoA is not zero, then \(c\) will be able to generate an excess welfare equaling to PoA by publishing a contract asking the two agents for \(T_p\) that allows the highest welfare state to be reached. And then the coordinator \(c\) is able to transfer utility maximally equaling the PoA to itself (because transferring more will make \(A\) and \(B\)'s utility not enforceable and result in them not commiting to the collusion contract, i.e., violate individual rationality). So in this case \(\bar{a}(c) = \text{PoA}\). If \(c\) chooses to extract the value itself, we have \(\bar{a}(c) = \max(\text{PoA}, \zeta(c))\).

Alternatively, if \(c\) does not publish this collusion contract to remove PoA, the two agents can do so themselves since they have the required knowledge of others utility and a collusion mechanism \(M'\) is available, as depicted in the graph below.

Here, the agent who publishes the collusion contract first essentially acts as the coordinator \(c\) so can take the excess welfare for itself. However, since we have symmetric knowledge in \(\Gamma\) and the axiom of collusion symmetry, the two agents will enter a race to publish the contract first on \(M'\), where the coordinator of \(M'\), \(c'\), will be able to extract \(\bar{a'}(c')=\max(\text{PoA}, \zeta(c))\) (recall this is the Moloch's curse). And since we previously define \(M=M'\), we have \(\bar{a}(c) = \max(\text{PoA}, \zeta(c))\).

In practice, this correspond to short tail strategies such as (statistical) arbitrage or liquidation, where there are multiple agents who each has information on others behavior.

  • Case 2.1.2. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

In the case where the price of anarchy is zero or that the price of anarchy is not zero but the coordinator chooses to publish a collusion contract, we fallback to the reasoning in Case 2.1.1 where \(\bar{a}(c) = \max(\text{PoA}, \zeta(c))\).

In the case where \(\text{PoA} \neq 0\) and the two agents try to coordinate themselves using \(M'\), they bear an extra cost of coordination aside from \(\bar{a}'(c')\). This is because peer-to-peer coordination could have much higher communication complexity than leader-based coordination: agents have ex-post knowledge on other agents utility in the process of collusion, and thus their own utility and behavior change with this ex-post knowledge, which will update other agents' knowledge and behavior as well, and so on recursively until equilibrium. This situation is when each agent only have knowledge of their private values (types) and some observation of others' signal (with noise). There has been many past studies on how, in an n-dimensional knapsack asset scenario, interdependent private values leads to a high price of anarchy (and sometimes impossibility results).

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Essentially, each agent is a part of a Gestalt coordinator (illustrated above) where, absent the existence of a global view, they update their local view using only a few "data points" on \(U_p\) (and they communicate those "data points" with linear complexity). Thus, we likely have \(\Delta_c > \bar{a'}(c')\), which means \(\bar{a}(c) > \text{PoA}\) as \(\Delta_c\) is internalized to \(c = c'\). Therefore, \(A\) and \(B\) as rational agents will not choose to use \(M'\) but rather just let \(c\) publish the collusion contract, therefore \(\bar{a}(c) = \max(\text{PoA}, \zeta(c))\).

In practice, this corresponds to situations where a powerful "central planner" solves coordination failues almost perfectly, the ultimate dream of USSR where the optimization problem doesn't solve you. Of course, the implications of this Brave New World scenario is a profound philosophical question which we do not know the answer to.

To concretize ideas into blockchain terms, imagine agents \(A\) and \(B\) are trading on multiple constant function market makers (CFMMs) where markets can be modeleds as graphs, it is well-known we can calculate the PoA of selfish routing in graphs and past works provide evidence that it is not small in existing DeFi protocols. Therefore, the coordinator \(c\) can reduce the PoA in CFMM trading by calculating an optimal social welfare routing.

  • Case 2.1.3. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

If the Price of Anarchy is zero, agents have no incentive to collude with the coordinator, we fallback to Case 1.2 where \(\bar{a}(c) = 0\).

If PoA is not zero and agents choose to collude, then we either fallback to Case 2.1.1 where \(\bar{a}(c) = \text{PoA}\) (note here we have no \(\kappa(c, P)\) as \(c\) has no ex-post knowledge of \(A\) and \(B\)'s values) or run into the Gestalt coordinator situation where \(\bar{a}(c) > \text{PoA}\), so agents would just choose to collude, where they are indifferent between communicating with \(c\) their utility or introducing \(c'\) and coordinate themselves since in both cases MEV equals Price of Anarchy.

In practice, this corresponds to Hayek's advocation of a decentralized economy: knowledge is distributed in social groups and each individual agents has tacit knowledge that would be hard to communicate with a central coordinator.

Concretely in blockchain terms, \(\Gamma\) here corresponds to searcher-miner backroom deals where most searchers compete on short-tail strategies: the coordinator (e.g., BSC validator) has no knowledge on any agent's utility while the searchers know each other relatively well and their competition drive validator revenues up automatically.

  • Case 2.1.4. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Similar to previous cases, here \(A\) and \(B\) either:

  • coordinate with \(c\) and pay \(\text{MEV = PoA}\)
  • coordinate with \(c'\) and pay \(\text{MEV > PoA}\)
  • not coordinate at all and lose \(\text{PoA}\) to Moloch.

Case 2.2: \(P = \{ A, B\}\), asymmetric knowledge between strategic agents.

  • Case 2.2.1 \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Intuitively, here \(c\) is able to extract less than or equal to Case 2.1.2 because \(A\) has more knowledge than it had, therefore \(A\) could profit from it by paying less coordinator tax.

On the other hand, \(\zeta(C \cup A, B)\) might be larger than \(\zeta(C, P)\) because of the disadvantageous position \(B\) is in. We introduce:

\[\zeta'(\alpha, \beta) = \underset{\beta' \subset \beta}{\operatorname{max}} \zeta(\alpha \cup \beta')\]

In words, \(\zeta'\) is the maximum of \(\zeta\) for every sub-coalition formed by \(\alpha\) with subsets of agents in \(\beta\). Using \(\zeta'\), we weaken the MEV in this case to a variant of Case 2.1.2 by saying \(\bar{a}(c) \leq \max(\text{PoA}, \zeta'(c, P))\).

  • Case 2.2.2 \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Similar to the reasoning in Case 2.2.1, we weaken this case to a variant of Case 2.2.1 as the coordinator has less knowledge, \(\bar{a}(c) \leq \max(\text{PoA}, \zeta'(c, P))\).

  • Case 2.2.3 \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

We weaken this case to Case 2.1.4 as agent \(A\) has more knowledge, \(\bar{a}(c) \leq \text{PoA}\).

  • Case 2.2.4 \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

We weaken this case to a variant of Case 2.1.2 as the coordinator has less knowledge, \(\bar{a}(c) \leq \max(\text{PoA}, \zeta(c))\).

Case 3: \(n\) agents and one coordinator. This reduces to subcases of Case 2 and Case 1.

From our analysis, we can conclude the following:

Equality: MEV = Moloch Extractable Value: ignoring individual opportunity captured by positive \(\zeta\), agents collectively pay an amount equaling the Price of Anarchy to either \(c\), \(c'\), Moloch, or a mix of them.

MEV in the unsophisticated case

We give a definition of MEV where the sophistication semantics satisfy the all the axioms in the sophisticated case (except for strategic universality, here we assume the negation of it, i.e., there at least exists one non-strategic agent) and one additional axiom:

  1. asymmetric sophistication: all strategic agents have knowledge of all non-strategic agents utility, and all non-strategic agents have no knowledge of all strategic agents.

    \[\forall p_i p_j, \psi p_i \rightarrow \neg \psi p_j \rightarrow (p_i \xrightarrow[]{k} p_j) \wedge \neg (p_j \xrightarrow[]{k} p_i)\]

Definition: the MEV of a mechanism \(M\) is the unsophistication tax posed by all strategic users on unsophisticated users under \(\langle \Xi, M, c, P, W, \Gamma \rangle\), i.e., \(\sum_{\psi p}\bar{a}(p)\).

Next, we do a case-by-case analysis on possible sophistication models under the axioms our semantics impose.

Case 1: one agent, one coordinator, i.e., \(P = \{ A \}\).

  • Case 1.1. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

This reduces to Case 1.1 in the sophisticated scenario where \(\bar{a}(c) = \zeta(c)\) with the coordinator not able to extort (thus no \(\kappa\)) but can only increase its own utility by realizing advantage from ex-post knowledge on \(A\).

Case 2: two agents, one coordinator, \(P = \{ A, B\}\).

  • Case 2.1. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

The coordinator can still extort \(A\), so we have \(\kappa(c, A)\). Furthermore, the coordinator can form a coalition with \(A\) to extract value on \(B\), or can extract \(B\) itself, giving us \(\zeta'(c, A)\). Notice in this case the Price of Anarchy between the coalition of \(C \cup A\) and \(B\) is paid to the Moloch since \(B\) is non-colluding. Therefore, we write:

\[\bar{a}(c) = \max(\kappa(c, A), \zeta'(c, A)) = \max(\text{PoA}_{C \cup A}, \zeta'(c, A))\]

Note that \(\bar{a}(A) = 0\) since it will lose the profits to \(c\).

  • Case 2.2. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Here since \(A\) and \(B\) are not colluding, coordinator would not be able to eliminate the Price of Anarchy fully. Thus, \(\bar{a}(c) = \zeta(c)\). However, note that some PoA can still be eliminated via reordering of the initial transactions sent by \(A\) and \(B\). For example, the two agents \(A\) and \(B\) sends \(t_1\) and \(t_2\) and \(c\) receives them in that order. Now suppose \(t_1\) first checks if a storage variable \(x\) is even, if it is, \(U_A = 100\), and if not, \(U_A = 1\). \(t_2\) adds one to counter \(x\) then states \(U_B = 1\). In this scenario, if variable \(x = 1\) in the last state \(s_{i-1}\), the coordinator will be able to generate a surplus welfare of 98 by optimizing ordering, thus eliminating a huge amount of price of anarchy compared with a strict recieve-order fairness coordinator.

  • Case 3. The \(n\) agents scenario is similar to our reasoning above.

Removing assumptions: We noticed in Case 2.1 that due to assuming the axiom of asymmetric sophistication and strategic coordinator, any strategic party aside from the coordinator wouldn't be able to extract value from a non-sophisticated party because they will always lose to \(c\) in the competition. This limits us from exploring more interesting cases! Next, we remove the axiom of asymmetric sophistication and analyze three sub-cases of \(\Gamma\).

  • Case 1. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

This reduces to Case 2.1 except we can omit the case where \(c\) forms a coalition with \(A\) to extract value from \(B\).

\[\text{MEV} = \bar{a}(c) = \max(\kappa(c, A), \zeta(c)) = \max(\text{PoA}_{C \cup A}, \zeta(c))\]

  • Case 2. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

This reduces to Case 2.1 except we can omit the case where \(c\) extracts the value itself, i.e., omitting the \(\zeta\) cases. Here, only \(A\) has ex-post knowledge of \(B\)'s value and is able to realize it by colluding with \(c\). Therefore we have \(\bar{a}(c) = \kappa(c, A)\) and \(\bar{a}(A) = \zeta(A)\), while \(\text{MEV} = \bar{a}(c) + \bar{a}(A)\) equaling the value extractable by all sophisticated entities.

In practice, this corresponds to when a searcher extracts value by sandwiching a user and then hides her extracting transaction to a relay (mega-bundle builder) with private mempool.

  • Case 3. \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

The coordinator is still able to do extorsion (using a very limited set of actions as \(c\) has no ex-post knowledge on \(A\)), so \(\bar{a}(c) = \kappa(c, A)\). Note that here since only \(A\) is colluding with the coordinator and the coordinator has no information advantage over \(A\), the spec-on-state will always end in \(A\)'s favor.

In practice, this corresponds to NFT minting scenarios using simple RPC services like Flashbots Protect, where agent \(A\) expresses its preference to the coordinator by setting gas price high while the naive agent \(B\) just clicks buy on its metamask on a browser and hope for the best.

Equality: MEV = Partial PoA + Unsophistication Tax: from our analysis, we can conlude that MEV in the unsophiticated case equals the summation of (i) the price of anarchy eliminated in the coalition formed by strategic agents, and (ii) the unsophistication tax which influences all agents who has an preference expression or information disadvantage. Note that since \(B\) cannot collude with the coordinator, its preference doesn't get communicated and that part of the welfare is surrendered to the Moloch as well as any welfare that could have been generated from coordinating \(B\) with other agents.

Alternative sophistication semantics

Other proposals to MEV formalization challenge the assumption of collusion universality. Here we formalize such proposals in sophistication semantics and analyze two sub-cases.

  • Case 1: \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Since \(A\) and \(B\) cannot collude while \(c\) can peek into both of their transactions, we have \(\text{MEV} = \text{PoA}\). An example is when there are two states, \(s_1\) (the highest welfare state) and \(s_2\) (the original equilibrium state), with utility \((10,8)\) and \((1,9)\) respectively for agent \(A\) and \(B\). In this case the maximum coordinator tax can be realized by the coordinator publishing a contract colluding with the agents (or the agents just communicate in their transactions conditionaling behaviors on the other agents' behavior) to reach \(s_1\), which has a valuation of 18, and then transfer 8 (which is the PoA) out of 18 to itself, giving the other two agents their original payoff.

  • Case 2: \(\Gamma\) as described by the following figure
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Here we also have \(\text{MEV = PoA}\) as \(A\) and \(B\) cannot collude while \(c\) can help coordinate. The difference with the previous case here is that agents will not have ex-post knowledge of each other and thus the exact amount of \(\text{PoA}\) may differ (as agents will have different utilities).

Definition of MEV: this kind of alternative semantics is similar to the sophisticated MEV case, except here we do not run into Moloch's Curse scenarios as \(A\) and \(B\) cannot collude using \(M'\).

On "bad MEV"

We conclude that MEV is "bad" in the following cases:

  1. unsophistication tax caused by failure of communication (in absence of \(\xrightarrow[]{c} c\))

    This is clear from our analysis of MEV in the unsophisticated case. Because non-strategic agents fail to express their preference on spec-on-state, they bear the cost of this uncommunication: a part of the welfare is permanently lost to the Moloch.

    Assuming this problem is not completely solvable by making preference expression UX more friendly and education on MEV, we can challenge the axiom of colluding monotonicity thus breaking the theorem of (in)ability of collusion. For example, we can devise a mechanism \(M_d\) where sophisticated users compete to discover (re-construct) and communicate preferences of unsophisticated users with coordinator \(c\). In this way, unsophisticated users can remain naive and still get majority of the value from their orderflow because sophisticated users will erode each others' profit by competing in \(M_d\); basically, the unsophisticated users are vertically integrated with the coordinator \(c_d\) of \(M_d\).

  2. positive \(\zeta\) from ex-post knowledge (in presence of exclusive \(\xrightarrow[]{k}\))

    Again this is clear from our analysis because agents with knowledge of other agents (whether strategic or not) might be able to profit from this additional information by colluding with the coordinator to express incompatible preferences, as dominated by a large positive \(\zeta\). If this knowledge is exclusive, then the profit from \(\zeta\) is not redistributive and lost forever. If it is non-exclusive, then \(\zeta\) reduces to coordinator tax.

    We can solve this by challenging the axiom of asymmetric sophistication and the axiom of low-cost knowledge by adding privacy into \(M\) such that no information advantage can be gained by any agent (thus eliminating the \(\zeta\) terms from our MEV equation completely). And ideally this should be computable privacy so that we can still do blockbuilding on private data to eliminate a part of the PoA and do not permanently surrender to Moloch.

  3. centralized non-distributive coordinator tax

    Since the coordinator tax equals the eliminated price of anarchy, there is little point for the agents in \(P\) to continue using mechanism \(M\) and colluding \(c\) as they can gain the same payoffs by not having a coordinator at all and pay everything to Moloch. Thus, the presence of MEV in \(M'\) erodes individual rationality and thus can severely descrease the popularity on any domain adopting \(M'\). We need to remove the \(\text{PoA}\) terms from our MEV equation.

    The solution to this is to decentralize extraction of the coordinator tax and re-distribute it to machanism participants. In this way, we preserve individual rationality and keep \(M\) popular. One method to decentralize and distribute is based on the surplus welfare brought by each coalition. However, this assumes the coordinator has a fixed re-executable program, which means we need the participants on this domain to commit to a pre-defined \(M\).

In an post-MEV world where we solved those three cases of "bad MEV," the unsophisticated users will receive a compensation equaling the difference between what they are getting now v.s. what they would have got if they were sophisticated users; the coordinator do maximal extraction by eliminating PoA fully and redistribute the profits according to the value every user brings to the coalition; and all users will be free from being preyed on asymmetric information.

Zero MEV protocols

Is it possible to design a mechanism that is completely free from MEV?

We analyze two protocols that have zero MEV using sophistication semantics.

Case 1. \(\Gamma\) as described by the following figure

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Here, \(\Gamma\) satisfy all axioms in the sophisticated case (except for self-awareness and collusion universality) and the following axiom:

  1. universal ignorance: no agents knows anybody's utility, including itselves.

    \[\forall p_i p_j, \neg (p_i \xrightarrow[]{k} p_j)\]

In this case MEV is trivially zero because every agent has the same utility for every state, zero. Therefore price of anarchy is trivially zero and all their preferences on spec-on-state are compatible 100%. Essentially, this proposals is aimed at making \(\Delta_u\) prohibitively large. In practice, this could be realized through opque privacy.

Case 2. Previously we see that Case 1.2 (and all the sub-cases that reduces to it) has zero MEV. This phenomenon corresponds with domains that only have game computation or transfers like the Bitcoin network: every user's preference is compatible.

A Treasure Map

At the beginning of this manifesto, I talked about how MEV has been a field where engineering drives science: we first discovered the similarities of on-chain financial activity with traditional HFT, then we studied how trading firms compete with each other in presence of a primitive coordinator, and finally we are here today, pushing changes across the infrastructure and application layer with products that are MEV-aware, even the sophistication semantics \(\Gamma\) it self was defined based on behaviors of the game being played today.

But we need more. Many science was invented based on some a posteriori knowledge of the physical world, then they accelerate engineering into a paradigm shift in the way we realize products by allowing endless new constructs to be built on top of the simple-to-reason abstract models. Hopefully here at the end of this manifesto I've achieved the goal of paving road for science to drive engineering.

Just to strengthen my bull case for MEV research, I present a treasure map indicating promising directions that leads to better products and the ultimate perfect \(M\):

Taxonomy: we give a taxonomy of MEV based on our formalization.

x-domain

  • cross-domain mev is just adding multiple \(\Xi\) and \(M\), cross-domain stability => formalize cross domain stability as a variant of gibbard theorem
    • competing mechanisms is x-domain stability => youtube ads, price match policy, sellers collude using cookie sharing
    • poisson distribution of CEX so we must have discrete time even just for x-domain stability and not for network delay fairness (price discovery)

fairness

  • a systematic view of coordination cost -> expressivity lattice of mechanisms (for proving incompleteness theorem, what does it mean for a mechanism to be turing complete, etc,.) -> fundamental tradeoff between fairness and efficiency, etc,.

limitation

  • we didn't model interdependent values in the sophisticated case, might need to
  • benifit of unsophistication
    • no end of history vitalik post
    • welfare isn't correlated with coordination

communication

  • builder time as a commoditized resource

centralization

  • but redistribution makes the validator not happy, plus it can hide MEV profits => this is basically MEV oracles
  • better mechanism than Monarchy (redistributive MEV)
  • assuming coordinator power away
    • OSS, etc,.
  • Can reconstruct other people's utility based on collusion contract data points! -> relates to bad MEV case 1 and 2
  • orderflow aggregated by their mevability

inefficiency

  • ofa and blockspace auction have complementary items, u(a ,b) > u(a) + u(b), u(a ,b) = u(a) + u(b) + Moloch EV
  • currently we run a parallel/sequential auction, we need them to be combinatorial

computation

  • winner determination problem in combinatorial auctions = weighted independent set for bundle merging (with parallelization) ===> is NP-hard even with approximation
  • bidding language = the entropy (digital information/analog information) of the valuation function
  • multi-stage mechanism => elicitation algorithm, WDP has exponential communication complexity even with two bidders with an elicitation algorithm
  • can restrict the valuation query such that better elicitation complexity can be achieved

product

  • application layer offloading tasks
  • solve the red blocks

Conclusion

Sophistication semantics sits at the right level of abstraction for us to model MEV: it can be easily coupled with the notion of harsanyi dividends to study the equilibrium situation and give an upper bound to MEV. Moreover, this coupling is similar to the loosening of \(f_m\) between the semantics/ingredient/supply lattice and the logic/preference/demand lattice.

We gave a formal semantics for MEV and its theory. It's now our duty to join the pirate ship and follow the treasure map.

prove section of coordination cost, impactful > most important for ppl to remember > most strongly proven > cooler, takeaway (add ilya anecdote)


Supply and demand

This is an optional section highlighting an informal interpretation of the supply and demand of MEV to help build mental models of the definitions above. Note this interpretation uses many background ideas in the post on Semantics Lattice.

We observe several phenomenons in the market for spec-on-state:

  1. the concrete semantics lattice \(C\) is also the supply/ingredient lattice, the spec-on-state lattice \(S'\) is also the demand/preference lattice.

  2. demand for incompatible spec-on-state resources competes to drive price up. We say two spec-on-state resources \(\vartheta_1\), \(\vartheta_2\) are incompatible if and only if in the ex-post \(S'\) under \(\Gamma\), the ending specifications \(\Theta_1\) that has the highest welfare out of all the ending specifications \(\vartheta_1\) contributed to is different from \(\Theta_2\) (by a similar definition). Similarly, a spec-on-state is compatible if for every other spec-on-state demand in \(S'\), they have the same highest welfare ending specification.

  3. through knowledge of utility (regardless of the existence of collusion \(\xrightarrow[]{c}\)), agent \(A\) where \(A \xrightarrow[]{k} B\) might generate more demand (and thus drive price up) for incompatible spec-on-state resources (with respect to the demand of \(B\)), and \(A\) is able to create the supply itself. This is what is captured by \(\zeta(A)\), which cooresponds to the notion of worth \(v\) in cooperative game theory.

  4. through collusion (which requires knowledge of utility \(\xrightarrow[]{k}\) by the person who initiates the collusion), the coalition \(\alpha\) formed by the colluding entities will be able to create more supply of spec-on-state, i.e., states in \(C\) that is in the co-domain of \(f_m\). However, the coalition \(\alpha\) only has incentive to create extra supply (enlargen \(C\)) if they know those supplies match some demand, i.e., addition of the supplies in \(C\) make less elements in \(S'\) map (\(f_m\)) to \(\bot\).

    In other words, if we have the five constraints on \(\Gamma\):

    • \(A \xrightarrow[]{k} B\)
    • \(A \xrightarrow[]{c} B\)
    • \(\exists e \in S', f_m(e) \subset f_m'(e)\) where \(f_m': S' \rightarrow C_{AB}\) satisfy the same constraints as \(f_m\) except it maps onto \(C_{AB}\) the concrete semantics lattice where we treat \(A\) and \(B\) as a coalition, i.e., by forming a coalition \(A\) and \(B\) is able to match more demands
    • \(e\) is in the utility function \(U_D\) of some agent \(D\)
    • \(A \xrightarrow[]{k} D\), i.e., the agent initiating the collusion, \(A\), knows the matchable demand.

    then, we define \(d_\zeta(AB) = \zeta(AB) - \zeta(A) - \zeta(B)\) as the harsanyi dividend of the coalition \(AB\) when they act on opportunities related to satisfying the demand of \(D\) (if \(D\) is the only element in the set of all agents that has the five constraint relationship with \(A\)). Similarly, we say the shapley value of \(AB\), \(\phi_{AB}(\zeta) = \sum_{S \subset N: i\in S}d_\zeta(S)|S|^{-1}\), is the share (under MEV) of the harsanyi dividend of \(AB\) in every possible mega-coalition formable (under \(\Gamma\)).

    Thus, we have \(d_\zeta(AB) > 0\) when \(\Gamma\) satisfy the five constraints, and that MEV equals \(d_\zeta(AB) + \zeta(A)\) if \(A\) has asymmetric knowledge and colluding power (thus \(A\) can take a part of \(B\)'s share of the shapley value)

    ====> plus implies compatible specifications, we need something for incompatible specifications, also shapley value is undefinable as shapley value assumes equal share of harsanyi dividend, which is ignoring MEV, so the shapley value needs to be formed ignoring receipient of the MEV. And

create more supply, and if those supplies are compatible with others' demand (thus, requires knowledge by some member in the coalition of the demand), \(d_v(AB)\) value is created

adding more mappings in \(f_m\) equals creating more realizable supplies in \(C\) equals collusion harsanyi dividend

collusion is basically changing the seller and buyer of the spec-on-state, because spec-on-state depends on the reachable states (which depends on agents' knowledge and collusion ability, so the coordinator not know much)

Assumptions

  • traditionally, if you control the demand for spec-on-state (preferences/logic), you are able to commoditize the supply of spec-on-state (ingredients/semantics)
  • if two agents can collude and has knowledge of each other's utility, then (i) the coalition of them controls a larger spec-on-state supply (ii) within the coalition, the surplus welfare gets eroded by Moloch EV even though they each can create a larger spec-on-state supply by colluding
  • \(A \xrightarrow[]{c} B\) means \(A\) acts as a buyer (and the demand may or may not be organic, but it doesn't matter) of spec-on-state, and \(B\) is a seller.
    • If this collusion is asymmetric, \(A\) acts as a monopolized buyer and can commoditize the seller, basically taking all of the surplus welfare, i.e., PoA that he can get from controlling the ingredients of the coalition \(AB\), i.e., the harsanyi dividend \(d_v\) of \(AB\) (not well-defined).
    • If the collusion is symmetric, they pay Moloch Extractable Value to c', i.e., \(\bar{a'}(c') = \text{MEV} = d_v(AB)\). Or, in the best case, get their shapley value fraction (their share of the harsanyi dividend).
    • If we have \(A \xrightarrow[]{k} B\) at the same time then this collusion/transaction is efficient, otherwise it is not and we assume they do not initiate collusion at all.
  • \(A \xrightarrow[]{k} B\) means \(A\) gains knowledge on \(B\)'s utility which gives (i) \(A\) an information advantage and thus change the utility of \(A\) (ii) \(A\) as a seller controls more supply of spec-on-state because itself knows how to reach more states (so \(A\) can just self-buy which creates surplus utility without having \(c\)), this is essentially \(A\) gaining a part of \(B\)'s demand (of incompatible resources, basically the competitions comes from increased demand from incompatible resources) and then compete with \(B\) in the demand market, essentially driving prices up.
  • The process of ingredients merging (generation of more supply of spec-on-state), i.e., creation of \(d_v\), is what creates Moloch Extractable Value.
  • \(f_m\) between \(S'\) and \(C\) captures \(\xrightarrow[]{k}\). Absent \(\xrightarrow[]{k} c\) agents are only submitting the supply (ingredient) but not the demand (preference) to \(M\).
  • in a multi-polar OFA scenario, we have everyone at each polar as the supply, and in equilibrium (assuming a well-defined \(M\)) each agent \(p\) gets \(d_v(p)\).
  • two resources here: builder-gas, and spec-on-state
    • \(p\) generates both demand and supply for spec-on-state, and only has demand for builder-gas. Sophisticated/previledged user \(p_s\) tend to generate more exclusive supply with \(d_v(p_s)\) or commoditize the demand of \(p_u\). Multi-polar means the spec-on-state supply (by the value of \(d_v\)) is exclusive to some users.
    • \(c\) generates supply for spec-on-state (often have exclusive supply), and only has supply for builder-gas
    • builder-gas is used to generate an excess of \(d_v(c)\) and ensure IC and efficiency.
    • Ideally, we commoditize builder-gas, so value \(d_v(c)\) maximally flows back to users. And between users, we rely on searcher competition (assuming sophisticated users go after \(\epsilon\)) to make value (\(d_v(p_s)\) and exclusive demand of \(p_u\)) maximally flows to \(p_u\).
    • fat dapp or fat protocol thesis is just multipolar ofa with supply and demand of spec-on-state

  1. i.e., which element in the semantics lattice. ↩︎

  2. For further disucssion on the properties, tradeoffs and the solutions, refer to note ↩︎ ↩︎

  3. for agents with interdependent value, independent types, and quasilinear utility ↩︎

  4. Currently, the best communication tool agents can use on a blockchain is coinbase.trasfer in a smart contract. ↩︎

  5. see note for background. ↩︎

  6. For an expanded definition and discussion on how should we choose the tradeoffs, see note ↩︎

  7. One could argue that distinct state accesses is a good approximation to specifications on states, as when one agent is supplying its ingredients and expressing its preferences it inevitably end up access that state/account. This is a result of the limitations of current communication channels (i.e., smart contracts and coinbases) that agents use for

    M. More importantly, state accesses, just like the "Transaction Spectrum," only qualitatively measure preferences, not quantitatively; therefore, it is inexact. However, it is indeed a great preference elicitation mechanism, in the sense that it computes a clearing price for people to express their preferences, and thus reduce the computation/communication load of the coordinator. ↩︎

  8. Notice here

    M is different from the tiebreaker mechanism
    Mt
    , the former is used for eliminating coordination inefficiencies in
    M
    while the latter is used to resolve condorcet paradoxes in
    M
    . ↩︎

  9. Here

    Pr is basically MEV, and I'd like to informally call this sentence "the incompleteness theorem of MEV," even though it's really a conjecture/shitpost. ↩︎

  10. It is also not reflexive and not transitive. ↩︎

  11. In

    Γ, knowledge can be transitive on collusion. We leave this as an informal definition in this post but in future one could attempt to model this using modal logic by accepting e.g.,
    pipjpk,(pikpj)(picpk)(pkkpj)
    ↩︎

  12. One could argue that, in theory,

    c can extort based purely on ex ante knowledge without learning anything. We conviniently ignore this situation for now as it is highly unlikely to happen in practice. ↩︎

  13. Yes, traditionally Price of Anarchy has been defined as the ratio, not difference, of the maximal welfare state and the equilibrium state. ↩︎

Select a repo