changed 3 years ago
Published Linked with GitHub

Semantics Lattice

TLDR

We define the notion of semantics lattices. Using this notion, we devise an incentive compatible mechanism for efficient MEV preference aggregation. We prove this mechanism’s correctness (i.e., it is private value incentive compatible, and it is welfare-maximizing) and gave an algorithm for the coordinator to execute this mechanism (i.e., do blockbuilding). Since VCG mechanisms have collusion-resistance and computational issues, we discuss how additional fee markets on-top of the mechanism can mitigate those problems and therefore aid efficient allocation of MEV.

Furthermore, with the notion of semantics lattice defined, we can easily devise some logic or language framework to describe MEV at a higher abstraction level than the sophsitication semantics we used to model MEV and therefore flexibly borrow existing results in those areas to reason MEV at the right level of abstraction.

Specification semantics lattice

Definition

A specification semantics lattice \(S\) is parametrized by a logic (language with a semantics of truth) \(L\) (which in turn might be parametrized over some free variable \(x\)), where every element is the set of all semantically equivalent sentences in \(L\), e.g., \(\texttt{true}, \texttt{true} \wedge \texttt{true}\), etc,.

\(\top\) is defined to be the set of all semantically equivalent propositions to \(\texttt{true}\). And \(\bot\) is the set of all semantically equivalent propositions to \(\texttt{false}\).

The partial order is defined as: \(e_i \sqsubseteq e_j\) if and only if \(U, e_i \vDash e_j\), meaning the formal system \(e_i\) in the universe of objects \(U\) semantically entails \(e_j\). From a derivation perspective, suppose \(L\) is complete, then \(e_i \sqsupseteq e_j \leftrightarrow e_i \rightarrow e_j\), i.e., an element is higher in the lattice if it is implied (exists a derivation) by the former; or, using \(e_j\) as a hilbert system we can formally prove (\(\vdash\)) \(e_i\).

One can immediately notice the join operator \(\sqcup\) corresponds with \(\vee\) and the meet operator \(\sqcap\) corresponds with \(\wedge\).

Interpretation

Some useful notions derived from the semantics lattice can help our interpretation of it:

  • implementation: given a set of observed theorems \(T\), and a Hilbert system (axioms plus proof systems written as axioms) \(H\), if \(H \vdash T\) (i.e., every observed theorem is derivable from \(H\)), then we say \(H\) is an implementation of \(T\) within the semantics lattice.
  • entropy: for the same set of theorems \(T\), if there are two Hilbert systems \(H_1, H_2\), then \(H_1\) has lower entropy than \(H_2\) if and only if \(H_1 \sqsubseteq H_2\) in the semantics lattice. This means the \(H\) with the highest entropy that implements \(T\) equals \(T\) or \(T\) in conjunction with some trivial axiom.
  • information: by observing more facts and admitting them as axioms, we are gaining information and thus eliminating entropy (moving down the semantics lattice); by deduction from one set of truth to another, we are losing information (moving up the semantics lattice).
  • prover complexity: a Hilbert system \(H\) has lower prover complexity than another Hilbert system \(H'\) if it has less admissible rules. This means that within an element \(e\) in the lattice, a system with the least prover complexity that entails exactly the semantics implications of \(e\) is the system with no admissible rules and none of the rules can prove each other (notice that there exists rules that are admmisble but not provable). We can even further this notion by defining that the actual prover complexity is the kolmogorov complexity of the hilbert system with least prover complexity.
  • generalization: If we try to derive a non-trivial Hilbert system \(H\) that implements a set of observed theorems, we are gaining information by adding generalizations, which might be justified by some axiom within or without the language \(L\), e.g., intuition, unknown unknown, or an axiom that isn't in the model but in the meta-model.

Mental Model

Concrete semantics lattice \(C\)

A concrete semantics lattice \(C\) is parametrized by a semantics model \(m\), where we define the lattice operations \(\sqsubseteq\), \(\sqcup\), \(\sqcap\) as \(\subseteq\), \(\cup\), and \(\cap\) and every element to be a set of specific instances of the model \(m\). Suppose we have a semantics model of natural numbers, then \(\bot = \emptyset\) and the least upper bound of \(\bot\) are sets with only one natural number in it, e.g., \(\{ 1 \}\), \(\{ 2 \}\), etc,. We have \(\text{Mod}(T)\) representing all possible models for a theory \(T\).

Expressivity lattice \(E\)

We can define the expressivity lattice \(E\) as a lattice where each element is a set of semantically equivalent languages, and that one element is higher than another if that element (language) is more expressive (i.e., the language is able to describe more things).

Correspondences

We notice that there exists a correspondence between the three lattices, namely, \(S\), \(C\), \(E\).

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

This is true because (assuming the languages are complete) the set of specific instances \(t\) of a model in the concrete semantics lattice \(C\) is exactly the set of states that is semantically entailed/can be proven by a proposition \(p\) in \(S\) (and thus the information it carries). \(p\), in turn, corresponds to a language \(l\) in \(E\) that exactly describes the semantic consequences \(t\). This menas there is a correspondence between semantics, logic, and language.

Complexity

Now one might have the urge to collapse all three lattices into one, but notice that the cardinality of \(E\) maximally equals the cardinality of \(S\), while \(S\)'s maximally equals the one of \(C\) (which is the cardinality of the power set of \(m\)). Furthermore, one might notice that the lattices here resembles the lattice of tracing semantics/lattice abstract semantics/lattice of lattice of abstract semantics in abstract interpretation. In fact, they describe very similar ideas, and the reason why we make the distinction over there is because we can reduce the complexity of the lattices \(S\) and \(E\) whenever we want to (in order to ease reasoning/computation/complexity).

Realizations on computation

In reality, if the semantics model we are realizing is some form of computation \(\Xi\), then we can define elements in \(C\) as sets containing all reachable states in \(\Xi\); we denote those as \(s_1, s_2\) etc,. Thus, elements in \(S\) are all possible specifications that you can write about all reachable states in \(\Xi\), i.e., properties that \(s_1, s_2, \dots\) exhibit. And elements in \(E\) are all possible languagues that can describe and can only describe the specifications \(P_1, P_2, \dots\)

In reality, we can implement languages in \(E\) (and thus the language that we use to describe specification in \(L\)) using:

  • a generic compiler + some static analyzer compilation passes: this is the approach of Rust, which is basically starting from a more universal/powerful language and then add checks in the compiler saying if some static analyzer returns negative, the compilation fails. This approach is least elegant but most practical.
  • a generic compiler + some type system: this is the approach of most functional programming languages such as Haskell. Technically type inference rules in a type system is also a form of static analysis, but the difference is that often type checks are more embedded into the compilation phase (i.e., translation from syntax to semantics phase) than slapping a static analyzer on top of the compiler.
  • embedded domain specific languages: this is the approach of Coq, which is to define the syntax and semantics of your eDSL in a more powerful language. This way you are starting to combine syntax together with the semantics so anything anyone can say is probably acceptable (true).
  • domain specific languages: this is the approach of Z3, which is basically encoding your semantics into a specific syntax. Out of all the approaches, this one creates most new languages, but it is also the one that is most impractical as countless DSLs died in the dust of some 50yo system admin's Windows Vista laptop.

If we imagine the syntax and semantics of a language as two circles, then from top to bottom we see a shrinking of the syntax circle to gradually match the size of the semantics circle, this means that we have less and less compilation failures, post-code checks (which always feels hacky), and we have a higher probability of believing in something once we see the language it is written in (without having to run a compiler in our head).

Specification on State

Recall from our formalization of MEV:

In our formalization of \(M\), the resource we are allocating is what specification should the state \(s_i\) adhere to. \(U_P\) basically describes the utility of each agent with respect to different specifications on the state. And clearly, "specifications on state" is not a commodity.

Of course, one can also think the resource as of indivisual states, but due to the granularity of each agent's utility and how in reality those utilities are communicated (by coinbase.transfer), a more useful mental model is to think the resource as of the specification that the state should satisfy.

This specification on state is perfectly captured by our notion of semantic lattices! Here we try to use the notions defined above to characterize a spec-on-state resource allocation market.

Motivation

We use spec-on-state to model agents' utility instead of a map from states to utility because:

  • communication constraints of agents in \(P\)
  • computation constraints of the coordinator \(c\)
  • it is a more realistic model of how agents think and how the coordinator "builds blocks"
  • refine granularity/distinct states doesn't capture the notion of compatibility very well (thus cannot define something like a "second price" easily). Of course in reality, with a well-defined gradient to aid some learning-based blockbuilding algorithm, the total heatmap modeling method works better than the spec-on-state lattice.

Formalization

Every agent \(p\) has a utility function modeled as a function \(U_p\) from an element in the spec-on-state lattice \(S\) to a positive real number. We write \(U_p(e)\) to represent the utility for the specification \(e \in S\) on the output state \(s_i\) by \(M\) on domain \(\Xi\).

An illustration of the lattice looks like:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
the spec-on-state lattice with merged utilities

where agent \(p_1\) has utility of 2 for specification \(s1\).

Suppose for simplicity the social choice function \(W\) is a utilitarian one, then the job of the coordinator \(c\), is to take the lattice \(S_1, S_2, \dots, S_n\) submitted by all agents and then overlay them into a lattice \(S'\) where there is a welfare function defined over all the elements, i.e., \(\forall e \in S', W(e) = \sum_{p \in P} U_p(e)\). After overlaying, the coordinator will iterate through all the elements in \(S'\) according to some arbitrary ordering[1] and apply the following element update rule until equilibrium (no updates happen):

\[W(e) = \sum_{~~~~\exists e'' e = e' \sqcap e''} W(e')\]

i.e., the welfare of an element is updated to be the sum of the welfare of all the elements whose greated lower bound is it.

Suppose there exists a mapping \(f_m: S' \rightarrow C\), defined over all elements in \(S'\) and maps to a subset of the elements in \(C\), and satisfies two properties:

\[\forall e_1 e_2 \in S', e_1 \sqsubseteq e_2 \rightarrow f_m(e_1) \sqsubseteq f_m(e_2)\]

\[\forall e_1 \in S' e_1' \in C, e_1(e') \rightarrow e_1' \sqsubseteq f_m(e_1)\]

i.e., it preserves the order of \(S'\) and is not trivial (maps to the largest set of reachable states that satisfies the spec-on-state demand.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
illustration of the assumption of realizable demand

Then, \(c\) iterates through the least upper bounds of \(\bot\) and pick any element \(e\) with highest welfare and that \(f_m(e) \neq \bot\). We call the specifications that are the least upper bounds of \(\bot\) ending specifications.

Using the above example, we can see the coordinator will choose \(S_4\), the compatible specification of all agents who submitted \(S_0\), \(S_1\), and \(S_2\), since it has the highest welfare.

commit-reveal: since VCG mechanisms are prone to auctioneer inserting bids and extracting more profit, we seperate \(M\) into two phases. The fist phase is a commit-reveal where all agents send encrypted transactions \(T_P\) (their preferences \(U_p\) are also encrypted) to \(M\), then \(c\) commits to a batch \(i\) (unordered) of encrypted transactions. Only after \(c\)'s commit, the decryption will happen and transaction/preference contents are revealed, and then \(c\) commits to the ordering of the batch \(i\) (in the batch \(i+1\)).

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
the commit-reveal scheme in VCG MEV mechanism

VCG Mechanism for MEV

We devise a VCG mechanism \(M_v\) for allocation of MEV (i.e., the spec-on-state resource).

Description: the coordinator \(c\) computes the concrete semantics lattice \(C\) consisting of all reachable states from \(T_c\) and \(T_P\) and the overlayed & updated spec-on-state lattice \(S'\).

Suppose the highest welfare specification is \(S_1\) and the second highest welfare one is \(S_2\) (under the constraint that \(f_m(S_1) \neq \bot \wedge f_m(S_2) \neq \bot\)), then the coordinator chooses \(s_i \in f_m(S_1)\) and outputs the payment vector \(\bar{a}\) where for every agent that has non-zero utility on any specification that is an upper bound of \(S_1\), they pay an amount equaling to \(W(S_2)\) to \(c\). Note that by procedural \(S_1 \sqcap S_2 = \bot\).

Algorithm: suppose all of the agents that has non-zero utility on any specification that is an upper bound of \(S_1\) is in the set of agents \(I\), then \(M\) proceeds as follows

  • Every agent \(p\) submits transactions \(T_p\) which expresses a preference function \(U_p: S \rightarrow \mathbb{R^+}\) to the coordinator \(c\).
  • \(c\) computes \(S'\), applies the element update rule, computes \(S_1\), \(S_2\), \(f_m\), and chooses any \(s_i \in f_m(S_1)\).
  • \(c\) computes the payment vector \(\bar{a}\) where:
    • \(\forall p \notin I\), \(\bar{a}(p) = 0\)
    • \(\forall p \in I\), \(U_p(S_1) < W(S_1) - W(S_2) \rightarrow \bar{a}(p) = 0\)
    • \(\forall p \in I\), \(U_p(S_1) >= W(S_1) - W(S_2) \rightarrow \bar{a}(p) = W(S_2) - W(S_1) + U_p(S_1)\)

Correctness: we prove that \(M\) satisfies two properties.

  1. ex-post Dominant Strategy Incentive Compatible (DSIC).

    Since \(\sum_{i\in I}\bar{a}(i) = W_{-I}(S_2) - W_{-I}(S_1)\), the winning coalition, whose members benifit from \(c\) choosing \(S_1(s_i)\), pays their externality, which equals to the most efficient allocation that the mechanism would have chosen if \(I\) weren't in the game (i.e., \(S_2\)). Thus, we get \(\sum_{i\in I}\bar{a}(i) = W(S_2)\). This means if we treat \(I\) as a whole, then the mechanism is incentive compatible (plugging original VCG proof here). This means agents \(p \notin I\) have incentive compatibility, but for agents \(p \in I\) we are still not sure.

    Thus, we proof the incentive compatibility for each agent \(p\in I\). We seperate the proof into three subcases:

    • \(p\) lowers its bid, but it does not decrease the total welfare too much so still \(W(S_1) > W(S_2)\). But \(\bar{a}(p) = W_{-p}(S_1) - W_{-p}(S_1) = 0\), so it has no incentive to lower its bids in this case.
    • \(p\) lowers its bid, but it decrease the toal welfare to a point where \(W(S_1) < W(S_2)\), so it would have a utility zero, but in the original case we have \(\bar{a}(p) = W_{-p}(S_2) - W_{-p}(S_1) = W(S_2) - W(S_1) + U_p(S_1)\) which means the net utility is \(U_p(S_1) - \bar{a}(p) = W(S_1) - W(S_2) > 0\). Thus, \(p\) has no incentive to lower its bid even if it has ex-post knowledge.
  2. When every agent reports truth, \(M\) outputs an effective allocation that maximizes \(W\).

    This is intuitive from the element update rule and \(M\) choosing \(S_1\).

Fee markets

We analyze several fee market designs for allocation of MEV (the spec-on-state resource).

  1. 1-dimensional

    One can treat all specifications the same and ignore how they could be compatible. Specifically, this implementation will be a 1-dimensional fee market where the highest bidder's specification gets allocated (i.e., without process for blockbuilding).

  2. n-dimensional

    We can devise a multiple dimensional fee market on spec-on-state resource where all the dimensions represent compatible specifications, i.e., their join doesnt yield false in \(S'\). Using this mechanism, agents in \(I\) pay roughly the fee they would've been paying (like an ordinal version of the VCG mechanism above). Of course, this fee market wouldn't be very useful as there is no a priori knowledge of \(S\).

  3. builder-gas

    Our mechanism so far does not consider the time it takes for the coordinator to compute the lattice \(S'\) and apply element update rules. This poses risk to the feasibility of \(M\) and might endanger its ex-post DSIC property. Thus, we could model coordinator time as an additional resource; using the terminologies in PBS, we call this new resource builder-gas.

  4. burst-control allocation

    A better allocation of MEV (an uncommoditized resource) that targets average cases can be reduced to an easier problem of allocation over the resource of builder-gas (a commoditized resource).

    A straw-man proposal is to just slap a 1559 on top of the fee markets in 1&2. However, this doesn't work well as MEV isn't inter-block commoditized: most burst scenarios are caused by some specific permissionless MEV opportunity, so the demand for a spec-on-state will fade quickly if it is not satisfied.

    For intra-block allocation of MEV, we could make the mechanism to disincentivize the expression of a specification that is incompatible with the current winning specification. In other words, one has to pay at least more than the current highest-welfare incompatible spec. However, since welfare requires global information and agents only have partial local information, this doesn't work either.

  5. builder-gas market

    What we could do is to devise something akin to per-account fee markets, where we basically compute a clearing price for expressing specifications in burst situations (as we have a high confidence that the specifications are incompatible), thus eliminating potential computational tasks for the coordinator. Since this clearing price is not a take-or-leave offer to users, it doesn't give us the exact same IC guarantees as 1559 but it solves the problem of inter-block commoditization.

    A more direct mechanism is to refine builder-gas into multiple resources where each dimension represents some information that helps the blockbuilder to spend less time on aggregating local information. For example, a parallelization fee dimension that gauges how parallelizable one transaction is to others could be very relevant. Of course, the motivation for further refinement is unclear as builders operate under different scalability parameters.

  6. externalities

    Note that on domains without a coordinator for MEV allocation, the agents enters a state of uncoordinated coordination where they each try to use some mechanism \(M'\) that isn't designed to allocate MEV to approximate a MEV-allocating mechanism \(M\). As suggested by past data, the inefficiency externalities of this uncoordinated coordination grows exponentially as the size of MEV increases. This means we can model the externality of auction-by-other means as some superlinear function on builder-gas. We refer to this externality as Price of MEV.


  1. The ordering for iterating through the lattice doesn't change the end state in equilibrium, but it does change the speed/complexity of the computation. There has been many past studies on efficient ordering for update rules in lattices. ↩︎

Select a repo