# Limitations to Order Policy Enforcement: The Flipper Game ## Introduction Order manipulation attacks such as frontrunning and sandwiching have become an increasing concern in blockchain applications. Enforcing strict transaction ordering in a block is one of the many proposals for reducing MEV in blockchains. The paper *[Breaking the Chains of Rationality: Understanding the Limitations to and Obtaining Order Policy Enforcement](https://eprint.iacr.org/2023/868.pdf)* explores the impossibility conditions for Order Policy Enforcement under the assumption of rational actors. The paper also proposes AnimaguSwap, a DEX that uses rationally binding transactions to enforce the stakers and flippers to have conflicting interests by constructing a game with Pareto efficient outcomes in an inefficient Nash equilibrium. It further goes on to formally prove how this would not work with binding side contracts for stakers(exactly why governments enforce anti-collusion laws). This article dives deeper into the flipper game, how it is constructed, and why I believe this is not a strong enough rational binding for the flipper to not collude. This article uses a game theoretical approach and analyzes how a rational flipper would behave under different strategies employed by a set of colluding stakers. ## Game Mechanics The Flipper Game involves a set of $N$ stakers, with one staker elected as the flipper in each round. For simplicity, let us assume that this leader is elected in a round-robin fashion(can be stake-weighted random as well) and all the stakers have equal weights. Every epoch has $N$ rounds such that each staker has been elected as the flipper at least once. Let us understand the terminologies involved in the mechanism before we go ahead: 1. **Stakers**: $s_{i} \in \{s_{1}, s_{2}, .., s_{N}\}$. 2. **Transaction**: $tx$ 3. **Flipper**: $f \in \{s_{1}, s_{2}, .., s_{N}\}$ 4. **Flipped Transaction**: A flipped transaction is the opposite of the transaction. In the lens of a DEX if user transaction $tx$ sells a certain asset, then $\overline{tx}$ is the reverse order, i.e., buying the same asset. Let us discuss this game under the lens of an AMM-based DEX that leverages the game to prevent MEV extraction. We know sandwiching transactions is a common strategy to extract MEV and won't delve into it. The paper linked in the introduction explains what sandwiching is in detail and can be used for reference. The game proceeds as follows: 1. An external user selects a random bit $r \in \{0, 1\}$ and submits $\overline{tx}$ if $r = 1$ and $tx$ if $r = 0$ to the network. 2. The external user also submits $r$ to the flipper through a secure channel and receives a commitment $sig_{i}(r)$. While revealing the transaction the flipper needs to provide $r$ so the transaction can be executed. 3. While ordering the transaction, stakers can extract MEV by a sandwich attack but remain uncertain whether the transaction has been flipped. Sandwiching the transaction wrongly will lead to losses for stakers. 4. The flipper is incentivized to deviate from colluding because the flipper can make a profit from the losses made by the stakers(if they attack wrongly). This creates an interesting game. The utility for the flipper and the stakers individually is illustrated below: | ($f$, $s_{i}$) | Attack | Defer | | -------------- |:------- |:------ | | **Collude** | (k, k) | (0, 0) | | **Trick** | (z, -z) | (0, 0) | Because $k$ is a shared utility from MEV, $z > k$ is almost always holds good. This shows why the flipper has the incentive to switch from colluding to tricking the rest of the stakers. ## Modeling the game ### One game every epoch In the previous section, we modeled the game such that we consider the game to be played by a staker only when it is elected as a flipper. The reality is, MEV is extracted in every block and not just when $s_{i}$ is the flipper. Now let us try modeling the game with utilities associated with a time frame of one epoch: | ($f$, $s_{i}$) | Attack | Defer | | -------------- |:------------------ |:------ | | **Collude** | ($N.k$, $N.k$) | (0, 0) | | **Trick** | ($z$, $(N-1).k-z$) | (0, 0) | Every epoch has $N$ rounds and $f$ can choose to be part of the colluding network or not. It is fair to assume that if $f$ is not part of the colluding network the MEV proceeds from other rounds will not be shared with $f$. When we model the game for every epoch it enforces a stronger condition $z > N.k$ for $f$ to not collude with the rational stakers. ### Multi-epoch game It was quite straightforward to move from a single-block game to a single-epoch game. Things get trickier when we model the game to exist over multiple epochs. Let us analyze the multi-epoch game when the stakers $S$ employ different strategies. #### Grim Trigger The stakers following this strategy will permanently defect following any instance of the flipper tricking the stakers. In such a setting, the utility when $f$ decides to trick $S/f$ is as follows: 1. Recieves $z$ as reward for successfully tricking $S$. 2. $\forall s_{i} \in S$ decide to defect and not attack every time $f$ is the flipper in the future epochs. The utilities for the stakers and flipper over $e$ epochs is illustrated below: | ($f$, $s_{i}$) | Attack | Defer | | -------------- |:-------------------- |:------ | | **Collude** | ($e.N.k$, $e.N.k$) | (0, 0) | | **Trick** | ($z$, $e.(N-1).k-z$) | (0, 0) | For large enough $e$ we know $e.N.k > z$. When $S$ employs a Grim Trigger strategy it is evident that it is in the interest of $f$ to collude with $S$. #### Tit-for-Tat The stakers following this strategy will start with cooperating with $f$ and then mirroring the previous action of $f$. In such a setting, the utility when $f$ decides to trick $S/f$ is as follows: 1. Recieves $z$ in $e_{1}$ as reward for successfully tricking $S$. 2. Recieves 0 reward in $e_{2}$ as $S$ defers and doesn't attack. In the worst-case scenario of $f$ knowing the strategy employed by $S$, $f$ decides to Collude. 3. Recieves $z$ in $e_{3}$ as reward for successfully tricking $S$. Again we assume $f$ knows the strategy employed by $S$ and $f$ decides to Trick. $f$ receives a reward of $z$ in every alternate epoch. | ($f$, $s_{i}$) | Attack | Defer | | -------------- |:------------------------------------ |:------ | | **Collude** | ($e.N.k$, $e.N.k$) | (0, 0) | | **Trick** | ($(e/2).z$, $(e/2).(N-1).k-(e/2).z$) | (0, 0) | When the stakers employ a Tit-for-Tat strategy it enforces a stronger condition $z > 2.N.k$ for $f$ to not collude with the rational stakers. #### Other strategies $S$ can employ any other strategy like Reputation-based Strategies, Minimax Strategy, Maximin Strategy etc., and a multi-epoch game can be modeled to understand the utility functions for each actor in the system. Rational stakers $s_{i} \in S$ will also try maximising their gains and the Grim Trigger strategy ensures maximal gains for the colluding stakers $S$ which enforces a stable Nash equilibrium with $f$ colluding and $S/f$ attacking. ### Sybil attack by flipper We can now try to model a multi-epoch game such that $f$ engages in a Sybil attack, that is, every epoch $f$ removes stake and restakes as a different entity. This is very similar to the `One game every epoch` model with an added cost of using something like tornado cash to anonymously transfer staked assets to a different address. Assuming $f$ bears an additional cost $c$ for doing this, the game is illustrated below: | ($f$, $s_{i}$) | Attack | Defer | | -------------- |:-------------------- |:------ | | **Collude** | ($N.k$, $N.k$) | (0, 0) | | **Trick** | ($z-c$, $(N-1).k-z$) | (0, 0) | This enforces a stronger condition when compared to the `One game every epoch` model enforcing $z-c > N.k$ for $f$ to not collude with the rational stakers. ## Conclusion and Future Work The main takeaways are: 1. The game is not a straightforward single block game where it is obvious that $z > k$ 2. Not colluding is not the dominant strategy for $f$ under all strategies employed by $S/f$. Modeling games with $f$ gaining when $S/f$ incurs losses does not guarantee a stable Nash equilibrium of not colluding. A game where $f$ is directly incentivized by the user will enforce a stable Nash equilibrium for not colluding. However, this can only be achieved when the user is ready to pay the flipper the full value of the MEV. This article focuses on a single flipper $f$ playing against colluding actors $S/f$. In a subsquent article I intend formally prove how this can be extended to a multiple non-colluding flipper scenario(if this holds good :P).