# CAMCOS Fall 2022 Project - PBS / MEIP-1559 <!-- Notes: - Scratchwork: https://hackmd.io/@9MPvD9czTQS18zEjAERVAg/HkjI6l9Qi/edit - slides: https://docs.google.com/presentation/d/1cUnKknRuodn3qsOTCZREpvaU-zMa-eGcIL0NU0utP8g/edit?usp=sharing - graphics: Lucidcharts - LaTeX: https://latex-tutorial.com/tutorials/ --> # 1. Multidimensional EIP-1559 For our first project, we create a basic functioning MEIP-1559 simulator with two resources that can later be generalized to higher dimensionality. We build off the work of previous CAMCOS projects by improving transaction modelling and multidimensional resource design. We obtain some qualitative insights using the simulator and some reasoning. ## 1.1 Preliminaries In the Ethereum Virtual Machine (EVM), a unit of account called **gas** accounts for the resources used in a transaction. The most modern transaction fee methodology is the August 2021 **Ethereum Improvement Proposal 1559 (EIP-1559)**. This system's main intention was to create a default fee for each block while allowing the block sizes to be able to flucuate appropriately to deal with the network's congestion dynamically.[^16] The design is: 1. When users submit transactions, each transaction has a **gas limit** and the user supply a **max fee** which must be at least the **basefee** (see below). 2. The protocol computes a **basefee** that updates every block with the function: $$b_c = b_p * (1 + 1/8 * \frac{g_p - g_t}{g_t}).$$ Here, $b_c$ is the current block's basefee (with units of fee per gas), $b_p$ is the parent block's basefee, $g_p$ is the parent block's amount of gas, and $g_t$ is the fixed gas target for each block. [^16] The basefee increases for the block if the amount of gas in the parent block is higher than the fixed gas target, and decreases if the parent block's gas is lower than the gas target. 3. When a miner picks up a transaction into a block, the user pays the entire amount of the max fee (per unit of gas in the gas used, which is upper bounded by the gas limit). Out of this, the basefee is **burned** (destroyed permanently), and the miner gets all the tips on top of the basefee. As an example, if the user offers a tip of 300 gwei/gas and the basefee is 200 gwei/gas, the miner pockets 100 gwei/gas and the user loses 300 gwei/gas when her transaction is accepted. The target amount of gas per block ($g_t$) is 15 million gas, where the maximum block size is 30 million gas. Notice the maximum block size is double the target block size, this means we can think of EIP-1559 as automatically adjusting based on the demand of the gas resource, with a "slack parameter" ($\frac{maximum}{target}$) of 2x. ### Multidimensional EIP-1559 Currently a transaction's price is dependent solely on gas. Gas is trying to capture many components of the EVM at once, including: * EVM Usage: The computational complexity of the operation. * Calldata Usage: The input string that gives instructions to the EVM. L2 rollups in some sense abuses this feature since it is not priced adaptively. * Storage: The amount of disk space that will be occupied. * Witness data: Signature to attest the validity of the blockchain. [^31] Using a unidimensional pricing model to represent several variable resources can have negative effects on the operation of a blockchain. When resource costs are hardcoded, there is a risk of resource exhaustion attacks in which resource mispricing is exploited to overload the blockchain.[^32] Additionally, the system may not operate as efficiently as possible when multiple resources aren't priced separately.[^30][^31] For example, if high EVM execution levels lead to the block limit being reached, other resources will undesireably become more costly in the next block. Instead, a multidimensional approach would allow each resource's supply and demand to fluctuate separately for a more accurate pricing model.[^30] There are two natural ways to allocate space for multiple resources inside a transaction: 1. A **split-pool** allocation would have each constituent resource be priced individually with their own target limit and maximum limit. If a transaction is a suitcase, one can think of this design (for $k$ resources) as having $k$ suitcases, one for each resource. 2. A **shared-pool** allocation would still just have one single space allocated for multiple resources, but the resources would all be converted into a single unit (say gas) like that of the status quo, unidimensional EIP-1559. The resources would have their own basefees in this system. This is analogous to having a single suitcase to fit the $k$ resources. For our work, we assume a split-pool allocation, as in Figure 1 below. ![](https://imgur.com/tQoaSWe.png) **FIGURE 1**: Target and maximum limit for a block pre and post MEIP-1559 using a split-pool allocation. Target limit represented by dotted line while maximum limit represented by red line. ## 1.2 Transaction Modeling We model and analyze the transactions in a MEIP-1559 design. We make some simplifying assumptions. To start, we will use just 2 resources -- **EVM execution** and **calldata**. (technically we are using gas and calldata, so we should subtract off the gas component allocated to calldata to make the resources as disjoint as possible. However, for this preliminary exploration we did not do so.) We obtained data from 14979 transactions which were written to blocks 15627832 to 15627931 by using the python web3 api along with a public RPC from ankr. [^28] Then we must generate new transactions with similar characteristics. ### Methodology Overview In previous CAMCOS work, we have modelled the distribution of (one-dimensional) gas. We had the most success with **gamma** distributions[^44] $$\ f(x,a,b)=\frac{\beta^{a}x^{a-1}e^{-b x}}{\Gamma(a)}, \text{where } a>0,\text{ } x\geq0, $$ though we also had success with other distributions (for example, CAMCOS Fall 2021 obtained a simple model via a Pareto distribution of $\alpha = 1.42150$ and $\beta = 21000.$) The parameter $a$ is known as the **shape** and $\frac{1}{b}$ the **scale**. An additional location parameter can be used to shift the distribution along the x-axis which is useful for shifting the minimum possible gas value to be the real world minimum gas value of 21000. Fitting the raw transactional gas data with a gamma distribution gives $X \sim \Gamma(0.742, 0.945e-06)$ with a lower limit of $21000$, and fitting the raw transactional calldata length values with a gamma distribution gives $Y \sim \Gamma(0.158, 0.000322)$. In previous work, we had a couple of naive approaches: 1. the **sample-then-split** approach: we model transactions as following some "total gas" distribution first, then given a gas amount $Z$, split it randomly int $X+Y$, where $X$ corresponds to EVM and $Y$ corresponds to calldata. 2. the **sample-then-combine** approach: we model transactions as sampling $X$ and $Y$ independently as the EVM and calldata values, and then adding them together to create a transaction. There are pros and cons with both approaches and they are useful as toy models. However, both of them assume independence between the two resources, which is highly unrealistic. We began with the most naive sample-then-combine transaction generation, which was to model each resource independently using known distributions (**Method 1**), but then ended up throwing away the approach altogether for sake of accuracy. We then attempted various approaches to modeling the resources with dependence using joint distributions: 1. Our second attempt was to use a clustering algorithm to identify clusters before modeling each cluster with multivariate distributions (**Method 2**). 2. We further improved our transaction generation with our own clustering algorithm that identifies important gas and calldata values that appear at a high proportion of the dataset (**Method 3**). We then modeled these special values using univariate distributions. The remainder of the dataset was modeled using a single multivariate distribution. Of these processes, **Method 3**, was our most optimal of the approaches we evaluated. We now outline the lessons from all of these approaches. ### Method 1 As stated, we modeled each resource independently using the resource's marginal distributions. We first split the dataset into a training and test set at a 80-20 split, respectively. As the population parameters for the various known distributions are unknown, we must estimate them from the dataset. This disallows us to use common methods of comparing a model to a known distribution such as the Smirnov test or Anderson-Darling test since the groups being compared aren't independent.[^34][^35] Therefore, we use the technique described in [^36] to first estimate bootstrapped model parameters on the training set. We created models using these estimated boostrapped parameters for three distributions commonly used in transaction analysis: poisson, normal, and gamma. [^29] We shifted the poisson and gamma distributions using a location parameter of 21000 to avoid unrealistically low modeled gas values. This is of course not possible for the normal distribution as it is unbounded. Empirical cumulative distribution functions (CDFs) of the transaction data and fitted models were compared using a two-sample bootstrapped Smirnov test.[^34] The charts below represent show the raw gas used distribution and fitted model distributions with Smirnov test statistic and p-value annotations. It is desirable to have a low D test statistic value and high p-value. We plotted histograms of modeled data vs raw data in **Figure 2**. The poisson distribution fits the transaction gas used data poorly with a high Smirnov test statistic, although it has a high p-value. The normal distribution fits the transaction gas used data moderately well with a low Smirnov test statistic, however it also has a low p-value. Note that the normal distibution is not lower bound by zero as would be desired in an appropriate transation gas used model. The gamma distribution fits the transaction gas used data well with a relatively low Smirnov test statistic and high p-value. ![](https://i.imgur.com/EIu6jaG.png) **FIGURE 2**: Histograms of raw gas values (orange) vs modeled gas (green). In this dataset, 18.3% of transactions use 21000 gas. This is the minimum amount of gas that an Ethereum operation currently requires.[^33] This already suggests that the gas used distribution may better follow a mixture model that has a high percentage of gas used at 21000 and, separately, a distribution of gas used that follows some distribution. Our Method 3 is heavily motivated by this observation. We then followed the same methodology for calldata. As calldata often has length of zero, however, there is no need to bound the poisson and gamma distributions with a location parameter as we did for gas distributions. **FIGURE 3** shows the distributions of raw calldata lengths vs modeled calldata. Again, the gamma distribution performs the best of the three distributions tested with the lowest test statistic value. ![](https://i.imgur.com/3KhYzmb.png) **FIGURE 3**: Histograms of raw calldata (orange) vs modeled calldata (green). If we generated the joint distribution of what we obtained in Method 1 and plot against empirical data, we obtain the following fit: ![](https://i.imgur.com/7x95fVK.png) **FIGURE 4**: Raw test data plotted against data generated using method 1. Clearly, this is not the best fit, even not accounting for the boundary behaviors. This was the main motivation which inspired us to try other methods. ### Method 2 When we visualize the calldata length vs gas used on a log-log scale as in **Figure 1**, we see that there are several clusters of data. What we call Method 2 is naively using clustering on this data. We include the clustering in **Figure 1** as well. | Data | Clustering | | -------- | -------- | | ![](https://i.imgur.com/KtCu9Zb.png) | ![](https://i.imgur.com/y6dEWzm.png) | **FIGURE 5**: Data and corresponding clusters. If we use the clusters as basis to generate new data and again overlay with empirical data, we obtain: ![](https://i.imgur.com/40AmRZQ.png) **FIGURE 6**: Raw test data plotted against data generated using method 2. We do a better job here than Method 1, as the data are at least roughly in the same place. However, we see some problems with this method right away. The clusters are arbitrarily split into sub clusters where it isn't clear there's any obviously different logic that dictated the different clusters. For example, note that the horizontal line on the bottom are split into two clusters, and this corresponds to a strange "line plus wave" data generation on the bottom of **Figure 6**. Several of the horizontal lines that seem logically distinct near the $y=4.2$ area in the right subplot of **Figure 5** are also put into one cluster because of geometric distance, but they seem to correspond to structurally different data. To insert some logic that takes care of these issues, we decided to go one step further in Method 3. ### Method 3 The outline of this method is as follows: 1. First, we specify a ``ratioLimit`` (which we arbitrarily set to 1.5%) that corresponds to a lower bound which forces us to treat a value as important. If more than ``ratioLimit`` of the dataset falls at a specific gas or calldata value, then we take all data with the corresponding value and treat it separately from the remainder of the dataset. 2. We fit special values by $1$-dimensional or $2$-dimensional distributions, depending on whether both coordinates were determined to be important (the $1$-dimensional distributions ended up becoming gammas as well). We fit the values that weren't previously modeled by $1$-dimensional distributions to a separate single 2-dimensional distributions (we used a multivariate lognormal for this). 3. When we generate data, we generate important values and the "non-important" values separately with the fitted distributions. We see Step 1 in action in **Figure 7** below. Notice the remainder of the data looks roughly like a 2-dimensional lognormal distribution. ![](https://i.imgur.com/4PPEfVk.png) **FIGURE 7**: Calldata length vs gas used for the transactionDataContinuous_Clean.csv file using 14979 transactions from blocks 15627832 to 15627931 with clusters identified at the vertical and horizontal lines. Here, the ``ratioLimit`` threshold for defining a cluster was $\geq$ 1.5% of the dataset at a given calldata or gas used value. These identified clusters are displayed in the table below. 1. Rows with type "Point" are modeled in the transaction simulator as a "0-dimensiona" joint distribution with the exact calldata and gas used values as displayed in the table, which just account for a specific proportion. 2. Rows with type "GasDistribution" are modeled in the transaction simulator as a 1-dimensional gamma distribution of calldata with a fixed gas value. 3. Similarly, rows with type "CDDistribution" are 1-dimensional gamma distributions on gas with a fixed calldata value. **Table 1**: Performing method 3 resulted in the following clusters. ![](https://i.imgur.com/HZq3M85.png) We are still left with one cluster to model under "Remaining", which fits well with a multivariate lognormal distribution. ![](https://i.imgur.com/9cPOgzZ.png) **FIGURE 8**: After modeling the transaction values identified by the vertical and horizontal lines in the figure above, we are left with this final cluster of data. <!-- ![](https://i.imgur.com/UKEg9rg.png) --> ![](https://i.imgur.com/Zw5iUiX.png) **FIGURE 9**: Raw test data plotted against data generated using method 3. We again plot simulated data against raw test data (**Figure 9**). See the table below for results for all three methods side-by-side (**Figure 10**). For sake of comparison, we also included the outputs of the the same procedure for the first two methods. Method 3 seems to perform the best of the procedures attempted. | ![](https://i.imgur.com/oXmEeB1.png) | ![](https://i.imgur.com/OYl8U6w.png) | ![](https://i.imgur.com/Zw5iUiX.png)| | -------- | -------- | -------- | **FIGURE 10**: Raw test data plotted against data generated using all three methods. ## 1.3 Simulation Procedure Given a model to generate transactions, we can run simulations of MEIP-1559. Our method [^41] starts with a mempool (populated via 2000 transactions generated from our transaction generator from 1.2) and runs over a sequence of many *turns*, each corresponding to one slot. Each turn, the following actions take place: 1. some number (random with a mean of 400) of transactions are created and inserted into the mempool. a. The resource limits (gas and calldata) of these transactions are again based on our transaction generator. b. The tips are simulated based on the Geth recommendation (continuing methodology from previous CAMCOS work). This recommendation is generated using a gamma distribution of $Y \sim \Gamma(20.72054, 1/17.49951)$ [^42]. 2. The total price of each transaction is calculated by summing ``GasUsed * GasPrice`` over all the resources. 3. the simulators fill the block. We use a greedy approach (transactions are first sorted in the mempool based on tip amount and the transactions that generates largest profit will be prioritized and put into the block until the limit is reached). While the greedy approach is a reasonable start (and this approach is the current implementation from *geth* [^43]), for future simulations (and possibly reality) it might make sense to consider more sophisticated solvers for MEIP-1559. One would want to make a good tradeff between the additional complexity of knapsack solvers and the possible loss of MEV from a greedy algoirthm, although it is very likely that this will be a moot point as other realizable sources of MEV will dominate. Armed with this simulator, there are many simulations we can do. For sake of focus, we look at a particular application, which is exploring parameter values for calldata, such as the learning rate. ## 1.4 Exploring Calldata Values For our simulation, it was easy to pick default parameter values for gas from empirical data. However, we do not have a good precedent for calldata. Luckily, we can bootstrap our process by using our simulator to explore the value space. Using an initial basefee of 25 and maximum limit that is double the target limit, we vary the **target limit** and **learning rate**, $d$. Recall that for one-dimensional EIP-1559 and gas, $d$ refers to the $1/8$ from the basefee update formula $b_c = b_p * (1 + 1/8 * \frac{g_p - g_t}{g_t})$). The results of our simulation follow (we pick this range of values to show because they are where the calldata doesn't increase indefinitely or decrease to 0). ![](https://imgur.com/zrupQL2.png) **FIGURE 11**: Grid plot of learning rate against target limit of calldata. ![](https://imgur.com/hI3zGZc.png) **FIGURE 12**: Heatmap of variance of basefees of learning rate against target limit of calldata. We observe that an aggressive learning rate such as one of 5/8 causes large fluctuations and larger target limits such as 24000 causes basefees to tend to 0. Avoiding that, there might be many different metrics to pick a good learning rate. For a very preliminary metric, we can use the variance of basefees. This metric would point to calldata having a learning rate of 1/11 with target limit of 23500. Do note that the basefee of gas denoted by the blue line looks constant as we are looking at a large scale of 0-100. So far, our simulations were using transactions generated with data scraped during normal times. We want to observe when we apply transactions scraped during peak times (another lesson we learned from earlier CAMCOS work). In Figure 3, we show the data during February 9, 2022 when the second most expensive NFT, “Clocks”, was sold; it contains much higher overall gas and lower calldata compared to normal operations (left). ![](https://imgur.com/eTmsJws.png) **FIGURE 13**: Transaction data during normal times and peak times. We repeat our procedure on this data in Figures 4 and 5, showing respectively the results of simulations using different learning rate / target limit values and a heatmap of variances of basefees. ![](https://imgur.com/qLJFLS9.png) **FIGURE 14**: Grid plot of learning rate d against target limit of calldata. ![](https://imgur.com/onMEAw0.png) **FIGURE 15**: Heatmap of variance of basefees of learning rate against target limit of calldata. ### Calldata String Analysis As a preliminary investigation into calldata strings, we visualize the most common strings present in our dataset in **Table 2**. Here, any string that is present in more than 0.5% of transactions is displayed. Future work should investigate what types of transactions these are, and how they are related to other resources. **Table 2**: Most commonly observed calldata strings in our dataset. Note that the calldata string '0x' corresponds to no calldata used. ![](https://i.imgur.com/tU3EG3H.png) ## 1.5 Summary In conclusion, 1. Continuing previous CAMCOS work, we have improved upon transaction (resource limit) generation with a clustering-based method that seems to work better than methods involving independence assumptions or "out-of-the-box" clustering methods. - To our knowledge, there are no reliable multivariate goodness-of-fit tests that can be used to compare data that were generated using distribution parameters estimated from raw data to the raw data itself. Perhaps a modified Peacock test could be used as a next direction [^45]. 2. We have applied this generator to make a functionaling simulator for MEIP-1559-related simulations. 3. We have used the simulator to make some preliminary explorations of good parameter values for calldata. We decided that for normal operations, $d=1/11$ and 23500 look like good values for learning rates and target limit lengths, respectively. - However, this seems to be different for high-demand (such as NFT) situations. - We also used variance as a simple metric, however it is very imperfect. (for example, there is a clear bias for smaller learning rates to produce smaller variance) # 2. Proposer-builder Separation In this sub-project, we discuss different designs for proposer-builder separation (PBS), make some qualitative observations, and use game theory to study some toy problems to obtain some quantitative insights. ## 2.1 Preliminaries The general idea behind **proposer-builder separation (PBS)** is to split the power of a miner/validator into two roles: 1. Builders gather transactions from users and optimize for MEV, whether their own or as a service for users. They then create candidate **blocks**. 3. Proposers choose one of the candidate blocks and propose it. The market can be very simple - for example, the builders can just all propose blocks with a bid for each block, and the proposer simply chooses the block with the highest bid [^2]. As a concrete example, **MEV-Boost** [^3] is an off-chain implementation of proposer-builder separation that works with the current Ethereum 2.0 blockchain. MEV-boost is a free and open-source software that enables a communication channel between validators (acting as proposers) and one or more block builders. It acts as a trusted intermediary between the parties, using trusted **relays** to store intermediate blocks that proposers agree on. It also has an additional role, specialized MEV-maximizers called **searchers** who find profitable transactions and send them as blocks to builders. The ultimate goal for PBS is the popular concept of **decentralization**; in particular, PBS seeks to decentralize the power in the proposer rule, which presently has both building and proposing powers. In a detailed design, we would need to balance some subgoals of decentralization, such as: * **MEV-resistance**: how resistant the design is against proposer/builders hurting the user experience when incentivized to maximize MEV. * It is hard to give a precise definition of **maximal extractable value (MEV)**, but it is typically taken to mean profits that can be extracted by proposers (pre-PBS) from the mempool. It can be considered neutral (or even good) when it increases the efficiency of the market in the case of arbitrage and liquidation, but it is usually considered "bad" in the context of actions such as **front/back-running** (including transactions before/after other transactions to obtain value, usually at the cost of the authors of those transactions)[^3][^4] * One reason to fight MEV is that it creates "MEV-driven centralization": if MEV creates large amounts of revenue, it becomes a focus and a source of distraction, leading to collusive efforts to obtain the revenue and thus centralization.[^5] * **Censorship resistance**: how resistant the design is against proposers / builders colluding to not include transactions that "should" be included in blocks (i.e. transactions with high enough tip and basefee). * Much of censorship is simply due to MEV. For example, proposers can copy transactions that seem high-value and ignore the original transactions. * There are also reasons to censor transactions for non-MEV reasons, such as off-chain political pressure. For example, since crypto mixer Tornado Cash became a smart contract sanctioned by the United States government. As a result, complying validators are forced to exclude sanctioned transactions [^7]. * Another relevant example comes from MEV-Boost: some relays in MEV-Boost ("OFAC-enforcing relays") choose not to include transactions sanctioned by OFAC, so they reject blocks containing these transactions. According to [^46], most of the validators use these OFAC-enforcing relays, so this type of censorship is a very relevant issue. PBS could mitigate validator centralization, but it also introduces new centralization—builder centralization. Builders might overbid for blocks and exclude(censor) certain transactions, which is considered as a censorship threat in PBS. A few different solutions to mitigate this kind of censorship are[^26]: * **censorship-resistence list(crList)**: proposers obtain the right to add transactions to a list and then could ask/force builders to include transactions from the list. We will look at this design more in subsequent sections. * MEV-smoothing: we can try to equalize MEV profits for all proposers. Proposers who do not propose the most profitable blocks will be disincentivized by attestors from the consensus layer [^6][^47]. * Encrypted mempools : The content of transactions are encrypted before entering into the mempool, and decrypted only when on-chain. This will make it hard for both parties to censor and extract MEV. * Altruistic self-building: Altruistic proposers build their own blocks rather than creating a transaction list for builders to choose. Of course, any design should also take into consideration practical concerns unrelated to decentralization, such as - **complexity**: how complicated a design is. - **liveness / throughput**: a good design should make sure that the rate that transactions get processed by the chain is not slowed down too much. - **incentive compatbility**: we want to make sure the different players in the system want to participate in the system. We can probably rely on altruism for low-cost activities, but generally we want to consider the agents as rational. ## 2.2 Some Main Questions in PBS Design Space There are implementation details we probably want in every PBS design (for a simple example, we probably want proposers to have digital signatures and slashing rules that a proposer should not make two proposals). However, there are quite a few major variations / parameterizations one can tweak when implementing PBS: ### 2.2.1. What object is the builder market bidding on? Two natural approaches are: 1. **Slot auction**: Builders simply bid for the right of making blocks. This is essentially bidding to be a full-powered proposer (assuming the builders act fast enough). Typically this means during slot $n$, builders bid for the right of making blocks in slot $n$, though they can also theoretically build for blocks in slot $(n+1)$ or later[^8]. 2. **Block auction**: Builders need to have a block built already at the time of the bidding, and bid for the right to have the block selected. Typically this is done by submitting a hash/header during the bid, after which the builders need to reveal the block contents. Some basic observations: - In a slot auction, builders still have time to perform MEV extraction even after they win their bid. This correlates with less MEV resistance. However, in a block auction, the interaction with the consensus layer is more difficult - what happens, for example, if the proposer accepts a hash but the builder does not show up with a valid block? Thus, at a shallow level, we can consider this a tradeoff between MEV resistance (or more broadly, decentralization) and complexity. - In a slot auction, we would need to consider the "time limit" on bidding and building. The longer the gap, the less MEV resistance we have (since the builder has more time to act). Shorter gaps might affect throughput and also make the system less incentive-compatible for builders. - In a block auction, we must secure against the possibility that builders disappear without publishing the block body after an accepted header. There are potential incentive issues (for example, we want the proposers to be paid even if this happens) but also potential concensus consequences (we might need e.g. a two-step process for every block that has an intermediate state of "header received but block didn't" -- see **two-slot PBS** later). As a concrete example, MEV-Boost[^1][^4] uses the *block auction* design via a **commit-reveal** scheme: 1. After receiving transactions from which they could maximize their profits from searchers, builders build the most profitable block and send blocks along with bids to **relays**. 2. **Relays** are responsible for confirming the validity of built blocks as well as safekeeping the blocks. They broadcast the block headers while keeping the blocks contents. 3. The proposer chooses the block header with the highest tip with the help of relay, signs it and return back to relay. Afterwards, the block is proposed and added to the chain, and proposer receives transaction fees. 4. The builders then exposes the block data, and the relays can be trusted to secure the data if the builders disappear[^44]. ### 2.2.2 How does a slot/block auction design interact with the consensus layer? A slot auction design basically adds a market to pick the proposer, so it is in some sense orthogonal with the consensus layer (at the level of abstraction that only knows that the blockchain gets a proposer every slot, without knowing how). The block auction design makes it a bit more complicated. Two natural approaches are: 1. **one-slot PBS**: there is a certain number of committees between builders and proposers (like the relayer role in MEV-Boost). The committees can validate the proof of the payload sent from builders and then send header with validated proof to proposers. The proposers accept the highest bid header and broadcasts the given block (made available by the committees). Then the attestors can vote[^9]. 2. **two-slot PBS**: the process is divided into two slots: the first slot creates a block on the beacon chain with just the winning header. The second block comes to the delivery stage, releasing the winning builder's block containing the payload. Attestors will vote twice, once for the headers and once for the complete blocks[^10], to make the block safe. Between these designs, the second one is more modular and does not need a trusted layer. However, it makes more complex demands from the consensus layer and will hurt throughput, since the validators are doing twice the amount of attestation work. The above is just a sample of the space of design. There are many other possible ones -- for example, maybe there can be a market for many proposers participating as a single proposer unit, voting on different blocks to propose for. ## 2.3 crList designs for hybrid PBS Our "basic" PBS setup has a very simple role for proposers. Even though it introduces competition for builders and thus decentralizes the builders, it is still possible to make builders centralized if e.g. a certain builder pool becomes very good at optimizing MEV and always winning the bids as a **dominant builder**. Thus, we might want a design to give some power back to the proposer, and split power between the proposer and the builders. One particular type of design, called **hybrid PBS** [^10], combines the status quo transaction fee market and PBS. The main idea is that the proposers "force include" some transaction so some transactions in the block will be provided by the proposers and some by the builders. The basic design is as follows: 1. when a proposer is selected, the proposer creates a **crList** (the "cr" stands for "censorship resistant" [^11]), which refers to a list of transactions with sufficient basefee and tips that the proposer considers to be worth adding [^12]. These transactions include those politically sanctioned transactions and previouly censored tranctions[^13]. 2. the blocks created by the builders must include the transactions in the crList. 3. Proposer accepts winning header. 4. Builder publishes block body. The idea is that altruist-minded proposers can use the crList to fight against builder collusive censorship. The spirit of these transactions would be including transactions "fairly" for the sake of anti-censorship. More broadly, one can think of our "basic PBS design" as a special case of **partial block auctions**[^21], where builders are being auctioned the right to manipulate *some* aspect of the block with *some* proposer input, and the interactions between the two parties create the final block. Of course, a "rational" proposer would be able to perform MEV on the transactions themselves, or not include transactions in such a list at all. However, as long as we have some altruistic proposers, we significantly decrease the probability of prolonged censorship. A heuristic argument shows that even 5% of altruistic proposers is enough to include a "victim transaction" of censorship every 20 slots[^17]. crLists are one of the main proposed solutions to censorship on MEV-Boost due to OFAC-enforced relays ([^48]). The crList itself has a few design parameters: 1. **crList content**: for the txs in crList, can we have two choices: all txs are required to be old txs(previously censored) or any txs that the proposer approves. The former choice gives more restriction on proposer's selection of crList transactions but can somehow prevent proposer from being discriminative. The latter one is more likely to be problemic, e.g. proposer censorship and discrimination. 2. **crList space**: we can let the builders use whatever space the crList doesn't use (we call this **proposer-prioritized shared space**), require crList transactions to fill whatever space builders don't use (**builder-prioritized shared space**), or just have a fixed limit for both the crList and non-crList portions of the block (**separated space**). A few observations: * it doesn't make sense to set a lower bound for the crList space, as a proposer can theoretically not have seen any transactions. * if we use proposer-prioritized shared space, we probably do not want the upper bound to be equal or more than $\frac{1}{2} G_{max}$, because that leaves the possibility for proposers to take up the entire blockspace every block, or "proposer centralization" (although this is not necessarily a game breaker, it might structurally dissuade people to be builders) * Builder-prioritized shared space allows builders to completely ignore the crList if there are enough transactions in the mempool to fill the block with instead. However, if there aren't enough transactions in the mempool, the builder must either include crList transactions or fill the rest of the block with their own transactions, paying the basefee for each one. This disincentive relies on the exponential increase of basefees from EIP-1559[^21][^27]. * Separated space is very similar to proposer-prioritzed shared space with an upper bound on crList size. Assuming the proposer fills up their allocated space, the two parameters are equivalent. 3. **crList incentives**: the two obvious choices are letting proposers include whatever they want with no monetary incentive, with the spirit being anti-censorhsip rather than MEV (**altruistic crList**) and paying proposers for each transaction (**monotonic-value crList**). A few observations: * positive-value incentives would lead proposers to want to fill the crList, so we would definitely not want the upper bound to be equal or more than $\frac{1}{2} G_{max}$ in that case. * One alternative would be a reward for successfully creating a crList of required length (e.g. length >1) or of certain percent of block space (e.g.$\frac{1}{4} G_{max}$). 4. **crList integration**: basically, how the builders use the crList transactions in their block. Natural choices are putting the crList in the front (**frontloading**), putting the crList in the back (**backloading**), putting the crList anywhere (**free list inclusion**), or putting individual transactions anywhere (**free transaction inclusion**). With all of these options, one can also e.g. consider letting builders change the order of crList transactions or permute them freely. * Frontloading induces backrunning and backloading induces frontrunning. Both types of activities have different drawbacks for the network [^18]. For example, one can argue that backrunning is more likely to be done via spamming, so it has greater effects on throughput [^19]. Thus, choosing between front/backloading could come down to choosing between the lesser of the two drawbacks. * Note that free list inclusion and free transaction inclusion both give builders even more power to extract MEV than either the frontloading or the backloading methods. * It's important to keep in mind that certain transactions are not very valuable for MEV but still can benefit from anti-censorship (such as politically censored transactions) 5. **builder incentives**: this is the "dual question" to the incentives for proposer transaction inclusion in the crList, namely if there are punishments for builders for *not* including transactions from the crList. The natrual choices are to slash builders for not including the entirity of the crList (**all or nothing**) or a punishment for each transaction not included (**partial punishments**). ## 2.4 Cost of Censorship In [^12], Buterin performs some preliminary quantification of the concept of **cost of censorship** as a quantitative measure of censorship resistance. The idea is to consider a transaction $T$ that powerful economic actors are trying to censor and then compute the cost it would take to sustain the censorship of the transaction. We start by reframing the problems examined in their analysis. ### Cost of Censorship without crList We use the following notation: - $T$ is a particular transaction attackers want to censor. - $X_C$ is the value (taking into account gas limit, basefee, and tip) of $T$. - $k$ is the number of transactions that fit in a block. We assume each transaction costs the same amount of gas, so any transaction (including $T$) would be $1/k$ of a block. - $M$ is the mempool. - $X_M$ is the value of every transaction in $M \backslash \{T\}$ equals $X_M$. Let $X_C - X_M = \delta$. - We can assume $X_C \geq X_M$ (since otherwise we would get the censorship fairly naturally by miners who would not want to include it). This means $\delta \geq 0$. **Problem 1: no PBS**. As a most basic example, we consider status-quo no-PBS setups. Suppose economic actors are trying to censor $T$. 1. Unless the attacker can consistently control all the miners, the attacker pretty much has to "buy up the block" by creating transactions that are at least $X_C$ in value. 2. This means the attacker must pay $X_C*\min(k, |M|)$ per block, else other miners would include $T$. 3. The analysis here basically matches Buterin [^12], who obtains $(X_C / g) * G$, where $g$ and $G$ are constants representing the gas limit of $T$ and the max gas limit of a block, respectively. Our model estimates $G/g = k$, so we get matching results in the case the mempool is big enough to fill blocks. **Problem 2: basic PBS**. In the basic PBS setup, the attacking builder censors by consistently outbidding the other builders (again, assuming $T$ is at least as valuable as the average transaction, we assume it would be naturally included by the blocks of other builders). This can be seen as a **auction theory** problem, where the cost to the builder is effectively the cost of continuously winning the auction. For now, we abstract it away by calling the amount of money it takes (per slot) to "guarantee" / "virtually guarantee" winning the action the **auction-smashing premium (ASP)**. * A reasonable estimation for the ASP is the average price of winning block bids over many slots, or the **revenue** of the auction. This is a direct object of study from auction theory and studying Nash equlibria; we explore this a bit more in the next section. * In practice, the attacker would probably end up paying more than just the revenue of the auction as other actors adapt by e.g. increasing their bids. It seems difficult to quantify this increase, which is one of the reasons we use a different word from "revenue." * There is a slight caveat, which is that in addition to just winning the auction, the cost to the builder includes the opportunity cost of not including the transaction. With our notation, since $X_C$ wins the builder $\delta$ over the other transactions, the censoring builder is effectively paying $\delta$ extra per slot to keep the censorship going, not counting the ASP. This gives a simplified formula of $$ \mathtt{ASP} + \delta.$$ The analysis here is similar to Buterin [^12], but the notation is significantly different. There is a lot more to say about the ASP - we will have a whole section dedicated to it later as we define it more formally. For now, we can black-box the concept and just use it as a mental shorthand for the "auction-related" part of the cost of censorship. ### Cost of Censorship with crList We look at how the crList changes the basic analysis. Recall that there are 2 design choices regarding the blockspace precedence. In builder-prioritized blockspace, the crList can be ignored by builders who can otherwise fill up the block. In proposer-prioritized blockspace, the builders must include all transactions in the crList. We start by looking at builder-prioritized blockspace, since it is most similar to what we already have. **Problem 3: Builder-prioritized blockspace.** Consider the case where: 1. the crList contains $r$ transactions (so as a fraction, $r/k$ of the blockspace); The crList is selected uniformly at random. 4. we focus on regimes where almost always $|M| < (k-r)$ (otherwise we effectively have no crList and the builder can just fill the block with whatever they want without breaking the rules) 5. $b$ is the EIP-1559 basefee. 6. $G$ is the max gas limit of a block. In this setup, there are two cases. With probability $(1-r/k)$, the crList does not include $T$, and the censoring actor will just need to bid the ASP (whatever it is). With probability $r/k$, the crList will include $T$, and the censoring actor will need to fill the rest of the blockspace with transactions. This would mean creating $(k-|M|+1)$ transactions, each of which gives a pure loss equal to the basefee. Thus, on top of the ASP, there is an additional cost of $\frac{(k-|M|+1)bG}{k}$. Recalling that the last bit is an opportunity cost of $\delta$ of not including the transaction, putting it together, the cost is $$\mathtt{ASP} + \frac{(k-|M|+1)rbG}{k^2} + \delta.$$ **Problem 4: Builder-prioritized blockspace, Multiple Transactions.** As an example of a slightly more complex analysis, we can generalized our analysis in Problem 3 to the case of having to censor multiple transactions. We assume: 1. there are $c = |C|$ transactions in the set $C$ of transactions that the actor wants to censor 3. there are $m = |M|$ overall transactions in the mempool $M$; 4. the constants $c$, $m$, $k$ are big enough that we do not have to worry about e.g. sampling with replacement vs sampling without replacement issues; 5. the fee paid by each transaction in $C$ equals $X_C$, and the fee paid by each transaction in $M \backslash C$ equals $X_M$. We can assume $X_C \geq X_M$. So now, for each nonnegative integer $s \in \{0, 1, \ldots, c\}$ (assuming $c < k$) we can find the crList to include $s$ of the censored transactions. As before, $s=0$ with probability $(1-r/k)^c$ (using our 3rd assumption), in which case we simply pay the ASP. However, for any $s > 1$, which happens with roughly probability $(1-r/k)^{c-s} (r/k)^s {c \choose s}$, we need to create $(k - |M| + s)$ transactions, each giving a pure loss equal to the basefee. For sake of intuition we estimate this instead of giving a precise answer. Assuming $s$ is a lot smaller than $k$, we do not lose much by just estimating $(k - m + s) \approx (k - m)$. So again we effectively just have two cases: with probability $(1-r/k)^c$ we simply pay the ASP, and in the remainder we pay an additional $(k-m)bG/k$. Finally, since the expected number of censored transactions appearing in the crList is $ck/m$, we have to pay $\delta$ opportunity cost for each. This allows us to approximate the total cost as $$\mathtt{ASP} + \frac{(1-(1-r/k)^c)(k-m)bG}{k} + \delta\frac{ck}{m}.$$ **Problem 5: Proposer-prioritized blockspace.** In this situation, if the crList includes the censored transactions, then the cost to censoring (when the transactions appear in the crList) is equal to the cost of ignoring the crList. 1. By default, we can assume this is a large slashing penalty that the builder may not want to consider. 2. One way around this would be to continuously bribe the proposer to not include the transaction(s) in the crList. This depends on the altruism level of the proposer and the cost of bribery. We see that proposer-prioritized blockspace offers a high degree (possibly best we have) of censorship resistance if (a) the slashing conditions are strict, (b) the cost of bribery is high and (c) a censored transaction has high probability of being picked up by an altruistic proposer (for example, someone who regularly looks for transactions with high probability of being censored). Further analysis of the game theory of this case may be interesting but outside of the scope of our work. ### Summary 1. PBS changes the cost of censorship from a "filling up transactions" cost to a "guarantee wining the auction" cost, though there are still elements of the former. It is useful to separate out this cost (which we call the ASP) as a separate variable and study it separately (as we do in the next section). 2. The builder-prioritized blockspace approach has weaker censorship resistance as the expected number of transactions in the mempool goes up. In the case where the mempool is consistently full, the only cost of censorship is the opportunity cost for the specific transactions. 4. In general, the cost of censorship under builder-prioritized blockspace has 3 terms: a. the ASP, b. the cost of filling up the remainder of the block with new transactions, where the censorer is paying basefee per transaction. c. the opportunity cost of ignoring the censored transactions (which goes to zero in the case they have similar fees as other transactions, though we expect transactions under censorship to magnify their fees on repeated attempts). 5. Under proposer-prioritized blockspace, we get very good (as good as we can hope for?) censorship resistance, though studying its cost would involve assumptions on bribery dynamics and/or slashing conditions. We ignore these aspects in our work to keep a tighter focus. ## 2.5 Auction Theory In this section, we set up some toy games using auction theory to model the PBS market. Our goal is to contrast these games with the transaction-based market that also arises in blockchains and to obtain some intuition for how different implementations can affect our results. Related to previous section, this theory would also allow us to get a sense of the ASP (though we won't be able to get neat formulas due to the inherent difficulty in these problems). We start by fixing some terms and conventions. In the style of a classic auction theory framework, * There are $n$ **builders** denoted by $B_1, B_2, … , B_n$. * $\mathcal{N}=\{1,2,\dots,n\}$. * During a "round" of the auction, 1. Each builder $B_i$ is given some **signal** $s_i$, which represents private information. We typically assume each builder knows the *distribution* of signals for the other builders but not their values. 2. Each builder makes a bid which is a deterministic function of their signal $b(s_i)$. 3. The winner of the bid (the provider) pays their bid, and receives some amount of **utility**. Everyone else receives a payoff of $0$. * if there is a tie, there are multiple approaches. With **random tie-breaking**, one of the tied builders is chosen randomly with equal probability to be the winner. This approach is the most realistic, since we expect the proposer to have no preference between equal tips. With **null tie-breaking**, no winner is chosen. * For simplicity, we assume we will use **random tie-breaking**, which will be irrelevant when the probability of ties is zero (this usually happens when the signals are random). This setup is typically called a **first-price sealed bid auction**. Now, we can make some more additional assumptions given our specific context: * $Txs$ is the full set of possible transactions in the universe. * we obviously do not need to consider all aspects of a transaction. It will probably suffice to use the "basic value" of a transaction as a proxy. * Each auction, we sample $Mem$, a collection of possible transactions, from $Txs$ with some random process. * The simplest possible such process we can use would be "draw $m$ transactions independently," where $m$ is the empirical number of transactions per block. * A slightly more complicated such process would be a Poisson process that draws a random number of transactions. * For more sophisticated analysis (probably only tractable with simulations), we might create a mixture of different distributions. For example, maybe 99% of the time we draw with "non-NFT" parameters and 1% of time we draw with "NFT" parameters, where "non-NFT" parameters have some default mean value and variance but "NFT" parameters tend to generate more transactions with higher value and varaince (encoding MEV potential). * The signal $s_i$ for each builder $B_i$ is some subset $Mem_i \subseteq Mem$. The builder makes her bid after reciving this signal. * Intuitively, this means the signal available to each builder during the round is what transactions they can see in the mempool. * Different builders may see different transactions due to latency and/or specialized knowledge. * After the bidding, the winner $B_j$ is assigned some $Mem_j'$ where $Mem_j \subseteq Mem_j' \subseteq Mem$. The winner then obtains value equal to $v_j(Mem_j')$, where * Each builder has some **block-valuation** function $v_j: \mathcal{P}(Mem)\to \mathbb{R}^+$. It corresponds to the amount of value (including MEV) the builder can extract from a given set of transactions. * We also assume that if $T_1\subseteq T_2$ then $v_i(T_1)\le v_i(T_2)$ for all builders $B_i$. **Problem 1 (simple starting point)**: - $Mem$ is sampled by independently drawing $m$ transactions from $Txs$. This will be the default assumption for subsequent problems. - we do a block auction; i.e. $Mem_i = Mem_i'$. - all builders see all transactions in the mempool; i.e. $Mem_i=Mem$ for all $i$. - all builders have the same valuation - their block is always value $v=v(Mem)$. Any strategy profile in which two builders bid $v$ and everyone else bids at most $v$ is an equilibrium. There are no other pure-strategy equilibria. If the second highest bid is $v-\varepsilon$ for some $\varepsilon > 0$ and the highest bid is $v$, then a better response for the highest bidder is to bid $v-\varepsilon/2$. If instead the highest bid is $v-\varepsilon$, then a better response for any other bidder is to bid $v-\varepsilon/2$. Bidding more than $v$ is no better than bidding $0$. In this setup, we therefore expect rational builders to make a profit of $0$. This is a somewhat degenerate case with several equilibria, because the expected profit is $0$. It is instructive to see the case as a limit of the case of private-value auctions in Problem 2. Using Problem 1 as a starting point, we build more examples that capture the nuances of PBS auctions. We show that our framework is still very flexible, and can be used to encode concepts such as builder **skill** (how good they are at extracting MEV from the same transactions), latency, and both block and slot auctions. As usual in game-theoretical analysis, we usually assume that everyone knows what algorithms everyone else is using. In real-life, these players will be simulating everyone else with some proxy distributions anyway, and analysis gets much more complicated if we do not make this assumption. ### Uncertainty and Independent Private Auctions Much of the degeneracy of Problem 1 comes from the lack of uncertainty. We add this component in and then make a connection to the well-studied notion of independent private auctions (IPAs). **Problem 2 (builders see different mempools)** The main idea is the same as Problem 1, but the builders don't necessarily all see the same transactions -- $Mem_i$ can be different from $Mem_j$ now for different $i$ and $j$. For simplicity, we assume that (a) each transaction has an equal probability of being seen by each builder that is the same for all builders and known by all builders, and (b) the valuation just depends on the number of transactions the builder can see. - we do a block auction. - all builders see every transaction independently with probability $p_{obs}$. - all builders have the same skill; i.e. $v(i,T)=v(T)$ for all $i$. - the valuation function is a function of the number of transactions it observes: If $|T_1|=|T_2|$ then $v(T_1)=v(T_2)$. Note that a model where we can roughly divide transactions into "valuable" or "non-valuable" transactions can also be modeled this way, by just ignoring the non-valuable transactions. **Problem 3 (builders have some variance in their ability to extract MEV)** The main idea is again the same as Problem 1, but there is some noise in a builder's ability to extract MEV. In particular, given the same set of transactions in the mempool, each builder trying to build the MEV has some (equal) distribution of how much value they can extract that is *known* to them. One possible interpretation is that for a particular auction, a builder might happen to have information private to other parties that can extract a lot more MEV (for example, knowing that a particular transaction is initiated by a particular real-life entity or corresponds to a real-life event). - we do a block auction. - all builders see every transaction; i.e. $Mem_i=Mem$ for all $i$. - each builder is given some *private* information $s_i$ before the bid - as usual with distributions, we assume all builders know the distributions of the $s_i$ - for each $B_i$, the valuation function is a function of the mempool and $s_i$. - For example, one specific parameterization might be the following: we draw $s_i$ uniformly from $[0,1]$, and for all $i$, $v_i(Mem) = s_i*V(Mem)$, where $V$ represents the "standard" MEV extractible from $Mem$. While Problems 2 and 3 are fairly different at first glance, they are actually both special cases of **independent private auctions**, the class of auctions where all participants are given as private values the values of the good they are bidding on *to them* and nobody else, though they know the distribution of the value of the good for everyone else and it is symmetric across all participants. This is a fairly standard subject in auction theory.[^14] In general for these types of auctions, in which $F$ is the cdf of the each bidder's value, the unique symmetric equilibrium is for each $B_i$ to bid \begin{align} b_i &= \mathbb{E}[\, \max_{j \neq i} v_j \mid (\max_{j \neq i} v_j) < v_i \,] \\ &= \frac{1}{(F(v_i))^{n-1}} \int_0^{v_i} x(n-1)(F(x))^{n-2}f(x) \mathrm{d}x . \end{align} **Caveat:** The general equilibrium doesn't exactly apply to Problem 2. We require $F$ to be continuous, and this is not the case in Problem 2, which has a discrete space of possible block values. This leads to complications because ties become nonnegligible events. To illustrate, consider a simpler example where each bidder's value is uniformly drawn from the discrete set $\{0,1\}.$ 1. Suppose for contradiction that there exists a pure-strategy equilibrium, and let $\beta_i(v_i)$ denote the equilibrium bid of $B_i$ given a value $v_i$. Clearly, $\beta_i(0)=0$. We will see that there is no best strategy for a value of $1$ by splitting into cases. 2. Case: $\beta_i(1)=1$. This strategy guarantees a payoff of $0$. Bidding $0$ is a better strategy, with an expected payoff of at least $(1/2)^{n-1}(1/n)$ (the probability of everyone else getting a value of $0$ multiplied by the probability of winning the random tie-break). 3. Case: $\beta_i(1)=0$. Bidding $\epsilon>0$ is a better strategy (for small enough $\epsilon$). Bidding $0$ only has a chance of winning when everyone else bids $0$. Bidding $\epsilon$ is the same, but with a guaranteed win in that case (instead of random tie-break) and with arbitrarily small decrease to payoff. 4. Case: $0<\beta_i(1)<1$ and $\beta_i(1) = \beta_j(1)$ for some $j \ne i$. A better strategy is to bid $\beta_j(1)+\epsilon$. Doing so allows for an arbitrarily small decrease in payoff in the case that $1 = v_i \ne v_j$, and a significant increase in payoff in the case that $1 = v_i = v_j$. 5. Case: $0<\beta_i(1)<1$ and $\beta_i(1) \ne \beta_j(1)$ for all $j \ne i$. A better strategy is to bid lower. Specifically, if $\beta_i(1)-\delta$ is the largest bid less than $\beta_i(1)$, then decreasing $B_i$'s bid by $\delta-\epsilon$ increases the payoff from winning by $\delta-\epsilon$ without affecting the probability of winning at all. 6. $\beta_i(1)$ can't be $0$, it can't be $1$, and it can't be between $0$ and $1$. It also obviously can't be above $1$, as that yields negative payoff. Therefore, this auction has no pure-strategy equilibrium. The point is that ties make analysis complicated and sometimes don't allow for the existence of pure-strategy equilibria. We also aren't that interested in the discreteness of Problem 2 when considering how its results apply to the real world, since there are always some amount of noise and external factors that blur the values of blocks. We mostly care about the "shape" of Problem 2. It is possible that discreteness leads to interesting and fundamentally different results. However, in the context of Ethereum, it makes most sense to study auctions with continuous values and discrete bids (since the smallest bid is 1 wei), rather than the discrete values and continuous bids of Problem 2. ### Slot Auctions and Affiliated Value Auctions **Problem 4 (slot auctions)** We go back to the simplicity of Problem 1 but now make it a slot auction. Recall this means builders bid before they make their block. We need some model of the difference between the information sets at the time of bidding and at the time of block publication. For simplicity, we can just model this as a uniform distribution. Here's a natural setup: - at the time of bidding, all builders see the same information $Mem$ (that is, $Mem_i = Mem$ for all $i$) - If we want to model the situation in which builders bid for the right to make blocks in future slots, we can set $Mem_i=\varnothing$ for all $i$. - we do a slot auction. This means some new transactions will be available to the winning builder when they must submit their block. We model this by having $v(Mem_i')$ be distributed via some uniform distribution $[a, b]$. One can think of $a$ as $v(Mem)$, a lower bound of the value of $v(Mem_i')$. **Problem 5 (uncertain values)** We introduce uncertainty by no longer assuming that the value of blocks is deterministic given a set of observed transactions. - at the time of bidding, all builders see the same information $Mem$ - we do a block auction. - the valuation of a block is stochastic; that is, the payoff to the winner $B_i$ is some random variable $v(Mem)$. - For example, we can assume the winner randomly draws some $s_i \in [0,1]$ and then is rewarded $s_i*V(Mem)$, where $V(Mem)$ is some deterministic value corresponding to the maximum MEV of $Mem$. The big difference between this and Problem 3 is that in Problem 3, the Builder knows her payoff before the bid, whereas in Problem 5 she does not. The class of **affiliated value auctions / interdependent value auctions** are auctions where the value of the good to the bidder is not known until after the bid is complete. This is a much more general class of auctions with the previously-discussed private value auctions and the well known **common value auctions** (bidders bidding for a common good whose value is the same for everyone but not known to the bidders during the bid) as special cases. The setup of Problem 4 is yet another special case where the good does not have a common value and is independent (in fact, taken at face value, we can call these "independent interdependent value auctions..."), but the math should not be too different from common value auctions. Analytically, Problem 4 is essentially the same as Problem 1. Builders know the expected value of $v(Mem)$ instead of the exact value, but this makes little difference. The expected payoff of a winner who bids $b$ is $\mathbb{E}[v(Mem)-b] = \mathbb{E}[v(Mem)]-b$. Since $\mathbb{E}[v(Mem)]$ is a common constant for all builders, the expected payoff function matches the payoff function from Problem 1. Therefore, maximizing expected payoff in this problem is the same as maximizing payoff in Problem 1. A very similar argument can be made for Problem 5. ### Skill and Asymmetric Distributions **Problem 6 (asymmetric skills)** We define builder **skill** as their ability to extract MEV from the same transactions. - we do a block auction. - all builders see every transaction; i.e. $Mem_i=Mem$ for all $i$. - each builder $B_i$ is given a *public* skill $s_i$ before the bidding. - we can, if we want, assume that the $s_i$'s themselves follow distributions. However, as the $s_i$ are public knowledge, the process that generates them is not too important for equilibrium analysis for a single round. - for each $B_i$, the valuation function is a deterministic function of the input and $s_i$. - for example, we can assume that $v_i() = s_i*v()$ for some common function $v()$. Problem 6's analysis starts to feel quite different from that of the previous problems. Consider a simple case where there are 2 bidders, and it is common knowledge that $B_1$'s valuation is $a$ and $B_2$'s is $b$, where $a < b$ and $a, b \in \mathbb{R}$. (using our setup above, one way to generate this would be to assume $v(Mem) = 1$, $s_1 = a$, $s_2 = b$) 1. it is clear that $B_2$ would not want to bid more than $b$, as she would lose money. 2. knowing this, it is in $B_1$'s interest to build slightly above $b$, say $b+ \epsilon$ for some small $\epsilon \in \mathbb{R}$. 3. $B_1$ will always win this bid, and her payoff will be $a - (b + \epsilon$), which is essentially $a-b$ in the limit as $\epsilon \rightarrow 0$. 4. In particular, note that $B_1$'s bid does not depend *at all* on her valuation -- she can leverage her dominant valuation to the maximum and expose bascially no information in her bid (besides the fact that she is in a stronger position). 5. Generalizing to $n$ players is not too difficult: the player with the highest valuation is incentivized to pay ($\epsilon$ plus) the player with the second highest valuation, and should always win the bid. Although this toy problem is easy to analyze, the lack of symmetry can make the analysis feel a bit ad-hoc. Indeed, the study of **asymmetric auctions** can get quite difficult, with even studying $n=2$ cases being fairly involved. We are forced to enter this territory if we make our problem even a tiny bit more complicated: **Problem 7 (asymmetric skills, stochastic payout)** Same as Problem 6, except the valuations are no longer deterministic. - we do a block auction. - all builders see every transaction; i.e. $Mem_i = Mem$ for all $i$. - each builder $B_i$ is given a public skill $s_i$ before the bidding. - for each $B_i$, the valuation function is a random variable of the input and $s_i$. - for example, we can restrict the domain of the $s_i$ to be intervals of type $[a,b] \subset \mathbb{R}^+$. Then define $v_i(Mem)$ to be a random sample from $[a, b]$. One of the simplest instances of this problem is if $B_1$ is assigned a value uniformly in $[a,b]$ and $B_2$ is assigned a value uniformly in $[c,d]$. Even this simple-looking setup requires some significant work[^38]. For an explicit equilibrium formula, see [^37]. The most important qualitative results for us are: - the two highest bidders matter a lot. The bidder with the "stronger" distribution has a very big advantage. - many standard mechanism design assumptions fail; for example, we will not be able to obtain a mechanism that is incentive-compatible and symmetric -- in other words, if we really wanted the builders to signal their true valuations, we would be forced to somehow incorporate the knowledge of the difference between the builders' distributions into the mechanism, which makes no sense in the decentralized and anonymous setting of blockchain (and would be considered strongly unfair even when not) - c.f. Meyerson-Satterswaite Theorem[^39], which is a formalization of the difficulties of asymmetry when there is private information. Pragmatically, what this means is that if we think differing skill is a huge factor, we should not expect an easy analysis. The best we can do is, perhaps, something like what we are attempting here: extract some qualitative directions and principles, but settle for empirical results and simulations if we want higher resolution. <!-- * We can encode the difference between block and slot auctions (and also differences in skill/bandwidth at seeing transactions) in this framework. * We can incorporate skill / "higher-bandwidth" builders in the uncommitted-bidding case by having some builders more likely to see more newer transactions. --> ### Auction-Smashing Premium In the Cost of Censorship section, we noted that an attacker $B_i$ who wants to censor transactions in PBS must get their block selected by the proposer by winning an auction $A$ against other builders, and we related the cost of winning to the expected revenue (expected highest bid) of the auction. Bidding only the expected revenue might be unsatisfactory because that only gives a $0.5$ chance of winning. Instead, we consider the revenue cdf $R_A$. If the attacker is satisfied with a probability $p$ of winning the auction, he should bid an amount $x$ such that $R(x) = p$; i.e. he should bid $R_A^{-1}(p)$. The ASP is the amount *extra* that he should bid over his equilibrium bid $b_A(s_i, i)$, so we formally define the ASP as $$ \mathtt{ASP}_A(p,i) = R_A^{-1}(p) - b_A(s_i, i) . $$ #### Examples We find the ASP for some of the auction problems described above. * The auction in Problem 1 (and equivalently Problems 4 and 5) is the easiest example to start with. Suppose that everyone (besides the attacker) bids according to the Nash equilibrium of bidding the constant value $v$. Clearly, the attacker wins with probability $1$ if and only if he bids more than $v$, so the ASP can be thought of as some positive and very small $\epsilon$. Since the attacker can make $\epsilon$ arbitrarily close to $0$, we actually just consider the ASP to be $0$, which is consistent with the formal definition. The revenue is $v$ with probability $1$, so $R_\mathrm{P1}^{-1}(1) = v$. Putting it together, we get $\mathtt{ASP}_\mathrm{P1}(1,i) = R_\mathrm{P1}^{-1}(1) - b_\mathrm{P1} = v - v = 0$. * We now focus on the symmetric block auction in Problem 3. Let $F$ be the cdf of block valuations. We established that the equilibrium was for each $B_i$ to bid \begin{align*} b(v_i) &= \mathbb{E}[\, \max_{j \neq i} v_j \mid (\max_{j \neq i} v_j) < v_i \,] \\ &= \frac{1}{(F(v_i))^{n-1}} \int_0^{v_i} x(n-1)(F(x))^{n-2}f(x) \mathrm{d}x . \end{align*} The revenue is then the bid of the builder with the highest valuation. Let $G$ be the cdf of the highest order statistic of valuations (excluding the attacker). Valuations are assumed independent, so the probability that the highest valuation is less than $x$ is $G(x) = (F(x))^{n-1}$. $G$ outputs a probability given a valuation, but we are interested in a valuation given a probability, so we take the inverse. $G^{-1}(p) = F^{-1}(p^\frac{1}{n-1})$ represents a valuation that wins the auction with probability $p$. We can then input that into the equilibrium bid function to get the inverse of the revenue cdf: \begin{align*} R_\mathrm{P3}^{-1}(p) &= b(G^{-1}(p)) \\ &= \mathbb{E}[\, \max_{j \ne n} v_j \mid (\max_{j \ne n} v_j) < F^{-1}(p^\frac{1}{n-1}) \,] \\ &= \frac{1}{(F(F^{-1}(p^\frac{1}{n-1})))^{n-1}} \int_0^{F^{-1}(p^\frac{1}{n-1})} x(n-1)(F(x))^{n-2}f(x) \mathrm{d}x \\ &= \frac{1}{p} \int_0^{F^{-1}(p^\frac{1}{n-1})} x(n-1)(F(x))^{n-2}f(x) \mathrm{d}x . \end{align*} Putting it together, we get \begin{align*} \mathtt{ASP}_\mathrm{P3}(p,i) &= R_\mathrm{P3}^{-1}(p) - b(v_i) \\ &= \mathbb{E}[\, \max_{j \ne n} v_j \mid (\max_{j \ne n} v_j) < F^{-1}(p^\frac{1}{n-1}) \,] - \mathbb{E}[\, \max_{j \neq i} v_j \mid (\max_{j \neq i} v_j) < v_i \,] \\ &= \frac{1}{p} \int_0^{F^{-1}(p^\frac{1}{n-1})} x(n-1)(F(x))^{n-2}f(x) \mathrm{d}x \\ & \quad - \frac{1}{(F(v_i))^{n-1}} \int_0^{v_i} x(n-1)(F(x))^{n-2}f(x) \mathrm{d}x . \end{align*} * Let's look at a specific instance of Problem 3. Suppose each builder's valuation is uniformly drawn from the interval $[0,1]$. We should expect the ASP to be somewhere around $1/2$ on average when the auction has a large number of builders and when the satisfactory probability $p$ of winning is large. This is because in a competitive auction we expect each builder's bid to be close to their valuation (which is $1/2$ on average), and we expect the revenue to be close to $1$. We now proceed with the computation. The valuation cdf is $F(x)=x$ and its inverse is $F^{-1}(p)=p$. The ASP is \begin{align*} \mathtt{ASP}_\mathrm{P3}(p,i) &= R_\mathrm{P3}^{-1}(p) - b(v_i) \\ &= \frac{1}{p} \int_0^{p^\frac{1}{n-1}} x(n-1)x^{n-2} \mathrm{d}x - \frac{1}{v_i^{n-1}} \int_0^{v_i} x(n-1)x^{n-2} \mathrm{d}x \\ &= \frac{n-1}{p} \int_0^{p^\frac{1}{n-1}} x^{n-1} \mathrm{d}x - \frac{n-1}{v_i^{n-1}} \int_0^{v_i} x^{n-1} \mathrm{d}x \\ &= \frac{n-1}{p} \cdot \frac{1}{n} \cdot p^\frac{n}{n-1} - \frac{n-1}{v_i^{n-1}} \cdot \frac{1}{n} \cdot v_i^n \\ &= \frac{n-1}{n} \cdot (p^\frac{1}{n-1} - v_i) . \end{align*} If we set $v_i=1/2$ (the average valuation) and choose large $n$ and $p$, we can see that the ASP is around $1/2$, as expected. #### Simulations For more complicated auctions, it can be hard to analytically compute the ASP, so we performed some preliminary simulations on the symmetric block auction from Problem 3. We choose agent counts in $\{5, 10, 15, 20\}$ and use the valuation distributions $\mathrm{Uniform}([0.5, 1.5])$ and $\mathrm{Gamma}(2, 0.5)$. Agents bid according to the equilibrium for Problem 3, computed numerically. We run 2000 simulations per choice of distribution and agent count and record the average revenues and bids and the 95th percentile of ASPs. | Distribution | Agents | Revenue | Average Bid | ASP | | ------------ | ------ | ------- | ----------- | ----- | | Uniform | 5 | 1.163 | 0.899 | 0.390 | | Uniform | 10 | 1.320 | 0.949 | 0.445 | | Uniform | 15 | 1.376 | 0.968 | 0.462 | | Uniform | 20 | 1.403 | 0.973 | 0.474 | | Gamma | 5 | 1.252 | 0.742 | 0.950 | | Gamma | 10 | 1.693 | 0.839 | 1.318 | | Gamma | 15 | 1.940 | 0.884 | 1.531 | | Gamma | 20 | 2.126 | 0.908 | 1.686 | There are a few things to note: - the bids, revenue, and ASP all increase as the number of builders increase. Intuitively, bids and revenue should increase as the auction gets more competitive, but it's less obvious that the ASP should increase. - for the uniform distribution, as the number of builders increases, the average revenue gets closer to the maximum possible valuation, and the average bid gets closer to the mean valuation. - the ASP increases much more dramatically as the number of builders increases with the gamma distribution than with the uniform distribution. This tells us that the skewness of the valuation distribution likely has a large effect on the ASP. ### Winner's Curse The **winner's curse** [^14] is a phenomenon in certain auctions where, vaguely speaking, winning an auction causes disappointment for the winner in some way. In our framework, we define $\mathrm{WC}$ to be the event, with respect to a bidder $B_i$ and signal $s_i$, that $$\mathbb{E}[v_i | s_i] > \mathbb{E}[v_i | s_i \cap (B_i \, \mathrm{wins})].$$ As a motivating example, consider a first-price common-value auction in which the value $v$ is the average of everyone's (private and i.i.d.) signals. In this case $\mathbb{E}[v]$ is the mean of the signal distribution. A bidder $B_i$ with a relatively low signal $s_i \ll \mathbb{E}[v]$ still expects the value to be somewhere around $\mathbb{E}[v]$ and bids accordingly. However, if it is revealed that $B_i$ wins the auction, then she learns that the value must be much less than previously expected, potentially meaning her payoff is negative. Specifically, the value is at most $s_i$ because anyone with a higher signal than $s_i$ would've perceived the value to be higher than $B_i$ did and thus would've bid higher than her, under the assumption that bids are symmetric and increase with value. Note the following: - Winner's curse is usually discussed in the context of common-value auctions, in which each $v_i$ is equal, but we do not make this assumption. - From now on, winner's curse is always w.r.t. a fixed bidder $B_i$. - We restrict our attention to first-price auctions. - We assume a bidder wins if and only if they have the highest signal, as is usually the case in equilibrium (ignoring ties for simplicity). - We assume the distributions of signals are public (since $B_i$ can't form expectations of other signals otherwise). Whether or not winner's curse occurs might depend on which $s_i$ is drawn. Simply plugging in the definition, $\Pr(\mathrm{WC}) = \Pr(\mathbb{E}[v_i | s_i] > \mathbb{E}[v_i | s_i \cap (B_i \, \mathrm{wins})])$. Note that $\Pr(\mathrm{WC})$ does not involve the probability of winning; it represents the probability of the curse occurring *if* $B_i$ wins. A necessary condition for winner's curse is that $B_i$'s expectations of at least one other bidder's signal drop upon receiving info that her signal is the highest. Call this **signal decline** (abbreviated $\mathrm{SD}$). The necessity of signal decline is straightforward to see: the only alternative to signal decline is that $B_i$'s expectations of other signals remain the same, since intuitively none of her signal expectations can increase after knowing that hers is the highest. If her expectations stay the same, then $\mathbb{E}[v_i | s_i] = \mathbb{E}[v_i | s_i \cap (B_i \, \mathrm{wins})]$, meaning no winner's curse. Given a signal $s_i$, signal decline occurs if and only if there is a nonzero chance that $s_j \ge s_i$ for some $j \ne i$. Like with winner's curse, we can describe the probability of signal decline. $\Pr(\mathrm{SD}) = \Pr(\exists j \ne i, \, \Pr(s_j \ge s_i) > 0)$. From the definition of winner's curse, we can see that winner's curse requires that $v_i$ depends on the signals of other bidders. There are many ways for $v_i$ to do so, but considering that winner's curse requires expectations of both the signals and $v_i$ to drop, we should expect a strong positive correlation between $v_i$ and the signals to correspond to a high probability of winner's curse. In particular, if $\mathbb{E}[v_i|s_1,\dots,s_k,\dots,s_n] > \mathbb{E}[v_i|s_1,\dots,s_k-\delta,\dots,s_n]$ for all $k \ne i$ and $\delta > 0$ (call this condition **strictly-increasing expected value**), then combined with signal decline we get winner's curse. With strictly-increasing expected value, winner's curse happens precisely when signal decline happens. Note that the example of winner's curse at the beginning of this section describes an auction that satisfies both signal decline (with some probability depending on the signal distribution) and strictly-increasing expected value. This is definitely a stronger condition than necessary for winner's curse, but it's useful for intuition. None of the auction theory problems listed in above sections exhibit winner's curse (i.e. $\Pr(\mathrm{WC})=0$) because, even though many of them have signal decline, $v_i$ has nothing to do with the signals of other bidders. The way we framed auctions so far may give the impression that actual PBS auctions should have independent valuations as well. Each builder has their own block, and it's hard to imagine how the value of one block can impact the value of another. However, we claim that it is possible to have dependent valuations in PBS auctions. A slight shift in perspective helps: instead of associating $v_i$ with the value of $B_i$'s block, we associate $v_i$ with the value of space on the blockchain to $B_i$. From this perspective, it's easier to imagine dependent valuations; e.g. if $B_i$ obtains value from generating NFTs, $v_i$ might depend on how others perceive the value of the NFTs, which affects how valuable they perceive blockchain space to be. It is of course still possible that values are independent; e.g. if $B_i$ obtains value through arbitrage, then she knows her exact profits during block creation. In reality, the value of blockspace to builders is probably partly independent and partly dependent. A simplistic approach to modeling this is to assume the final valuation of $B_i$ is $$ v_i = \alpha s_i + (1 - \alpha) \sum_j \frac{s_j}{n} , $$ where $s_j$ is the pre-bid valuation of $B_j$, and $\alpha$ is a parameter controlling how independent or dependent the final valuation is. For example, $v_i$ is totally independent if $\alpha = 1$ and totally dependent if $\alpha = 0$. #### Simulation Using the simple model described above, we do some very preliminary simulations demonstrating the effect of changing the $\alpha$ parameter on profits under naive bidding strategies. We choose agent counts in $\{5, 10, 15, 20\}$, choose $\alpha \in \{1.00, 0.66, 0.33, 0.00\}$, and use the valuation distributions $\mathrm{Uniform}([0.5, 1.5])$ and $\mathrm{Gamma}(2, 0.5)$. Agents bid according to the equilibrium described in Problem 3, but each agent (wrongly) assumes that valuations are always completely independent (i.e. that $\alpha = 1$). We run 1000 simulations per choice of parameters and record the average profits of the winners. The following table uses $\mathrm{Uniform}([0.5, 1.5])$ as the valuation distribution. | $\alpha$ | Agents: 5 | Agents: 10 | Agents: 15 | Agents: 20 | | -------- | --------- | ---------- | ---------- | ---------- | | $1.00$ | 0.166 | 0.090 | 0.062 | 0.047 | | $0.66$ | 0.053 | -0.049 | -0.086 | -0.107 | | $0.33$ | -0.055 | -0.180 | -0.230 | -0.252 | | $0.00$ | -0.164 | -0.321 | -0.376 | -0.405 | The following table uses $\mathrm{Gamma}(2, 0.5)$ as the valuation distribution. | $\alpha$ | Agents: 5 | Agents: 10 | Agents: 15 | Agents: 20 | | -------- | --------- | ---------- | ---------- | ---------- | | $1.00$ | 0.637 | 0.617 | 0.600 | 0.579 | | $0.66$ | 0.363 | 0.176 | 0.099 | 0.016 | | $0.33$ | 0.058 | -0.261 | -0.433 | -0.562 | | $0.00$ | -0.244 | -0.693 | -0.945 | -1.113 | Clearly, under this naive strategy profile, profits decrease as valuations become more dependent (i.e. as $\alpha$ decreases). ### Contrasting User and Builder Auctions There has been some research on the transaction markets (modelling users trying to get their transactions included in a block as bidders in a market). These "user auctions" have a few differences with our "builder auctions." 1. User auctions are **combinatorial** - they must allocate a finite space to many transactions at the same time, whereas builder auctions sell one good at a time. In this sense builder auctions are simpler. 2. Users must pay a basefee in addition to tips. In auction theory terms, user auctions have a **reserve price**. 4. We expect users to be more risk-averse than builders. This says more about the role of builders than users, as people are naturally risk-averse. Builders likely try to make profit in the long-term, and they run algorithms (which are risk-neutral) optimized to do so. 5. In contrast to slot auctions, users typically know the subjective value of their transactions at the time of bidding. There is however still some room for uncertainty in user auctions from time-sensitive transactions and the possibility of being excluded from a block. <!-- Since there are many more users than builders, there are subtle differences in dynamics of how we would model them. --> ### Summary 1. In equilibrium, a first-price auction can be profitable for some builder only if there are differences in how each builder perceives the value of their block. If we want builders to be able to profit, we need an auction that allows for differing signals. The most natural way is to have a source of randomness in block values: MEV, tips, transaction visibility, etc. 2. Builders are assumed to be risk-neutral, so for each builder, having an expected block value of $v$ is the same as having an actual block value of $v$. This, however, does not mean we can remove block value distributions from all our auction descriptions, since builders don't always have the same information and therefore don't always form the same expectations of block values. E.g. $B_i$'s expectation of $B_j$'s block value is likely different from $B_j$'s expectation of his own block value because $B_j$ has more information about it. 3. For first-price auctions, winner's curse occurs for a builder $B_i$ when (1) she knows the probability distributions of everyone's signals, (2) the probability that her signal is strictly the highest is less than $1$, and (3) her value $v_i$ is positively correlated with the signals of other builders. 4. Overall, this is a complex problem, and analytical approaches should be done after simplifying assumptions about the context (e.g. whether we can assume that people would basically end up seeing the same mempools). # Appendix ## A.1. PBS's approach to MEV and Comparison with Alex It is useful to compare PBS with another design, the **Alex protocol**[^22]. The big-picture objectives are similar (adding decentralization) but there are some fundamental differences between Alex and PBS. In particular, Alex's goals are to hit *zero* MEV. Similar to how PBS adds a new "block market" layer to the stack, Alex adds a new layer as well, the "content layer." In this setup, - "pickers" kind of act as builders by providing views of of the mempool - "printers" take the central role most similar to proposers in PBS or miners - "shufflers" are there to present randomness to remove MEV. The main idea is that if there is shuffling of the transactions, then transaction reordering and similar MEV transaction becomes much harder (or, under a more pessimistic lens, might be moved to a secondary market). In particular, frontrunning, backrunning, sandwiching, etc. all become more difficult[^25]. Philosophically, there is a big divide between Alex and PBS, even though they are both roughly in the direction of decentralization. Alex aims to disminish MEV, while PBS embraces MEV (c.f. name of MEV-Boost). If MEV is considered a fundamental problem, then it might be useful to remove MEV. However, if MEV is considered a "necessary evil" or even a useful thing [^23] then it might be more robust to embrace it as a feature (this issue is somewhat analogous to corruption that might have positive benefits on the economy [^24]). For example, we might want to use MEV as an incentive for miners/validators to provide participation or security. In this light, PBS can be seen as trying to "optimize" MEV for the combined perspecitve of builders and proposers. Finally, there are plenty of people who would take a "middle ground" stance that it would be hard / unreasonable to destroy MEV completely, but "MEV smoothing" (evening out the variance of MEV distribution) would be good for overall power dynamics and user experience[^6]. While the difference is quite deep, we think Alex is worth mentioning since some of its techniques might be useful for future work in this area of mechanism design -- we may want blockchains with different tolerances towards MEV. # References [^1]:Alchemy. "What is MEV Boost?" August, 2022. Retrieved from: https://www.alchemy.com/overviews/mev-boost [^2]: B. Vitalik.“Proposer/block builder separation-friendly fee market designs” June 2021. Retrieved from: https://ethresear.ch/t/proposer-block-builder-separation-friendly-fee-market-designs/9725. [^3]: 0xa9aE. "FAQ: MEV and mev-boost (for node runners)." July, 2022. Retrieved from: https://mirror.xyz/ladislaus.eth/XY00FiQBunZss_SddZ7rSfYocrONdEJkM0o2NZG9Tf4. [^4]: P. Daian et al., "Flash Boys 2.0: Frontrunning in Decentralized Exchanges, Miner Extractable Value, and Consensus Instability," 2020 IEEE Symposium on Security and Privacy (SP), 2020, pp. 910-927, doi: 10.1109/SP40000.2020.00040. Retrieved from: https://ieeexplore.ieee.org/document/9152675. [^5]: B. Simon. "MEV Driven Centralization in Ethereum: Part 1." July, 2022. Retrieved from: https://simbro.medium.com/mev-driven-centralization-in-ethereum-ec829a214f18. [^6]: fradamt. "Committee-driven MEV smoothing." August, 2021. Retrieved from: https://ethresear.ch/t/committee-driven-mev-smoothing/10408. [^7]: H. Luke."Ethereum may now be more vulnerable to censorship — Blockchain analyst." September, 2022. Retrieved from: https://cointelegraph.com/news/ethereum-may-now-be-more-vulnerable-to-censorship-blockchain-analyst [^8]: B. Monnot. "Unbundling PBS: Towards protocol-enforced proposer commitments (PEPC)." October, 2022. Retrieved from: https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879?u=barnabe [^9]: V. Buterin. "Single-slot PBS using attesters as distributed availability oracle." January, 2022. Retrieved from: https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877 [^10]:V. Buterin. "Two-slot proposer/builder separation." October, 2021. Retrieved from: https://ethresear.ch/t/two-slot-proposer-builder-separation/10980 [^11]: Quintus. "Censorship Resistance: crlists in mev-boost." July, 2022. Retrieved from: https://github.com/flashbots/mev-boost/issues/215 [^12]:V. Buterin. "State of research: increasing censorship resistance of transactions under proposer/builder separation (PBS)." 2022. Retrieved from: https://notes.ethereum.org/@vbuterin/pbs_censorship_resistance [^13]:ox0083. "The litList (crList) Builder." October, 2022. Retrieved from: https://mirror.xyz/apriori.eth/Ow6EeeGXQ-6R1beflaO5ez6UOHx6KeJCZFVKTxiflMg [^14]:V. Krishna (2009). *Auction Theory*. Academic Press. [^15]:B. Monnot. "Updates on Proposer-Builder Separation." October, 2022. Retrieved from: https://www.youtube.com/watch?v=sQQ2UYB3qOI [^16]:T. Roughgarden. "Transaction Fee Mechanism Design for the Ethereum Blockchain: An Economic Analysis of EIP-1559." December, 2020. Retrieved from: https://timroughgarden.org/papers/eip1559.pdf [^17]:thegostep."MEV-Boost: Merge ready Flashbots Architecture." November, 2021. Retrieved from: https://ethresear.ch/t/mev-boost-merge-ready-flashbots-architecture/11177 [^18]:O. Alex. "Flashbots: Frontrunning the MEV Crisis." November, 2023.Retrieved from: https://medium.com/flashbots/frontrunning-the-mev-crisis-40629a613752 [^19]:Ether World. "MEV Part 1: Front Running, Back Running & Sandwich Trading." July, 2022. Retrieved from: https://www.youtube.com/watch?v=NSHRfpUSqCM [^20]:D. Bergemann, B. Brooks, and S. Morris, “Countering the winner’s curse: Optimal auction design in a common value model,” vol. 15, no. 4, pp. 1399–1434, November, 2020, Retrieved from:https://econtheory.org/ojs/index.php/te/article/viewFile/3797/29107/822. [^21]:V. Buterin. "How much can we constrain builders without bringing back heavy burdens to proposers?" October, 2022. Retrieved from: https://ethresear.ch/t/how-much-can-we-constrain-builders-without-bringing-back-heavy-burdens-to-proposers/13808 [^22]:pmcgoohan. "Targeting Zero MEV - A Content Layer Solution." April, 2021. Retrieved form: https://ethresear.ch/t/targeting-zero-mev-a-content-layer-solution/9224 [^23]:P. Daian. "MEV… wat do?" April, 2021. Retrieved from: https://pdaian.com/blog/mev-wat-do/ [^24]:J. C. Heckelman and B. Powell, “Corruption and the Institutional Environment for Growth,” vol. 52, no. 3, pp. 351–378, Sep. 2010, doi: 10.1057/ces.2010.14. Retrieved from: https://core.ac.uk/download/pdf/7359094.pdf [^25]:pmcgoohan. "Targeting Zero MEV - A Content Layer Solution." April, 2021. Retrieved from : https://github.com/pmcgoohan/targeting-zero-mev/blob/d550cbd9e7d5fd84ca719ae783add79f10906280/content-layer.md [^26]:C. Donovan . "A Beginner’s Guide to Ethereum Censorship." October, 2022. Retrieved from: https://newsletter.banklesshq.com/p/ethereum-censored-flashbots-centralization-lido [^27]:fradamt. "PBS censorship-resistance alternatives." October, 2022. Retrieved from: https://notes.ethereum.org/@fradamt/H1TsYRfJc [^28]: A. Aslam, Y. Teoh, Y. Zhang. SJSU CAMCOS Repository (2022). Github Repository. https://github.com/krzhang/camcos/blob/main/data/TxF22.csv [^29]:P. Fader and B Hardie. "The Gamma-Gamma Model of Monetary Value", 2013. Retrieved from http://www.brucehardie.com/notes/025/gamma_gamma.pdf. [^30]:T. Diamandis, A. Evans, T. Chitra, and G. Angeris. "Dynamic Pricing for Non-fungible Resources: Designing Multidimensional Blockchain Fee Markets", 2022. Retrieved from https://arxiv.org/pdf/2208.07919.pdf. [^31]:V. Buterin. "Multidimensional EIP 1559", 2022. Retrieved from https://ethresear.ch/t/multidimensional-eip-1559/11651. [^32]:D. Perez and B. Livshits. “Broken metre: Attacking resource metering in EVM”. In: arXiv preprint arXiv:1909.07220 (2019). [^33]:F. Leal, A. Chis, and H. González–Vélez. "Performance Evaluation of Private Ethereum Networks." SN COMPUT. SCI. vol. 1:285, 2020. [^34]:V. Berger and Y. Zhou. "Kolmogorov-Smirnov test: Overview." Wiley Statsref: Statistics Reference Online, 2014. [^35]:A. Adhikari and J. Schaffer. "Modified Lilliefors Test." J Mod Appl Stat Methods, vol. 14:1, pp. 53-69, 2015. [^36]:G. Babu and C. Rao. "Goodness-of-fit tests when parameters are estimated." Sankhya, vol. 66, pp. 63–74, 2004. [^37]:T. Kaplan and S. Zamir. "Asymmetric first-price auctions with uniform distributions: analytic solutions to the general case." 2012. [^38]:E. Maskin and J. Riley. "Asymmetric Auctions." 1998. Retrieved from: https://scholar.harvard.edu/maskin/files/asymmetric_auctions.pdf [^39]:Wikipedia. "Myerson-Satterswaite Theorem." *Wikipedia*, Retrieved from: https://en.wikipedia.org/wiki/Myerson%E2%80%93Satterthwaite_theorem. [^40]:https://www.econstor.eu/bitstream/10419/62694/1/72394704X.pdf W. Güth et al. "Bidding behavior in asymmetric auctions: An experimental study." 2001. [^41]:A. Aslam, Y. Teoh, Y. Zhang. SJSU CAMCOS Repository (2022). Github Repository. https://github.com/krzhang/camcos [^42]:https://www.dropbox.com/sh/ag4e5qp4zv5whwv/AADG-NDRslYmJJzqscJxPw-ka?dl=0&preview=CAMCOS_F21.pdf CAMCOS Fall 2021 Report [^43]:A. Dietrichs. "EIP-4488 Mining Strategy & Stipend Analysis." 2021. Retrieved from: https://hackmd.io/@adietrichs/4488-mining-analysis [^44]: P. Virtanen et al. scipy.stats.gamma Package. SciPy 1.9.3: Fundamental Algorithms for Scientific Computing in Python. Nature Methods. 17, pp.261–272. 2022. [^45]:J. Peacock. Two-dimensional goodness-of-fit testing in astronomy. mnras, vol. 202, pp. 615–627. 1983. [^46]: P. McCorry. "MEV-boost and layer-1’s censorship resistance." November, 2022. Retrieved from: https://stonecoldpat.substack.com/p/mev-boost [^47]: Flashbots. "Censorship Résistance & PBS." September, 2022. Retrieved from: https://www.youtube.com/watch?v=XZJcZ05d-Wo [^48]: Protos. "Ethereum censorship: 24% of blocks reject transactions." September, 2022. Retrieved from: https://protos.com/ethereum-censorship-24-of-blocks-reject-transactions/