++Summary Note++: Empirical Analysis of EIP-1559

# ++Summary Note++: Empirical Analysis of EIP-1559 Article: [Empirical Analysis of EIP-1559: Transaction Fees, Waiting Times, and Consensus Security](https://arxiv.org/pdf/2201.05574.pdf) This article was shared by many people when it first came out last year. I finally had a chance to read it thoroughly and understand the research problem in more details. I will highlight several key components of the article below: ### Main Research Problem The main research problem is to study the effect of EIP-1559 on several important aspects of Ethereum blockchain. ### Research Questions Three research questions are proposed in the article. - _What is the impact of EIP-1559 on transaction fee dynamics?_ - _What is the impact of EIP-1559 on user experience in terms of transaction waiting times?_ - _What id the impact of EIP-1559 on consensus security?_ For each dimension, they look at one or more specific metrics. For instance, for consensus security, they evaluate the impact on fork rates, network loads and MEV. ### Methodology The main methodology is [**Regression Discontinuity Design (RDD)**](https://en.wikipedia.org/wiki/Regression_discontinuity_design). RDD is a quasi-experimental evaluation method used widely in different fields to evaluate the impact of an event. If you want to get a quick start on this approach, you can check out the following two literature: - [The State of Applied Econometrics: Causality and Policy Evaluation](https://www.aeaweb.org/articles?id=10.1257/jep.31.2.3) - [Regression Discontinuity Designs in Economics](https://www.aeaweb.org/articles?id=10.1257/jel.48.2.281). ### Data and Data Sources They collected three sets of data, corresponding to three research questions. - **Transaction fee data**. They collect block-level and transaction-level data using [Google Bigquery](https://cloud.google.com/bigquery). - **Mempool data** for waiting time calculation. The authors run four Ethereum full nodes distributed in North Carolina, Los Angles, Montreal, and Helsinki. _My question :question: here_: How does geographically distributing the nodes around the globe help collect more accurate Ethereum mempool data? - **ETH** price data. This data is queried from Bloomberg Terminal at one-minute granularity. - **Miner revenue**. Flashbot API is used to collect MEV data from the blockchain. To my surprise, Google Bigquery is the source for block-level and transaction-level data. I am not aware of this data source for blockchain and transactions before. ### Findings Their findings are as follows: - They show that the intra-block gas prices decrease significantly as EIP-1559 adoption increases, which results in easier fee estimation and better user experience. - The analysis shows that EIP-1559 significantly reduces users' wating times and latency. - They verify that a larger block size increases presence of uncle blocks. ### My Own Reflection I have several reflections after reading this article. - The method used in this article, RDD, can be applied to examine the impact of Ethereum Merge on a broad range of blockchain aspects, such as transaction fee, stakers' revenue, user experience, to name a few. - The main goal is to conduct a causality analysis. There is a different approach that is also designed to examine causal impacts, which is _graphical causal models_, or [directed acyclic graph (DAG)](https://en.wikipedia.org/wiki/Directed_acyclic_graph).I am wondering if DAG can be applied to tackle similar types of problems. - The authors' solution for transaction waiting time is clever. They use the difference between the timestamp that a transaction first appears in mempool and the timestamp of the next block. This solution reduces the proportion of transactions with negative waiting times dramatically if the timestamp of the current block is used.