# EIP-4844: Insights By the Data # TL;DR * The chain appears stable after 4844, with blocks propagating well to different sides of the network * Block propagation times peak at 5 blobs rather than 6, this result is not what one should expect if all other things are held constant * This is not the result of timing games by block builders, based on analysis of transaction fees, MEV, and the source of blocks (builders vs. solo stakers) * For reasons not yet understood, blocks with 6 blobs appear to be less full than other blocks, which likely accounts for the difference in propagation time * Layer 2s utilize a variety of strategies for posting blobs to Ethereum ## Introduction On March 13, 2024 Ethereum's Dencun hardfork went live, marking a significant step forward in Ethereum's rollup centric roadmap & providing massive scalability improvements for Ethereum Layer 2s. The most important change in Dencun was the introduction of [EIP-4844](https://eips.ethereum.org/EIPS/eip-4844) which introduces a new kind of transaction which accepts "blobs" of data, which are persisted and served by Ethereum consensus nodes for a short period of time. This provides about 100x more data availability for Layer 2s to store their data compared to call data. As a result, there are now multiple Layer 2s with sub-cent transaction fees. It's now been a little over three months of data since Dencun went live, and with a decent amount of data there are several interesting insights to observe. ## Data Sources & Code The majority of data was taken from [ethpandaops/xatu](https://github.com/ethpandaops/xatu). Other sources of data included various [public MEV relay endpoints](https://www.relayscan.io) for MEV data in addition to data from a locally synced Execution Engine and Beacon Node to gather the full details of transactions & to link beacon block roots to execution payloads (Xatu was missing this for some reason). All of the code for gathering this data and generating the plots and analysis is hosted at [my github repository](https://github.com/yijing-eth/4844_insights). ### Blobs, Network Propagation, and Chain Stability The simplest measurement we could do to discuss the impact of 4844 on chain stability was to investigate how 4844 the hard fork impacted re-orgs: ![image](https://hackmd.io/_uploads/H1GcsU5BA.png) Though there was an initial uptick in re-orgs, these were due to implementation bugs that were patched shortly after the fork. Since these patches have been released, re-orgs appear to be on par with their pre-4844 levels. Continuing with the re-org discussion we wanted to see if there was a measurable difference in re-orgs for various regions around the world. Internet latency and bandwidth limitations vary widely in different geographical regions. Slower regions will be the first to experience instability if resource demands are increased. Xatu contains data for all of the following regions: ``` Finland(FL), Australia(AU), Netherlands(NL), United States(US) ``` Below is a plot of instances of re-orgs across these different regions, separated by client: ![image](https://hackmd.io/_uploads/SyFUyIqHA.png) There appears to be no significant differences in instances of re-orgs which is good. It suggests the current limits are within the limitations of slower regions. The next goal was to better understand how blobs propagate through the network. If core-devs can have better visibility into how the chain is performing under the current blob count limits, they'll have a better understanding of how far they can increase these limits in future hard-forks. Before diving into these results, some background is necessary: Blocks and blobs are propagated independently on the network. Though a block and its corresponding blobs may originate from a common origin, they may take different paths to arrive at various nodes in the network. Ethereum consensus nodes do not consider a block valid until the node has received all components (block and blobs) of the block. Thus, in all subsequent analysis, the **propagation time** for a block is defined as **the time since the start of the slot that a node receives the last component** of the block. A great question to investigate was how blob count influences block propagation time. Having an understanding of this dependency is essential when considering whether if blob limits can be raised. Below is a plot of the average propagation time as a function of the number of blobs in a block: ![image](https://hackmd.io/_uploads/H1toP4qBA.png) It appears the propagation time levels off at around five blobs! As I will explain in the next section, this is quite an unexpected result! ### Modeling the Expected Propagation Time Network latency is often modeled as a log-normal distribution due to the multiplicative effects of various independent factors such as distance, routing, and congestion. Similarly processing delays on nodes would be modeled by normal or log-normal distributions. Thus the sum of multiple log-normal variables (representing delays from each hop in the network) can be approximated by another log-normal distribution. But all of this is just for modeling one message. When blobs are involved there are multiple messages traveling through the network and (as mentioned earlier) the final propagation time is recorded only when all messages are received for the same block. Thus we are interested in the maximum of several log-normal distributions, which does not have a simple analytical form. But nevertheless we can model it. The plot below illustrates what the final propagation distribution might be expected to look like: ![image](https://hackmd.io/_uploads/HkqL845SR.png) looking more closely at the median and average: ![image](https://hackmd.io/_uploads/SyB284qS0.png) it is easy to see that the average propagation time continues to increase and does **not** level off as blob counts increase (even as the number of blobs continue higher), so something strange is going on here. This is explored much more deeply in the next section. ### Blobs, Block Building, & MEV The last section showed that the average propagation times for a block seems to peak at five rather than six blocks. Our first guess was that proposers and block builders were playing timing games. It is now a [well known result](https://arxiv.org/abs/2305.09032) that the MEV a builder can extract from a slot increases linearly with the amount of time they have to build the block. Because of this block builders and proposers sometimes intentionally wait longer into their slot to propose so they can gather more MEV. They cannot wait too long however, or they risk missing the attestation deadline for a majority of proposers and being reorged out of the chain. Sophisticated builders and proposers will carefully tune the time that they release a block so as to maximize their profit without losing the block. Since blocks with 6 blobs propagate more slowly than blocks with less blobs, it is possible that builders have already realized that they need to release blocks that have 6 blobs earlier in order to make the attestation cutoff. If this were the case then one might expect that the Layer 2s would be forced to tip more in order to compensate these builders for the lost time. Below is what we found when plotting the miner tip against the blob count: ![image](https://hackmd.io/_uploads/Bkeu-r5rC.png) It seems the tips essentially hover around 1-2 Gwei and there's no discernible dependence on blob count. This suggests blob consumers are not being forced to compensate block builders. So the question became, if blob consumers weren't increasing their tips, did blocks with more blobs contain less MEV? Using data from the public MEV relays, some limited data (~10k blocks) can be obtained about the MEV extracted from the canonical blocks which contained blobs: ![image](https://hackmd.io/_uploads/SJ5mbSqSC.png) At least from the limited data available, the number of blobs in a block does not appear to affect the amount of MEV that is extracted. To really put this question to bed, we attempted to isolate blocks built by builders vs home stakers. This was done by inspecting the `extra_data` field in the execution payload. To obtain blocks produced by home stakers, we **exclude all values for extra_data that were seen in any block which we know was produced by a builder**. Below is a plot of propagation time for blocks produced by builders and non-builders: ![image](https://hackmd.io/_uploads/Bk9Srr5rR.png) Very interesting! A couple of conclusions can be drawn from this graph. First off, the peak of block propagation time is **still** 5 blocks, even when the block is built by home stakers. Second, one can easily observe the timing games played by proposers/builders here; they are waiting about a half a second longer on average to release their blocks. These three results together **completely invalidate** our initial assumption that timing games from builders explain the leveling off of propagation time at five blobs. So if builders aren't causing this, what is? Well it turns out there's an interesting result when you plot the gas used in a block vs the number of blobs in a block: ![image](https://hackmd.io/_uploads/Hk1MdS5HC.png) As EIP-1559 targets half the gas limit (currently 30 million), one would expect this to hover around 15 million gas as shown here. But interestingly, the median amount of gas per block *dips* when blocks include 6 blobs! There is also a huge dip in gas used for blocks produced by locally. This isn't too surprising; when there is not very many transactions on the network, there is very little MEV, and local blocks become competitive with builders, resulting in a disproportionate number of local blocks being produced in low transaction conditions. It's also useful to note that there still appears to be a dip in gas used at 6 blobs in both cases: ![image](https://hackmd.io/_uploads/Sk4VYScrC.png) Less gas used implies less transactions, which means the blocks are smaller and would propagate faster than larger blocks. This is confirmed when plotting the size of the block vs number of blobs: ![image](https://hackmd.io/_uploads/SkU-iB9BA.png) This definitely helps explain the difference in propagation times. What is less clear at this point is *why* blocks are being packed with less transactions when they include 6 blobs? We recommend **core developers investigate this further**. Before ending off the MEV section, we also wanted to share *where* block builders typically include blob transactions in a block. The results were interesting: ![image](https://hackmd.io/_uploads/HynYTBcrA.png) It is interesting that this placement doesn't appear to be randomly distributed. There is clearly a bias for the very end of the block, but there wasn't enough time to dig into this result further. ### How are Layer 2s using Blobs? Next up was to investigate how Layer 2s are actually using blobs. The data over the last three months gives the following top consumers of Ethereum blob space: | Tx Recipient | Layer 2 | Blob Count | | --- | --- | --- | | `0xff00000000000000000000000000000000008453` | Base | 110253 | | `0x1c479675ad559dc151f6ec7ed3fbf8cee79582b6` | Arbitrum | 67494 | | `0xff00000000000000000000000000000000000010` | Optimism | 55205 | | `0xd19d4b5d358258f05d7b411e21a1460d11b0876f` | Linea | 21473 | | `0xa8cb082a5a689e0d594d7da1e2d72a3d63adc1bd` | ZKsync | 19145 | | `0xa13baf47339d63b743e7da8741db5456dac1e556` | Scroll | 19018 | | `0xc662c410c0ecf747543f5ba90660f6abebd9c8c4` | Starknet | 16609 | | `0x6f54ca6f6ede96662024ffd61bfd18f3f4e34dff` | Zora | 6767 | | `0x24e59d9d3bd73ccc28dc54062af7ef7bff58bd67` | Mode | 5776 | | `0xf338cad020d506e8e3d9b4854986e0ece6c23640` | Paradex | 5600 | | `0xff00000000000000000000000000000000000255` | Kroma | 2146| But what strategies do these Layer 2s employ to utilize this blob space? There are a few different ways Layer 2s might choose to get their blobs on chain. They have freedom in both how many blobs the post per transaction as well as how often they post. First let's take a look at frequency: ![image](https://hackmd.io/_uploads/SyN89B5r0.png) The frequency between blob posting here is all over the map depending on the Layer 2. But many of these results can be explained by looking at how many blobs are included in each transaction: ![image](https://hackmd.io/_uploads/B1Lg2r9HA.png) The most popular rollups (Base, Arbitrum, Optimism), simply wait until they've accumulated 6 blobs worth of transactions and then post that to the chain. This results in a rougly Gaussian distributions, with peaks that reflect the average time it takes these rollups to accumulate this number of transactions. Other rollups, like Scroll and Starknet, essentially post blobs as soon as they have even a single blob, resulting in them posting blobs on nearly every block. Still other rollups seem to employ a mixture of these strategies, perhaps attempting to optimize the gas market.