Description | Link |
---|---|
4844 networking spec | ethereum/EIPs – EIP-4844: Networking |
EL mempool spec | ethereum/devp2p – NewPooledTransactionHashes |
Deneb p2p interface | ethereum/consensus-specs – Deneb: Blob Subnets |
Cancun engine API | ethereum/execution-apis – engine_getBlobsV1 |
Fulu p2p interface | ethereum/consensus-specs – Fulu: Data Column Subnets |
Description | Link |
---|---|
Davide's article on Taiko sequencing | Understanding Based Rollups & Total Anarchy – ethresear.ch |
Pop, Nishant, Chirag on improving CL gossip | Doubling the Blob Count with GossipSub v2.0 – ethresear.ch |
Francesco's blob mempool tickets | Blob Mempool Tickets – HackMD |
Dankrad's mempool sharding document | Mempool Sharding – Ethereum Notes |
DataAlways' orderflow dashboard | Private Order Flow – Dune |
(1). L2 transaction lifecycles
(1.1). Centralized sequencers ⇒ patient blobs
(1.2). Total anarchy ⇒ impatient blobs
(1.3). Aside: blobs in a "preconf" world
(2). Blob gossip and validation pre-PeerDAS
(2.1). Blob gossip and the mempool
(2.2). Block validation and blobs
(2.3). The full pre-PeerDAS picture
(3). Blob gossip and validation post-PeerDAS
(3.1). Block validation and blobs
(3.2). Blob gossip changes
(3.2.1). Horizontally shard the EL mempool
(3.2.2). Vertically shard the EL mempool
(4). Summary and conclusion
To understand the properties of blob transactions, we first need to understand the service that the blobs provide to L2 users. Blobs are the vehicle through which Ethereum rollups post L2 transaction data to the L1. In this way, an L2 user who sees their transaction sequenced in a blob and included on the L1 can use that as a "confirmation rule" on their transaction inclusion and ordering.
Definition (informal) – A confirmation rule is a signal indicating that a transaction has been included and ordered.
This definition is vague because confirmation rules can come in many flavors. We will discuss this ad nauseam below, but here are a few examples that should feel familiar.
Example confirmation rules:
Returning to blobs, an L2 transaction being included in a blob that landed on the L1 is a confirmation rule, but the importance of this specific confirmation depends significantly on how the L2 sequences transactions. For a centrally sequenced L2 (e.g., Base, OP Mainnet, Arbitrum), the green checkmark you get from the sequencer is the only confirmation rule you care about, while the actual posting of data to the L1 is not that meaningful to the L2 users.[1] In contrast, for a based rollup using total anarchy sequencing (e.g., Taiko), the L2 transaction inclusion in a blob that lands in an L1 block is the first and most crucial confirmation you get. This distinction is vital because it determines the properties of blob transactions on the L1, which we should consider when designing the L1.
Sections (1.1) and (1.2) further describe the L2 transaction lifecycle for centralized and total anarchy sequencing, respectively. We examine these two modalities in detail because they are what exists today. Section (1.3) briefly considers the potential implications of a world with based & native rollups that give "preconfs."
Let's start with the most basic rollup construction: a centralized sequencer occasionally posting L2 transaction data as blobs to the L1. The figure below demonstrates this flow.
conf #1
as the first confirmation.Key point: almost all L2 transactions will rely on the centralized sequencer confirmation (conf #1
) and won't demand timely blob inclusion on the L1 (conf #2
). There are many proposed fallbacks to the centralized sequencer in the case of outages or censorship (e.g., Arbitrum's "Censorship Timeout" or Optimism's "OptimismPortal"). Still, the overwhelming majority of transactions will mainly rely on the sequencer confirmation. Critically, this implies that blobs posted by centralized sequencer rollups will not be latency-sensitive.[3] We categorize these blobs as "patient" (borrowing the definition from Noam's Serial Monopoly paper), because they are indifferent (over reasonable time horizons) about which exact L1 block the blob is included in.
Moving to a much different sequencer model, let's consider Taiko, which uses a "total anarchy" permissionless sequencer model (for now – they are planning to upgrade to an allow list for block builders partly because of the problems outlined below). The figure below demonstrates the L2 transaction lifecycle in this case.
Key point: all these L2 transactions will rely on timely blob inclusion on the L1 (conf #1
). Until then, their transaction will remain pending. Searchers will compete to submit L1 blobs with profitable L2 transaction sequencing. We call these blobs "impatient" because both (i) their timely inclusion and (ii) their order within the L1 block are critical to the L2 functioning. We already see this empirically; see Davide Rezzoli's recent article outlining how Taiko labs face adverse selection when posting blobs and are often outbid by more competitive searchers.
One subtlety alluded to in step 2 above: we expect the vast majority of these blobs to go directly to builders instead of going through the public mempool. We also see this empirically as described by DataAlways in this tweet. When there is an open competition to sequence L2 transactions, blobs will be carrying MEV and thus must flow through private channels to avoid being front-run and/or unbundled. DataAlways summarizes nicely in this tweet; see the surrounding thread for further context.
"Preconf" rollups aim to give L2 sequencing authority to L1 validators who opt-in to an out-of-protocol service. With this authority, the L1 proposer who is elected as the next leader to propose an L2 block can issue "preconfirmations" (promises of inclusion and/or ordering) to L2 transactions (a preconf is, itself, a confirmation rule). Thus, the L1 proposer who also builds the L2 block receives payments and MEV from building the L2 block.
We aren't going to spend too much time here because preconf rollups don't exist yet, but it is worth touching on. We believe blobs built by L1 proposers (or builders/relays) who are giving preconfs to L2 users may hit the public mempool. Consider a validator (the next enrolled preconfer) who is the L1 proposer eight slots into the future. Thus, they have sole sequencing rights over the L2 for 96 seconds. Each preconf they issue corresponds to an L2 transaction, which they must pack into a blob and post to the L1 (in a specific order). This validator can publish the blobs in order and doesn't necessarily need to wait for their slot to include the blob in their own block. Again, this is all a bit speculative and dependent on the L2 construction, but it seems possible that these blobs will need to be included over the next eight slots but won't be as latency-sensitive as those that use total anarchy to sequence (as discussed in the previous section); these blobs might be best modeled as "quasi-patient" transactions (e.g., see this paper).
Of course, once it is the validator's slot, they can simply include any remaining blobs with the preconfed L2 transactions directly. Existing designs have these preconfs enforced by slashing conditions, so the validator would be strongly incentivized to ensure the blobs make it on a chain in the order they promised. We close this topic here, but it will be important to discuss if we see increased usage of preconf rollups.
Let's take stock of where we are today based on Section (1). We partition blobs into two categories:
From this DataAlways dashboard, we see that about 80% of blobs hit the public mempool, and only the Taiko sequencers (a permissionless set, as described above) are consistently sending private blobs. For now, this partition accurately characterizes the existing blob flow. We return to the L1 and consider how blobs consume network bandwidth for validators participating in consensus. A validator has the following blob-related roles:
These roles have very different implications for each validator's network resource consumption based on when they happen in the slot.
Today, validators connect to different peers with their EL and CL clients. The "mempool" refers to the set of transactions the EL client hears about before being included in a block. As specified in EIP-4844, blob transactions are gossiped in a pull-based manner.
"Nodes MUST NOT automatically broadcast blob transactions to their peers. Instead, those transactions are only announced using NewPooledTransactionHashes messages and can then be manually requested via GetPooledTransactions."
– Networking, EIP-4844.
The NewPooledTransactionHashes
message serves as an announcement of a blob, and any peer who hasn't yet downloaded that blob responds directly with a GetPooledTransactions
request. In this manner, all blobs that hit the public mempool are propagated quickly to every node. The sequence diagram below demonstrates this process.
NewPooledTransactionHashes
, which contains the transaction type, size, and hash.GetPooledTransctions
.NewPooledTransactionHashes
message.Key point: Each node should only download each blob a single time when they request it from the first peer because they ask for the blob sequentially from one peer at a time. After that, they will ignore any NewPooledTransacionHashes
that include the blobs they already downloaded.
With this heuristic, we calculate today's blob mempool bandwidth consumption should be around 32 kB/s = (128 kB/blob * 3 blob/slot) / 12 s/slot. The figure below shows the empirical data is close to this theoretical value.
Blob mempool ingress bandwidth consumption. 33.8 kB/s is only slightly higher than the expected 32 kB/s, resulting from 3 blobs per slot.
Key point: Public mempool blobs are spread out over the 12-second slot, distributing the network load over the interval. Additionally, each node expects to see every public mempool blob.
The mempool accounts for blob transactions not yet included in a block. Separately, when a validator receives a new block, they must ensure that the blobs the block commits to are available to determine overall block validity. Today, the validators ascertain this blob availability by fully downloading the blobs. As mentioned above, the CL has an entirely different gossip network and peers than the EL. Validators use a combination of both to receive all the blob data needed to attest to a block. The first and primary source of blobs for the CL is the blob subnets (blob_sidecar_{subnet_id}
). With a maximum of six blobs, there are six subnets that every validator connects to. When gossiping a block, the corresponding blobs are gossiped over their respective subnet (e.g., the blob committed to in index two is gossiped over blob_sidecar_2
). If a validator doesn't receive a blob over their CL gossip,[4] they can check if their EL client has it in the mempool (received over EL gossip); they use this engine_getBlobsV1
API. Lastly, the validator can directly ask their CL peers for a blob (instead of just waiting to hear it over gossip) with the blob_sidecar_by_range
API. Note, however, that the req/resp model is not usually used on the critical path and is unlikely to help retrieve missing blobs in the time between hearing about the beacon block and the attestation deadline. Still, we include it here because it is part of the spec and worth highlighting. The sequence diagram below shows this flow for three blobs, which Bob receives in three distinct ways.
Step-by-step annotation:
beacon_block
over the pub-sub topic and needs to attest to its validity. The block contains three blobs (but the blobs are gossiped separately).blob_1
over CL gossip on the blob_subet_1
topic. He still doesn't have blob_2
or blob_3
from either subnet.engine_getBlobsV1
to see if the EL has heard about any blobs over mempool gossip. The engine call returns blob_2
, but not blob_3
.blob_3
.Key point: The blob subnets are a push model instead of a pull. When Bob receives a blob over the subnet, he forwards it to his CL peers, even though they haven't explicitly asked for it.
The push model can lead to redundancy, where a node receives blobs multiple times, which we see empirically.
Ingress traffic over blob_sidecar topics. At an average of 153 kB/s (
14 blobs per slot), we see about 4x more blobs than are included in each block.
A pull-based model would more efficiently consume bandwidth but at the cost of latency (given an extra round trip of control messages before the blob is transmitted). See Gossipsub v2.0 from Pop, Nihant, and Chirag, which aims to reduce this amplification.
The figure below highlights the timeline of events within a slot for both blob gossip and block validation.
Step-by-step annotation:
PeerDAS, widely held to be the main priority for the Fulu/Osaka hardfork, changes how the protocol interacts with blobs. For the purposes of this post, the only piece of PeerDAS we need to cover is the columnar approach used by the CL to verify data availability. The figure below demonstrates this distinction.
For simplicity, let's assume that the quantity of data needed to perform the two tasks is approximately the same. In other words, PeerDAS does increase the number of blobs per block, but it doesn't significantly increase the amount of blob data downloaded by the CL because each validator only downloads a subset of each blob (e.g., with a 48-blob target, if each validator downloads 1/8 of every blob, then in aggregate they download six blobs – the Prague/Electra target – worth of data).
With this setup, we can consider how the validator interactions with blobs change. We will reverse the order by examining the CL block validation rule before discussing the mempool and EL gossip.
On the CL side, the validators still have to determine block validity based on the availability of blobs. Instead of downloading the full set of blobs, they download a random subset of columns from each blob. The blob subnets described above are deprecated in favor of data column sidecar subnets (data_column_sidecar_{subnet_id}
), which is the topic where full columns of blobs are gossiped. Critically, there is no concept of partial columns; thus, each column depends on the complete set of blobs committed to by a block. To validate a block, the CL checks the result of the is_data_available
function, which ensures that the node has access to their assigned columns for each blob in the block. As before, let's consider the three ways to retrieve their columns of data:
data_column_sidecar
subnets,engine_getBlobsV1
(same as before, more on this below), ordata_column_sidecars_by_root
API.Key point: Step (2) above returns the blobs themselves instead of just the columns the validator needs to check. To construct the entire column, the EL will only be helpful if it returns every blob in the block (meaning the block cannot have any private blobs).
There are multiple problems with this. First (and most obviously), if the EL has the entire set of blobs for the block, then what was the point of the CL sampling only a subset of each blob? (More on this in the following section.) The CL could just confirm the data is available directly by fetching every blob from the EL. Secondly, if any of the blobs in the block is private, then the EL call won't aid in constructing the entire column (recall: no partial columns). So … this is awkward. We sharded the CL blob validation, but by doing so, we eliminated the value of the public EL mempool for block validation (it is still potentially useful for block building, especially considering the patient, public mempool blobs described in Section 1). Consider the figure below as an example.
As before, the green blobs arrive uniformly throughout the slot (there are now 48), while the purple blobs are not gossiped until the block has been published. Now, the honest attester faces the following situation:
They need these full columns, but because of some private (purple) blobs in the block, the public mempool is insufficient for the full-column construction. Thus, if they don't receive the full column over gossip via data_column_sidecar
, they cannot consider the block valid (as before, we assume that the request/response domain API isn't dependable for the critical path of block validation).
Note that the builder/proposer is highly incentivized to make sure the columns of data are distributed over the CL gossip because the validity of their block depends on it. It may be OK to just leave it to them to figure out how much to push the latency when distributing columns. This comes with some risk, as builders may push latency to the limit and leave low-bandwidth attestors unable to download the columns fast enough to attest properly. Even today, builders make difficult decisions about which blobs to include, especially during periods of high volatility – see discussion here. Even if we are happy to leave most of the blob distribution to the builders (albeit a big assumption), we still have the issue of blob gossip under a significant increase in blob throughput.
The reality remains that the EL blob gossip still receives all public blobs, eliminating the benefit of sampling only a subset of each blob on the CL. We calculated an average of 32 kB/s with three blobs; under a 48-blob regime, this would be 512 kB/s = (128 kB/blob * 48 blob/slot) / 12 s/slot. This is a significant increase and potentially too much bandwidth for many solo/small-scale operators. Thus, it is worth considering changes to gossip to alleviate this. To conclude this article, we will consider horizontal and vertical sharding of the EL mempool, each of which could be implemented with various levels of complexity and effectiveness; this list is by no means exhaustive, and each proposal is probably worthy of an entire article analyzing the tradeoffs. Instead, we aim to give a feel for the design space and defer the complete analysis and recommendations to future work.
We use "horizontal sharding" to describe the process of the EL downloading only a subset of the total set of public blobs. It is horizontal because the messages are still complete blobs (instead of just some columns within the blob). There are several different approaches to achieve this; generally, these are easy to implement, but they don't resolve the issue of block validation requiring the full column (the same cell of every blob). Here are a few candidate ideas:
0x01
prefix. This is another rate-limiting method that is not restricted to the first blobs the node hears about.Pros
Cons
The CL uses vertical sharding (only downloads a subset of columns of each blob), so if the EL could only download the columns the CL needs, we would completely solve the issue. However, implementing this is not simple because of the potential DoS risk; without the full blob transaction, any subset of the blob columns is gossiped without validators confirming that it corresponds to a valid, fee-paying blob transaction. Thus, the naïve approach of simply gossiping columns/cells instead of complete blob transactions is untenable. Turning our attention to DoS prevention, there are a few promising threads.
Pros
Cons
We covered a lot. In summary:
▓▒░ made with ♥ and markdown. thanks for reading! –mike ░▒▓
Of course, the blob posting is essential for fraud proofs and forced exits, two critical features of L2s. We emphasize the confirmation rule aspect because the default path for L2 transactions will be to treat the sequencer confirmation as final. ↩︎
The user could further wait for the finality of the L1 block as a third confirmation. ↩︎
Terence mentioned some other reasons centralized sequencer L2s might be time-sensitive, but there is still a long time window during which the blob can be posted without effecting rollup operations meaningfully. ↩︎
Many technical details within the CL gossip can impact the probability of a blob coming over the subnet. Control messages like IHAVE, IWANT, IDONTWANT
signal to your peers what data you have and need. For this document, we elide these details. ↩︎
This article was originally going to be about the market design of blob mempool tickets… but here we are, lol. ↩︎