Try   HackMD

Blob gossip and validation before and after PeerDAS

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

https://whyy.org/episodes/why-we-gossip/


tl;dr; Blobs contribute to Ethereum scaling, in part, by providing a confirmation rule for L2 transactions. The value of this confirmation rule depends on the L2s sequencer model and can lead to public or private L1 blob transactions (80% of blobs are gossiped in the public mempool today). With this context, we examine how the protocol presently handles blobs by considering the execution layer (abbr. EL) mempool and the consensus layer (abbr. CL) blob validation. We then turn our attention to the changes introduced by PeerDAS (the next step along the path of Ethereum's data scaling roadmap) to demonstrate that, while the CL only samples a subset of the total blob data, the EL mempool will, by default, still receive all public blobs. This fact reduces the benefit of sharding the blobs in the first place, and we conclude by examining a few candidate mechanisms to shard the EL mempool horizontally or vertically and the tradeoffs they make.

by mike neuderapril 15, 2025 (happy tax day! 🪙📅 )

Thanks to Julian Ma and Francesco D'Amato for extensive comments. Further thanks to Alex Stokes and lightclients for helpful discussions.

🔗 Specs and protocol docs

Description Link
4844 networking spec ethereum/EIPs – EIP-4844: Networking
EL mempool spec ethereum/devp2p – NewPooledTransactionHashes
Deneb p2p interface ethereum/consensus-specs – Deneb: Blob Subnets
Cancun engine API ethereum/execution-apis – engine_getBlobsV1
Fulu p2p interface ethereum/consensus-specs – Fulu: Data Column Subnets
Description Link
Davide's article on Taiko sequencing Understanding Based Rollups & Total Anarchy – ethresear.ch
Pop, Nishant, Chirag on improving CL gossip Doubling the Blob Count with GossipSub v2.0 – ethresear.ch
Francesco's blob mempool tickets Blob Mempool Tickets – HackMD
Dankrad's mempool sharding document Mempool Sharding – Ethereum Notes
DataAlways' orderflow dashboard Private Order Flow – Dune

Contents

(1). L2 transaction lifecycles
  (1.1). Centralized sequencers ⇒ patient blobs
  (1.2). Total anarchy ⇒ impatient blobs
  (1.3). Aside: blobs in a "preconf" world
(2). Blob gossip and validation pre-PeerDAS
  (2.1). Blob gossip and the mempool
  (2.2). Block validation and blobs
  (2.3). The full pre-PeerDAS picture
(3). Blob gossip and validation post-PeerDAS
  (3.1). Block validation and blobs
  (3.2). Blob gossip changes
    (3.2.1). Horizontally shard the EL mempool
    (3.2.2). Vertically shard the EL mempool
(4). Summary and conclusion


(1). L2 transaction lifecycles

To understand the properties of blob transactions, we first need to understand the service that the blobs provide to L2 users. Blobs are the vehicle through which Ethereum rollups post L2 transaction data to the L1. In this way, an L2 user who sees their transaction sequenced in a blob and included on the L1 can use that as a "confirmation rule" on their transaction inclusion and ordering.

Definition (informal)A confirmation rule is a signal indicating that a transaction has been included and ordered.

This definition is vague because confirmation rules can come in many flavors. We will discuss this ad nauseam below, but here are a few examples that should feel familiar.
Example confirmation rules:

  1. Six bitcoin blocks – The bitcoin core client marks any transaction with six or more blocks built on the including block as confirmed.
  2. Ethereum finality – Ethereum blocks are finalized in batches (called epochs). Once a transaction is finalized, it will only be reverted if 1/3 of the validator set is provably slashable. Finality is a robust guarantee, but it is a bit slow.
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More →

    As seen in the image above, Etherscan lets you know how strong this confirmation rule is.
  3. Ethereum block inclusion – Even before finality, Ethereum transactions are confirmed by being included in a block.
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More →

    As seen in the image above, Etherscan gives you a green checkmark, with some text advising that the block is not yet finalized. Still, for most transactions, this confirmation rule is sufficient.
  4. Base centralized sequencer green check – Since Coinbase is the only party that sequences Base transactions, you only need confirmation from their sequencer.
    Image Not Showing Possible Reasons
    • The image was uploaded to a note which you don't have access to
    • The note which the image was originally uploaded to has been deleted
    Learn More →

    The image above shows that Basescan gives you the green check because the sequencer confirmed the transaction.

Returning to blobs, an L2 transaction being included in a blob that landed on the L1 is a confirmation rule, but the importance of this specific confirmation depends significantly on how the L2 sequences transactions. For a centrally sequenced L2 (e.g., Base, OP Mainnet, Arbitrum), the green checkmark you get from the sequencer is the only confirmation rule you care about, while the actual posting of data to the L1 is not that meaningful to the L2 users.[1] In contrast, for a based rollup using total anarchy sequencing (e.g., Taiko), the L2 transaction inclusion in a blob that lands in an L1 block is the first and most crucial confirmation you get. This distinction is vital because it determines the properties of blob transactions on the L1, which we should consider when designing the L1.

Sections (1.1) and (1.2) further describe the L2 transaction lifecycle for centralized and total anarchy sequencing, respectively. We examine these two modalities in detail because they are what exists today. Section (1.3) briefly considers the potential implications of a world with based & native rollups that give "preconfs."

(1.1). Centralized sequencers
patient blobs

Let's start with the most basic rollup construction: a centralized sequencer occasionally posting L2 transaction data as blobs to the L1. The figure below demonstrates this flow.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Step-by-step annotation:

  1. The user submits their L2 transaction to the centralized sequencer.
  2. The sequencer immediately confirms the transaction for the user. We call this conf #1 as the first confirmation.
  3. The sequencer batches many L2 transactions into an L1 blob, which they submit to the public mempool.
  4. The Ethereum builder/proposer observes the mempool, picks up the blob to include in a block, and publishes the block to the consensus layer network.
  5. The user receives their second confirmation when the blob that includes their transaction is published to the L1.[2]

Key point: almost all L2 transactions will rely on the centralized sequencer confirmation (conf #1) and won't demand timely blob inclusion on the L1 (conf #2). There are many proposed fallbacks to the centralized sequencer in the case of outages or censorship (e.g., Arbitrum's "Censorship Timeout" or Optimism's "OptimismPortal"). Still, the overwhelming majority of transactions will mainly rely on the sequencer confirmation. Critically, this implies that blobs posted by centralized sequencer rollups will not be latency-sensitive.[3] We categorize these blobs as "patient" (borrowing the definition from Noam's Serial Monopoly paper), because they are indifferent (over reasonable time horizons) about which exact L1 block the blob is included in.

(1.2). Total anarchy
impatient blobs

Moving to a much different sequencer model, let's consider Taiko, which uses a "total anarchy" permissionless sequencer model (for now – they are planning to upgrade to an allow list for block builders partly because of the problems outlined below). The figure below demonstrates the L2 transaction lifecycle in this case.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Step-by-step annotation:

  1. The user submits their L2 transaction to the Taiko mempool.
  2. Searchers listen to the mempool, construct L1 blobs containing the L2 transactions, and send the blobs to the L1 builders directly (more on this private connection below).
  3. The builder/proposer includes the blob in their published block.
  4. The user receives their first confirmation when they see an L1 block containing a blob containing their transaction.

Key point: all these L2 transactions will rely on timely blob inclusion on the L1 (conf #1). Until then, their transaction will remain pending. Searchers will compete to submit L1 blobs with profitable L2 transaction sequencing. We call these blobs "impatient" because both (i) their timely inclusion and (ii) their order within the L1 block are critical to the L2 functioning. We already see this empirically; see Davide Rezzoli's recent article outlining how Taiko labs face adverse selection when posting blobs and are often outbid by more competitive searchers.

One subtlety alluded to in step 2 above: we expect the vast majority of these blobs to go directly to builders instead of going through the public mempool. We also see this empirically as described by DataAlways in this tweet. When there is an open competition to sequence L2 transactions, blobs will be carrying MEV and thus must flow through private channels to avoid being front-run and/or unbundled. DataAlways summarizes nicely in this tweet; see the surrounding thread for further context.

(1.3). Aside: blobs in a "preconf" world

"Preconf" rollups aim to give L2 sequencing authority to L1 validators who opt-in to an out-of-protocol service. With this authority, the L1 proposer who is elected as the next leader to propose an L2 block can issue "preconfirmations" (promises of inclusion and/or ordering) to L2 transactions (a preconf is, itself, a confirmation rule). Thus, the L1 proposer who also builds the L2 block receives payments and MEV from building the L2 block.

We aren't going to spend too much time here because preconf rollups don't exist yet, but it is worth touching on. We believe blobs built by L1 proposers (or builders/relays) who are giving preconfs to L2 users may hit the public mempool. Consider a validator (the next enrolled preconfer) who is the L1 proposer eight slots into the future. Thus, they have sole sequencing rights over the L2 for 96 seconds. Each preconf they issue corresponds to an L2 transaction, which they must pack into a blob and post to the L1 (in a specific order). This validator can publish the blobs in order and doesn't necessarily need to wait for their slot to include the blob in their own block. Again, this is all a bit speculative and dependent on the L2 construction, but it seems possible that these blobs will need to be included over the next eight slots but won't be as latency-sensitive as those that use total anarchy to sequence (as discussed in the previous section); these blobs might be best modeled as "quasi-patient" transactions (e.g., see this paper).

Of course, once it is the validator's slot, they can simply include any remaining blobs with the preconfed L2 transactions directly. Existing designs have these preconfs enforced by slashing conditions, so the validator would be strongly incentivized to ensure the blobs make it on a chain in the order they promised. We close this topic here, but it will be important to discuss if we see increased usage of preconf rollups.

(2). Blob gossip and validation pre-PeerDAS

Let's take stock of where we are today based on Section (1). We partition blobs into two categories:

  1. Patient, public mempool blobs.
  2. Impatient, private mempool blobs.

From this DataAlways dashboard, we see that about 80% of blobs hit the public mempool, and only the Taiko sequencers (a permissionless set, as described above) are consistently sending private blobs. For now, this partition accurately characterizes the existing blob flow. We return to the L1 and consider how blobs consume network bandwidth for validators participating in consensus. A validator has the following blob-related roles:

  1. gossiping blob transactions, and
  2. validating that blobs a block commits to are available before attesting.

These roles have very different implications for each validator's network resource consumption based on when they happen in the slot.

(2.1). Blob gossip and the mempool

Today, validators connect to different peers with their EL and CL clients. The "mempool" refers to the set of transactions the EL client hears about before being included in a block. As specified in EIP-4844, blob transactions are gossiped in a pull-based manner.

"Nodes MUST NOT automatically broadcast blob transactions to their peers. Instead, those transactions are only announced using NewPooledTransactionHashes messages and can then be manually requested via GetPooledTransactions."
Networking, EIP-4844.

The NewPooledTransactionHashes message serves as an announcement of a blob, and any peer who hasn't yet downloaded that blob responds directly with a GetPooledTransactions request. In this manner, all blobs that hit the public mempool are propagated quickly to every node. The sequence diagram below demonstrates this process.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Step-by-step annotation:

  1. Alice notifies Bob of a new blob transaction with NewPooledTransactionHashes, which contains the transaction type, size, and hash.
  2. If Bob doesn't already have that blob, he requests it from Alice with GetPooledTransctions.
  3. Alice responds by sending the full blob to Bob.
  4. Bob notifies his peers with a NewPooledTransactionHashes message.

Key point: Each node should only download each blob a single time when they request it from the first peer because they ask for the blob sequentially from one peer at a time. After that, they will ignore any NewPooledTransacionHashes that include the blobs they already downloaded.

With this heuristic, we calculate today's blob mempool bandwidth consumption should be around 32 kB/s = (128 kB/blob * 3 blob/slot) / 12 s/slot. The figure below shows the empirical data is close to this theoretical value.

Screenshot 2025-04-07 at 10.20.51 AM

Blob mempool ingress bandwidth consumption. 33.8 kB/s is only slightly higher than the expected 32 kB/s, resulting from 3 blobs per slot.

Key point: Public mempool blobs are spread out over the 12-second slot, distributing the network load over the interval. Additionally, each node expects to see every public mempool blob.

(2.2). Block validation and blobs

The mempool accounts for blob transactions not yet included in a block. Separately, when a validator receives a new block, they must ensure that the blobs the block commits to are available to determine overall block validity. Today, the validators ascertain this blob availability by fully downloading the blobs. As mentioned above, the CL has an entirely different gossip network and peers than the EL. Validators use a combination of both to receive all the blob data needed to attest to a block. The first and primary source of blobs for the CL is the blob subnets (blob_sidecar_{subnet_id}). With a maximum of six blobs, there are six subnets that every validator connects to. When gossiping a block, the corresponding blobs are gossiped over their respective subnet (e.g., the blob committed to in index two is gossiped over blob_sidecar_2). If a validator doesn't receive a blob over their CL gossip,[4] they can check if their EL client has it in the mempool (received over EL gossip); they use this engine_getBlobsV1 API. Lastly, the validator can directly ask their CL peers for a blob (instead of just waiting to hear it over gossip) with the blob_sidecar_by_range API. Note, however, that the req/resp model is not usually used on the critical path and is unlikely to help retrieve missing blobs in the time between hearing about the beacon block and the attestation deadline. Still, we include it here because it is part of the spec and worth highlighting. The sequence diagram below shows this flow for three blobs, which Bob receives in three distinct ways.

Screenshot 2025-04-07 at 6.10.03 PM
Step-by-step annotation:

  1. Bob receives a beacon_block over the pub-sub topic and needs to attest to its validity. The block contains three blobs (but the blobs are gossiped separately).
  2. Bob hears about blob_1 over CL gossip on the blob_subet_1 topic. He still doesn't have blob_2 or blob_3 from either subnet.
  3. Bob calls engine_getBlobsV1 to see if the EL has heard about any blobs over mempool gossip. The engine call returns blob_2, but not blob_3.
  4. Bob makes a direct request to his CL peers for blob_3.

Key point: The blob subnets are a push model instead of a pull. When Bob receives a blob over the subnet, he forwards it to his CL peers, even though they haven't explicitly asked for it.

The push model can lead to redundancy, where a node receives blobs multiple times, which we see empirically.
Screenshot 2025-04-09 at 12.35.20 PM

Ingress traffic over blob_sidecar topics. At an average of 153 kB/s (

14 blobs per slot), we see about 4x more blobs than are included in each block.

A pull-based model would more efficiently consume bandwidth but at the cost of latency (given an extra round trip of control messages before the blob is transmitted). See Gossipsub v2.0 from Pop, Nihant, and Chirag, which aims to reduce this amplification.

(2.3). The full pre-PeerDAS picture

The figure below highlights the timeline of events within a slot for both blob gossip and block validation.

Screenshot 2025-04-14 at 3.55.19 PM
Step-by-step annotation:

  1. The green (public) blobs arrive through the mempool (EL) at a uniform rate throughout the slot (they are patient and don't need to be strategic with timing).
  2. The block arrives after the slot begins but before the attestation deadline. It commits some blobs, which may be private or public.
  3. The purple (private) blobs arrive through the consensus layer network along with the block (they are impatient and thus strategic with timing and propagation).

(3). Blob gossip and validation post-PeerDAS

PeerDAS, widely held to be the main priority for the Fulu/Osaka hardfork, changes how the protocol interacts with blobs. For the purposes of this post, the only piece of PeerDAS we need to cover is the columnar approach used by the CL to verify data availability. The figure below demonstrates this distinction.

Screenshot 2025-04-09 at 1.02.08 PM

For simplicity, let's assume that the quantity of data needed to perform the two tasks is approximately the same. In other words, PeerDAS does increase the number of blobs per block, but it doesn't significantly increase the amount of blob data downloaded by the CL because each validator only downloads a subset of each blob (e.g., with a 48-blob target, if each validator downloads 1/8 of every blob, then in aggregate they download six blobs – the Prague/Electra target – worth of data).

With this setup, we can consider how the validator interactions with blobs change. We will reverse the order by examining the CL block validation rule before discussing the mempool and EL gossip.

(3.1). Block validation and blobs

On the CL side, the validators still have to determine block validity based on the availability of blobs. Instead of downloading the full set of blobs, they download a random subset of columns from each blob. The blob subnets described above are deprecated in favor of data column sidecar subnets (data_column_sidecar_{subnet_id}), which is the topic where full columns of blobs are gossiped. Critically, there is no concept of partial columns; thus, each column depends on the complete set of blobs committed to by a block. To validate a block, the CL checks the result of the is_data_available function, which ensures that the node has access to their assigned columns for each blob in the block. As before, let's consider the three ways to retrieve their columns of data:

  1. over gossip on the data_column_sidecar subnets,
  2. from the blobs fetched from the EL with engine_getBlobsV1 (same as before, more on this below), or
  3. from the request/response domain with the new data_column_sidecars_by_root API.

Key point: Step (2) above returns the blobs themselves instead of just the columns the validator needs to check. To construct the entire column, the EL will only be helpful if it returns every blob in the block (meaning the block cannot have any private blobs).

There are multiple problems with this. First (and most obviously), if the EL has the entire set of blobs for the block, then what was the point of the CL sampling only a subset of each blob? (More on this in the following section.) The CL could just confirm the data is available directly by fetching every blob from the EL. Secondly, if any of the blobs in the block is private, then the EL call won't aid in constructing the entire column (recall: no partial columns). So this is awkward. We sharded the CL blob validation, but by doing so, we eliminated the value of the public EL mempool for block validation (it is still potentially useful for block building, especially considering the patient, public mempool blobs described in Section 1). Consider the figure below as an example.

Screenshot 2025-04-14 at 3.59.44 PM

As before, the green blobs arrive uniformly throughout the slot (there are now 48), while the purple blobs are not gossiped until the block has been published. Now, the honest attester faces the following situation:

Screenshot 2025-04-09 at 2.12.35 PM

They need these full columns, but because of some private (purple) blobs in the block, the public mempool is insufficient for the full-column construction. Thus, if they don't receive the full column over gossip via data_column_sidecar, they cannot consider the block valid (as before, we assume that the request/response domain API isn't dependable for the critical path of block validation).

Note that the builder/proposer is highly incentivized to make sure the columns of data are distributed over the CL gossip because the validity of their block depends on it. It may be OK to just leave it to them to figure out how much to push the latency when distributing columns. This comes with some risk, as builders may push latency to the limit and leave low-bandwidth attestors unable to download the columns fast enough to attest properly. Even today, builders make difficult decisions about which blobs to include, especially during periods of high volatility – see discussion here. Even if we are happy to leave most of the blob distribution to the builders (albeit a big assumption), we still have the issue of blob gossip under a significant increase in blob throughput.

(3.2). Blob gossip changes

The reality remains that the EL blob gossip still receives all public blobs, eliminating the benefit of sampling only a subset of each blob on the CL. We calculated an average of 32 kB/s with three blobs; under a 48-blob regime, this would be 512 kB/s = (128 kB/blob * 48 blob/slot) / 12 s/slot. This is a significant increase and potentially too much bandwidth for many solo/small-scale operators. Thus, it is worth considering changes to gossip to alleviate this. To conclude this article, we will consider horizontal and vertical sharding of the EL mempool, each of which could be implemented with various levels of complexity and effectiveness; this list is by no means exhaustive, and each proposal is probably worthy of an entire article analyzing the tradeoffs. Instead, we aim to give a feel for the design space and defer the complete analysis and recommendations to future work.

(3.2.1). Horizontally shard the EL mempool

We use "horizontal sharding" to describe the process of the EL downloading only a subset of the total set of public blobs. It is horizontal because the messages are still complete blobs (instead of just some columns within the blob). There are several different approaches to achieve this; generally, these are easy to implement, but they don't resolve the issue of block validation requiring the full column (the same cell of every blob). Here are a few candidate ideas:

  1. Hardcode an ingress-byte limit. For example, the EL could limit the number of blobs they download to no more than 12 per slot. This rate limiting will mean that each node only receives a fraction of the total blobs (in particular, the first ones they heard announced).
  2. Only download a random subset of blobs. For example, based on some randomness, only download blobs sent from peers with a 0x01 prefix. This is another rate-limiting method that is not restricted to the first blobs the node hears about.
  3. Store only the highest-N blobs ordered by priority fee. For example, the EL could maintain a list of 12 blobs ordered by priority fee. When they hear a new blob announced (we would need to add a fee to the blob metadata), they decide to pull it if it has a higher fee than any transaction in their list.

Pros

  • Simple to implement. Most of these are local heuristics that can be implemented in the clients without a hard fork as they are out of protocol.
  • Effectively minimizes the bandwidth consumption. They do accomplish the stated goal of reducing the worst-case bandwidth consumption of the EL mempool.

Cons

  • Changes the model of the blob mempool today. As constructed today, nodes expect to (eventually) have very similar views of the mempool. This model breaks down if not every node expects to download every transaction. We don't know what the resulting fragmentation of mempool views would cause.
  • May cause/contribute to/hasten the death of the public mempool. The mempool guarantees worsen under a horizontal sharding mechanism because you no longer expect every node to hear about every blob transaction. Given we have a robust public mempool (recall that 80% of blobs are public, and we expect this to remain so long as the centralized sequencers continue to be the predominant blob consumers), making the mempool less effective for blobs certainly increases the odds that rollups go direct-to-builder to post blobs. (There are different views about the longevity of the public mempool as is, and we won't make a value judgment on that here.)
(3.2.2). Vertically shard the EL mempool

The CL uses vertical sharding (only downloads a subset of columns of each blob), so if the EL could only download the columns the CL needs, we would completely solve the issue. However, implementing this is not simple because of the potential DoS risk; without the full blob transaction, any subset of the blob columns is gossiped without validators confirming that it corresponds to a valid, fee-paying blob transaction. Thus, the naïve approach of simply gossiping columns/cells instead of complete blob transactions is untenable. Turning our attention to DoS prevention, there are a few promising threads.

  1. Blob mempool tickets.[5] As proposed by Francesco in this article, blob mempool tickets create an explicit, in-protocol market for allocating write access to the blob mempool. As such, the vertical sharding of the mempool no longer becomes a DoS risk because the ticket (and limited number of tickets) ensures that only authorized senders have access to the mempool. This gives strong guarantees on the upper bound of blobs flowing through the mempool at any time.
  2. Blob mempool reputation system. As proposed by Dankrad in this article, limiting write access to the mempool to nodes who have either successfully landed blobs or have burned a small amount of ETH can also mitigate this DoS risk. This doesn't give as strong of a bound on blobs in the mempool, but it may be simpler to implement.

Pros

  • Minimizes bandwidth consumption and mirrors the CL. From first principles, it is clear that this is the "more correct" method as it mirrors the vertical sharding of the CL. Recall that the whole point of data sharding on the CL was eliminating the need for the nodes to download the full set of blobs. Sharding the EL horizontally and the CL vertically severely limits the utility of the EL mempool.
  • Preserves the public mempool. There is a clear path to preserving the public mempool as the default path for patient blobs to utilize by explicitly resolving the bandwidth concern.
  • Blob tickets may serve as a blob inclusion list mechanism. (more speculative) Because the protocol is explicitly aware of who has access to send blobs over the mempool, the attesters could also be employed to enforce the inclusion of timely blobs (similar to FOCIL).

Cons

  • Complex to implement. Clearly, either of these systems is much harder to implement than any horizontal sharding option listed above; the engineering overhead is significant (especially if the fork-choice rule is modified).
  • Complex to reason about economically. Beyond just the engineering challenge, a suite of economic questions accompany these proposals. How do we price the tickets? How will rollups strategize about purchasing mempool access versus going direct-to-builder? What heuristics would be necessary for a robust reputation system? The design space here is vast, and it may be the case that simple rules work, but it is not obvious.

(4). Summary and conclusion

We covered a lot. In summary:

  1. L2 transactions have a variety of confirmation rules, one of which is inclusion in an L1 blob. The L1 blob confirmation provides very different utility depending on the sequencing model of the L2. These confirmations also have implications for the properties of the L1 blob transactions.
    • 1.1. Blobs generated by L2s with centralized sequencers are neither MEV-carrying nor particularly latency-sensitive. This makes them likely candidates for the public mempool, and 80% of today's blobs fit this model.
    • 1.2. Blobs generated by L2s with permissionless sequencers (e.g., in Taiko's total anarchy) are MEV-carrying and will compete for L1 inclusion. This paradigm leads to private blob flow, and 20% of today's blobs follow this path (exclusively Taiko's blobs).
    • 1.3. In a based/native rollup where the L1 validators issue preconfirmations, the blobs containing the L2 transactions are likely to be latency-sensitive to some extent, depending on the construction of the L2. We don't spend much time here because these rollups don't yet exist.
  2. We examine how blobs are handled by the protocol today by considering the EL mempool and the CL blob validation.
    • 2.1. The EL downloads blobs into the mempool using a pull-based model. Each node is expected to eventually have the same view of the public set of blobs. This blob download is spread evenly over the slot.
    • 2.2. The CL must determine blob availability as part of the block validation logic. The main venue for hearing about blobs on the CL is from the CL gossip over the blob subnets, but they can also check their EL mempool for blobs that a block references. The CL only has four seconds to check that they have access to all blobs before attesting to the block.
  3. PeerDAS changes how the protocol interacts with blobs. Specifically, the CL now only downloads a subset of each blob instead of the entire thing.
    • 3.1. Data columns and the beacon block are gossiped on the CL (full blobs are no longer gossiped over the blob subnets). As a result, any blobs that are gossiped on the EL are only helpful in constructing the columns if there are no private blobs, which seems unlikely. By default, the EL mempool will try to download all public blobs. This eliminates the benefit of the CL only needing to observe some subset of the total blob data.
    • 3.2. We need to address this asymmetry by sharding the EL mempool somehow (or acknowledging that the public mempool is not viable in the long term).
      • 3.2.1. Horizontal (blob-wise) sharding of the mempool is easiest but has significant drawbacks that may limit the value of the public mempool.
      • 3.2.2. Vertical (column-wise) sharding of the mempool aligns with the vertical sharding of blobs done in the CL. However, it is more difficult to implement and requires serious anti-DoS mechanisms.

▓▒░ made with ♥ and markdown. thanks for reading! –mike ░▒▓


  1. Of course, the blob posting is essential for fraud proofs and forced exits, two critical features of L2s. We emphasize the confirmation rule aspect because the default path for L2 transactions will be to treat the sequencer confirmation as final. ↩︎

  2. The user could further wait for the finality of the L1 block as a third confirmation. ↩︎

  3. Terence mentioned some other reasons centralized sequencer L2s might be time-sensitive, but there is still a long time window during which the blob can be posted without effecting rollup operations meaningfully. ↩︎

  4. Many technical details within the CL gossip can impact the probability of a blob coming over the subnet. Control messages like IHAVE, IWANT, IDONTWANT signal to your peers what data you have and need. For this document, we elide these details. ↩︎

  5. This article was originally going to be about the market design of blob mempool tickets but here we are, lol. ↩︎