Mempool DAS with blob tickets

Introduction

What

The core ideas are:

Mempool sampling: using a vertically sharded mempool, where nodes download all blob txs but only samples of the associated blobs, with indices chosen the same way as for column sampling. Column subnets would be deprecated and entirely replaced by mempool propagation.
Blob tickets: combine the allocation of blob inclusion rights with the allocation of mempool write access, which is anyway necessary for inclusion since blobs can only go be sampled through the mempool. The allocation is done in advance, by auctioning blob tickets for a specific slot.

Why

It's a relatively simple way to implement the necessary restrictions on a vertically sharded mempool, with the added benefit of making the mempool throughput be exactly limited by the desired throughput.
Column sampling happens already in the mempool, blob by blob, with much more relaxed timelines:
- Peak bandwidth is not as much of a constraint
- We can more safely afford to use pull-based gossip (like the blob mempool already does) due to the relaxed timeline. This reduces redundancy, because each blob is sent and received only once per node on average.
- No duplication of bandwidth between CL and EL, only mempool propagation.
No need for blob ILs: each consumed blob ticket acts as a blob IL already. Since there's only a small number of these, there's no need for validators to package them into ILs, and we can instead rely on direct mempool observation (though we still need attesters that don't run a full mempool to have some kind of availability signal, like a committee's vote. See here).

Blob tickets

Usage

A blob ticket for slot N gives you two things:

Mempool right: you can send a blob in the blob mempool in slot N-1, including an associated blob transaction. We discuss this in detail later
Inclusion right: you can get a blob included onchain in slot N. Meaning, in order to get a blob tx with k blobs included in slot N, you must have k blob tickets for slot N. No blob basefee is paid however, because the fee paid in the auction substitutes it. The only cost paid at execution time is the regular gas cost.

Mempool

For now let's not worry about how blob tickets are allocated, and let's just focus on how they are used to build a blob mempool. The blob mempool lives entirely on the CL layer p2p network, and consists of a single blob_transaction_envelope global topic used for sharing blob transactions, and MAX_BLOBS_PER_BLOCK cell_envelope topics used for sharing the blob samples associated with them, together with proofs.

Sending messages in these topics at slot N require holding a ticket for slot N+1, and is enforced by having messages include a ticket_index and be signed with blob_ticket.pubkey, where blob_ticket is the ticket with index ticket_index at slot N+1. As we see later when discussing the auction, the BeaconState keeps a record of the blob tickets that have been sold, in state.blob_tickets. In particular, state.blob_tickets[current_slot % SLOTS_PER_EPOCH] records the tickets for current_slot. Since the tickets used to access the mempool during slot N are those from slot N+1, we use the tickets recorded in state.blob_tickets[current_slot + 1 % SLOTS_PER_EPOCH].

Blob transaction topic

blob_transaction_envelope

class BlobTransactionEnvelope(Container):
    transaction: Transaction
    kzg_commitments: List[KZGCommitments, MAX_BLOBS_PER_TRANSACTION]
    slot: Slot
    ticket_indices: List[uint8, MAX_BLOBS_PER_TRANSACTION]
    
class SignedBlobTransactionEnvelope(Container):
    message: BlobTransactionEnvelope
    signature: BLSSignature

Note: we could remove kzg_commitments from BlobTransactionEnvelope if we had ssz encoded transactions, because we could just access the versioned hashes and check them against the kzg commitments in the cells. Without that, we will also have to give to the EL the versioned hashes corresponding to kzg_commitments and have it check that they correspond to the ones in transaction.

Gossip checks

Let blob_ticket = state.blob_tickets[slot + 1 % SLOTS_PER_EPOCH][ticket_index]. Before forwarding, we essentially just check that the message is correctly signed by the holder of an unused ticket:

All holders of tickets corresponding to ticket_indices for slot have to be the same
signature is a valid signature of message by blob_ticket.pubkey
this is the first valid message received for ticket_index

Sample topics

cell_envelope_{subnet_id}

class CellEnvelope(Container):
    cell_index: CellIndex
    cell: Cell
    kzg_commitment: KZGCommitment
    kzg_proof: KZGProof
    slot: Slot
    ticket_index: uint8    
    
class SignedCellEnvelope(Container):
    message: CellEnvelope
    signature: BLSSignature

Note: with NUMBER_OF_COLUMNS = 256, a Cell is exactly 1 KB, while a SignedCellEnvelope ~1.2 KBs, or ~20% overhead. Overhead increases linearly with the NUMBER_OF_COLUMNS.

Gossip checks

Let blob_ticket = state.blob_tickets[slot + 1 % SLOTS_PER_EPOCH][ticket_index]. Again, before forwarding we check that the message is correctly signed by the holder of an unused ticket:

compute_subnet_for_cell_index(cell_index) == subnet_id
signature is a valid signature of message by blob_ticket.pubkey
this is the first valid message received for ticket_index

Validity

When gossiping, we do not require the following check, which is however required in order for a CellEnvelope to be valid (together with the gossip checks):

verify that cell is a an opening of kzg_commitment at cell_index, through kzg_proof

The reason we do not do it is that verifying a single cell, though reasonably cheap (~3ms), is much less efficient than batched verification (~16ms for a whole blob for example). For the purpose of preventing DoS, checking the signature is enough. The full verification can be done later, once all cells for a certain blob have been retrieved (or anyway, a client is free to schedule this as it pleases). The cells are ultimately only considered valid once the proof is verified.

Note: since verifying the signature is "only" ~2-3x faster than verifying a cell proof, an alternative design could be to not sign the CellEnvelope (which saves 96 KBs, ~10%) and to instead verify the proof immediately, without waiting for batch verification. However, a node would then only verify and forward a CellEnvelope if it knows a corresponding BlobTransactionEnvelope, because that's the only place where signature verification (i.e. gating of the mempool by tickets) would happen in this design.

Engine API

class NewBlobTransactionRequest(object):
    transaction: Transaction
    versioned_hashes: Sequence[VersionedHash]

The CL sends a newBlobTransactionRequest to the EL when:

BlobTransactionEnvelope is valid
Valid CellEnvelope are available for all cell indices we are sampling

The request is:

NewBlobTransactionRequest(
    transaction=blob_transaction_envelope.transaction,
    versioned_hashes=[kzg_commitment_to_versioned_hash(kzg_commitment)
                      for kzg_commitment in blob_transaction_envelope.kzg_commitments]
)

The EL validates it by checking that transaction is valid and that versioned_hashes == transaction.blob_versioned_hashes.

Redeeming a ticket - blob inclusion

In order to for the execution payload of slot to contain a blob tx with k KZG commitments, tx.sender must hold k blob tickets for slot, recorded in state.blob_tickets[slot % SLOTS_PER_EPOCH]. If the CL could decode transactions (for example if ssz was used on both layers), we could directly verify that this is the case for all blob txs by accessing tx.sender. Without that, we could add blob_sender_addresses: List[ExecutionAddress, MAX_BLOBS_PER_BLOCK] to theExecutionPayload, listing a sender address for each kzg commitment in the BeaconBlock. Then we can just check that each address in blob_sender_addresses is matched by a winning ticket.

ticket_addresses = [ticket.execution_address for ticket in state.blob_tickets[slot % SLOTS_PER_EPOCH]]
for address in set(execution_payload.blob_sender_addresses):
   assert execution_payload.blob_sender_addresses.count(address) <= ticket_addresses.count(address)

Auction

The design space for how to allocate the tickets is very large. This is just one approach, as an example of what's possible. As another very different example, we could have 128 "permanent" blob tickets with an associated Harberger tax.

At a high level, the CL runsSLOTS_PER_EPOCH parallel auctions, for slots current_slot + SLOTS_PER_EPOCH + [0, SLOTS_PER_EPOCH-1], whose bids are sent to an EL contract. One such auction is settled each slot, in particular the one for current_slot + SLOTS_PER_EPOCH - 1, which is the earliest open one, having been opened for SLOTS_PER_EPOCH slots . Bids for these auctions are sent to an EL contract and communicated to the CL, and the MAX_BLOBS_PER_BLOCK best bids for a slot are awarded a blob ticket for it when its auction is settled.

The CL also keeps track of the outcome of the last SLOTS_PER_EPOCH settled auctions, for slots current_slot + [0,SLOTS_PER_EPOCH-1] (as mentioned above, the earliest still open auction is for slot current_slot + SLOTS_PER_EPOCH). There's no reason to keep track of blob tickets beyond current_slot, because they cannot be used anymore, neither for the mempool nor for inclusion.

Bidding for tickets (EL)

We create an EL contract that receives bids in the form of calls with input [bid, quantity, slot, pubkey, address], and passes them to the CL, in the form of a BlobTicketRequest.

class BlobTicketRequest(Container)
    bid: Gwei
    quantity: uint8
    slot: Slot
    pubkey: BLSPubkey
    address: ExecutionAddress

The EL does not verify anything, and just passes the message to the CL, but having the bid process on the EL makes it by default spam-resistant and censorship-resistant.

Running the ticket auction (CL)

These are introduced in the Beacon Chain:

class BlobTicket(Container):
    pubkey: BLSPubkey
    address: ExecutionAddress
    
class BeaconState(Container):
    ...
    blob_ticket_auctions: Vector[List[BlobTicketRequest, MAX_BLOBS_PER_BLOCK], SLOTS_PER_EPOCH]
    blob_tickets: Vector[List[BlobTicket, MAX_BLOBS_PER_BLOCK], SLOTS_PER_EPOCH]

Overall, blob_ticket_auctions holds all bids for the SLOTS_PER_EPOCH parallel auctions, each auction represented by one list. In particular, blob_ticket_auctions[i] holds theMAX_BLOB_TICKETS_PER_SLOT highest bids (if there are that many) for the unique slot such that slot % SLOTS_PER_EPOCH == i in the current auction period, current_slot + SLOTS_PER_EPOCH + [0, SLOTS_PER_EPOCH-1].

blob_tickets holds the blob tickets for auctions that have already been settled and whose slot is yet to come. In particular, blob_tickets[i] holds the winning blob tickets for the slot in current_slot + [0,SLOTS_PER_EPOCH-1] such that slot % SLOTS_PER_EPOCH == i. Therefore,blob_tickets[current_slot % SLOTS_PER_EPOCH] holds the winning tickets for current_slot, which we use to determine whether the blobs included in the current execution payload come with corresponding blob tickets owned by their senders. On the other hand,blob_tickets[current_slot+1 % SLOTS_PER_EPOCH] holds the winning tickets for current_slot+1, which we use to determine who can send blob txs in the mempool during current_slot.

process_blob_ticket_request:
- Verifies that current_slot + SLOTS_PER_EPOCH <= blob_ticket_request.slot < current_slot + 2*SLOTS_PER_EPOCH. In other words, you can only bid for a slot which is between 32 and 64 slots in the future, because the auction for slots less than 32 slots in the future are already settled, and the ones for more than 64 slots in the future are not yet open.
- Adds blob_ticket_request to blob_ticket_auctions[blob_ticket_bid.slot % SLOTS_PER_EPOCH] as long as either there's still space or blob_ticket_request.bid is higher than the lowest currently stored bid, in which case it replaces it.
The slot transition (process_slot) to slot:
- Concludes the auction for s = slot + SLOTS_PER_EPOCH - 1, by processing blob_ticket_auctions[s % SLOTS_PER_EPOCH], after which it resets it to []
- When processing blob_ticket_auctions[s % SLOTS_PER_EPOCH], it assigns blob tickets to the MAX_BLOBS_PER_BLOCK highest bids (a bid with quantity > 1 counts as multiple identical bids), by putting them in blob_tickets[s % SLOTS_PER_EPOCH]