owned this note changed 3 years ago
Published Linked with GitHub

Blob Sidecar handling in Prysm

Background

In EIP4844, BlobsSidecar was introduced to accept "blobs" of data to be persisted in the beacon node for period of time. These blobs bring rollup fees down by magnitude and enable Ethereum to remain competitive without sacrificing decentralization. Blobs are pruned after ~1 month. Available long enough for all actors of a L2 to retrieve it. The blobs are persisted in beacon nodes, not in the execution engine. Alongside these blobs, BeaconBlockBody contains a new field blob_kzg_commitments. It's important that these kzg commitments match blob contents.

Note that on the CL, the blobs are only referenced in the BeaconBlockBody, and not encoded in the BeaconBlockBody. Instead of embedding the full contents, the contents are propagated separately in BlobsSidecar above. The CL must do these three things correctly:

  • Beacon chain: process updated beacon blocks and ensure blobs are available
  • P2P network: gossip and sync updated beacon block types and new blobs sidecars
  • Honest validator: produce beacon blocks with blobs, publish the blobs sidecars

In this doc, we explore the spec dependencies between BlobsSidecar and BeaconBlockBody loosely coupled and implementation complexity due to them being loosely coupled.

Beacon chain processing

As part of block processing, a new function process_blob_kzg_commitments is added. It ensures kzg commitments between the BeaconBlockBody matches hashes in SignedBlobTransaction defined in the block transaction. Given raw transactions, a node should be able to peek inside for blob versioned hashes given the field offset. In this validation, BlobsSidecar is not required. The beacon node can process the block "optimistically”.

P2P network processing

We'll look at beacon_block and blobs_sidecar gossip topics separately.

In blobs_sidecar, the node will reject if the blobs are malformed (i.e. the BLSFieldElement in an invalid range), and the kzg proof is incorrectly encoded as a compressed BLS G1 point. Lastly, it will reject if the sidecar signature is invalid. Before accepting the blobs, one last step is to run validate_blobs_sidecar which verifies the aggregated proof within the side car object. Also before that, one quick verification to ensure that KZG proof is a correctly encoded compressed BLS G1 point.

In beacon_block, the node will look at blob_kzg_commitments in BeaconBlockBody and reject the block if the kzg commitments are incorrectly encoded as compressed BLS G1 points and the commitments don't correspond to the versioned hashes in the transaction list

Forkchoice processing

As we can see from the Beacon chain and P2P network processing, either object could be processed independently. There’s no strict dependency on each other. This changes in the fork choice land. Today, client implementations use forkchoice store to cache useful status such as VALID, OPTIMISTIC, INVALID… etc With EIP4844, It’ll also be useful for forkchoice store to mark the block (i.e. a node in protoarray term) whether the data is available (i.e has a valid sidecar). is_data_available is a piece of important information and is meant to change with later sharding upgrades. is_data_available retrieves the matching BlobsSidecar of a given blob_kzg_commitments and validates the sidecar is sound. It's important to note that a block can not be VALID until is_data_available returns true. Until then, the block should only be treated as OPTIMISTIC. Given such constraints, we analyze two scenarios

  1. Block is received before sidecar

This is a happy case. block (blue) is received before sidecar (red). block arrives in p2p service where the node validates the block passes p2p conditions before gossip to its peers and sending it to blockchain service where node validates the block according to consensus rule then send it to forkchoice store and DB for persistent storage.
Once the block has been cached in forkchoice store, the node can validate sidecar and then send it to forkchoice store to make the block has a valid sidecar and VALID before storing it to DB for persistent storage. The total completion time depends on the time gap between the two objects, which should be minimized for performance.

  1. Sidecar is received before block

This is a tricky case. sidecar (grey) is received before block (blue). In this case, we store sidecar in a pending queue until the corresponding block is received, processed, and stored in forkchoice store. The total completion time depends on when the block can arrive

Loosely coupled sidecar and block increases complexity for the scenario above.

Further considerations

  • With checkpoint sync, a node will likely have to request BeaconBlocksByRange for backtracking. In a tightly coupled world, we don't need to call BlobsSidecarsByRange separately
  • Anything else?
Select a repo