TL;DR
We need more blobs today, and even more tomorrow.
Can the current form of PeerDAS scale to the theoretical limit of 1D PeerDAS - 64 to 72 blobs max? The short answer is unlikely.
There are some known bottlenecks, and we're not even sure if scaling to 32 blobs is safe (more on this below). If we start with a conservative increase in Fusaka - say 16-18 blobs max - we'd have to wait for the G fork (2026-2027) to increase it again. That's likely too late and reduces the value of pushing PeerDAS now.
(unless we have BPO forks to allow us to change blob count between forks)
In this post, we'll explore a solution that enables us to scale blob count further (possible to the theoretical limit of 1D PeerDAS) and more safely, and present a case to have this shipped in the Fusaka fork. This solution was discussed during DevCon, and advocated by Francesco to include with PeerDAS.
Today on mainnet (Deneb), sending blob transactions (EIP-4844) require the transaction sender to include a computed KZG commitment and proof as part of the raw transaction. The commitment ensures fake blob data cannot be substituted, and the blob proofs are validated in both the execution layer (EL) when the transaction enters the mempool, and in the consensus layer (CL) when they are transmitted across the network.
From the PeerDAS upgrade, the network form of blob data changes in the consensus layer - instead of Blobs
, blob data will be sent in the forms of DataColumns
that comprises of Cells
, with a proof accompanying each respective cell[1], and nodes would use these proofs (instead of blob KZG proofs from Deneb) to verify all transmitted cells against the KZG commitments.
In the current specification, these cell KZG proofs are computed by the proposer during block production. Computing these proofs is quite expensive, ~150ms for each blob on a single thread but can be parallelised. This has been a known bottleneck for a while, and there's been new optimisations to bring this number down, including optimising KZG libraries performance and distributed blob building.
This will allow us to safely scale to some limited extent without increasing full node hardware and bandwidth requirement. However the proof computation time increases as the blob count increases, and regardless of who computes the proof, the proposer or another more powerful node, someone has to compute the proofs in the 4 second block proposal critical path. This limits how far we can scale with PeerDAS under the current form, and going for the theoretical limit of 64-72 blobs may be a bit risky.
Below table shows estimated Proof Computation Time based on benchmarks for various blob count and CPU thread count. Note that we use "available CPU threads" here because the node would also be spending cycles on other tasks within the CL + potentially an EL and other processes runing on the same machine.
Blob Count \ Available CPU Threads | 8 | 16 | 32 |
---|---|---|---|
16 | 300ms | 150ms | 150ms |
32 | 600ms | 300ms | 150ms |
64 | 1200ms | 600ms | 300ms |
72 | 1350ms | 750ms | 450ms |
As you can see above, in the best case for 32 blobs, a powerful node (with 16 avaialble threads) that contributes to proof building would still take 300ms for computing proofs, plus it also has to publish them to its mesh peers.
There are a few possible solutions that would allow scaling blobs to a higher degree:
The main downside of Option 1 is complexity and the substantial changes required to the existing PeerDAS implementation.
Option 2 was discussed during the R&D workshop at Devcon 2024 and in multiple CL breakout calls afterward, with agreement from all participants. As a next step, we'd like more eyes on this proposal - especially from the EL teams, who would be involved in implementing some of the changes.
Instead of computing the cell KZG proofs in the CL during block production, we have the transaction sender compute the cell proofs and send them along with the transaction, similar to how it's done in EIP-4844.
Replacing the unused blob KZG proofs with cell KZG proofs across all layers is probably a worthy cleanup too, as they will no longer be used in the CL.
Initially this was not widely accepted because it leaks DAS cryptography into the EL, and would result in a coupling that potentially make future changes harder (e.g. cell size reduction, encoding changes etc). However, despite requiring changes in multiple layers, this is arguably a simpler and necessary change, as it doesn't requiring an extensive amount of optimisation. Longer term, I suspect the EL may end up having knowledge of DAS cryptograpgy anyway if we end up implementing a vertically sharded blob mempool.
It's not 100% clear to me from the proposal, but if we shift the proof computation from the CL to the tx sender, are 72 blobs as new max realistic or is there another limitation that wouldn't allow us to reach that?
Based on what I know so far, I believe it's realistic - the main known bottlenecks are proof computation time and bandwidth requirements to propagate blobs during block proposal. This new proposal eliminates the proof computation bottleneck during block production, and distributed blob building is quite effective on solving the later based on the testing so far.
We expect an increase in EL bandwidth as the blob count increases, but it is relatively small compared to the bandwidth usage on the CL (gossip is bandwidth intensive), and I believe the potential optimisations on CL gossip could offset the increase in EL bandwidth. Some numbers here: https://blog.sigmaprime.io/peerdas-distributed-blob-building.html#impact-on-node-operators
Now the 64-72 max blobs is still a theoretical limit until we have a version to test with, I've only tested up to 32 blobs. AFAIK there are no other known bottlenecks on the CL side, but possible limitations:
If there are no major issues with this approach, the goal is to finalise the spec across all layers ASAP so client teams can start working on an implementation. We'll also need teams to help drive spec changes listed below.
This could be the final spec change needed to ship the first iteration of PeerDAS - hopefully in 2025. 🚀
Thanks for reading!
Update EIP-7594: include cell proofs in network wrapper of blob txs #9378
getPayloadV5
: changes BlobsBundle
to include cell KZG proofs instead of blob KZG proofs.getBlobsV2
: changes to return list of blobs and cell KZG proofs (instead of blob KZG proofs).getBlobsBundle
and EL getBlobs
DataColumnSidecar
:
GetBlobSidecar
API
Changes to support the above operations:
computeProofsWithoutExtendingBlob
verifyBlobWithExtendedProof
Here's a diagram illustrating the computation and bandwidth bottleneck in PeerDAS:
Distributed blob building was proposed to alleviate these problems and distribute the computation and data propagation work across more powerful nodes in the network:
note: numbers are estimates only based on benchmarks and tests conducted earlier with 16 blobs. Assumes 50mbps upload bandwidth, 8 cores with 6 avaialble cores for proof computation, and 8 mesh peers.
Blob data network form changes from Blobs to Columns, which comprises of cells.
↩︎
This approach changes data column gossip to allow transmitting a subset of cells and proofs. this will enable a more efficient and more effective form of distribute blob building, where nodes can compute and distribute cell proofs without having all the blobs. (Currently the network form DataColumnSidecar
requires all blobs to compute). ↩︎