By: @terencechain, please dm me with any questions or leave any feedbacks. Always happy to chat more
Container:
BeaconBlockBody
structure changes
Block processing:
verify_kzg_commitments_against_transactions
added to process_block
Container:
Add BlobsSidecar
and SignedBlobsSidecar
Gossip domain:
Modify beacon_block
Add blobs_sidecar
⚠️ Should validate_blobs_sidecar
be part of validation before gossip? As it is currently stated, validate_blobs_sidecar
only affects fork choice
⚠️ Based on the current network spec, sidecar can be verified and propagated without block. Is that intended?
Req/res domain:
Modify /eth2/beacon_chain/req/beacon_blocks_by_range/2/
Modify /eth2/beacon_chain/req/beacon_blocks_by_root/2/
Add BlobsSidecarsByRange v1
start_slot: Slot
, count: uint64
List[BlobsSidecar, MAX_REQUEST_BLOBS_SIDECARS]
⚠️Notice that we don't have BlobsSidecarsByRootv1
. My first impression is that this would be useful to have. If you never receive a sidecar from gossip, at least you could fast retrieve it
⚠️MAX_REQUEST_BLOBS_SIDECARS
is 128. The target size for the sidecar is ~1MB, and the max is ~2MB. How does this affect low bandwidth devices?
Some additional notes on coupling BeaconBlock
s and Blob
s on the networking layer during 4844.
Validator proposes signed_blobs_sidecar
in addition to signed_beacon_block
. Validator retrieves blobs and commitments from EL client through get_blobs_and_kzg_commitments
where input is the payload_id
⚠️ get_blobs_and_kzg_commitments
and get_payload
ideally should be unified. Retrieving blobs and payload under one request would be nice
The validator will validate payload transactions, blobs and kzg commitments are all aligned, then construct blobs_sidecar
to sign. blobs_sidecar
has a field kzg_aggregated_proof
, which computes for aggregated polynomials (blobs), commitment and hash them to a bls field for kzg proof.
⚠️ It's unclear to me how long compute_proof_from_blobs
takes these days, but I feel this is a good optimization target.
Similar to SignedBeaconBlock
, the client must backfill block up to current_epoch - MIN_EPOCHS_FOR_BLOCK_REQUESTS
.
CL client starting from checkpoint sync MUST backfill BlobsSidecar
up to current_epoch - MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS
⚠️ What kind of validations should CL clients perform for the backfilled BlobsSidecar
? It cant verify the signature because it's not SignedBlobsSidecar
. Can it at least perform validate_blobs_sidecar
? does Cross check with beacon blocks in the DB? (This area is undefined in the spec)
A counter argument to SignedBlobsSidecarsByRange
is a node can just download the blobs historically before it has the beacon blocks, it might not need to know the proposer for the historic blobs you are syncing until the get the blocks anyway
I think for historic sync, there are three things you can do…
I think 1 is a must. 2 and 3 only make sense if you couple blob and block under the same object.
There are two ways to sync. Forward sync and backward sync w/ wss state. For backward sync, Blob doesn’t have a parent_root field, it’s not a chain, so you actually have to validate them one by one as an individual unit.
…
Beacon client will store SignedSideCar
up to MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS
. The client will likely implement some rolling window pruning mechanism per N
epoch basis.
As for the additional storage requirement. The worst case is: MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS
* SLOTS_IN_EPOCH
* 2MB is ~525GB
We should keep in mind that the economics of a 1MB/2MB (target/max) will result in an average of 1MB over a long period. Whether 1MB/2MB is realistic with the current networking is remain to be seen. The likely trade off is increase network complexity or lower thoes values. Lastly, the upperbound of 4 weeks (MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS
) could be lowered to 2 weeks. The worst case calculation ~525GB will be much lower
⚠️ Wonder if it's helpful to add a CLI flag to enable multiplier for MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS
such that altruistic node could store blobs for a longer duration. The downside is that such design could be abused by someone who hardcodesMIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS
to a very low number
It is currently uncertain what the safe size of Blobs (target and max) is in 4844 given gossip performance and node bandwidth considerations.
There will be an additional bandwidth required for gossiping sidecar and request-respond sidecars. A current published recommendation is minimum 10Mb/s symmetric and "recommended" 25 Mb/s: https://ethereum.org/en/developers/docs/nodes-and-clients/. Assuming a network of nodes in this range, what is the safe size of Blobs that the network can handle given the current beacon chain slot structure (wide dissemination required prior to 4 seconds).
Is there pen and paper and/or experimental analysis we need to do to tune this value?
There are few ways to address bandwidth concerns
TODO:
Talk with Age (Sigp). Age wrote an extention of gossipsub that reduces amplification factor. With episub, we may be able to dynamtically choke/unchoke peers in the mesh to lower local amplification factor with reasonable delivery. It should be backwards compatible with gossipsub. The goal is to reduce amplification factor to 2-8. It also has a simultation framework to run 10k nodes and crawled dht to get distribution data
How can we use this data to test, tune and optimize for 4844.
A block without a sidecar companion is not considered valid, but it's ok to be processed. See is_data_available
requirement. As stated in the spec, the block may be processed "optimistically"
⚠️ What parallel can we draw between the blocks with the SYNCING
/ ACCEPTED
payload versus the blocks without a sidecar? The current Prysm design doesn’t consider blocks without a sidecar as the head. Within filter_block_tree
if correct_justified and correct_finalized and is_data_available:
blocks[block_root] = block
return True
It seems sane to validate such blocks, if EL reports INVALID
, then just toss it. If VALID
then find it's respective blob if the node couldn't immediately. Having block without sidecar makes little sense. The reason optimistic mode exists is so a node can keep sending EL client heads to try to sync/repair. In this case, EL client is not syncing so being optimistic has not much value add.
We need to decide the best library to use across implementations. it'll be nice to converge it to a single library use by all CL teams
I believe George talked with the Supranational team, and gave them a list of requested functionality. Based on Discord, I believe this is the list of KZG/polynomial functions that we would expect blst to expose so that we can fully replace go-kzg:
-compute_powers()
-matrix_lincomb()
-g1_lincomb()
-bytes_to_bls_field()
-evaluate_polynomial_in_evaluation_form()
-verify_kzg_proof()
-compute_kzg_proof()
In general, we as the client team should piece together whatever basic functionality blst offers with exceptions made for performance sensitive functions.
BLST already contains functions to operate on field elements. So we can implement alot of non performance critical functions in the spec using BLST right now.
There's also a c-kzg library being worked on by Dankrad and Ben that fills in the missing functions not supported by BLST. c-kzg could be useful in the short term if Supranational takes too long to add KZG functionality.
Current devnet and Prysm prototype uses Go-KZG types that wrap on top of Kilic-BLS which Geth previously forked to support BLS precompile. This is not used by any CL client.
Go-KZG has option to use different BLS libs. BLST was attempted but lack of memory management made it hard. It's hard to wrap CGO tpes with Go types.
Currently BLST exposes some operations 4844 needs, but not all of them
Supranational also agreed that the current interface for pairings is very troublesome and they would improve documentation.