# EI4844 CL client implementer's notes By: [@terencechain](https://twitter.com/terencechain), please dm me with any questions or leave any feedbacks. Always happy to chat more ## Beacon chain consensus changes **Container:** `BeaconBlockBody` structure changes **Block processing:** `verify_kzg_commitments_against_transactions` added to `process_block` ## Networking changes **Container:** Add `BlobsSidecar` and `SignedBlobsSidecar` **Gossip domain:** Modify `beacon_block` - Validate the body's kzg commitments are encoded compressed BLS G1 points (reject) - Validate body's kzg commitments match the versioned hashes in the tx list (reject) Add `blobs_sidecar` - Validate the sidecar slot is within bound (ignore) - Validate sidecar blobs are within range for FE (reject) - Validate KZG proof is encoded compressed BLS G1 point (reject) - Validate sidecar signature (reject) - Validate the sidecar proposer is in the correct slot (reject) - Validate that this is the first sidecar for the given proposer index and slot combo. A little bit of anti-dos measure here ⚠️ Should `validate_blobs_sidecar` be part of validation before gossip? As it is currently stated, `validate_blobs_sidecar` only affects fork choice ⚠️ Based on the current network spec, sidecar can be verified and propagated without block. Is that intended? **Req/res domain:** Modify `/eth2/beacon_chain/req/beacon_blocks_by_range/2/` Modify `/eth2/beacon_chain/req/beacon_blocks_by_root/2/` Add `BlobsSidecarsByRange v1` - Request: `start_slot: Slot`, `count: uint64` - Response: `List[BlobsSidecar, MAX_REQUEST_BLOBS_SIDECARS]` - Response is determined by content leading up to the current head ⚠️Notice that we don't have `BlobsSidecarsByRootv1`. My first impression is that this would be useful to have. If you never receive a sidecar from gossip, at least you could fast retrieve it ⚠️`MAX_REQUEST_BLOBS_SIDECARS` is 128. The target size for the sidecar is ~1MB, and the max is ~2MB. How does this affect low bandwidth devices? Some [additional notes](https://notes.ethereum.org/RLOGb1hYQ0aWt3hcVgzhgQ?view) on coupling `BeaconBlock`s and `Blob`s on the networking layer during 4844. ## Validator changes Validator proposes `signed_blobs_sidecar` in addition to `signed_beacon_block`. Validator retrieves blobs and commitments from EL client through `get_blobs_and_kzg_commitments` where input is the `payload_id` ⚠️ `get_blobs_and_kzg_commitments` and `get_payload` ideally should be unified. Retrieving blobs and payload under one request would be nice The validator will validate payload transactions, blobs and kzg commitments are all aligned, then construct `blobs_sidecar` to sign. `blobs_sidecar` has a field `kzg_aggregated_proof`, which computes for aggregated polynomials (blobs), commitment and hash them to a bls field for kzg proof. ⚠️ It's unclear to me how long `compute_proof_from_blobs` takes these days, but I feel this is a good optimization target. ## Blob backfilling requirement Similar to `SignedBeaconBlock`, the client must backfill block up to `current_epoch - MIN_EPOCHS_FOR_BLOCK_REQUESTS`. CL client starting from checkpoint sync MUST backfill `BlobsSidecar` up to `current_epoch - MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS` ⚠️ What kind of validations should CL clients perform for the backfilled `BlobsSidecar`? It cant verify the signature because it's not `SignedBlobsSidecar`. Can it at least perform `validate_blobs_sidecar`? does Cross check with beacon blocks in the DB? (This area is undefined in the spec) A counter argument to `SignedBlobsSidecarsByRange` is a node can just download the blobs historically before it has the beacon blocks, it might not need to know the proposer for the historic blobs you are syncing until the get the blocks anyway I think for historic sync, there are three things you can do... 1. verify kzg proof 2. verify proposer signature 3. verify blob is part of the canonical chain I think 1 is a must. 2 and 3 only make sense if you couple blob and block under the same object. There are two ways to sync. Forward sync and backward sync w/ wss state. For backward sync, Blob doesn’t have a parent_root field, it’s not a chain, so you actually have to validate them one by one as an individual unit. ... ## Storage requirement Beacon client will store `SignedSideCar` up to `MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS`. The client will likely implement some rolling window pruning mechanism per `N` epoch basis. As for the additional storage requirement. The worst case is: `MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS` * `SLOTS_IN_EPOCH` * 2MB is ~525GB We should keep in mind that the economics of a 1MB/2MB (target/max) will result in an average of 1MB over a long period. Whether 1MB/2MB is realistic with the current networking is remain to be seen. The likely trade off is increase network complexity or lower thoes values. Lastly, the upperbound of 4 weeks (`MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS`) could be lowered to 2 weeks. The worst case calculation ~525GB will be much lower ⚠️ Wonder if it's helpful to add a CLI flag to enable multiplier for `MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS` such that altruistic node could store blobs for a longer duration. The downside is that such design could be abused by someone who hardcodes`MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS` to a very low number ## Networking requirement It is currently uncertain what the safe size of Blobs (target and max) is in 4844 given gossip performance and node bandwidth considerations. There will be an additional bandwidth required for gossiping sidecar and request-respond sidecars. A current published recommendation is minimum 10Mb/s symmetric and "recommended" 25 Mb/s: https://ethereum.org/en/developers/docs/nodes-and-clients/. Assuming a network of nodes in this range, what is the safe size of Blobs that the network can handle given the current beacon chain slot structure (wide dissemination required prior to 4 seconds). Is there pen and paper and/or experimental analysis we need to do to tune this value? There are few ways to address bandwidth concerns 1. Reduce amplification factor 2. Reduce max blob size 3. Push vs pull of blobs 4. Make the blob sidecar as an optional topic Maybe we can take a look at gossip amplification factor, and adjust amplification on the based on bandwidth metering system. Go libp2p supports resource metering system on per peer bassis to track. Rate limit can happen on the application layer (req/res limits) and gossipsub. TODO: Talk with Age (Sigp). Age wrote an extention of gossipsub that reduces amplification factor. With episub, we may be able to dynamtically choke/unchoke peers in the mesh to lower local amplification factor with reasonable delivery. It should be backwards compatible with gossipsub. The goal is to reduce amplification factor to 2-8. It also has a simultation framework to run 10k nodes and crawled dht to get distribution data How can we use this data to test, tune and optimize for 4844. ## Block w/o sidecar == optimistic block? A block without a sidecar companion is not considered valid, but it's ok to be processed. See `is_data_available` requirement. As stated in the spec, the block may be processed "optimistically" ⚠️ What parallel can we draw between the blocks with the `SYNCING` / `ACCEPTED` payload versus the blocks without a sidecar? The current Prysm design doesn’t consider blocks without a sidecar as the head. Within `filter_block_tree` ```python if correct_justified and correct_finalized and is_data_available: blocks[block_root] = block return True ``` It seems sane to validate such blocks, if EL reports `INVALID`, then just toss it. If `VALID` then find it's respective blob if the node couldn't immediately. Having block without sidecar makes little sense. The reason optimistic mode exists is so a node can keep sending EL client heads to try to sync/repair. In this case, EL client is not syncing so being optimistic has not much value add. ## The current state of KZG implementation We need to decide the best library to use across implementations. it'll be nice to converge it to a single library use by all CL teams I believe George talked with the Supranational team, and gave them a list of requested functionality. Based on Discord, I believe this is the list of KZG/polynomial functions that we would expect blst to expose so that we can fully replace go-kzg: -`compute_powers()` -`matrix_lincomb()` -`g1_lincomb()` -`bytes_to_bls_field()` -`evaluate_polynomial_in_evaluation_form()` -`verify_kzg_proof()` -`compute_kzg_proof()` In general, we as the client team should piece together whatever basic functionality blst offers with exceptions made for performance sensitive functions. BLST already contains functions to operate on field elements. So we can implement alot of non performance critical functions in the spec using BLST right now. There's also a [c-kzg](https://github.com/dankrad/c-kzg/tree/lagrange_form) library being worked on by Dankrad and Ben that fills in the missing functions not supported by BLST. c-kzg could be useful in the short term if Supranational takes too long to add KZG functionality. Current devnet and Prysm prototype uses Go-KZG types that wrap on top of Kilic-BLS which Geth previously forked to support BLS precompile. This is not used by any CL client. Go-KZG has option to use different BLS libs. BLST was attempted but lack of memory management made it hard. It's hard to wrap CGO tpes with Go types. Currently BLST exposes some operations 4844 needs, but not all of them Supranational also agreed that the current interface for pairings is very troublesome and they would improve documentation.