Try   HackMD

Persisting Data Columns

Currently the das branch does not act on columns received over gossip. On top of that, it persists the columns inconsistently. The inconsistency arises beacuse put_kzg_verified_blobs marks the RpcBlock as Available once all blobs have been received. Any data columns received over gossip after will be ignored and the peer will be banned.

To solve for this situation:

  1. Prevent put_kzg_verified_blobs from marking the block as available
  2. Act on and persist columns recevied over gossip

Persisting columns received over gossip

The data columns are persisted on the LRU cache, a key value store where the key is the block root and the value is the pending_component.

PendingComponent

The PendingComponent has the following fields relevant to DAS

  • block_root
  • verified_data_columns

As new data columns come in over gossip, put_gossip_data_column is executed. This codepath first checks if there is an existing PendingComponent in the LruCache for the given block_root. It pops out the existing PendingComponent (if it exists, if not it creates a new one), and merges the incoming data columns with the columns we currently have cached. The newly merged PendingComponent is then submitted to the cache.

Taking action on persisted columns

Currently put_kzg_verified_data_columns does not act upon incoming data columns. Changes should be made to this function to:

  1. check if the required columns are available to begin sampling
  2. execute sampling
  3. on sample success set status to Available

Note that we need access to the nodes enr for custody/sampling requirements.

Additional requirements

The beacon chain needs to know about which columns to keep custody of. Column custody requirements are a function of the nodes ENR. The beacon chain itself isn't aware of any networking related info, including the ENR. If it were to have a reference to, NetworkGlobals for example, this woud introduce a circular dependency. Instead, a service should be created within the networking crate to calculate column requirements and then send that information to the beacon chain.

Outdated notes

The DA checker must have access to the local enr in order to calculate the nodes custody requirements and perform sampling.

Since everything is Arc'd, mutating fields in the DA checker where network_globals is available is not straightforward

I ended up wrapping the network_globals in a RwLock. This provides thread safety and inner mutability

PR: https://github.com/sigp/lighthouse/pull/5438/files

Here are some notes on other options I tried

  • I tried getting an inner mutable ref of the arc'd network glboal. The issue is that because the ref counter for the arc > 1 I am unable to get the inner mutable ref
  • I tried wrapping the Arc<NetworkGlobals> field in da checker with a Cell. The issue is that Cell isnt thread safe, so that doesn't work.

Cell<Arc<NetworkGlobals<<T as BeaconChainTypes>::EthSpec>>> cannot be shared between threads safely
within DataAvailabilityChecker<T>, the trait Sync is not implemented for Cell<Arc<NetworkGlobals<<T as BeaconChainTypes>::EthSpec>>>
if you want to do aliasing and mutation between multiple threads, use std::sync::RwLock