The following is a design document allowing a beacon node to track more detailed information about a set of validators performances that what currently can be extracted from a BeaconState
object.
One of the main tools to debug p2p/forkchoice/performance problems is attestation/proposal timing information. With Altair's fork, we lost easy access to the attestation inclusion distance as the PendingAttestation
object is no longer needed. Stakers need to rely currently on a centralized entity (like an explorer) to obtain this information, or they need to parse themself the beacon blocks.
Currently, to obtain information about validator performance, the validator needs to request the beacon node to fetch it from the objects it already has, or compute it if it is something that can only be obtained by processing these objects. Since performance information may be required on a per-epoch basis, on multiple validators, this ends up in an unnecessary burden of the beacon operation.
I believe that in order to alleviate this problem, it is unavoidable to break (albeit minimally) the separation of duties between validators and beacons: I propose two extra beacon node CLI flags, paralleling what is already supported by LightHouse: --track-validator-auto
and --track-validator-indices
. The second flag would take a list of validator indices to track performance-related information. The first flag would track this information for every validating key connected via RPC. The beacon will have some information through these flags, that a validator client will be interested in its performance parameters.
A naive solution to this problem, if it were simply about obtaining inclusion distance, would be simply add a log entry when processing an attestation, as part of beacon-block processing, whenever this attestation is from one of our tracked validators. At this stage we can log a message
This approach has an advantage of simplicity, as it would merely require adding a CLI flag, and a single check in attestation processing. However there is a big problem with this approach and it lies in the separation of duties between the beacon-node and the validator clients. The requested information is of interest to the validator, not to the node. And this validator may be far away and not have access to the beacon node besides its RPC port.
In order to maintain the idea that the consumer will be the validator client while the provider will be the beacon-node, the design becomes a bit more involved
This approach has a few advantages: it has minimal impact in maintaining the cache since it requires a single check per object processed and no block/state fetching. Therefore when a validator requests to update it's performance metrics/logs, the beacon does not need to perform any computation, it simply returns the last available information.
This does not render obsolete custom endpoints like validator/performance
since the API endpoint would not take an epoch nor any timing parameter, it would simply reply with the tracked validator last participation in the cache.
Concretely I propose:
--disable-account-metrics
and --disable-rewards-penalty-logging
to toggle wether or not the validator client calls the above mentioned API point every epoch.proto/eth/v2
blockchain.config
serviceValidatorLastPerformanceRespose
which just consists of repeated ValidatorLastPerformance
messages, one for each tracked index.The beacon simply updates the cache, discarding any old information, whenever a new attestation for the validator is processed in the blockchain service. The rpc server just flushes this info on the API query.