# Validator Accounting Reform And Other Proposals The purpose of this document is to crystallize and explain a series of proposals for incremental changes to the beacon chain specification that simplify the protocol, increase efficiency and improve the protocol's economic properties. ### Basic reward accounting reform * **Key goals**: simplify attestation processing, reduce costs and shift costs from end-of-epoch to mid-epoch, set the stage for future reforms * **Key ideas**: remove `PendingAttestations` and instead store flags per validator in shuffled order * **Target date**: HF1 (aka the light client support fork) Currently, when we receive an attestation, we defer almost all processing (except basic verification) to the end of an epoch. We store a `PendingAttestation` object that contains the attestation data (not including the signature), and at the end of an epoch we take all of the `PendingAttestation` objects and run the computations for who participated, which epochs get justified and/or finalized, and what each participant's reward or penalty should be. The obvious question to ask is: why not do the accounting in real time? The answer has been, because attestations contain a random sample of validators, and adjusting individual validator records one at a time when spread out across a large validator set is too inefficient. Adjusting one record requires `log(n)` hashes to edit the Merkle branch, but adjusting all `n` records at the same time requires as little as `n/4` hashes (as a validator balance is only 1/4 of a chunk). But it turns out that we can get the best of both worlds. If we make a data structure containing the current-epoch participation records of each validator, we can _make the data structure itself be in shuffling order_. That is, item `k` in the data structure would not literally be the k'th validator; instead, it would be the k'th validator _in the shuffled list_. Because attestation committees are directly taken from contiguous segments of the shuffled list, this allows us to efficiently adjust the records associated with each validator directly in attestation processing. This PR attempts to implement this: https://github.com/ethereum/eth2.0-specs/pull/2140 (along with a few other minor reforms) ### Duties accounting * **Key goals**: simplify accounting by removing hysteresis, improve efficiency of end-of-epoch processing, improve inactivity leak fairness * **Key ideas**: replace balance/effective balance mechanism with duties counter and global counters * **Target date**: between HF1 and sharding? We add a new data structure called the "duties counter"; this is a simple counter of the duties fulfilled by validators. Get the target correct? Add 1 point. Get the head correct and be included quickly? Add 1 point. Every 256 epochs (this can be staggered, we process 1/256 of validators during each epoch), we update the validator balance based on the duties that the validator fulfilled since the last balance update. For each validator, we know (i) how many duties they fulfilled, (ii) how many max possible duties they _could have fulfilled_ (eg. 3 per epoch * 256 epochs = 768, or less if the validator was only active for part of the period). We reward and penalize the validator based on these values, and to determine the coefficient for the collective reward multiplier, we keep a 256-epoch-period EMA of the average participation of all validators. After a balance update, we reset that validator's duties counter back to zero. We also keep a 256-epoch rolling `recent_leaky_epochs_bitfield`, and we give each validator a "present during leaked epochs" (PDLE) counter. When a validator reward is processed, we take `recent_leaky_epochs_bitfield.count(1) - validator.PDLE` as the number of leaked epochs they were absent in. We penalize them for these epochs, and add it to their "total absent leaked epochs" counter. We penalize the validator quadratically in _their own_ total absent leaked epochs (see [here](recent_leaky_epochs_bitfield) for why this is a good idea). This restructuring has a few important benefits. First, by making the numbers involved in duties accounting smaller (1-8, instead of larger integers), it greatly increases the efficiency of using Kate commitments to update duties counters. Second, and even more importantly, it's structured in such a way as to ensure that empty epochs (epochs with no blocks) require no processing at all. This is important as it mitigates a large category of DoS vector in the beacon chain (many blocks that each skip a large number of slots), which currently requires uglier workarounds to protect against. ### Kate commitments * **Key goals**: further increase efficiency of rewards accounting * **Key ideas**: move duties counters to Kate commitments, reducing updating costs * **Target date**: after sharding? We can go further and get even more efficiency out of the balance accounting process if we use Kate commitments to store the duties accounting arrays and the intra-epoch performance bitfields (in fact, Kate commitments have no penalty for modifying values in arbitrary positions, so we could even just update duties accounting directly!). See https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html for an explainer on what Kate commitments are. If an attestation only needs to add a small number (eg. 1...8) to a duties counter, then it only takes ~3 elliptic curve additions (a very small economic expense, even less than a hash!) to process each attestation. We could maintain separate Kate commitments for "current epoch" and "previous epoch" (to prevent double-attesting within one epoch) and "long-term counter"; merging one into the other is a simple addition operation. We would also add a global counter to track total participation, making epoch transition costs nearly trivial in most cases as there would be no per-validator data that even needs to be updated at the end of each epoch! ### Exit and re-entry * **Key goals**: allow validators to exit and re-enter (though with some cost to prevent abuse) * **Key ideas**: allow exited validators to become active, comb over the spec to make sure this can be done safely * **Target date**: between HF1 and sharding? We can consider allowing exited validators to re-enter the validator set voluntarily. Note that this by itself is only useful before the merge, and only if transfers are not added before the merge, as otherwise validators could just withdraw, transfer to a new account and re-deposit. Here are some notes on considerations around doing this safely: https://notes.ethereum.org/elDvTNrbRqmgP6np_YWc2g ### Load smoothing * **Key goals**: reduce the ratio between maximum possible load and actual expected load * **Key ideas**: adjust how validator logic works in the case where validator count exceeds some threshold, eg. 16M ETH (there are multiple proposals on how to do this) * **Target date**: unknown, depends on proposal Currently, there is a challenge with beacon chain clients that load is very unpredictable in the long term. Load is proportional to total deposit size. Hence, the theoretical maximum load (if all ~113M ETH is validating) is many times larger than the likely expected load (if eg. 1-10M ETH is validating). Node operators thus need to have hardware to support a much higher level of load than they actually end up processing in the average case, needlessly increasing node operating costs. #### Idea 1: cap active validator set size, use random sortition to select validators if there are too many We set a constant `MAX_ACTIVE_VALIDATORS` (eg. 524,288 validators or 16M ETH) denoting the maximum number of validators that can _actually_ be active at any one time. If the size of the active validator set exceeds `MAX_ACTIVE_VALIDATORS`, then excess validators are randomly selected and put into a "dormant" state. Newly joining validators start dormant, and another random selection process moves dormant validators into the active state. The goal is to achieve a form of hybrid security: from an economic security point of view, a 51% attack would still require ~8M ETH to misbehave and be leaked or slashed, but even an attacker that has that much ETH would face great difficulties, as they could not get their validators all into the active set at the same time. In fact, to be 1/3 of the active set, the attacker would need to have close to 1/3 of the entire active+dormant set. This would reduce rewards for validators in the >16M case, but this reduced reward would be made up for by a corresponding reduction in validators' inconvenience and costs: 1. Dormant validators do not need to attest or even stay online 2. Dormant rewards can always exit after 1 day and can skip the queue See https://github.com/ethereum/eth2.0-specs/issues/2137 for one plan of how to implement this. #### Idea 2: adjust Casper FFG to have variable epoch lengths; make epochs longer if there are very many active validators Self-explanatory; longer epoch lengths mean longer finality time if the validator count is very large but keep per-slot load constant. See https://ethresear.ch/t/exponential-epoch-backoff/1152 for some existing thinking on how to implement variable-sized epochs in the context of Casper FFG.