This is a a vision by some Lido contributors for the major CSM upgrade CSM v2. Community input and comments are welcome!
The first version of Communty Staking Module (CSM) has been live on the mainnet since October 2024. Research and development regarding the second version of CSM should be started to ensure timely delivery of the updates regarding Pectra hardfork and improvements that have already been discovered.
Several important features and improvements are proposed to be implemented in the new version of CSM. A short summary of the features is shown below:
With the introduction of the EIP-7251, Ethereum validators might be consolidated into large (2048 ETH) validators or created to be large initially. This allows for a significant decrease in the load on the P2P network due to a reduction in the number of peers. However, with no ability to specify a "custom ceiling" for large validators, stakers can only choose between small (32 ETH) and large (2048 ETH) validators. Regarding overall protocol capital efficiency, it only makes sense to consolidate small validators into large ones or create large validators if the balance of the resulting large validators will be >= 1700 ETH (see research results by Lido contributors). The word "Community" in the name of CSM means that the target audience for the module is community stakers who run around 10-20 validators on average. This means that for most of them, consolidation would be capital-inefficient for the protocol, although further analysis remains to be done to understand overall effect of possible consolidation on the landscape of CSM validators (e.g. impact 0x02 validators would have on associated operational gas costs with running X validators, performing partial withdrawals when needed, etc.) At the same time, community stakers will not benefit much from the consolidations given the low per-validator bond requirements (1.3 ETH currently with the theoretically possible reduction in CSM v2). With that said, and because CSM could not support large validators without corresponding changes in the underlying Lido on Ethereum protocol, it is proposed that the support for large validators should be consider at a later juncture (e.g. together with CSM v3).
However, support 0x02 type validators enabled by EIP-7251 might be considered again in the future once more real data about average per-operators validator counts is available and the Lido on Ethereum protocol introduces support of the large validators.
Preconfirmations are a hot topic in the Ethereum space. Research is being conducted on implementing preconfirmations support at the Lido protocol level. Additional features should be considered for inclusion into CSM v2 depending on the research outcomes.
Before diving into the CSM v2 features, providing a short product rationale is crucial.
This concept makes CSM entry modular and extendable. Entry Gates allow for unlimited curated/automated/custom entry pathways for CSM. Extensions allow third parties to build their products on top of CSM with minimal security and trust assumptions since the core CSM code secures the core protocol.
This feature is crucial for increasing the participation of independent community stakers in the protocol. It would allow identified community stakers a way to be join the protocol in an expedited manner compared to permissionless (unknown) operators and directly impacts the level of the Lido on Ethereum protocol decentralization.
This is a simple enabler needed to support the different node operator types. It allows CSM to adapt to changing market conditions and provide separate treatment for each type.
Staking protocols should have some way of protecting themselves against malicious or ill-performing Node Operators.. In a permissionless system, it can only be done automatically. The strikes system is an automated tool that prevents potential performance issuesfrom substantially affecting the overall performance and health of the protocol.
It has become apparent that CSM participants can be divided into several distinct types:
Definition of these types and mechanisms to identify which type a specific Node Operators belongs to means that parameters like Node Operator reward share, performance threshold, deposit queue priority, bad performance ejection params, penalties not tied to the Ethereum network (keyRemovalCharge
, ElRewardStealingAdditionalFine
), etc., could be customized all in one module versus having to have multiple-different modules catering to different NO segments.
The main vibe of CSM v2 is to keep supporting community stakers and increase their number by offering individual conditions for different Node Operator types, specifically community stakers.
To enable this special treatment, several features listed below should be introduced.
CSModule.sol
currently has several methods for creating Node Operators. During the Early Adoption (EA) period, these methods allow only members of the EA List to join. Once the EA period is over, these methods become permissionless. This approach is extremely limited in terms of customization. Hence, the concept of Entry Gates and Extensions is proposed.
To implement this concept Node Operator creation methods in CSModule.sol
should be permissioned.
Note: Permissioned methods for Node Oparator creation on the CSModule.sol
level don't mean that permissionless entry wouldn't be allowed, just that it would move one level deeper into the stack.
Only members of the corresponding role (CREATE_NODE_OPERATOR_ROLE
) can call these methods. These role members are what we call Entry Gates and Extensions.
Entry Gates are smart contracts that allow users to join CSM. There are two types of entry gates:
Entry Gate can assign a custom Node Operator type (defined by bondCurveId
) to the operators created using it.
A permissioned entry gate can allow existing Node Operators to prove they are eligible to get the corresponding Node Operator type and upgrade their existing CSM Node Operator to this type (corresponding bondCurveId
).
This is a stateless contract that proxies addNodeOperator*
calls with no additional actions. It maintains the invariant with mandatory keys uploading.
An entry gate that allows to join only vetted addresses using Merkle Tree.
Additionally, it sets a special Node Operator type (bondCurveId
) for the Node Operator created.
Similar to entry gates, extensions are smart contracts that allow Node Operators to join CSM. Extensions also abstract Node Operator management in some form. An extension might assign itself as a CSM Node Operator manager and/or reward address to do that. This allows an extension to implement custom principles of Node Operator management, like taking a reward share from the Node Operator rewards or allowing only verified validator keys to be uploaded.
A good but not exhaustive example of the extension is a DVT-powered extension. Since DVT assumes the collective operation of a single validator by the cluster participants, a DVT-powered extension can manage individual cluster participants and share rewards. The other possible aspect of a DVT-powered extension can be on-chain key validation, ensuring that only keys created via DKG can be uploaded to CSM via this extension.
Node Operator creation will have an interface similar to the current one. However, the contract to be called will be different. By default, it is assumed that CSM will have two native entry gates, but the cool thing about entry gates is that they can be added or modified later without needing to change the core contracts. This would allow, for example, for multiple different mechanisms (working via dedicated entry gates) to identify independent community stakers or updating the "master data" that an entry gate checks eligibility against, etc.
Permissionless entry gate will have a simplified interface for Node Operator creation (without proof argument):
Independent community stakers entry gate will inherit the existing CSM interface for Node Operators creation:
In addition to the Node Operator creation, the Identified Community stakers entry gate will also provide a method for the existing CSM Node Operators to claim custom Node Operator type (beneficial bond curve) if they have created a CSM Node Operator before CSM v2 and were included into the updated list of the Identified community stakers upon CSM v2 launch or later:
All other interactions with CSM will remain intact except for the minor changes in the interface of the addKeys
methods (from
argument added):
To support different Node Operator types in CSM v2, a special tech solution is required consisting of two parts:
The latter can be solved by moving existing parameters and adding new ones to a separate registry contract that will store individual values of each parameter for each Node Operator type. The first part is solved by the Entry Gates and Extensions described above.
Since there are many Node Operator types and not all require a complete set of custom parameters, mandatory default values for each parameter should be stored in the registry, and importantly the values for these defaults should be considered and set taking into consideration the most "conservative" approach given a default NO type of "permissionless & unknown". The default value should be used if a custom value is set for the particular parameter and Node Operator type pair.
It is proposed that this contract be called CSParametersRegistry
. The general scheme for the contract looks as follows:
Currently, CSM has several parameters that can be migrated to the CSParametersRegisty
:
keyRemovalCharge
- an amount of stETH charged for the deletion of the validator key from the CSM deposit queue;elRewardsStealingAdditionalFine
- an amount of ETH added to the amount of ETH EL rewards stolen or misdirected by the CSM validators while proposing blocks;performanceLeeway
- a value or set of per-key-interval values (see "Parameters depending on the number of validator keys") in BP for the performance leeway used by CSM Performance Oracle to calculate Node Operator rewards distribution;On top of these params, several new params can emerge within new features of CSM v2, like:
priorityQueueParams
- priority queue ID and number of seats a Node Operator is eligible to occupy in the priority deposit queue;rewardShare
- a value or set of per-key-interval values (see "Parameters depending on the number of validator keys") in BP determining the share of the Node Operator rewards allocated to the module by the Staking Router a Node Operator is eligible for. All undistributed rewards are returned to Lido DAO treasury;strikesParams
- parameters for the new strikes system (lifetime, threshold);badPerformancePenalty
- a penalty that is confiscated if the validator is ejected from the protocol due to systematic bad performance;performanceDutyCoefficients
- a set of coefficients for the performance rating calculation by CSM Performance Oracle (see "Updated CSM Performance Oracle metric" section).keysLimit
- the number of active keys a Node Operator of a certain type is allowed to have. This also applies to key upload. It can be used for Pro-operator types to limit their possible impact on the protocol decentralization should they be given special conditions (joining via the Entry Gate with a specific Node Operator type).The main parameter related to the Node Operator type is a bondCurveId
. This parameter defines a Node Operator type and will still be stored in the CSAccounting.sol
. The creation of the new Node Operator type will effectively mean the creation of the new bond curve.
Several parameters, namely, performanceLeeway
and rewardShare
, should depend on the number of the validator keys controlled by the Node Operator to allow for limited beneficial values. Both performanceLeeway
and rewardShare
are set as a function of the total validator keys controlled by the node operator.
Ex. rewardShare
for the first 10
validator keys of the given node operator is 100%
of the rewards allocated to the module by the staking router; for the next 10
keys, it is 90%
, and for the other keys (>20
), it is 80%
.
For CSM v2, finding a way for different node operator types to be deposited before the permissionless (unknown) operators is required.
CSModule
keeps track of a predefined list of priority queues. Every queue is identified by its index, which also serves as its priority indicator. A queue with an index of 0 has the maximum priority. The module has a fixed amount of queues, limited by the LOWEST_PRIORITY
deploy-time constant. As a result, the module can work with any queue in the range [0; LOWEST_PRIORITY], where LOWEST_PRIORITY
is reserved to be used as a default queue.
CSM v2 introduces a separate contract to the module composition called CSParametersRegistry
, so the configuration for separate priority queues is stored in the registry. Queue configuration looks like this:
In this structure, priority
indicates which queue to use, and maxDeposits
shows the maximum number of validator keys a node operator can get deposited using the priority queue.
A node operator can be assigned to a group identified by its curveID
, so every curveID
has its priority queue configuration. The configuration is not limited to using specific queues except for reserved ones. We can imagine the following setup:
The mechanic of adding new keys by a node operator looks like this:
QueueConfig
associated with the node operator's curveID
.QueueConfig.maxDeposits
limit that can be put into the queue with priority QueueConfig.priority
.QueueConfig.priority
and the rest to the default one, i.e., with LOWEST_PRIORITY
.Note that a node operator can get up to maxDeposits
using its priority queue. In other words, the following scenario is actual:
maxDeposits
) keys and puts them in the priority queue.The mechanic is simple. CSModule
goes over the queues in the order of their priority (0 -> LOWEST_PRIORITY
) and processes batches from the queues sequentially: when one queue is exhausted, the next one is used to fetch batches.
The cleanup mechanic works similarly to deposit data retrieval. Queues are ordered and processed by their priority, and the batches that cannot be deposited are removed from the queue where they were found.
To migrate the current queue to the new mechanic, a separate priority is dedicated to the queue and can't be used by any curveID
. LEGACY_QUEUE_PRIORITY
is set to LOWEST_PRIORITY - 1
. As a result, we have the following range of free priorities: [0; LOWEST_PRIORITY-2]. Eventually, the legacy queue will be entirely deposited, and the reserved priority will be released for use in the next release of CSM (v3).
For the node operators eligible for priority queue seats, a special method will be provided to migrate an eligible number of keys from the legacy queue to the new priority queue.
If the Node Operator is eligible for N
seats in the priority queue but already has N
or more keys deposited, migration method described above will have no effect. See "Adding keys to the queue(s)" section for the details.
One of the unresolved issues in the current version of CSM is the ejection of the bad-performing validators. Although these validators will not get the Node Operator's reward, the bond rebases will persist, and such validators will negatively impact the overall Lido on Ethereum protocol APR. Hence, the price of the theoretical attack on the Lido on Ethereum protocol is decreased compared to the other permissionless protocols.
As described in the document attached to the CSM Architecture, one optimal way to tackle the issue is to introduce a bad-performance strikes system into CSM.
It is proposed to have a single actor responsible for the performance strikes assignment - CSM Performance Oracle.
Once in a frame, CSM Performance Oracle delivers an additional tree root with information about "strikes" for the validators. A strike means that the validator performed below the threshold in this frame. When updating this tree, CSM Performance Oracle considers the previous values from the old tree. All strikes older than the strikesLifetime
oracle frames (ex. 6 frames) are dropped.
Strikes tree leaves have a form of {noID, validatorPubkey, [strikeTimestamps]}
.
The main reason for assigning strikes to validators and not Node Operators is to maintain consistency in the performance measurements. Currently, CSM Performance Oracle considers validators' performance individually. Hence, strikes should also be a validator property to ensure precise ejections of the bad-performing validators.
It is crucial to note that strikes are not a penalty but an indicator of bad performance that should be considered by the Node Operators as a signal to improve their performance.
In the initial CSM proposal, the term "performance tax" was used. However, this value might be challenging to calculate accurately. It seems reasonable to rename the initial term to "bad performance penalty" and make it a fixed value that is confiscated from the Node Operators bond should their validators be ejected due to a sufficient number of strikes.
It is proposed to have a fixed configurable value (optionally individual per Node Operator type) for the "bad performance penalty", with the Lido DAO being the actor to set/update the actual value should the network conditions change. This approach allows Lido protocol to keep "bad performance penalty" up to date.
Once the number of strikes reaches the strikesThreshold
(ex. 3 strikes in 6 months), the permissionless method can trigger exit for the validator and confiscate a "bad performance penalty" from the Node Operator's bond.
Ejection parameters are subject to Lido DAO decision
Since Node Operator key indices in the CSM keys storage might be changed in the optimistic vetting approach (deleted key is swapped with the last key in the keys storage), it is required to provide the current key index in the Node Operator's storage to the permissionless method and check that the key in the leaf and the storage are identical.
Since validator ejection with EIP-7002 comes with the price, this price should be confiscated from the Node Operator's bond and transferred to the method caller to cover corresponding operational expenses.
The Node Operators may decide to exit their validator before getting a last strike before the threshold, which will allow them to avoid confiscation of a "bad performance penalty."
It is crucial to note that all direct losses will be confiscated, and no staking rewards will be distributed during frames with poor performance anyway.
Given the strike system's goal of "Protecting protocol from systematic bad performers while keeping performance leeway," it is reasonable to allow bad-performing validators to voluntarily leave the protocol, effectively reducing the number of bad-performing validators in the protocol.
Also, accounting for the already assigned strikes upon validator withdrawal would require a direct connection between the withdrawal reporting process and CSM Performance Oracle, effectively making exits permissioned and heavily dependent on the CSM Performance Oracle operation.
Currently, CSM Performance Oracle uses the included attestations (inclusion delay and correctness are not considered) rate as a performance proxy for the CSM validators:
This metric is an excellent proxy for the validator's liveliness since each Ethereum validator should submit one attestation each epoch. However, two more validator duties are not accounted for in the current approach: block proposals and sync committee participation. These duties are much less frequent than attestations but are vital for the network. As CSM evolves and gains a stake share in the Lido on Ethereum protocol, a more robust and accurate performance proxy is required.
Performance rate representation in the form of the relation between fulfilled and assigned duties has proven to be robust and easy to understand. It is proposed that this approach be kept while enriching it with the block proposal and sync committee duties. The resulting formula will take the form:
where are weight coefficients and are effectiveness ratings for attestations, block proposals and sync committee respectively.
A similar approach is used by beaconcha.in to calculate validator effectiveness.
The default weight coefficients are defined as follows (according to eth2book):
However, this coefficients assume validator rewards over the long period of time during which all 3 duties were assigned. Since CSM Performance Oracle has a relatively short frame it is required to account for the scenarios when not all duties where assigned within the frame.
can have custom values for the given Node Operator types. This allows for a special performance rating calculation. Ex. Identified Community Stakers might have lower coefficients for the block proposals or sync committee.
The most general approach to attestation effectiveness calculation is:
Actual attestation reward is calculated using a sophisticated algorithm described here. One can see that actual reward depends on the multiple factors like vote correctness and inclusion delay. These factors are not fully under the Node Operators control given that they validate from home and not top-tier data centers. Choice of the Ethereum clients might also affect attestation reward performance.
Since CSM is primarily focused on operators validating from home it is worth calculating attestation submission rate as attestation effectiveness.
This approach is vulnerable to the edge case of significant percentage of the incorrect attestation votes alongside high submission rate. However, this case is only reachable with the explicit malicious intent since all of the current Ethereum clients are designed for the maximal performance and would not demonstrate such behaviour specific code or configuration modifications. Any malicious case can still be detected by the tools like beaconcha.in or rated.network and counteracted with the Node Operator penalisation and ejection from the validator set.
With the block proposals the situation is simple and straightforward. Node Operators should maintain their setup in a way that allows them to submit valid block in time even if they are operating from the distant locations. This requirement comes from the fact that block production is vital duty of any validator. Unlike attestations where individual invalid or delayed votes can not disturb the network, block proposal is a binary duty. The block is either proposed or not. Hence, the block proposal effectiveness can be calculated as follows:
Sync committee is a very rare duty of the validator. The effectiveness of this duty can be calculated as:
It is crucial to subtract missing blocks from the denominator since sync committee votes can only be included into the proposed blocks and block proposers are not correlated with the sync committee participants.
Using the formula for the individual validator performance the average network performance can be calculated as average of all individual validators performances.
However, it might be resource consuming to calculate individual performance for each validator. Also, we need to account for the change in the active validators number. To simplify the calculations an alternative approximate formula can be used:
where:
Given all of the calculations above, the final algorithm will look as follows:
CSM v2 will introduce custom performance thresholds for the Node Operator types. This feature will allow to make performance rating calculation more precise (include vote accuracy and inclusion delay) while keeping the performance threshold low enough for community stakers exclusively. Without a separate threshold accurate performance metric with the low threshold will allow for a significant downtime of the validators operating from DCs which is net negative for the protocol and the overall network. On the other hand accurate metric with the hight performance threshold will not allow community stakers with the limitations described above to get rewards.
Despite the fact that the update of the CSM Performance Oracle will be delivered within CSM v2 it is proposed to follow a "staged" approach, and first implement a simplified attestation effectiveness accounting. It will allow for the shorter development and release time. Also, it will help to gather more real-world data about CSM validators performance and make an informed decision on the value of the lowered performance threshold for the community stakers.
Execution Layer Triggerable Exits (EIP-7002) will allow staking protocols to request exits for the validators with the withdrawal credentials pointing to their contracts. However, this method of exit requesting comes with a fee. Hence, the existing method of requesting exit by signing a corresponding message with the validator private key is still preferable.
There are several cases when CSM might require usage of the EL triggerable exits:
The first case should be resolved within VEBO since CSM does not have information about the validator keys requested by VEBO in case of exit requests for withdrawal coverage or due to targetLimit
. However, VEBO should inform CSM (using a "hook" method) about the exits being triggered for delayed validators so that CSM can penalize the Node Operator's bond. This penalty should not be burned but transferred to the Lido DAO treasury to cover the fee paid by VEBO to trigger exit requests.
CSM must directly initiate both second and third cases since the exit should be triggered for a particular key. This means that the Lido on Ethereum implementation of the EIP-7002 should be able to request exit triggering for a particular validator key directly.
In the case of ejection due to strikes, all corresponding penalties should be applied upon ejection. It is worth considering a tip for the method caller to compensate for the gas costs + premium. This tip should be confiscated from the Node Operator's bond.
When it comes to voluntary ejection, the module should not apply any additional penalties, assuming that Node Operators will pay the total price for the ejection themselves.
It is also proposed to allow DAO (via vote or EasyTrack) to explicitly request ejection of the CSM validators.
To attract more Identified Community Stakers (ICS) to CSM, it is proposed to introduce optional referral program called "Bring your friend to CSM".
If one ICS (referrer) invites another ICS (referral), the referrer can receive some benefits from the invitation. A possible benefits structure might look like:
Invite means that upon NO creation, the referral specifies the referrer's address in the transaction. This information is recorded on-chain. Node Operators with recorded invites get access to the benefits described.
Both the referral and the referrer should pass the identification process and get an ICS pass.
Referrer can not be specified for existing Node Operators, only for new ones who are eligible for the ICS Node Operator type at the moment of creation.
The referral program consists of seasons. At the start of each season, a Node Operator type that can be obtained as a reward and a referrals threshold are set. These parameters can not be changed within a season. Points for inviting referrals are counted and valid only within a season. The beneficial Node Operator type can be claimed only while the season lasts. When a new season starts, all previously collected referral points are dropped.
The initial slashing penalty will be reduced with EIP-7251 from 1 ETH to 1/128 ETH for the 32 ETH validators. Given the associated gas costs, reporting the initial slashing will become useless after Pectra.
More details on the solution space here.