One performance threshold or more?

This doc delves into the question of the performance thresholds in the CSM Performance Oracle. The motivation for the performance threshold introduction and the possible effects of the multiple thresholds are considered.

Why performance threshold?

In the original document about CSM Performance Oracle, several options for the staking rewards distribution were considered:

Proportional to performance with different approaches to performance measuring
Socialized among well-performers. Where well-performers are validators above the performance threshold, and performance is measured as ratio goodAttestationsCount / totalAttestationsSheduledForPeriod Several reviewers have indicated support for the second option. Hence, it was chosen as the preferred one.

Can the performance threshold approach be extended?

After the initial discussion, Izzy has proposed considering options with several thresholds instead of a single one. This document aims to investigate the multi-threshold approach's possible options and effects.

Single threshold approach

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

The simplest approach to the performance threshold concept is using a single threshold. In short, the approach might be described as follows:

The Lido DAO sets the performance threshold;
CSM Performance Oracle uses the threshold value to define well-performers in the current frame;
CSM Performance Oracle uses goodAttestationsCount / totalAttestationsSheduledForPeriod ratio as a performance metric;
Once performance data is collected, CSM Performance Oracle calculates the number of shares that should be allocated to each CSM Node Operator as numberOfWellPerformingVlaidatorsForOperator / totalNumberOfWellPerformingValidatorsInCSM * totalrewardSharesInFrame
To account for the activation and deactivation (exits) of the validators within a frame, the active_rate = slotsValidatorActive/totalSlotsFrame is added to the validators' weights;
CSM Oracle reports the updated Merkle tree root on-chain; In this design, only well-performing validators are eligible for the rewards allocation, while others are not. This makes the performance threshold an effective cliff.

Multi-threshold approach

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

The other option for the performance threshold concept is a multi-threshold approach. This approach can be generalized as follows:

The Lido DAO sets the performance thresholds and reduction coefficients for each threshold value;
CSM Performance Oracle uses the threshold values to split validators into performance buckets;
CSM Performance Oracle uses goodAttestationsCount / totalAttestationsSheduledForPeriod ratio as a performance metric;
Once performance data is collected, CSM Performance Oracle calculates the number of shares that should be allocated to each CSM Node Operator as follows:
- Normalize validator counts depending on the batch as effectiveNumberOfValidatorsForNodeOperator = numberOfNOValidatorsBatch1 * 1 + numberOfNOValidatorsBatch2 * coeffBatch2 + numberOfNOValidatorsBatch2 * coeffBatch3 + ...
- Calculate Node Operator's share as sharesForNodeOperator = effectiveNumberOfValidatorsForNodeOperator / totalEffectiveNumberOfValidators * totalCSMRewardSharesInFrame
To account for the activation and deactivation (exits) of the validators within a frame, the active_rate = slotsValidatorActive/totalSlotsFrame is added to the validators' weights;
CSM Oracle reports the updated Merkle tree root on-chain; This design allows for smoother reward distribution compared to the single threshold approach. It is important to note that the more thresholds are in place, the closer the distribution to the full performance proportion.

What approach to choose?

The best way to figure it out is to refer to the initial motivation of the performance threshold introduction:

I think it’s an interesting way to implicitly show that diversification of the operator set is more important than “maximizing performance” (diversified setups ⇒ perf hit, some clients are less mature, some people operating out of low bandwidth/bad connectivity areas, etc)

There are other cons to aspects in our approach, which we may be framing as pros but will also be construed as cons (e.g. lower gas fees because of single contract approach is good for gas, but bad from decentralization perspective). this shows that we’re taking that into account and encouraging decentralization in another way to make up for that (ways that we think are more important for example)

Due to being late/last mover, we need to offer differentiated features, and I think this is a really interesting one and unlikely anyone else will do this before we do. lido’s size can make this happen, if we’re saying that we want to rebuff the argument that size == bad, this is a great way to do it

With that being said, any approach between a single threshold and proportional to performance approaches does not entirely fulfill the first and third points in the initial motivation. Since with multiple thresholds, the motivation to keep performance above the first threshold is reduced. Also, the usage of multiple thresholds raises a complicated question about the actual values for the thresholds and reduction coefficients.

My personal take

From my personal perspective (Dmitry G), the best way is to go with the single threshold approach. Set the actual value for the threshold low enough to allow for the short-time performance issues (say validator offline for 1-2 days) and high enough to keep overall protocol APR on the acceptable level. This will create a direct incentive for the operators to maintain their validators' performance on the levels above the threshold with no intermediate assumptions.

With the multiple-threshold approach, we will either lose the idea of "small outages are acceptable" if we set the first threshold too high, or we will decrease the bottom line of the target performance by still allocating some rewards to the not-that-good-performers if we just add several thresholds below the existing one.

The final thing to keep in mind here is how Simple bonds are designed. Node Operators will still get bond rebase rewards even if they are below the threshold. This fact effectively adds the second virtual threshold - "Unless you are ejected, you will still get bond rebase rewards, which are approx 50% of the total rewards for the CSM operators."

Final words

If we proceed with the multi-threshold approach, a dedicated analysis will be required to determine the number and actual values for the thresholds.

For a single threshold, the only thing to determine is the value. I would propose 90%.

isidorosp

2024/02/08 08:01:38

Since with multiple thresholds, the motivation to keep performance above the first threshold is reduced

I think it depends where the thresholds are set? e.g. if you use mulitthreshold (single threshold is (A<>B, multi is X<>Y<>Z) you can set the threshold between X<>Y at a higher point than you would set it if there were only two thresholds A<>B, so it's possible that 0.5*rewards(Y)+1*rewards(Z) >= 1*rewards(B) And this is kind of the idea: you can still reward good performers and not completely punish mediocre performers (Edited)

dgusakov

2024/02/08 08:52:01

True. This is covered below in my personal take.

2024/02/08 08:03:33

With that being said, any approach between a single threshold and proportional to performance approaches does not entirely fulfill the first and third points in the initial motivation

don't agree

2024/02/08 08:55:25

Why? It depends on how you set thresholds. If you set all of them >= than the single one, you dilute rewards a bit while still keeping the lower boundary. If you add new thresholds below the original one you reduce the lower boundary.

2024/02/08 08:10:36

The final thing to keep in mind here is how Simple bonds are designed. Node Operators will still get bond rebase rewards even if they are below the threshold. This fact effectively adds the second virtual threshold - “Unless you are ejected, you will still get bond rebase rewards, which are approx 50% of the total rewards for the CSM operators.”

This is more of an argument to not do this at all vs in favor of a specific variant IMO. And it's problematic in and of itself.

2024/02/08 08:53:18

I would say this is a thing that we have and need to keep in mind while making the final decision.