owned this note
owned this note
Published
Linked with GitHub
# Nomination pool rewards
## Goal
When a payout happens, it should be divided amoung delegators so that each delegator gets
\begin{equation} \text{payout amount} \times \frac{\text{delegator's shares}}{\text{total shares}} \end{equation}
However, we cannot payout all the delegators when a payout happens. Instead we will pay the delegator when they claim rewards. For each delegator $D$, we have a quantity called $\text{pending payout}(D)$. When they join, this is $0$. When a payout happens, this increases by the expression above. When they claim rewards, we pay them the pending payout and reset it to $0$.
However we cannot update a storage item for every delegator when a payout happens as we want all operations to take constant time. So $\text{pending payout}(D)$ will be a function that we can lazily evaluate. The goal of this document is to explain how to do that.
## How to do that
We have that
$$ \text{pending payout}(D)= \sum \text{payout amount} \times \frac{\text{$D$'s shares at time of payout}}{\text{total shares at time of payout}}$$
where the sum is taken over payouts since $D$ last claimed rewards. If we force $D$ to claim rewards before any time they change their shares, then $\text{delegator's shares at time of payout}$ is a constant, so we can take it out:
$$ \text{pending payout}(D)= \text{$D$'s shares} \times \sum \frac{\text{payout amount}}{\text{total shares at time of payout}}$$
Since this expression does not depend on $D$, we can keep track of it globally. We define
$$\text{reward counter}=\sum \frac{\text{payout amount}}{\text{total shares at time of payout}}$$
where the sum is taken over all payouts. now we have
$$ \text{pending payout}(D)= \text{$D$'s shares} \times (\text{current reward counter} - \text{reward counter at $D$'s last reward claim}$$
Now how do we lazily compute the reward counter? Well total shares also is constant unless an operation happens that changes it. So if total shares don't change betwen times $t_1$ and $t_2$, then
$$\text{reward counter at $t_2$}-\text{reward counter at $t_1$} = \frac{\text{total payouts at $t_2$}-\text{total payouts at $t_1$}}{\text{total shares}}$$
Now how do we lazily evaluate total payouts? Well the reward pool balance only changes because of payouts and reward claims (technically we define any balance change in the reward pool that is not a reward claim as a payout and then assume that all such payouts are positive). So we have:
$$\text{total payouts} = \text{reward pool balance} + \text{total reward claims}$$
Now we should be in a position to lazily evaluate all that we need.
## State items associated with nomination pool rewards
Globally we have
$\text{last_recorded_reward_counter}$ - the reward counter when we last recorded it
$\text{last_recorded_total_payouts}$ - the total payouts that we recorded at the same time as $\text{last recorded reward counter}$.
$\text{total_rewards_claimed}$ - the total of all the rewards claimed ever. This is good as of now, unlike the previous two.
Locally with each delegtor $D$ we have
$\text{reward_counter_at_last_payout}$ - this is the reward counter at $D$'s last payout. If they haven't had a payout, it is the reward counter when they joined.
## pseudo-code for functions
The external functions we have are claim rewards and bond/unbond. They work like this:
```
claim_rewards(D)
p=pending_rewards(D)
if(p < 0)
raise some error that makes this transaction invalid
if(p > 0)
transfer p from the reward pool to D
pool.total_rewards_claimed += p
D.reward_counter_at_last_payout = current_reward_counter()
```
```
join(D)
update_recorded()
Do_joining_stuff(D)
D.reward_counter_at_last_payout = current_reward_counter()
```
```
bond_extra(D)
update_recorded()
claim_rewards(D)
D.reward_counter_at_last_payout = current_reward_counter() // Is redundant unless p < 0 in claim_rewards
// other stuff
```
```
unbond(D)
update_recorded()
claim_rewards(D)
D.reward_counter_at_last_payout = current_reward_counter() // Is redundant unless p < 0 in claim_rewards
// other stuff
```
These use the functions:
```
pending_rewards(D)
return D.shares*(current_reward_counter() - D.reward_counter_at_last_payout)
```
```
current_reward_counter()
payouts_since_last_record = pool.balance + pool.total_rewards_claimed - pool.last_recorded_total_payouts
return pool.last_recorded_reward_counter + (payouts_since_last_record / pool.total_shares)
```
```
update_recorded()
pool.last_recorded_reward_counter = current_reward_counter()
pool.last_recorded_total_payouts = pool.balance + pool.total_rewards_claimed
```
## Proofs
**Assumption 1**: Every change to the reward pool balance is positive except during reward claims.
**Theorem 1**
Under assumption 1,
1. During every reward claim, the reward pool has enough balance to pay out the pending reward.
2. The sum of all rewards and the pending reward of a delegator D during any run is the sum of what they should get from each payout according to equation 1.
2 means that if a delegator does a reward claim at the end of a run, it does not matter if they did many reward claims during that run or just the one at the end, they get the same total.
This follows from the following proposition, that we will need to go through the various functions in detail to prove:
**Proposition 1**
1. Under assumption 1, the current reward counter and total payouts ever never decrease with time. They stay constant during any function call.
2. When the reward pool balance does not change, nor does anyones pending rewards. During a reward claim by delegator D, either they get paid out all their pending rewards which get reset to 0, or nothing happens and no other delegators pending rewards change. Outside of any reward claims, if the reward pool balance changes then pending rewards change according to equation 1.
2. The sum of the pending rewards is always equal to the reward pool balance
**Proof of Proposition 1**. First, consider what happens during a reward claim. We see that if $p < 0$, nothing happens and so all 3 properties hold. If, $p=0$, then if $D.shares=0$, we update $D$'s last_racorded_reward_share, but this maintains that $D$'s pending rewards are 0. When p > 0, we pay out D's pending reward p to D and set $D$'s last_racorded_reward_counter to the current reward counter, which sets their pending rewards to 0. The reward pool balance also decreases by p. To get 1,2 and 3, it remains to show that the current reward counter, the total payouts ever and other delegator's pending payouts do not change. total payouts ever is reward pool balance + total_rewards_paid and the former decreases by p and the latter increases by it, leaving total payouts ever unchanged. In the calculation of current reward counter, this means that payouts_since_last_record does not change. Since last_recorded_reward_counter and total_shares do not change, nor does current reward counter. Since current reward counter does not change, nor does any other delegator's pending payouts.
We next analyse update_recorded. It does not change anything used to calculate total payouts ever, just records its value so total payouts ever remains the same. It sets last_recorded_reward_counter to the current reward counter at the beginning. After the function executes, since the total payouts are the same, payouts_since_last_record in the cimputaion of current_reward_counter() will be 0 and the current reward counter is the same as the just recorded last recorded one. Since the current reward counter is unchanged, and no other variable in the pending rewards computation is changed, all pending rewards remain the same.
... more stuff ...
**proof of Theorem 1**: By Proposition 1,1., under Assumption1, the reward counter is weakly monotonically omcreasing with time. Thus every delegators pending rewards are non-negative. By 3., the pending rewards sun to the reward pool balance. Thus the reward pool balance is at least any delegators pending reward, and the pool has enough balance to pay any (or all) delegators that claim their reward.
With non-negativity of pending payouts, 2. of the Theorem then folows from 2. of the Proposition by induction.
## Precision issues
We should use fixed point arithmetic for the reward counter.
That is, we actually have an integer fixed_point_base and set current reward counter to be
$$\text{pool.last_recorded_reward_counter} +
\lfloor \frac{\text{payouts_since_last_record} \times \text{fixed_point_base}}{\text{pool.total_shares}} \rfloor$$
and then compute pending payouts as
$$\text{pending_rewards(D)}= \lfloor \frac{\text{D.shares} \times(\text{current_reward_counter} - \text{D.reward_counter_at_last_payout})}{\text{fixed_point_base}} \rfloor$$
We will show that, with this calculation, delegators are always underpaid, and the reward pool benefits from this rounding, so never becomes in solvent.
Note that the error from the rounding in the $\text{pending_rewards(D)}$ is always less than 1, so the amount delegators are under paid by it is the smallest unit of currency, 1 Plank in Polkadot's case. To make sure that delegator's are underpaid by as much as possible, we need to make $\text{fixed_point_base}$ as large as possible.
We'll want to prove the rounding leaves the pools solvent thing. To do that, we introduce two rational quantities:
$$\text{unrounded reward counter} = \text{pool.last_recorded_reward_counter} +
\frac{\text{payouts_since_last_record}}{\text{pool.total_shares}}$$
and
$$\text{unrounded pending rewards(D)}= \text{D.shares} \times(\text{unrounded reward counter} - \text{D.reward_counter_at_last_payout})$$
and we claim that
**Proposition 2**: We always have that
$$\sum_D \text{unrounded pending rewards(D)} \leq \text{pool.balance}$$ and under Assumption 1, $\text{unrounded pending rewards(D)} \geq 0$.
With this, since $\text{pending_rewards(D)} \leq \text{unrounded pending rewards(D)}$ as we round down, the pool is always solvent.
**Proof of Proposition 2**
We analyse how the claim_reward() and update_recorded() affect $\sum_D \text{unrounded pending rewards(D)}$ and $\text{pool.balance}$.
Firstly we look at claim_reward(D). As before total payouts ever i.e. pool.balance + pool.total_rewards_claimed does not change, and so $\text{unrounded reward counter}$ does not change. As a result$\text{unrounded pending rewards}(D')$ does not change for $D' \neq D$. $\text{unrounded pending rewards(D)}$ does change, by
$\text{D.shares} \times(\text{D.reward_counter_at_last_payout after} - \text{D.reward_counter_at_last_payout after})$.
## What happens when reward pool balance unaccountably falls?