# Freeloading Games in Reward Symmetry
(This formulation is a next iteration following a [previous article](https://hackmd.io/@onur/risk_sharing_in_pos))
Let us assume that each validator has staked some number of tokens, which determines how much reward they will receive. Let $w_i$ be the normalized weight of validator $i$. Validators have to have nonzero weights. Each validator is supposed to send exactly one message, and the strategies are `SEND` and `DON'T SEND` respectively.
Let $x=(x_1,\dots,x_N)$ be the tuple of boolean values that hold whether a validator has sent a message. Then participating weight is calculated as
$$
W_p(x) = \sum_{i\in V}x_iw_i
$$
The total base reward is defined as R, and the actual amount to be distributed after correction is calculated as $W_p(x)R$. Reward symmetry dictates that each validator receives the same reward per weight, regardless of their sending a message or not. Thus, a validator $i$ receives $w_iW_p(x)R$ for a given participation profile $x$.
Moreover, we dictate that a minimum participating weight $W_\min$ needs to be observed in order for everyone to receive any reward. If participating weight is greater than or equal to this value, we call the outcome *successful*, and *unsuccessful* otherwise. In unsuccessful rounds, no validator receives any reward.
$$
s(x) = \begin{cases}
1 & \text{if } W_p(x)\geq W_\min \\
0 & \text{otherwise}
\end{cases}$$
Let us assume that the cost of sending a message for a given validator depends on weight, given by $c(w)$. **We also assume that $c(w)>0$ for $w>0$ and $c$ is monotonically increasing in $w$.** A validator $i$ incurs the cost only if they send a message, i.e. $x_i=1$. The main point of this game is that validators may "freeload" by not sending any message, because then they would not incur the cost, but still receive some rewards
Finally, payoff of a validator $i$ is defined as
$$
\pi_i(x) =
\begin{cases}
w_iW_p(x)R-c(w_i) & \text{if } s(x) = 1 \text{ and } x_i=1 \\
w_iW_p(x)R & \text{if } s(x) = 1 \text { and } x_i = 0 \\
-c(w_i) & \text{if } s(x) = 0 \text{ and } x_i=1 \\
0 & \text{if } s(x) = 0 \text{ and } x_i=0
\end{cases}
$$
This formulation is supposed to aid in choosing a base reward value such that there exists a Nash equilibrium $x$ where $s(x)=1$. It should be chosen so that being a validator is still profitable with a relatively small stake, e.g. $wR - c(w)> 0$ for $w=1/1000$. In fact, the maximum economically viable number of validators can be calculated as $N_\text{viable}=1/w$ where $w$ is obtained by solving $wR = c(w)$.
## Nash equilibria
Ideally, one Nash equilibrium should be $x_1=x_2=\dots=x_n=1$. However, if the base reward is too low or the costs are too high, that equilibrium could be located at a lower value of participating weight.
**Corollary:** If no validator $i$ exists such that $w_i\geq W_\min$, then $x_1=x_2=\dots=x_n=0$ will be a Nash equilibrium.
**Theorem:** Let $\bar{V}\subset V$ be a totally ordered subset of validators such that
- $w_i \leq w_j$ for all consecutive pairs $(i, j)$ from $\bar{V}$,
- AND $w_i\geq w_j$ for all $i\in\bar{V}$, for all $j\in V\setminus\bar{V}$,
- AND $\bar{W}_\text{total}-w_{\min(\bar{V})} \leq W_\min \leq \bar{W}_\text{total}$.
where $\bar{W}_\text{total} = \sum_{i\in\bar{V}}w_i$. Then let $\bar{x}$ be an action profile such that
$$
\bar{x}_i = \begin{cases}
1 & \text{if } i\in \bar{V} \\
0 & \text{otherwise}
\end{cases}
$$
If $\pi_i(\bar{x})\geq 0$ for all $i\in V$, then there exists a Nash equilibrium $x$ such that $s(x)=1$.
*Proof:* TBD $\square$
### Making sure `DON'T SEND` is not a dominant strategy
When $s(x)=0$, `DON'T SEND` yields a higher payoff for all $x$. So it is not possible to have a game where `SEND` is the dominant strategy for all cases. However, we can make sure `SEND` yields a higher payoff than `DON'T SEND` in optimal conditions so that `DON'T SEND` is not a dominant strategy.
For a validator to choose `SEND`, there must exist an action profile where it gives a higher payoff than `DON'T SEND`. We compare two limiting cases where a validator with weight $w$ evaluates his options:
#### Case A
Validator's action causes the round to switch from successful to unsuccesful. We compare the payoffs when $W_p(x)=W_\min$ (i.e. $s(x)=1$) and $W_p(x')=W_\min-w$ (i.e. $s(x')=0$).
Here, the only difference between $x$ and $x'$ is the given validator's action.
**Lemma:** For a given validator with weight $w$, if $wW_\min R-c(w)\geq 0$ is satisfied, then `SEND` is the rational choice (has a higher payoff) for all $x$ where $W_p(x)\geq W_\min$.
*Proof:* Using the definition of payoffs above and substituting values, we arrive at the inequality above.
#### Case B
The round is successful regardless of the validator's choice. We compare the payoffs when $W_p(x)=W_\min+w$ (i.e. $s(x)=1$) and $W_p(x')=W_\min$ (i.e. $s(x')=1$).
**Lemma:** For a given validator with weight $w$, if $w^2R-c(w)\geq 0$ is not satisfied, then the validator's dominant strategy is `DON'T SEND`.
*Proof:* Let $s(x)=1$ and $W_p(x)=\bar{W}\geq W_\min$. For the validator to choose `SEND`, corresponding payoff needs to be higher:
$$
w\bar{W}R-c(w) \geq w(\bar{W}-w)R
$$
Simplifying the expression, we arrive at the inequality above. If it is not satisfied, `DON'T SEND` yields a higher payoff in both cases and becomes the dominant strategy. $\square$
In the given setting, weight limit will always be smaller than $W_\min$, because $W_\min\geq 0.5$ and setting the weight limit above that would allow for only one validator in the game. For that reason, case B always yields the more critical condition, that is, if the inequality in case B is satisfied, then the inequality in case A is also always satisfied.
### Linear cost function
Let us assume that the cost function is linear: $c(w) = c_vw+c_f$, where $c_v$ is the variable cost and $c_f$ is the fixed cost of sending a message. Substituting it in the inequality from case B, we get
$$
Rw^2-c_vw - c_f \geq 0.
$$
We solve the inequality for $w$ to obtain
$$
w \geq \frac{\sqrt{c_v^2+4Rc_f}+c_v}{2R}
=: w_\text{honest}
$$
This defines the threshold at which validators switch from freeloading to being honest. The subset of validators with weight smaller than $w_\text{honest}$ are expected to freeload regardless of others' strategies, *in the given game context*.
If the maximum validator weight at which freeloading can be tolerated is $w_\text{honest}$, then the base reward should be set to a minimum of
$$
R_\min := \frac{c_vw_\text{honest}+c_f}{w_\text{honest}^2}.
$$
*Example:* The game can be generalized to multiple rounds that occur over the course of a year. Let us assume that the information of previous rounds is not retained---we can then treat the whole year as a single game. Let there be a only a fixed cost of participating, equal to 500 USD per month, amounting to 6000 USD per year. Let us also assume that we cannot tolerate freeloading of validators with weight greater than 1%. Then, the minimum base reward that should be emitted every year should be equal to $6000/0.01^2=60\text{m}$ USD.
Similarly, we can calculate the minimum weight that should be set in-protocol for a given base reward. For a yearly issuance rate of 2%, total token supply of 100m and token valuation at 0.1 USD.
### Setting a minimum weight to ensure honesty of all validators
In designing participation rules, it might be wise to set a minimum validator weight $w_\min = w_\text{honest}$, so that all validators are incentivized not to freeload. In that case, the maximum possible number of validators would be equal to $N_\max = 1/w_\min$. If costs were incurred in fiat and rewards were paid out in tokens, then $N_\max$ would depend on the token price $P$, namely $N_\max \propto P$. As the price increased, the network would be able to support more validators of smaller weight.
Now that we introduced the additional dimension of token price, we can move on to add more complexity into our game. A single round game is too simplistic to model participation in a Proof of Stake protocol, because the protocol governs a continuous process of proposing and finalizing new blocks, where each block corresponds to one game round. Thus, it makes more sense to model freeloading as a repeated game, where the outcome of previous rounds feed back into the token price. To explain the approach in one sentence: if no blocks get finalized, then the token price drops to zero, eliminating the prospect of *any* future reward. Therefore, freeloading might be even less rational after all.
## Repeated games
We assume that multiple rounds of the game are played consecutively, and that the aggregate success of rounds feed back into the token price. Let us define a shorthand $s_t:=s(x^t)$ where $x^t$ is the action profile of round $t$. Then let $S_t = (s_0,s_1,\dots,s_{t-1})$ be the tuple of results of rounds preceding round $t$. We assume that price at round $t$ is given by a function $P_t(S_t)$.
We assume that rewards are paid out in tokens and costs are incurred in fiat. Then the payoff of validator $i$ for a single round $t$ with an action profile $x^t$ is given by
$$
\pi_i(x^t) =
\begin{cases}
w_iW_p(x^t)P_t(S_t) R-c(w_i) & \text{if } s_t = 1 \text{ and } x^t_i=1 \\
w_iW_p(x^t)P_t(S_t) R & \text{if } s_t = 1 \text { and } x^t_i = 0 \\
-c(w_i) & \text{if } s_t = 0 \text{ and } x^t_i=1 \\
0 & \text{if } s_t = 0 \text{ and } x^t_i=0
\end{cases}
$$
We assume that the price function admits the following property: For all $S_t\in\{0,1\}^{t-1}$ and for all $t'\in[0,t-1]$,
$$P(S_t)\geq P(\tilde{S}_t)$$
where ($s_{t'}=1$) AND ($\tilde{s}_{t'}=0$) AND ($s_u=\tilde{s}_u$ for all $u\in[0,t-1]\setminus\{t'\}$). In other words, comparing the price yielded by two sequences of rounds where the only difference is the result of a round at a certain index, the sequence where that round is successful yields a greater or equal price than the one where it is not. Example: $P_5(1,0,\underline{1},1,0)\geq P_5(1,0,\underline{0},1,0)$.
**Corollary:** $P_t$ is maximized when $s_0=\dots=s_{t-1}=1$, namely all rounds up until that point were successful.
An example for such a price function could be
$$
P_t(s) = \frac{\sum_{u=0}^{t-1}s_u}{t}
$$
TBD