Fault and termination fee structure

# Fault and termination fee structure ## Introduction The current sector fault and termination fee structure is based on the selection of three parameters: the **fault fee rate**, the **termination fee** and the **maximum fault time** before a sector is terminated. It is not clear if these parameters are chosen in an optimal fashion, or what is the effect of changing one of these. This issue has become even more relevant recently with FIP discussion #183 (https://github.com/filecoin-project/FIPs/issues/183), following cryptocurrency restrictions from the Chinese government. This resulted in the change of the maximum fault time from 2 weeks to 6 weeks. This was done without much analysis of how this would affect other related quantities. This change was motivated by the fact that a large subsector of miners experienced a shock that would limit their ability to recover their faults rapidly. A secondary goal of this study is then to provide a mechanism that could automatically adjust parameters to react to large system shcoks, removing the guesswork. ## Mathematical formulation of fault and termination fee structure The current fault and termination fee structure can be formulated as follows: When a sector becomes faulty, it starts incurring a penalty with a constant rate, $-\mathcal{N}$ (we denote it as negative here, as in a negative amount of reward). If a sector is faulty for an amount of time $x$, the total reward will then be $Reward(x)=-\mathcal{N}x$. This reward structure continues up to a maximum fault time, $x_{\rm max}$, when the sector is automatically terminated, and has to pay an additional termination fee we denote as $\mathcal{N}T$, such that $Reward(x)=-\mathcal{N}\left(x_{\rm max}+T\right)$ for $x\ge x_{\rm max}$. These properties can be expressed as having a reward rate function: $$Rate(x)=\mathcal{N }\left[H(x-x_{\rm max})-1-T\delta(x-_{\rm max})\right]$$ where $H(x)$ is a Heavyside step function (https://en.wikipedia.org/wiki/Heaviside_step_function), and $\delta(x)$ is a Dirac delta function (https://en.wikipedia.org/wiki/Dirac_delta_function) We can integrate this to obtain the total Reward function, $$Reward(x)=\int_0^x Rate(x^\prime)dx^\prime=\mathcal{N}\left[x\left(H(x-x_{\rm max})-1\right)-TH(x-x_{\rm max})\right].$$ Given this reward structure, we are interested in computing what would be the expected reward/fee for a faulty sector. To compute this we need to make some assumptions about how the repair times for sector are distributed. ## Modeling Miner repair time We model the amount of time it takes a miner to repair their sector with an exponential distribution. This has the advantage of being a simple distribution where we will be able to find closed form expressions, and we expect it not to be a very unrealistic assumption. The assumption is that the amount of time, $x$, it will take to repair a given faulty sector is a random variable drawn from an exponential distribution, $$x\sim {\rm Exponential}[\lambda],$$ with a probability distribution function, $$f_\lambda(x)=\lambda e^{-\lambda x}.$$ The parameter $\lambda$ could be estimated from the most recent available data on sector repair time, so it is a quantity that should be updated by the network. This would for instance capture the effect discussed in FIP 183, where a large subset of miners become slower in repairing their sector, this would then pull $\lambda$ to a lower value. At every time step one can compute the average of observed termination times $\bar{x}$. The maximum likelihood estimator for $\lambda$ can then be fixed by the observed data as $$\lambda_{MLE}=\frac{1}{\bar{x}}.$$ ## Computing expected reward (fee) Once $\lambda$ is fitted from the data, we can estimate the expected reward as $$C\equiv E[Reward]=\int_0^\infty Reward(x)f_\lambda(x)dx$$ $$=\mathcal{N}\left[-\frac{1}{\lambda}+\int_{x_{\rm max}}^\infty x\lambda e^{-\lambda x}dx-T\int_{x_{\rm max}}^\infty \lambda e^{-\lambda x}dx\right]$$ $$=\mathcal{N}\left\{\frac{1}{\lambda}\left[(\lambda x_{\rm max}+1)e^{-\lambda x_{\rm max}}-1\right]-Te^{-\lambda x_{\rm max}}\right\}.$$ This equation is what could be called a "Governance surface", and can be interpreted as the constraint on the variables $C,\mathcal{N},T, x_{\rm max}, \lambda$, that $$G(C,\mathcal{N},T, x_{\rm max}, \lambda)=0,$$ where $$G(C,\mathcal{N},T, x_{\rm max}, \lambda)\equiv C-\mathcal{N}\left\{\frac{1}{\lambda}\left[(\lambda x_{\rm max}+1)e^{-\lambda x_{\rm max}}-1\right]-Te^{-\lambda x_{\rm max}}\right\}.$$ The properties of this Governance surface have been further explored and documented by the BlockScience research group in (https://hackmd.io/@bsci-filecoin/Bk3TI7x8t) (*beware that they use slightly different notation*). One particularly interesting insight found in the BlockScience study is the relation $$\frac{\partial C}{\partial x_{\rm max}}=-\mathcal{N}e^{-\lambda x_{\rm max}}\left(x_{\rm max}-T\right).\,\,\,\,\,\,\,\,\,\,\,\,\,\,(*)$$ This implies that for a given value of $T$, there is an optimal maximum fault time $x_{\rm max}=T$ that will minimize the expected fee incured by a faulty sector. ## Example solutions The main result of this research is understanding of the constraint $G(C,\mathcal{N},T,x_{\rm max},\lambda)=0$. This is a tool that can be used to make further decisions, for instance, how will expected sector fee change if maximum fault time is changed from 2 to 6 weeks. Given this tool, we can also design several combination of parameters that maximize as many desired properties as possible. It is then the job of the community to decide what should be maximized. Below we show two example solutions that may be implemented ### Constant $C,\,x_{\rm max}, T$, self adjusting $\mathcal{N}$ In this approach we assume that $T$ and $x_{\rm max}$ are fixed, (as they currently are). Here we can choose to keep $C$ at a constant value that does not depend on $\lambda$, and allow $\mathcal{N}$ to be a function of $\lambda$. This approach has the benefit of self adjusting the reward rates based on the current status of the network. For instance take the situation that led to FIP 183. A large subset of miners will become slower in repairing sectors due to new regulations. This will significantly decrease the value of $\lambda$. The expected reward $C$ can be kept constant. This can be acheived by fixing $$\mathcal{N}=\frac{C}{\frac{1}{\lambda}\left[(\lambda x_{\rm max}+1)e^{-\lambda x_{\rm max}}-1\right]-Te^{-\lambda x_{\rm max}}}.$$ in this case, such that $N$ will have to decrease, with decreasing $\lambda$ to keep constant expected reward. This will have two benefits: 1) If a large subset of miners become slower, there is a built in forgiveness that the rates become lower, such that they will not have to pay as much. The bigger the fraction of miners that experience the shock, the more they are "insured", this could have been enough to satisfy the needs of miners affected by new Chinese regulations. 2) If a large subset of miners become slower, it rewards miners who were able to remain efficient and recovering their faults rapidly, as their fee rate will be reduced. If they remain fixing their faults at the same amount of time, they will end up paying less, because their efficient repairs are in higher demand. ### Fixed $\mathcal{N}T$, optimal $x_{\rm max}$, constant $C$ We present a second possible solution to the constraint $G=0$, which accomplishes the following goals: 1) We use $\mathcal{N}T=\mathcal{T}$ as a fixed input, since the termination fee should be independent of the fault mechanism (since miners should also have the choice to terminate even without going into fault, so fee should be determined from that). We also point out that the this termination fee depends on how long the miner has been operating (with maximum fee at 90 days or more). 2) The termination time $x_{\rm max}$ should be optimized such that it is the most reasonable time. From BlockScience's work (*) we see this is done by choosing $$x_{\rm max}=T=\mathcal{T}/\mathcal{N}$$ The public likes $x_{\rm max}=6weeks$. In this approach $x_{\rm max}$ will be variable, but we can aim for having 6 weeks when there are "normal" conditions. 3) We would like the expected reward, $C$ to be independent of $\lambda$, and the reward rate to be roughly proportional to $\mathcal{N}=a\lambda$ as an insuirance mechanism resembling our first solution above. 4) It would be good if these fees $a$ and $C$ could be chosen in some optimal fair way. From applying 1) and 2), we now have the surface $$C=\frac{\mathcal{N}}{\lambda}\left[\left(\lambda\frac{\mathcal{T}}{\mathcal{N}}+1\right)e^{-\frac{\lambda\mathcal{T}}{\mathcal{N}}}-1\right]-\mathcal{T}e^{-\frac{\lambda\mathcal{T}}{\mathcal{N}}}$$ We now see that condition 3) can be accomplished choosing $\mathcal{N}=a\lambda$, with $$C=a\left[\left(\frac{\mathcal{T}}{a}+1\right)e^{-\frac{\mathcal{T}}{a}}-1\right]-\mathcal{T}e^{-\frac{\mathcal{T}}{a}}$$ We can use 2) to fix $a$ by enforcing that for some "normal" repair rate $\bar\lambda$ (it is to be decided what is considered normal repair times, maybe all time average?), and for a maximum termination fee $\mathcal{T}_{\rm max}$ (corresponding to miners that have operated more than 90 days), then $$6weeks=\frac{\mathcal{T}_{\rm max}}{a\bar\lambda}\Rightarrow a=\frac{\mathcal{T}_{\rm max}}{\bar\lambda*6weeks}$$ This fixes what is the optimal fault fee rate. We could conversely also use this last equation to fix the termination fee. Say we are happy with our current rate $\mathcal{N}_{\rm current}$, we can make that the standard for normal times $$a_{\rm current}=\frac{\mathcal{N}_{\rm current}}{\bar\lambda}$$ and then we coud fix the termination fee by choosing $$\mathcal{T}_{\rm max}=6weeks*\bar\lambda a_{\rm current}$$