owned this note
owned this note
Published
Linked with GitHub
# Exit queues
In the context of Swarm, an *exit queue* system is one in which stake can be withdrawn by the following procedure:
1. The node owner telegraphs their intention to exit the staking system by sending a *commit* transaction, placing them on the *exit queue*.
2. After predetermined conditions are met, the node owner's stake becomes unlocked and hence withdrawable. At the same time, the node becomes ineligible for participation in revenue sharing.
Exit queues allow the protocol designer to throttle the rate at which nodes exit the system, a means to underwrite system stability. On the other hand, a long or unpredictable exit queue increases the risk to the investor of a stake position by making it illiquid, which could have a negative impact on TVL — the total amount of value locked in stake positions — a potential key driver of BZZ market capitalisation. Managing this tradeoff is therefore the crux of designing an effective exit queue.
## System stability
In Swarm today, the decision problem of whether to turn off your node is a simple comparison of the ongoing operating cost against the income one could generate by continuing. If stake were withdrawable, an opportunity cost enters the equation: turning off your node *and withdrawing stake* allows you to deploy your BZZ in trading. This opportunity cost is a function of crypto market volatility; we generally expect this to change more rapidly than the storage market dynamics governing operating costs and protocol revenue. As a result, we expect that enabling stake withdrawals will engender more frequent node churn events.
More frequent node exits raises concerns about data durability: if all nodes responsible for a given address block (neighbourhood) exit before *repair* occurs — that is, new nodes enter and sync the neighbourhood to restore the replication rate — unrecoverable data loss occurs. One goal of the system design of Swarm must be to make this event sufficiently unlikely.[^sufficiently]
[^sufficiently]: How unlikely is sufficiently unlikely? This depends on whether users interact with Swarm directly, or, as on traditional storage media such as spinnning disks, via an additional redundancy layer employing [erasure coding](https://docs.ethswarm.org/docs/concepts/DISC/erasure-coding). In the latter case, a certain amount of data loss can be tolerated. In the former, the chance of chunk loss ought to be negligible, even over long time periods.
An exit queue mechanism allows the system designer to mitigate this issue in the following ways:
* *Reduce reaction time.* In some cases, the change in income distribution to a neighbourhood triggered by a node exit will naturally induce a new node to enter. For example, if a neighbourhood consists of four equally staked nodes, the effect of a node exit is to increase the potential revenue on joining that neighbourhood (again equally staked) by 20%.
In order for this effect to take hold, the operator of the new node needs to learn about the change in competitive environment and react. During this *reaction time,* the neighbourhood could become under-replicated. An exit queue allows information about the exiting node to disseminate before this happens. The reaction time is effectively reduced by the queue wait time — ideally to zero.
To optimise this effect, the system designer needs to make some assumptions about reaction times among the operator population, which could range from node sync time at the minimum to weeks or even months for a manually operated node fleet.
* *Filter vol spikes.* In the event of very short-lived market conditions that could trigger a node exit event, an exit queue wait time that exceeds the lifetime of the trading opportunity effectively cuts stake capital out of that opportunity, removing the incentive to exit. The queue therefore acts as a kind of low-pass filter on market conditions.
* *Pre-emptive price adjustment.* In the case of node exits triggered by longer-lived market conditions, the primary mechanism we have to bring in more nodes is the price oracle marking up the quote. By immediately discounting nodes in the exit queue from the node count used for price oracle updates, the quote in such conditions can begin reacting to the exit before nodes actually stop functioning. Moreover, new nodes can enter the system *before* the exiting nodes go dark without triggering price decreases due to over-replication.
* *Hard limit churn rate.* If desired, the overall churn rate can be bounded by limiting the number of nodes that can exit via the queue per round. For example, the system designer can set a minimum time for the node count to halve and limit exits based on an estimate of current node count (using rolling averages of reported depth and replication rate).
## Queue disciplines
The set of conditions that determine when a stake position on the exit queue becomes unlocked is called the *queue discipline.* Here are a few examples:
* *Clock-based.* Stake becomes unlocked after a fixed number $N_\mathrm{exit}$ of round.
* *Throughput-based.* At most $k$ nodes can exit per round. If more than $k$ nodes are waiting to exit in a given round, excess nodes are placed on a backlog and processed on a FIFO basis.
* *Anchor-based.* Node may exit after it has successfully participated in one round after joining the queue.
* *Instant.* Stake becomes unlocked immediately. (In this case, commit and withdraw can be bundled into the same transaction.)
* *One out one in.* In a system where there is also an onramp: stake becomes unlocked only when there is a node on the onramp ready to join the same neighbourhood.
* *Throughput-based with onramp netting.* As in throughput-based exits, except that there is also an onramp that can accelerate exits if occupied: allow $k+\ell$ exits per round, where $\ell$ is the number of nodes waiting on the onramp.
* *Indefinite.* Stake never becomes unlocked. This is incentive-equivalent to the currently deployed system.
Queue disciplines can be combined conjunctively or disjunctively, according to whether the system designer wishes to make exit conditions stricter or more relaxed, respectively.
If we start to take into account the allocation of nodes into neighbourhoods, the design space becomes rapidly more complex.
*Example.* The most obvious way to take into account the populations of individual neighbourhoods is to modify the queue discipline depending on the number of nodes sharing AoR with the exiting party. For example a simple rule might be:
* If RR > 4, unlock immediately.
* Otherwise, wait for 2 weeks * (4-RR).
* Recalculate the wait each time a new node joins.
## TVL and liquidity
Making it harder to withdraw funds — whether because the queue wait is *longer* or *less certain* — increases the risk of depositing stake. A fund that is locked until (possibly random) maturity date $T$ is valued with a discount factor of $\mathbb{E}[\beta^T]$ where $\beta^\bullet$ is a discount schedule. A risk-averse entity such as a small node operator that may face sudden liquidity needs uses a more aggressive discount schedule.[^lending]
[^lending]: If there is a money market for BZZ that accepts stake positions as collateral, discount schedules for all operators with access to this market should converge on the lending rate.
A risky, illiquid deposit requires a higher yield to justify. Correspondingly, if we implement an exit queue with a longer perceived wait time, we can expect less total capital to be deposited in stake positions per unit revenue. Since stake TVL is likely to be a major driver of BZZ market cap, the Swarm Protocol designers must make some effort to limit the maturity risks faced by stakers.
The incentive effect of illiquidity is already visible in the state of the network today. Stake deposits are generally not withdrawable until a staking contract upgrade, making them highly illiquid. Accordingly, the network TVL is less than four months of protocol revenue, corresponding to a mean annual yield of over 230%.[^dash] By way of comparison, Filecoin liquid staking is available at a yield of just 12.75%, reflecting a much greater level of demand for a position that can be exited at will.[^depinscan] (Of course, this isn't the only factor limiting the demand for BZZ yield relative to FIL.)
[^dash]: https://beta.dash.swarm.shtuka.io
[^depinscan]: https://depinscan.io/projects/filecoin
### Risk scenarios
For the sake of assessing the possible impact of a particular queue design on the value of stake positions, the system designer should consider typical risk scenarios associated with the factors that influence the wait time $T$.
For example:
* *Anchor-based queue.* If stake unlock depends on a random draw from a known distribution, the quantiles of $T$ can be computed directly and used as tail scenarios.
* *One out one in.* The wait time depends on the economic outlook of the Swarm Network. A key risk scenario is that the operator wants to exit stake position during a time of network stagnation or shrinkage. If the exit rate exceeds the onramp rate, $T$ may be infinite, making stake positions worthless.
## Extension: migration queues
As well as exits, i.e. complete stake withdrawals, we might wish to make some other stake position changes work on a queue basis:
* Partial stake drawdown
* Change of overlay address
* Reduction of AoR
* Increase of AoR, stake topup (less likely)
The motivation to introduce stake drawdown, change of address, or AoR reduction queues is the same as for an exit queue: each action causes a node to relinquish (or in the case of a drawdown, reduce) their responsibility to a certain address block, and for the sake of durability the system designer may wish to throttle them.
## Limitations of exit queues
Exit queues have effect only so long as node operators can be induced to use them. But operators can always just turn off nodes unannounced. The system designer should take care to make the latter choice as unattractive as possible, e.g. via slashing penalties for unscheduled exits. At the same time, the exit queue conditions cannot be so onerous that operators would rather simply stop the service and endure penalties.