owned this note
owned this note
Published
Linked with GitHub
# Lido Withdrawals Design & Bunker Mode
## ETH2 Withdrawals
### Validator lifecycle
![](https://i.imgur.com/GIK89EC.png)
#### UNKNOWN State
Validator client will report that the state of a particular validator is UNKNOWN when it loads validator keys that have not yet submitted a valid deposit to the Ethereum proof-of-work chain validator deposit contract.
#### DEPOSITED State
Once a valid transaction has been submitted to the validator deposit contract, beacon node will will detect the presence of the transaction on the ETH1 chain and validator client will now report being in the DEPOSITED state.
#### PENDING State
The current specification for processing deposits requires a `ETH1_FOLLOW_DISTANCE` (minimum of 2048 ETH1 blocks, ~8 hours, to ensure stability of the chain) plus `EPOCHS_PER_ETH1_VOTING_PERIOD` (64 epochs, ~6.8 hours, to organize a sizable committee for voting) before deposits can be processed into the Ethereum beacon chain. The validator is then assigned an index number and placed into a queue for activation. The beacon chain can process the deposits of 4 ~ 16 new validators per finalized epoch, the difference in the number is determined by the number of total active validators on the chain. Once a validator has reached the front of the queue, it is assigned an activation epoch after an additional 4~5 epochs (~31 minutes).
#### ACTIVE State
Once the activation epoch arrives, the validator is activated and assigned responsibilities including proposing or attesting to blocks on the beacon chain. Validators receive either rewards or penalties to the initial deposit based upon their overall performance. If a validator's balance drops below 16 ETH (typically due to inactivity), it will be ejected. Ejections are treated the same as a voluntary exits.
#### Withdrawals
Validators that have been active and have a validator index (including validators that are slashed/exited) can initiate a `BLStoExecutionChange` request that changes its `withdrawal_credentials` which begins the withdrawal process. Once the `withdrawal_credentials` are changed withdrawals will automatically be processed at the rate of 16 per block. Fully exited validators will also be fully withdrawn once withdrawals are initiated. Learn more in our withdrawal guide.
#### EXITING State
An ACTIVE validator may request to exit by submitting a signed `VoluntaryExit` operation to the Ethereum network. Assuming the validator has been in the active state for the `SHARD_COMMITTEE_PERIOD` or 256 epochs ~ 27 hours plus the look ahead 4~5 epochs(~31 minutes), the validator will be assigned an `exit_epoch` that is determined by the length of the exiting queue. The beacon chain can process the exits of 4 ~ 16 validators per finalized epoch, the difference in the number is determined by the number of total active validators on the chain.
#### SLASHING State
If a slashable event is included in a block while a validator is either ACTIVE, EXITING, or EXITED, it will briefly enter the SLASHING state where slashing penalties are applied, before being forcefully transitioned into the EXITED state. Slashed validators incur three distinct penalties:
##### Minimum Penalty
A penalty of (1/32 * Effective Balance), issued immediately
##### Missed Attestation Penalties
A penalty equivalent to that incurred by an inactive validator, issued every epoch until the validator leaves the exit queue
##### Attack Multiplier Penalty
A penalty proportional to three times the number of other slashings in the past 8192 epochs (4 weeks, ~36 days), applied 4096 epochs (2 weeks, ~18 days) after the slashing event was first included in a block. Under normal circumstances this penalty is quite small, however in the event that a large number of slashings occur in a short time frame, this penalty can be as high as 32 ETH.
#### EXITED State
In the case that the validator has reached the exited state voluntarily, the funds will become withdrawable after 256 epochs (~27 hours). If the validator was slashed, this delay is extended to 4 weeks (2048 epochs*4 or ~36 days). If a slashable event is included in a block before funds have been withdrawn, the validator will move back to the SLASHING state causing withdrawal delays to reset.
---
### Slashing
"Slashing" is the burning of some amount of validator funds and immediate ejection from the active validator set. In Phase 0, there are two ways in which funds can be slashed: proposer slashing and attester slashing. Although being slashed has serious repercussions, it is simple enough to avoid being slashed all together by remaining consistent with respect to the messages a validator has previously signed.
#### Proposer slashing
To avoid "proposer slashings", a validator must not sign two conflicting `BeaconBlock` where conflicting is defined as two distinct blocks within the same slot.
#### Attester slashing
To avoid "attester slashings", a validator must not sign two conflicting `AttestationData` objects, i.e. two attestations that satisfy `is_slashable_attestation_data`.
#### Slashing Penalties
Slashing penalties consist of three types. The minimum penalty is issued once slashing is detected and has a fixed size. The midterm attack multiplier penalty is applied on the 18th day and is proportional to the number of other slashings in the past ~36 days (counting backward from the midterm point). This penalty can be as high as 32 Ether. In addition, the validator is penalized for inactivity during all days before being finally ejected (~36 days at minimum or more if too many exits are registered in the Ethereum queue).
```mermaid
gantt
title Ethereum slashing penalties timeline
axisFormat %d
section begin
Slashing started (epoch0) :done, :m3, 2022-10-01, 6h
section slashing
Slashing duration :crit, :a1, 2022-10-01, 36d
Minimum penalty (epoch0) : milestone, m1, 2022-10-01,
Midterm Attack Multiplier Penalty (epoch0 + 2^12): milestone, m2, 2022-10-19,
Missed attestation penalties (epoch0, ..., epoch0 + 2^13) :a2, 2022-10-01, 36d
section end
Slashing completed (epoch0 + 2^13) :done, m3, 2022-11-06, 6h
```
**The timeline**:
1. An offense is made by a validator (t+0)
1. An offense is reported (t2=t+???)
1. A small fixed initial slashing happens (t2)
1. About 18 days (maybe more but unlikely) of inactivity for the validator, during which midpoint slashing size (see `process_slashings` in the spec) is determined
1. A midpoint slahsing happens, that can range from 0 to 100% of validator balance, depending on share of validators slashed (at this point slashing effects can be considered known and resolved)
1. about 18 days after (maybe more but unlikely), the remaining funds are withdrawable
#### Inactivity leak
If the chain fails to finalize for *tsF*>4 epochs (*tsF*= “time since finality”), then a penalty is added so that the maximum possible reward is zero (validators performing imperfectly get penalized), and a second penalty component is added, proportional to *tsF* (that is, the longer the chain has not finalized, the higher the per-epoch penalty for being offline). This ensures that if more than 1/3 of validators drop off, validators that are not online get penalized much more heavily, and the total penalty goes up quadratically over time.
This has three consequences:
* Penalizes being offline much more heavily in the case where you being offline is actually preventing blocks from being finalized
* Serves the goals of being an anti-correlation penalty (see section below)
* Ensures that if more than 1/3 do go offline, eventually the portion online goes back up to 2/3 because of the declining deposits of the offline validators
With the current parametrization, if blocks stop being finalized, validators lose 1% of their deposits after 2.6 days, 10% after 8.4 days, and 50% after 21 days. This means for example that if 50% of validators drop offline, blocks will start finalizing again after 21 days.
---
### GASPER
Gasper is a combination of Casper the Friendly Finality Gadget (Casper-FFG) and the LMD-GHOST fork choice algorithm. Together these components form the consensus mechanism securing proof-of-stake Ethereum. Casper is the mechanism that upgrades certain blocks to "finalized" so that new entrants into the network can be confident that they are syncing the canonical chain. The fork choice algorithm uses accumulated votes to ensure that nodes can easily select the correct one when forks arise in the blockchain.
Gasper sits on top of a proof-of-stake blockchain where nodes provide ether as a security deposit that can be destroyed if they are lazy or dishonest in proposing or validating blocks. Gasper is the mechanism defining how validators get rewarded and punished, decide which blocks to accept and reject, and which fork of the blockchain to build on.
#### FINALITY
Finality is a property of certain blocks that means they cannot be reverted unless there has been a critical consensus failure and an attacker has destroyed at least 1/3 of the total staked ether. Finalized blocks can be thought of as information the blockchain is certain about. A block must pass through a two-step upgrade procedure for a block to be finalized:
1. Two-thirds of the total staked ether must have voted in favor of that block's inclusion in the canonical chain. This condition upgrades the block to "justified". Justified blocks are unlikely to be reverted, but they can be under certain conditions.
1. When another block is justified on top of a justified block, it is upgraded to "finalized". Finalizing a block is a commitment to include the block in the canonical chain. It cannot be reverted unless an attacker destroys millions of ether (billions of $USD).
These block upgrades do not happen in every slot. Instead, only epoch-boundary blocks can be justified and finalized. These blocks are known as "checkpoints". Upgrading considers pairs of checkpoints. A "supermajority link" must exist between two successive checkpoints (i.e. two-thirds of the total staked ether voting that checkpoint B is the correct descendant of checkpoint A) to upgrade the less recent checkpoint to finalized and the more recent block to justified.
Because finality requires a two-thirds agreement that a block is canonical, an attacker cannot possibly create an alternative finalized chain without:
1. Owning or manipulating two-thirds of the total staked ether.
1. Destroying at least one-third of the total staked ether.
The first condition arises because two-thirds of the staked ether is required to finalize a chain. The second condition arises because if two-thirds of the total stake has voted in favor of both forks, then one-third must have voted on both. Double-voting is a slashing condition that would be maximally punished, and one-third of the total stake would be destroyed.
#### Fork choice
The original definition of Casper-FFG included a fork choice algorithm that imposed the rule: follow the chain containing the justified checkpoint that has the greatest height where height is defined as the greatest distance from the genesis block. In Gasper, the original fork choice rule is deprecated in favor of a more sophisticated algorithm called LMD-GHOST. It is important to realize that under normal conditions, a fork choice rule is unnecessary - there is a single block proposer for every slot, and honest validators attest to it. It is only in cases of large network asynchronicity or when a dishonest block proposer has equivocated that a fork choice algorithm is required. However, when those cases do arise, the fork choice algorithm is a critical defense that secures the correct chain.
LMD-GHOST stands for "latest message-driven greedy heaviest observed sub-tree". This is a jargon-heavy way to define an algorithm that selects the fork with the greatest accumulated weight of attestations as the canonical one (greedy heaviest subtree) and that if multiple messages are received from a validator, only the latest one is considered (latest-message driven). Before adding the heaviest block to its canonical chain, every validator assesses each block using this rule.
#### Slashing as punishment for breaking consensus
The behaviours that lead to slashing are as follows.
1. Related to Casper FFG consensus,
* making two differing attestations for the same target checkpoint, or
* making an attestation whose source and target votes "surround" those in another attestation from the same validator.
2. Related to LMD GHOST consensus,
* proposing more than one distinct block at the same height, or
* attesting to different head blocks, with the same source and target checkpoints.
All of these slashable behaviours relate to "equivocation", which is when a validator contradicts something it previously advertised to the network.
The slashing conditions related to Casper FFG underpin Ethereum 2.0's economic finality guarantee. They effectively impose a well-determined price on reverting finality.
The slashing conditions related to LMD GHOST are less robustly supported by consensus theory, and are not directly related to economic finality. Nonetheless, they punish bad behaviour that could lead to serious issues such as the balancing attack. Since we already had the slashing mechanism available for use with Casper FFG, it was simple enough to extend it to LMD GHOST.
## LIDO withdrawals
### Withdrawals in liquid staking
Withdrawals in liquid staking is a fairly complex mechanics, because it needs to take into account a number of simultaneous, rarely communicating processes in a way that results in the fair distribuitions of rewards and penalties for all the users.
Here are events and processes that need to be accounted for for a purpose of a withdrawal process in Lido.
1. Withdrawal request from a staker
1. Withdrawal signal from a validator
1. Oracle reports on network state
1. Ongoing rewards and penalties
1. Ongoing slashings
1. Unbonding period
1. New penalties and slashings during onbonding
1. Network turbulence (e.g. lack of finality)
Most of these points are fairly comlicated by themselves in eth2. It would be fair to say that handling slashings during withdrawals is the main problem for Lido
### Lido’s design choices
* slashings and rewards are socialized between all stakers
* node operators don’t have collateral
* slashing risks are covered
* stETH is minted immediately on deposit and starts to receive socialized rewards/is under slashing risks, even if the validators deposited with that eth are still in the entry queue
### Main flow
As withdrawals on Ethereum are processed asynchronously, Lido has to have a request-claim process for stETH holders. To ensure the requests are processed in the order they are received, the queue is introduced. Here is an overview of the withdrawals handling process that the development team is proposing:
1. **Request:** To withdraw stETH to Ether, the user sends the withdrawal request to the `WithdrawalQueue` contract, locking the stETH amount to be withdrawn.
1. **Fulfillment:** The protocol handles the requests one-by-one, in the order of creation. Once the protocol has enough information to calculate the stETH share redemption rate of the next request and obtains enough Ether to handle it, the request can be finalized: the required amount of Ether is reserved and the locked stETH is burned.
1. **Claim:** The user can then claim their Ether at any time in the future. The stETH share redemption rate for each request is determined at the time of its finalization and is the inverse of the Ether/stETH staking rate.
It’s important to note that the redemption rate at the finalization step may be lower than the rate at the time of the withdrawal request due to slashing or penalties that have been incurred by the protocol. This means that a user may receive less Ether for their stETH than they expected when they originally submitted the request.
A user can put any number of withdrawal requests in the queue. While there is an upper limit on the size of a particular request, there is no effective limit as a user can submit multiple withdrawal requests. There’s a minimal request size threshold of 100 wei required as well due to rounding error issues.
A withdrawal request could be finalized only when the protocol has enough Ether to fulfill it completely. Partial fulfillments are not possible, however, a user can accomplish similar behavior by splitting a bigger request into a few smaller ones.
For UX reasons, the withdrawal request should be transferrable. We are intending to keep the basic implementation as simple as possible. That also would allow for the secondary market for “Lido withdrawal queue positions” to form.
It is important to note two additional restrictions related to withdrawal requests. Both restrictions serve to mitigate possible attack vectors allowing would-be attackers to effectively lower the protocol’s APR and carry fewer penalties/slashing risk than stETH holders staying in the protocol.
1. **Withdrawal requests cannot be canceled.** To fulfill a withdrawal request, the Lido protocol potentially has to eject validators. A malicious user could send a withdrawal request to the queue, wait until the protocol sends ejection requests to the corresponding Node Operators, and cancel the request after that. By repeating this process, the attacker could effectively lower the protocol APR by forcing Lido validators to spend time in the activation queue without accruing rewards. If the withdrawal request can’t be canceled, there’s no vulnerability. As noted above, making the position in the withdrawal queue transferrable would provide a “fast exit path” for regular stakers.
1. **The redemption rate at which a request is fulfilled cannot be better than the redemption rate on the request creation.** Otherwise, there’s an incentive to always keep the stETH in the queue, depositing Ether back once it’s redeemable, as this allows to carry lower staking risks without losing rewards. This would also allow a malicious actor to effectively lower the protocol APR. To avoid this, we propose that penalties leading to a negative rebase are accounted for and socialized evenly between stETH holders and withdrawers. Positive rebases could still affect requests in the queue, but only to the point where rebases compensate for previously accrued penalties and don’t push the redemption rate higher than it was at the moment of the withdrawal request’s creation.
### How slashings would affect withdrawal request fulfillment
In the event of mass slashing, it is necessary to either delay withdrawal requests until penalties can be socialized among holders or to fulfill the requests with a discount. Calculating discount at the moment of slashing start arguably would have been the best way forward. Unfortunately, one can't predict either exit time or total penalties caused by slashing, so there's no way to calculate the discount properly before the slashing has ended.
The main idea here is to guarantee that the Lido protocol should operate in a fair way regarding unsophisticated stakers. For example, a sophisticated actor can, by observing the current state of the blockchain, predict, that due to a mass slashing today, the protocol will experience a negative rebase in 18 days. Then, in the absence of any delay or discount mechanism, they will be able to avoid losses by withdrawing their funds and re-staking them back after the penalties are socialized between other protocol users. This would lead to an arbitrage resulting in the burden of losses being handled by those who didn't have the time or experience to withdraw first.
The two options of mechanism that would protect against that in the event of mass slashing are delaying the withdrawal requests, or fulfilling them without additional delay but with a discount.
There's no way to predict the overall penalty amount associated with slashing. So, the only penalty estimate which won't be giving the withdrawing stakers a better rate than the regular ones is the full balance of the validator under slashing — one just can't lose more. In most cases the penalty amount would be significantly lower, so fulfilling requests with a discount is not optimal from the UX and fairness perspective. Moreover, any user already can sell their stETH (or even withdrawal request position) on a secondary market with a discount. Therefore, the “delay” option should be preferred as the way to socialize losses from mass slashing events.
Single and multiple-validator non-mass slashing, on the other hand, should not greatly affect the protocol, as the daily protocol rewards compensate fully for these types of slashings. Therefore, delaying all withdrawal requests due to minor slashing instances, while protocol rewards are still higher than the losses, is impractical and unnecessary.
To address this, the proposed protocol design consists of two modes: “turbo mode” and “bunker mode”. In the "turbo mode", the protocol fulfills withdrawal requests once it has enough Ether to do so. In the "bunker mode", a user must wait until all losses that might've been predicted based on the state of the blockchain observable at the moment of withdrawal request creation are socialized between all users.
Oracles could trigger a "bunker mode" based on the information about the number of slashings that happened within the last 36 days for Lido and non-Lido validators, as well as extrapolated daily Lido on Ethereum protocol rewards.
In the "bunker mode", a withdrawal request's fulfillment is delayed until the Lido protocol gets back to the "turbo mode" or all slashings associated with the request have passed.
Associated slashings are all ongoing slashings that occurred before the withdrawal request or in the same oracle report period (or the next oracle report period in some cases).
The definition of the request’s associated slashings ensures that the consequences of any slashing that is observable at the moment of the withdrawal request creation are experienced in full. At the same time, users won’t be affected by slashings happening way after the withdrawal request is made.
Finally, if it’s observed that Ethereum has been in the leak mode after the previous oracle report, no withdrawal request should be finalized on the next oracle report to allow time for the blockchain to fully apply all the consequences of the leak mode and allow additional time to respond to the incident.
### Withdrawal request fulfillment mechanics
To fulfill a withdrawal request, the protocol should do several things:
1. Determine the next validator to exit based on a predetermined order and notify the respective Node Operator of this decision.
2. Decide on the time of finalization (including delays imposed by slashing conditions) and the stETH share redemption rate of each withdrawal request.
To make these decisions, the protocol needs information from the Consensus Layer, which is not directly accessible from the Execution Layer. The current version of the protocol relies on a committee of oracles to provide this missing information for protocol operation on the Execution Layer.
1. Assign oracle committee members the additional duty of signaling the next validators to eject. Once in `N` epochs, oracle members analyze newly appeared withdrawals requests in the `WithdrawalQueue` contract, calculate the list of the next validators to eject using the algorithm approved by the Lido DAO and implemented in the oracle code, and publish the list to oracle smart contracts. The algorithm must handle failures and delays in validators' ejection.
2. Oracles must keep track of the queue, and when more requests can be finalized based on the consensus and Execution Layer states, the oracle must transmit the index of the last order to be finalized, the amount of Ether needed to satisfy the range of orders and the redemption rate. Withdrawal requests that were sent less than `M` blocks (~1 hour, for example) away from the reportable epoch could not be finalized in this report. Such a restriction is necessary so that any slashing that occurs can be recorded in the Consensus Layer chain. In the case of a bunker mode, a withdrawal request shall not be finalized until all associated slashings are completed.
### Using buffered Ether from deposits and rewards for withdrawal request fulfillment
The use of buffered Ether and accrued rewards for fulfilling withdrawal requests are grounded in the principle of APR maximization strategy which aims to ensure that Ether spends as little time as possible in the buffer or the activation queue.
To provide the best UX possible, withdrawal requests should be fulfilled as fast as possible by utilizing Ether from the deposit buffer and rewards vaults to fulfill withdrawal requests. This way the protocol can balance capital efficiency and UX considerations.
As outlined previously, withdrawal request fulfillment can only happen at the time of an oracle report due to the lack of up-to-date information on the state of the Consensus Layer at any other time. This holds true even when using solely the Ether held on the Execution Layer (i.e. deposit buffer and rewards) for request fulfillment in both turbo and bunker mode. Satisfying withdrawal requests immediately would give sophisticated actors a way to avoid losses from slashings.
Oracle report uses a combined Execution Layer buffer of deposited Ether, rewards accrued & withdrawals vault to fulfill the withdrawal requests, restake the amount which could be restaked & leaves the rest in the buffer.
![](https://hackmd.io/_uploads/Bk4z-EPoj.png)
### Handling overlapping slashing
In a scenario where one slashing happens before the previous one has fully finished, it is essential to consider how this can affect the finalization of withdrawal requests.
Consider the following situation:
- The protocol switches to bunker mode due to **mass slashing 1**
- A user submits a withdrawal request to the WithdrawalQueue
- **Slashing 2** occurs, not necessarily a big one
- All validators from mass **slashing 1** have exited
- *Option 1: Withdrawal request can be finalized here*
- After some time, all validators from **slashing 2** have exited
- *Option 2: Withdrawal request can be finalized here*
When should the withdrawal request be finalized?
The withdrawal delay is introduced to prevent sophisticated actors from avoiding penalties which can be predicted by observing Consensus Layer. The withdrawal request is delayed until all validators under associated slashings have exited, but it shouldn't be delayed for longer than necessary.
Delaying until Option 2 opens up a potential griefing vector. A malicious node operator could be slashing extra validators before all validators under slashing 1 have exited, blocking all withdrawals in the Lido protocol. Note that the cost of such an attack gets significantly lower under extreme network conditions.
Thus it's practical to choose Option 1.
Under Option 1 the withdrawal request can be finalized once either all validators under "associated slashings" have exited or Lido protocol has gotten back to the turbo mode, lifting slashing-induced delays altogether.
Note also that under Option 1 withdrawal requests made way before Lido protocol has entered bunker mode could get unaffected by bunker mode delays.
### Staker’s flow
Staker's flow describes all the steps that are required for a common user of the protocol to withdraw their ether.
```mermaid
%%{
init: {
"theme": "forest"
}
}%%
sequenceDiagram
title Staker's flow
participant Staker
participant WithdrawalQueue
participant Anyone
Note right of Staker: Step 1
Staker->>WithdrawalQueue: requestWithdrawal(amount of stETH to lock)
WithdrawalQueue-->>Staker: return (request id)
Note right of Anyone: Step 2
loop Untill (request status) is final
Anyone->>WithdrawalQueue: withdrawalRequestStatus(request id)
WithdrawalQueue-->>Anyone: return (request status)
end
Note right of Anyone: Step 3
Anyone->>WithdrawalQueue: claimWithdrawal(request id)
WithdrawalQueue->>Staker: transfer ether (#8804; originally locked stETH amount)
```
##### Step 1. Initiate a request
Withdrawal requests are initiated by a stETH holder via the `WithdrawalQueue` contract using the entry-point `requestWithdrawal` method which:
- irreversibly locks given `stETH` amount by trasferring it from the caller (`msg.sender`) relying on provided allowance (explicit approval)
- enqueues withdrawal request into the contract's requests queue
- associates an incremental unique id (i.e., position in the FIFO-serving queue) with the caller's address
The amount of ETH that staker will receive upon withdrawal is capped at this stage (locked `stETH` stops accrueing rewards on behalf of the caller since the moment of withdrawal request had been enqueued succesfully).
##### Step 2. Wait until the request finalization
There are two approaches to track requests finalization:
- On-chain: call the permissionless `withdrawalRequestStatus` method of the `WithdrawalQueue` contract allows tracking withdrawal requests providing the request id.
- Off-chain: subscribe to the `WithdrawalsFinalized` events emitted by the contract with a terminating condition if last finalized unique id becomes equal or greater than the needed unique id.
The withdrawal request recipient can receive in the end:
- ether amount corresponding to the locked `stETH` amount when called the `requestWithdrawal` method previously
- ether amount less than locked `stETH` amount in case of mass slashing events (unlikely, will be described further below)
This amount becomes fully determined (immutable) once the withdrawal request marked as finalized and becomes claimable.
Withdrawal requests are expected to become finalized in a couple of days for the most cases. In case of mass slashing events and penalties finalization needs to be postponed up to ~18 days.
##### Step 3. Claim ether
Once withdrawal request has been fulfilled completely and marked as finalized, anyone is able to call the `claimWithdrawal` method on the `WithdrawalQueue` contract, providing the original unique withdrawal request id. The method transfers ethers to the original requestor (recipient) who locked their `stETH` on the first step.
### Oracle flow (protocol accounting)
Rough oracle flow is the following:
- **take a pre- snapshot of the `stETH` share price**
- update exited validators number for each Lido-participating node operator
- estimate transient validators balance
- finalize withdrawal requests (if possible) by burning `stETH` shares and locking ether in the same time (i.e., decreasing a total pooled ether amount)
- resubmit remaining funds up to the reported `WithdrawalQueue` balance at block `N` to the deposit buffer
- set the reserve mark value to leave the necessary amount of funds in deposit buffer till the next report to use for withdrawals
- **take a post- snapshot of the `stETH` share price for rewards**
- calculate both consensus and execution layer rewards
- mint&distribute protocol fee on top of rewards only if consensus layer rewards part is positive
- **take a post- snapshot of the `stETH` share price for APR calculation**
Oracle report should provide the following data:
- Consensus layer data:
* Lido validators balances sum at block `N`
* Active Lido validators number at block `N`
* Cumulative number of exited Lido validators number at block `N`
* List of exited Lido validators number at block `N` per each Node Operator since the previously completed report
- Execution layer data:
* The balance of `WithdrawalQueue` available for distribution at block `N`
- Decision data:
* New withdrawals buffered ether reserve mark value
* List of triples: [last finalized request id, finalized stETH amount, finalized shares amount]
Requirements and constraints:
- Block `N` corresponds to the last block of the **finalized** expected epoch
- Off-chain Oracle daemon decides how many withdrawal requests have to be finalized on each report. It **must** suspend finalization if slashing occured until midterm penalty applied
- List of triples `[last finalized request id, amount of finalized stETH, amount of finalized shares]` is used to deliver the values of the previously missed reportable epochs since the last successfully completed report
- Oracle is able to finalize the requests happened only BEFORE the fixed offset from block `N` (10 epochs by default)
- A validator is exited once became [inactive](https://github.com/ethereum/consensus-specs/blob/11a037fd9227e29ee809c9397b09f8cc3383a8c0/specs/phase0/beacon-chain.md#is_active_validator) (reached the assigned `exit_epoch`)
- The Oracle daemon are unable to deliver the retrospective report if an [expected reportable epoch](https://docs.lido.fi/contracts/lido-oracle#getexpectedepochid) was missed. A new report should deliver the data for missed period(s) happened since the last successfully completed report.
#### Withdrawal request finalization
Pending withdrawal requests got finalized as a part of the Oracle report.
The Oracle report delivers the following values used to finalize the withdrawal requests:
* the balance of WC address at block N
* the amount of locked ether on WC address at block N
* the list of triples to describe each finalized batch of requests:
* id of the last withdrawal request id in the batch
* the total pooled ether amount per single stETH (i.e., share price) in the batch
It means, that requests till last withdrawal request id from the list's tail element become finalized (or claimable), noting `id` of the last finalized withdrawal request at block `N` is required to avoid on-chain loops over the withdrawal requests on their finalization. The total pooled ether amount per single stETH share for the blocks range got recorded into the table to facilitate a calculation of ether amount claimable for each request (even for old ones).
```mermaid
sequenceDiagram
title Oracle flow
participant OO as Off#hyphen;chain oracle
participant WQ as WithdrawalQueue
participant LO as LidoOracle
participant Lido
OO-->>OO: determine finalized block N
OO->>OO: read CL data at block N
OO->>OO: read WC balance at block N
OO->>WQ: readLastRequestId(block N)
WQ-->>OO: return (lastRequestId)
loop Collect unfinalized requests info (≤ lastRequestId)
OO->>WQ: getRequestInfo(requestId)
WQ-->>OO: return (request info)
end
OO-->>OO: determine last requestId for finalization
OO-->>LO: push CL, EL data + last requestId for finalization
LO-->>Lido: update accounting
Lido-->>WQ: finalize(lastRequestId, currentSharePrice)
```
### Exit signals flow
The exit daemon is used to inform that some Lido validators should be ejected. The purpose of the daemon is to secure and distribute these kind of reports via smart-contract events.
Exit daemon runs its own lifecycle and doesn't need to be aligned with the general Lido oracle reports.
Exit daemon delivers the following data:
- an exact pubkey to exit
- NO id and stakingModuleId where the pubkey belongs
- validator index corresponding to the validator with the pubkey
- block/epoch number of decision (Q: not needed?)
- proof of validity (merkle root of all keys?)
#### Calculate/evaluate amount of validators for exit
Rough algorithm looks like this.
1. Take not finalized withdrawal requests form `WithdrawalQueue`
1. Evaluate amount of the requests which don't require to exit any additional validators:
1. Number of requests for which there is enough ether in buffers permitted for withdrawals
2. Number of requests which are going to receive their ether from pending exit requests (requested and for which the validators responded under `TIMEOUT_FOR_EXIT_REQUEST`)
3. /optional/ Take into account the ether which is going to come from sponteneous validator exits
1. Calc amount of packs of 32 ETH needed to satisfy ether demand from the rest of the withdrawal requests
#### Select specific validators for exit
Based on the [exit order discussion](https://research.lido.fi/t/withdrawals-on-validator-exiting-order/3048).
1. Take list of Lido validators ordered by validator activation time ascending
2. Identify the last validator requested for exit (by taking the last `ValidatorExitRequest` event)
3. Take the first `N` validators, skipping already exited / exiting validators
#### Parameters of protocol exit flow
Parameters of the protocol related to validator exit flow
- `TIMEOUT_FOR_EXIT_REQUEST` Time since emitting `ValidatorExitRequest` event till the corresponding validator is considered non-responded and skipped. Likely should be stored in `ValidatorExitBus` contract
- `MAX_KEYS_TO_REPORT_FOR_EXIT_AT_ONCE` Constant to prevent oracle from hitting block gas limit while reporting to `ValidatorExitBus`. Determined based on the transaction gas cost
- `MAX_EXIT_REQUESTS_PER_DAY` Safely limit on the amount of exit reports allowed per day. To protect from exiting insane amounts of validators due to malicios or incorrect oracle reports.
```mermaid
%%{
init: {
"theme": "neutral"
}
}%%
sequenceDiagram
title Exit daemon flow
participant ED as Exit daemon
participant LO as Lido Oracle
participant NO as Node Operator
ED-->ED: determine amount of validators keys to exit
loop Push validators keys (≤ keys count)
ED-->LO: push next validator pubkey
LO-->LO: emit event with validator pubkey
end
loop Waiting for relevant keys
NO-->LO: read events for relevant pubkey
NO-->NO: sign exit message and send to CL if relevant
end
```
* The propose tooling for signing exit messages described [here](https://research.lido.fi/t/withdrawals-automating-lido-validator-exits/3272)
## Bunker mode
The proposed design for Lido on Ethereum withdrawals has two operating modes — “turbo” and “bunker”. Turbo mode processes stETH withdrawal requests as fast as possible and is expected to be working most of the time. The bunker mode activates under mass slashing conditions. It introduces extra delay for the slashing penalties to be realized and socialized in full before the withdrawal request can be fulfilled.
Under current conditions and the proposed design, in order to switch to the bunker mode Lido should experience simultaneous slashing of 600+ Lido validators. That’s 6 times more simultaneous slashings than Ethereum has experienced, and no Lido validator has been slashed ever.
The bunker mode is proposed as the Lido protocol emergency plan. The goal is to specify what is considered a protocol-wide emergency and specify the protocol behavior under extreme conditions ahead of time. The design is aimed to prevent sophisticated actors from making a profit at the cost of everyone else.
“Turbo/bunker mode” aims to create the situation when for every frame (where a frame is a period between two Oracle reports, see details in the Oracle spec):
* users who remain in the staking pool
* users who exit within the frame
* users who exit within the nearest frames
are in nearly the same conditions.
Thus, the “bunker mode” should appear when within the current frame there is a negative Consensus Layer (CL) rebase or it's expected to happen in the future as it would break the balance between users exiting now and those who remain in the staking pool or exit later.
> Negative CL rebase occurs when, within a period, aggregate Lido validator performance results in penalties exceeding rewards. Important note: Lido receives rewards on Execution Layer (priority fee and MEV for proposed blocks) and rewards/penalties on Consensus Layer (for performing the validator's duties). We consider only CL rewards/penalties as they are a better indicator for validators' network consensus participation and do not depend on market conditions like MEV.
Several scenarios should be taken into consideration as they individually or in their combination may cause a negative CL rebase:
* Mass slashing event (if slashing penalties are big enough to have the impact on Protocol's rewards in the current frame or in the future, esp. midterm penalties)
* Lido validators offline for several hours/days due to hardware damage, software malfunction, or connectivity issues (if attestation penalties are big enough to have the impact on Protocol's rewards in the current frame)
* Mass slashing event among non-Lido validators (if non-Lido slashing is big enough to increase significantly Lido midterm slashings penalties to have the impact on Protocol's rewards in the current frame or in the future)
* non-Lido validators offline (if the number of non-Lido validators offline is big enough to trigger an inactivity leak mode that reduces Lido rewards and increases Lido penalties and leads to a negative CL rebase in the current frame or in the future)
Given that, it's proposed to launch the “bunker mode” in the following situations:
1. New or ongoing mass slashing that can cause a negative CL rebase within any frame during the slashing resolution period
1. Negative CL rebase in the current frame
1. Lower than expected CL rebase in the current frame and a negative CL rebase in the end of the frame
### “Bunker mode” conditions
**Bunker is a Protocol mode that is activated when there is a negative CL rebase in the current frame or it is likely to happen in the future**
CL rebase is used for several reasons:
* CL rebase is a better indicator for validators’ performance
* MEV received during the frame can distort the picture and Oracle will not see really bad validators’ performance
* We consider a conservative scenario as we cannot rely on Execution Layer rewards
Based on the definition, there can be **three conditions that trigger the “bunker mode”**.
**Condition 1. New or ongoing mass slashing that can cause a negative CL rebase**
Motivation:
* Upcoming slashing penalties are unpredictable as they depend on the network's future condition
* A negative CL rebase could happen in the future due to midterm penalties
* LIDO socializes upcoming slashing penalties among users
* LIDO does not give advantages to those who exit in the current frame
Implementation:
* The “bunker mode” is set up when there are as many slashed validators as can cause a negative CL rebase during the current frame (due to initial slashing penalties) or any of the future frames during slashing resolution periods (due to midterm slashing penalties).
* CL rebase in the future is assumed to be exactly the same as in the current frame reduced by the amount of midterm penalties expected based on the number of ongoing slashings for Protocol and within the network
* Protocol switches back to the “turbo mode” once the current and future possible penalties from the Lido slashing cannot cause a negative CL rebase
* Oracle makes a decision on turning the “bunker mode” on and off based on the historical and current state of the network and Lido; it can be changed during slashing resolution period
**Condition 2. Negative CL rebase in the current frame**
Motivation:
* A negative CL rebase in the current frame is strong evidence of an incident in the network or Protocol. As it has to be a catastrophic scenario to cause a negative CL rebase (see Appendix B for details), we expect it to last for a while in the future, so there is a high probability that a negative CL rebase can happen in the future
* LIDO socializes upcoming negative CL rebase among users
* LIDO does not give advantages to those who observe the network and can exit while the others still do not know about the problems
Implementation:
* The “bunker mode” is set up when Oracle sees a negative CL rebase in the current frame
* In case of no associated slashings, there is a limit on maximum delay for withdrawal requests finalization that is set to 2 * gov_reaction_time (~ 6 days)
This means that by default Protocol will start to finalize withdrawal requests with no associated slashings automatically after ~6 days despite the “bunker mode”. The time window is needed to enable the governance to intervene and stop requests finalizations if necessary
**Condition 3. Lower than expected CL rebase in the current frame and a negative CL rebase at the end of the frame**
Motivation:
* Lower than expected CL rebase in the current frame together with a negative CL rebase at the end of the frame is a signal of an incident happened shortly before the Oracle report when the conditions (1) and (2) are OK
* LIDO expects it to last for a while in the future, so there is high probability that a negative CL rebase can happen in the future
* LIDO socializes upcoming negative CL rebase among users
* LIDO dos not give advantages to those who observe the network and can exit while the others still do not know about the problems
Implementation:
* The “bunker mode” is set up when Oracle sees
`[ (lower than expected CL rebase in the current frame)
and
(negative CL rebase in X) or (negative CL rebase in x) ]`
*Lower than expected CL rebase in the current frame* is defined based on the expected amount of maximum rewards calculated based on the Ethereum spec with confidence_interval parameter representing normal variation due to randomness of the proposal and sync rewards, not ideal validators performance and margins in calculations. The suggested confidence_interval is 0.9
X is a minimum sufficient number of epochs leading to lower than expected CL rebase in a frame (standard Oracle frame = 225 epochs). X is calculated by multiplying the confidence_interval parameter and the number of epochs in a frame and rounding up to the nearest integer. In case of non-standard frame length, X is not supposed to be changed as it depends only on the confidence_interval parameter.
The suggested X is 23 epochs:
`X = ceil (epochs_per_frame * confidence_interval) = ceil (225 * 0.9) = 23 epochs`
x is a period for point-in-time performance valuation at the end of the frame. The suggested x is 1 epoch
* In case of no associated slashings, there is a limit on maximum delay for withdrawal requests finalization that is set to `2 * gov_reaction_time + 1 (~ 7 days)`
This means that by default Protocol will start to finalize withdrawal requests with no associated slashings automatically after ~7 days despite the “bunker mode”. A time window is needed to enable the governance to intervene and stop requests finalizations if necessary. One additional day over condition (2) is needed because the problems triggering this condition could be detected only at the end of the current frame.
**Condition (1)** is operating only on available data on the network representing the impact of future penalties which is also available to potential sophisticated users.
**Condition (2)** is robust as it occurs in a highly unlikely scenario with a huge impact on Protocol.
**Condition (3)** is more prone to False Positive errors due to lesser intervals of observation; it extrapolates for estimation on future rebase and requires softening constraint (5), operating not only on the current state but also on historical data.
**Protocol mode is defined with every Oracle report.** With that implementation Oracle makes a decision on a possible negative CL rebase based on the most recent observed frame. For example, the “bunker mode” can be turned on due to anticipating penalties for ongoing slashings and turned off before their realization if the anticipated CL rebase grows enough.
![](https://i.imgur.com/sVSAGQA.png)
There is a **period-lock of X epochs for withdrawal requests** that can prevent sophisticated users to unstake if problems in Protocol are observed close to the Oracle report epoch.
Until the withdrawal requests are processed, **all rewards/penalties are socialized** for exiting and remaining users.
While in the “bunker mode”, **staking Ether from the buffer is stopped**:
* As “bunker mode” is a consequence of massive incidents within Protocol, new staking should be stopped until risks are eliminated, so they are not increasing with new staking.
* There can be a significant delay in withdrawal requests finalization due to the “bunker mode”, so Ether in the buffer should be preserved until the moment Protocol goes back to the “turbo mode” in order to facilitate withdrawal requests finalization.
### Withdrawal requests finalization
For any withdrawal request, a period between submitting a request and its actual finalization (when Ether would be available to withdraw) is determined by two factors:
* Time until transferring for finalization: a period during which a negative CL rebase can occur in order to socialize the impact of incidents that happened before a request was submitted
* Time for gathering liquidity: period for accumulating ETH for request fulfillment - depends on withdrawal request volume, amount of ETH in the buffer and time needed for validators to exit if needed (which is also depending on the size of the current exit queue)
*
This section is focused on the first factor - while in the “turbo mode” mode time until transferring for finalization is as short as possible, in the “bunker mode” for some withdrawal request it can be significantly longer.
As a Protocol mode is selected on each Oraсle report, for each withdrawal request an associated frame could be defined based on the epoch it was submitted.
By an **associated frame** we mean
* Current frame if the withdrawal request epoch is more than X epochs before the next Oracle report epoch (RE)
* Next frame if the withdrawal request epoch is equal or less than X epochs before the next Oracle report epoch
![](https://i.imgur.com/bWGwCEs.png)
For each withdrawal request the moment it would be transferred for finalization depends on associated frame (for socializing penalties due to incidents happened within this frame or before) and a Protocol mode.
If Protocol is in the “turbo mode”: **a withdrawal request would be transferred for finalization at the end of the associated frame**: closest Oracle report if it is more than X epochs to it, or the next one after that if it is less.
If Protocol is in the “bunker mode” there are more conditions (one general and one specific for the “bunker mode” due to a negative CL rebase).
A withdrawal request would be transferred for finalization at the Oracle report epoch if:
1. **There are no ongoing associated slashings relevant to the request**
Ongoing associated slashing - is a slashing started within the associated frame or earlier, and not ended before the current report epoch.
![](https://i.imgur.com/cJttfLQ.png)
2. **Penalties for incidents relevant to the request are socialized for the “bunker mode” due to a negative CL rebase**
Formalization for this condition:
The withdrawal request’s associated frame is before the first frame of the “bunker mode” due to a negative CL rebase
or associated frame report epoch is equal or more than MAX_NEGATIVE_REBASE_BORDER = 1536 # epochs ~6.8 days before the current report epoch.
![](https://i.imgur.com/2xvAmP7.png)
### Condition 1: New or ongoing mass slashing that can cause a negative CL rebase
![](https://i.imgur.com/NO4fQBN.png)
### Condition 2: Negative CL rebase in the current frame
![](https://i.imgur.com/sz9WPJZ.png)
![](https://i.imgur.com/xFSPNvh.png)
### Condition 3: Lower than expected CL rebase in the current frame and a negative CL rebase at the end of the frame
In the case below, window x is triggered by a negative CL rebase. If x is 1 epoch, we can estimate that there are massive incidents within Protocol at Oracle report epoch that can prolong to the next frame leading to negative CL rebase.
Lower than normal CL rebase for current oracle frame indicates that indications of those massive incidents happened through the whole frame, therefore providing an opportunity for sophisticated users to exit based on observable history.
![](https://i.imgur.com/xi7CoHB.png)
In the case below, the window X is triggered by a negative CL rebase. If X is 23 epochs we can estimate that there are continuous incidents within Protocol before Oracle report epoch that can prolong to the next frame leading to negative CL rebase.
![](https://i.imgur.com/qlhNo96.png)
## Oracle flow
![](https://i.imgur.com/2L1GW6Y.png)
## Withdrawal flow
![](https://i.imgur.com/nioDWz7.png)
## Bunker flow
![](https://i.imgur.com/1w5iQv9.png)
## Reference
* https://docs.prylabs.network/docs/how-prysm-works/validator-lifecycle
* https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/validator.md
* https://ethresear.ch/t/handling-withdrawals-in-lidos-eth-liquid-staking-protocol/8873
* https://notes.ethereum.org/@vbuterin/serenity_design_rationale
* https://eth2book.info/bellatrix
* https://eth2book.info/bellatrix/part2/incentives/slashing/
* https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/gasper/
* [Withdrawals spec](https://hackmd.io/PkQzohrLT9uvbp6HBlIvDg?view)
* [“Bunker mode”: what it is and how it works](https://docs.google.com/document/d/1NoJ3rbVZ1OJfByjibHPA91Ghqk487tT0djAf6PFu8s8/edit#)
* [Detecting incidents within Lido on Ethereum Protocol based on CL rebase](https://docs.google.com/document/d/1sz_I_mjqCBTjBWkG1x2I03XYwNmZqPyhwqfzscTdccA/edit#heading=h.2inkbpjkwzfi)
* [Accounting oracle](https://hackmd.io/gzhSS39YRWKK-Jh_maHGNA?view)
* [Lido Oracle: Associated slashings research](https://hackmd.io/@lido/r1Qkkiv3j)
* [On withdrawal request finalization rate](https://hackmd.io/-57vYMtvTO6J538yPi427Q)
* [Ejector oracle](https://hackmd.io/@lido/BJxY6m52j)
* [Lido on Ethereum. Withdrawals Landscape.](https://hackmd.io/lAkzu90YS7GJAVHtSzxt8g?view)
* [Withdrawals: Automating Lido Validator Exits](https://research.lido.fi/t/withdrawals-automating-lido-validator-exits/3272)
* [Withdrawals. On validator exiting order](https://research.lido.fi/t/withdrawals-on-validator-exiting-order/3048)