Sui Validator Recovery Options and Notes

# Sui Validator Recovery Options and Notes ## Background There's been discussion about the best way to recover a Sui validator in event of a primary server failure. The options discussed involve running a secondary server to reduce downtime due to resyncing. It is expected that the time required to resync from scratch would cause untolerable downtime. Discussion has focused on running a secondary server in parallel to the primary validator. An original assumption was that the database from the secondary server could be used to quickly resync the primary validator. ## Open Questions However, a number of questions emerged about this assumption. Through the discussion, three possible scenarios emerged and with them, a number of open questions. ## Scenarios ### Scenario 1 - Run secondary server as an RPC fullnode In scenario 1, the secondary server would run as an RPC fullnode in parallel with the primary validator. If the primary validator fails, the database from the RPC fullnode would be used to resync the primary validator by transferring it to the primary server or the validaor keys would be transferred to the RPC fullnode, which would be restarted in validator mode using ``validator.toml``. However, through analysis it became clear that the RPC fullnode database was different from the validator database. Most notably, the RPC fullnode database does not include the ``consensus_db`` and ``authority_db`` databases that the validator requires. As shown by [denysk | Stardust Staking](https://discord.com/channels/916379725201563759/1093849389778350110/1096090811793428500) ``` fullnode: .......... 352K /opt/sui/db/suidb/live/store/epoch_611 314G /opt/sui/db/suidb/live/store 29M /opt/sui/db/suidb/live/epochs 15G /opt/sui/db/suidb/live/checkpoints 435G /opt/sui/db/suidb/live 435G /opt/sui/db/suidb 435G /opt/sui/db/ validator node: 54G /opt/sui/db/consensus_db/756 74G /opt/sui/db/consensus_db/755 128G /opt/sui/db/consensus_db 12G /opt/sui/db/authorities_db/live/store/epoch_755 8.9G /opt/sui/db/authorities_db/live/store/epoch_756 11G /opt/sui/db/authorities_db/live/store/epoch_754 321G /opt/sui/db/authorities_db/live/store/perpetual 351G /opt/sui/db/authorities_db/live/store 31M /opt/sui/db/authorities_db/live/epochs 15G /opt/sui/db/authorities_db/live/checkpoints 366G /opt/sui/db/authorities_db/live 366G /opt/sui/db/authorities_db 493G /opt/sui/db/ ``` ### Scenario 2 - Run a second validator In scenario 2, the secondary server would run as a second validator. The second validator, of course, would use different keys that the primary validator. If the primary validator fails, the keys from the primary validator would be transferred to the secondary validator. However, a point was raised about this approach causing unnecessary and possibly disruptive p2p traffic. It's also unclear if the validator's consensus database is unique to each validator. If it is, for example, the secondary validator database couldn't be used by the primary validator. ## Appendix It appears that ``sui-node`` has a [built-in snapshot capability](https://docs.sui.io/build/snapshot). However, it's unclear if the snapshot capability is useful for validators or only fullnodes. "While validators can enable snapshots, they are typically most valuable for Full node operators. Snapshots of the Sui network enable Full node operators a way to bootstrap a Full node without having to execute all the transactions that occurred after genesis." ### Open Questions - Which of these options is viable and preferred? - If neither of these options is viable, e.g. ``consensus_db`` is unique to each validator, what is the best risk mitigation plan avaialble for validators to minimize downtime? - Is built-in snapshot capability useful for validators or only fullnodes? ### Testing Results Scenario A - Chainflow tested the following scenario on testnet, which was intended to simulate restoring a validator from a provided third-party snapshot. Note that provided third-party snapshots only include ``attestation_db`` and not ``consensus_db``. It is believed the latter is specific to a particular validator. 1 - Stop validator 2 - Wipe ``consensus_db`` and keep ``attestation_db`` 3 - Restart validator This resulted in about 4 hours of downtime, as it appears the ``consensus_db`` had to be recreated from the epoch boundary. So while the ``attestation_db`` snapshot synced that database rather quickly, recovery time was dependent on re-creation of ``consensus_db``. ![](https://hackmd.io/_uploads/BybucZgS2.png) See additional context [here](https://discord.com/channels/916379725201563759/1003662993608945684/1105169432227094638). ### Differences Between a Full Node DB and Validator DB Validator ``authorities_db`` = full node ``full_node_db`` -``fullnode_pending_transactions`` - ``indexes`` (i.e. the validator``authorities_db`` is the same as the full node ``full-node_db" without ``fullnode_pending_transactions`` and ``indexes``. ). Source [Luz in #mn-validators-discussion](https://discord.com/channels/916379725201563759/1093849389778350110/1110267175819808888). ### Suggestion on Configuring a Full Node backup to minimize the above difference The easier thing is to set db-path: /opt/sui/db/authorities_db in the fullnode.yaml config file. So the authorities_db is the same for full node / validator. ``` @sui-mainnet-backup:/opt/sui/db$ tree -d └── authorities_db └── live ├── checkpoints ├── epochs ├── fullnode_pending_transactions ├── indexes └── store ├── epoch_54 ├── epoch_55 ├── epoch_56 └── perpetual ``` and main (validator) : ``` sui-mainnet:/opt/sui/db$ tree -d . ├── authorities_db │ └── live │ ├── checkpoints │ ├── epochs │ └── store │ ├── epoch_54 │ ├── epoch_55 │ ├── epoch_56 │ └── perpetual └── consensus_db ├── 55 └── 56 ``` This is easier, you don't need to create a symbolic link or move the folder when you need to do it. Source [here](https://discord.com/channels/916379725201563759/1003662993608945684/1116290396318859275). ### Controlled Failover E This is the rough sequence of steps used to fail over (this assumes the "new host" already has the validator keys, genesis.blob, and proper firewall settings): 1 - Install and configure aws cli on new host: * sudo apt install awscli * aws configure set default.s3.use_accelerate_endpoint true * aws configure set default.s3.max_concurrent_requests 50 * aws s3 ls s3://mysten-mainnet-snapshots/ 2 - Upgrade sui node on new host to current release * wget https://sui-releases.s3-accelerate.amazonaws.com/388f54bff04e056a5785e914cdee176d6fae8fd0/sui-node delete any existing db's * - sudo rm -rf /opt/sui/db/authorities_db/ * - sudo rm -rf /opt/sui/db/consensus_db/ 3 - Snapshot copy * aws s3 cp --recursive s3://mysten-mainnet-snapshots/epoch_50/ ./live * sudo mv live/ /opt/sui/db/authorities_db/ * sudo chown -R sui:sui /opt/sui/db/ 4 - Stop sui-node on origin host: * sudo systemctl stop sui-node 5 - Start sui-node on new host * sudo systemctl start sui-node 6 - DNS update: mysten-1.mainnet.sui.io -> [new-host].mainnet.sui.io 7 - Monitor new host: * curl -s localhost:9184/metrics | grep -e last_committed_round -e ^current_round -e highest_synced_check -e uptime | grep -v "#" * du -sh /opt/sui/db/E ### Contributors Bethose | SDL chris / chainflow denysk | Stardust Staking Ivan Merín Rodríguez Antoine | Node Guardians