owned this note
owned this note
Published
Linked with GitHub
# Moving Lighthouse Validators [DRAFT]
This document describes a method of moving validators between two separate hosts (i.e., from one compute to another).
The method described here favours simplicity and robstness over validator uptime. When using this method you should expect validators to be down for at least 10-30 minutes. This is only a handful of missed attestations and it's unlikely for a given validator to have a block proposal during this time. This downtime is insignificant compared to the risk of slashing; don't rush it and don't overcomplicate it. Think before you act. Downtime is OK, slashing is not.
This guide will copy all validators and the slashing protection database from an old host to a new host.
**This guide is still in the draft phase.** It should describe the steps necessary but may have some rough edges or mistakes.
## Prerequistes
This document assumes:
- The reader is using Linux, or is able to translate this guide from Linux to whatever OS they're using. We're just stopping/starting processes and moving files around, so this should be easy enough for experienced users.
- The reader has a decent understanding of server administration. E.g., they can SSH to hosts, manage services (`systemd`, etc) and move files between hosts without handholding.
- The user has two hosts running:
- `old_host`: has a Lighthouse validator client (VC) running. This is the host which will have validators *removed* from it.
- `new_host`: has a Lighthouse validator client configured but not running. Has access to a *fully synced* beacon node (BN).
## Process
### 1. Stop the `old_host` validator client
Firstly, and most importantly, we need to stop the "old" validator client. Stopping this service is the key to preventing slashing.
If you're using Linux this probably means using the `sudo systemctl stop <service-name>` command. If you used the [Somer Esat Guide](https://someresat.medium.com/guide-to-staking-on-ethereum-ubuntu-lighthouse-773f5d982e03) then you'll want to use `sudo systemctl stop lighthousevalidator`. If you used the [CoinCashew Guide](https://www.coincashew.com/coins/overview-eth/guide-or-how-to-setup-a-validator-on-eth2-mainnet/part-i-installation/configuring-consensus-client-beaconchain-and-validator) then you'll want to use `sudo systemctl stop validator`.
Once you think you've stopped the VC, it's time to *ensure* you have stopped it. Check the service status with `sudo systemctl status <service-name>`:
- Somer Esat Guide: `sudo systemctl status lighthousevalidator`
- CoinCashew Guide: `sudo systemctl status validator`
In the output of this command, you want to see the following line:
```
Active: inactive (dead)
```
You definitely *do not* want to see something like:
```
Active: active (running) since Sat 2022-11-19 23:21:33 UTC; 30s ago
```
If you see `active`, it means the VC is still running! This is bad!
If you see a `Unit <service-name> could not be found` then you've got the `<serivce-name>` wrong and probably haven't stopped the VC. You will need to figure out the name of the VC service yourself. A good way to start might be running `ls /etc/systemd/system` to see a list of some installed services.
In addition to checking the service, run `ps -aux | grep lighthouse` and ensure that you can't see any Lighthouse VC processes running.
To be *even more sure* wait 15 minutes and check a block explorer like beaconcha.in to ensure that your validators are missing attestations.
It's **critical** that we stop the old VC. You should do everything you can think of to ensure the validator has stopped. Waiting an hour and then checking on beaconcha.in to see that all attestations are being missed is a completely reasonable thing to do.
### 2. Move the `old_host` validator directory locally
To help prevent the old VC from accidentally starting again and signing slashable messages (zombie-ing), we're going to move the validator directory. We assume that your VC stores its files in `~/.lighthouse/mainnet/validators`. However, this directory can vary based on your particular setup:
- Somer Esat Guide: `/var/lib/lighthouse/validators`
- CoinCashew Guide: `/home/<USER>/.lighthouse/mainnet/validators` where `<USER>` is the user you chose when following the guide.
> Reminder: The reader will need to replace occurances of `~/.lighthouse/mainnet/validators` with whichever directory applies to them.
Check that you have the right directory by running:
```
ls ~/.lighthouse/mainnet/validators
```
In this output you should see directories named after your validator public keys (starting with `0x`) and some or all of the following files/directories:
```
logs
slashing_protection.sqlite
slashing_protection.sqlite-journal
validator_definitions.yml
validator_key_cache.json
```
If you don't see of the above files *and* some directories starting with `0x` then you're likely using the wrong path. You **must** ensure you find the right directory or we cannot move the validators.
Assuming you have the right directory, we're going to move that directory to an arbitrary directory on our filesystem. The destination directory doesn't matter, as long as it's totally distinct to the original one (e.g., don't move it to a sub-directory of the original directory).
```
mv ~/.lighthouse/mainnet/validators ~/old_validators
```
Check the `~/old_validators` directory and ensure you can see the `0x` prefixed validator public key directories in there.
> Some users might also be using a "secrets directory". If so you'll need to also run `mv ~/.lighthouse/mainnet/secrets ~/old_secrets`. Remember to use the path that suits your installation. If you're using this directory it will exist and contain files with `0x`-prefixed filenames. If the directory doesn't exist or is empty, you do not need to do anything with it.
### 3. Double check that your validators are still stopped
Go to beaconcha.in and check that the validators are still missing attestations. If they're still getting rewards for producing attestation then something has gone *very wrong*. Do not proceed and seek assistance.
### 4. Move the validators to the `new_host`
You need to copy the `~/old_validators` directory from the `old_host` to the `new_host`. We will leave it up to the user to figure this out. Tools like `scp` or `rsync` are great for this.
The end result should be that *both* the `old_host` and `new_host` have *identical* directories at `~/old_validators`. We're leaving the files on the `old_host` as a backup, but they should be deleted when the process is complete.
> If you're using the secrets dir, do the same for `~/old_secrets`.
### 5. Turn off the `old_host`
You should power-down the `old_host` now, just to be safe. This might mean pressing the power button on the physical machine or using the web interface for a cloud-hosted machine. I wouldn't recommend deleting/terminating the box yet, having a backup of the files can be useful.
Everything we do from here on will be on the `new_host`.
### 7. Ensure the `new_host` VC service is stopped
On the `new_host` ensure the VC service is stopped. This will be just like what we did in step (1), but on the `new_host` rather than the `old_host`.
We're making sure the `new_host` VC is stopped since we're going to be moving around some of its configuration files. It's not a slashing risk for it to be running, however it's certainly best to have it stopped.
### 6. On the `new_host`, copy the validator files to the appropriate directory
On the `new_host`, determine the directory that the VC is expecting to find the validator files. We'll assume `~/.lighthouse/mainnet/validators`, but your setup might be different:
- Somer Esat Guide: `/var/lib/lighthouse/validators`
- CoinCashew Guide: `/home/<USER>/.lighthouse/mainnet/validators` where `<USER>` is the user you chose when following the guide.
Run these commands
```
rm -r ~/.lighthouse/mainnet/validators
cp -r ~/old_validators ~/.lighthouse/mainnet/validators
```
> If you're using the secrets dir, move `~/old_secrets` to `~/.lighthouse/mainnet/secrets` (or whichever destination directory is applicable).
### 7. Inspect the `validator_definitions.yml` file
On the `new_host` open the `~/.lighthouse/mainnet/validators/validator_definitions.yml` file in your favourite text editor (remembering that this file might be at a different path depending on which directory structure you use).
In this file you'll see that there are one or more [absolute paths](https://www.geeksforgeeks.org/absolute-relative-pathnames-unix/), like:
```
voting_keystore_path: /home/paul/.lighthouse/validators/0x87a580d31d7bc69069b55f5a01995a610dd391a26dc9e36e81057a17211983a79266800ab8531f21f1083d7d84085007/voting-keystore.json`
```
If the directory structure has changed between the two hosts, then these paths will need to be updated. For example, let's assume that the user on the old machine was `paul` but the user on the new machine is `karl`. The above path (and all others) would need to be changed to:
```
voting_keystore_path: /home/karl/.lighthouse/validators/0x87a580d31d7bc69069b55f5a01995a610dd391a26dc9e36e81057a17211983a79266800ab8531f21f1083d7d84085007/voting-keystore.json`
```
### 8. Triple check that your validators are still stopped
Go to beaconcha.in and check that the validators are still missing attestations. If they're still getting rewards for producing attestation then something has gone *very wrong*. Do not proceed and seek assistance.
### 9. Start the VC service on the `new_host`
On the `new_host`, start the VC service/process. When using `systemd`, this will involve something like `sudo systemctl start <service-name>`.
- Somer Esat Guide: `sudo systemctl start lighthousevalidator`
- CoinCashew Guide: `sudo systemctl start validator`
Check that the service is running with `sudo systemctl status <service-name>`, you should see something like:
```
Active: active (running) since Sat 2022-11-19 23:21:33 UTC; 30s ago
```
### 10. Determine that the VC is running
Check the logs of the VC. When using `systemd`, this can be easily done with `sudo journalctl -u <service-name> -f`.
- Somer Esat Guide: `sudo journalctl -u lighthousevalidator -f`
- CoinCashew Guide: `sudo journalctl -u validator -f`
If you're seeing `ERRO` logs or `CRIT` logs then you will need to investigate and resolve these issues.
Additionally, check on beaconcha.in that your validator is receiving attestation rewards and no longer missing attestations or blocks.
### 10. Clean up
Once you're confident that things are working, you can:
- On the `new_host`, delete the `~/old_validators` directory.
- Terminate/delete the `old_host`.
> If using the secrets directory, also delete the `~/old_secrets` directory.
### 11. Done
The guide is complete and there should be nothing else to do. If you get stuck feel free to reach out on our [Discord](https://discord.gg/cyAszAh).