Validator Ejector Logic Spec

--- tags: withdrawals, tooling authors: Nikolay Pryanishnikov created: 12-12-2022 updated: 26-12-2022 --- ![](https://hackmd.io/_uploads/r18VzMQYo.png) # Validator Ejector Logic Spec Validator ejector is a daemon which loads pre-signed validator exit messages and sends them out when necessary by listening to LidoOracle contract events. [Repository on GitHub](https://github.com/lidofinance/validator-ejector) ## Main Flow ```mermaid sequenceDiagram Ejector->>Ejector: Load and validate messages Ejector->>Execution Node: Fetch historical events Ejector->>Consensus Node: Submit exits if needed Ejector->>Execution Node: Fetch loop events Ejector->>Consensus Node: Submit exits if needed Ejector->>Ejector: Sleep and repeat ``` ## Requirements - Folder of pre-signed exit messages as individual json files in either spec format or ethdo output format - Execution node - Consensus node ## ENV Configuration Required: - EXECUTION_NODE=http://1.2.3.4:8545 - CONSENSUS_NODE=http://1.2.3.4:5051 - LOCATOR_ADDRESS=0x123 - Address of the Locator contract, can be found in the lido-dao repo - STAKING_MODULE_ID=123 - Staking Module ID for which operator ID is set - OPERATOR_ID=123 - Operator ID in the Node Operators registry, easiest to get from Operators UI - MESSAGES_LOCATION=messages - Folder to load json exit message files from Optional: - MESSAGES_PASSWORD - Password to decrypt encrypted exit messages with. Needed only if you have encrypted files in messages directory. - BLOCKS_PRELOAD=7200 - Amount of blocks to load events from on start. Increase if daemon was not running for some time. Defaults to a day of blocks - BLOCKS_LOOP=32 - Amount of blocks to load events from on every poll. Defaults to 1 epoch - JOB_INTERVAL=384000 - Time interval in milliseconds to run checks. Defaults to time of 1 epoch - HTTP_PORT=false - Port to serve metrics and health check on - RUN_METRICS=false - Enable metrics endpoint - RUN_HEALTH_CHECK=false - Enable health check endpoint - LOGGER_LEVEL=info - Severity level from which to start showing errors eg info will hide debug messages - LOGGER_FORMAT=simple - Simple or JSON log output: simple/json - LOGGER_SECRETS=[] - String array of exact secrets to sanitize in logs - DRY_RUN=false - Run the service without actually sending out exit messages ## Message Loading and Validation Ejector handles two message formats: Original: ```json { "message": { "epoch": "123", "validator_index": "123" }, "signature": "0x123" } ``` Ethdo output: ```json { "exit": { "message": { "epoch": "123", "validator_index": "123" }, "signature": "0x123" }, "fork_version": "0x02001020" } ``` They are supplied as separate .json files in a folder location of which can be set using an environment variable. For storage safety, messages can be encrypted and Ejector will automatically decrypt messages on startup by using the key from `MESSAGES_PASSWORD` environment variable. Encryption/decryption is done following [EIP-2335](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-2335.md) spec. - Invalid files in messages folder are noticed and logged - JSON inside files is checked to be parseable - JSON structure is checked - JSON after decryption is checked to be parseable - Exit signature is fully validated Signature is checked for current and previous fork settings, same way Consesus Node does it. Epoch is not checked separately as it being invalid (eg too early) will fail the signature check. At the start, all messages are loaded into memory. This allows us not to rely on specific filenames eg `0x123.json` and find messages by validatorIndex inside them instead, providing Node Operators with freedom to choose the names themselves. If new messages are added while Ejector is running, a restart is needed. Restart is safe as nothing bad will happen if we try to submit exit messages more than one time. ## Events Loading Events are loaded in two stages, both for which block parameters are adjustable via environment variables. Events will come in rounds, which are 4-6 hours long. Current event definition: ```solidity event ValidatorExitRequest(uint256 indexed stakingModuleId, uint256 indexed nodeOperatorId, bytes validatorPubkey ``` Events are requested from the Execution Node only for the exact Node Operator for whom id is configured in an environment variable. This allows us to catch and log an error for situations when we should be sending an exit, but don't have a message loaded for that specific validator. All node requests are done to finalized state (32 blocks after safe state and 64 after head for extra caution). Default poll parameters are optimised to wake up on new finalized epoch and read its events. ### Historical load First, on start, events are loaded for a large block amount to compensate for Ejector being switched off. Node Operators can adjust the parameters if it was not running for a long time. ### Loop load After historical load is done, Ejector will sleep and wake up to load events for a smaller amount of blocks. ## Exits When the Ejector thinks an exit should be made, first, validator is status is checked and if it's already exiting, nothing will be done. ```javascript switch (status) { case 'active_exiting': case 'exited_unslashed': case 'exited_slashed': case 'withdrawal_possible': // already exited case 'withdrawal_done': // already exited isExiting = true default: isExiting = false } ``` Then, validator index is found by looking up validator data in the Consensus Node using the public key in exit request. This is done so we can search exit messages in memory. After that, message is found in memory by looking at validator index inside exit messages. If it's not found, an error is logged and metric is updated. At this point, we don't validate message and its signature a second time because node will check its validity and return an error if it's not correct. For example, two hard forks might have happened by the time message needs to be sent out. Finally, message is sent to the Consensus Node and its response is handled. ## Additional Safety Features - Encrypted messages allow for secure file storage - Node requests are repeated both on generic errors and timeouts - Amount of messages left to send out can be checked using metrics so Node Operators don't run out of them - Dry run mode to test setup without actually sending out exit messages - Health check endpoint do detect app has crashed - Ability to disable metrics and health endpoint not to expose a port ## Logs Logging is configurable via environment variables. #### Log level Vital for debugging eg setting to "debug" to understand why something is not working properly. Especially useful since we might ask Node Operators to provide detailed logs in case of issues. #### Output type Simple or JSON. JSON allows easy parsing to filter needed data. ### Secrets Exact secrets to sanitize from logs. They are optional: some operators would want to sanitize RPC service urls with auth tokens, but for internal node addresses it might not be necessary. ## Metrics - exit_messages: ['valid'] - Exit messages and their validity: JSON parseability, structure and signature. Already exiting(ed) validator exit messages are not counted - exit_actions: ['result'] - Statuses of initiated validator exits - polling_last_blocks_duration_seconds: ['eventsNumber'] - Duration of pooling last blocks in microseconds - execution_request_duration_seconds: ['result', 'status', 'domain'] - Execution node request duration in microseconds - consensus_request_duration_seconds: ['result', 'status', 'domain'] - Consensus node request duration in microseconds - job_duration_seconds: ['name', 'interval', 'result'] - Duration of cron jobs Recommended Use Cases: - Monitoring of invalid exit messages with alerts - Monitoring of left messages which were not yet sent out - operators will generate and sign more messages when necessary - Noticing failed requests to nodes