# Off-chain storage & management for Lido validators’ keys
The crucial Lido protocol actors are Node operators: teams running ETH2 validator nodes and earning ETH2 beacon chain rewards for the protocol. Each Node Operator controls many Validators which are independent actors that participate in the consensus of the Ethereum 2.0 protocol.
Lido stake delegation process consists of several steps:
1. Node Operators generate locally a bunch of BLS key pairs and calculates `deposit_data_root` using `withdrawal_credentials` provided by Lido.
2. Node Operators uploads this DepositData to the smart contract and awaiting approval.
3. The protocol governance checks the public key uniqueness and that every `deposit_data_root` matches Lido `withdrawal_credentials`.
4. After key approval anyone could start delegation of the next 32 Ether chunk to the next validator because the contract has all data prepared on-chain. If there are no unused and approved key+signature pairs, no ether gets delegated on the delegation method call.
Currently, the protocol stores all the Node Operators’ keys in the separate smart contract — [Node operators registry](https://github.com/lidofinance/lido-dao/blob/master/contracts/0.4.24/nos/NodeOperatorsRegistry.sol). This results in significant gas costs, as the need in blockchain storage grows linearly with the amount of ether deposited. The proposed change is to bring operators’ keys off-chain.
The main challenge is to keep `Lido.submit` method which starts stake delegation process permission-less. The smart contract should be able to enforce key order and accepts only approved keys without storing them on-chain.
To achieve this, we can take a hackathon winner's approach (add a link on PR and Tom’s full name) using Merkle proofs as a basis. The main idea before that approach that smart contract calculates mekle root on key submission and anyone could call `Lido.submit` method with merkle proof passed as argument.
We propose to get rid of passing keys as a call data on key submission. Instead, Node Operator could upload them to IPFS and submit merkle root and ipfs_cid to the smart contract. To eliminate duplicate use of the same key, we suggest storing the key together with the index in the merkle tree.
## Happy Path
1. Node Operator generates a bunch of keys with indexes to eliminate duplicate use:
[1, pubkey1, deposit_data_root1],
[2, pubkey2, deposit_data_root2],
[N, pubkeyN, deposit_data_rootN]
2. Node Operator upload this bunch of keys to the IPFS and gets `ipfs_cid` in return.
3. Node Operator calculates merkle root out of keys: `bunch_data_root=MerkleRoot(keys)`
4. Node Operator submits to the smart contract:
5. The protocol governance checks the submission and approves a bunch of keys.
6. Anyone can call `Lido.submit` and provide merkle proof. Since merkle leaf contains indexes the order of key submissions is automatically ensured.
## Two instances of KeyBunches for each Node Operator
Since key generation and approval operation takes several days and more there is a good idea to have two instances of *KeyBunch*es: one for current usage and another that waits for approval. Once one of *KeyBunch*es is out of keys the Node Operator should generate a new one and submits it for approval. In a meantime submitters using another *KeyBunch*. Since each of public key and withdrawal credentials pair has an index the order of key submission is strictly guaranteed.
To choose what *KeyBunch* to use during permission-less deposit the smart contract should calculate the `bunch_data_root` based on submitted proof and takes that *KeyBunch* that matches the root.
## Offchain infrastructure
In the current version of Lido ethereum liquid staking protocol, all keys are stored in the smart contract which leads to huge gas costs but also makes operations simple.
1. Easy permission-less submit: anyone could call submit which takes the next approved key and submits the key with buffered ether to the deposit contract.
2. Easy check that keys does not contain duplicates and hashes calculates right using Lido withdrawal credentials
3. Easy fetch all set of public keys to calculate the total balance of Lido validators on the Beacon chain.
All keys that are submitted to the *Deposit Contract* are stored in ethereum blockchain as events / call data and might be easily consumed through the Subgraph protocol. Keys that has been submitted but not used and approved are not storing on-chain but can be fetched through IPFS. It’s very important to enforce the availability of that data using oracles network or using financial incentives on top of IPFS layer such as FileCoin.
So, all validator keys can be accessed off-chain through IPFS and Subgraph protocol or a combination of both.
* *Deposit Contract* is a smart contract on ethereum 1.0 side for deposits of ETH to the beacon chain. The deposit contract has a public `deposit` function to make deposits.
* *Node Operator* is an entity running ETH2 validator nodes and earning ETH2 beacon chain rewards for the protocol. Each Node Operator controls many Validators.
* *Validator* is an independent actor that participates in the consensus of the Ethereum 2.0 protocol and is controlled by one of Node Operator.
* *ValidatorKeyData* is a tuple `(key_index, pubkey, deposit_data_root)`
* *ValidatorKeyBunch* is tuple `(bunch_data_root, bunch_size, ipfs_cid)`