Problem
========
1. There is `RegistryService` in the mev-monitoring repo. The service is responsible to collect and keep up-to-date data regarding operators. It runs as a job by a schedule. Also, the service stores data in a database (PostgreSQL). Similarly, there is KAPI service which provides an ability to do the same as the `RegistryService` does.
2. Due to huge number of operators that cannot be fully reflected in RAM.
3. It would be great to store each snapshot of KAPI state to be able to calculate metrics for each of them. Meaning it will allow us to show metrics at the slot number.
Solution
========
Overall solution assumes to store all necessary states of Staking modules, Node operators and validators' keys.
There are DB structure available through [the link](https://viewer.diagrams.net/?tags=%7B%7D&highlight=0000ff&edit=_blank&layers=1&nav=1&title=KAPI%20Versions%20storage.drawio#Uhttps%3A%2F%2Fdrive.google.com%2Fuc%3Fid%3D1jFBF5QK-V4ym_GuY8TqG0jTR20WKV5Hk%26export%3Ddownload). Consider all entities in particular.
## DB Entities
There are the following tables in use:
- `meta_version`
- `node_key`, `node_key_version`
- `node_operator`, `node_operator_version`
- `staking_module`, `staking_module_version`
All entities has their own version table which contains not unique fields, except `meta_version` table. These fields may be changed depend on KAPI response. The primary entities (without `_version` postfix) contain only unique fields that are used for identification.
Each primary table contains auto-incremental primary key and the rest fields with unique constraint. Therefore, these fields are leveraged for searching already existed entities, for grouping and counting metrics by slots and so on.
Each version table contains references for corresponding primary table entity and to a meta entity. The meta entity contains common state of all entities which pulled from KAPI via `/v1/status` endpoint.
### Synchronization mechanism for KAPI
Meta version entity contains a special field, named `status`. The field can be either 0 or 1. Once fetching data started, MEVM creates a new entry in `meta_version` table with the following fields: `block_hash`, `block_number`, `timestamp` and `status = 0`. It enables MEVM services to recognize whether data pulling done. Once it's done, MEVM updates the new entry's `status` to 1.
In case when data wasn't completely pulled from KAPI, the appropriate entry will be deleted based on the status = 0.
### Node keys storage optimization
In order to understand whether Lido's operator proposed block, MEVM is matching proposer public key to known node operators' public keys.
`node_key` table used for that purpose. MEVM grabs validator public key from blockchain and stores it in `slot_storage` table with appropriate slot info. Meaning MEVM is no needed to keep all keys for each version of blockchain state due to MEVM uses version of keys only once, for current iteration of matching.
Therefore, we can keep only the newest synchronized version of keys in DB. The rest versions of keys can be redacted.
Implement new module KeysApiProvider
--------
Create a new module in `lido-mev-monitoring/src/keys-api` directory. Implement the following methods:
- `getStatus` - it calls `/v1/status` from Keys API service. The call returns the current status of the KAPI instance.
- `fetchModulesOperatorsKeys` - it calls `/v2/modules/operators/keys` from KAPI service. This method returns a stream with modules, operators and keys.
Implement new call `/v2/modules/operators/keys` in KAPI
----------
It should provide a stream with backward compatibility with REST API.
See schemas and details [here](https://linear.app/lidofi/issue/VAL-255/add-endpoint-for-streaming-for-kapi).
Add Keys API delaying alert
-----
Add new alert into `alerts/mev-monitoring-api.rule.yml`:
```yaml!
- alert: MEVMonitoringKeysAPIErrors
expr: sum by (instance, project) (increase(lido_mev_monitoring_keys_api_requests_duration_seconds_count{result="error"}[5m])) > 40
labels:
severity: warning
app_team: tooling
annotations:
summary: Keys API errors
description: '{{ $value }} errors in the last 5 minutes appeared in Keys API queries'
```