Problem
=======
The main goal of RM (relay monitoring) is indexing all bids from relay by slots.
In case when one (or more) relay is not available, RM is fall down. The next job execution may failed again due to unavailability of one (or more) relays.
This behavior leads to stop consuming data by mev-monitoring.
Solution
========
fetch-bids (every 5 sec):
---------
1. create a request record for each slot and relay with status = 0
2. processing: if success, set status = 1 and write the bids to the table with request_id, otherwise, set status = 2
3. select all relays with failed or unprocessed slots with lag no older than 120 slots; perform processing (2)
```sql!
select request_id, pubkey, url, slot_number
from relay_request_status rs
where rs.status in (0, 2)
and slot_number - lastIndexedSlotInDB > 120
```
indexing-bids (every 12 sec):
----------
1. index all slots that placed no older than 120 slots ago and with healthy relays
```sql!
insert into bids_range_storage (slot, min_value, max_value, median_value, count)
select
rb.slot_number
MIN(value) AS min_value
MAX(value) AS max_value,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY value) AS median,
count(*) as bid_count
from relay_request_status rs
inner join slot_relay_bids rb
on rb.slot_number = rs.slot_number
-- do not index slots with unhealthy relays
where rs.slot_number not in (
select rs.slot_number
from relay_request_status rs
where rs.status in (0, 2)
and rs.slot_number - lastIndexedSlotInDB > 120
) and rb.slot_number - lastIndexedSlotInDB > 120
group by rb.slot_number
```
2. get last indexed slot in db:
```sql!
select min(slot_number) as lastIndexedSlotInDB
from relay_request_status rs
where rs.slot_number not in (
select rs.slot_number
from relay_request_status rs
where rs.status not in (0, 2)
)
```
3. skip unhealthy relays from aggregation:
```sql!
update relay_request_status
set status = 3
where slot_number - lastIndexedSlotInDB > 80
and status in (0, 2)
```
cur = 1000
last = 800
forprocess = 1000 - 120 = 880
purge-bids (every 1 min):
----------
1. remove all failed or not processed requests with bids placed 200 slots ago (200 * 12 sec = 2400 sec = 40 min)