# Description
This document outlines a proposal to replace the current `EVMDownloader` and `reorgDetector` with a new component. The new implementation must meet the following requirements:
- Reduce the number of queries to the RPC endpoint.
- Speed up the synchronization process.
- Natively detect and handle chain reorganizations.
- All syncer is going to be almost in the same block
- Allow for the dynamic addition of new syncers. For example, if `L1InfoTreeSync` is fully synced, it should be possible start a new syncer like from scratch that takes advantange of previous data
# Architecture
## Current
- For each syncer, there are:
- One Database for the Processor
- One instance of a generic `Driver`
- One instance of a generic `Downloader`
- One instance of a syncer-specific `Processor`
- Syncers operating on the same network share an instance of `ReorgDetector`.
## Current: Component Diagram
```mermaid
classDiagram
L1InfoTreeSync --> Driver
L1InfoTreeSync --> Processor
Driver --> Processor
Driver --> Downloader
Driver --> ReorgDetector
Processor --> DB_L1InfoTree
ReorgDetector --> DB_reorg
Downloader --> Eth Client
```
## Current: Sequence Diagram
This is a simplification of the flow of a syncer:
```mermaid
sequenceDiagram
participant Run.go
participant L1InfoTreeSync
participant Driver
participant Processor
participant Downloader
participant RPC as EthRPC
participant RD as ReorgDetector
Run.go ->> L1InfoTreeSync: Start(ctx)
L1InfoTreeSync ->> Driver: Sync(ctx)
Driver ->> Processor: GetLastProcessedBlock()
rect rgb(250,250,200)
loop goroutine
Driver ->> Downloader: Download(ctx, bn, ch)
Downloader ->> RPC: GetNewBlock()
Downloader ->> RPC: GetLastFinalizedBlock()
Downloader ->> RPC: FilterLogs(addr, from_bn, to_bn)
Downloader ->> Driver: ch(Blocks)
end
end
Download ->> Driver: ch(Blocks)
Driver ->> RD: AddBlockToTrack(bn, block_hash)
```
## Current: Sequence Diagram (reorgs)
**ReorgDetector** detects the reorgs and send a message to all syncer subscribed. It wait until the syncer have executed the reorg and proceed internally to delete the data from **ReorgDetector Database**
```mermaid
sequenceDiagram
participant Downloader
participant Processor
participant Driver
participant RD as ReorgDetector
Driver ->> RD: Subscribe(id)
Driver ->> Driver: Listen(chan reorgBlock)
RD ->> Driver: chan reorgBlock(blockNumber)
rect rgb(250,250,200)
Driver ->> Driver: handleReorg
Driver ->> Processor: Reorg(blockNumber)
Processor ->> Processor: delete affected blocks
Driver ->> RD: chan Processed(true)
end
```
# Proposal Architecture
The idea is to divide the implementation into 2 stages:
## Stage 1: MultiDownloader acts as eth-client and reorg_detector
- The MultiDownloader will have its own database with all events stored per synced contract.
- This will reduce the number of RPC calls.
### Implementation:
- The code for Downloader, Driver, etc., will remain as-is (minimal changes).
- The `FilterLogs` call will become a blocking call until the MultiDownloader reaches the requested block.
### Reason:
- It maintains high stability by reducing the number of changes.
- It reduces the implementation time.
### Pros:
- Increases syncing speed in general, especially for resyncing.
# New Component Diagram
- The instance of the MultiDownloader will be shared by all syncers on the same network (currently 2 instances: one for L1 and another for L2).
```mermaid
classDiagram
L1InfoTreeSync --> Driver
L1InfoTreeSync --> Processor
Driver --> Processor
Driver --> Downloader
Driver --> MultiDownloader : acting as ReorgDetector
Processor --> DB_L1InfoTree
Downloader --> MultiDownloader : acting as EthClient
MultiDownloader --> MultiDownloader_Database
```
**MultiDownloader**:
- It will sync all events of required contracts.
- There are going to be two MultiDownloader modes:
- **Safe** (safe zone) < `Finalized`
- **Unsafe** (unsafe zone) >= `Finalized`
#### MultiDownloaderSafe:
- It will merge all syncer configs (contractAddrs, startBlockNumber) to make one query to the RPC per block range.
- It will support adding a new address: it will sync missed data and then start syncing as usual.
- Sync in parallel.
- **Future**: Allow modification of the starting block.
- **Future**: Store the `debugCall` output (we think we are going to remove this way of syncing because the contract event will provide the full information).
#### MultiDownloaderUnsafe:
- It will sync all blocks (one by one) between `finalized` <-> `latest`, storing:
- `Blockhash`
- `previousHash`
- `logBloom`
- It will query `eth_getLogs` for blocks that the `logBloom` indicates could have relevant events. The query will use the `blockHash` instead of the `blockNumber` to be sure that it is the right block ([eth_getLogs](https://www.quicknode.com/docs/ethereum/eth_getLogs)).
- If the RPC responds that the block is unknown, it means there is a reorg.

- When the `finalizeBlock` increases, all these blocks/events are moved to the *MultiDownloader_Safe* store:
- It will sanity-check the `
Block-1` just to guarantee that the entire chain is valid.
- It will delete empty blocks (no events), because it doesn't make sense to keep them in the historical part.
# Database
## Tables related to runtime
| Table | Description |
| ----|--|
| sync_status | Status of synchronization per contract. Block Range synced,expected,... |
Example:
```
CREATE TABLE sync_status (
contract_address TEXT NOT NULL, -- Contract address
target_from_block BIGINT NOT NULL, -- Desired from block
target_to_block TEXT NOT NULL, -- Desired to block
synced_from_block BIGINT NOT NULL, -- Current synced from block
synced_to_block BIGINT NOT NULL, -- Current synced to block
syncers_id TEXT NOT NULL, -- Syncer identifier
PRIMARY KEY (contract_address)
);
```
## Tables related to data
| Table | Description |
| ----|--|
| blocks | All synced blocks. |
| block_unsafe | Data related to unsafe blocks (e.g., block_parent_hash). |
| logs | Events related to a block. |
## Tables related to reorgs
| Table | Description |
| ----|--|
| reorg_block_chain | an ID for this reorg and all blocks reoged
NOTE: Maybe it will require an extra table to keep track of reorgs, like reog_timestamp, etc...
```mermaid
erDiagram
block ||--o| block_unsafe : unsafe
logs ||--o{ block: block
sync_status
reorg_chain
```
# Creational Flow
Each syncer must register with the MultiDownloader with its required contracts, topics, and block range.
```mermaid
sequenceDiagram
run.go ->> L1InfoTreeSync: New()
L1InfoTreeSync ->> MultiDownloader: Register([c1,c2], [t1,t2], 100-Finalized)
run.go ->> L1BridgeSyncer: New()
L1BridgeSyncer ->> MultiDownloader: Register([c1,c4], [t3,t4], 100-Latest)
run.go ->> MultiDownloader: Start()
```
This allows for dynamically changing the running components:
- The configuration of a syncer will be stored, and even if you don't launch a previously run component in the current run, it's going to keep storing the data.
### Flow
- When each syncer is built, it adds the contracts, topics, and `targetBlock` to the `MultiDownloader`: `multiDownloader.Register(name, contracts, topics, targetBlock)`
- It's started: `multiDownloader.Start()`
- The `MultiDownloader` starts syncing blocks.
- Each syncer's call to `FilterLogs(...)` will be blocked until this block range is synced.
### Thoughts
- Is the database performance sufficient?
- Filter by topic?
## How to propagate a reorg detection
When **MultiDownloader** detects an reorg is going to execute next actions:
- Move all affected blocks to table `reorged_blocks` with `block_number`, `hash` and `reorg_id`
- Resync this blocks
How to detect the reorg by the point of view of syncer:
### The syncer is running when the reorg is detected
- After get some logs (and blocks), before commit the DB tx, verify if the previous block is ok:
```
tx:= DB.NewTx()
lastBlockNumber, lastBlockHash:= getLastBlock(tx)
logs:= multiDownloader.GetLogs(tx,fromBlock, toBlock)
reorged, range:= multiDownloader.CheckBlock(lastBlockNumber,lastBlockHash)
if reorged:
tx.Rollback()
deleteAllBlockRange(range)
resyncBlockRange()
rebuildAynInternalCache()
return
saveLogsToDB(logs)
```
- The **multiDownloader** will return an empty log at end of getLastBlock if we are in unsafe zone to keep track of the sequence
### The syncer is not running
- In the next start it going to verify last block synced:
- multiDownloader.CheckBlock(blockNumber, blockHash)
- If the block is affected will returns the blockRange to delete and the syncer will act as the previous example
## Stage 2: Remove downloader and use MultiDownloader directly
This stage the idea is to remove legacy `Downloader` and provide the tracking in a centralized way
**TBD**
# Use Cases
## Use Case 1: Historical sync
### Use Case 1.1: Start all syncers.
### Use Case 1.2: Add SyncA, start MultiDownloader, wait a while, and add SyncB.
- This means that SyncB needs to catch up on data until it reaches SyncA.
### Use Case 1.3: Start SyncA, stop, add a new contract, and start again.
### Use Case 1.4: A syncer disappears. How to purge relevant data?
- The first case is that a `contract_addr` disappears.
- The second case is that the block range is reduced: just remove logs/blocks related to the range (in the example: [1234-1999]).
- Syncer to delete: addr1, from_block = 1234
- Another syncer: addr1, from_block = 2000
### Use Case 1.5: Start SyncA (addr1, from_block=2000), restart reducing from_block.
- Start a sync process (addr1, addr2, from_block=2000):
- insert `contracts_config` (addr1, from_block=2000)
- insert `contracts_config` (addr2, from_block=2000)
- sync [2000 - 2100]
- insert `sync_segments` (addr1, [2000 - 2100])
- insert `sync_segments` (addr2, [2000 - 2100])
- stop
- Start a sync process (addr1, from_block=1000):
- update `contracts_config` (addr1, from_block=1000)
- sync [1000 - 1999] for addr1
- update `sync_segments` (addr1, [1000 - 2100])
- sync [2100 - 2200] for addr1, addr2
- update `sync_segments` (addr1, [1000 - 2200])
- update `sync_segments` (addr2, [2000 - 2200])
NOTE: Must logs be related to a sync_id? I don't see any reason.
# Use Case 2: Tip sync
### Use Case 2.1: Happy path, no reorg.
### Use Case 2.2: Detect parentHash mismatch.
### Use Case 2.3: Detect logFilter unknown blockHash.
# Implementation estimation
**TBD Just an example**
| Task | Duration | Description |
|------|----------|-------------|
| 1 | 3d | Implement Safe syncer
| 2 | 3d | Unsafe syncer
| 3 | 3d | Adapt syncer to new interface
| 4 | 3d | Implement **MultiDownloader** reorg
| 5 | 3d | Adapt syncer to new reorgs
| 6 | 1d | Adapt bridge service to new reorg
# References
- [Toni's Doc](https://hackmd.io/w-767L32S7-ivVktTYExwQ)
- [Arnau's Doc](https://hackmd.io/0VyDMY-HTyikm67jtrZoHA)