Try   HackMD

Beacon Node DB

PR: https://github.com/sigp/lighthouse/pull/4718

During the EPF, I put together some code that abstracted away some of the levelDB specific database functionality within the beacon node backend. The existing KeyValueStore trait was already doing some of the abstraction, however the code always "assumed" the KV store was levelDB.

TLDR I copied the slasher db architecture! I created a BeaconNodeBackend type that implements the KeyValueStore and ItemStore traits. I then replaced all references of levelDB w/ the new BeaconNodeBackend type. Within the BeaconNodeBackend type a cfg! macro was used to activate levelDB/Redb specific code paths based on config values.

Recently I was able to add Redb as an alternative database implementation. Redb is an ACID embedded key value store implemented in pure rust. Data is stored in copy-on-write B-trees. See the design doc for more info

I've ran both the levelDB and Redb db implementations on mainnet and here are the results

Metrics Round 1

In this round of metrics gathering I have also included block processing metrics AND disk read/writes from the process itself. I used MacOS built in activity monitor to track disk read/writes for the lighthouse process.

Results

I made two separate runs for each db implementation. The data for each run can be found below. To summarize

  • Redb bytes written to disk was much less than levelDB. We're talking about roughly 12Gb writes per hour for Redb vs 32 Gb writes per hour for levelDb.
  • Redb bytes read from disk results were inconsistent between the two runs, I'll probably need to make a third run to confirm values but it seems that Redb required much less reads from disk vs levelDB.
  • Block Processing DB Write for redb fluctuated between 25-40ms vs levelDB at 20-25ms
  • Block processing DB reads seemed slightly faster for Redb on average, especially during read spikes.

Redb

Run #1

Note, my computer fell asleep during Run #1, some of the data here may be skewed because of this.

Block Processing Metrics: https://snapshots.raintank.io/dashboard/snapshot/pQ0M8E9EOV2w7V3mSsxPSH1auzGXonOL

Database Metrics: https://snapshots.raintank.io/dashboard/snapshot/t4xr2Xia6alpL46P9cW9w3UtzZ8qayx6

Start Time End Time Bytes Written To Disk Bytes Read From Disk
2024-08-02 16:10 GMT +2 2024-08-02 23:39 GMT +2 90.55 GB 4.90 GB

Bytes written per hour: 11.82 GB/hr

Run #2

DB metrics:
https://lhm.sigp.io/dashboard/snapshot/iapHRvhuL5H9ba7FQNiZ4Sr92ZiJXad7

Block processing metrics:
https://snapshots.raintank.io/dashboard/snapshot/RKXtZTa8uKVrYxaE9QJ3GjllevRGdUkI

Start Time End Time Bytes Written To Disk Bytes Read From Disk
2024-09-03 18:12 GMT +2 2024-09-03 23:30 GMT+2 64.37GB 491.0MB

Bytes written per hour: 12.12 GB/hr

levelDB

Run #1

Block processing metrics:
https://snapshots.raintank.io/dashboard/snapshot/XN3zWddfAvoMn0R52WnYZ4u5ZxH9UPQi

Database Metrics: https://snapshots.raintank.io/dashboard/snapshot/6w5E73M0C47zCfeZsrYStYoBAYqFnTkE

Start Time End Time Bytes Written To Disk Bytes Read From Disk
2024-09-03 00:00 GMT +2 2024-09-03 9:03 GMT + 2 296.37GB 20.25GB

Bytes written per hour: 32.4 Gb/hr

Run #2

DB Metrics: https://snapshots.raintank.io/dashboard/snapshot/4olVlEPmopbZ7vqQMAIg46m5Miqwirpf

Block processing metrics:
https://snapshots.raintank.io/dashboard/snapshot/TyIl4iLERJRKohI43xT8g3xPZzubvlxR

Start Time End Time Bytes Written To Disk Bytes Read From Disk
2024-09-03 09:10 GMT +2 2024-09-03 18:01 GMT +2 284.79GB 27.56 GB

Bytes written per hour: 31.7GB/hr

Metrics Round 2

In this round I've disabled backfill-sync

Redb

DB Metrics: https://snapshots.raintank.io/dashboard/snapshot/hMwTYy5rQbQcoOVbNjh4cM5qpPNuh28w

Block processing metrics:
https://snapshots.raintank.io/dashboard/snapshot/pAL8KxKXv6Dv4D94CMXjE59Zl2GkGb4C

Start Time End Time Bytes Written To Disk Bytes Read From Disk
2024-10-02 00:46 GMT+2 2024-10-02 09:22 GMT+2 105.33GB 899.2MB

Bytes written per hour: 12.24GB/hr

Next steps

I think we should "stress" test the redb implementation. The goal here would be to understand at what scenarios the database could become corrupted. We'll also want to trigger a db compaction.