# Detailed Note on Lighthouse Architecture ## Beacon Chain ## Beacon Processor ## Client ## HTTP API ## Lighthouse Network ## Network ## Operation Pool ## Tests ## Store This document provides an overview of the `store` crate, its organization, core types, and key traits. - benches/: Benchmark for performance testing. - src/: Primary implementation. * lib.rs: Root module. re exports and defines core abstractions. * database: Low-level database interface. * iter: Iterator helpers. * hdiff: Hot state diff logic. * blob\_sidecar\_list\_from\_root: Helpers for blob sidecar lists. * chunked\_iter & chunked\_vector: Chunked iteration utilities. * memory\_store: In-memory store implementation. * hot\_cold\_store: LevelDB-backed on-disk store (`HotColdDB`). * historic\_state\_cache & state\_cache: Caching layers for states. * metrics: Prometheus scraping functions. * partial\_beacon\_state & reconstruct: Partial-state persistence and reconstruction. * config.rs: `StoreConfig` for opening stores. * consensus\_context.rs: On-disk consensus context management. * errors.rs: `Error` enum for storage operations. * metadata.rs: Types like `BlobInfo` and `AnchorInfo`. * forwards\_iter.rs & impls.rs: Internal iterator implementations. ## 1. lib.rs - Imports all the other packages and files. - Type aliases ```rust pub type ColumnIter<'a, K> = Box<dyn Iterator<Item = Result<(K, Vec<u8>), Error>> + 'a>; pub type ColumnKeyIter<'a, K> = Box<dyn Iterator<Item = Result<K, Error>> + 'a>; pub type RawEntryIter<'a> = Result<Box<dyn Iterator<Item = Result<(Vec<u8>, Vec<u8>), Error>> + 'a>, Error>; ``` * **ColumnIter**: Key–value iterator for a DB column * **ColumnKeyIter**: Key-only iterator * **RawEntryIter**: Byte level iterator over raw key–value pairs ### Core Traits - `KeyValueStore<E: EthSpec>` Defines basic key/value operations: * `get_bytes`, `put_bytes`, `put_bytes_sync` * `sync`, `key_exists`, `key_delete` * `do_atomically`, `compact`, `compact_column` * Iterators: `iter_column`, `iter_column_from`, `iter_column_keys`, etc. * Batch deletes and conditional deletes. - `Key` Conversion from raw bytes to typed keys. Implements for: `Hash256` `Vec<u8>` - `ItemStore<E: EthSpec>` Extends `KeyValueStore` with typed item operations: `put`, `put_sync`, `get`, `exists`, `delete` for any `StoreItem`. - `StoreItem` Trait for types that can be persisted: * `db_column()` returns the storage column. * `as_store_bytes()` serializes via SSZ. * `from_store_bytes()` de-serializes via SSZ. - Supporting Types `DBColumn` enum: Unique column identifiers (e.g., `BeaconBlock`, `BeaconStateHotDiff`, etc.), with helpers: `as_str()`, `as_bytes()`, `key_size()`. `KeyValueStoreOp` & `StoreOp` enums: Operations for atomic batches. * Utility functions: * `get_key_for_col`, `get_col_from_key` * `get_data_column_key`, `parse_data_column_key` ### Tests Self-contained tests demonstrating: * In-memory and on-disk store usage. * `StoreItem` implementation and round-trip correctness. * Key parsing utilities. --- ## 2. state_cache.rs - Implements the state cache mechanism in Lighthouse - Import the built in LruCache library along with other types and standard crate collections. - Defines multiple structs, `FinalizedState` that has two members, the `BeaconState` and `state_root. - Lighthouse has a different mapping strategy, where, the `SlotMap` maps blockroots (Hash256) to SlotMap, and a `SlotMap`, which maps a Slot to state roots. - The `StateCache` struct which is the primary caching structure for the Beacon state. - It has members such as `finalized_state`, which stores the most recently finalized state - Uses an LRU cache to store a certain number of recent states - `block-map` tracks the block to slot and slot to state roots - stores the `max_epoch`, the tip of the chain in `head_block_root` and a `headroom` which I am not yet sure on what it does - `HotDiffBufferCache` struct stores the cache of the hierarchical diff buffes for the hot states prior to the finalized state. So for (N-1) - It has a `hdiff_buffers` field that uses LRU cache to store the state's slot to it's `HDiffBuffer`. - Enum `PutStateOutcome` which has various outcomes when inserting the state in the cache, used in the `StateCache` implementation methods. - `StateCache` methods - `new` : it takes in the state capacity, hdiff capacity and a headroom. (From the `cull` function at a glance, i think the headroom is sort of a buffer, which is used to let the ). It constructs a new `StateCache` with the provided `states` and `hdiffs` - `update_finalized_state`: Updates the new finalized state to the cache. Takes in the state root, block root, the BeaconState and the slots to retain which are pre finalization as arguments and checks if the slot is aligned to the epoch boundary, and if it is, it validates the forward progress of the slot. Once it's validated, the finalized state is mapped to the block that produced it, for lookup. Then the states below that slot, below finalization are pruned. These state roots are then added to the `HDiffBuffer`. After that, the finalized state is updated. - helper functions `len`, `capacity`, `num_hdiff_buffers`, `hdiff_buffer_mem_usage` - `update_head_block_root` that updates the head block of the cache state, to protect from pruning - `put_state` function: Inserts a new state into the cache, returning whether it was treated as finalized, pre-finalized diff, duplicate, or newly cached - `cull` function: Evicts approximately count states using a 2 stage LRU and epoch aware order, returning the list of removed roots `state_roots_to_delete` - `BlockMap` methods: - `insert`, `prune`, `delete` and `delete_block_states` - The important function to look at here is the `prune` function: Takes in the block root hash, slot, state root hash, checks if the `slot>= finalized_slot`, and inserts the rest into the `pruned_states` set, and returns the `pruned_states` - `HotHDiffBufferCache` methods: - `new`: Creates a new cache with the specified capacity - `get`: Retrieves a `HDiffBuffer` by `state_root` and returns the buffer - `put`: Complex insertion logic that preserves important entries: - If cache is not full, inserts the (state_root, (slot, buffer)) pair - If cache is full, special handling to preserve the `"snapshot"` - Finds the minimum slot currently in cache - Allows insertion if either: 1. Capacity > 1 (can keep both snapshot and new entry) 2. Capacity = 1 AND new slot < min_slot (new entry becomes the snapshot) - If insertion allowed: - Removes LRU entry - Inserts new entry - Returns false if insertion not allowed (protects the snapshot in single-capacity cache) - `cap`, `len`, `mem_usage` functions --- ## 3. hot_cold_store.rs - Implements the `HotColdDB`, an on-disk database that maintains a "hot" database for the recently finalized blocks/states/diffs and a cold or "freezer" database that stores the archival states, for bulk storage. ### Key Types - `HotColdDB` - Contains the `hot_db`, `cold_db`, `blobs_db`, `block_cache`, `state_cache` etc, which manages the database for storing finalized states. - `BlockCache` - In-memory LRU cache that stores the block artifacts, such as block cache, blob cache, data column cache. - `block_cache` stores the recently `SignedBeaconBlock` mapped by their block root. - `blob_cache` caches the `BlobSidecarList` to the block root. - `data_column_cache` for blocks having data columns, maps the block roots to a Hashmap of the respective `ColumnIndex` and a `Arc` of `DataColumnSidecar`. - `data_column_custody_info_cache` is optional, but tracks the earliest slot for which data columns are available. - `BlockCache` methods - Similar to `StateCache` methods, it has similar methods, with the addition of `put_blobs`, `put_data_column` etc. - `HotColdDBError` enum - `impl HotColdDB, MemoryStore` having a constructor `open_ephemeral` that creates a complete HotColdStore with the `MemoryStore`. - `impl HotColdDB, BeaconNodeBackend` which opens a new or existing database, withh paths to hot and cold DB. - todo ## `impl HotColdDB`: The main API for the `HotColdDB` - `Storage‐Strategy`: - `fn cold_storage_strategy`, `fn hot_diff_start_slot`, `hot_hdiff_start_slot` functions decides on whether the slot is stored as a snapshot, diff or a replay is required, for the freezer db - returns an error if `hot_hdiff_start_slot` returns a high value, i.e., `u64::MAX` - `update_finalized_state`, checks for the anchor slot, the "pre finalized slots" states to retain and updates the LRU cache. - `put_block`: Stores a block and updates the LRU cache - `try_get_full_block`: - `get_blinded_block`, `get_block_any_variant`, `get_full_block`, `get_block_with`: - `make_full_block`: -