Lighthouse Threading Architecture

# Lighthouse Threading Architecture > A comprehensive guide to the execution model, work flows, and shared state in the Lighthouse Ethereum consensus client. > > **Target audience:** Developers onboarding to the Lighthouse codebase. > **Source:** `beacon_node/`, `common/task_executor/`, `beacon_node/beacon_processor/` --- ## Table of Contents 1. [Tier Overview](#1-tier-overview) 2. [Tier 1 — Tokio Async Executor](#2-tier-1--tokio-async-executor) 3. [Tier 2 — Tokio Blocking Thread Pool](#3-tier-2--tokio-blocking-thread-pool) 4. [Tier 3 — Rayon Thread Pools](#4-tier-3--rayon-thread-pools) 5. [Tier Trigger Relationships](#5-tier-trigger-relationships) 6. [Flow: Gossip Block Import](#6-flow-gossip-block-import) 7. [Flow: Gossip Attestation (BLS Batching)](#7-flow-gossip-attestation-bls-batching) 8. [Flow: P2P RPC Request Serving](#8-flow-p2p-rpc-request-serving) 9. [Flow: RPC Block Import (Sync)](#9-flow-rpc-block-import-sync) 10. [Shared Objects Across Tiers](#10-shared-objects-across-tiers) 11. [Controllers & Orchestrators](#11-controllers--orchestrators) 12. [BeaconProcessor Queue Priority](#12-beaconprocessor-queue-priority) --- ## 1. Tier Overview Lighthouse uses three distinct execution tiers. No tier has its own dedicated OS thread — all tiers share a common Rust async runtime (`tokio`) and purpose-built thread pools. ```mermaid graph TB subgraph T1["⚡ Tier 1 — Tokio Async Executor (N OS threads, N ≈ CPU count)"] direction LR NET["🌐 network task\nlibp2p Swarm"] RTR["🔀 router task"] MGR["📋 BeaconProcessor\nmanager task"] SYN["🔄 sync manager"] API["🌐 HTTP API\naxum server"] WRK["⚙️ async workers\n(GossipBlock, RpcBlock,\nChainSegment, ...)"] NET --> RTR --> MGR MGR --> WRK end subgraph T2["🔒 Tier 2 — Tokio Blocking Pool (max = CPU count)"] direction LR BLS["🔐 BLS verification\n(attestations, aggregates)"] SLH["⚠️ slashings / exits"] SRV["📤 blob/column serving"] BCK["📦 backfill\n(rate limit OFF)"] end subgraph T3["🧵 Tier 3 — Rayon Pools"] direction LR LP["🐢 LowPriority\n~25% CPUs\nBackfill sync"] HP["🚀 HighPriority\n~80% CPUs\nColumn reconstruction"] end MGR -->|spawn_blocking| T2 MGR -->|spawn_blocking_with_rayon\nLowPriority| LP WRK -->|spawn_blocking_with_rayon_async\nHighPriority .await| HP classDef t1 fill:#87CEEB,stroke:#1a6691,stroke-width:2px,color:darkblue classDef t2 fill:#90EE90,stroke:#2d6a2d,stroke-width:2px,color:darkgreen classDef t3 fill:#E6E6FA,stroke:#5555aa,stroke-width:2px,color:#333366 class NET,RTR,MGR,SYN,API,WRK t1 class BLS,SLH,SRV,BCK t2 class LP,HP t3 ``` --- ## 2. Tier 1 — Tokio Async Executor Tier 1 is the tokio multi-threaded runtime. It runs `Future`s cooperatively — tasks yield at `.await` points, freeing the OS thread for other tasks. **Nothing here may block.** ### Permanent Tasks (always alive) | Task name | Struct | Owns | |---|---|---| | `"network"` | `NetworkService<T>` | libp2p `Swarm` — the only owner of TCP/QUIC sockets | | `"router"` | `Router<T>` | Routes `RouterMessage` from network to beacon processor / sync | | `"beacon_processor_manager"` | `BeaconProcessor<E>` | All 30+ work queues, worker slot counter | | `"sync_manager"` | `SyncManager` | Range sync, backfill sync state machines | | HTTP API | axum server | Serves `/eth/...` endpoints | | `"reprocess_scheduler"` | internal | Holds delayed work items until their slot arrives | ### Short-lived Async Workers (spawned per work item) Spawned by the BeaconProcessor manager via `task_spawner.spawn_async()`. Each consumes one worker slot. They run on the tokio executor (not a blocking thread) because they need to `.await` — e.g. waiting for the Execution Layer over the Engine API. | Work type | Spawned for | |---|---| | `GossipBlock` | Gossip block verification + import | | `GossipBlobSidecar` | Gossip blob sidecar | | `GossipDataColumnSidecar` | PeerDAS data column via gossip | | `RpcBlock` | Block received from sync peer | | `RpcBlobs` / `RpcCustodyColumn` | Blob/column from sync peer | | `ChainSegment` | Batch of blocks from range sync | | `DelayedImportBlock` | Block re-queued from reprocess scheduler | | `BlocksByRangeRequest` / `BlocksByRootsRequest` | Serving a peer's block request | | `ColumnReconstruction` | PeerDAS reconstruction trigger | ### Key source files - [`common/task_executor/src/lib.rs`](../common/task_executor/src/lib.rs) — `TaskExecutor`: `spawn()`, `spawn_blocking()`, `spawn_blocking_with_rayon()` - [`beacon_node/network/src/service.rs`](../beacon_node/network/src/service.rs) — `NetworkService::spawn_service()` - [`beacon_node/network/src/router.rs`](../beacon_node/network/src/router.rs) — `Router::spawn()` - [`beacon_node/beacon_processor/src/lib.rs`](../beacon_node/beacon_processor/src/lib.rs) — `BeaconProcessor::spawn_manager()` --- ## 3. Tier 2 — Tokio Blocking Thread Pool Tokio's blocking pool (`spawn_blocking`) provides OS threads for CPU-bound synchronous work that must not block the async executor. Capped at CPU count. **Triggered exclusively by:** Tier 1 BeaconProcessor manager via `task_spawner.spawn_blocking()`. ### What runs in Tier 2 | Work type | Queue type | Max queue size | |---|---|---| | `GossipAttestation` / `GossipAttestationBatch` | LIFO | `active_validators / slots_per_epoch` | | `GossipAggregate` / `GossipAggregateBatch` | LIFO | 4096 | | `GossipVoluntaryExit` | FIFO | 4096 | | `GossipProposerSlashing` / `GossipAttesterSlashing` | FIFO | 4096 | | `GossipSyncSignature` / `GossipSyncContribution` | LIFO | 2048 / 1024 | | `BlobsByRangeRequest` / `BlobsByRootsRequest` | FIFO | 1024 | | `DataColumnsByRootsRequest` / `DataColumnsByRangeRequest` | FIFO | 1024 | | `LightClientBootstrapRequest` + LC updates | FIFO | 512–1024 | | `ApiRequestP0` / `ApiRequestP1` (blocking variant) | FIFO | 1024 | | `ChainSegmentBackfill` (rate limiting **OFF**) | FIFO | 64 | | `Status` | FIFO | 1024 | ### LIFO vs FIFO queue choice - **LIFO** (attestations, aggregates, sync messages): freshest items are most valuable. When overloaded, process newest and drop stale. - **FIFO** (slashings, exits, blocks): ordered processing prevents censoring and preserves safety. ### Worker lifecycle (RAII pattern) ```mermaid flowchart LR MGR["📋 Manager\n(Tier 1)"] -->|spawn_blocking closure| OS["🔒 OS Thread\n(Tier 2)"] OS -->|runs closure| WORK["⚙️ CPU work\n(BLS verify, DB read...)"] WORK -->|completes or panics| DROP["💧 SendOnDrop\n(Rust Drop trait)"] DROP -->|fires idle_tx| MGR classDef tier1 fill:#87CEEB,stroke:#1a6691,stroke-width:2px,color:darkblue classDef tier2 fill:#90EE90,stroke:#2d6a2d,stroke-width:2px,color:darkgreen classDef raii fill:#FFD700,stroke:#997700,stroke-width:2px,color:black class MGR tier1 class OS,WORK tier2 class DROP raii ``` `SendOnDrop` is a RAII guard: it holds the `idle_tx` channel sender and sends a `WorkType` message when dropped — even on panic. This guarantees worker slots are always freed. --- ## 4. Tier 3 — Rayon Thread Pools Rayon provides data-parallel thread pools. Lighthouse creates **two named pools**, never uses the global rayon pool (which would cause CPU oversubscription). ```mermaid graph TB subgraph Provider["RayonPoolProvider (inside TaskExecutor)"] LP["🐢 LowPriority Pool\n~25% of CPUs\nmin 1 thread"] HP["🚀 HighPriority Pool\n~80% of CPUs\nmin 1 thread"] end MGR["📋 BeaconProcessor manager\n(Tier 1)"] -->|"spawn_blocking_with_rayon\n(LowPriority)\nwrapped inside Tier 2 slot"| LP WRK["⚙️ Async worker\n(Tier 1)"] -->|"spawn_blocking_with_rayon_async\n(HighPriority) .await\nbypasses Tier 2"| HP LP -->|"process_chain_segment_backfill()\nhistorical block import"| BCK["📦 Backfill Sync\n(rate limiting ON)"] HP -->|"data_availability_checker\n.reconstruct_data_columns()"| REC["🔬 PeerDAS Column\nReconstruction"] classDef t1 fill:#87CEEB,stroke:#1a6691,stroke-width:2px,color:darkblue classDef t3lo fill:#E6E6FA,stroke:#5555aa,stroke-width:2px,color:#333366 classDef t3hi fill:#FFD700,stroke:#997700,stroke-width:2px,color:black classDef work fill:#f0f0f0,stroke:#666,stroke-width:1px,color:#333 class MGR,WRK t1 class LP,BCK t3lo class HP,REC t3hi ``` | Pool | Trigger path | Consumes Tier 2 slot? | Used for | |---|---|---|---| | LowPriority (~25% CPUs) | Manager → `spawn_blocking_with_rayon()` | Yes (installs rayon inside a blocking task) | `ChainSegmentBackfill` when rate limiting ON | | HighPriority (~80% CPUs) | Async worker → `spawn_blocking_with_rayon_async().await` | No (direct rayon spawn + oneshot channel) | `reconstruct_data_columns()` (PeerDAS) | --- ## 5. Tier Trigger Relationships ```mermaid flowchart TD OS_KERNEL["🐧 OS Kernel\nepoll / kqueue"] TOKIO["⚡ Tokio Reactor\nasync I/O readiness"] T1["⚡ TIER 1\nTokio Async Executor"] T2["🔒 TIER 2\nTokio Blocking Pool"] T3LP["🐢 TIER 3 LowPriority\nRayon Pool ~25% CPUs"] T3HP["🚀 TIER 3 HighPriority\nRayon Pool ~80% CPUs"] OS_KERNEL -->|"socket/timer events"| TOKIO TOKIO -->|"wakes futures"| T1 T1 -->|"spawn_blocking(closure)\nBeaconProcessor manager only"| T2 T2 -->|"SendOnDrop fires idle_tx"| T1 T1 -->|"spawn_blocking_with_rayon\n(LowPriority)\nwraps inside Tier 2 slot"| T3LP T3LP -->|"SendOnDrop fires idle_tx"| T1 T1 -->|"spawn_blocking_with_rayon_async\n(HighPriority) .await\nbypasses Tier 2"| T3HP T3HP -->|"oneshot channel result"| T1 classDef kernel fill:#FF6B6B,stroke:#cc0000,stroke-width:2px,color:white classDef t1 fill:#87CEEB,stroke:#1a6691,stroke-width:2px,color:darkblue classDef t2 fill:#90EE90,stroke:#2d6a2d,stroke-width:2px,color:darkgreen classDef t3lo fill:#E6E6FA,stroke:#5555aa,stroke-width:2px,color:#333366 classDef t3hi fill:#FFD700,stroke:#997700,stroke-width:2px,color:black class OS_KERNEL kernel class TOKIO,T1 t1 class T2 t2 class T3LP t3lo class T3HP t3hi ``` **Rule:** Tier 1 is the only tier that can trigger other tiers. Tier 2 and Tier 3 never spawn further work — they only signal back to Tier 1 (via `idle_tx` or `oneshot`) when done. --- ## 6. Flow: Gossip Block Import A gossip block travels through all three major components before reaching the chain. ```mermaid sequenceDiagram participant P2P as 🌐 libp2p Swarm (Tier 1: network task) participant NS as 📡 NetworkService (Tier 1: network task) participant RT as 🔀 Router (Tier 1: router task) participant NBP as 🔗 NetworkBeaconProcessor (Tier 1) participant CH as 📋 BP Channel mpsc cap=16384 participant MGR as 📋 BP Manager (Tier 1) participant WRK as ⚙️ Async Worker (Tier 1: short-lived) participant EL as ⛏️ Execution Layer (engine_newPayload) participant BC as 🔗 BeaconChain (shared Arc) participant T2 as 🔒 Blocking Worker (Tier 2) P2P->>NS: NetworkEvent::PubsubMessage (BeaconBlock) NS->>RT: RouterMessage::PubsubMessage (unbounded channel) RT->>NBP: handle_gossip() → send_gossip_beacon_block() NBP->>CH: Work::GossipBlock(AsyncFn) try_send WorkEvent alt Worker slot available (current_workers < max_workers) MGR->>WRK: task_spawner.spawn_async() else All slots busy MGR->>MGR: push to gossip_block_queue (FIFO, cap=1024) Note over MGR: Wait for idle_rx signal MGR->>WRK: spawn when slot freed end rect rgb(200, 230, 255) Note over WRK,BC: process_gossip_block() — fully on Tier 1 async task WRK->>BC: 1. GossipVerifiedBlock (seen cache, gossip rules — sync, inline) WRK->>BC: 2. SignatureVerifiedBlock (BLS sig check — sync, inline) Note over WRK: 3. into_execution_pending_block() — SYNC, Tier 1 per_slot_processing() + per_block_processing() run inline on the async worker (no spawn_blocking) WRK->>EL: 4. engine_newPayload() .await response — async, Tier 1 EL-->>WRK: VALID / SYNCING end rect rgb(200, 255, 200) Note over WRK,T2: import_available_block() — crosses to Tier 2 WRK->>T2: spawn_blocking_handle(|| chain.import_block(...)) (DB writes, fork choice update) T2-->>WRK: block_root (await result) WRK->>BC: recompute_head_at_current_slot().await (updates cached_head RwLock — Tier 1) end WRK->>MGR: SendOnDrop fires idle_tx (even on panic) MGR->>MGR: current_workers -= 1 check queues ``` ### State Transition: Which Tier Does What The gossip block pipeline has a deliberate split across tiers: | Step | Function | Tier | Why | |---|---|---|---| | Gossip rules | `GossipVerifiedBlock::new()` | **Tier 1** (sync, inline) | Fast checks (seen cache, topic filter, slot range) | | BLS signature | `SignatureVerifiedBlock::from_gossip_verified_block_check_slashable()` | **Tier 1** (sync, inline) | Needed before state transition; cheap single sig verify | | **State transition** | `per_slot_processing()` + `per_block_processing()` | **Tier 1** (sync, inline) | Must complete before EL notification; worker is dedicated so blocking it is fine | | EL notification | `engine_newPayload()` via `ExecutionPendingBlock::into_executed_block()` | **Tier 1** (async `.await`) | I/O over HTTP — never blocks a thread | | DB write + fork choice | `import_block()` inside `import_available_block()` | **Tier 2** (`spawn_blocking_handle`) | CPU+IO-heavy; must not block the Tier 1 event loop | | Head update | `recompute_head_at_current_slot()` | **Tier 1** (async `.await`) | Acquires `recompute_head_lock` Mutex then updates `cached_head` RwLock | **Key insight:** `per_slot_processing` and `per_block_processing` run **inline on the Tier 1 async worker** because: 1. `into_execution_pending_block_slashable()` is a plain synchronous function (not async), called directly from the async context. 2. The GossipBlock async worker is a **dedicated task** — blocking it synchronously does not starve other concurrent work. 3. State transition must complete before `engine_newPayload` can be sent (the EL needs the block body). 4. DB writes and fork choice update (`import_block`) are the truly expensive part and are explicitly offloaded to Tier 2. **Call chain (simplified):** ``` process_gossip_block() ← async, Tier 1 (gossip_methods.rs) └─ process_gossip_verified_block() ← async, Tier 1 └─ chain.process_block() ← async, Tier 1 (beacon_chain.rs) ├─ unverified_block.into_execution_pending_block() ← SYNC, Tier 1 │ └─ per_slot_processing() + per_block_processing() ← SYNC, Tier 1 ├─ chain.into_executed_block().await ← async Tier 1 (awaits EL oneshot) └─ self.import_available_block().await ← async, crosses to Tier 2 └─ spawn_blocking_handle(|| chain.import_block(...)) ← Tier 2 ``` **Why async (not blocking) for the outer task?** Block import must `.await` the Execution Layer's `engine_newPayload` response over HTTP. Blocking a Tier 2 OS thread on network I/O would waste the thread pool. The worker is therefore async (Tier 1), and only the CPU+DB-heavy `import_block` is explicitly moved to Tier 2. --- ## 7. Flow: Gossip Attestation (BLS Batching) Attestations are the highest-volume message type. The BeaconProcessor batches them to amortize BLS verification cost. ```mermaid sequenceDiagram participant P2P as 🌐 libp2p Swarm (Tier 1) participant RT as 🔀 Router (Tier 1) participant CH as 📋 BP Channel participant MGR as 📋 BP Manager (Tier 1) participant WRK as 🔒 Blocking Worker (Tier 2) participant BC as 🔗 BeaconChain (shared Arc) P2P->>RT: PubsubMessage::Attestation RT->>CH: Work::GossipAttestation try_send WorkEvent alt Worker slot free MGR->>MGR: spawn immediately else Slot busy MGR->>MGR: push to attestation_queue (LIFO — freshest first) end Note over MGR: When slot becomes free, check queue depth alt Only 1 attestation queued MGR->>WRK: spawn_blocking(process_individual) else ≥2 attestations queued (up to 64) MGR->>MGR: Collect batch of min(queue_len, 64) into Work::GossipAttestationBatch MGR->>WRK: spawn_blocking(process_batch) end rect rgb(200, 255, 200) Note over WRK,BC: process_gossip_attestation_batch() — Tier 2 blocking thread WRK->>BC: Phase 1: Index all attestations (subnet, slot, validator lookup) WRK->>BC: Phase 2: Read validator_pubkey_cache (RwLock) → build SignatureSets WRK->>WRK: Phase 3: verify_signature_sets() ONE batch blst C call (pure CPU) alt Batch passes WRK->>WRK: CheckAttestationSignature::No (skip per-sig verify) else Batch fails (one bad sig poisons all) WRK->>WRK: Fallback: verify each sig individually end WRK->>BC: Phase 4 (per attestation): apply_attestation_to_fork_choice() WRK->>BC: add_to_naive_aggregation_pool() WRK->>RT: propagate_attestation_if_timely() (network_tx → ValidationResult) end WRK->>MGR: SendOnDrop fires idle_tx ``` **Key insight:** Processing 64 attestations in one batch BLS call is significantly faster than 64 individual calls. The LIFO queue ensures the freshest attestations are processed first when overloaded — older ones are dropped from the back. --- ## 8. Flow: P2P RPC Request Serving When a peer requests blocks/blobs/columns via the Req/Resp protocol. ```mermaid sequenceDiagram participant PEER as 🤝 Remote Peer participant P2P as 🌐 libp2p Swarm (Tier 1) participant NS as 📡 NetworkService (Tier 1) participant RT as 🔀 Router (Tier 1) participant MGR as 📋 BP Manager (Tier 1) participant WRK as ⚙️ Worker (Tier 1 or 2) participant BC as 🔗 BeaconChain / DB (shared Arc) PEER->>P2P: RPC Request (BlocksByRange / BlobsByRoot / etc.) P2P->>NS: NetworkEvent::RequestReceived NS->>RT: RouterMessage::RPCRequestReceived (unbounded channel) RT->>MGR: send_blocks_by_range_request() → Work::BlocksByRangeRequest(AsyncFn) or Work::BlobsByRangeRequest(BlockingFn) alt Async request (BlocksByRange, BlocksByRoot) MGR->>WRK: task_spawner.spawn_async() runs on Tier 1 else Blocking request (BlobsByRange, BlobsByRoot, DataColumns) MGR->>WRK: task_spawner.spawn_blocking() runs on Tier 2 end WRK->>BC: Read blocks/blobs from HotColdDB BC-->>WRK: data loop Stream response chunks WRK->>NS: network_tx: NetworkMessage::SendResponse (unbounded channel) NS->>P2P: libp2p send RPC response chunk P2P->>PEER: response chunk end WRK->>MGR: SendOnDrop fires idle_tx ``` --- ## 9. Flow: RPC Block Import (Sync) When the Sync manager downloads blocks from peers and imports them into the chain. ```mermaid sequenceDiagram participant PEER as 🤝 Remote Peer participant P2P as 🌐 libp2p Swarm (Tier 1) participant SYN as 🔄 Sync Manager (Tier 1) participant NBP as 🔗 NetworkBeaconProcessor participant MGR as 📋 BP Manager (Tier 1) participant WRK as ⚙️ Async Worker (Tier 1) participant BC as 🔗 BeaconChain (shared Arc) PEER->>P2P: RPC Response (blocks) P2P->>SYN: RouterMessage::RPCResponseReceived → SyncMessage SYN->>SYN: Update sync state machine (range sync / backfill) SYN->>NBP: send_chain_segment() or send_rpc_beacon_block() NBP->>MGR: Work::ChainSegment(AsyncFn) or Work::RpcBlock(AsyncFn) [HIGH PRIORITY queue, FIFO] Note over MGR: Priority: chain_segment > rpc_block > gossip_block > attestations MGR->>WRK: task_spawner.spawn_async() rect rgb(200, 230, 255) Note over WRK,BC: Tier 1 async block import WRK->>BC: verify_block_for_import() WRK->>BC: import_block() → state_transition() BC->>BC: recompute_head() WRK->>SYN: sync_tx: SyncMessage::BlockProcessed (advance sync state machine) end WRK->>MGR: SendOnDrop fires idle_tx ``` --- ## 10. Shared Objects Across Tiers All shared objects are `Arc<T>` — reference-counted heap allocations. Each tier holds a clone of the `Arc` pointer, all pointing to the same allocation. Internal mutability is provided by `RwLock` or `Mutex`. ```mermaid graph TB subgraph SharedState["🔒 Shared State (Arc-wrapped)"] BC["🔗 Arc<BeaconChain<T>> ───────────────────────────── .canonical_head: CanonicalHead ├─ fork_choice: RwLock<BeaconForkChoice> ├─ cached_head: RwLock<CachedHead> └─ recompute_head_lock: Mutex<()> .store: Arc<HotColdDB> .op_pool: OperationPool .naive_aggregation_pool: RwLock .observed_attestations: RwLock .validator_pubkey_cache: RwLock .execution_layer: ExecutionLayer .task_executor: TaskExecutor"] NG["🌐 Arc<NetworkGlobals<E>> ───────────────────────────── .peers: RwLock<PeerDB> .sync_state: RwLock<SyncState> .backfill_state: RwLock .local_enr: RwLock .gossipsub_subscriptions: RwLock .config: Arc<NetworkConfig> .spec: Arc<ChainSpec>"] NBP["🔗 Arc<NetworkBeaconProcessor<T>> ───────────────────────────── .beacon_processor_send: mpsc::Sender .duplicate_cache: Arc<Mutex<HashSet>> .chain: Arc<BeaconChain> .network_tx: mpsc::UnboundedSender .sync_tx: mpsc::UnboundedSender .network_globals: Arc<NetworkGlobals> .executor: TaskExecutor"] DB["💾 Arc<HotColdDB> ───────────────────────────── Hot store (recent blocks/states) Cold store (finalized history)"] TE["⚙️ TaskExecutor (Clone — wraps Weak<Runtime>) ───────────────────────────── .spawn() → Tier 1 .spawn_blocking() → Tier 2 .spawn_blocking_with_rayon() → Tier 3"] end subgraph Tiers["Execution Tiers"] T1NET["🌐 network task"] T1RTR["🔀 router task"] T1MGR["📋 BP manager"] T1SYN["🔄 sync manager"] T1API["🌐 HTTP API"] T1WRK["⚙️ async workers"] T2WRK["🔒 blocking workers"] end T1NET --> BC T1NET --> NG T1RTR --> BC T1RTR --> NBP T1MGR --> NG T1SYN --> NBP T1API --> BC T1API --> NG T1WRK --> BC T2WRK --> BC BC --> DB classDef shared fill:#FFD700,stroke:#997700,stroke-width:2px,color:black classDef tier1 fill:#87CEEB,stroke:#1a6691,stroke-width:2px,color:darkblue classDef tier2 fill:#90EE90,stroke:#2d6a2d,stroke-width:2px,color:darkgreen class BC,NG,NBP,DB,TE shared class T1NET,T1RTR,T1MGR,T1SYN,T1API,T1WRK tier1 class T2WRK tier2 ``` ### Channels (tier boundaries) | Channel | Type | Cap | Sender(s) | Receiver | Purpose | |---|---|---|---|---|---| | `beacon_processor_send` | `mpsc` | 16 384 | Router, Sync | BP Manager | All work enters BeaconProcessor | | `idle_tx` / `idle_rx` | `mpsc` | 16 384 | Every Tier 2/3 worker (SendOnDrop) | BP Manager | Worker completion signal | | `network_tx` | `mpsc unbounded` | ∞ | Workers, Router | NetworkService | Send responses / validations to libp2p | | `router_send` | `mpsc unbounded` | ∞ | NetworkService | Router | Network events → Router | | `sync_tx` | `mpsc unbounded` | ∞ | Router, workers | Sync Manager | Sync state machine messages | | `ready_work_rx` | `mpsc` | 12 288 | Reprocess scheduler | BP Manager | Delayed blocks now ready | --- ## 11. Controllers & Orchestrators ```mermaid graph TB subgraph Controllers["🎛️ Controllers (who drives what)"] direction TB BP["📋 BeaconProcessor Manager ══════════════════════════════ THE central work dispatcher • Owns 30+ priority queues • Manages worker slot count (max=CPUs) • Polls: idle_rx → ready_work_rx → event_rx • Forms attestation batches (up to 64) • Drops work flagged drop_during_sync • Never blocks — pure async event loop"] TE["⚙️ TaskExecutor ══════════════════════════════ Controls HOW work is spawned • .spawn() → Tier 1 async • .spawn_blocking() → Tier 2 OS thread • .spawn_blocking_with_rayon() → Tier 3 LowPriority • .spawn_blocking_with_rayon_async() → Tier 3 HighPriority"] NS["🌐 NetworkService ══════════════════════════════ Controls all libp2p I/O • Owns the Swarm (sole owner) • tokio::select! event loop • Routes events to Router • Sends gossip validation results"] CH["🔗 CanonicalHead ══════════════════════════════ Controls chain state reads/writes • fork_choice RwLock (heavy writes) • cached_head RwLock (fast reads) • recompute_head Mutex (serializes updates) • Two locks by design: fast path for networking (cached_head) vs slow path for fork choice computation"] SYN["🔄 Sync Manager ══════════════════════════════ Controls what to download • Range sync (forward) • Backfill sync (historical) • Drives block/blob requests to peers • Advances state machines on import result"] BCI["🔗 BeaconChain ══════════════════════════════ THE shared state hub (not a controller) • Every meaningful operation routes through it • Owns: canonical_head, store, op_pool, execution_layer, task_executor • Accessed concurrently by all tiers via Arc"] end BP -->|uses| TE BP -->|reads sync_state from| NG["Arc<NetworkGlobals>"] NS -->|feeds events to| RT["Router"] RT -->|sends work via| BP SYN -->|sends work via| BP CH -->|field of| BCI classDef controller fill:#FF6B6B,stroke:#cc0000,stroke-width:2px,color:white classDef state fill:#FFD700,stroke:#997700,stroke-width:2px,color:black class BP,TE,NS,SYN,CH controller class BCI state ``` ### Decision: who controls what | Question | Answer | |---|---| | What work runs next? | **BeaconProcessor manager** — priority queue + worker slot count | | Which OS thread runs work? | **TaskExecutor** — `spawn` / `spawn_blocking` / `spawn_blocking_with_rayon` | | What is the canonical head? | **CanonicalHead** — `cached_head` RwLock (fast) or `fork_choice` RwLock (authoritative) | | What blocks to download from peers? | **Sync Manager** — range/backfill state machines | | What comes in from the network? | **NetworkService** — sole owner of libp2p Swarm | | Where does consensus state live? | **BeaconChain** — shared Arc, accessed from all tiers | --- ## 12. BeaconProcessor Queue Priority When a worker slot becomes free, the manager drains queues in this strict order: ```mermaid flowchart TD FREE["⚡ Worker slot free\n(idle_rx received)"] --> Q1 Q1{"chain_segment_queue\n(FIFO, cap=64)"} Q1 -->|item| SPAWN["⚙️ spawn_worker()"] Q1 -->|empty| Q2 Q2{"rpc_block / rpc_blob /\nrpc_custody_column\n(FIFO)"} Q2 -->|item| SPAWN Q2 -->|empty| Q3 Q3{"delayed_block_queue\n(FIFO, cap=1024)"} Q3 -->|item| SPAWN Q3 -->|empty| Q4 Q4{"gossip_block / execution_payload /\ngossip_blob / data_column\n(FIFO)"} Q4 -->|item| SPAWN Q4 -->|empty| Q5 Q5{"api_request_p0\n(FIFO, high-priority API)"} Q5 -->|item| SPAWN Q5 -->|empty| Q6 Q6{"aggregate_queue ≥ 2?\n(LIFO → batch up to 64)"} Q6 -->|yes| BATCH_AGG["Collect GossipAggregateBatch"] Q6 -->|no / 1| SPAWN BATCH_AGG --> SPAWN Q6 -->|empty| Q7 Q7{"attestation_queue ≥ 2?\n(LIFO → batch up to 64)"} Q7 -->|yes| BATCH_ATT["Collect GossipAttestationBatch"] Q7 -->|no / 1| SPAWN BATCH_ATT --> SPAWN Q7 -->|empty| Q8{"sync_contribution /\nsync_message\n(LIFO)"} Q8 -->|item| SPAWN Q8 -->|empty| Q9{"unknown_block_att /\nunknown_block_agg\n(LIFO)"} Q9 -->|item| SPAWN Q9 -->|empty| Q10{"status / blocks_by_range /\nblobs_by_range / ...\n(FIFO)"} Q10 -->|item| SPAWN Q10 -->|empty| Q11{"slashings / exits /\nbls_execution_change\n(FIFO)"} Q11 -->|item| SPAWN Q11 -->|empty| Q12{"api_request_p1\n(FIFO, low-priority API)"} Q12 -->|item| SPAWN Q12 -->|empty| Q13{"backfill_chain_segment\n(FIFO, lowest priority)"} Q13 -->|item| SPAWN Q13 -->|empty| Q14{"light_client_*\n(FIFO)"} Q14 -->|item| SPAWN Q14 -->|empty| NOTHING["NOTHING_TO_DO\n(no work available)"] classDef decision fill:#FFD700,stroke:#997700,stroke-width:2px,color:black classDef spawn fill:#90EE90,stroke:#2d6a2d,stroke-width:2px,color:darkgreen classDef batch fill:#87CEEB,stroke:#1a6691,stroke-width:2px,color:darkblue classDef nothing fill:#FFB6C1,stroke:#DC143C,stroke-width:2px,color:black class Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14 decision class SPAWN,FREE spawn class BATCH_AGG,BATCH_ATT batch class NOTHING nothing ``` ### Priority rationale | Priority group | Reason | |---|---| | Chain segments | Most efficient path to advancing the head — batch of blocks already downloaded | | RPC blocks/blobs | Already explicitly requested from peers, peer is waiting | | Delayed blocks | May be parent of pending gossip blocks | | Gossip blocks | Must be imported before attestations referencing them can be verified | | API P0 | High-priority client requests (e.g. validator block production) | | Aggregates → Attestations | Validator rewards; aggregates more efficient (more info, same verification time) | | Sync committee | Rewards but no fork choice influence | | Unknown block att/agg | Older items held for retry — less fresh | | Status / range requests | Serving peers (important but not chain-critical) | | Slashings / exits | Safety-important but low urgency | | API P1 | Low-priority background API requests | | Backfill | Historical data, zero urgency for current consensus | | Light client | Minimal consensus impact | --- ## Key Design Principles 1. **No blocking on Tier 1 — with one deliberate exception.** CPU work is generally offloaded to Tier 2 (`spawn_blocking`). However, state transition (`per_slot_processing` + `per_block_processing`) runs inline on the Tier 1 async worker for gossip blocks. This is safe because each GossipBlock worker is a dedicated task and the state transition must complete synchronously before the EL can be notified. The expensive part — `import_block()` (DB writes + fork choice) — is still explicitly moved to Tier 2. 2. **Single manager, many workers.** One BeaconProcessor manager event loop serializes queue decisions; N workers run in parallel. This avoids lock contention on queue state. 3. **RAII worker slots.** `SendOnDrop` guarantees the idle signal fires even on panic — worker slots are never leaked. 4. **Two `CanonicalHead` locks by design.** `cached_head` (fast RwLock) serves the networking stack without blocking. `fork_choice` (heavy RwLock) is only taken for actual fork choice computation. 5. **LIFO for attestations.** When overloaded, freshest attestations have the most value for block production. Stale attestations are silently dropped from the back of the LIFO queue. 6. **Scoped rayon pools, never global.** The global rayon pool interacts badly with tokio thread counts. Lighthouse creates its own `LowPriority` and `HighPriority` pools sized to leave headroom for the async executor. 7. **Channels are tier boundaries.** `beacon_processor_send`, `idle_tx`, `network_tx`, `sync_tx` are the only crossing points between components. Shared `Arc<T>` provides concurrent data access within a tier.