# Solana Light Client Executor Specification **Version**: 1.1 **Date**: 2025-12-02 **Status**: Draft for Review --- ## Table of Contents 1. [Executive Summary](#executive-summary) 2. [Architecture Overview](#architecture-overview) 3. [System Components](#system-components) 4. [Deterministic Job Claiming](#deterministic-job-claiming) 5. [P2P Communication Layer](#p2p-communication-layer) 6. [Relayer Registry](#relayer-registry) 7. [Signature & Message Structure](#signature--message-structure) 8. [Trust Model & Security Phases](#trust-model--security-phases) 9. [Implementation Roadmap](#implementation-roadmap) 10. [Security Considerations](#security-considerations) 11. [Open Questions](#open-questions) --- ## Executive Summary The **Solana Executor Pallet** is the destination-side component of the Highway bridge that handles incoming cross-chain transfers **from Mosaic to Solana**. It mirrors the design of the entry pallet but adds **aggregated threshold signatures** and **p2p coordination** for enhanced security. ### Key Features - **Deterministic Job Claiming**: Hash-based rotation with primary/secondary/tertiary failover ensures fair, predictable relayer selection - **Committee-Based Threshold Signature Verification**: 128-member committee with 87/128 (67%) threshold required before execution (Phase 1) - **Two-Phase Validation**: Source validation committee + Execution validation committee for complete security - **P2P Coordination**: Relayers exchange signatures, proofs, and heartbeats off-chain - **Dual Registry**: Registered relayers (admin-managed) + Active relayers (heartbeat-based) - **Light Client Integration**: Phase 2 will verify Merkle proofs against Mosaic light client state - **Canonical Message Format**: Uses `keccak256(job_identity_hash)` matching Highway protocol ### Design Philosophy The executor follows the **entry pallet pattern** from `hway-solana/programs/highway-bridge-entry`: - Same job identity hash computation ([complete_transfer.rs:191-235](file:///Users/bojan/Documents/highway/hway-solana/programs/highway-bridge-entry/src/instructions/complete_transfer.rs#L191-L235)) - Same PDA architecture (config, executed transfers, vault authority) - Same event emission pattern - **New**: 128-member committee signature verification instead of simple whitelist --- ## Architecture Overview ### High-Level System Architecture ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ HIGHWAY BRIDGE SYSTEM │ └─────────────────────────────────────────────────────────────────────────┘ ┌──────────────────────┐ ┌──────────────────────┐ │ MOSAIC CHAIN │ │ SOLANA CHAIN │ │ │ │ │ │ ┌────────────────┐ │ │ ┌────────────────┐ │ │ │ Light Client │ │◄─────── Listens ─────────│ │ Bridge Entry │ │ │ │ (Solana State) │ │ to Solana │ │ Program │ │ │ └────────────────┘ │ Events │ └────────────────┘ │ │ ▲ │ │ │ │ │ │ │ │ │ │ │ Verifies Proofs │ │ Emits Events │ │ │ │ │ │ │ │ ┌────────────────┐ │ │ ▼ │ │ │ Executor │ │ │ ┌────────────────┐ │ │ │ Pallet │◄─┼───── Relayers Submit ────┼──│ Solana │ │ │ │ (Mosaic→Solana)│ │ Committee Sigs │ │ Listener │ │ │ └────────────────┘ │ (87/128 threshold) │ └────────────────┘ │ │ │ + Merkle Proofs │ │ └──────────────────────┘ └──────────────────────┘ ▲ │ │ │ │ P2P NETWORK │ │ ┌──────────────────┐ │ └──────────│ Relayer Pool │◄──────────────────┘ │ (128 members) │ │ │ │ • Signature │ │ Aggregation │ │ • Proof Sharing │ │ • Heartbeats │ │ • Reorg Alerts │ └──────────────────┘ ``` ### Complete System Architecture with Light Clients ``` ┌───────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ HIGHWAY BRIDGE SYSTEM │ │ Bidirectional Cross-Chain Bridge │ └───────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────┐ ┌─────────────────────────────────────────────┐ │ MOSAIC CHAIN │ │ SOLANA CHAIN │ │ (Substrate-based) │ │ (Solana Runtime) │ └─────────────────────────────────────────────┘ └─────────────────────────────────────────────┘ ┌─────────────────────────────────────────────┐ ┌─────────────────────────────────────────────┐ │ ON-CHAIN COMPONENTS (Mosaic) │ │ ON-CHAIN COMPONENTS (Solana) │ ├─────────────────────────────────────────────┤ ├─────────────────────────────────────────────┤ │ │ │ │ │ ┌───────────────────────────────────────┐ │ │ ┌───────────────────────────────────────┐ │ │ │ Solana Light Client Pallet │ │ │ │ Mosaic Light Client Program │ │ │ │ (Phase 2 - Verifies Solana State) │ │ │ │ (Phase 2 - Verifies Mosaic State) │ │ │ │ │ │ │ │ │ │ │ │ • Tracks Solana block headers │ │ │ │ • Tracks Mosaic block headers │ │ │ │ • Verifies finality proofs │ │ │ │ • Verifies finality proofs │ │ │ │ • Stores state roots │ │ │ │ • Stores state roots │ │ │ │ • Validates epoch changes │ │ │ │ • Validates validator sets │ │ │ └───────────────────────────────────────┘ │ │ └───────────────────────────────────────┘ │ │ ▲ │ │ ▲ │ │ │ Reads finalized │ │ │ Reads finalized │ │ │ Solana state │ │ │ Mosaic state │ │ │ │ │ │ │ │ ┌───────────┴───────────────────────────┐ │ │ ┌───────────┴───────────────────────────┐ │ │ │ Highway Bridge Executor Pallet │ │ │ │ Highway Bridge Entry Program │ │ │ │ (Mosaic → Solana Transfers) │ │ │ │ (Solana → Mosaic Transfers) │ │ │ │ │ │ │ │ │ │ │ │ Phase 1 (Current Spec): │ │ │ │ Current Implementation: │ │ │ │ • 128-member committee validation │ │ │ │ • initiate_transfer (lock tokens) │ │ │ │ • 87/128 threshold (67%) │ │ │ │ • complete_transfer (unlock tokens) │ │ │ │ • Aggregated sig verification │ │ │ │ • refund_transfer (timeout refunds) │ │ │ │ • Relayer registry │ │ │ │ • Token vault management │ │ │ │ │ │ │ │ • Fee collection │ │ │ │ Phase 2 (Future): │ │ │ │ • Event emission │ │ │ │ • Verifies Merkle proofs against ──────┼─┼───────┼─┼──► light client state roots │ │ │ │ Solana light client state │ │ │ │ │ │ │ │ • Trustless verification │ │ │ │ │ │ │ └────────────────────────────────────────┘ │ │ └───────────────────────────────────────┘ │ │ ▲ │ │ │ │ │ │ Executes transfers │ │ │ Emits events │ │ │ (unlock tokens) │ │ ▼ │ └──────────────┼──────────────────────────────┘ └──────────────┼──────────────────────────────┘ │ │ │ │ │ │ ┌──────────────┴─────────────────────────────────────────────────────┴──────────────────────────────┐ │ OFF-CHAIN RELAYER INFRASTRUCTURE │ ├────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ P2P NETWORK LAYER │ │ │ │ (libp2p-based) │ │ │ │ │ │ │ │ Protocol 1: Source Validation (Phase 1 of 2) │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ SignatureRequest → Claiming relayer broadcasts proof to 128-member committee │ │ │ │ │ │ SignatureResponse → Committee members verify proof, sign, return signature │ │ │ │ │ │ Aggregation → Claiming relayer collects 87/128 signatures (6s timeout) │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ Protocol 2: Execution Validation (Phase 2 of 2) │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ After destination execution, relayer requests execution validation │ │ │ │ │ │ Committee verifies MessageExecuted event on destination chain │ │ │ │ │ │ 87/128 signatures required for reward claim │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ Protocol 3: Heartbeat & Liveness │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Heartbeat messages every 60 minutes → Active relayer registry updates │ │ │ │ │ │ Offline detection → Remove from active set after 60 minutes silence │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ Protocol 4: Proof Sharing & Reorg Alerts │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ ProofShare → Share Merkle proofs to avoid redundant computation │ │ │ │ │ │ ReorgAlert → Broadcast chain reorganizations affecting pending jobs │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────┘ │ │ ▲ │ ▲ │ │ │ │ │ │ │ ┌───────────────────────┐ ┌────────────┴─────┴─────┴────────────┐ ┌───────────────────────┐ │ │ │ Relayer A │ │ Relayer B (Claiming) │ │ Relayer C │ │ │ ├───────────────────────┤ ├──────────────────────────────────────┤ ├───────────────────────┤ │ │ │ │ │ │ │ │ │ │ │ Mosaic Listener ────┼──┼──► Detects TransferInitiated event ◄┼──┼──── Mosaic Listener │ │ │ │ (Monitors blocks) │ │ Creates Job (job_identity_hash) │ │ (Monitors blocks) │ │ │ │ │ │ Deterministic claiming algorithm │ │ │ │ │ │ Solana Listener ────┼──┼──► Detects TransferInitiated event ◄┼──┼──── Solana Listener │ │ │ │ (WebSocket) │ │ │ │ (WebSocket) │ │ │ │ │ │ │ │ │ │ │ │ Job Builder │ │ Job Claimer │ │ Job Builder │ │ │ │ Creates jobs from │ │ Primary/Secondary/Tertiary logic │ │ Creates jobs from │ │ │ │ detected events │ │ Time-windowed claiming │ │ detected events │ │ │ │ │ │ │ │ │ │ │ │ Committee Member │ │ Signature Aggregator │ │ Committee Member │ │ │ │ Verifies Merkle │ │ 1. Builds Merkle proof │ │ Verifies Merkle │ │ │ │ proofs │ │ 2. Broadcasts SignatureRequest │ │ proofs │ │ │ │ Signs if valid │ │ 3. Collects 87/128 responses │ │ Signs if valid │ │ │ │ │ │ 4. Aggregates signatures (BLS) │ │ │ │ │ │ Transaction Executor │ │ 5. Submits to destination chain │ │ Transaction Executor │ │ │ │ (Phase 2 addition) │ │ │ │ (Phase 2 addition) │ │ │ │ │ │ Transaction Executor │ │ │ │ │ │ Database │ │ Constructs execute_transfer() tx │ │ Database │ │ │ │ Jobs, proofs, │ │ Submits to Solana/Mosaic │ │ Jobs, proofs, │ │ │ │ relayer registry │ │ │ │ relayer registry │ │ │ └───────────────────────┘ └──────────────────────────────────────┘ └───────────────────────┘ │ │ │ │ ... (Relayer D, E, ... up to 128 relayers in committee) ... │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ DATA FLOW: MOSAIC → SOLANA TRANSFER │ └────────────────────────────────────────────────────────────────────────────────────────────────────┘ User (Mosaic) │ │ 1. Initiates transfer: 100 USDC → Solana address 0xABC... ▼ Mosaic Bridge Entry Pallet │ │ 2. Locks 100 USDC in vault (LockRelease mode) │ OR Burns 100 USDC (BurnMint mode) │ 3. Emits TransferInitiated event ▼ Mosaic Block Finalized (Block N) │ ├─────────────────────────┬─────────────────────────┬─────────────────────────┐ ▼ ▼ ▼ ▼ Relayer A Relayer B Relayer C Relayer D detects event detects event detects event detects event │ │ │ │ │ 4. Create Job with job_identity_hash = keccak256(canonical_preimage) │ │ │ │ │ │ 5. Compute claiming relayer: hash % active_count │ │ Determine: Primary (0-6s), Secondary (7-12s), Tertiary (13s+) │ │ │ │ │ │ ▼ │ │ │ Relayer B │ │ │ (Primary - 0-6s window) │ │ │ │ │ │ │ 6. Build Merkle proof │ │ │ from Mosaic state │ │ │ │ │ │ │ 7. Broadcast SignatureRequest ─┼──────► P2P Network ─────┤ │◄─── P2P ────────────────┤ to 128-member │ (128 members) │ │ │ committee │ │ │ │ │ │ │ 8. Verify proof │ │ 8. Verify proof │ │ 9. Verify claimer is │ │ 9. Verify claimer is │ │ primary assigned │ │ primary assigned │ │10. Sign message │ │10. Sign message │ │11. Send SignatureResponse ─► P2P ───────────────►│◄─── SignatureResponse ──┤ │ │ │ │ │ ▼ │ │ │ Relayer B │ │ │ │ │ │ │ 12. Collect 87/128 signatures │ │ │ (within 6 second timeout) │ │ │ │ │ │ │ 13. Aggregate signatures (BLS) │ │ │ signer_bitmap = 128-bit │ │ │ │ │ │ │ ▼ │ │ │ Submit Transaction │ │ │ │ │ │ └─────────────────────────┼─────────────────────────┴─────────────────────────┘ │ ▼ Solana Executor Program │ 14. Verify 87/128 committee signatures 15. Verify ExecutedTransfer doesn't exist 16. Transfer 100 USDC from vault → 0xABC... OR Mint 100 USDC to 0xABC... (BurnMint mode) 17. Create ExecutedTransfer PDA 18. Emit TransferExecuted event │ ▼ User receives 100 USDC on Solana ┌────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ PHASE 2: LIGHT CLIENT INTEGRATION │ └────────────────────────────────────────────────────────────────────────────────────────────────────┘ When Mosaic Solana Light Client Pallet is available: Mosaic Chain Solana Chain │ │ │ Relayers periodically submit │ │ Mosaic block headers + finality proofs │ │ │ │ ───────────────────────────────────────────────────────────► │ │ │ │ Mosaic Light Client Program │ │ │ │ Stores verified │ │ Mosaic state roots │ ▼ │ Solana Executor Program │ │ │ │ execute_transfer( │ │ merkle_proof, │ │ block_header, │ │ state_root │ │ ) │ │ │ │ Verifies: │ │ 1. Block header in light client ✓ │ │ 2. Merkle proof → state_root ✓ │ │ 3. Extract event data ✓ │ │ 4. Execute transfer ✓ │ │ │ │ NO SIGNATURES NEEDED │ │ (Trustless verification) │ ▼ Symmetrical setup on Mosaic side: │ │ Solana Light Client Pallet (on Mosaic) │ │ │ │ Stores Solana block headers + state roots │ ▼ │ Mosaic Executor Pallet (Solana → Mosaic) │ │ │ │ Verifies Merkle proofs against │ │ Solana light client state │ ▼ ``` ### Bidirectional Flow **Solana → Mosaic** (Entry Pallet): 1. User calls `initiate_transfer` on Solana entry program 2. Tokens locked/burned, event emitted 3. Relayers detect event → create job → build Merkle proof 4. Submit to Mosaic executor (existing implementation) **Mosaic → Solana** (Executor Pallet - THIS SPEC): 1. User initiates transfer on Mosaic 2. Event finalized on Mosaic chain 3. Relayers detect event → deterministic claiming (primary/secondary/tertiary) 4. **Primary relayer requests signatures via p2p (6s window)** 5. **128-member committee verifies proof + signs message** 6. **87/128 signatures aggregated and submitted to Solana executor program** 7. Executor verifies committee signatures + executes transfer --- ## System Components ### 1. Solana Executor Program (On-Chain) **Location**: New program in `hway-solana/programs/solana-executor` (similar to `highway-bridge-entry`) **Instructions**: ```rust pub mod solana_executor { /// Initialize executor configuration pub fn initialize( ctx: Context<Initialize>, committee_size: u8, // 128 members threshold: u8, // 87 (67% of 128) initial_relayers: Vec<RelayerInfo>, refund_timeout_seconds: i64, ) -> Result<()> /// Execute incoming transfer from Mosaic /// Verifies aggregated signatures from 87/128 committee members pub fn execute_transfer( ctx: Context<ExecuteTransfer>, transfer_id: [u8; 32], // job_identity_hash source_chain_id: String, destination_chain_id: String, source_block_number: u64, source_block_hash: [u8; 32], source_event_hash: [u8; 32], recipient: Pubkey, amount: u64, transfer_method: TransferMethod, // LockRelease, BurnMint, LockMint, BurnRelease aggregated_signature: Vec<u8>, // BLS aggregated signature signer_bitmap: u128, // Which of 128 committee members signed ) -> Result<()> /// Update relayer set (admin only) pub fn update_relayer_set( ctx: Context<UpdateConfig>, relayers_to_add: Vec<RelayerInfo>, relayers_to_remove: Vec<Pubkey>, ) -> Result<()> /// Update threshold (admin only) pub fn update_threshold( ctx: Context<UpdateConfig>, new_threshold: u8, ) -> Result<()> // ... (other instructions similar to entry pallet) } /// Token transfer method configuration #[derive(AnchorSerialize, AnchorDeserialize, Clone, Copy)] pub enum TransferMethod { /// Lock on source, release on destination (both use vaults) LockRelease, /// Burn on source, mint on destination (no vaults needed) BurnMint, /// Lock on source, mint on destination (source vault, destination mints) LockMint, /// Burn on source, release on destination (source burns, destination vault) BurnRelease, } ``` **State Accounts**: ```rust #[account] pub struct ExecutorConfig { pub authority: Pubkey, pub pending_authority: Option<Pubkey>, // Committee configuration for signature verification pub committee_size: u8, // 128 members pub threshold: u8, // 87 (67% threshold) pub relayers: Vec<RelayerInfo>, // Max 128 relayers for committee // Same fields as entry pallet pub fee_basis_points: u16, pub fee_recipient: Pubkey, pub min_fee_lamports: u64, pub paused: bool, pub supported_chains: Vec<u8>, pub min_transfer_amount: u64, pub max_transfer_amount: u64, pub refund_timeout_seconds: i64, pub bump: u8, } #[account] pub struct RelayerInfo { pub pubkey: Pubkey, // ed25519 public key for signature verification pub bls_pubkey: [u8; 48], // BLS public key for committee signatures pub weight: u8, // Vote weight (usually 1) pub added_at: i64, // Timestamp when added } #[account] pub struct TokenConfig { pub token_address: Pubkey, pub transfer_method: TransferMethod, pub vault_address: Option<Pubkey>, // For LockRelease, BurnRelease modes pub mint_authority: Option<Pubkey>, // For BurnMint, LockMint modes pub is_paused: bool, } #[account] pub struct ExecutedTransfer { pub transfer_id: [u8; 32], pub recipient: Pubkey, pub amount: u64, pub executed_at: i64, pub signer_bitmap: u128, // Which committee members signed (128-bit) pub bump: u8, } ``` ### 2. Off-Chain Relayer Components **Location**: `hway-relayer/src/chain_interfaces/solana/executor/` **New Modules**: ``` src/chain_interfaces/solana/executor/ ├── mod.rs # Public exports ├── job_claimer.rs # Deterministic claiming with primary/secondary/tertiary ├── signature_aggregator.rs # Collects 87/128 committee signatures ├── message_builder.rs # Creates canonical messages for signing ├── executor_client.rs # Solana executor program client └── executor_transaction.rs # Transaction construction for execute_transfer ``` **Integration Points**: - Uses existing `MerkleTree` from [verification/merkle.rs](hway-relayer/src/core/verification/merkle.rs) - Uses existing `Job::job_identity_hash()` from [types/job.rs:173](hway-relayer/src/core/types/job.rs#L173) - Integrates with new p2p module (see section 5) ### 3. P2P Communication Module **Location**: `hway-relayer/src/core/p2p/` (currently placeholder) **New Implementation**: ``` src/core/p2p/ ├── mod.rs # P2P trait definitions ├── libp2p_network.rs # libp2p implementation ├── messages.rs # P2P message types ├── source_validation.rs # Source chain validation protocol ├── execution_validation.rs # Destination execution validation protocol ├── heartbeat_protocol.rs # Heartbeat (every 60 minutes) └── proof_sharing.rs # Merkle proof exchange ``` ### 4. Light Client Integration Module (Phase 2) **Location**: `hway-relayer/src/chain_interfaces/mosaic/light_client_syncer.rs` **Purpose**: Bridge between Mosaic chain and Solana light client program for trustless verification. **Following Repository Patterns**: - ✅ Implements similar pattern to `MosaicClient` and `SolanaClient` - ✅ Uses `new()` + `init()` initialization pattern (lines 96-107 of `mosaic/client.rs`) - ✅ Has custom error type `LightClientError` (like `SolanaError`, `MosaicError`) - ✅ Uses `Arc<RwLock<>>` for shared state management - ✅ Follows existing file organization in `chain_interfaces/mosaic/` - ✅ Module declaration in `mosaic/mod.rs`: `pub mod light_client_syncer;` **Architecture**: ```rust use super::client::MosaicClient; use super::types::{MosaicError, Result as MosaicResult}; use crate::chain_interfaces::solana::client::SolanaClient; use solana_sdk::{pubkey::Pubkey, signature::Keypair, transaction::Transaction}; use std::collections::HashMap; use std::sync::Arc; use std::sync::atomic::{AtomicU64, Ordering}; use tokio::sync::RwLock; use tracing::{error, info, warn}; /// Error types for light client operations #[derive(Debug, thiserror::Error)] pub enum LightClientError { #[error("Mosaic client error: {0}")] MosaicClient(#[from] MosaicError), #[error("Solana RPC error: {0}")] SolanaRpc(String), #[error("State root not found for block {0}")] StateRootNotFound(u64), #[error("Header not finalized")] HeaderNotFinalized, #[error("Light client program error: {0}")] ProgramError(String), #[error("Serialization error: {0}")] Serialization(String), } pub type Result<T> = std::result::Result<T, LightClientError>; /// Configuration for light client syncer #[derive(Clone)] pub struct LightClientConfig { /// Solana light client program ID pub program_id: Pubkey, /// Sync interval in seconds (default: 60) pub sync_interval_secs: u64, /// Cache capacity (default: 1000 blocks) pub cache_capacity: usize, } /// Light client syncer for Mosaic → Solana state verification /// /// **Pattern**: Follows MosaicClient/SolanaClient architecture pub struct LightClientSyncer { /// Mosaic RPC client for fetching block headers mosaic_client: Arc<MosaicClient>, /// Solana RPC client for submitting headers to light client solana_client: Arc<SolanaClient>, /// Configuration config: LightClientConfig, /// Keypair for signing header submissions submitter_keypair: Keypair, /// Cache of verified state roots (block_number -> state_root) state_root_cache: Arc<RwLock<HashMap<u64, [u8; 32]>>>, /// Last synced block number last_synced_block: Arc<AtomicU64>, } impl LightClientSyncer { /// Create a new light client syncer /// /// **Pattern**: Matches `MosaicClient::new()` and `SolanaClient::new()` pub fn new( mosaic_client: Arc<MosaicClient>, solana_client: Arc<SolanaClient>, config: LightClientConfig, submitter_keypair: Keypair, ) -> Self { Self { mosaic_client, solana_client, config, submitter_keypair, state_root_cache: Arc::new(RwLock::new(HashMap::new())), last_synced_block: Arc::new(AtomicU64::new(0)), } } /// Initialize and test connection /// /// **Pattern**: Matches `MosaicClient::init()` and `SolanaClient::init()` pub async fn init( mosaic_client: Arc<MosaicClient>, solana_client: Arc<SolanaClient>, config: LightClientConfig, submitter_keypair: Keypair, ) -> Result<Self> { let syncer = Self::new(mosaic_client, solana_client, config, submitter_keypair); // Test connections syncer.health_check().await?; Ok(syncer) } /// Health check for both clients pub async fn health_check(&self) -> Result<()> { self.mosaic_client.health_check().await .map_err(|e| LightClientError::MosaicClient(e))?; self.solana_client.health_check().await .map_err(|e| LightClientError::SolanaRpc(e.to_string()))?; Ok(()) } } impl LightClientSyncer { /// Submit Mosaic block headers to Solana light client program /// /// This runs as a background task, periodically fetching finalized /// Mosaic block headers and submitting them to the Solana light client. /// /// Submission frequency: Every 10 blocks or every 60 seconds (whichever comes first) pub async fn sync_headers(&self, from_block: u64) -> Result<()> { loop { let current_block = self.mosaic_client.get_finalized_block_number().await?; if current_block > self.last_synced_block.load(Ordering::SeqCst) { // Fetch block header with finality proof let header = self.mosaic_client.get_block_header(current_block).await?; let finality_proof = self.mosaic_client.get_finality_proof(current_block).await?; // Submit to Solana light client self.submit_header(&header, &finality_proof).await?; // Update cache self.state_root_cache.write().await.insert( current_block, header.state_root ); self.last_synced_block.store(current_block, Ordering::SeqCst); } tokio::time::sleep(Duration::from_secs(60)).await; } } /// Query verified state root from Solana light client /// /// Returns the state root for a given Mosaic block number. /// The light client must have already verified this block. pub async fn get_state_root(&self, block_number: u64) -> Result<[u8; 32]> { // Check cache first if let Some(&root) = self.state_root_cache.read().await.get(&block_number) { return Ok(root); } // Query from light client program let light_client_state = self.solana_client .get_account(&self.light_client_program_id) .await?; let state: LightClientState = deserialize(&light_client_state.data)?; state.get_state_root(block_number) .ok_or_else(|| Error::StateRootNotFound(block_number)) } /// Verify header has been finalized on Mosaic before submission pub async fn verify_finality(&self, header: &BlockHeader) -> Result<()> { let finalized_number = self.mosaic_client.get_finalized_block_number().await?; require!( header.number <= finalized_number, "Header not yet finalized" ); Ok(()) } /// Handle chain reorganizations /// /// If a reorg is detected on Mosaic, notify the light client to revert /// affected headers and resubmit the canonical chain. pub async fn handle_reorg(&self, old_block: u64, new_block: u64) -> Result<()> { // Revert light client state self.submit_reorg_proof(old_block, new_block).await?; // Clear affected state roots from cache self.state_root_cache.write().await.retain(|&num, _| num < old_block); // Resubmit canonical headers for block_num in old_block..=new_block { let header = self.mosaic_client.get_block_header(block_num).await?; let proof = self.mosaic_client.get_finality_proof(block_num).await?; self.submit_header(&header, &proof).await?; } Ok(()) } /// Submit block header to Solana light client program async fn submit_header( &self, header: &BlockHeader, finality_proof: &FinalityProof, ) -> Result<()> { let ix = submit_header_instruction( &self.light_client_program_id, &self.submitter_keypair.pubkey(), header, finality_proof, ); let tx = Transaction::new_signed_with_payer( &[ix], Some(&self.submitter_keypair.pubkey()), &[&self.submitter_keypair], self.solana_client.get_latest_blockhash().await?, ); self.solana_client.send_and_confirm_transaction(&tx).await?; Ok(()) } } ``` **Integration with Executor**: When Phase 2 is active, the executor uses the light client syncer to build trustless proofs: ```rust // Phase 2: Build transaction with Merkle proof pub async fn build_execute_transfer_v2( job: &Job, light_client_syncer: &LightClientSyncer, ) -> Result<Transaction> { // 1. Get verified state root from light client let state_root = light_client_syncer .get_state_root(job.source_block_number.unwrap()) .await?; // 2. Build Merkle proof from job event let merkle_proof = build_merkle_proof_for_event( job.source_event.as_ref().unwrap(), job.source_block_number.unwrap(), ).await?; // 3. Construct execute_transfer instruction (no signatures needed) let ix = execute_transfer_instruction( job.id, job.source_chain_id.clone(), job.destination_chain_id.clone(), job.source_block_number.unwrap(), job.source_block_hash.unwrap(), job.source_event_hash.unwrap(), job.recipient.unwrap(), job.amount.unwrap(), merkle_proof, // NEW: Merkle proof instead of signatures state_root, // NEW: State root from light client ); // Any relayer can submit (no claiming needed in Phase 2) build_transaction(ix) } ``` **Key Differences from Phase 1**: | Aspect | Phase 1 (Committee Signatures) | Phase 2 (Light Client) | |--------|------------------------------|------------------------| | Trust Model | 87/128 committee consensus | Cryptographic proofs | | Verification | BLS signature aggregation | Merkle proof + light client | | Claiming | Deterministic with time windows | Any relayer (first-come-first-served) | | P2P Required | Yes (signature aggregation) | No (optional for optimization) | | Transaction Size | ~1.5KB (aggregated BLS sig) | ~800 bytes (proof) | | Security | 67% honest committee assumption | Trustless (light client assumption) | --- ## Deterministic Job Claiming ### Objective Ensure **exactly one relayer** claims each job in a **fair, predictable, and verifiable** manner without central coordination, with automatic failover to secondary/tertiary relayers. ### Primary Algorithm: Hash-Based Rotation with Time Windows **Mechanism**: ```rust /// Determine which relayers are assigned to a job (primary, secondary, tertiary) pub fn compute_assigned_relayers( job_identity_hash: [u8; 32], active_relayers: &[RelayerId], // Sorted deterministically ) -> AssignedRelayers { // Convert job hash to u64 index let hash_value = u64::from_be_bytes(job_identity_hash[0..8].try_into().unwrap()); let count = active_relayers.len(); AssignedRelayers { primary: active_relayers[(hash_value as usize) % count].clone(), secondary: active_relayers[((hash_value as usize) + 1) % count].clone(), tertiary: active_relayers[((hash_value as usize) + 2) % count].clone(), } } /// Determine if this relayer should claim based on time elapsed pub fn should_claim( assigned: &AssignedRelayers, my_relayer_id: &RelayerId, seconds_since_detection: u64, ) -> bool { // Primary window: 0-6 seconds if *my_relayer_id == assigned.primary && seconds_since_detection <= 6 { return true; } // Secondary window: 7-12 seconds (only if primary hasn't claimed) if *my_relayer_id == assigned.secondary && seconds_since_detection > 6 && seconds_since_detection <= 12 { return true; } // Tertiary window: 13+ seconds (only if primary and secondary haven't claimed) if *my_relayer_id == assigned.tertiary && seconds_since_detection > 12 { return true; } false } #[derive(Clone)] pub struct AssignedRelayers { pub primary: RelayerId, pub secondary: RelayerId, pub tertiary: RelayerId, } ``` **Time Windows**: | Assignment | Time Window | Blocks (approx) | |------------|-------------|-----------------| | Primary | 0-6 seconds | 0-N blocks | | Secondary | 7-12 seconds | N+1 to 2N blocks | | Tertiary | 13+ seconds | 2N+1+ blocks | **Properties**: - ✅ **Deterministic**: Same job hash → same relayer assignment (all nodes agree) - ✅ **Fair**: Uniform distribution over job hashes - ✅ **Verifiable**: Any relayer can verify who should claim - ✅ **No coordination**: No need for leader election - ✅ **Automatic Failover**: Secondary/tertiary take over if primary fails - ✅ **Censorship Resistant**: Three levels of failover **Relayer List Sorting**: ```rust /// Canonical sort order for relayers (must match across all nodes) pub fn canonical_relayer_order(relayers: &mut Vec<RelayerEntry>) { relayers.sort_by(|a, b| { // Primary: registration timestamp (oldest first) // Secondary: pubkey bytes (tie-breaker) a.registered_at.cmp(&b.registered_at) .then_with(|| a.pubkey.cmp(&b.pubkey)) }); } ``` ### Race Condition Handling **Scenario**: Both primary and secondary relayers collect signatures and submit in the same block. **Resolution on Mosaic Verifier Pallet**: ```rust /// If multiple claims in same block, prioritize by assignment order pub fn resolve_claim_race( claims: Vec<MessageClaim>, assigned: &AssignedRelayers, ) -> MessageClaim { // Priority: primary > secondary > tertiary if let Some(claim) = claims.iter().find(|c| c.relayer_id == assigned.primary) { return claim.clone(); } if let Some(claim) = claims.iter().find(|c| c.relayer_id == assigned.secondary) { return claim.clone(); } if let Some(claim) = claims.iter().find(|c| c.relayer_id == assigned.tertiary) { return claim.clone(); } // If none of the assigned relayers, reject all panic!("No valid claimer found"); } ``` **Result for Losing Relayer**: - Receives `MessageClaimRejected` event with reason `"claimed_by_higher_priority_relayer"` - No penalty (expected behavior) - Can continue with next job --- ## P2P Communication Layer ### Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────────────┐ │ P2P COMMITTEE SIGNATURE AGGREGATION FLOW │ └─────────────────────────────────────────────────────────────────────┘ Step 1: Job Detected ───────────────────── Mosaic Chain │ │ Event Finalized ▼ ┌──────────────┐ │ All Relayers│ (monitor Mosaic) │ Detect Job │ └──────────────┘ │ ▼ Compute assigned relayers deterministically from job_identity_hash │ ├─────► Relayer A (primary - 0-6s window) ├─────► Relayer B (secondary - 7-12s window) ├─────► Relayer C (tertiary - 13s+ window) └─────► Relayers D-N (committee members) Step 2: Source Validation Signature Request (P2P) ────────────────────────────────────────────────── Relayer A (primary, within 0-6s window): 1. Build Merkle proof 2. Create message: sign(job_identity_hash + relayer_id + timestamp) 3. Broadcast SignatureRequest to 128-member committee via P2P ┌──────────────────────────────────┐ │ SignatureRequest │ │ { │ │ job_identity_hash, │ │ merkle_proof, │ │ source_block_hash, │ │ recipient, │ │ amount, │ │ claimer: Relayer A, │ │ assignment_proof: primary │ │ } │ └──────────────────────────────────┘ │ ├───► P2P Network ────► Committee Member 1 ├───► P2P Network ────► Committee Member 2 ├───► P2P Network ────► ... └───► P2P Network ────► Committee Member 128 Step 3: Committee Source Validation Response (P2P) ─────────────────────────────────────────────────── 128 Committee Members: 1. Verify Merkle proof independently 2. Verify claimer is correctly assigned (primary within time window) 3. Validate message wasn't already executed on destination 4. If valid: Sign message with BLS key 5. Send SignatureResponse via P2P back to Relayer A Committee 1 ────► bls_signature_1 ────┐ Committee 2 ────► bls_signature_2 ────┤ Committee 3 ────► bls_signature_3 ────┤ ... ├───► Relayer A (aggregates) Committee 87 ────► bls_signature_87 ────┤ (need 87/128 = 67%) ... │ Committee 128────► bls_signature_128────┘ Timeout: 6 seconds to collect signatures Step 4: BLS Aggregation & Destination Submission ───────────────────────────────────────────────── Relayer A: 1. Collects 87/128 BLS signatures (within 6s timeout) 2. Aggregates into single BLS signature 3. Creates 128-bit signer_bitmap (which committee members signed) 4. Submits to Solana executor program: execute_transfer( transfer_id, source_chain_id, ..., aggregated_bls_signature, signer_bitmap: 0x...FFFFF (128 bits) ) ┌──────────────┐ │ Solana │ │ Executor │◄───── Transaction submitted │ Program │ with aggregated BLS signature └──────────────┘ │ ▼ Verifies 87/128 committee signatures Executes transfer (release or mint based on config) Emits TransferExecuted event Step 5: Execution Validation (Second Committee Round) ────────────────────────────────────────────────────── After destination execution confirmed: Relayer A: 1. Listens for MessageExecuted event on Solana 2. Wait for transaction finality (Solana: finalized slot) 3. Request execution validation from committee via P2P 128 Committee Members: 1. Verify MessageExecuted event exists on destination 2. Confirm transaction is finalized 3. Sign execution approval Relayer A: 1. Collect 87/128 execution signatures 2. Submit claim_reward() to Mosaic with execution signatures ``` ### P2P Message Types ```rust /// P2P messages exchanged between relayers #[derive(Serialize, Deserialize)] pub enum P2PMessage { /// Request source validation signatures for a job SourceValidationRequest { job_identity_hash: [u8; 32], source_chain_id: String, destination_chain_id: String, source_block_number: u64, source_block_hash: [u8; 32], source_event_hash: [u8; 32], source_tx_hash: [u8; 32], merkle_proof: Vec<[u8; 32]>, merkle_root: [u8; 32], recipient: Pubkey, amount: u64, transfer_method: TransferMethod, claimer: RelayerId, assignment_level: AssignmentLevel, // Primary, Secondary, or Tertiary timestamp: i64, }, /// Response with BLS signature for source validation SourceValidationResponse { job_identity_hash: [u8; 32], bls_signature: [u8; 96], // BLS signature (G2 point) signer: RelayerId, committee_index: u8, // Position in 128-member committee timestamp: i64, }, /// Request execution validation signatures for reward claim ExecutionValidationRequest { job_identity_hash: [u8; 32], destination_chain_id: String, destination_tx_hash: [u8; 32], destination_block_number: u64, claimer: RelayerId, timestamp: i64, }, /// Response with BLS signature for execution validation ExecutionValidationResponse { job_identity_hash: [u8; 32], bls_signature: [u8; 96], signer: RelayerId, committee_index: u8, timestamp: i64, }, /// Heartbeat to signal liveness (every 60 minutes) Heartbeat { relayer_id: RelayerId, timestamp: i64, supported_chains: Vec<u8>, current_block_numbers: HashMap<String, u64>, bls_pubkey: [u8; 48], }, /// Reorg alert ReorgAlert { chain_id: String, old_block_hash: [u8; 32], new_block_hash: [u8; 32], reorg_depth: u64, affected_jobs: Vec<[u8; 32]>, }, /// Proof sharing (optional optimization) ProofShare { job_identity_hash: [u8; 32], merkle_proof: Vec<[u8; 32]>, merkle_root: [u8; 32], sender: RelayerId, }, } #[derive(Serialize, Deserialize, Clone, Copy)] pub enum AssignmentLevel { Primary, // 0-6 seconds Secondary, // 7-12 seconds Tertiary, // 13+ seconds } ``` ### Committee Selection Algorithm ```rust /// Deterministically select 128-member committee for a job pub fn select_committee( job_identity_hash: [u8; 32], source_chain_id: &str, active_relayers: &[RelayerEntry], committee_size: usize, // 128 ) -> Vec<RelayerEntry> { // Create deterministic seed from job + chain let mut seed_input = job_identity_hash.to_vec(); seed_input.extend_from_slice(source_chain_id.as_bytes()); let seed = keccak256(&seed_input); // Use seed to shuffle relayer list deterministically let mut committee = active_relayers.to_vec(); fisher_yates_shuffle(&mut committee, &seed); // Take first 128 (or all if fewer relayers) committee.truncate(committee_size.min(active_relayers.len())); committee } /// Fisher-Yates shuffle with deterministic seed fn fisher_yates_shuffle(list: &mut [RelayerEntry], seed: &[u8; 32]) { let mut rng_state = seed.clone(); for i in (1..list.len()).rev() { // Generate next random index rng_state = keccak256(&rng_state); let j = u64::from_be_bytes(rng_state[0..8].try_into().unwrap()) as usize % (i + 1); list.swap(i, j); } } ``` ### P2P Protocol Flow **Technology**: Use **libp2p** for p2p networking - **Discovery**: mDNS (local) + Kademlia DHT (global) - **Transport**: TCP + QUIC - **Encryption**: Noise protocol (ed25519 keys) - **Pub/Sub**: GossipSub for broadcasts - **Request/Response**: Custom protocol for signature exchange **Configuration**: ```rust pub struct P2PConfig { pub listen_address: String, // e.g., "/ip4/0.0.0.0/tcp/9000" pub bootstrap_peers: Vec<String>, // Known relayer addresses pub heartbeat_interval_minutes: u64, // Default: 60 minutes pub source_validation_timeout_secs: u64, // Default: 6 seconds pub execution_validation_timeout_secs: u64, // Default: 6 seconds pub max_concurrent_requests: usize, // Default: 100 pub committee_size: usize, // Default: 128 pub signature_threshold: usize, // Default: 87 (67%) } ``` ### Heartbeat Mechanism **Heartbeat Server** (centralized for MVP, decentralizable later): ``` ┌──────────────────────────────────────┐ │ Heartbeat Coordination Service │ │ (Centralized, Low-Trust) │ │ │ │ Endpoints: │ │ POST /api/v1/heartbeat │ │ { relayer_id, timestamp, │ │ signature, bls_pubkey } │ │ │ │ GET /api/v1/active_relayers │ │ → Returns list of relayers with │ │ heartbeat in last 60 minutes │ └──────────────────────────────────────┘ ▲ │ │ │ │ POST /heartbeat │ GET /active_relayers │ every 60 minutes │ (used for committee selection) │ │ ┌────┴────────────────────▼─────┐ │ Relayer Pool │ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ │ │ A │ │ B │ │ C │ │ D │ │ │ └────┘ └────┘ └────┘ └────┘ │ │ ... up to 128+ relayers ... │ └────────────────────────────────┘ ``` **Heartbeat Frequency**: Every **60 minutes** - **Active window**: Relayer considered active if heartbeat within last **60 minutes** - **Offline detection**: Missing heartbeat → marked inactive - **Re-activation**: Sending heartbeat → immediately marked active **Registry Synchronization**: - Every **60 minutes**, heartbeat server submits `updateActiveRelayers()` to: - Mosaic registry pallet - Registry contract on Solana - Registry contract on all other supported chains - Same relayer list sent to all chains for consistency --- ## Relayer Registry ### Two-Table Architecture Both tables are stored in **off-chain database** (SQLite/PostgreSQL), **NOT on-chain**. #### Table 1: Registered Relayers (Admin-Managed) **Purpose**: Whitelist of relayers authorized to participate in the network. **Schema**: ```sql CREATE TABLE registered_relayers ( id INTEGER PRIMARY KEY, relayer_id TEXT NOT NULL UNIQUE, -- e.g., "0x1234...abcd" (pubkey) pubkey BLOB NOT NULL, -- ed25519 public key (32 bytes) bls_pubkey BLOB NOT NULL, -- BLS public key (48 bytes) registered_at TIMESTAMP NOT NULL, -- When added by admin registered_by TEXT NOT NULL, -- Admin who added (for audit) supported_chains TEXT NOT NULL, -- JSON array: ["Solana", "Mosaic"] stake_amount BIGINT, -- Optional: stake requirement metadata TEXT, -- JSON: contact info, etc. is_enabled BOOLEAN DEFAULT TRUE, -- Admin can disable -- For deterministic sorting UNIQUE(registered_at, relayer_id) ); CREATE INDEX idx_registered_enabled ON registered_relayers(is_enabled); CREATE INDEX idx_registered_sort ON registered_relayers(registered_at, relayer_id); ``` **Management**: - **Add**: Admin runs `relayer-cli add-relayer --pubkey 0x... --bls-pubkey 0x... --chains Solana,Mosaic` - **Remove**: Admin runs `relayer-cli disable-relayer --id 0x...` - **Sync to On-Chain**: Heartbeat server periodically calls `update_relayer_set()` on all chain registry contracts #### Table 2: Active Relayers (Heartbeat-Based) **Purpose**: Track which registered relayers are currently online and responsive. **Schema**: ```sql CREATE TABLE active_relayers ( relayer_id TEXT PRIMARY KEY, -- References registered_relayers last_heartbeat TIMESTAMP NOT NULL, -- Most recent heartbeat current_block_numbers TEXT, -- JSON: {"Solana": 12345, "Mosaic": 67890} p2p_address TEXT, -- Multiaddr for P2P connection bls_pubkey BLOB NOT NULL, -- BLS public key (48 bytes) version TEXT, -- Relayer software version -- Computed field (updated on heartbeat) is_active BOOLEAN GENERATED ALWAYS AS ( last_heartbeat > datetime('now', '-60 minutes') ) STORED, FOREIGN KEY (relayer_id) REFERENCES registered_relayers(relayer_id) ); CREATE INDEX idx_active_heartbeat ON active_relayers(last_heartbeat); CREATE INDEX idx_active_status ON active_relayers(is_active); ``` **Update Flow**: ```rust /// Called every 60 minutes by relayer pub async fn send_heartbeat( repos: &Repositories, relayer_id: &RelayerId, block_numbers: HashMap<String, u64>, bls_pubkey: [u8; 48], ) -> Result<()> { repos.active_relayers.upsert(ActiveRelayer { relayer_id: relayer_id.clone(), last_heartbeat: chrono::Utc::now(), current_block_numbers: serde_json::to_string(&block_numbers)?, p2p_address: get_p2p_multiaddr(), bls_pubkey, version: env!("CARGO_PKG_VERSION").to_string(), }).await } ``` ### Canonical Sorting (Both Tables) **Critical**: All relayers must have **identical sorted order** for deterministic committee selection. **Sort Order**: ```rust pub fn get_active_relayers_sorted(repos: &Repositories) -> Result<Vec<RelayerEntry>> { // 1. Get registered relayers let registered = repos.registered_relayers .find_all_enabled() .await?; // 2. Filter to only active (heartbeat within 60 minutes) let active = repos.active_relayers .find_active() .await?; let active_ids: HashSet<_> = active.iter().map(|a| &a.relayer_id).collect(); // 3. Keep only registered + active let mut eligible: Vec<_> = registered.into_iter() .filter(|r| active_ids.contains(&r.relayer_id)) .collect(); // 4. CANONICAL SORT (must match across all nodes) eligible.sort_by(|a, b| { // Primary: registered_at (oldest first) a.registered_at.cmp(&b.registered_at) // Secondary: relayer_id (lexicographic, tie-breaker) .then_with(|| a.relayer_id.cmp(&b.relayer_id)) }); Ok(eligible) } ``` **Why This Order**: - **Oldest first**: Rewards early relayers, stable over time - **Relayer ID tie-breaker**: Handles same-second registrations deterministically - **All nodes agree**: Same database state → same sorted list --- ## Signature & Message Structure ### Message Format (What Committee Members Sign) **Structure**: ```rust /// The canonical message that 87/128 committee members must sign pub struct ValidationMessage { /// Job identity hash (keccak256 of canonical preimage) pub job_identity_hash: [u8; 32], /// Relayer ID claiming this job pub claimer_relayer_id: String, /// Assignment level (Primary, Secondary, Tertiary) pub assignment_level: AssignmentLevel, /// Timestamp when validation was requested pub request_timestamp: i64, } impl ValidationMessage { /// Compute the message bytes to be signed with BLS pub fn to_signable_bytes(&self) -> Vec<u8> { let mut buf = Vec::with_capacity(32 + 4 + self.claimer_relayer_id.len() + 1 + 8); // 1. Job identity hash (32 bytes) buf.extend_from_slice(&self.job_identity_hash); // 2. Length-prefixed claimer ID put_u32_be(&mut buf, self.claimer_relayer_id.len() as u32); buf.extend_from_slice(self.claimer_relayer_id.as_bytes()); // 3. Assignment level (1 byte) buf.push(self.assignment_level as u8); // 4. Request timestamp (8 bytes big-endian) buf.extend_from_slice(&self.request_timestamp.to_be_bytes()); buf } /// Sign the message with relayer's BLS keypair pub fn sign_bls(&self, bls_keypair: &BlsKeypair) -> BlsSignature { let msg_bytes = self.to_signable_bytes(); bls_keypair.sign(&msg_bytes) } /// Verify a BLS signature against a public key pub fn verify_bls(&self, signature: &BlsSignature, pubkey: &BlsPublicKey) -> bool { let msg_bytes = self.to_signable_bytes(); signature.verify(&msg_bytes, pubkey) } } ``` ### Job Identity Hash (Uses Existing Relayer Format) **Source**: [hway-relayer/src/core/types/job.rs:173](hway-relayer/src/core/types/job.rs#L173) **Canonical Preimage** (from `Job::identity_preimage()`): ``` "HIGHWAY-JOB" (11 bytes) // Domain separator 0x01 (1 byte) // Domain version job.version (1 byte) // Currently 0x01 source_chain_id (length-prefixed UTF-8) destination_chain_id (length-prefixed UTF-8) block_number (8 bytes big-endian u64) block_hash (32 bytes) event_hash (32 bytes, keccak256 of source event) ``` **Then**: `job_identity_hash = keccak256(preimage)` **Match On-Chain**: The Solana executor program MUST use the **same computation** as [complete_transfer.rs:205-235](hway-solana/programs/highway-bridge-entry/src/instructions/complete_transfer.rs#L205-L235). ### BLS Signature Aggregation **Technology**: BLS12-381 signatures ```rust use bls_signatures::{PrivateKey, PublicKey, Signature, aggregate}; /// Aggregate 87+ BLS signatures into one pub fn aggregate_bls_signatures(signatures: &[Signature]) -> Signature { aggregate(signatures).expect("Failed to aggregate signatures") } /// Verify aggregated signature on-chain pub fn verify_bls_aggregate( aggregated_sig: &Signature, message: &[u8], pubkeys: &[PublicKey], ) -> bool { // Each pubkey must have signed the same message aggregated_sig.verify(message, pubkeys) } /// Create signer bitmap from committee member indices pub fn create_signer_bitmap_128(signer_indices: &[u8]) -> u128 { let mut bitmap = 0u128; for &index in signer_indices { assert!(index < 128, "Committee index must be < 128"); bitmap |= 1u128 << index; } bitmap } ``` **Why BLS**: - Single aggregated signature regardless of committee size (87 signatures → ~96 bytes) - Efficient on-chain verification - Standard in blockchain consensus systems ### Signer Bitmap (128-bit) **Purpose**: Compact representation of which 128 committee members signed. **Encoding**: ```rust /// Convert list of signer indices to 128-bit bitmap pub fn create_signer_bitmap(signer_indices: &[u8]) -> u128 { let mut bitmap = 0u128; for &index in signer_indices { bitmap |= 1u128 << index; } bitmap } /// Decode bitmap to signer indices pub fn decode_signer_bitmap(bitmap: u128) -> Vec<u8> { (0..128) .filter(|i| (bitmap & (1u128 << i)) != 0) .collect() } /// Count number of signers pub fn count_signers(bitmap: u128) -> u8 { bitmap.count_ones() as u8 } /// Example: Committee members at indices [0, 2, 4, 87, 100, 127] signed /// bitmap = 0x80000010000000000000000000000015 (128 bits) /// count_signers(bitmap) = 6 ``` **On-Chain Verification**: ```rust // In execute_transfer instruction let signer_indices = decode_signer_bitmap(signer_bitmap); let signer_count = signer_indices.len(); // Verify threshold met (87/128 = 67%) require!( signer_count >= config.threshold as usize, ErrorCode::InsufficientSignatures ); // Get committee for this job let committee = select_committee( transfer_id, &source_chain_id, &config.relayers, 128, ); // Collect pubkeys of signers let signer_pubkeys: Vec<BlsPublicKey> = signer_indices.iter() .map(|&i| committee[i as usize].bls_pubkey) .collect(); // Build message let message = ValidationMessage { job_identity_hash: transfer_id, claimer_relayer_id: claimer.to_string(), assignment_level, request_timestamp: claim_timestamp, }.to_signable_bytes(); // Verify aggregated BLS signature require!( verify_bls_aggregate(&aggregated_signature, &message, &signer_pubkeys), ErrorCode::InvalidSignature ); ``` --- ## Trust Model & Security Phases ### Phase 1: Committee-Based Threshold Signature Verification **Architecture**: ``` ┌────────────────────────────────────────────────────────────────┐ │ PHASE 1 TRUST MODEL │ │ 128-Member Committee, 87/128 (67%) Threshold │ └────────────────────────────────────────────────────────────────┘ Mosaic Chain Solana Chain │ │ │ Event: Transfer to Solana │ │ (100 USDC to 0xABC...) │ ▼ │ ┌─────────────┐ │ │ Relayer A │ ─┐ │ │ (Primary) │ │ │ └─────────────┘ │ Detect event │ │ Build proof │ ┌─────────────┐ │ │ │ Relayer B │ ─┤ ◄──── Primary has 0-6s to │ │ (Secondary)│ │ collect signatures │ └─────────────┘ │ │ │ 128-Member Committee │ ┌─────────────┐ │ │ │ Relayer C │ ─┤ Each verifies: │ │ (Committee) │ │ - Merkle proof valid │ └─────────────┘ │ - Claimer correctly assigned │ │ - Not already executed │ ... │ │ │ │ ┌─────────────┐ │ │ │ Relayer 128 │ ─┘ │ │ (Committee) │ │ └─────────────┘ │ │ │ │ Primary collects 87/128 BLS signatures │ │ Aggregates into single BLS signature │ │ │ └──────────► Submit Transaction ──────────►│ execute_transfer( │ transfer_id, │ aggregated_bls_signature, │ signer_bitmap: 128-bit │ ) │ ▼ ┌──────────────────┐ │ Executor Program │ │ │ │ 1. Verify 87/128 │ │ BLS sigs │ │ 2. Verify no │ │ double-spend │ │ 3. Transfer │ │ tokens │ └──────────────────┘ ``` **Security Properties**: - **128-Member Committee**: Large committee size for security - **67% Threshold (87/128)**: Standard BFT requirement (≥2/3) - **Byzantine Fault Tolerance**: System secure if ≤42 committee members malicious (1/3) - **Proof Verification**: Each committee member independently verifies Merkle proof before signing - **Double-Spend Prevention**: `ExecutedTransfer` PDA prevents re-execution **Attack Vectors & Mitigations**: | Attack | Mitigation | |--------|-----------| | **43+ committee members collude** | Requires 43/128 (34%) colluding - economically irrational with staking | | **Primary relayer never submits** | Secondary has 7-12s window, tertiary has 13s+ window | | **Invalid Merkle proof** | Each committee member verifies independently, won't sign if invalid | | **Replay attack** | `ExecutedTransfer` PDA enforces transfer_id uniqueness | | **Griefing (spam signatures)** | Rate limiting on P2P, reputation system | | **Committee prediction attack** | Committee selection uses job hash, unpredictable | **Assumptions**: - ❌ **NOT trustless**: Assumes <1/3 committee members are malicious - ✅ **Byzantine fault tolerant**: Up to 42/128 relayers can be malicious - ✅ **Liveness**: Requires 87 committee members online (high availability with 128) **Production Setup**: - **Committee Size**: 128 members - **Threshold**: 87/128 (67%) - **Relayer Entities**: Mix of exchanges, validators, and foundation - **Multi-Sig Authority**: Controls relayer registry - **Monitoring**: Real-time alerts for invalid signature attempts **Implementation Tasks**: 1. **On-Chain Program**: - [ ] Create `solana-executor` program structure - [ ] Implement `initialize` instruction with 128-member committee config - [ ] Implement `execute_transfer` with BLS aggregate signature verification - [ ] Implement `update_relayer_set` instruction - [ ] Add PDA accounts (ExecutorConfig, ExecutedTransfer, TokenConfig) - [ ] Add token transfer method support (LockRelease, BurnMint, LockMint, BurnRelease) - [ ] Write comprehensive tests - [ ] Security audit preparation 2. **P2P Communication**: - [ ] Implement libp2p network layer - [ ] Define P2P message types (SourceValidationRequest, SourceValidationResponse, ExecutionValidationRequest, ExecutionValidationResponse) - [ ] Implement source validation protocol (6s timeout) - [ ] Implement execution validation protocol (6s timeout) - [ ] Add heartbeat mechanism (60 minute interval) - [ ] Committee selection algorithm (128 members) - [ ] BLS signature aggregation - [ ] Integration tests for p2p communication 3. **Relayer Components**: - [ ] Implement `job_claimer` (primary/secondary/tertiary with time windows) - [ ] Implement `signature_aggregator` (collects 87/128 BLS signatures) - [ ] Implement `message_builder` (canonical ValidationMessage) - [ ] Implement `executor_client` (Solana program interaction) - [ ] Implement `executor_transaction` (transaction construction) - [ ] Integration with existing Merkle proof system 4. **Database & Registry**: - [ ] Create `registered_relayers` table schema (with bls_pubkey) - [ ] Create `active_relayers` table schema - [ ] Implement relayer repository methods - [ ] Add admin CLI commands (add/remove relayers) - [ ] Heartbeat coordination service (HTTP API, 60 minute interval) - [ ] Active relayer query logic (60 minute active window) 5. **Integration & Testing**: - [ ] End-to-end test: Mosaic → Solana transfer - [ ] Test 87/128 signature aggregation - [ ] Test claiming algorithm with primary/secondary/tertiary failover - [ ] Test committee selection determinism - [ ] Test all token transfer methods - [ ] Performance testing (throughput, latency) - [ ] Devnet deployment & testing --- ### Phase 2: Light Client Verification **Prerequisite**: **Mosaic Solana Light Client Pallet** must be available on Mosaic chain **Architecture**: ``` ┌────────────────────────────────────────────────────────────────┐ │ PHASE 2: LIGHT CLIENT VERIFICATION │ └────────────────────────────────────────────────────────────────┘ Mosaic Chain Solana Chain │ │ │ 1. Event: Transfer to Solana │ │ 2. Mosaic block finalized │ │ 3. Block header + state root │ ▼ │ ┌─────────────────────┐ │ │ Mosaic Light Client │ │ │ (Solana Program) │◄─────────────────────────┤ │ │ Relayers submit block │ │ Stores: │ headers periodically │ │ - Block headers │ │ │ - State roots │ │ │ - Finality proofs │ │ └─────────────────────┘ │ │ │ │ Block header verified │ ▼ │ ┌──────────────────┐ │ │ Executor Program │ │ │ │ │ │ execute_transfer(│ │ │ transfer_id, │ │ │ merkle_proof, │ ◄───────────────────────────┘ │ block_header, │ Any relayer submits │ state_root, │ proof + header │ ) │ │ │ │ Verification: │ │ 1. Verify block │ │ header in │ │ light client │ │ 2. Verify merkle │ │ proof against │ │ state_root │ │ 3. Extract event │ │ data │ │ 4. Execute │ │ transfer │ └──────────────────┘ ``` **Key Changes from Phase 1**: 1. **On-Chain Proof Verification**: ```rust pub fn execute_transfer( ctx: Context<ExecuteTransfer>, transfer_id: [u8; 32], // NEW: Light client proof parameters mosaic_block_header: BlockHeader, merkle_proof: Vec<[u8; 32]>, merkle_leaf_index: u64, // ... other params ) -> Result<()> { // 1. Verify block header against light client state let light_client = &ctx.accounts.mosaic_light_client; require!( light_client.verify_block_header(&mosaic_block_header), ErrorCode::InvalidBlockHeader ); // 2. Verify Merkle proof against block's state root let state_root = mosaic_block_header.state_root; require!( verify_merkle_proof( &merkle_proof, state_root, merkle_leaf_index, &transfer_data_hash ), ErrorCode::InvalidMerkleProof ); // 3. No longer need committee signatures (proof is trustless) // 4. Execute transfer based on token config // ... } ``` 2. **Light Client Integration**: - Relayers submit Mosaic block headers to Solana light client program - Light client tracks finality (epoch changes, validator sets) - Executor reads verified headers from light client state 3. **Committee Removal**: - No longer need 87/128 relayer signatures - Proof verification is cryptographic (trustless) - **Any relayer** can submit (first-come-first-served or highest fee) **Security Properties**: - ✅ **Trustless**: No reliance on honest relayers - ✅ **Cryptographic Verification**: Merkle proofs + light client consensus - ✅ **Censorship Resistance**: Any relayer can submit valid proofs - ❌ **Light Client Assumption**: Assumes Mosaic light client is correct **Implementation Tasks**: 1. **Light Client Integration**: - [ ] Integrate Mosaic light client program as dependency - [ ] Implement block header verification in executor - [ ] Add state root tracking - [ ] Create `LightClientSyncer` module 2. **Merkle Proof Verification**: - [ ] Implement on-chain Merkle verification - [ ] Optimize for compute units - [ ] Test against real Mosaic state roots 3. **Protocol Update**: - [ ] Remove committee signature verification logic - [ ] Update `execute_transfer` instruction signature - [ ] Add light client account to instruction context - [ ] Update relayer claiming logic (no aggregation needed) 4. **Testing & Deployment**: - [ ] End-to-end tests with real light client - [ ] Test block header submission - [ ] Test reorg handling - [ ] Testnet deployment --- ## Configuration & Parameters ### System-Wide Parameters | Parameter | Value | Rationale | |-----------|-------|-----------| | Committee Size | 128 nodes | Balance security (larger = more secure) vs. efficiency | | Signature Threshold | 87/128 (67%) | Standard BFT requirement (≥2/3) | | Primary Window | 0-6 seconds | Primary relayer exclusive claiming window | | Secondary Window | 7-12 seconds | Failover if primary doesn't claim | | Tertiary Window | 13+ seconds | Final failover | | Source Committee Timeout | 6 seconds | Fast iteration for signatures | | Execution Committee Timeout | 6 seconds | Consistent with source | | Mosaic Finality Wait | 2 blocks (~18s) | Optimized for speed | | Heartbeat Interval | 60 minutes | Frequent enough to detect failures | | Inactivity Threshold | 60 minutes | 1 missed heartbeat = inactive | | Registry Update Frequency | 60 minutes | Balance freshness vs. gas costs | ### Per-Chain Finality Configuration | Chain | Source Finality | Destination Finality | Rationale | |-------|-----------------|----------------------|-----------| | Ethereum | 0 blocks | 0 blocks | Low reorg risk after inclusion | | Mosaic | 2 blocks | 2 blocks | Block "buried" deeply | | Solana | Finalized | Finalized | Solana finality is fast | --- ## Security Considerations ### Threat Model **Attacker Goals**: 1. **Steal funds** from Solana vault 2. **Double-spend** a cross-chain transfer 3. **Censor** legitimate transfers 4. **Grief** relayers (DoS, spam) **Assumptions**: - **Phase 1**: <1/3 of 128 committee members are malicious (42 or fewer) - **Phase 2**: Mosaic light client is correct - **Both**: Solana consensus is secure ### Attack Scenarios & Mitigations #### 1. Malicious Committee Majority (Phase 1) **Attack**: 43+ malicious committee members collude to sign invalid transfer. **Mitigation**: - **High committee size**: 128 members makes collusion difficult - **67% threshold**: Requires 87 colluding members - **Reputable relayers**: Exchanges, validators with reputation at stake - **Staking (future)**: Require relayers to stake funds, slash on provable misbehavior - **Monitoring**: Real-time alerts for unexpected transfers **Residual Risk**: If 43+ committee members compromise, funds can be stolen (Phase 1 limitation). #### 2. Claiming Relayer Censorship **Attack**: Primary relayer refuses to submit valid transfer. **Mitigation**: - **Time-windowed failover**: Secondary has 7-12s, tertiary has 13s+ - **Anyone can submit**: In Phase 2, any relayer can submit proofs (no claiming needed) #### 3. Double-Spend **Attack**: Submit same transfer_id twice to drain vault. **Mitigation**: - **ExecutedTransfer PDA**: Stores transfer_id, prevents re-execution - **Anchor constraint**: `init` on ExecutedTransfer fails if already exists #### 4. Replay Attack **Attack**: Reuse old signatures for different transfer. **Mitigation**: - **Message includes transfer_id**: Each signature is tied to specific job_identity_hash - **Timestamp in message**: Prevents replay across time - **Signer bitmap**: Executor verifies signatures match current committee #### 5. P2P Network Attacks **Attack**: Sybil attack, eclipse attack, message spam. **Mitigation**: - **Whitelist bootstrap peers**: Only connect to known relayers - **Signature on all messages**: P2P messages signed with relayer keypairs - **Rate limiting**: Throttle signature requests per relayer - **Reputation system**: Track relayer behavior, ban misbehavers #### 6. Merkle Proof Forgery (Phase 2) **Attack**: Submit fake Merkle proof to claim non-existent transfer. **Mitigation**: - **Light client verification**: Proof must verify against light client's finalized state roots - **Cryptographic soundness**: Merkle tree collision resistance - **Proof size limits**: Prevent DoS via huge proofs ### Audit Focus Areas 1. **Committee Selection Logic**: Ensure deterministic and unpredictable 2. **BLS Signature Verification**: Correct aggregation and verification 3. **PDA Seeds**: Verify ExecutedTransfer seeds prevent collisions 4. **Integer Overflow**: Check amount calculations, fee math 5. **Access Control**: Only valid committee signatures accepted 6. **Re-Entrancy**: Ensure no re-entrancy in CPI calls 7. **Token Transfer Methods**: All 4 methods (LockRelease, BurnMint, LockMint, BurnRelease) correct --- ## Open Questions ### Technical Decisions 1. **BLS Library for Solana**: - Which BLS12-381 library to use on Solana? - Compute unit cost for 87-signature aggregation verification? - **Proposed**: Research existing Solana BLS implementations 2. **Heartbeat: Central Service vs P2P Gossip**: - Central service is simpler, introduces single point of failure - P2P gossip is decentralized, more complex - **Proposed**: Start central service (as per HWAY scope), migrate to gossip later 3. **Claiming Fallback Timing**: - 6 seconds primary, 6 seconds secondary - is this optimal? - Should timing be block-based or time-based? - **Proposed**: Use time-based (6s/6s/∞) per HWAY scope 4. **Relayer Stake Requirement**: - Should relayers stake funds to participate? - If yes, how much? - **Proposed**: Not in Phase 1, add later with slashing ### Economic Questions 1. **Relayer Incentives**: - Who pays relayers for execution? - Fee model: percentage of transfer? Fixed fee? - **Proposed**: Per HWAY scope, rewards from Mosaic pool 2. **Gas Costs**: - Who pays for Solana transaction fees? - Claiming relayer (reimbursed via bridge fee)? - **Proposed**: Yes, claiming relayer pays, reimbursed from fee 3. **Committee Participation Rewards**: - Should committee members receive small rewards for signing? - **Proposed**: Per HWAY scope, small participation rewards ### Governance 1. **Relayer Onboarding**: - What criteria for adding new relayer? - KYC required? - **Proposed**: Multi-sig approval, criteria TBD 2. **Relayer Removal**: - Automatic removal for downtime? - Or only manual removal by multi-sig? - **Proposed**: Per HWAY scope, inactive if no heartbeat for 60 minutes 3. **Threshold Updates**: - Should threshold adjust based on active committee count? - **Proposed**: Fixed 87/128 threshold, manually updated by governance --- ## Appendix: Message Lifecycle States A message progresses through the following states in the relayer's mempool: | State | Description | |-------|-------------| | **Detected** | Event observed on source chain | | **Picked Up** | Relayer determines message is assigned to them (primary/secondary/tertiary) | | **Processing** | Collecting 87/128 committee signatures for source validation (6s timeout) | | **Submitting to Destination** | Message verified, preparing destination delivery | | **Message Delivered** | Message executed on destination chain (success) | | **Message Failed** | Message execution failed on destination chain | | **Completed** | Execution validated, reward claimed and distributed | ---