# `bdk_mempool` — Implementation Plan ## 1. Scope `bdk_mempool` is a Rust library that **loads a Bitcoin Core mempool snapshot and answers two questions**: 1. *Would this transaction (or package) be accepted by the network right now?* 2. *What would a miner do with this mempool? — block-template simulation, eviction order, and feerate-diagram RBF under cluster mempool rules*. - Tx-level standardness validation (post-construction, pre-broadcast) - Package validation (well-formedness, 1-parent-1-child topology) - Replace-by-fee (RBF) acceptance under current both legacy (BIP-125) and cluster (feerate-diagram) rules - TRUC (BIP-431, v3) validation - Ephemeral dust validation - Fee policy checks (min relay, dynamic mempool min, incremental relay) - Mempool topology limits (legacy ancestor/descendant; cluster when available) - Cluster mempool simulation: linearization, chunking, block-template construction, eviction ordering (per Core v31) **NOTE** - RPC client (delegated to `bdk-bitcoind-client`) - Snapshot-based, not live; no incremental relay, mempool eviction, or reorg handling ## 2. Architecture The entire user-facing surface for v0.1.0: ```rust use bdk_mempool::{Mempool, Policy, BlockLimit}; use bitcoin::{Transaction, FeeRate}; // Load let mp = Mempool::from_json(&getrawmempool)?; let mp = Mempool::from_bitcoind(&client)?; // Load with a custom linearizer let mp = Mempool::from_json_with_linearizer(&getrawmempool, AncestorSetGreedy)?; let mp = Mempool::from_bitcoind_with_linearizer(&client, AncestorSetGreedy)?; // Policy let policy = Policy::core_v31(); let policy = Policy::core_v30(); let policy = Policy::core_v29(); let policy = Policy::core_v28(); // Policy with overrides (builder pattern) let policy = Policy::core_v31() .permit_bare_multisig(true) .min_relay_feerate(FeeRate::from_sat_per_vb(2)) .dust_relay_feerate(FeeRate::from_sat_per_vb(3)) .mempool_fullrbf(true) .datacarriersize(1000); // OP_RETURN byte budget; default 83 (pre-v30) or 520 (v30+) // Chain context — required for locktime and BIP68 checks let ctx = ChainContext { tip_height: 850_000, mtp: 1_700_000_000, // median-time-past of tip, Unix seconds }; // All prevouts in input order let prevouts = Prevouts::All(&[txout1, txout2]); // Query let result = mp.check_tx(&tx, &policy) // structural only .with_chain(&ctx) // adds locktime/BIP68 .with_prevouts(&prevouts) // adds input-standardness .run(); let result = mp.check_package(&package, &policy, &ctx, &prevouts); // Miner's-eye queries (cluster mempool, core_v31 only) let template = mp.block_template(BlockLimit::default()); let next_to_evict = mp.eviction_order().next(); ``` **NOTE:** **Policy** will be a builder pattern. ## 3. Policy The current versions suported will be: - `Policy::core_v28()` - `Policy::core_v29()` - `Policy::core_v30()` - `Policy::core_v31()` (cluster mempool) Structural constants like `MAX_STANDARD_TX_WEIGHT`, TRUC limits, package limits, sigop budgets will not be configurable. Mempool Cluster limits exposed only on Policy::core_v31(): - `cluster_count_limit` — max transactions per cluster. Default is `64` - `cluster_vbyte_limit` — max total vbyte size per cluster. Default is `101 kvB` **NOTE:** There will be upstream update request of `core_v31()` to `bdk-bitcoind-client` ## 4. The Checks ### A. Tx-level standardness | # | Check | |---|---| | A.1 | Reject coinbase transactions | | A.2 | Tx version is in {1, 2, 3} (3 = TRUC) | | A.3 | `tx_weight ≤ MAX_STANDARD_TX_WEIGHT` (400,000) | | A.4 | `vin.len() > 0` and `vout.len() > 0` | | A.5 | Each `scriptSig` is push-only and length ≤ 1,650 bytes | | A.6 | Each output script is a recognized standard type (P2PK, P2PKH, P2SH, P2WPKH, P2WSH, P2TR, P2A, OP_RETURN, multisig) | | A.7 | Bare multisig rejected unless `permit_bare_multisig` is set | `IsStandard` | | A.8 | OP_RETURN aggregate byte budget across all OP_RETURN outputs (post-v30 model) | | A.9 | Per-output dust check using fee-rate-aware threshold (dust_relay_feerate) | | A.10 | If `version == 3`: `weight ≤ TRUC_MAX_WEIGHT` (40,000) | | A.11 | Each output value ≥ dust threshold, EXCEPT: tx pays zero fee AND every dust output is spent within the same package (ephemeral dust) | | A.12 | Non-witness serialized size ≥ MIN_STANDARD_TX_NONWITNESS_SIZE (65 bytes) | | A.13 | Total sigop cost ≤ MAX_STANDARD_TX_SIGOPS_COST (80,000) — covers legacy + segwit, separate from per-input P2SH cap | | A.14 | For each output, if scriptPubKey is a witness program, its version is 0 or 1; v2–v16 are non-standard for relay | ### B. Input standardness (needs prevouts) | # | Check | |---|---| | B.1 | P2SH redeemScript sigops ≤ `MAX_P2SH_SIGOPS` (15) | | B.2 | P2WSH: witness stack items ≤ 100, each item ≤ 80 bytes, witness script ≤ 3,600 bytes | | B.3 | Tapscript: stack item ≤ `MAX_STANDARD_TAPSCRIPT_STACK_ITEM_SIZE` | | B.4 | Reject Taproot annex (currently non-standard) | | B.5 | For each input, the prevout scriptPubKey is itself a standard type (same set as #6) | ### C. Package well-formed | # | Check | |---|---| | C.1 | parents appear before children | | C.2 | no duplicate txs, no two txs share a prevout | | C.3 | Package count ≤ 25 | | C.4 | Total package weight ≤ 404,000 | ### D. LockTime (needs chain context) | # | Check | |---|---| | D.1 | `nLockTime` finality vs tip height + MTP | | D.2 | BIP68 sequence locks via nSequence | ### E. Mempool-state checks | # | Check | |---|---| | E.1 | Reject if txid already in mempool | | E.2 | Reject if wtxid already in mempool | | E.3 | Detect missing inputs (prevout not in snapshot or provided prevouts) | | E.4 | Static min relay feerate: `tx_feerate ≥ min_relay_feerate` | | E.5 | mempool min fee: `tx_feerate ≥ mempoolinfo.mempoolminfee` | ### F. RBF (needs mempool lookup) **Legacy (core_v28..core_v30):** | # | Check | |---|---| | F.1.1 | Rule 1 (signaling): if policy disables full-RBF, at least one input of a conflict must signal BIP-125 (nSequence < 0xfffffffe). Skip when mempoolfullrbf is set. | | F.1.2 | Rule 3: replacement_fee ≥ sum(C.descendant_fees) over direct conflicts | | F.1.3 | Rule 4: `replacement_fee − sum(fees over displaced set) ≥ incremental_relay_feerate × replacement_vsize` | | F.1.4 | Rule 5: Collect direct conflicts per input; compute displaced set (conflict + descendants) via snapshot fields. Reject if total displaced count > 100 | **Cluster (core_v31), replacing F.1.1..F.1.4:** | # | Check | |---|---| | F.2.1 | Identify affected clusters (clusters containing any direct conflict or any input of replacement) using `OutPoint → ClusterId` index | | F.2.2 | Reconstruct "after" cluster state: remove conflicts and their descendants, insert replacement (see H.3 for cluster rebuild) | | F.2.3 | Re-linearize affected clusters in the "after" state using the configured `Linearizer` | | F.2.4 | Compute feerate diagrams for affected clusters before and after; pad the shorter diagram with a zero-fee tail; accept iff the "after" diagram is ≥ at every point and > at some point | | F.2.5 | Anti-free-relay floor: `replacement_fee ≥ sum(displaced_fees) + min_relay_feerate × replacement_vsize` | ### G. TRUC stateful | # | Check | |---|---| | G.1 | TRUC tx must have only TRUC unconfirmed ancestors (check direct parents' `version`) | | G.2 | Non-TRUC tx must have only non-TRUC unconfirmed ancestors | | G.3 | **Legacy:** TRUC ancestor count ≤ 2 (incl. self): unconfirmed parent must have no unconfirmed ancestors of its own — assert parent.ancestor_count == 1. **Cluster:** the cluster containing the TRUC tx (after addition) has count ≤ 2. | | G.4 | **Legacy**: TRUC descendant count ≤ 2 (incl. self): parent must have no existing unconfirmed children — assert parent.descendant_count == 1. **Cluster:** subsumed by the cluster-count constraint in #39. | | G.5 | TRUC child vsize ≤ 1,000 when ancestors exist | | G.6 | Sibling eviction: if TRUC parent already has one unconfirmed child, a new child triggers eviction of the existing sibling rather than rejection — return `SiblingEviction { evict: Txid }` caller decides whether to proceed | ### H. Cluster mempool It is computed once at snapshot load; powers F cluster-era, block-template queries, and eviction. Layered subsystem; the order below is the build order. #### H.1 Arithmetic — FeeFrac ```rs struct FeeFrac { fee: i64, size: u32 } ``` - Feerate comparison via `i128` cross-multiply (`fee_a × size_b` vs `fee_b × size_a`); never float - Add/sub for chunk merges. #### H.2 Ancestor Feerate Subroutine (CPFP core) Used by AncestorSetGreedy (H.4) to find the best candidate at each iteration. ```rust /// Compute the effective feerate of a tx together with all its unconfirmed ancestors. /// ancestor_set(tx) = { tx } ∪ { all unconfirmed ancestors reachable via depends[] } fn ancestor_feerate(tx: TxId, graph: &DepGraph) -> FeeFrac { let set = ancestor_set(tx, graph); // BitSet walk, O(n) per call FeeFrac { fee: set.iter().map(|t| graph[t].fee).sum(), size: set.iter().map(|t| graph[t].vsize).sum(), } } ``` This is the fee-bumping mechanism underlying CPFP: a child's fee raises the effective feerate of the entire ancestor set, causing the miner to include all ancestors together. #### H.3 BitSet ```rust= // N = cluster_count_limit / 64; default cluster_count_limit = 64, so N = 1 struct BitSet<const N: usize>([u64; N]); ``` - Stack-allocated, `Copy`, O(1) set ops (union, intersection, difference, contains) - Foundation for `DepGraph` and `Chunk` #### H.4 Cluster / DepGraph ```rs struct TxEntry { txid: Txid, fee: i64, // ≥ 0 vsize: u32, parents: BitSet<N>, // direct unconfirmed parents within cluster children: BitSet<N>, // direct unconfirmed children within cluster } struct Cluster { txs: Vec<TxEntry>, id: ClusterId, } ``` Membership indexes (built once at load, keyed for O(1) lookup): ```rs struct MempoolIndex { // Primary lookup by_txid: HashMap<Txid, ClusterId>, by_outpoint: HashMap<OutPoint, ClusterId>, // maps each output to its tx's cluster // For RBF fast-path: find affected clusters without full traversal clusters: Vec<Cluster>, } ``` **Cluster rebuild for RBF "after" state:** When evaluating a replacement, affected clusters must be reconstructed without the evicted transactions before re-linearization. The procedure is: 1. Identify all `displaced = direct_conflicts ∪ their descendants` via snapshot `descendant` fields. 2. For each affected cluster: filter out `displaced` txs to produce a reduced `DepGraph` 3. Insert the replacement tx into the reduced graph, connecting its edges 4. Run connected-component analysis on the reduced graph — evicting txs may split a cluster into two or more components; each component becomes a new `Cluster` 5. Re-linearize each resulting cluster with the configured `Linearizer` Step 4 (potential cluster split) is O(n) per affected cluster via BFS/DFS over the reduced adjacency list. This is the most expensive part of the RBF fast-path. Built once at load by union-find over the snapshot's `depends` field; snapshot does not expose cluster membership directly #### H.5 Linearizer trait ```rust pub trait Linearizer { fn linearize(&self, c: &Cluster, prior: Option<&Linearization>) -> Linearization; } pub struct Linearization(Vec<TxId>); ``` Implementation: 1. `AncestorSetGreedy` — repeatedly pick the tx whose ancestor set (see H.2) has the highest effective feerate; emit that ancestor set as a chunk; remove it; repeat. O(n²); deterministic. Satisfies the invariant: each emitted chunk's feerate ≥ the best remaining ancestor-set feerate. 2. `Limo<F: SubsetFinder>` — improves an input linearization using a subset finder; pre-SFL Core algorithm; O(n²). 3. `SpanningForest { rng_seed: u64 }` — current Core (v31) algorithm; very fast on real clusters (~20–50 µs for 70–100 tx) but no termination bound, requires randomization. #### H.6 Chunking Derived from a linearization in a single O(n) pass: append each tx as a singleton chunk; while last chunk's feerate is less than the new chunk's, merge them ```rs struct Chunk { txs: BitSet<N>, ff: FeeFrac, } struct Chunking { chunks: Vec<Chunk>, // feerates are non-increasing by construction } ``` #### H.7 Block-template simulation ```rs BinaryHeap<Reverse<(FeeFrac, ClusterId, ChunkIdx)>> // ordered by descending feerate ``` - Pop the highest-feerate chunk, advance that cluster's chunk cursor, push the next chunk from the same cluster - Stop when the next chunk would exceed `BlockLimit` (weight or sigops) - O(total_chunks · log K) where K is the number of clusters #### H.8 Eviction ordering Iterator over chunks across all clusters in ascending feerate order. Exact mirror of H.7 with a reversed heap. Identifies the "would be mined last" set for size-limit eviction. #### H.9 Feerate diagram comparison (implements F.36c and F.36d) ```rs /// Piecewise-linear (cumulative_vbytes, cumulative_fee) derived from a Chunking. struct Diagram(Vec<(u64, i64)>); enum DiagramCmp { Better, Worse, Equal, Incomparable } fn compare(old: &Diagram, new: &Diagram) -> DiagramCmp { // Pad the shorter diagram with a zero-fee tail to equal the longer's total size. // Walk both simultaneously; track whether new is ever < old or ever > old. // O(|old| + |new|) } ``` Accept the replacement iff `compare(before, after) == DiagramCmp::Better`. ## 5. NOTE - All cluster-mempool computation is batch, performed at `from_json` / `from_bitcoind` time. A large mempool (100k txs) with `AncestorSetGreedy` will take O(n²) across all clusters; document this in the crate README with a benchmark. - `cluster_count_limit` and `cluster_vbyte_limit` are part of `Policy`; pre-v31 policies leave the cluster-mempool layer present but unused. - Linearizer is pluggable via `from_json_with_linearizer` / `from_bitcoind_with_linearizer`; the default is `AncestorSetGreedy`. -The RBF fast-path in F.2 re-linearizes only affected clusters, not the whole mempool. Cluster membership lookup is O(1) via `MempoolIndex`. ## 6. Glossary - `AncestorSetGreedy` — the simplest correct algorithm. Repeatedly find the highest-feerate ancestor set (a tx plus all its still-unconfirmed ancestors), emit it, remove it, repeat. - `ancestor_feerate(tx)` — the effective feerate of a tx and all its unconfirmed ancestors; the core CPFP subroutine. See H.2. - `Chunk` — a contiguous group of transactions from a linearization that are mined together because merging them maximizes feerate. Each chunk has a non-increasing feerate relative to the previous. - `ChainContext` — tip height and MTP required for locktime and BIP68 checks. - `Diagram` — the piecewise-linear (cumulative_vbytes, cumulative_fee) curve derived from a Chunking; used for feerate-diagram RBF comparison. - `FeeFrac` — a (fee, size) pair supporting exact rational feerate comparison via i128 cross-multiplication. - `Prevouts` — map of `OutPoint → TxOut`, providing the data needed for input-standardness and missing-input checks. - `Linearization` — an ordered sequence of txids within a cluster such that no tx appears before its parents. - `Limo<F: SubsetFinder>` — "Linearization through Incremental Merging of Optimizations." Takes an existing linearization and improves it. - `SpanningForest { rng_seed: u64 }` — Models the cluster as a spanning forest and does merge/split steps until it converges on the optimal linearization. Very fast on real clusters. ## 7. REFERENCES FOR CLUSTER MEMPOOL - https://github.com/bitcoin/bitcoin/issues/25584 - https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393 - https://github.com/bitcoin/bitcoin/issues/27677 - https://github.com/bitcoin/bitcoin/pull/34616 - https://github.com/bitcoin/bitcoin/pull/33157 - https://github.com/bitcoin/bitcoin/issues/30289 - https://github.com/bitcoin/bitcoin/pull/26451 - https://github.com/bitcoin/bitcoin/issues/29319