# `bdk_mempool` — Implementation Plan
## 1. Scope
`bdk_mempool` is a Rust library that **loads a Bitcoin Core mempool snapshot and answers two questions**:
1. *Would this transaction (or package) be accepted by the network right now?*
2. *What would a miner do with this mempool? — block-template simulation, eviction order, and feerate-diagram RBF under cluster mempool rules*.
- Tx-level standardness validation (post-construction, pre-broadcast)
- Package validation (well-formedness, 1-parent-1-child topology)
- Replace-by-fee (RBF) acceptance under current both legacy (BIP-125) and cluster (feerate-diagram) rules
- TRUC (BIP-431, v3) validation
- Ephemeral dust validation
- Fee policy checks (min relay, dynamic mempool min, incremental relay)
- Mempool topology limits (legacy ancestor/descendant; cluster when available)
- Cluster mempool simulation: linearization, chunking, block-template construction, eviction ordering (per Core v31)
**NOTE**
- RPC client (delegated to `bdk-bitcoind-client`)
- Snapshot-based, not live; no incremental relay, mempool eviction, or reorg handling
## 2. Architecture
The entire user-facing surface for v0.1.0:
```rust
use bdk_mempool::{Mempool, Policy, BlockLimit};
use bitcoin::{Transaction, FeeRate};
// Load
let mp = Mempool::from_json(&getrawmempool)?;
let mp = Mempool::from_bitcoind(&client)?;
// Load with a custom linearizer
let mp = Mempool::from_json_with_linearizer(&getrawmempool, AncestorSetGreedy)?;
let mp = Mempool::from_bitcoind_with_linearizer(&client, AncestorSetGreedy)?;
// Policy
let policy = Policy::core_v31();
let policy = Policy::core_v30();
let policy = Policy::core_v29();
let policy = Policy::core_v28();
// Policy with overrides (builder pattern)
let policy = Policy::core_v31()
.permit_bare_multisig(true)
.min_relay_feerate(FeeRate::from_sat_per_vb(2))
.dust_relay_feerate(FeeRate::from_sat_per_vb(3))
.mempool_fullrbf(true)
.datacarriersize(1000); // OP_RETURN byte budget; default 83 (pre-v30) or 520 (v30+)
// Chain context — required for locktime and BIP68 checks
let ctx = ChainContext {
tip_height: 850_000,
mtp: 1_700_000_000, // median-time-past of tip, Unix seconds
};
// All prevouts in input order
let prevouts = Prevouts::All(&[txout1, txout2]);
// Query
let result = mp.check_tx(&tx, &policy) // structural only
.with_chain(&ctx) // adds locktime/BIP68
.with_prevouts(&prevouts) // adds input-standardness
.run();
let result = mp.check_package(&package, &policy, &ctx, &prevouts);
// Miner's-eye queries (cluster mempool, core_v31 only)
let template = mp.block_template(BlockLimit::default());
let next_to_evict = mp.eviction_order().next();
```
**NOTE:** **Policy** will be a builder pattern.
## 3. Policy
The current versions suported will be:
- `Policy::core_v28()`
- `Policy::core_v29()`
- `Policy::core_v30()`
- `Policy::core_v31()` (cluster mempool)
Structural constants like `MAX_STANDARD_TX_WEIGHT`, TRUC limits, package limits, sigop budgets will not be configurable.
Mempool Cluster limits exposed only on Policy::core_v31():
- `cluster_count_limit` — max transactions per cluster. Default is `64`
- `cluster_vbyte_limit` — max total vbyte size per cluster. Default is `101 kvB`
**NOTE:** There will be upstream update request of `core_v31()` to `bdk-bitcoind-client`
## 4. The Checks
### A. Tx-level standardness
| # | Check |
|---|---|
| A.1 | Reject coinbase transactions |
| A.2 | Tx version is in {1, 2, 3} (3 = TRUC) |
| A.3 | `tx_weight ≤ MAX_STANDARD_TX_WEIGHT` (400,000) |
| A.4 | `vin.len() > 0` and `vout.len() > 0` |
| A.5 | Each `scriptSig` is push-only and length ≤ 1,650 bytes |
| A.6 | Each output script is a recognized standard type (P2PK, P2PKH, P2SH, P2WPKH, P2WSH, P2TR, P2A, OP_RETURN, multisig) |
| A.7 | Bare multisig rejected unless `permit_bare_multisig` is set | `IsStandard` |
| A.8 | OP_RETURN aggregate byte budget across all OP_RETURN outputs (post-v30 model) |
| A.9 | Per-output dust check using fee-rate-aware threshold (dust_relay_feerate) |
| A.10 | If `version == 3`: `weight ≤ TRUC_MAX_WEIGHT` (40,000) |
| A.11 | Each output value ≥ dust threshold, EXCEPT: tx pays zero fee AND every dust output is spent within the same package (ephemeral dust) |
| A.12 | Non-witness serialized size ≥ MIN_STANDARD_TX_NONWITNESS_SIZE (65 bytes) |
| A.13 | Total sigop cost ≤ MAX_STANDARD_TX_SIGOPS_COST (80,000) — covers legacy + segwit, separate from per-input P2SH cap |
| A.14 | For each output, if scriptPubKey is a witness program, its version is 0 or 1; v2–v16 are non-standard for relay |
### B. Input standardness (needs prevouts)
| # | Check |
|---|---|
| B.1 | P2SH redeemScript sigops ≤ `MAX_P2SH_SIGOPS` (15) |
| B.2 | P2WSH: witness stack items ≤ 100, each item ≤ 80 bytes, witness script ≤ 3,600 bytes |
| B.3 | Tapscript: stack item ≤ `MAX_STANDARD_TAPSCRIPT_STACK_ITEM_SIZE` |
| B.4 | Reject Taproot annex (currently non-standard) |
| B.5 | For each input, the prevout scriptPubKey is itself a standard type (same set as #6) |
### C. Package well-formed
| # | Check |
|---|---|
| C.1 | parents appear before children |
| C.2 | no duplicate txs, no two txs share a prevout |
| C.3 | Package count ≤ 25 |
| C.4 | Total package weight ≤ 404,000 |
### D. LockTime (needs chain context)
| # | Check |
|---|---|
| D.1 | `nLockTime` finality vs tip height + MTP |
| D.2 | BIP68 sequence locks via nSequence |
### E. Mempool-state checks
| # | Check |
|---|---|
| E.1 | Reject if txid already in mempool |
| E.2 | Reject if wtxid already in mempool |
| E.3 | Detect missing inputs (prevout not in snapshot or provided prevouts) |
| E.4 | Static min relay feerate: `tx_feerate ≥ min_relay_feerate` |
| E.5 | mempool min fee: `tx_feerate ≥ mempoolinfo.mempoolminfee` |
### F. RBF (needs mempool lookup)
**Legacy (core_v28..core_v30):**
| # | Check |
|---|---|
| F.1.1 | Rule 1 (signaling): if policy disables full-RBF, at least one input of a conflict must signal BIP-125 (nSequence < 0xfffffffe). Skip when mempoolfullrbf is set. |
| F.1.2 | Rule 3: replacement_fee ≥ sum(C.descendant_fees) over direct conflicts |
| F.1.3 | Rule 4: `replacement_fee − sum(fees over displaced set) ≥ incremental_relay_feerate × replacement_vsize` |
| F.1.4 | Rule 5: Collect direct conflicts per input; compute displaced set (conflict + descendants) via snapshot fields. Reject if total displaced count > 100 |
**Cluster (core_v31), replacing F.1.1..F.1.4:**
| # | Check |
|---|---|
| F.2.1 | Identify affected clusters (clusters containing any direct conflict or any input of replacement) using `OutPoint → ClusterId` index |
| F.2.2 | Reconstruct "after" cluster state: remove conflicts and their descendants, insert replacement (see H.3 for cluster rebuild) |
| F.2.3 | Re-linearize affected clusters in the "after" state using the configured `Linearizer` |
| F.2.4 | Compute feerate diagrams for affected clusters before and after; pad the shorter diagram with a zero-fee tail; accept iff the "after" diagram is ≥ at every point and > at some point |
| F.2.5 | Anti-free-relay floor: `replacement_fee ≥ sum(displaced_fees) + min_relay_feerate × replacement_vsize` |
### G. TRUC stateful
| # | Check |
|---|---|
| G.1 | TRUC tx must have only TRUC unconfirmed ancestors (check direct parents' `version`) |
| G.2 | Non-TRUC tx must have only non-TRUC unconfirmed ancestors |
| G.3 | **Legacy:** TRUC ancestor count ≤ 2 (incl. self): unconfirmed parent must have no unconfirmed ancestors of its own — assert parent.ancestor_count == 1. **Cluster:** the cluster containing the TRUC tx (after addition) has count ≤ 2. |
| G.4 | **Legacy**: TRUC descendant count ≤ 2 (incl. self): parent must have no existing unconfirmed children — assert parent.descendant_count == 1. **Cluster:** subsumed by the cluster-count constraint in #39. |
| G.5 | TRUC child vsize ≤ 1,000 when ancestors exist |
| G.6 | Sibling eviction: if TRUC parent already has one unconfirmed child, a new child triggers eviction of the existing sibling rather than rejection — return `SiblingEviction { evict: Txid }` caller decides whether to proceed |
### H. Cluster mempool
It is computed once at snapshot load; powers F cluster-era, block-template queries, and eviction.
Layered subsystem; the order below is the build order.
#### H.1 Arithmetic — FeeFrac
```rs
struct FeeFrac {
fee: i64,
size: u32
}
```
- Feerate comparison via `i128` cross-multiply (`fee_a × size_b` vs `fee_b × size_a`); never float
- Add/sub for chunk merges.
#### H.2 Ancestor Feerate Subroutine (CPFP core)
Used by AncestorSetGreedy (H.4) to find the best candidate at each iteration.
```rust
/// Compute the effective feerate of a tx together with all its unconfirmed ancestors.
/// ancestor_set(tx) = { tx } ∪ { all unconfirmed ancestors reachable via depends[] }
fn ancestor_feerate(tx: TxId, graph: &DepGraph) -> FeeFrac {
let set = ancestor_set(tx, graph); // BitSet walk, O(n) per call
FeeFrac {
fee: set.iter().map(|t| graph[t].fee).sum(),
size: set.iter().map(|t| graph[t].vsize).sum(),
}
}
```
This is the fee-bumping mechanism underlying CPFP: a child's fee raises the effective feerate of the entire ancestor set, causing the miner to include all ancestors together.
#### H.3 BitSet
```rust=
// N = cluster_count_limit / 64; default cluster_count_limit = 64, so N = 1
struct BitSet<const N: usize>([u64; N]);
```
- Stack-allocated, `Copy`, O(1) set ops (union, intersection, difference, contains)
- Foundation for `DepGraph` and `Chunk`
#### H.4 Cluster / DepGraph
```rs
struct TxEntry {
txid: Txid,
fee: i64, // ≥ 0
vsize: u32,
parents: BitSet<N>, // direct unconfirmed parents within cluster
children: BitSet<N>, // direct unconfirmed children within cluster
}
struct Cluster {
txs: Vec<TxEntry>,
id: ClusterId,
}
```
Membership indexes (built once at load, keyed for O(1) lookup):
```rs
struct MempoolIndex {
// Primary lookup
by_txid: HashMap<Txid, ClusterId>,
by_outpoint: HashMap<OutPoint, ClusterId>, // maps each output to its tx's cluster
// For RBF fast-path: find affected clusters without full traversal
clusters: Vec<Cluster>,
}
```
**Cluster rebuild for RBF "after" state:**
When evaluating a replacement, affected clusters must be reconstructed without the evicted transactions before re-linearization. The procedure is:
1. Identify all `displaced = direct_conflicts ∪ their descendants` via snapshot `descendant` fields.
2. For each affected cluster: filter out `displaced` txs to produce a reduced `DepGraph`
3. Insert the replacement tx into the reduced graph, connecting its edges
4. Run connected-component analysis on the reduced graph — evicting txs may split a cluster into two or more components; each component becomes a new `Cluster`
5. Re-linearize each resulting cluster with the configured `Linearizer`
Step 4 (potential cluster split) is O(n) per affected cluster via BFS/DFS over the reduced adjacency list. This is the most expensive part of the RBF fast-path.
Built once at load by union-find over the snapshot's `depends` field; snapshot does not expose cluster membership directly
#### H.5 Linearizer trait
```rust
pub trait Linearizer {
fn linearize(&self, c: &Cluster, prior: Option<&Linearization>) -> Linearization;
}
pub struct Linearization(Vec<TxId>);
```
Implementation:
1. `AncestorSetGreedy` — repeatedly pick the tx whose ancestor set (see H.2) has the highest effective feerate; emit that ancestor set as a chunk; remove it; repeat. O(n²); deterministic. Satisfies the invariant: each emitted chunk's feerate ≥ the best remaining ancestor-set feerate.
2. `Limo<F: SubsetFinder>` — improves an input linearization using a subset finder; pre-SFL Core algorithm; O(n²).
3. `SpanningForest { rng_seed: u64 }` — current Core (v31) algorithm; very fast on real clusters (~20–50 µs for 70–100 tx) but no termination bound, requires randomization.
#### H.6 Chunking
Derived from a linearization in a single O(n) pass: append each tx as a singleton chunk; while last chunk's feerate is less than the new chunk's, merge them
```rs
struct Chunk {
txs: BitSet<N>,
ff: FeeFrac,
}
struct Chunking {
chunks: Vec<Chunk>, // feerates are non-increasing by construction
}
```
#### H.7 Block-template simulation
```rs
BinaryHeap<Reverse<(FeeFrac, ClusterId, ChunkIdx)>> // ordered by descending feerate
```
- Pop the highest-feerate chunk, advance that cluster's chunk cursor, push the next chunk from the same cluster
- Stop when the next chunk would exceed `BlockLimit` (weight or sigops)
- O(total_chunks · log K) where K is the number of clusters
#### H.8 Eviction ordering
Iterator over chunks across all clusters in ascending feerate order. Exact mirror of H.7 with a reversed heap. Identifies the "would be mined last" set for size-limit eviction.
#### H.9 Feerate diagram comparison (implements F.36c and F.36d)
```rs
/// Piecewise-linear (cumulative_vbytes, cumulative_fee) derived from a Chunking.
struct Diagram(Vec<(u64, i64)>);
enum DiagramCmp { Better, Worse, Equal, Incomparable }
fn compare(old: &Diagram, new: &Diagram) -> DiagramCmp {
// Pad the shorter diagram with a zero-fee tail to equal the longer's total size.
// Walk both simultaneously; track whether new is ever < old or ever > old.
// O(|old| + |new|)
}
```
Accept the replacement iff `compare(before, after) == DiagramCmp::Better`.
## 5. NOTE
- All cluster-mempool computation is batch, performed at `from_json` / `from_bitcoind` time. A large mempool (100k txs) with `AncestorSetGreedy` will take O(n²) across all clusters; document this in the crate README with a benchmark.
- `cluster_count_limit` and `cluster_vbyte_limit` are part of `Policy`; pre-v31 policies leave the cluster-mempool layer present but unused.
- Linearizer is pluggable via `from_json_with_linearizer` / `from_bitcoind_with_linearizer`; the default is `AncestorSetGreedy`.
-The RBF fast-path in F.2 re-linearizes only affected clusters, not the whole mempool. Cluster membership lookup is O(1) via `MempoolIndex`.
## 6. Glossary
- `AncestorSetGreedy` — the simplest correct algorithm. Repeatedly find the highest-feerate ancestor set (a tx plus all its still-unconfirmed ancestors), emit it, remove it, repeat.
- `ancestor_feerate(tx)` — the effective feerate of a tx and all its unconfirmed ancestors; the core CPFP subroutine. See H.2.
- `Chunk` — a contiguous group of transactions from a linearization that are mined together because merging them maximizes feerate. Each chunk has a non-increasing feerate relative to the previous.
- `ChainContext` — tip height and MTP required for locktime and BIP68 checks.
- `Diagram` — the piecewise-linear (cumulative_vbytes, cumulative_fee) curve derived from a Chunking; used for feerate-diagram RBF comparison.
- `FeeFrac` — a (fee, size) pair supporting exact rational feerate comparison via i128 cross-multiplication.
- `Prevouts` — map of `OutPoint → TxOut`, providing the data needed for input-standardness and missing-input checks.
- `Linearization` — an ordered sequence of txids within a cluster such that no tx appears before its parents.
- `Limo<F: SubsetFinder>` — "Linearization through Incremental Merging of Optimizations." Takes an existing linearization and improves it.
- `SpanningForest { rng_seed: u64 }` — Models the cluster as a spanning forest and does merge/split steps until it converges on the optimal linearization. Very fast on real clusters.
## 7. REFERENCES FOR CLUSTER MEMPOOL
- https://github.com/bitcoin/bitcoin/issues/25584
- https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393
- https://github.com/bitcoin/bitcoin/issues/27677
- https://github.com/bitcoin/bitcoin/pull/34616
- https://github.com/bitcoin/bitcoin/pull/33157
- https://github.com/bitcoin/bitcoin/issues/30289
- https://github.com/bitcoin/bitcoin/pull/26451
- https://github.com/bitcoin/bitcoin/issues/29319