# Bitcoin Light Clients Part II: Compact Block Filters
## Introduction
In a prior article - [Part I: Bloom Filters](https://enigbe.medium.com/part-i-bloom-filters-164d046f34b6) - on simple payment verification for light clients, I explored how light clients utilize bloom filters for transaction verification, highlighting the reasons why these filters were implemented and some known flaws in their usage. The challenges are mostly related to privacy leaks, incentive misalignment for full nodes, and trust.
The key highlight from part I is this - BIP37 bloom filters are unsuitable for use and a new method of payment verification for clients must address the design flaws inherent in them. BIP37 peer-to-peer protocol has also been [disabled](https://bitcoinops.org/en/newsletters/2019/11/27/#deprecated-or-removed-features) by [default](https://bitcoinops.org/en/newsletters/2019/07/31/#bloom-filter-discussion) on Bitcoin Core.
[BIP 157 ](https://github.com/bitcoin/bips/blob/master/bip-0157.mediawiki) proposed a new protocol that permits transaction filtering on the client-side for light clients connected to at least one honest node securely verify transactions included in a block; without loss of privacy or reliance on trusted full nodes. It also eliminates I/O asymmetry between light and full nodes, and minimizes the DOS vector that full nodes were subject and vulnerable to. This proposal details probabilistic compact block filters with concrete implementation in [BIP 158](https://github.com/bitcoin/bips/blob/master/bip-0158.mediawiki).
In this article, I will share what I have learned about the need for client-side transaction filtering, what compact block filters are, and how they are used in the network.
## Client-side Transaction Filtering
With bloom filters, the SPV client constructs a bloom filter and sends it to the full node. The full node then compares each transaction it receives to all of the bloom filters it has received, looking for a match. This means most of the computational overhead is placed on the full node, which makes it incentive-incompatible. With client-slide transaction filtering, this asymmetry is flipped.
Full nodes calculate one compact block filter (CBFs) per block regardless of the number of light clients connected to them. With CBFs, light clients check for matching scriptpubkeys in a set, and download a block only if there is a match or the possibility of a match.
Because light clients download blocks from different peers without worrying about other peers identifying that the downloaded blocks are blocks of interest, and never having to send their wallet addresses to peers, their privacy is preserved much better than with bloom filters.
## Compact Block Filters
A compact block filter is a condensed representation of the transactions in a block.
![Compact Block Filter Image](https://i.imgur.com/4F5HxAt.png)
In the code block below, I have attached a simple Rust program to get the CBF of a specified block.
```rust=src/main.rs
use bitcoin::hashes::hex::ToHex;
use bitcoincore_rpc::{Auth, Client, RpcApi};
#[derive(Debug)]
pub struct BlockFilter {
pub header: String,
pub filter: String,
}
fn main() {
let core_rpc_url = "http://localhost:38332";
let rpc = Client::new(
core_rpc_url,
Auth::UserPass(
"<bitcoind_username>".to_string(),
"<bitcoind_password>".to_string(),
),
)
.unwrap();
let block_height = 88000;
let block_hash = rpc.get_block_hash(block_height).unwrap();
let block_filter = rpc.get_block_filter(&block_hash).unwrap();
println!(
"{:?}",
BlockFilter {
header: block_filter.header.to_string(),
filter: block_filter.filter.to_hex(),
}
);
}
```
Shell output
```bash=
BlockFilter {
header: "8be72397110f26fd602ef0607698cc4d796ab0d05891132150c81049e8995151",
filter: "340cfdbecf2a0399fbbbd50aa1e39fa6ec83cc5fe30a6b377d4872ae9c932bdbdb78d4a5ad8ec0110ca55d105c089896a0fffe0a41e132b27d45e9166cdaade080a26299fd71238ebf9b87fefebff71b81b64b6e1cbe8b1dcab9868b9bcbd45928b33ca926ac82a853c0444777e680d83a0947332e04f20a74c8e16b3a1dc79f656403028d6af9d29600"
}
```
### Construction
Full nodes construct compact block filters for each block by:
1. Iterating over each transaction in the block, selecting and adding all scriptpubkeys, i.e. those referenced by inputs and those in transaction outputs, to a list. All output OP_RETURN scripts and coinbase inputs are ignored because the former is provably unspendable, and the later references no UTXOs.
2. Hashing the list's elements using a SipHash function which produces a uniformly random 64-bit number, sorting the hashed list, and removing duplicated elements.
3. Mapping the return numbers over a range $$ 0 < number < F $$
where
$$
F = N * M
$$
and **M**: False positive rate, **N**: Number of elements in the list
4. Calculating the Golomb-Rice code for each mapped number
5. Constructing a Golomb Coded Set (GCS) -an algorithm that compresses the list in a lossless way- parametrized by the key used in the SipHash function, the false positive rate M, the bit parameter of Golomb-Rice, and the vector of N item.
The GCS is the block filter for the given block.
### Usage
Light clients can, upon receiving the block filter, check for an address match by:
1. Reconstructing the list of hashed numbers by reversing the compression done when the GCS was created.
2. Hashing and sorting script_pubkeys in their wallets
3. Checking for (comparing) each item in the decompressed set individually, or finding an intersection of items in 1 and 2 above.
## Benefits
There are a handful of benefits of using block filters in comparison with bloom filters. Here I highlight the most obvious.
CBFs have `header`s which are functions of the hash of the previous filter's header and the hash of the current filter. These headers make it possible to form a chain of block filters as shown below
$$
header = H(prev_{header}) + H(current_{filter})
$$
![Block Filter Chain](https://i.imgur.com/Z1WFLiU.png)
This chain helps light clients to compare filter headers from different full nodes, which light clients can use to monitor filter chain divergence, and tell if nodes are sending false information. Unlike with bloom filters where there are no deterministic artifacts, light clients retrieve block filter headers. This reduces the trust requirements on light clients because they do not worry about full nodes omitting transaction information with little risk of detection [[3]](https://github.com/bitcoin/bips/blob/master/bip-0157.mediawiki#Motivation)[[1]](https://bitcoin-dev.blog/blog/bip158-deep-dive/).
Light clients also do not have to send any probabilistic filter of their addresses to full nodes, and thus protect their privacy. They also do not have to download blocks without first confirming there is a match, and even in cases where the filters indicate a match when there is none, i.e. false positive, the probability of that happening is 1 in 784931[[1]](https://bitcoin-dev.blog/blog/bip158-deep-dive/)[[3]](https://github.com/bitcoin/bips/blob/master/bip-0158.mediawiki).
Full nodes, on the other hand, no longer have to continuously scan each incoming transaction for each bloom filter a light client sends to them. We established in [`Part I`](https://enigbe.medium.com/part-i-bloom-filters-164d046f34b6) that this is neither a verification technique that rewards the CPU work done nor does it lend itself to scale, opening full nodes to a DOS attack vector. Given the compact size of compact block filters, full nodes can do a one-time calculation, for each block, and save it to disk, allocating small space for them.
## Disadvantages
Although suggestive transaction information could be gotten by checking the mempool with a bloom filter, CBFs are calculated for confirmed blocks. This means that light clients have no way to get information on scriptpubkeys in relevant but unconfirmed transactions [[7]](https://bitcoin.stackexchange.com/questions/101512/how-do-light-clients-using-compact-block-filters-get-relevant-unconfirmed-transa).
It is expected that the bandwidth for both full nodes and light clients will go up. This is because of the request-response cycle between nodes (full and light) to get filters, filter headers, and filter headers at spaced intervals. Light clients in particular have to sync the block header before they can download any filters or filter headers.
## Conclusion
Compact block filters are another technique for light clients to use in transaction filtering and payment verification. They are constructed by hashing, sorting, and compressing the scriptpubkeys into a set for each transaction in every block. Light clients match their hashed scriptpubkeys against the decompressed set, only downloading blocks with a likely match.
These filters offer the benefits of privacy, and less trust, addressing the incentive-misalignment of full nodes matching addresses to bloom filters. With compact block filters, the risk of Denial of Service (DoS) attacks on full odes is also reduced.
On the contrary, there is an increase in bandwidth for light clients.
## References
1. Mouton, E. (2021): [*Compact Block Filters Deep Dive (BIP 158) - A technical explanation of the workings of compact block filters and Golomb-Rice Coding*](https://bitcoin-dev.blog/blog/bip158-deep-dive/). Last accessed 23 May 2022
2. Bitcoin Optech: *Compact block filters*. https://bitcoinops.org/en/topics/compact-block-filters/. Last accessed 23 May 2022
3. Osuntokun, O., Akselrod, A. (2017): [*BIP 158 - Compact Block Filters for Light Clients*](https://github.com/bitcoin/bips/blob/master/bip-0158.mediawiki#Contents). Last accessed 23 May 2022
4. [*What's the distinction between BIP 157 and BIP 158? Are they supported by Bitcoin Core?*](https://bitcoin.stackexchange.com/questions/86231/whats-the-distinction-between-bip-157-and-bip-158-are-they-supported-by-bitcoi). Last accessed 23 May 2022
5. Rusnak, P. (2020): [*BIP158: Compact Block Filters*.](https://blog.trezor.io/bip158-compact-block-filters-9b813b07a878). Last accessed 23 May 2022
6. [ *How do light clients using compact block filters get relevant unconfirmed transactions?*. ](https://bitcoin.stackexchange.com/questions/101512/how-do-light-clients-using-compact-block-filters-get-relevant-unconfirmed-transa)