# Understanding Ethereum Bloom Filters
There are already many good introductory articles about [Bloom Filters](https://en.wikipedia.org/wiki/Bloom_filter) and their application in [Ethereum](https://ethereum.org/en/). This article mainly helps readers further understand them by reading the [go-ethereum](https://github.com/ethereum/go-ethereum) source code and analyzing a practical example.
## Where are Bloom Filters
In Ethereum, Bloom Filters mainly exist in two places: Block Header and [Transaction Receipt](https://ethereum.stackexchange.com/questions/16525/what-are-ethereum-transaction-receipts-and-what-are-they-used-for). The first is the Block Header, please look at [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/block.go#L74):
```go
// Header represents a block header in the Ethereum blockchain.
type Header struct {
ParentHash common.Hash `json:"parentHash" gencodec:"required"`
UncleHash common.Hash `json:"sha3Uncles" gencodec:"required"`
Coinbase common.Address `json:"miner"`
Root common.Hash `json:"stateRoot" gencodec:"required"`
TxHash common.Hash `json:"transactionsRoot" gencodec:"required"`
ReceiptHash common.Hash `json:"receiptsRoot" gencodec:"required"`
Bloom Bloom `json:"logsBloom" gencodec:"required"`
Difficulty *big.Int `json:"difficulty" gencodec:"required"`
Number *big.Int `json:"number" gencodec:"required"`
GasLimit uint64 `json:"gasLimit" gencodec:"required"`
GasUsed uint64 `json:"gasUsed" gencodec:"required"`
Time uint64 `json:"timestamp" gencodec:"required"`
Extra []byte `json:"extraData" gencodec:"required"`
MixDigest common.Hash `json:"mixHash"`
Nonce BlockNonce `json:"nonce"`
... // other fields are omitted
}
```
And the second is Transaction Receipt, please look at [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/receipt.go#L52):
```go
// Receipt represents the results of a transaction.
type Receipt struct {
// Consensus fields: These fields are defined by the Yellow Paper
Type uint8 `json:"type,omitempty"`
PostState []byte `json:"root"`
Status uint64 `json:"status"`
CumulativeGasUsed uint64 `json:"cumulativeGasUsed" gencodec:"required"`
Bloom Bloom `json:"logsBloom" gencodec:"required"`
Logs []*Log `json:"logs" gencodec:"required"`
// Implementation fields: These fields are added by geth when processing a transaction.
TxHash common.Hash `json:"transactionHash" gencodec:"required"`
ContractAddress common.Address `json:"contractAddress"`
GasUsed uint64 `json:"gasUsed" gencodec:"required"`
EffectiveGasPrice *big.Int `json:"effectiveGasPrice"` // required, but tag omitted for backwards compatibility
BlobGasUsed uint64 `json:"blobGasUsed,omitempty"`
BlobGasPrice *big.Int `json:"blobGasPrice,omitempty"`
// Inclusion information: These fields provide information about the inclusion of the
// transaction corresponding to this receipt.
BlockHash common.Hash `json:"blockHash,omitempty"`
BlockNumber *big.Int `json:"blockNumber,omitempty"`
TransactionIndex uint `json:"transactionIndex"`
}
```
## What are Blooms
So what are Blooms? Let’s look at [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/bloom9.go#L41) again:
```go
const (
// BloomByteLength represents the number of bytes used in a header log bloom.
BloomByteLength = 256
// BloomBitLength represents the number of bits used in a header log bloom.
BloomBitLength = 8 * BloomByteLength
)
// Bloom represents a 2048 bit bloom filter.
type Bloom [BloomByteLength]byte
```
Simply put, in Ethereum, a Bloom is a 256-byte array, which means there are 2048 bits in total.
## How to Calculate Blooms
So how to calculate Blooms? We still look for the answer from the code. The following [code snippet](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/block.go#L267) shows how to calculate the Bloom of a Block Header:
```go
b.header.Bloom = CreateBloom(receipts)
```
And the following [code snippet](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/receipt.go#L294) shows how to calculate the Bloom of a Transaction Receipt:
```go
r.Bloom = CreateBloom(Receipts{(*Receipt)(r)})
```
So the process is exactly the same, the logic is in the `CreateBloom()` function, please look at [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/bloom9.go#L103):
```go
// CreateBloom creates a bloom filter out of the give Receipts (+Logs)
func CreateBloom(receipts Receipts) Bloom {
buf := make([]byte, 6)
var bin Bloom
for _, receipt := range receipts {
for _, log := range receipt.Logs {
bin.add(log.Address.Bytes(), buf)
for _, b := range log.Topics {
bin.add(b[:], buf)
}
}
}
return bin
}
```
So now it is obvious that some kind of information will be calculated for each [Log](https://docs.alchemy.com/docs/deep-dive-into-eth_getlogs#what-are-logs-or-events) address and topic, and then this information is 'added' together to get the final result. Let’s get into Bloom’s [add()](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/bloom9.go#L66) method:
```go
// add is internal version of Add, which takes a scratch buffer for reuse (needs to be at least 6 bytes)
func (b *Bloom) add(d []byte, buf []byte) {
i1, v1, i2, v2, i3, v3 := bloomValues(d, buf)
b[i1] |= v1
b[i2] |= v2
b[i3] |= v3
}
```
So the 'add' operation mentioned above should actually be a [bitwise or](https://en.wikipedia.org/wiki/Bitwise_operation#OR) operation. We guess that `bloomValues()` will calculate which three bits of Bloom should be set to 1 based on the passed address (20 bytes) or topic (32 bytes). Let’s confirm it through [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/bloom9.go#L139):
```go
// bloomValues returns the bytes (index-value pairs) to set for the given data
func bloomValues(data []byte, hashbuf []byte) (uint, byte, uint, byte, uint, byte) {
sha := hasherPool.Get().(crypto.KeccakState)
sha.Reset()
sha.Write(data)
sha.Read(hashbuf)
hasherPool.Put(sha)
// The actual bits to flip
v1 := byte(1 << (hashbuf[1] & 0x7))
v2 := byte(1 << (hashbuf[3] & 0x7))
v3 := byte(1 << (hashbuf[5] & 0x7))
// The indices for the bytes to OR in
i1 := BloomByteLength - uint((binary.BigEndian.Uint16(hashbuf)&0x7ff)>>3) - 1
i2 := BloomByteLength - uint((binary.BigEndian.Uint16(hashbuf[2:])&0x7ff)>>3) - 1
i3 := BloomByteLength - uint((binary.BigEndian.Uint16(hashbuf[4:])&0x7ff)>>3) - 1
return i1, v1, i2, v2, i3, v3
}
```
This function is a little harder to understand than the previous code. However, after analysis, we know that our guess is correct. It calculates the [keccak256](https://ethereum.stackexchange.com/questions/30369/difference-between-keccak256-and-sha3) hash of the given address or topic, and then calculates which three bits of bloom need to be set to 1 based on the first six bytes of the hash.
To make the logic clearer, let's rewrite the `bloomValues()` function in TypeScript so that it directly returns a [bit mask](https://en.wikipedia.org/wiki/Mask_(computing)):
```typescript
function bloomValues(addrOrTopic: string): bigint {
const hash = ethers.keccak256(addrOrTopic);
let result = 0n;
result |= 1n << (BigInt('0x' + hash.slice( 2, 6)) & 0x7ffn);
result |= 1n << (BigInt('0x' + hash.slice( 6, 10)) & 0x7ffn);
result |= 1n << (BigInt('0x' + hash.slice(10, 14)) & 0x7ffn);
return result;
}
```
## How Bloom Filters work
After reading the above code, we already have a clear understanding of how Bloom Filters work. Each Transaction Receipt contains a list of Logs, and each Log has an address (indicating which contract emitted it) and up to 4 Topics. Please take a look at [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/core/types/log.go#L29):
```go
// Log represents a contract log event. These events are generated by the LOG opcode and
// stored/indexed by the node.
type Log struct {
// Consensus fields:
// address of the contract that generated the event
Address common.Address `json:"address" gencodec:"required"`
// list of topics provided by the contract.
Topics []common.Hash `json:"topics" gencodec:"required"`
// supplied by the contract, usually ABI-encoded
Data []byte `json:"data" gencodec:"required"`
... // other fields are omitted
}
```
Ethereum allows us to [query logs](https://docs.alchemy.com/docs/deep-dive-into-eth_getlogs) within a certain height range through complex conditions based on addresses and topics. If we traverse each block and transaction receipts to check the log to see if it meets the conditions, the speed will be too slow. So we can use the Bloom of block headers to quickly filter out blocks that do not meet the conditions. Please take a look at [the code](https://github.com/ethereum/go-ethereum/blob/v1.14.11/eth/filters/filter.go#L274):
```go
// blockLogs returns the logs matching the filter criteria within a single block.
func (f *Filter) blockLogs(ctx context.Context, header *types.Header) ([]*types.Log, error) {
if bloomFilter(header.Bloom, f.addresses, f.topics) {
return f.checkMatches(ctx, header)
}
return nil, nil
}
```
Note that although Bloom cannot filter out blocks that meet the conditions, it may leave some blocks that do not meet the conditions. So we have to double check the Logs, which is why we need to call the [checkMatches()](https://github.com/ethereum/go-ethereum/blob/v1.14.11/eth/filters/filter.go#L284) function. Here is the code of the [bloomFilter()](https://github.com/ethereum/go-ethereum/blob/v1.14.11/eth/filters/filter.go#L351) function:
```go
func bloomFilter(bloom types.Bloom, addresses []common.Address, topics [][]common.Hash) bool {
if len(addresses) > 0 {
var included bool
for _, addr := range addresses {
if types.BloomLookup(bloom, addr) {
included = true
break
}
}
if !included {
return false
}
}
for _, sub := range topics {
included := len(sub) == 0 // empty rule set == wildcard
for _, topic := range sub {
if types.BloomLookup(bloom, topic) {
included = true
break
}
}
if !included {
return false
}
}
return true
}
```
## A Practical Example
Let's look at a practical example. At the time of writing this article, the height of the Ethereum mainnet is approximately [20953745](https://etherscan.io/block/20953745). Let's take a look at this block, because it contains relatively few transactions, which is easier for us to observe.
However, Etherscan (and [EthersJS](https://github.com/ethers-io/ethers.js/issues/276)) does not display the `logsBloom` field of the block header, so we need to query it through [JSON RPC](https://ethereum.org/en/developers/docs/apis/json-rpc/#eth_getblockbynumber):
```bash
curl -X POST -H 'content-type: application/json;' --data '{
"jsonrpc":"2.0",
"method":"eth_getBlockByNumber",
"params":["0x13fba91", false],
"id":1
}' https://eth.llamarpc.com | jq
```
This is the result:
```json
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"baseFeePerGas": "0x1af09eaa5",
"blobGasUsed": "0x0",
"difficulty": "0x0",
"excessBlobGas": "0x0",
"extraData": "0xd883010e08846765746888676f312e32322e36856c696e7578",
"gasLimit": "0x1c9c380",
"gasUsed": "0x334f0e",
"hash": "0x85f1c2c07db71b1aabf639ae42cb88d5868278b115e1fa91e037d1afcc563c41",
"logsBloom": "0x102100004001110500000050868810641410500684001000020019601000004128100010f801200802110208800401100205c8588880200104991420002a018000012808080048091060410880000228001110900100084c01410020802000000005182002228122190c0001a4005801001100211200400000100c90018c020004000048800100480a03012100205000040000c308410028404c0140701004028a0100000400908a010045c018098404040321000224008040000980040000e00e002442081008004000800000058c830248001300800090000100012020301220b420110009220140450188000610004a001009020905c0004000000020240c",
"miner": "0xd4e96ef8eee8678dbff4d535e033ed1a4f7605b7",
"mixHash": "0x5197ad7f38b4eb00a337fc52a86147f07e2624e268046fe3b8ce1e83c0d79862",
"nonce": "0x0000000000000000",
"number": "0x13fba91",
"parentBeaconBlockRoot": "0x2097b67a5b24b4ec115a0baa6e145728facdc9b3ecc7c48964573255e8b55c98",
"parentHash": "0x3dcae4895539797180ebd29432f472692222bf6fc6c5436c4be01e1f1e266f10",
"receiptsRoot": "0x80fa0a5d28e9540b794f8d96aca3a1219f835c26b4afdefa7b68072cb21c306c",
"sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
"size": "0x55a6",
"stateRoot": "0x9c08afbf2c3caaf4089055e4d290f44aa926612a43a04231834ad55554b8b345",
"timestamp": "0x670b32c3",
"totalDifficulty": "0xc70d815d562d3cfa955",
"transactions": [...],
"transactionsRoot": "0x87c92d0b4ed69ed2144b49da715ffaa39d3892a69f5f582806e6245be001b114",
"uncles": [],
"withdrawals": [...],
"withdrawalsRoot": "0xbf1711c5a2cecf68ef17c26ecd637f63ecc5ef6a98a30993088aab364a2bbff4"
}
}
```
Let's write a TypeScript function to verify the logsBloom of the block header:
```typescript
async function checkBlockLogsBloom() {
const provider = new ethers.JsonRpcProvider('https://eth.llamarpc.com');
const block = await provider.getBlock(20953745);
let logsBloom = 0n;
for (const txHash of block!.transactions) {
const txr = await provider.getTransactionReceipt(txHash);
console.log('txHash:', txHash.slice(0, 8), ', logsBloom:', txr!.logsBloom);
logsBloom |= BigInt(txr!.logsBloom);
}
console.log('blockLogsBloom:', '0x102100004001110500000050868810641410500684001000020019601000004128100010f801200802110208800401100205c8588880200104991420002a018000012808080048091060410880000228001110900100084c01410020802000000005182002228122190c0001a4005801001100211200400000100c90018c020004000048800100480a03012100205000040000c308410028404c0140701004028a0100000400908a010045c018098404040321000224008040000980040000e00e002442081008004000800000058c830248001300800090000100012020301220b420110009220140450188000610004a001009020905c0004000000020240c');
console.log('addedLogsBloom:', '0x' + logsBloom.toString(16));
}
```
Running the above function, we can confirm that the `logsBloom` of each transaction receipt added (bitwise or) together are indeed the `logsBloom` of the block header:
```
txHash: 0x352d91, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x0c43ba, logsBloom: 0x1000000000000000000000000000000000000000...
txHash: 0x4a2817, logsBloom: 0x1000000000000000000000000000000000000000...
txHash: 0xadbb54, logsBloom: 0x0000000000000001000000000080000000000000...
txHash: 0x8df4d3, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x6a76d0, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x429f34, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x057c3c, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x9390ee, logsBloom: 0x0020000000001000000000008000000010000000...
txHash: 0xb3fc8f, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xa6cae0, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x65f4f6, logsBloom: 0x0001000000000000000000000000000000000000...
txHash: 0xf7c2a6, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x9b5221, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xbc52dd, logsBloom: 0x0000000040000000000000000000000000001000...
txHash: 0x844a9c, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xc6f729, logsBloom: 0x1000000000000000000000000000000000000000...
txHash: 0x3a601e, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xcd6728, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xed42df, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xa66375, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x73d872, logsBloom: 0x0000000000000000000000400000000000000000...
txHash: 0xf7f932, logsBloom: 0x0000000000000000000000100000000400001002...
txHash: 0x876525, logsBloom: 0x0000000000000000000000000400104000000004...
txHash: 0xce406d, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x5be108, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xca1ca9, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xd7c295, logsBloom: 0x0020000000000004000000008000000004000000...
txHash: 0xf7d9b1, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x85ceca, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x0919d3, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xae7761, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x52d1db, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xd0335a, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x0745c8, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x7f7857, logsBloom: 0x0000000000010000000000000000000000004000...
txHash: 0xac37c1, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x891ae3, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xb41ab2, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x41c7eb, logsBloom: 0x0000000000000000000000000008000000000000...
txHash: 0xac5396, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0xc35c0c, logsBloom: 0x0000000000000000000000000000000000100000...
txHash: 0xecbd68, logsBloom: 0x0000000000000000000000000000000000000000...
txHash: 0x254a74, logsBloom: 0x0020000000000100000000008200002000000000...
txHash: 0xc98ab0, logsBloom: 0x0000000000000000000000000000000000000000...
blockLogsBloom: 0x1021000040011105000000508688106414105006...
addedLogsBloom: 0x1021000040011105000000508688106414105006...
```
Let’s write another function to verify the `logsBloom` field of a transaction receipt:
```typescript
async function checkTxLogsBloom() {
const provider = new ethers.JsonRpcProvider('https://eth.llamarpc.com');
const txr2 = await provider.getTransactionReceipt('0x254a74342df678da6ae85cb9f178042a6524fdf6df91436a7e742a7dbe49f9f5');
let logsBloom = 0n;
for (const {address, topics} of txr2!.logs) {
const addrBloom = bloomValues(address);
logsBloom |= addrBloom;
console.log('addr :', '0x..' + address.slice(42-8,), ', bloom:', '0x' + addrBloom.toString(16).padStart(512, '0'));
for (const topic of topics) {
const topicBloom = bloomValues(topic);
logsBloom |= topicBloom;
console.log('topic:', '0x..' + topic.slice(66-8,), ', bloom:', '0x' + topicBloom.toString(16).padStart(512, '0'));
}
}
console.log(' txLogsBloom:', txr2!.logsBloom);
console.log('addedLogsBloom:', '0x' + logsBloom.toString(16).padStart(512, '0'));
}
```
The `logsBloom` of the transaction receipt is indeed the result of adding (bitwise or) the bloom of each of its logs together:
```
addr : 0x..3C756Cc2, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..5cc9109c, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..1457bd09, bloom: 0x0000000000000100000000000000000000000000...
addr : 0x..3C756Cc2, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..f523b3ef, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..1457bd09, bloom: 0x0000000000000100000000000000000000000000...
topic: 0x..cd3e8692, bloom: 0x0000000000000000000000000000000000000000...
addr : 0x..049aF2ab, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..f523b3ef, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..cd3e8692, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..049af2ab, bloom: 0x0000000000000000000000000000000000000000...
addr : 0x..049aF2ab, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..f523b3ef, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..cd3e8692, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..3a960582, bloom: 0x0000000000000000000000000000000000000000...
addr : 0x..CD3E8692, bloom: 0x0000000000000000000000000200000000000000...
topic: 0x..fffbbad1, bloom: 0x0000000000000000000000008000000000000000...
addr : 0x..CD3E8692, bloom: 0x0000000000000000000000000200000000000000...
topic: 0x..0159d822, bloom: 0x0020000000000000000000000000000000000000...
topic: 0x..1457bd09, bloom: 0x0000000000000100000000000000000000000000...
topic: 0x..3a960582, bloom: 0x0000000000000000000000000000000000000000...
addr : 0x..049aF2ab, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..f523b3ef, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..3a960582, bloom: 0x0000000000000000000000000000000000000000...
topic: 0x..f9025b43, bloom: 0x0000000000000000000000000000002000000000...
txLogsBloom: 0x0020000000000100000000008200002000000000...
addedLogsBloom: 0x0020000000000100000000008200002000000000...
```
## Summarize
In this article, we learned Bloom Filters and their application in Ethereum Blockchain by analyzing the go-ethereum source code. We also wrote some TypeScript code and verified it against a specific mainnet block to further deepen our understanding. However, Bloom Filters do not seem to be as fast as expected. So Ethereum is planning to abandon it. Please see [EIP-7668](https://eips.ethereum.org/EIPS/eip-7668) for more details.
Buy me a cup of coffee if this article has been helpful to you: `0x8f7BEE940b9F27E8d12F6a4046b9EC57c940c0FA`