# Light Ethereum Subprotocol (LES)
The Light Ethereum Subprotocol (LES) is the protocol used by "light" clients, which only
download block headers as they appear and fetch other parts of the blockchain on-demand.
They provide full functionality in terms of safely accessing the blockchain, but do not
mine and therefore do not take part in the consensus process. Full and archive nodes can
also support the 'les' protocol besides 'eth' in order to be able to serve light nodes.
The current protocol version is **les/4**. See end of document for a list of changes in
past protocol versions. Some of the les protocol messages are similar to of the [Ethereum
Wire Protocol](https://hackmd.io/@Nhlanhla/HkIHa9r7K), with the addition of a few new fields.
## Canonical Hash Trie
Canonical Hash Trie (CHT) structures are used by LES for quick initial syncing and secure
on-demand retrieval of canonical hash mappings, block headers and total difficulty (TD)
values.
A CHT is a Merkle trie (specifically '[Merkle Patricia Trie](https://hackmd.io/@Nhlanhla/SkGDJZB0d)' as used for Ethereum state)
that contains `blockNumber -> [blockHash, TD]` mappings where keys are binary big endian
encoded 64 bit integers and values are RLP-encoded `[hash, number]` pairs.
CHTs are generated by LES servers for every 32768 blocks, `CHT[i]` containing data for
blocks `0..(i+1) * 32768 - 1`. If a client knows the root hash of `CHT[i]` and wants to fetch
header number `N` (where `N < (i+1) * 32768`), it can obtain the header and the corresponding
Merkle proof of the CHT with a [GetHelperTrieProofs] request.
CHTs are only generated after 2048 confirmations, making it sure they will not be changed
by a chain reorg. In the current version of the light client there is a hard-coded
`[chtNumber, chtRoot]` pair associated with the genesis block hash of both the mainnet and
the testnet. A trustless validation algorithm is planned for later protocol versions.
## BloomBits
The BloomBits data structure optimizes log searching by doing a bitwise transformation
that makes it cheaper to retrieve bloom filter data relevant to a specific filter.
When searching in a long section of the block history, we are checking three specific bits
of each bloom filter per queried address/topic. In order to do that, LES must retrieve a
~550 byte block header per filtered block.
The BloomBits structure optimizes bloom filter lookups through a "bitwise 90 degree
rotation" of the bloom filters. Blocks are grouped into fixed length sections (section
size for the LES BloomBits Trie is 32768 blocks), `BloomBits[bitIdx][sectionIdx]` is a
32768 bit (4096 byte) long bit vector that contains a single bit of each bloom filter from
the block range `sectionIdx*SectionSize ... (sectionIdx+1)*SectionSize-1`. Since bloom
filters are usually sparse, a simple data compression makes this structure even more
efficient, especially for on-demand retrieval. By reading and binary AND-ing three
BloomBits sections, we can filter for an address/topic in 32768 blocks at once ("1" bits
in the binary AND result mean bloom matches).
### Compression Algorithm
BloomBits data is stored in compressed form. The compression algorithm is optimized for
sparse input data which contains a lot of zero bytes. Decompression requires knowledge of
the decompressed data length.
The algorithm can be described with this pseudo-code:
if data only contains zeroes,
CompressBytes(data) == nil
otherwise if len(data) <= 1,
CompressBytes(data) == data
otherwise:
CompressBytes(data) == append(CompressBytes(nonZeroBitset(data)), nonZeroBytes(data)...)
where
nonZeroBitset(data) is a bit vector with len(data) bits (MSB first):
nonZeroBitset(data)[i/8] && (1 << (7-i%8)) != 0 if data[i] != 0
len(nonZeroBitset(data)) == (len(data)+7)/8
nonZeroBytes(data) contains the non-zero bytes of data in the same order
### BloomBits Trie
In order to make this data structure retrievable on-demand for the light client, we put
the generated vectors in a trie. Parts of this trie can be retrieved with the
[GetHelperTrieProofs] message. Currently the trie root is part of the trusted syncing
checkpoint but trustless validation of the BloomBits trie is part of the development
plans. The trie consists of the compressed bit vectors as values stored at keys
constructed from the the bloom bit index encoded as a 2-byte big endian, followed by the
section index encoded as an 8-byte big endian. Since all-zero bit vectors have a zero
length when compressed, these vectors are not added to the trie at all.
BloomBits tries are generated for each new section of transformed bloom filter data by
adding the vectors belonging to the latest section index to the previous trie.
## Client Side Flow Control
Any node which takes on a server role in the the LES protocol needs to be able to somehow
limit the amount of work it does for each client peer during a given time period. They can
always just serve requests slowly if they are overloaded, but it is beneficial to give
some sort of flow control feedback to the clients. This way, clients could (and would have
incentive to) behave nicely and not send requests too quickly in the first place (and then
possibly timeout and resend while the server is still working on them). They could also
distribute requests better between multiple servers they are connected to. And if clients
can do this, servers can expect them to do this and throttle or drop them if they break
the flow control rules.
### The Model
Let us assume that serving each request has a cost (depending on type and parameters) for
the server. This cost is determined by the server, but it has an upper limit for any valid
request. The server assigns a "buffer" for each client from which the cost of each request
is deduced. The buffer has an upper limit (the "buffer limit") and a recharge rate (cost
per second). The server can decide to recharge it more quickly at any time if it has more
free resources, but there is a guaranteed minimum recharge rate. If a request is received
that would drain the client's buffer below zero, the client has broken the flow control
rules and is throttled or disconnected.
### The Protocol
The server announces three parameters in the [Status] message:
- `"flowControl/BL"`: Buffer Limit, an integer value
- `"flowControl/MRR"`: Minimum Rate of Recharge, an integer value
- `"flowControl/MRC"`: Maximum Request Cost table. The value of this parameter is a
table assigning cost values to every on-demand retrieval message in the LES protocol.
The table is encoded as a list of integer triples: `[[MsgCode, BaseCost, ReqCost], ...]`
On the server side:
When a client connects, the server sets the initial Buffer Value (`BV`) of the client to
`BL` and announces `BL` in [Status]. When a request is received from the client, it
calculates the cost according to its own estimates (but not higher than `MaxCost`, which
equals `BaseCost + ReqCost * N`, where `N` is the number of individual elements asked in
the request), then deducts it from `BV`. If `BV` goes negative, drops the peer, otherwise
starts serving the request. The reply message contains a `BV` value that is the previously
calculated `BV` plus the amount recharged during the time spent serving. Note that since
the server can always determine any cost up to `MaxCost` for a request (and a client
should not assume otherwise), it can reject a message without processing it if received
while `BV < MaxCost` because that's already a flow control breach.
On the client side:
The client always has a lowest estimate for its current `BV`, called `BLE`. It
- sets `BLE` to `BL` received in [Status]
- doesn't send any request to the server when `BLE < MaxCost`
- deduces `MaxCost` when sending a request
- recharges `BLE` at the rate of `MRR` when less than `BL`
When a reply message with a new `BV` value is received, it sets `BLE` to `BV -
Sum(MaxCost)`, summing the `MaxCost` values of requests sent after the one belonging to
this reply.
#### Buffer underrun
Before **les/3** buffer underruns always resulted in immediate disconnection. Now it is
possible and recommended to send a [StopMsg] instead and then a [ResumeMsg] when the
buffer has been at least partially recharged. This allows clients to treat the buffer
feedback as an optional performance optimization hint instead of a mandatory mechanism
and allows simple implementations that do not care about the buffer at all.
## Request ID
Every on-demand request message contains a `reqID` field, which is simply returned by the
server in the corresponding reply message. This helps matching replies for requests on the
client side so that each reply doesn't need to be matched against each pending request.
## Protocol Messages
### Status (0x00)
`[[key_0, value_0], [key_1, value_1], ...]`
Inform a peer of the sender's current LES state. This message should be sent just after
the connection is established and prior to any other LES messages. The following keys
are required (value types are noted after the key string):
- `"protocolVersion"` `P`: is 1 for protocol version one.
- `"networkId"` `P`: specifies the network ID of the chain, as in the [Ethereum Wire Protocol].
- `"headTd"` `P`: Total Difficulty of the best chain. Integer, as found in block header.
- `"headHash"` `B_32`: the hash of the best (i.e. highest TD) known block.
- `"headNum"` `P`: the number of the best (i.e. highest TD) known block.
- `"genesisHash"` `B_32`: the hash of the Genesis block.
- `"forkID"` `[crc32, nextFork: P]`: mandatory since **les/4**.
The value identifies the chain/fork the node is operating on.
- `"recentTxLookup"` `P`: announced by servers since **les/4**. Transaction status
is served for transactions included in the N-1 most recent blocks (N=1 means that
mined transactions are not served at all). N=0 means all transactions are available.
There are several optional key/value pairs which can be set:
- `"announceType"` `P`: set by clients, this field affects the [Announce] messages of the
server. Allowed integer values are:
- none (`0`): no [Announce] messages are sent, i.e. the client is not interested in announcements.
- simple (`1`): Default. [Announce] messages use the **les/1** format.
- signed (`2`): there is a `"sign"` key in the key/value list of [Announce] messages. The
associated value is a signature of an RLP encoded `[headHash: B_32, headNumber: P, headTd: P]`
structure by the server's node key.
- `"serveHeaders"` (empty value): present if the peer can serve header chain downloads.
- `"serveChainSince"` `P`: present if the peer can serve Body/Receipts ODR requests
starting from the given block number.
- `"serveRecentChain"` `P`: if present then the availability of chain data is only guaranteed
for the given number of recent blocks. If the node serves chain data then `"serveChainSince"`
should always be present while `"serveRecentChain"` is optional. Chain availability can
be assumed for blocks with `blockNumber >= MAX(serveChainSince, headNumber-serveRecentChain+1)`.
- `"serveStateSince"` `P`: present if the peer can serve Proof/Code ODR requests starting
from the given block number.
- `"serveRecentState"` `P`: if present then the availability of state data is only guaranteed
for the given number of recent blocks. If the node serves state data then `"serveStateSince"`
should always be present while `"serveRecentState"` is optional. State availability can
be assumed for blocks with `blockNumber >= MAX(serveStateSince, headNumber-serveRecentState+1)`.
- `"txRelay"` (no value): present if the peer can relay transactions to the ETH network.
- `"flowControl/BL"`, `"flowControl/MRC"`, `"flowControl/MRR"`: see [Client Side Flow Control]
Unknown keys should be ignored by both sides. This allows announcing additional
capabilities while staying compatible with past protocol versions.
### Announce (0x01)
`[headHash: B_32, headNumber: P, headTd: P, reorgDepth: P, [[key_0, value_0], [key_1, value_1], ...]]`
Announce a new chain head and optionally also a change to some of the values announced at
handshake. A restrictive change of server capabilities (for example, an increase of
`"serveStateSince"` due to state pruning) should be announced at least 10 seconds prior to
actually restricting those capabilities in order to avoid asynchronous problems. Changes
to unknown keys should be ignored. Changes to known keys that make no sense lead to
disconnection.
Announcing a head with a lower or equal TD than previously announced or a head that the
sending node later refuses to honor with a proceeding [GetBlockHeaders] message (with
number and TD also matching) is considered bad form, and may lead to disconnection or
reduce the reputation of the sending node.
The field `reorgDepth` contains the number of blocks to be rolled back from the last head
announced by the same node in order to find the last common ancestor of the last and
current heaviest chain. Adding this field helps the client to minimize the number of
requests and the amount of bandwidth required to fetch new headers.
### GetBlockHeaders (0x02)
`[reqID: P, [block: {P, B_32}, maxHeaders: P, skip: P, reverse: P in {0, 1}]]`
Require peer to return a [BlockHeaders] message. Reply must contain a number of block
headers, of rising number when `reverse` is `0`, falling when `1`, `skip` blocks apart,
beginning at block `block` (denoted by either number or hash) in the canonical chain, and
with at most `maxHeaders` items.
### BlockHeaders (0x03)
`[reqID: P, BV: P, [blockHeader_0, blockHeader_1, ...]]`
Reply to [GetBlockHeaders]. The items in the list (following the message ID) are block
headers in the format described in the main Ethereum specification, previously asked for
in a [GetBlockHeaders] message. The list may be empty if none of the requested block
headers were available on the server side.
### GetBlockBodies (0x04)
`[reqID: P, [hash_0: B_32, hash_1: B_32, ...]]`
Require peer to return a [BlockBodies] message. Specify the set of blocks that we're
interested in with the hashes.
### BlockBodies (0x05)
`[reqID: P, BV: P, [[transactions_0, uncles_0] , ...]]`
Reply to [GetBlockBodies]. The items in the list (following the message ID) are some of
the blocks, minus the header, in the format described in the main Ethereum specification,
previously asked for in a [GetBlockBodies] message.
### GetReceipts (0x06)
`[reqID: P, [hash_0: B_32, hash_1: B_32, ...]]`
Require peer to return a [Receipts] message.
### Receipts (0x07)
`[reqID: P, BV: P, [[receipt_0, receipt_1, ...], ...]]`
Provide a set of receipts which correspond to the block hashes previously asked for in
[GetReceipts].
### GetProofs (0x08)
`[reqID: P, [[blockhash: B_32, key: B_32, key2: B_32, fromLevel: P], ...]]`
Require peer to return a [Proofs] message, containing one or more Merkle proofs, each
proving the value of index `key` from the state trie of the given block (if `key2` is
empty), or the storage value of index `key2` from the storage trie referenced in the
account at `key`. If `fromLevel` is greater than zero, the given number of trie nodes
closest to the root can be omitted from the proof.
This message was deprecated in **les/2**, use [GetProofsV2] instead.
### Proofs (0x09)
`[reqID: P, BV: P, [[node_1, node_2, ...], ...]]`
Return a set of Merkle proofs, each consisting of a set of nodes that must be processed in
order to access the trie entry value (or prove the absence of an entry) requested in
[GetProofs].
### GetContractCodes (0x0a)
`[reqID: P, [[blockhash: B_32, key: B_32], ...]]`
Require peer to return a [ContractCodes] message.
### ContractCodes (0x0b)
`[reqID: P, BV: P, [value_0: B, value_1: B, ...]]`
Provide a set of contract codes which correspond to the block hashes and account keys
previously asked in [GetContractCodes].
### GetHeaderProofs (0x0d)
`[reqID: P, [[chtNumber: P, blockNumber: P, fromLevel: P], ...]]`
Require peer to return a [HeaderProofs] message, containing one or more canonical block
headers (of block number `blockNumber`) and corresponding Merkle proofs of the [CHT]
(Canonical Hash Trie) identified by `chtNumber`. If `fromLevel` is greater than zero, the
given number of trie nodes closest to the root can be omitted from the proof.
This message was deprecated in **les/2**, use [GetHelperTrieProofs] instead.
### HeaderProofs (0x0e)
`[reqID: P, BV: P, [[blockHeader, [node_1, node_2...]], ...]]`
Return a set of structures, each containing a block header and a Merkle proof proving the
header hash and belonging TD against a given CHT requested in [GetHeaderProofs].
### SendTx (0x0c)
`[txdata_1, txdata_2, ...]`
Require peer to add a set of transactions into its transaction pool and relay them to the
ETH network.
This message was deprecated in **les/2**, use [SendTxV2] instead.
### GetProofsV2 (0x0f)
`[reqID: P, [[blockhash: B_32, key: B_32, key2: B_32, fromLevel: P], ...]]`
Require peer to return a [ProofsV2] message, containing a single (and smallest possible)
set of trie nodes that proves for each request the value of index `key` from the state
trie of the given block (if `key2` is empty), or the storage value of index `key2` from
the storage trie referenced in the account at `key`. If `fromLevel` is greater than zero,
the given number of trie nodes closest to the root can be omitted from the proof.
### ProofsV2 (0x10)
`[reqID: P, BV: P, [node_1, node_2, ...]]`
Return the smallest set of trie nodes required to access the trie entry value (or prove
the absence of an entry) requested in [GetProofsV2]. This set will be called a *proof
set*. Compared to [Proofs], this message contains a single list of nodes satisfying all
requested proofs. The list shouldn't contain duplicate nodes.
### GetHelperTrieProofs (0x11)
`[reqID: P, [[subType: P, sectionIdx: P, key: B, fromLevel: P, auxReq: P], ...]]`
Require peer to return a [HelperTrieProofs] message, containing a *proof set* and optional
auxiliary data for each request.
Note: this request is a generalization of the **les/1** [GetHeaderProofs] message. It
retrieves Merkle proofs from different types of "helper tries" which are generated for
every fixed-length section of the canonical chain. `subType` identifies the helper trie
that is being requested for the section marked by `sectionIdx`. `key` and `fromLevel` are
interpreted like in case of proof requests.
If `auxReq` is greater than zero then auxiliary data is requested too. If `auxReq` is 1
then the root hash of the specified trie (according to the server) is returned and no trie
nodes are added to the proof set. This special request will be required for trustless
validation of helper tries. The interpretation of `auxReq` values greater than 1 is
subject to `subType`.
The following `subType` integer values are allowed in **les/2**:
- CHT (`0`): request a key from the [Canonical Hash Trie]. If `auxReq` is 2 then the
belonging header is returned as `auxData`. `key` is the block number encoded as an
8-byte big endian. Note that the section size for CHTs has been raised to 32k instead of
4k blocks so for example a `sectionIdx` of 100 equals a `chtNumber` of 807 in case of
the **les/1** [GetHeaderProofs] message.
- BloomBits (`1`): request a key from the [BloomBits Trie]. In this trie `key` is 10 bytes
long, it consists of the bloom bit index encoded as a 2-byte big endian, followed by the
section index encoded as an 8-byte big endian. The returned value is the corresponding
compressed bloom bit vector.
### HelperTrieProofs (0x12)
`[reqID: P, BV: P, [[node_1, node_2...], [auxData_0, auxData_1, ...]]]`
Return a proof set and a set of `auxData` requested in [GetHelperTrieProofs]. The length
of the `auxData` list equals the number of requests with a non-zero `auxReq`.
### SendTxV2 (0x13)
`[reqID: P, [txdata_1, txdata_2, ...]]`
Require peer to add a set of transactions into its transaction pool and relay them to the
ETH network, then return a [TxStatus] message containing the status of the sent
transactions.
### GetTxStatus (0x14)
`[reqID: P, [txHash_1, txHash_2, ...]]`
Require peer to return a [TxStatus] message containing the status of the referenced
transactions. This message is intended for inquiry about past transactions sent by the
client. Note that the server is not required to make every transaction available
indefinitely.
### TxStatus (0x15)
`[reqID: P, BV: P, [[status: P, data: B], ...]]`
Return the current status of the sent/queried transactions. Possible `status` values are:
- Unknown (`0`): transaction is unknown
- Queued (`1`): transaction is queued (not processable yet)
- Pending (`2`): transaction is pending (processable)
- Included (`3`): transaction is already included in the canonical chain. `data` contains
an RLP-encoded `[blockHash: B_32, blockNumber: P, txIndex: P]` structure.
- Error (`4`): transaction sending failed. `data` contains a text error message.
### StopMsg (0x16)
Instruct the client to temporarily stop sending requests and to not expect responses to those requests it did not already receive a reply for.
Implementer's note: this message can be used to handle transient server overloads or individual client flow control buffer underruns. The server should avoid sending [StopMsg] too often though if the client also avoids buffer underruns. It should try to regulate its own utilization (and thereby also the frequency of transient overload occurences) with the flow control feedback. Receiving [StopMsg] more than once every few minutes in long term average or not receiving [ResumeMsg] in a few seconds can be considered bad service quality by the clients.
### ResumeMsg (0x17)
`[BV: P]`
Update flow control buffer and allow sending requests again. Note that the requests not answered before [StopMsg] were permanently canceled and will not be answered after [ResumeMsg]. If a [ResumeMsg] is received without a preceding [StopMsg] then it should be treated as a simple flow control buffer update (assuming that the server has already deducted the cost of the previously answered messages).
## Change Log
### les/4 (March 2021)
- Keys `"forkID"` and `"recentTxLookup"` were added to the [Status] message.
### les/3 (May 2019)
- Keys `"serveRecentChain"` and `"serveRecentState"` were added to the [Status] message.
- Messages [StopMsg] and [ResumeMsg] were added to improve handling transient overloads
and flow control buffer underruns.
### les/2 (November 2017)
- The `"announceType"` key was added to the [Status] message.
- The BloomBits Trie and associated messages [GetHelperTrieProofs], [HelperTrieProofs]
were added to facilitate server-assisted log search. **les/1** clients would frequently
download large ranges of receipts to search for specific logs.
- Messages [GetProofsV2], [ProofsV2] were added to de-duplicate result nodes when
requesting multiple proofs at the same time.
- Messages [SendTxV2], [GetTxStatus] and [TxStatus] were added to allow querying for past
transactions and to enable user-lever error reporting for non-includable transactions at
the time of submission.
- The [GetHeaderProofs], [HeaderProofs], [GetProofs], [Proofs] and [SendTx] messages from
**les/1** are no longer supported in **les/2**.
[Client Side Flow Control]: #client-side-flow-control
[Canonical Hash Trie]: #canonical-hash-trie
[CHT]: #canonical-hash-trie
[BloomBits Trie]: #bloombits-trie
[Status]: #status-0x00
[Announce]: #announce-0x01
[GetBlockHeaders]: #getblockheaders-0x02
[BlockHeaders]: #blockheaders-0x03
[GetBlockBodies]: #getblockbodies-0x04
[BlockBodies]: #blockbodies-0x05
[GetReceipts]: #getreceipts-0x06
[Receipts]: #receipts-0x07
[GetProofs]: #getproofs-0x08
[Proofs]: #proofs-0x09
[GetContractCodes]: #getcontractcodes-0x0a
[ContractCodes]: #contractcodes-0x0b
[GetHeaderProofs]: #getheaderproofs-0x0d
[HeaderProofs]: #headerproofs-0x0e
[SendTx]: #sendtx-0x0c
[GetProofsV2]: #getproofsv2-0x0f
[ProofsV2]: #proofsv2-0x10
[GetHelperTrieProofs]: #gethelpertrieproofs-0x11
[HelperTrieProofs]: #helpertrieproofs-0x12
[SendTxV2]: #sendtxv2-0x13
[GetTxStatus]: #gettxstatus-0x14
[TxStatus]: #txstatus-0x15
[StopMsg]: #stopmsg-0x16
[ResumeMsg]: #resumemsg-0x17
[Ethereum Wire Protocol]: ./eth.md
[Merkle Patricia Trie]: https://github.com/ethereum/wiki/wiki/Patricia-Tree