Light clients are crucial elements in blockchain ecosystems. They help users access and interact with a blockchain in a secure, decentralized and trustless manner. https://www.parity.io/blog/what-is-a-light-client/
@Daan van der Plas
For the last 1,5 months I’ve gone through the Smoldot code which was written by Pierre Krieger. Smoldot is a Rust/Substrate client alternative which contains two clients: the full client and the wasm light node. As for this knowledge base, it is solely focused on the wasm light node implementation. I will start by giving a short introduction to Smoldot and Substrate Connect. Thereafter I want to guide you through the code and give you an overview of how everything works and why it is important. Because Pierre has done an amazing job with documenting everything I have often used links to the source code.
Alright, lets go!
Table of contents:
Before we begin, the code executes asynchronously and uses many asynchronous tasks. Tasks will be mentioned throughout the code walk through and are shown in the diagrams, provided for each service, with a symbol (T). Moreover, the arrows between the service pending and other services represent mspc-channels which are used to share information between these asynchronous tasks. Simply put, the channels have a transmit and receiving end. Both ends have shared ownership over a variable, allocated on the heap. The sending end notifies the receiving end when it has changed the shared variable.
Most blockchain user interfaces in the ecosystem work by connecting via a server (e.g. PolkadotJS) to a few trusted blockchain nodes which represent a central point of failure. Generally, if one wants to securely interact in a trustless manner with a blockchain, syncing a full node is necessary, which requires a lot of knowledge, effort, and resources. This is where light clients come into play. Put simply, a light client joins the peer-to-peer network of a chain and is able to interact with multiple full nodes. Unlike full nodes, light clients don’t need to run 24/7 and store a lot of data. As a matter of fact, light clients rely on full nodes for obtaining the information they need, e.g. requesting the balance of a specific user. However, and this is very important, after obtaining this information a light client verifies the information is correct. As a result, people that interact with blockchain user interfaces will independently and trustlessly obtain information from the blockchain due to a light client running on their device! In addition, Smoldot validates a submitted transaction (coming from the user) before it sends it to full nodes to be added to the transaction pool. The need for trusted blockchain nodes is now only necessary for light clients to join the peer-to-peer network.
Substrate Connect provides the infrastructure necessary to run light clients directly in the browser. In addition, the browser extension enables resource sharing across browser tabs. Without the extension, Substrate Connect runs in the browser with each browser tab running a light client instance. This route will no doubt negatively impact page loading speed, providing a suboptimal user experience, especially compared to Web2 alternatives. Furthermore, Substrate Connect doesn’t require a TLS certificate to connect to nodes, as the connection is initiated from within the browser extension, which has more access rights than a typical website. Substrate Connect works in all major browsers, and when using the extension, it acts as a bridge, where Smoldot will run in the extension, making it possible for every tab or website to sync with the chain.
Now lets unravel how Smoldot syncs with the chain!
First, this abstract is made to cover the essentials of Smoldot without having to read the whole document. Everything will be described again and more elaborately beginning from the chainspec.
Whether Smoldot is used to interact with the relay chain or a parachain, it always needs to sync with the relay chain. As for a parachain, due to Polkadot's shared security, the parachain's finalized state is guaranteed on the relay chain. In other words, information regarding a parachain block, whether it is included, reverted or finalized, is acquired from the relay chain.
To verify whether a relay chain block is finalized, a light client needs to know the elected validators from the GRANDPA protocol. By knowing the elected validators for a given era Smoldot can verify the justification which contains the proof of the finality of a block. The justification exists of a set of GRANDPA commits signed by all the elected validators. If more than 2/3 + 1 of the elected validators voted with their commit on the block and the signatures are correct, Smoldot can conclude the block is finalized.
The elected validators form the authority set for a given GRANDPA era and are elected by the NPoS algorithm. Changes to the authority set are crucial to track for a node who tries to sync with the relay chain in order to verify justifications. It also offers an alternative and less resourceful way of getting to the head of the chain, called the warp sync protocol. Instead of requesting all the blocks to get to the current state of the chain, a light client only needs to request fragments. These fragments provide the necessary proofs of the changes that have been made to the authority set.
When it is up to date with the latest GRANDPA era, Smoldot will start syncing with the chain similar to a full node. To clarify, instead of holding the state and verifying the content of the new (non-finalized) blocks, it will only receive and verify new (non-finalized) block headers. Smoldot verifies a (non-finalized) block header by verifying the authenticity of the block. In other words, whether the author of the block was selected by the BABE protocol. In short, BABE breaks time into epochs, with each epoch being broken into slots. BABE will select an author (or several) to author a block in each slot.
On the whole, Smoldot verifies consensus- and finality by keeping track of the authority set (the active validator set). The starting authority set will be obtained through the runtime and when it is up to date with the chain it acquires subsequent changes from the block headers.
In order to give Smoldot the necessary information to start syncing with a chain it requires the chainspec. A chain specification is the description of everything that is required for the client to successfully interact with a certain blockchain. From the chainspec the chain information is built. In the code, the chain information is all the crucial information Smoldot needs to know to verify the consensus and finality. It needs an initial state for the chain information and it is constructed from either:
:code
which contains the WebAssembly code of the runtime. Besides the WASM code, it contains the runtime version and runtime APIs it provides. As for adding a relay chain, specific runtime calls of the GRANDPA-pallet are required to build the starting chain information:
(⇒ All resulting in the ChainInformation)
As a result, the chain information provides the starting authority set for the warp sync process. However, to sync with a chain it needs information from other peers and therefore needs to join the peer-to-peer network.
In order to connect to other nodes it needs the networking service. It is responsible for joining the peer-to-peer network, for Substrate-based chains, the libp2p protocol is used. The networking service starts three background tasks:
For more information regarding Substrate's Networking Protocol.
This task ensures that Smoldot is always connected to a certain amount of peers. The amount of in-slots
and out-slots
refer, respectively, to the maximum amount of peers that can connect to Smoldot and the maximum amount of peers Smoldot connects to. In addition, when a connection is established, it requests to open substreams with the given peer. Last but not least, it coordinates the requests and responses from the connections to Smoldot and vice versa. In the code the the coordinator is responsible for this.
In detail:
out-slots
:For a peer-to-peer network it is necessary to have the information to connect to any given peer, available at any given time. Due to its origin, the storage of this information needs to be decentralized and is therefore divided over all the peers in the network through the Kademlia distributed hash table (DHT). In the DHT each node has so called k-buckets, which form a partial view of the complete list of all the peers in the network. Peer discovery is done by sending a Kademlia "find node" request to a single peer. More specifically, based on the Kademlia request-response protocol, it builds a wire message to ask the target to return the nodes closest to the parameter. To decide which peer it sends this request to:
As a result, Smoldot receives a new list of peers with their multiadresses. These peers will be inserted into the k-buckets, if there is space, and/or replaces the peers that have not been connected.
After the initialization of the network, sync, runtime and transaction service, the bootnodes (obtained from the chainspec) are added to Smoldot’s k-buckets and Smoldot can start connecting to the peer-to-peer network.
Smoldot has two types of connections due to API-related purposes, more specifically two ways of calling the browser's APIs.
There are two types of substreams:
When a connections is made, a block announcement substream is requested, if accepted, followed by a request for additional GRANDPA announcements. In addition, when an inbound substream is opened it requests for an outbound substream. This is used to update the peers with Smoldot’s view of the state of the chain.
Now that the networking service has been set up it starts a new background task, the synchronization service. The role of the sync service is to do whatever necessary to obtain and stay up to date with the best and finalized blocks of the chain. As for the relay chain it starts with the warp sync protocol to get to the latest GRANDPA era as fast as possible, then it changes to the all-forks protocol. The parachain synchronization relies on the relay chain, more specifically the runtime service of the relay chain.
In general, Smoldot will track a list of sources, which represent peers that Smoldot receives new block headers and GRANDPA commits from. From the information it receives from a given peer, it knows which (non-finalized) blocks this peer is aware of.
The flow of the relay chain synchronization service exists of:
Thanks to GRANDPA, Smoldot is able to get up to date with the latest GRANDPA era through the warp sync protocol. Instead of requesting all the blocks to get to the current state of the chain, a light client only needs to request fragments. These fragments provide the necessary proofs of the changes that have been made to the authority set, also known as the elected validators. The elected validators are elected by the NPoS algorithm to participate in the GRANDPA protocol for that given era (more information). By knowing the elected validators, Smoldot can verify a justification. In other words, it can verify the finality of a block within a given era. Changes to the authority set are signed by the previous authority set and stated in the block header at the end of the era. When the respective block is finalized and the justification is verified by Smoldot, it knows the new authority set in a trustless manner!
The warp syncing process can be split into 4 phases:
When Smoldot is up to date with the latest GRANDPA authority set, i.e. latest era:
:code
(encoded WASM blob).As might be noticed, this time the dispatchables are BABE related instead of GRANDPA related. That is because the BABE protocol decides which elected validator is allowed to author a block when, important for the all-forks syncing protocol.
The all-forks syncing strategy is very similar to how full nodes sync with the network; holding the state, verifying the content of the new (non-finalized) blocks and GRANDPA commits. Yet, Smoldot will only receive and verify new (non-finalized) block headers and GRANDPA commits. Smoldot verifies a (non-finalized) block header by verifying the authenticity. In other words, whether the author of the block was selected by the BABE protocol.
BABE breaks time into epochs, with each epoch broken into slots. BABE will select an author (or several) to author a block in each slot. Each slot can have a primary and secondary author (or “slot leader”). Primary slot leaders are assigned randomly, using VRF. VRF takes an epoch random seed (agreed upon in advance by all nodes), a slot number and the author’s private key. Each author evaluates its VRF for each slot in an epoch. For each slot whose output is below some agreed-upon threshold, the validator has the right to author a block in
that slot. Because the function is random, however, sometimes there are slots without a leader. In order to ensure a consistent block time, BABE uses a round-robin system to assign secondary slot leaders.
The header of the first block produced after a transition to a new epoch contains the public keys that are allowed to sign new blocks for the next epoch. When this block is finalized, Smoldot knows the block authors for the next epoch.
The all-forks syncing process :
Correspondingly, it determines if it is the best block, it checks for consensus updates in the header digest, and adds the block to the state machine:
Important to mention, when the warp sync finished, there is a high probability that there is a misalignment in the received information and Smoldot’s latest finalized block. In order to correct this misalignment, Smoldot requests the necessary blocks and justifications.
After succesfully verifying a new block or GRANDPA commit, Smoldot sends an update to other light clients it is connected to. In addition, it updates the services that are subscribed to the sync service:
Last, it informs all its present and future peers of the state of the local node regarding the best and finalized block.
It is recommended to first read the runtime service with the relay chain in mind.
The relay chain stores the head-data, also known as parahead, of every registered parachain.
The flow of the parachain and parathread syncing service exists of:
First we subscribe to the runtime service to obtain the current state of the relay chain and get notified about new blocks. This is done by subscribing to the runtime service of the relay chain. When the finalized runtime is downloaded by the runtime service the parachain syncing service can fetch paraheads, i.e. best and finalized blocks, by calling the ParachainHost_persisted_validate_data
. However, in order to execute this extrinsic, Smoldot needs the merkle node values which are required during runtime execution by:
ParachainHost_persisted_validation_data
extrinsic.⇒ If successful, obtain parahead from PersistedValidationData.
The parahead, head data, contains information about a parachain block. When the relay chain block, where the parahead is obtained from, is finalized, the parahead is finalized as well. When a parahead is finalized, it checks whether it is up to date with the block height of other parachain nodes.
New blocks and finality updates are notified to the subscribers:
The next asynchronous task is the runtime service. Essentially, this service wants to have the latest finalized downloaded runtime to provide to other services. Therefore it needs to stay up to date with the chain. In order to stay up to date with the chain it subscribes to the synchronization service. When a service subscribes to the sync service it receives the current state of the chain (the finalized block and the non-finalized descendants) and gets notified regarding new non-finalized (best) blocks and finality updates. As a result, the runtime service holds a data structure of a tree of non-finalized blocks that all descend from the finalized block.
These new block headers contain the header digest which states whether the runtime has been upgraded. If so, the runtime service has to download the new runtime:
:code
(encoded WASM blob)Through wasmi it is able to interpret the WASM blob in native binary as a WASM virtual machine. The dispatchables from the runtime are exported and the host functions are imported. That all together makes the WASM virtual machine specific to a Substrate/Polkadot runtime.
When executing a runtime call through a WASM virtual machine, the execution could be interrupted because it requests execution of a host function. The virtual machine is then interrupted, the host function is executed and the virtual machine will be resumed with the result of this host function.
When the runtime service gets notified about the block which stated the runtime upgrade, it has the latest finalized runtime again. Accordingly, the data structure will be pruned and the following services can utilize the latest (upgraded) finalized runtime:
The transaction service handles everything related to transactions. It holds a data structure called the transaction pool which holds a list of pending transactions, transactions that should later be included in blocks. Furthermore, a list of transactions that have been included in non-finalized blocks. As for the light client the transaction service and the transaction pool are most of the time idle. This is due to the fact it is only operating when the user submits a transaction.
The transaction service utilizes the runtime service. Identical to the parachain syncing service it wants to use the latest finalized runtime. The transaction service needs the runtime to validate incoming transactions from the JSON-RPC service. It validates a submitted transaction against the latest (best) block. If valid, it sends the transaction to peers through the networking service. Moreover, it will check the new (best) blocks for whether the transaction is included and when the block, where the transaction is included, is finalized. To have the submitted transaction being added to a block the transaction will be gossiped to peers through the networking service.
The validation of a transaction:
TaggedTransactionQueue_validate_transaction
extrinsic.
After validating the transaction and sending it to peers, it will download the block bodies from latest new blocks:
It updates the submitter of the transaction about the status of the transaction.
The last asynchronous task that will be spawned handles the JSON-RPC service. The JSON-RPC service holds a state machine which consists of a list of clients (Smoldot API users), pending outgoing messages, pending request(s) and active subscription(s). It all starts with a submitted JSON-RPC request by the API user:
chain_get_header
)chain_subscribe_new_heads
)Because this service is dependent on a lot of other services, multiple lightweight tasks are created which are handled by this service. For each subscription it spawns a new lightweight task that waits for updates and notifies the API user (these updates need to be manually polled by the API user). In addition, the service starts another mini task dedicated to filling the Cache with new blocks from the runtime service. The API user is more likely to ask for information about recent blocks and perform calls on them, hence a cache of recent blocks.