Block Exchange Protocol

\[ \def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{\text{#1}\small\style{text-transform: uppercase}{\text{#2}}} \]

\[ \newcommand{\swarmsample}{\textsc{SwarmSample}} \newcommand{\announcepeer}{\textsc{AnnouncePeer}} \newcommand{\have}{\textsc{Have}} \]

The protocol at a given peer \(p\) is divided into two distinct components:

  1. neighborhood (connection) manager;
  2. block exchange manager manager.

The neighborhood manager is responsible for maintaining a minimum set of \(d\) peers within \(p\)'s network view. The block exchange manager, on the other hand, will manage block queries and uploads/downloads with surrounding peers, and may request additional or different neighbors to the neighborhood manager should that need arise.

Peers can be of any of three distinct types:

  1. download/cache peers (DCPs). Downloads blocks from and uploads blocks to other peers in the swarm. May or may not hold a viable copy[1] of the file;
  2. storage peers (SPs). Only upload blocks to other peers in the swarm. Typically do not hold a viable copy of the file, unless they are also DCPs;
  3. DHT tracker nodes. Keeps track of a (subset of) the peers that currently make up a download swarm.

Base Protocol

Neighborhood bootstrap. A peer \(p\) joining a swarm to download a file \(F\) starts by:

  1. sending an \(\announcepeer\) message, containing a peer descriptor with its network address and port, to the DHT tracker;
  2. receiving a \(\swarmsample\) message in response, which contains up to \(d\) randomly chosen peer descriptors (where \(d \in \mathbb{Z}^{+}\)), and which should be used to bootstrap \(p\)'s neighborhood manager.
type PeerDescriptor* = object of RootObj peerId: Hash ip: IpAddress port: Port AnnouncePeer* = object of Message descriptor: PeerDescriptor SwarmSample* = object of Message sample: seq[PeerDescriptor]

Figure 1. Peer descriptors, \(\announcepeer\), and \(\swarmsample\).

Block knowledge bootstrap. Once \(p\) has access to a neighborhood, it attempts to map the set of blocks held by each neighbor by sending \(\have\) messages to all of them, in sets of size \(d_q \leq d\) at a time. Each \(\have\) message contains a want list which encodes the full set of blocks that \(p\) does not have on \(F\). Note that the want list may contain more blocks than what is required for \(p\) to obtaion a viable copy of \(F\).

type Have* = object of RootObj wants: IntSet

Figure 2. A \(\have\) message.

A neighbor \(q\) receiving a \(\have\) message will reply with a \(\have\) message containing its own wantlist of the blocks it does not have. This exchange allows \(q\) to learn about the blocks \(p\) currently has (and wants), and \(p\) to learn about the blocks \(q\) currently has (and wants). \(p\) will keep track, for every neighbor \(q\), of the blocks that \(q\) wants and has. Once \(p\) has exchange \(\have\) messages with \(q\), its knowledge about \(q\)'s blocks is bootstrapped.

Block knowledge maintenance. Maintenance of block knowledge then happens as peers download blocks, verify their correctness (Merkle proof), and broadcast that knowledge to neighbors that want them.

Block download schedule. As peer \(p\) continuously updates its knowledge about which peers have which blocks, it contructs and maintains its download schedule. A download schedule is nothing but an ordered set of \((\text{peer}, \text{block})\) pairs where pairs are ordered by a scoring function \(f\).

There are several possible ways to build scoring functions. For instance, \(p\) could privilege rare blocks as Bittorrent does, and could construct weighted combinations of peer speed/latency and a score for blocks. At first, we will start simple and implement: i) random selection; ii) rarest first.

proc download(self: Peer, F: File): while not self.store.hasEnoughBlocks(F): let downloadSlot = await self.acquireDownloadSlot() let (neighbor, blockDescriptor) = nextFromSchedule( self.peers, self.store, F) blockDescriptor.set(BlockState.downloading) if await downloadSlot.download(blockDescriptor, self.store, neighbor): blockDescriptor.set(BlockState.downloaded) announceBlock(blockDescriptor, self.peers) else: blockDescriptor.set(BlockState.pending)

Figure 3. Pseudocode for a file download.

Neighborhood maintenance. In addition to maintaining block knowledge, \(p\) needs to maintain its neighborhood as well. A neighbor \(q\) is elligible for replacement if one of the following holds:

  1. \(q\) is declared dead by \(p\)'s heartbeat failure detector;
  2. \(q\) fails to supply blocks to \(p\) after \(m\) attempts;
  3. \(q\) is a storage node and \(p\) already has all the blocks \(q\) has to offer.

In any of those cases, \(p\) will trigger a resample, in which it will ask from the DHT tracker for a new random peer. \(p\) may optionally inform the tracker of which peers it already knows so as to avoid having those being returned. If no new peer can be found, \(p\) waits for some amount of time, and tries again.[2]


  1. a set of blocks that is as large as the minimum number required to reconstruct the file. ↩︎

  2. simple exponential backoff could work as a first strategy here. ↩︎

Select a repo