---
# System prepended metadata

title: Block Protocol Message Flow and Data

---

$$
\newcommand{cid}[1]{\text{CID}(#1)}
$$

# Block Protocol Message Flow and Data

Peer $a$ wants to obtain a dataset $D$ with CID $\cid{D}$. $\cid{D}$ is a CIDv1 CID containing, in human readable format:

```text
<0x01 (code for CIDv1)> <0xCD01 (code for logos storage manifest)> <sha2 256 0x...digest>
```

The SHA2-256 digest refers to the manifest which, according to the Logos Storage dataset spec, contains:

```cddl=
manifest = {
  version: uint,
  codec: bstr,
  block_size: uint,
  block_count: uint,
  merkle_root: bytes .size 32,
  ? content_type: tstr,
  ? file_name: tstr,
}
```

Peer $a$ issues a `FIND_NODE` message which locates the node responsible for $\cid{D}$. `FIND_NODE` accepts a libp2p `PeerId`, converting from $\cid{D}$ to `PeerId` might involve rehashing if those are not of the same length. Assuming they're both 256-bits, I suppose we can use the $\cid{D}$ directly and issue `findNode(cast[PeerId](cid))` once we migrate to the libp2p DHT.

Once $a$ has a list of closest peers, it issues `GET_PROVIDERS` to one of them, and obtains a list of providers. $a$ connects to $\gamma$ such providers, attempting to satisfy its lower connection threshold for the swarm.

**1 - Manifest.** Upon connecting to the first peer in the swarm, let's refer to this peer as $b$, $a$ issues a `GET_MANIFEST` message[^1] to $b$. $a$ could issue `GET_MANIFEST` to other $k \leq \gamma$ peers if desired. Until at least one reply is received back, the whole protocol is blocked.

Once $a$ retrieves a manifest, it decodes it and checks its authenticity by comparing with the hash contained in $\cid{D}$. If it matches, $a$ _stores the manfiest_ and proceeds to step 2. Otherwise it drops $b$ and tries again with another peer. $a$ repeats this process until it either gets the manifest, or the user aborts the download; it never gives up on its own.

**2 - Block downloads.** Once $a$ has verifiable valid manifest, it uses the information contained in it to begin requesting block presence information and, finally, blocks.

Each block that gets received has its Merkle proof checked against the Merkle root contained within the manifest. 

The Merkle root does not need to be used anywhere else except for validating Merkle proofs. In old Codex, the Merkle root identified the peer group sharing the data pointed at by the manifest with CID $\cid{D}$. Now, this group is identified directly by $\cid{D}$. 

We do not require the Merkle even for block addresses, want lists can contain `(CID(D), [index list])` and things should work as expected.