Week9 Update - HackMD

# Week9 Update ## TL;DR 1. Implement the [gossipsub mcache](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#message-cache) in the zig-libp2p, opened the [PR](https://github.com/zen-eth/zig-libp2p/pull/57). 2. Integrate the [multiaddress](https://docs.libp2p.io/concepts/fundamentals/peers/#peer-ids-in-multiaddrs) in the zig-libp2p, opened the [PR](https://github.com/zen-eth/zig-libp2p/pull/61). 3. Migrate the [peer-id PR](https://github.com/blockblaz/peer-id/pull/3) to blockblaz org for zeam client. 4. Generate gossipsub RPC protobuf codec, merged [PR](https://github.com/zen-eth/zig-libp2p/pull/56) ## Gossip message cache The message cache (or mcache), is a data structure that stores message IDs and their corresponding messages, segmented into "history windows." Each window corresponds to one heartbeat interval, and the windows are shifted during the heartbeat procedure following gossip emission. The number of history windows to keep is determined by the mcache_len parameter, while the number of windows to examine when sending gossip is controlled by mcache_gossip. ### Architecture Overview: ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │ Message Store │ │ History Windows │ │ Peer Transmissions │ │ │ │ │ │ │ │ msgID -> Message│ │ [W0][W1][W2][W3] │ │ msgID -> peerCounts │ │ │ │ │ │ │ └─────────────────┘ └──────────────────┘ └─────────────────────┘ │ │ │ └───────────────────────┼─────────────────────────┘ │ References only ``` ### Sliding Window Mechanism: ``` Time → [Current] [Recent] [Older] [Oldest] Window 0 Window 1 Window 2 Window 3 │ │ │ │ │ │ │ └─ Messages aged out (deleted) │ │ └─ Old messages (not gossiped) │ └─ Recent messages (included in gossip) └─ New messages (included in gossip) New message arrives: 1. Add to msgs map 2. Add reference to Window 0 3. Window 0 is always current shift() called (periodically): 1. Create new empty Window 0 2. Shift all windows right: W0→W1, W1→W2, W2→W3 3. Delete oldest window (W3) and cleanup messages ``` ### Message Lifecycle: ``` put(msg) → [W0] ──shift()──→ [W1] ──shift()──→ [W2] ──shift()──→ [deleted] │ │ │ │ │ │ │ │ fresh msg recent msg old msg cleaned up (gossiped) (gossiped) (not gossiped) (freed memory) ``` ### Gossip Selection: Only messages in the first `gossip` windows are included in gossip. For example, with gossip=2 and history_size=4: ``` [W0][W1] | [W2][W3] ↑────↑ | ↑────↑ Gossiped | Not gossiped ``` ## Interpreting multiaddrs Multiaddrs are parsed from left to right, but they should be interpreted right to left. Each component of a multiaddr wraps all the left components in its context. For example, the multiaddr `/dns4/example.com/tcp/1234/tls/ws/tls` (ignore the double encryption for now) is interpreted by taking the first `tls` component from the right and interpreting it as the libp2p security protocol to use for the connection, then passing the rest of the multiaddr to the websocket transport to create the websocket connection. The websocket transport sees `/dns4/example.com/tcp/1234/tls/ws/` and interprets the `tls` in this context to mean that this is going to be a secure websocket connection. The websocket transport also gets the host to dial along with the tcp port from the rest of the multiaddr. Components to the right can also provide parameters to components to the left, since they are in charge of the rest of the multiaddr's interpretation. For example, in `/ip4/1.2.3.4/tcp/1234/tls/p2p/QmFoo` the `p2p` component has the value of the peer id and it passes it to the next component, in this case the `tls` security protocol, as the expected peer id for this connection. Another example is `/ip4/.../p2p/QmR/p2p-circuit/p2p/QmA`, here `p2p/QmA` is passed to `p2p-circuit` and then the `p2p-circuit` component knows it needs to use the rest of the multiaddr as the information to connect to the relay node. This enables nesting and arbitrary parameters. A component can parse arbitrary data with some encoding and pass it as a parameter to the next component of the multiaddr. For example, we could reference a specific HTTP path by composing `path` and `urlencode` components along with an `http` component. This would look like `/dns4/example.com/http/GET/path/percentencode/somepath%2ftosomething`. The `percentencode` parses the data and passes it as a parameter to `path`, which passes it as a named parameter (`path=somepath/tosomething`) to a `GET` request. A user may not like percentencode for their use case and may prefer to use `lenprefixencode` to have the multiaddr instead look like `/dns4/example.com/http/GET/path/lenprefixencode/20_somepath/tosomething`. This would work the same and require no changes to the `path` or `GET` component. It's important to note that the binary representation of the data in `percentencode` and `lenprefixencode` would be the same. The only difference is how it appears in the human-readable representation. ## Next Start to work on gossipsub stream management and peers management in zig-libp2p.