XCMP Transport

# XCMP Transport Authors: Alistair, Jeff, Ximin Sequal to https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/HJBrUKhuI We favor collators *fetching* each incoming message from the *backing and approval checkers for the parachain block that sent the message*. We'd ideally fetch by some *filtered* request mechanism, but maybe via *unfiltered* requests initially, and maybe always for small messages. We'll later explore push messages options, but maybe only as an optimization, and maybe only for Aura/Sassafas parachains. In this document, we briefly weigh alternatives to this approach, which hopefully then explains our reasoning behind this balancing act. We consider two XCMP message transport styles: directed and reconstructed from erasure codes. We currently feel direct messaging provides sufficent security, but we could add reconstruction messaging later if the fancy strikes us. ## Directed A directed transport for message sounds simplest, provided it satisfies our securitry requirements. We've two big questions here, "from whom?" and "push vs pull"? **I. From whom do collators obtain incoming messages?** 1. *Backing validator for sending parachain block* Easy to implement since easy to identiy, but zero security assumptions about their behavior. *Conclusion*: Acceptable MVP target, but poor availability assurances 2. *Approval checkers for sending parachain block* Requires recieving collator track relay chain, which they do anyways, and extract sending approval checker identities, which requires fresh code. We assume one honest approval checker already during A&V, which helps, and they're more numerous regardless. *Conclusion*: Required for availability without message reconstruction, but omssion from MVP sounds okay. 3. *Any validators assigned to back the reciever (aka parachain validator)* Appears useful if we must reduce connections, but they are few in number, unlike approval checkers, and do not provide any availability assurances. We'd need transport messages to these parachain validators somehow, so the underlying problem remains, but now requires fewer connetions. *Conclusion*: Wait and see, provides poor availability assurances, optimization only 4. *Sending collators* We've even weaker assurances than from any validator class here, but these provide only a performance optimization at best. *Conclusion*: Wait and see, poorest availability assurances, optimization only **II. Do collators fetch message from senders or do senders push proactively?** All push messages appear incomptable with true parathreads. There are extremely important parathread cases that actually fit a multi-parachain model compatable with push, including grouped zk transaction or rollups and maybe system parathreads, but important reasons for doing true parathreads remain incompatable iwth push. *Fetch*: We reduce bandwidth by requesting messages, but several options exist: 1. *Unfiltered fetch* Ignoring authenticaiton works great for tiny messages, but an unfiltered aka unauthenticated fetch protocol invites SPAM. *Conclusion*: Implement first for MVP. 2. *Filtered fetch* We could grant message queries only to collators with an upcoming block production slot, for which they provide a proof. We think filtered fetches resist SPAM because few nodes have upcoming block production slots, but this assumption requires trusting the parachain's authenticatio entrypoint. Yet, actually verifying these proofs requires some secondary BIOS entrypoint, which ideally requires SPREE-like code verification. *Conclusion*: Implement when conveneint, not required for MVP. 3. *Paid fetch* Almost requires fast "layer two" payment designs, so complex right now. *Conclusion*: Wait until appropraite payment channel code exists. 3. *PoW filtered fetch* An upcoming block producer on a PoW parachain discovers their slot with some pre-slot PoW, so they then request messages, and finally build the block. See also other PoW variants like https://forum.web3.foundation/t/caucus-subprotocol-of-fantomette/47 *Conclusion*: Wait until pre-slot PoW analyized, so never unless someone asks Handan *Push*: Pushed messages avoid SPAM nicely, but they require senders know destinations, so maybe useful if message gets sent by the sending collator I-3 or validators assigned to back the reciever I-4. 5. *Gossip push* We can push into parachain gossip networks if we know what this means. We're currently talking about reputaiton systems for this, but that sounds problematic. We should consider pushing to random parachain nodes instead. *Conclusion*: Wait at least until well defined, optimization only 6. *Aura/Sassafras push* We can push semi-directly to upcoming aura block producers or encrypted to upcoming anonymous sassafras block producers using sassafras ephemeral block sealing keys. We keep pushing unread messages however to ensure parachain liveness and/or internal availability. *Conclusion*: Wait and see, optimization only **III. Should message metadata move differently?** A collator requires metadata for all-ish messages between its old watermark and its relay parent. We authenticate message metadata from the chain, but it does not live there so we must transport it too. We might include very small messages inside their metadata. We might keep unfiltered fetch for metadata permenently, but require another transport for large messages. ## Reconstruction We could fetch messages by reconstructing them from erasure coded data, but this requires connections with 1/3 validators, which maybe require QUIC or just falls over. **IV. How do we reconstruct the message?** 1. *Reconstruct and rerun sending parachain block* Absolute last resort since extremely inefficent. Avoid doing this, unless we simply require some reconstruction based fallback, but expect it'll never run. Simplest 2. *Partial erasure code reconstruction* Among reconstruction strategies this sounds least inefficent, but cost orthogonality between erasure coding and authentication. High code complexity. See: https://github.com/w3f/research-internal/issues/533 We've intermediate stages between IV-1 and IV-2, but we'd probably choose an extreme, depending upon whether we're expecting for an extremely rare backup case or actual usage. Any MVP skips reconstruction messaging entirely.