# rust-libp2p-nym Transport Specification ## 1. Introduction and background This document aims to describe the Rust implementation of a Nym transport using libp2p. This will allow libp2p nodes to configure themselves to use this transport and send and receive messages through the mixnet. The motivation for development was initially to provide network privacy for Ethereum consensus validators, as they use libp2p as their p2p library. However, this transport could be used by any application that uses libp2p, bringing privacy to a wide variety of projects. ### 1.1 Nym Nym is a mixnet (overlay network) that allows for anonymous communication. An individual using the mixnet has access to a Nym endpoint through which they send and receive messages. Communication through the mixnet is in the form of packets encrypted with Sphinx (although this detail is not necessary to know for this specification). Nym guarantees message delivery, although it does not guarantee ordering. A libp2p node that wishes to communicate through the mixnet will then also need to have access to a Nym endpoint and methods for sending/receiving messages with it. ### 1.2 SURBs SURBs (single-use reply blocks) are a pre-addressed return packet implemented by Nym which allow an anonymous reply to a message. A sender can send a number of SURBs along with a message, which allow to recipient to send a message back to the sender without knowing the sender's Nym address. A communication lifecycle built on top of Nym could look like: - node A obtains the Nym address of node B and sends a message to node B along with X number of SURBs - node B receives the message along with the SURBs. To respond, it uses 1 SURB to send a reply to node A. Node B remains unaware of node A's Nym address. - this can continue until node B is nearly out of SURBs, in which case they would need to request more SURBs from node A. This is not implemented within Nym, and would need to be implemented on the application level. ## 2. Specification To allow libp2p nodes to use Nym, a new transport type must be created that implements the libp2p `Transport` trait. The trait is as follows: ```rust pub trait Transport { type Output; type Error: Error; type ListenerUpgrade: Future<Output = Result<Self::Output, Self::Error>>; type Dial: Future<Output = Result<Self::Output, Self::Error>>; fn listen_on( &mut self, addr: Multiaddr ) -> Result<ListenerId, TransportError<Self::Error>>; fn remove_listener(&mut self, id: ListenerId) -> bool; fn dial( &mut self, addr: Multiaddr ) -> Result<Self::Dial, TransportError<Self::Error>>; fn dial_as_listener( &mut self, addr: Multiaddr ) -> Result<Self::Dial, TransportError<Self::Error>>; fn poll( self: Pin<&mut Self>, cx: &mut Context<'_> ) -> Poll<TransportEvent<Self::ListenerUpgrade, Self::Error>>; fn address_translation( &self, listen: &Multiaddr, observed: &Multiaddr ) -> Option<Multiaddr>; } ``` Since communication through the Nym mixnet is packet-based, like UDP/QUIC, and not stream-based like TCP, the implementation will more closely resemble an existing packet-based transport. The concept of a "connection" to a peer will need to be implemented inside the transport package, as there are no actual "connections" possible through the mixnet (the way they are with TCP, for instance). ### 2.1 Connections Since there are no "connections" in the mixnet from end-to-end, the module will need to implement an abstracted connection, which represents a way to send a message to or receive messages from a remote peer. There are both inbound and outbound connections; inbound are received via listening and outbound are created via dialing. An inbound connection should be identified by a unique ID that is re-used for each message sent using that connection. This is required because of the anonymity of the mixnet; without an identifier, we will be unable to determine which connection the message belongs to. The initiator of the connection ("dialer") will pick a random 32-byte ID to send aloing with the initial message. All subsequent messages on this connection will contain this ID. The result of a connection establishment is a channel for inbound or outbound messages to be sent to or received from the remote side. > Note: Nym does not guarantee message ordering; for example, if I send message A, then message B to someone, they may receive message B first. This will need to be handled on the transport level. See [section 3](#3-Message-ordering) for discussion. #### 2.1.1 Connection type From [the libp2p docs](https://docs.rs/libp2p/latest/libp2p/trait.Transport.html#associatedtype.Output), the `Output` type is: > The result of a connection setup process, including protocol upgrades. > Typically the output contains at least a handle to a data stream (i.e. a connection or a substream multiplexer on top of a connection) that provides APIs for sending and receiving data through the connection. In particular, the `Connection` type must implement [`StreamMuxer`](https://docs.rs/libp2p/0.34.0/libp2p/core/trait.StreamMuxer.html), although that's not explicitly in the `Transport` trait, but it used higher up in the libp2p stack. In libp2p, a muxer is a type that allows multiple, distinct data streams to exist simultaneously over the same connection. For example, we can have a stream with protocol `ping` and a stream with protocol `gossip` over the same connection. A stream is simply a readable and writable data stream. The `StreamMuxer` trait is as follows: ```rust pub trait StreamMuxer { type Substream: AsyncRead + AsyncWrite; type Error: Error; fn poll_inbound( self: Pin<&mut Self>, cx: &mut Context<'_> ) -> Poll<Result<Self::Substream, Self::Error>>; fn poll_outbound( self: Pin<&mut Self>, cx: &mut Context<'_> ) -> Poll<Result<Self::Substream, Self::Error>>; fn poll_close( self: Pin<&mut Self>, cx: &mut Context<'_> ) -> Poll<Result<(), Self::Error>>; fn poll( self: Pin<&mut Self>, cx: &mut Context<'_> ) -> Poll<Result<StreamMuxerEvent, Self::Error>>; } ``` This trait allow for opening and closing new substreams over the connection. - `poll_inbound` returns any new inbound substreams - `poll_outbound` creates and returns a new substream - `poll_close` returns if any substreams were closed The `Connection`'s `Substream` type should implement [`futures::AsyncRead` ](https://docs.rs/futures/latest/futures/io/trait.AsyncRead.html) and [`futures::AsyncWrite`](https://docs.rs/futures/latest/futures/io/trait.AsyncWrite.html). This allows the application to read from, write to, and close the substream. `StreamMuxer` can be implemented by having each substream have a distinct ID, and every substream message should be prefixed with this ID. #### 2.1.2 Bi-directionality Connections are bi-directional; thus, both the dialer and dial recipient should have a way to message the remote side. The dialer is able to since it knows the Nym address of the recipient; however, the recipient does not know the Nym address of the dialer. There are two methods that can be used to overcome this: - the dialer simply sends its Nym address along with its initial message establishing the connection; - the dialer sends some number of reply SURBs along with its initial message, and the recipient is able to request more SURBs if it believes it requires more. The first is simpler to implement and arguably better traffic-wise, as the dialer might end up sending more SURBs than necessary, and the recipient does not need to request extra SURBs (which adds another round-trip). ### 2.2 Message types Messages within the transport are one of three variations: ```rust enum Message { ConnectionRequest(ConnectionMessage), ConnectionResponse(ConnectionMessage), TransportMessage(TransportMessage), } ``` When a peer dials us, or we dial a peer, the first message sent to establish the connection will be a `ConnectionRequest`. This message contains the connection ID, as well as the dialer's Nym address (if SURBs are not used). ```rust struct ConnectionMessage { /// 32-byte randomly generated ID id: ConnectionId, /// nym address of the dialer sender: Option<nym_sphinx::addressing::clients::Recipient>, /// libp2p peer ID peer_id: PeerId, } ``` To accept the `ConnectionRequest`, the remote side will send back a `ConnectionResponse`. Similarly to above, it contains the connection ID. Note that the same message is sent as both a `ConnectionRequest` and a `ConnectionResponse`; the only difference is that `sender` is set for the former and not for the latter. After the connection is established, further messages will be sent using the `TransportMessage` type from both sides. The ID must match the ID sent in the `ConnectionRequest`/`ConnectionResponse`. ```rust pub struct TransportMessage { /// increments by 1 for every TransportMessage sent over a connection. /// required for ordering, since Nym does not guarantee ordering. /// ConnectionMessages do not need nonces, as we know that they will /// be the first messages sent over a connection. /// the first TransportMessage sent over a connection will have nonce 1. nonce: u64, id: ConnectionId, message: SubstreamMessage, } ``` All messages sent over an established connection must be either initiating a substream, closing a substream, or sending data over a substream. `SubstreamMessageType` is then as follows: ```rust enum SubstreamMessageType { OpenRequest, OpenResponse, Close, Data(Vec<u8>), } ``` `SubstreamMessage`s contain the following: ```rust pub struct SubstreamMessage { substream_id: SubstreamId, message_type: SubstreamMessageType, } ``` ### 2.3 Listening ```rust fn listen_on( &mut self, addr: Multiaddr ) -> Result<ListenerId, TransportError<Self::Error>>; fn remove_listener(&mut self, id: ListenerId) -> bool; ``` Upon transport start-up, the node should obtain its Nym address and listen on this address, emitting a `TransportEvent::NewAddress`. The transport will only have one listener for now, as it can only listen on one Nym endpoint. Inbound connection requests and messages will be handled by polling the Nym endpoint. ### 2.3.1 Multiaddress Libp2p uses the multiaddress format for addressing nodes. For the Nym transport, multiaddresses will be of the form `/nym/<nym-address>`. ### 2.4 Dialing ```rust fn dial( &mut self, addr: Multiaddr ) -> Result<Self::Dial, TransportError<Self::Error>>; ``` To dial a node, we first parse the `Multiaddr` for the node's Nym address. We then generate a new private key for this connection, and send a `ConnectionRequest` message to the remote node through the mixnet. The `Dial` type is a future the resolves to a `Connection`. ### 2.5 Polling The `poll` function should return `TransportEvents` when necessary. From the docs: ```rust fn poll( self: Pin<&mut Self>, cx: &mut Context<'_> ) -> Poll<TransportEvent<Self::ListenerUpgrade, Self::Error>> ``` > Poll for TransportEvents. > A TransportEvent::Incoming should be produced whenever a connection is received at the lowest level of the transport stack. The item must be a ListenerUpgrade future that resolves to an Output value once all protocol upgrades have been applied. > Transports are expected to produce TransportEvent::Incoming events only for listen addresses which have previously been announced via a TransportEvent::NewAddress event and which have not been invalidated by an TransportEvent::AddressExpired event yet. The various `TransportEvent`s are [here](https://docs.rs/libp2p/latest/libp2p/core/transport/enum.TransportEvent.html). The most relevant are `Incoming` and `NewAddress`. `NewAddress` will be emitted once on start-up when we determine our Nym public key (and this multiaddress). `Incoming` will be emitted for each new inbound connection request. ## 3. Message ordering As noted above, the Nym mixnet does not guarantee message ordering. However, the transport needs to be able to know the order in which the messages were sent, otherwise the data read from a substream may be corrupted (and invalid from the application side) if it's not in the order intended by the sender. This can be implemented by putting a nonce in each `TransportMessage` which starts at 1 and increments by 1 each time a `TransportMessage` is sent. Then, a recipient of these messages can process these messages in order by nonce. In the case where the remote side does not follow this protocol, and multiple messages are sent with the same nonce, or a nonce is skipped for a long period of time, the node should probably drop the connection. ## 4. Message encoding The messages specified in section 2.2 must all be encoded and put inside of a Nym packet before sending them over the mixnet. Luckily, all of the message fields are of fixed length except for at most one, so the messages can be encoded by simply concatenating all the fields together. ### 4.1 `Message` type ```rust enum Message { ConnectionRequest(ConnectionMessage), ConnectionResponse(ConnectionMessage), TransportMessage(TransportMessage), } ``` Encoding: ```rust impl Message { pub(crate) fn to_bytes(&self) -> Vec<u8> { match self { Message::ConnectionRequest(msg) => { let mut bytes = 0_u8.to_be_bytes().to_vec(); bytes.append(&mut msg.to_bytes()); bytes } Message::ConnectionResponse(msg) => { let mut bytes = 1_u8.to_be_bytes().to_vec(); bytes.append(&mut msg.to_bytes()); bytes } Message::TransportMessage(msg) => { let mut bytes = 2_u8.to_be_bytes().to_vec(); bytes.append(&mut msg.to_bytes()); bytes } } } } ``` The first byte is the variant index of the `Message` (0, 1, or 2), and the rest of the encoding is the encoded message. ## 4.2 `ConnectionMessage` type ```rust struct ConnectionMessage { /// 32-byte randomly generated ID id: ConnectionId, /// nym address of the dialer sender: Option<nym_sphinx::addressing::clients::Recipient>, /// libp2p peer ID peer_id: PeerId, } ``` Encoding: ```rust= impl ConnectionMessage { fn to_bytes(&self) -> Vec<u8> { let mut bytes = self.id.0.to_vec(); match self.recipient { Some(recipient) => { bytes.push(1u8); bytes.append(&mut recipient.to_bytes().to_vec()); } None => bytes.push(0u8), } bytes.append(&mut self.peer_id.to_bytes()); bytes } } ``` The `id` field has a known length of 32. The `recipient` is `0` if `None`, `1 || recipient.to_bytes()` otherwise. `Recipient` also has a known length of 96. The `peer_id` field has a variable length and thus is appended last. ## 4.3 `TransportMessage` type ```rust pub struct TransportMessage { /// increments by 1 for every TransportMessage sent over a connection. /// required for ordering, since Nym does not guarantee ordering. /// ConnectionMessages do not need nonces, as we know that they will /// be the first messages sent over a connection. /// the first TransportMessage sent over a connection will have nonce 1. nonce: u64, id: ConnectionId, message: SubstreamMessage, } ``` Encoding: ```rust= impl TransportMessage { fn to_bytes(&self) -> Vec<u8> { let mut bytes = self.nonce.to_be_bytes().to_vec(); bytes.extend_from_slice(self.id.0.as_ref()); bytes.extend_from_slice(&self.message.to_bytes()); bytes } } ``` The `nonce` field has a known length of 8. The `id` field has a known length of 32. The `message` field has a variable length and thus is appended last. ## 4.4 `SubstreamMessage` type ```rust pub struct SubstreamMessage { substream_id: SubstreamId, message_type: SubstreamMessageType, } ``` Encoding: ```rust= impl SubstreamMessageType { fn to_u8(&self) -> u8 { match self { SubstreamMessageType::OpenRequest => 0, SubstreamMessageType::OpenResponse => 1, SubstreamMessageType::Close => 2, SubstreamMessageType::Data(_) => 3, } } } impl SubstreamMessage { pub(crate) fn to_bytes(&self) -> Vec<u8> { let mut bytes = self.substream_id.0.clone().to_vec(); bytes.push(self.message_type.to_u8()); if let SubstreamMessageType::Data(message) = &self.message_type { bytes.extend_from_slice(message); } bytes } } ``` The `id` field has a known length of 32. The `message_type` field has a known length of 1, and corresponds to the enum variant index of `SubstreamMessageType`. If the type variant if `SubstreamMessageType::Data`, then the data field `Vec<u8>` is appended to the end. ## 5. References and resources - libp2p Transport trait: https://docs.rs/libp2p-core/latest/libp2p_core/transport/trait.Transport.html - Nym websockets endpoint documentation: https://nymtech.net/docs/clients/websocket-client.html - Nym websockets endpoint example: https://github.com/nymtech/nym/blob/develop/clients/native/examples/websocket_binarysend.rs - Nym SURB [message](https://github.com/nymtech/nym/blob/fd1fb7ca7bdc5b71cfdf3fbeae2368f5ec1a9e91/clients/native/websocket-requests/src/text.rs#L25) and [response](https://github.com/nymtech/nym/blob/fede9cc194058596a1fe91af546260b583a39ef7/common/nymsphinx/src/receiver.rs#L23) - `sender_tag` is used for SURBs.