IP Proxying over libp2p

# IP Proxying over libp2p ## QUIC DATAGRAMs, CONNECT-UDP and CONNECT-IP ### QUIC Datagrams RFC 9221 specifies a new QUIC frame type, the DATAGRAM frame. DATAGRAMs are sent in QUIC packets, and therefore end-to-end encrypted and authenticated and congestion controlled. When lost, a DATAGRAM frame is not retransmitted, making them suitable to transport unreliable data. ### HTTP Datagrams RFC 9297 defines how DATAGRAMs can be used in HTTP/3. HTTP DATAGRAMs are associated with a particular HTTP request: In a nutshell, HTTP DATAGRAMs are sent in QUIC DATAGRAM frames, but prefixed with a single varint, encoding the QUIC stream ID of the HTTP request. This allows the creation of multiple datagram flows within the same QUIC connection. ### Proxying UDP in HTTP This is used by RFC 9298, which defines how UDP packets can be proxied in HTTP: The client sends a so-called Extended CONNECT request to the proxy on a QUIC stream, requesting the proxy to open a UDP socket and proxy a flow of UDP packets to a target server. Multiple UDP flows to different target servers can be proxied in the same QUIC connection. ### Proxying IP in HTTP In a similar fashion, RFC 9484 defines how IP packets can be proxied in HTTP. In addition to the logic needed for proxying UDP packets, the IP packets also comes with options to advertise IP routes over HTTP. ## Nunet IP Proxying The goal is to create a VPN that allows proxying of IP packets between different nodes in the network. These nodes could be located in home / corporate networks, and therefore might be located behind NATs / firewalls. ## Option 1: Add an unreliable datagram API to libp2p libp2p provides a stream-multiplexing abstraction over different network transports (TCP, QUIC, WebSocket, WebTransport, WebRTC). It also provides hole punching capabilites (for TCP, QUIC and WebRTC), allowingt he traversal of (well-behaved) NATs. libp2p also defines a way to run an HTTP server on top of the libp2p abstraction of a stream-multiplexed connection. However, libp2p currently does provide any unreliable / datagram abstraction. In the past, efforts to define such an abstraction have stalled for a number of reasons: 1. It was unclear how payloads exceeding the available MTU should be handled (e.g. should they 1. bedropped / rejected 2. split up into multiple datagrams and reassembled at the receiver, or 3. transparent fall back to a reliable stream?). 2. A libp2p application would want to be able to multiplex datagram flows on the same connection, similar to what HTTP DATAGRAMs achieve. However, this further decreases the MTU, in a much more severe way than HTTP DATAGRAMs do (a QUIC varint typically consumes between 1-2 bytes, whereas a libp2p multistream protocol ID can consume anywhere between 15-40 bytes). After defining a libp2p datagram abstraction, we could rip RFC 9484 apart, and: 1. Run the Extended CONNECT request-response exchange on a libp2p streams (which negotiate libp2p+HTTP). 2. Send HTTP DATAGRAMs over libp2p datagrams. ### Pros * Adding a general-purpose API to libp2p, that could potentially be used by other protocols. * Allows multiplexing of proxied IP flows with other libp2p data exchanged on the same QUIC connection. ### Cons * Potentially lengthy libp2p specification process. * We wouldn't run vanilla RFC 9484. Instead, we'd be running libp2p+HTTP streams and on top of libp2p datagrams. * This creates interoperability problems: It's now not possible to use a vanilla RFC 9484 stack (in any programming language). Instead, such a stack needs to be ripped apart to feed in alibp2p connection. * It reduces the MTU available in DATAGRAM frames: once for the libp2p demultiplexing layer, and again for the HTTP DATAGRAM demultiplexing. * Lots of implementation work in libp2p land. Added complexity due to libp2p's connection abstraction (what do we do when running on top of TCP). * TCP fallback of this solution looks completely different from what it looks like in RFC 9484 (which uses DATAGRAM capsules on top of HTTP/2). ## Option 2: Running vanilla RFC 9484 Instead of ripping RFC 9484 apart and running it on top of the respective libp2p abstractions, we could instruct libp2p to obtain a vanilla RFC 9000 QUIC connection between two nodes, potentially making use of libp2p's hole punching capabilities. To do so, we could extend libp2p's hole punching API to (optionally) return a plain TCP / QUIC connection to a remote multiaddress. The application is then free to use this QUIC connection in any way it likes, in our case by setting up an HTTP connection, so we can send HTTP DATAGRAMs. This allows running of arbitrary services behind a NAT. For plain connections, apart from coordinating the hole punching, libp2p would completely get out of the way: the application gets to set the `tls.Config`, the `quic.Config`, and is responsible for managing the lifecycle of this connection. ### Pros * Using unmodified RFC 9484. It is reasonable to assume that any RFC 9484 implementation will have an API to consume QUIC connections. * Limited amount of changes to libp2p specification needed: only need to coordinate the TLS ALPN for the hole-punched connectin. * Limited code changes to libp2p stack needed: only need a way to request a plain TCP / QUIC connection to a peer. ### Cons * Peers might end up with two connections between each other: one libp2p connection, and one QUIC connection for carrying proxied traffic. * There's no circumstances where more than two connections would be necessary: It is possible to proxy multiple flows over a single QUIC connection.