# Multipath TCP Adoption In-The-Wild
> This blog post is a summary of our research article ["From Single Lane to Highways: Analyzing the Adoption of Multipath TCP in the Internet"](http://dl.ifip.org/db/conf/networking/networking2021/1570699492.pdf) published in [IFIP Networking 2021](https://networking.ifip.org/2021/). In case you are interested in learning more about our work, we encourage you to read our full paper and download our ongoing scans at [mptcp.io](https://mptcp.io/).
Multipath TCP (MPTCP) has been in development since 2013 ([RFC 6824](https://datatracker.ietf.org/doc/html/rfc6824)) and has seen significant interest from researchers and industries alike. Several organizations have publicly announced MPTCP incorporation within their products and services. For example, Apple uses MPTCP in its iOS devices (e.g. Siri, Music, Maps, Wi-Fi Assist) [^AppSuport] and also allows third-party developers to use the protocol in non-system applications. Korea Telecom uses MPTCP to provide Gigabit speeds over Wi-Fi and LTE to its customers [^KTSupport]. MPTCPv1 was recently standardized in early 2020 [[RFC 8684]](https://datatracker.ietf.org/doc/html/rfc8684.html) and is available in all Linux machines with kernel v5.6 and upwards [^KerMP].
However, despite significant interest, the current state of MPTCP deployment in the Internet remains [largely unknown](http://blog.multipath-tcp.org/blog/html/2018/12/19/which_servers_use_multipath_tcp.html). The last attempted study dates back to 2015 [^Meh2015] and was later found to include many false positives due to middleboxes that echoed MPTCP options in TCP header extensions. In this work, we extend the previous study's methodology and provide the most accurate picture of *true* MPTCP deployment to date. Specifically, our study scans for MPTCP support over the two most popular services in the Internet, HTTP and HTTPS, over both IPv4 and IPv6.
## Scanning for MPTCP Options
To understand our measurement methodology, it is first essential to understand the MPTCP protocol implementation and operation. MPTCP extends regular TCP packets and uses optional TCP header fields for signaling (see [RFC 6824](https://datatracker.ietf.org/doc/html/rfc6824#section-3)). The most notable is the `MP_CAPABLE` option which includes the `MP-CAPABLE` flag that signals that the particular host supports MPTCP. Additionally, the option also includes a random 64-bit sequence as *sender keys* that the hosts use to authenticate themselves.

*MPTCPv0 Key exchange at connection establishment*
The figure above shows the [MPTCPv0 connection establishment](https://datatracker.ietf.org/doc/html/rfc6824#section-2.1) procedure between two hosts *Bob* and *Alice*. The procedure mimics TCP's [three-way handshake](https://datatracker.ietf.org/doc/html/rfc793#section-3.4). Bob initiates the connection by sending a `SYN` packet containing the `MP-CAPABLE` flag and its own sender's key. If Alice also supports MPTCP, it will reply with a `SYN-ACK` which contains the `MP-CAPABLE` flag and its own sender's key. Bob finally establishes the connection by sending an `ACK` with both keys and the `MP_CAPABLE` option. If any of the MPTCP options are dropped during the exchange (e.g. Alice does not support MPTCP), the connection reverts back to a regular TCP.
It is possible to leverage the MPTCP's handshake mechanism to determine MPTCP support in the Internet. That is, one can generate a spoofed MPTCP `SYN` with `MP_CAPABLE` option that also contains a 64-bit sequence as key and send it to any publicly accessible host. If the targeted host responds with a `SYN-ACK` that also includes the `MP_CAPABLE` option, it is likely that it also supports MPTCP. Note that this is the same methodology used by a previous MPTCP adoption study in 2015 [^Meh2015].

*Scanning for MPTCP support using ZMAP*
To identify MPTCP hosts using this method, we use [ZMap](https://zmap.io/) to probe the entire IPv4 address space over port 80 (≈74M unique responsive IPs) and port 443 (≈52M unique responsive IPs). In IPv6, we use the [IPv6 Hitlist Service](https://ipv6hitlist.github.io/) to probe port 80 (≈746k responsive IPs) and port 443 (≈544k responsive IPs) due to the size of the address space. Our study was conducted between July 2020 to December 2020. However, our scans are still ongoing and our latest results are available at [mptcp.io](https://mptcp.io/).

*MPTCP Adoption in IPv4 for HTTP and HTTPS statistics*
The table shows the MPTCP support in IPv4 over port 80 and 443. From almost 60M responsive targets on port 80 and 50M on port 443, about 200k addresses responded with the `MP-CAPABLE` flag in their SYN-ACK (*potential MPTCP*). It can also be observed that a large percentage of potential MPTCP hosts are inconsistently active across our six-month scanning period -- hinting at the existence of transient hosts. In IPv6, we find a very small number of addresses responding with the `MP_CAPABLE` option: 43 on TCP/80 and 165 on TCP/443. Similar to IPv4, we see more `MP-CAPABLE` addresses on port 443 (0.03%) compared to port 80 (0.005%).
## Middleboxes vs MPTCP
At first glance, it might seem that MPTCP is extensively supported in the Internet, as we receive `MP-CAPABLE` flag from many hosts (more on port 443 than on port 80). However, we also find that only a fraction of the hosts (≈4%) send us a different sender's key value in `SYN-ACK` than the static senders key sent in original `SYN`. This behavior is quite weird within the MPTCP realm since for each legitimate MPTCP-enabled host, the "key MUST be hard to guess, and it MUST be unique for the sending host at any one time." [[RFC6824]](https://datatracker.ietf.org/doc/html/rfc6824#section-3.1). To understand the root cause of this behavior, we plot the Hamming weight distribution of the sender’s key from all hosts that responded with `MP_CAPABLE` option in their `SYN-ACK`.

Since the key is a random 64-bit sequence, the sum of independent random variables tends toward a normal distribution [[central limit theorem]](https://en.wikipedia.org/wiki/Central_limit_theorem). The plot above shows the hamming weight of sender's keys received from *potential MPTCP* IPv6 hosts on port 443. Note the outlier at hamming weight 16 which does not follow the distribution. We find this outlier present in all our scans for both port 80 and 443 (more prevalent in IPv4 than IPv6). Interestingly, the outlier's hamming weight *exactly* matches the weight of the static key we send in our `SYN` probe to the hosts!
The likely cause of this can be the prevalence of middleboxes in the Internet that do not support the TCP header extensions and therefore handle such packets in unconventional ways [^Hes2013]. Many middleboxes drop packets with TCP header extensions altogether while others may remove the extensions but forward the packet to the destination. In the case shown above, the middleboxes *replay* the header extensions back to us and the `SYN` probe never reaches the intended target host. Such middleboxes are easier to filter out in our ZMap scans by checking if the returned sender's key in `SYN-ACK` differs from that sent in `SYN`. However, this may result in several false positives due to middleboxes that may perform more complex operations on packets with extended TCP options, e.g., modifying parts of sender’s keys.
>**Note**: MPTCPv1 [[RFC 8684]](https://datatracker.ietf.org/doc/html/rfc8684.html) redesigns the initial handshake mechanism specifically to handle "replaying" middleboxes in the Internet. The MPTCP options in`SYN` sent by the connection initiator (Bob in figure) no longer includes the sender's key but only the `MP-CAPABLE` flag. Alice responds back with its own sender's key in `SYN-ACK` and Bob sends its key in the final `ACK` if `SYN-ACK` includes a sender's key. In this case, any mirroring middlebox on path only replays Bob's `MP-CAPABLE` flag in the `SYN-ACK` without the sender's key and the resulting connection falls back to TCP.
## True MPTCP support in IPv4 and IPv6
We detect the presence of interfering middleboxes by running [Tracebox](http://www.tracebox.org/) towards all targets that sent an `MP_CAPABLE` option in our ZMap scans. Specifically, we issue Tracebox requests with the `MP_CAPABLE` option towards a target address. In the reply, we receive responses from intermediate routers on the path, including any modifications made. For detailed methodology, we encourage you to read our [paper](http://dl.ifip.org/db/conf/networking/networking2021/1570699492.pdf).
Firstly, we observe that a large pecentage of targets do not respond to our Tracebox queries and were *unreachable*. In IPv4, almost 90% and 48% hosts on port 443 and 80 were unreachable. In IPv6, only port 443 had unreachable hosts (≈82%). Further analysis revealed the root cause to be target ISP blocking, transient hosts, etc.

*Monthly distribution of true MPTCP support in IPv4*
After removing the unreachable hosts, we find the *true* support of MPTCP to be significantly lower than that reported by ZMap. In IPv4, only ≈7.5k and ≈6.9k hosts support MPTCP on port 80 and 443 respectively. In IPv6, the adoption is much lower with 31 and 27 MPTCP hosts on port 80 and 443 respectively. The figure above shows the monthly distribution of *true* MPTCP hosts and hosts affected by non-replaying *middleboxes* in IPv4. We find that the support for MPTCP on IPv4 has been steadily increasing for both HTTP and HTTPS -- almost doubling for HTTP over six months. Of the 402 and 1.27k *middlebox-affected* end-hosts on port 80 and 443, only 6 were found to truly support MPTCP.

*Top-10 ASes with hosting IPv4 MPTCP hosts*

*Top-6 ASes hosting IPv6 MPTCP hosts*
We also find that almost half of all IPv4 MPTCP hosts (≈5300) are deployed in the US, thanks to [Apple](https://www.apple.com/) which has the largest share. The second-largest support for MPTCP over IPv4 comes from Australia (≈1400), mainly due to servers hosted by [Telstra](https://www.telstra.com.au/), a major telecommunications company in the region. Germany and Korea comes third and fourth (< 1000 hosts) with support from [Plus Server](https://www.plusserver.com/en/) and [Korea Telecom](https://corp.kt.com/eng/) respectively. Compared to IPv4, MPTCP support in IPv6 is much more evenly distributed over ASes -- with most located in hosting providers and ISPs. Overall, we find that the current MPTCP deployment spans **more than 80 countries** across the globe.
## Conclusion
We plan to continue our scans for MPTCP adoption for foreseeable future, which you can view/download at [mptcp.io](https://mptcp.io/). To learn more about impact of middleboxes on MPTCP data transfers and MPTCP IPv4 and IPv6 traffic share in the Internet, please read our [full paper](http://dl.ifip.org/db/conf/networking/networking2021/1570699492.pdf). We are also very interested in learning about applications that use MPTCP over the public Internet. If you are aware of such use cases or manage servers that are involved in such exchanges, please contact us at [info@mptcp.io](mailto:info@mptcp.io).

[^AppSuport]: https://support.apple.com/en-us/HT201373
[^KTSupport]: https://www.androidauthority.com/kt-launches-1gbps-giga-lte-617147/
[^KerMP]: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f870fa0b5768842cb4690c1c11f19f28b731ae6d
[^Meh2015]: Mehani, Olivier, et al. "An early look at multipath TCP deployment in the wild." Proceedings of the 6th international workshop on hot topics in planet-scale measurement. 2015.
[^Hes2013]: Hesmans, Benjamin, et al. "Are TCP extensions middlebox-proof?." Proceedings of the 2013 workshop on Hot topics in middleboxes and network function virtualization. 2013.