Privacy on IPFS

--- title: Privacy on IPFS tags: research&projects, ipfs, privacym, p2p --- # Privacy on IPFS **Scope:** This document outlines i) user privacy attack vectors on IPFS and DHT-based P2P networks and ii) research and development directions to improve the current state of user privacy of IPFS. ## IPFS privacy thread model By actively and passively monitoring network requests on IPFS, it is possible to learn insights about IPFS users that allow attacker to learn users' interests and behavior over time. In [1], the authors describe three attack vectors on user's privacy that can be performed through monitoring alone: i) identifying data *wanters* (IDW); ii) tracking node *wants* (TNW); and iii) testing for past interests. The IDW and TND attacks consist of monitoring for CID requests on the network to link peer IDs -- and respective IPs -- with content identifiers (CIDs). This attack can also be targeted, by placing/injecting the attacker's IDs in the victim's peer table in order to obtain a vantage point to learn which CIDs the victim is requesting. This attack allows anyone on the network to learn which content a user/IP is requesting over time which can be highly critical depending on the content being fetched by the user. The testing for past interests attack consists of attackers testing for recently accessed content by the victim. Often, users store and "seed" content recently requested on the network. Serving content leaks interest (i.e. user/IP serving a CID leaks interest in that particular content), thus it can be used by attackers to learn the content a user is interested about. In a nutshell, user privacy vulnerabilities on IPFS arise from the underlying fact that users are required to collaborate in order to discover and serve content on a P2P network. In addition, a simple and leaky DHT is optimal in terms of performance, which is specially important to ensure usability in large P2P networks. ## Privacy Preserving Browsers and IPFS Brave Browser [2] implements a set of privacy features that provide its users protection against web fingerprinting, protection against third-party cookies and other web user privacy attacks [3]. Recently, Brave Browser supports IPFS natively [4]. Given that the privacy issues of IPFS and Brave's commitment to user's privacy are at odds, what mechanisms could IPFS (or Brave Browser itself) implement to mitigate the user privacy attacks that are endemic to IPFS? ## Research and development directions In this section we outline potential research and development directions to mitigate or attenuate the privacy vulnerabilities posed by the IPFS underlying protocol. We also focus on how to make effective private attacks harder and more expensive, which may push back on attackers due to lack of incentives to perform network and/or targeted privacy attacks. **Content routing with Oblivious Locality Sensitive Hashing**: **Content fetching through Private Information Retrieval -- Sinkhole protocol**: The Sinkhole protocol aims at protecting against the IDW and TND attacks by partially hiding the CID of the requested content. The protocol splits the traditional DHT lookup into two phases: First, the lookup initiator searches for Private Information Retrieval (PIR) providers for a given key in the DHT; Then, the lookup initiator completes the key lookup by querying the PIR provider for the full ID. Only the first phase of the protocol discloses to the DHT nodes information about the ID of the lookup. However, the first phase of the lookup only discloses part of the ID key. **PIR-based routing**: The PIR-based routing mechanism consists of users performing CID queries in a PIR fashion -- i.e. instead of requesting the CID in the clear, the query is encrypted or impossible for the responder to learn about even though the responder is able to serve the requester. There are multiple ways to achieve this through trivial PIR or homomorphic encryption-based content fetching schemes. This mechanism aims at attenuating and improving user's privacy against IDW and TND attacks. **Plausible deniability mechanisms**: By adding noise to the network, it *may* be possible to add plausible deniability to both *want* and *serve* DHT requests. For example, by probabilistically replicating incoming serve and want requests, the users will automatically add noise to the network, effectively obfuscating what are the real requests in the network [7]. This mechanism aims at attenuating and improving user's privacy against IDW and TND attacks. **k-anonimity rings for content routing**: The authors of [5] claim that, by forming anonymity groups and generate DHT responses inside those groups, the *responder* privacy can be improved through k-anonimity against network-level attacks. This mechanism is aims at improving user's privacy against the testing for past interests attack. **IPFS over Mixnet/Onion routing networks**: Proxying DHT requests through a Mixnet/Tor network (and using pseudoanoymous peer IDs) may help unlink DHT requests from user IPs and real peer IDs. This mechanism aims at attenuating and improving user's privacy against IDW, TND attacks and testing for past interest attacks. **Onion routing-based content routing on IPFS**: Relying on the DHT overlay to build and use onion circuits for DHT requests may improve the user's privacy [8]. This mechanism aims at attenuating and improving user's privacy against IDW, TND attacks and testing for past interest attacks. ## Literature & further reading 1. [Monitoring Data Requests in Decentralized Data Storage Systems: A Case Study of IPFS ](https://arxiv.org/abs/2104.09202) 2. [Brave Browser](https://brave.com) 3. [Brave Browser - Privacy Features](https://brave.com/privacy-features/) 4. [IPFS support on Brave Browser](https://brave.com/ipfs-support/) 5. [k-anonymity Chord for Anonymous Query Response](http://perpetualinnovation.net/ojs/index.php/ijngc/article/view/275) 6. [Sinkhole protocol](https://github.com/gpestana/notes/issues/27) 7. [Coin flip request delegation](https://github.com/gpestana/notes/issues/23) 8. [p3lib - Toolbox for enhancing privacy in P2P networks](https://github.com/hashmatter/p3lib/)