# (wip draft) IPIP: Trustless Retrieval Client > ### 👉️ THIS IS AN EARLY WORKING DRAFT of Improvement Proposal (IPIP) to be eventually published for public peer review at https://github.com/ipfs/specs/ *** **Status:** ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) **Author(s)**: - Marcin Rataj ([@lidel](https://github.com/lidel)) * * * **Abstract** This specification defines a minimal set of protocols, conventions, and best practices client applications should follow to implement trustless retrieval of content-addressed data provided by public or private IPFS provider nodes. * * * # Table of Contents - [Trustless Retrieval Client](#trustless-retrieval-client) - [Table of Contents](#Table-of-Contents) - [Objectives](#objectives) - [Out of scope](#out-of-scope) - [Detailed design](#detailed-design) - [Test fixtures](#test-fixtures) - [Security considerations](#security-considerations) - [Copyright](#copyright) # Objectives <!-- What does this specification aims to archieve? --> Increase the utility of IPFS in areas where p2p-first approach is not feasible: - Mobile Web Browser opening ipfs:// and ipns:// - IoT device fetching firmware updates - Package manager fetching binaries or sources # Out of scope <!-- What is not part of this spec, either by design, or is already defined elsewhere --> Client implementations are free to cache and reprovide retrieved data, but it is not a part of this specification. IPNS record and DNSLink resolution. TODO TBD: latest IPNS record (punt and as clients to use reframe? custom gateway response type?) TODO TBD: publishing data: mention things like running own provider node, writable gateways, and/or using pinning service API # Detailed design <!-- The resulting specification should be detailed enough to allow competing, interoperable implementations. --> The underlying idea behind retrieval client is to identify minimum set of primitives to enable trustless content retrieval, and list ways it can be augumented to improve utility and performance. ## TODO: flesh out "variants" Below "variants" document different approaches to data transfer. Each come with utility vs complexity trade-offs. ## Variant A.1: raw blocks The simplest variant is an HTTP client fetching immutable, raw block from an HTTP Gateway: ```console $ curl -H "Accept: application/vnd.ipld.raw" -L "https://ipfs.io/ipfs/bafybeidd2gyhagleh47qeg77xqndy2qy3yzn4vkxmk775bg2t5lpuy7pcu" > block.bin ``` Main benefit of this approach is that it does not require client to implement anything other than HTTP. If the data fits in a single raw block, it can be consumed without any additional deserialization, no IPLD library needs to be included. ## Variant A.2: blocks in a CAR Instead of requesting bigger DAGs block-by-block, a client can request it as a [CAR](https://ipld.io/specs/transport/car/): a serialized representation of any IPLD DAG as the concatenation of its blocks, plus a header that describes the graphs in the file (via root CIDs): ```console $ curl -H "Accept: application/vnd.ipld.car" -L "https://ipfs.io/ipfs/bafybeidd2gyhagleh47qeg77xqndy2qy3yzn4vkxmk775bg2t5lpuy7pcu" > dag.car ``` ## Variant A.2.1: blocks in a CAR + selector ## Variant B.1: direct libp2p connection and bitswap thin libp2p client, supporting quic/websockets/webtransport 1. learning about provider multiaddr from DNSaddr or other means (TODO TBD: reframe?) 2. connecting to provider 3. fetching data over bitswap (or better protocol, if both ends support it) ## Variant B.2 DHT, P2P equivalent of kubo (go-ipfs) ## TODO - client maintains a pool of HTTP gateways - implementation should have at least one implicit gateway defined by default - users should be able to customize the list of gateways by providing their own URLs - client can have libp2p client in addition to http, but it is optional(?), can be used for - LAN discovery and local p2p transfer - recovery fallback when all gateways are down or unable to fetch data within some time budget - describe flows - opening `ipfs://cid` - opening http(s):// link to IPFS resource on an HTTP gateway - opportunistic protocol upgrade to content-addressed and verifiable transport - TODO TBD: mutable resources on `ipns://` and `http(s)://...ipns..` - DNSLink can be resolved by the client over DNS, DNSSEC can be validated if present - IPNS records are problematic: how can client resolve it without requiring DHT? - extend gateway spec with a way for requesting IPNS record from gateways - `GET /ipns/{libp2p-key}` with `Accept: application/vnd.ipfs.ipns.record; version=2` ? - things to include - "do the best with transports available" - "prioritize fetching from HTTP and HTTP CDN caches before resorting to p2p and bitswap" - "when peer provides data for specific content root, keep connection to it and mark it for future use" - prioritizing gateways and peers which already have the data - resolve DNSAddr if possible - HTTP gateway and DNSLink websites may have DNSAddr records - check if /dnsaddr resolves to any actionable multiaddrs - connect to multiaddrs if client supports resolved protocols - start bitswap sections, ask if node on the other end has the root CID, use the information for sorting retrieval providers - HEAD with only-if-cached - order gateways to be used for datat transfer with specific Origin by prioritizing ones that already have the data - make clear point that use of IPLD and libp2p is encouraged, but not mandatory - client should be able to fetch content-addressed data without libp2p - client should be able to fetch opaque binary blocks and without having to implement IPLD - developers should be able to use IPFS as "key-value store with soft value (raw block) limit of 2MiB" ## Test fixtures <!-- Provide examples or list relevant CIDs. Describe how implementations can use them to determine specification compliance. --> TODO: figure out test vectors which make sense. /ipfs/ is easy, /ipns/ is tricky - immutable `/ipfs/` - opening IPFS address - `ipfs://{cid}/path/to/file.txt` - `https://example.com/ipfs/{cid}/path/to/file.txt` - `https://{cid}.ipfs.example.com` - mutable `/ipfs/` - open oproblem: tricky to have something deterministic, without infrastructure which keeps example alive forever - opening DNSLink address - `ipns://en.wikipedia-on-ipfs.org` - `https://en.wikipedia-on-ipfs.org` - `https://ipfs.io/ipns/en.wikipedia-on-ipfs.org` - `https://en-wikipedia--on--ipfs-org.ipns.dweb.link` - TODO TBD if we want `https://en.wikipedia-on-ipfs.org.ipns.localhost:8080` - probably not, just require inlining everywhere - opening IPNS record address - `ipns://{libp2p-key}` - `https://{libp2p-key}` - `https://ipfs.io/ipns/{libp2p-key}` - `https://{libp2p-key}.ipns.dweb.link` ## Security considerations <!-- Explain the security implications and related risks that implementers should be aware of --> - TODO: highlight sensitive areas where trust should not be delegated to gateways - DNSLink resolution - IPNS record resolution - data fetched as block or CAR (bag of blocks), verification MUST happen on the client # Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).