# (wip draft) IPIP: Trustless Retrieval Client
> ### 👉️ THIS IS AN EARLY WORKING DRAFT of Improvement Proposal (IPIP) to be eventually published for public peer review at https://github.com/ipfs/specs/
***
**Status:** 
**Author(s)**:
- Marcin Rataj ([@lidel](https://github.com/lidel))
* * *
**Abstract**
This specification defines a minimal set of protocols, conventions, and best
practices client applications should follow to implement trustless retrieval of
content-addressed data provided by public or private IPFS provider nodes.
* * *
# Table of Contents
- [Trustless Retrieval Client](#trustless-retrieval-client)
- [Table of Contents](#Table-of-Contents)
- [Objectives](#objectives)
- [Out of scope](#out-of-scope)
- [Detailed design](#detailed-design)
- [Test fixtures](#test-fixtures)
- [Security considerations](#security-considerations)
- [Copyright](#copyright)
# Objectives
<!-- What does this specification aims to archieve? -->
Increase the utility of IPFS in areas where p2p-first approach is not feasible:
- Mobile Web Browser opening ipfs:// and ipns://
- IoT device fetching firmware updates
- Package manager fetching binaries or sources
# Out of scope
<!-- What is not part of this spec, either by design, or is already defined
elsewhere -->
Client implementations are free to cache and reprovide retrieved data, but it
is not a part of this specification.
IPNS record and DNSLink resolution.
TODO TBD: latest IPNS record (punt and as clients to use reframe? custom gateway response type?)
TODO TBD: publishing data: mention things like running own provider node, writable gateways, and/or using pinning service API
# Detailed design
<!-- The resulting specification should be detailed enough to allow competing,
interoperable implementations. -->
The underlying idea behind retrieval client is to identify minimum set of primitives to enable trustless content retrieval, and list ways it can be augumented to improve utility and performance.
## TODO: flesh out "variants"
Below "variants" document different approaches to data transfer. Each come with utility vs complexity trade-offs.
## Variant A.1: raw blocks
The simplest variant is an HTTP client fetching immutable, raw block from an HTTP Gateway:
```console
$ curl -H "Accept: application/vnd.ipld.raw" -L "https://ipfs.io/ipfs/bafybeidd2gyhagleh47qeg77xqndy2qy3yzn4vkxmk775bg2t5lpuy7pcu" > block.bin
```
Main benefit of this approach is that it does not require client to implement anything other than HTTP. If the data fits in a single raw block, it can be consumed without any additional deserialization, no IPLD library needs to be included.
## Variant A.2: blocks in a CAR
Instead of requesting bigger DAGs block-by-block, a client can request it as a [CAR](https://ipld.io/specs/transport/car/): a serialized representation of any IPLD DAG as the concatenation of its blocks, plus a header that describes the graphs in the file (via root CIDs):
```console
$ curl -H "Accept: application/vnd.ipld.car" -L "https://ipfs.io/ipfs/bafybeidd2gyhagleh47qeg77xqndy2qy3yzn4vkxmk775bg2t5lpuy7pcu" > dag.car
```
## Variant A.2.1: blocks in a CAR + selector
## Variant B.1: direct libp2p connection and bitswap
thin libp2p client, supporting quic/websockets/webtransport
1. learning about provider multiaddr from DNSaddr or other means (TODO TBD: reframe?)
2. connecting to provider
3. fetching data over bitswap (or better protocol, if both ends support it)
## Variant B.2 DHT, P2P
equivalent of kubo (go-ipfs)
## TODO
- client maintains a pool of HTTP gateways
- implementation should have at least one implicit gateway defined by default
- users should be able to customize the list of gateways by providing their own URLs
- client can have libp2p client in addition to http, but it is optional(?), can be used for
- LAN discovery and local p2p transfer
- recovery fallback when all gateways are down or unable to fetch data within some time budget
- describe flows
- opening `ipfs://cid`
- opening http(s):// link to IPFS resource on an HTTP gateway
- opportunistic protocol upgrade to content-addressed and verifiable transport
- TODO TBD: mutable resources on `ipns://` and `http(s)://...ipns..`
- DNSLink can be resolved by the client over DNS, DNSSEC can be validated if present
- IPNS records are problematic: how can client resolve it without requiring DHT?
- extend gateway spec with a way for requesting IPNS record from gateways
- `GET /ipns/{libp2p-key}` with `Accept: application/vnd.ipfs.ipns.record; version=2` ?
- things to include
- "do the best with transports available"
- "prioritize fetching from HTTP and HTTP CDN caches before resorting to p2p and bitswap"
- "when peer provides data for specific content root, keep connection to it and mark it for future use"
- prioritizing gateways and peers which already have the data
- resolve DNSAddr if possible
- HTTP gateway and DNSLink websites may have DNSAddr records
- check if /dnsaddr resolves to any actionable multiaddrs
- connect to multiaddrs if client supports resolved protocols
- start bitswap sections, ask if node on the other end has the root CID, use the information for sorting retrieval providers
- HEAD with only-if-cached
- order gateways to be used for datat transfer with specific Origin by prioritizing ones that already have the data
- make clear point that use of IPLD and libp2p is encouraged, but not mandatory
- client should be able to fetch content-addressed data without libp2p
- client should be able to fetch opaque binary blocks and without having to implement IPLD
- developers should be able to use IPFS as "key-value store with soft value (raw block) limit of 2MiB"
## Test fixtures
<!-- Provide examples or list relevant CIDs. Describe how implementations can
use them to determine specification compliance. -->
TODO: figure out test vectors which make sense. /ipfs/ is easy, /ipns/ is tricky
- immutable `/ipfs/`
- opening IPFS address
- `ipfs://{cid}/path/to/file.txt`
- `https://example.com/ipfs/{cid}/path/to/file.txt`
- `https://{cid}.ipfs.example.com`
- mutable `/ipfs/`
- open oproblem: tricky to have something deterministic, without infrastructure which keeps example alive forever
- opening DNSLink address
- `ipns://en.wikipedia-on-ipfs.org`
- `https://en.wikipedia-on-ipfs.org`
- `https://ipfs.io/ipns/en.wikipedia-on-ipfs.org`
- `https://en-wikipedia--on--ipfs-org.ipns.dweb.link`
- TODO TBD if we want `https://en.wikipedia-on-ipfs.org.ipns.localhost:8080` - probably not, just require inlining everywhere
- opening IPNS record address
- `ipns://{libp2p-key}`
- `https://{libp2p-key}`
- `https://ipfs.io/ipns/{libp2p-key}`
- `https://{libp2p-key}.ipns.dweb.link`
## Security considerations
<!-- Explain the security implications and related risks that implementers
should be aware of -->
- TODO: highlight sensitive areas where trust should not be delegated to gateways
- DNSLink resolution
- IPNS record resolution
- data fetched as block or CAR (bag of blocks), verification MUST happen on the client
# Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).