📄 Proposal: Location-Hinted Links for Content-Addressable Systems

# 📄 Research: Available Location-Hinted Links for Content-Addressable Systems ## Abstract This document explores how to extend content-addressable systems with location-hinted identifiers that help clients discover content providers without needing external discovery systems (like IPNI or DHTs). We explore three approaches to encode these identifiers: - A naive query param encoding in IPFS paths - Magnet URI format (widely used in BitTorrent) - A more modern structured schema like RASL ## Problem Smart clients (like [`helia-verified-fetch`](https://github.com/ipfs/helia-verified-fetch/tree/main/packages/verified-fetch) in the IPFS Ecosystem) enable users to fetch and verify content from IPFS Nodes or servers like Hashstream without trusting the server — only the content matters. However, they still need to know where to send the request. Today, this requires hardcoded client configuration (e.g. gateway URLs) or additional discovery hops (IPNI, DHT Lookups). These extra steps add latency, complexity and fragile dependencies. Moreover, content providers are required to perform additional orchestrating to keep content discoverable by these networks. We want a way to bundle location hints directly with the content identifier so clients can: - Try those hints first, - And only fall back to discovery if needed. This improves performance, reduces load on discovery networks, and gives content publishers a simple, portable way to share location-hinted identifiers. ## Naive Approach: IPFS Path Encoding A simple idea would be: ```sh /ipfs/{cid}?providers=maddr1,maddr2 ``` Resulting for example in: ```sh /ipfs/bafy...abc?providers=https://my-hash-stream-server.com/ipfs/,/dns4/other.com/tcp/443/https ``` ### Pros - ✅ Familiar - ✅ Easy to parse for HTTP clients - ### Cons - ❌ Tied tightly to IPFS-specific URL semantics - ❌ Not a general-purpose or cross-protocol format - ❌ No established ecosystem outside IPFS This pushes us to conside more interoperable, system-agnostic format. ## Option 2: Magnet URIs Magnet URIs are a well-established, extensible format used in systems like BitTorrent, ed2k, and Gnutella. **Example:** ```sh magnet:?xt=urn:btih:<infohash>&dn=<display name>&tr=<tracker URL>&ws=<web seed> ``` **Where:** - `xt` → Exact Topic (typically content hash) - `dn` → Display Name - `tr` → Tracker URL (for discovery) - `ws` → Web seed (HTTP seed source) Magnet URIs are: - ✅ Protocol-agnostic - ✅ Extensible - ✅ Well-supported in many tools - ✅ Designed for “content + discovery hints” This makes them a strong candidate for encoding content identifiers + provider hints across ecosystems. ### Design Proposal - `xt` → CID or multihash, using new or existing URN namespaces - Example: `urn:ipfs:<cid>` or `urn:multihash:<multihash>` - `tr` → Provider hints (multiaddrs, URLs) **Example:** ```sh magnet:?xt=urn:ipfs:bafy...abc&tr=https://my-hashstream-server.com/ipfs/&tr=/dns4/other.com/tcp/443/https ``` ### Encoding alternatives | Approach | Example | Notes | | -------------------------------------------- | --------------------------------------------------------- | -------------------------------------------------------------------------------------------------------- | | **Use IPFS namespace** | `magnet:?xt=urn:ipfs:<cid>&tr=<provider1>&tr=<provider2>` | Familiar to IPFS ecosystem, directly usable with existing CIDs. | | **Use generic multihash namespace** | `magnet:?xt=urn:multihash:<multihash>&tr=<provider1>` | More general; works outside IPFS. Needs clear encoding (e.g., RAW CID or base32 multihash). | | **Use `ws` (web seeds) for HTTP** | `magnet:?xt=urn:ipfs:<cid>&ws=https://myserver.com/ipfs/` | Fits into BitTorrent’s `ws` extension; good for HTTP/HTTPS providers but may confuse non-HTTP use cases. | | **Use `tr` (tracker) with multiaddrs** | `magnet:?xt=urn:ipfs:<cid>&tr=/ip4/1.2.3.4/tcp/4001` | Directly embeds multiaddr as tracker; requires clients to understand multiaddr format. | | **Use custom query param (least preferred)** | `/ipfs/<cid>?providers=maddr1,maddr2` | IPFS-specific; lacks general ecosystem support; not magnet-compatible. | ### Comparison | Criteria | IPFS namespace | Multihash namespace | Web seeds (`ws`) | Trackers (`tr` + multiaddr) | Custom query param | | ----------------------- | -------------- | ----------------------------- | -------------------------------- | ------------------------------------ | ------------------ | | Generality | IPFS-specific | Protocol-agnostic | Mostly HTTP/HTTPS | Flexible, fits p2p + infra | IPFS-specific | | Ecosystem compatibility | Good (IPFS) | Good (multihash users) | Good (BitTorrent) | Fair (needs multiaddr-aware clients) | Poor | | Ease of adoption | High | Medium (define multihash URN) | High for HTTP; medium for others | Medium (multiaddr parsing) | Low | | Standards fit | Good | Needs work | Standard in BT | Fits existing magnet keys | Not standardized | ### Benefits - ✅ Ecosystem-agnostic - ✅ Flexible encoding → can support multihash or CID - ✅ Extensible → can include multiple discovery hints like `tr`, `ws` - ✅ Leverages existing tooling → parsers, UX patterns from BitTorrent ### Drawbacks - ❌ **No origin information** → Magnet links lack cryptographic or verifiable origin metadata, making them hard to authenticate or authorize in modern trust-aware environments. - ❌ **Limited interoperability** → Despite wide historical use, they are poorly integrated into modern tools and protocols outside the BitTorrent ecosystem. - ❌ **Fragmented support** → Interpreting parameters like `tr` and `ws` varies across clients. There’s no universal way to encode non-HTTP providers (e.g., multiaddrs). - ❌ **Redundant with more expressive standards** → Magnet URIs do not provide unique expressive power beyond what can be done more cleanly with modern URL-based or schema-driven identifiers (e.g., RASL). ### Open Questions - **Multiaddr encoding** — What string format to use? Base64? Canonical string? - **Namespace registration** — Should `urn:ipfs` or `urn:multihash` be formalized? - **Client adoption** — Can we build lightweight magnet resolvers for new clients? - **Cross-protocol support** — Will this work for non-IPFS protocols like Bitcoin or Nostr? - **Fallback policies** — What should a client do when all `tr`/`ws` hints fail? ## Option 3: RASL ### What is RASL? RASL (Retrieval of Arbitrary Structures & Links) is a URL scheme designed to facilitate the retrieval of content-addressed resources by embedding both the content identifier and optional location hints directly within the URL. This approach streamlines discovery and retrieval in decentralized systems, leveraging the self-certifying nature of content identifiers. A RASL URI looks like: ```sh web+rasl://<subject>;<hint1,hint2,...> ``` - **subject** → the main identifier (e.g., an IPFS CID) - **hintN** → authority hosts, where .well-known/rasl metadata can be fetched **Example:** ```sh web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com,bsky.app/ ``` **Components:** - **web+rasl:** Custom URL scheme, browser-compatible - **CID:** The content identifier - **Hints:** Hostnames (or subdomains) with known hosting - **Path:** Optional path for structured data access ### Retrieval Process 1. Parse the URL and extract the CID and hints. 2. Construct retrieval URLs for each hint: ```sh https://<hint>/.well-known/rasl/<cid> ``` 3. Try `HEAD` or `GET` requests to these URLs. The server MAY respond with a redirect to get the content fetched. 4. Retrieve first successful response. 5. Verify that retrieved content matches the CID. ### Retrieval process for IPFS Gateway For IPFS content retrieval, we can use the RASL convention to redirect a content ID (CID) to its corresponding IPFS path. For example: ```sh https://ipfs.io/.well-known/rasl/bafybeigdyrztxpxh2fq2ttrb7z7kj2vxnxar4rb72q56i3qj2xktx3zqka ``` This request could respond with: ```sh HTTP/1.1 302 Found Location: https://ipfs.io/ipfs/bafybeigdyrztxpxh2fq2ttrb7z7kj2vxnxar4rb72q56i3qj2xktx3zqka ``` This HTTP redirect allows the client to follow the `Location` header to fetch the actual IPFS content without needing to understand IPFS internals upfront. ### Benefits - ✅ **Self-contained identifiers** with embedded retrieval hints. - ✅ **Transport-agnostic**: retrieval over HTTPS, IPFS, or custom transports possible. - ✅ **Web-compatible**: works well with browsers and standard HTTP infra. - ✅ **Content integrity guaranteed**: only CID-verified content is accepted. ### Comparison to Magnet URIs | Feature | Magnet Links | RASL | | -------------------- | --------------------- | -------------------------------------- | | Scheme | `magnet:` | `web+rasl:` | | Identifier Format | infohash / URN | CID | | Location Hints | `tr`, `ws` parameters | Authority portion of URL | | Web Integration | Limited | Designed for web/browser integration | | Content Verification | Hash match | CID match (self-certifying) | | Standard Path | None | `/.well-known/rasl/<cid>` | | Ecosystem Support | Legacy p2p | Emerging standard in decentralized web | ### Drawbacks - ❌ New and less widely adopted - ❌ Requires new client-side support (though simple to implement) - ❌ Somewhat unconventional URL structure ## 🚀 Conclusion Each approach offers a different tradeoff in expressiveness, compatibility, and modernity: | Option | Portability | Ecosystem Fit | Standards-Based | Web Compatibility | Integrity | Expressiveness | | ----------- | ----------- | ------------- | --------------- | ----------------- | --------- | -------------- | | Naive Query | Low | IPFS-specific | No | Medium | Yes | Low | | Magnet URIs | Medium | p2p (legacy) | Partially | Limited | Yes | Medium | | RASL | High | Future-web | Yes | Excellent | Yes | High | ## 🚀 Summary This proposal aims to make content-addressable systems like Hashstream even simpler and more robust by using the magnet URI format to deliver location-hinted identifiers. This reduces discovery complexity, speeds up client fetches, and leverages an interoperable, ecosystem-proven format. ## References - [Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme) - [Helia](https://helia.io/) - [Multiaddr spec](https://github.com/multiformats/multiaddr) - [RASL](https://dasl.ing/rasl.html) --- ## Call Follow up Notes - What can we define for a URI format that enables other teams to easily adopt it - Lowest ammount of work to adop - Define something that can easily be translated with minimum effort - visual compatibility with Gateway URLs - with a gatewat link - base compatibility with `ipfs://CID/` - ergonomic in CLI - non URL encoding needing except for quotes - Useful for a full node like Kubo, not only HTTP - Publisher should be able to encode as much as possible and potentially go away, no need for intermediate translation steps - allow link constructor to be as descriptive as they choose (advertise http, libp2p, second hop for IPNI...) - rely in query parameters + multiaddrs - easy t oadopt - visual compatability - query parameters may be skipped if a client does not know them but can act without them - hints are optional - client MAY do fallback - potentially support encoding content type - self certified URL Problematic with RASL - origin tied - Useful for a full node like Kubo, not only HTTP - Bound to origin and needs extra hop/redirect - As minimal interactive as possible Problems with Magnet - No origin information - Limited interoperability ## Key call ideas 1. Easy Adoption: The URI format needs to be easy for existing systems like IPFS, Kubo, and Helia to adopt with minimal changes. For instance, they should be able to resolve content from these URIs without significant modifications to their resolution process. This is critical for adoption. 2. Programability & Compatibility: - The URI format should be easily translatable with minimal effort for integration. - It should visually align with Gateway URLs (e.g., https://<gateway>/ipfs/<CID>), and maintain compatibility with ipfs://CID URIs. - It should be ergonomic for CLI usage, meaning users should be able to easily share and edit links without requiring excessive URL encoding (except for standard URL encoding like quotes). 3. Protocol and Transport Flexibility: - The design should not be limited to HTTP or gateway systems but should extend to any transport protocol (including libp2p, TCP, etc.), and even support different content encodings. - A URI should allow for detailed location hints that give as much information as possible about how to fetch the content, but without forcing translation steps or dependencies on intermediaries. 4. Publisher Flexibility: - Publishers should have the freedom to encode as much information as they wish in the URI, whether it’s HTTP, libp2p, IPNI, or others. - This format should allow links to remain functional even if certain services or hints become unavailable. 5. Self-certification and Content Interpretation: - Optionally, support for encoding content type and other self-certification features is important. This means the client could interpret the content type directly after verifying the CID. 6. Fallback Mechanisms: - Clients must be able to discover content even when certain hints are unavailable or throttled. - The design should enable fallback strategies that allow clients to proceed with discovery and resolution using alternative methods (e.g., DHT, IPNI, etc.), without the need for full reliance on the original transport. 7. Future-Proofing: - The design should allow for the easy addition of new transport protocols or hint types. It should also allow clients to gracefully handle missing or deprecated hints.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.