Try   HackMD

Arweave Protocol URI Scheme

This document describes the the ar:// URI scheme and how to combined them with a user configurable gateway host to be translated to http:// URIs. These are then be used by the browser to initiate a HTTP request to retrieve data from the Arweave network.

Content Requests

From inception, the impetus for creating a URI scheme for Arweave data is increasing decentralization. By decoupling the content identifier, the transaction id (txid), from the location it is served - the gateway host. It allows users to configure their preferred gateway to retrieve the content. Thus eliminating the dependency on any one gateway to bridge between the browser and the Arweave network.

To achieve this the ar:// URI scheme takes an ar://{txid} URI and maps it to its equivalent location at a configuragle Arweave gateway host, creating an http:// URI scheme for it https://{gateway host}/{txid}.

Transaction Id Lookup

The most direct way to retrieve data stored on the Arweave network is to use the transaction id (txid) of the transaction that posted the data. The following is an example of how that flow works today (in Chrome via ArConnect plugin).

User Input

First, the user enters the following into the browsers address bar.

ar://o5PwDrHU4dVcNVm8OB2RLT8shCDhlyVrjWdGMSV01vo

Browser Redirect

The ar:// formatted URI causes the browser to redirect to the equivalent location at the gateway (configurable in settings). https://arweave.net is the most commonly used gateway today.

Gateway Redirect

Upon receiving a HTTP request for data at a specific transactionId the gateway responds with a HTTP redirect response, providing the requested content but serving it under a deterministically generated subdomain. This may seem like an unnecessary step but the justification for this step will become clear in the next use case.

Permaweb Applications

A compelling use case for a network that stores data permanently is the possibility of deploying permanent, serverless, single page applications (SPAs).

These applications often rely on a more traditional (similar to http://) URI scheme.

The above is an example URI, it is not intended to resolve to any SPA on Arweave.

Permaweb Requests

By Transaction Id

The browser redirect and gateway redirect work just as they did in the previous example, except this time the data being served is a static website or SPA.

This is where the unique sub-domain on the gateway redirect comes into play. If every SPA loaded though the gateway was served at arweave.net with no subdomain, they would all share a single security sandbox. When a user retrieved one SPA that stored personal data in local storage, another SPA retrieved from the same gateway would be able to access the same local storage data and read it. This could lead to significant breaches in privacy and security. By creating a unique subdomain for each request the gateway can ensure a unique security sandbox for each asset stored on Arweave.

By Transaction Id + Path + Query

In this part of the example there's two ways a user might visit an article on the site. The first is by navigating to it by clicking on the article summary on the home page. The second is by following a direct link from an ar:// formatted URI scheme.

ar://oLagmHm1-aUcMwYEAdprszz4WY9OMiNhkyYeFS1cA00/post?txid=kk_T_k8VAk6b_mAN7sWq1lBLSNaYxcyL7phtt4l1Nyo

URI Element Element Value
Scheme name ar://
Transaction Id oLagmHm1-aUcMwYEAdprszz4WY9OMiNhkyYeFS1cA00
Path part /post
Query part ?txid=kk_T_k8VAk6b_mAN7sWq1lBLSNaYxcyL7phtt4l1Nyo

Browser Redirect

The browser redirect that occurs when provisioning this URI is similar to the basic content requests case. What changes is that the Path part and Query part are preserved.

URI Element Element Value
Scheme name https://
Host name arweave.net
Path part /oLagmHm1-aUcMwYEAdprszz4WY9OMiNhkyYeFS1cA00/post
Query part ?txid=kk_T_k8VAk6b_mAN7sWq1lBLSNaYxcyL7phtt4l1Nyo

Emerging Standards

Arweave Name System

Arweave Name System (ArNS) is an emerging standard on Arweave that maps human readable names to transactionIds, much in the same way DNS maps human readable names to IP addresses on the Web,

ArNS names are globally unique and allow the gateway to serve transaction data without requiring the transactionId be part of the URI. In order for ArNS names to interoperate with DNS names, ArNS names replace the deterministically generated sub-domain with a well known human readable name. Because this ArNS name is linked to a specific transactionId (txid) in the ArNS Smartweave contract state, it also eliminates the need to include the txid in the URI.

The Arweave GraphQL guide is stored at the following txid.

iiO2AWChgG9yGWzw23IgxLetOOKr3eJiMfXPZrf6zzY

To retrieve it via the arweave.net gateway, we would navigate to the following gateway location.

By registering an ArNS name with the ArNS SmartContract we can simplify the URL. In this case the ArNS name gql-guide was mapped to the transaction id iiO2AWChgG9yGWzw23IgxLetOOKr3eJiMfXPZrf6zzY, resulting in the following DNS name that points to the same transaction data.

Important to notice here is the arweave.dev gateway is used for ArNS name lookups as opposed to the arweave.net gateway. This is because ArNS names are in their pilot phase. More can be read about the proposed Arweave Name System here https://arns.arweave.dev/.

ArNS names represent an improvement in the accessibility of retrieving content on Arweave but they could improved it even further by being incorporated into the ar:// URI scheme. As follows

ar://gql-guide

ArNS Name Requests

In these examples, when the browser detects the presence of an ArNS name insted of a txid in the ar:// scheme, it redirects to the arweave.dev gateway for processing.

URI Action Resulting URI
ar://gql-guide redirects to https://gql-guide.arweave.dev

This is neccessary to support integration with the existing DNS system. Because this gql-guide sub-domain is guaranteed to be globally unique, it will maintain a unique security sandbox at this location.

An important implementation detail here is if the user provides a path part or fragment along with the ArNS name in the ar:// scheme it should be preserved when relocating to the arwaeve.dev gateway.

URI Action Resulting URI
ar://gql-guide/#introduction redirects to https://gql-guide.arweave.dev/#introduction

This enables deep linking support for SPAs retrieved by their ArNS names.

Future Directions

An ideal implementation of the ar:// URI scheme would support content retrieval by transaction id or ArNS name without requiring a redirect to a gateway to integrate with DNS.

An example of a simple content request would resolve like the following.

An ArNS lookup would resolve seamlessly as well.

URI Scheme Layering

As it stands, browsers are designed to make HTTP requests which are explicitly described by the HTTP URI Scheme. The HTTP URI scheme includes hostnames which can be either an IP Address or a DNS name. Thus, both of these underlying protocols are implicitly part of the HTTP URI scheme and browser HTTP requests.

In order to support the ar:// URI scheme without re-writing the entire browser stack from TCP/IP up, provisioning ar:// URIs is done by transforming them into https:// URIs. Which can then be used to make an HTTP request.

In the above example the ar://<txid>/path/?query#fragment URI is mapped to a http URI using a gateway hostname so the browser can make an HTTP request to retrieve the data. Which gateway hostname to use is exposed in the browser settings, giving the user choice over what gateway to use.

Note that the path, query, and fragment components of the ar:// URI scheme are preserved and transferred to the https:// URI scheme.

ArNS Integration

Much like how DNS is supported as an alternative to IP Addresses in the http:// URI scheme, ArNS names should be supported as an alternative to transaction Ids in the ar:// URI scheme.

The difference here is that ArNS is still a pilot program and will evolve over the the next year. As a result the best path forward is to support ArNS name mapping by having a browser plugin read the state of the ArNS contract. It would then map the ArNS name to a txid which could then be presented to the browser in the standard ar://<txid> scheme format (including path, query, and fragment parts). The browser would then do the usual ar:// -> https:// scheme mapping and make the HTTP request.

The browser plugin could presumably use the same gateway host indicated in the browser settings to evaluate the state of the ArNS contract.

Ideal Integration

An idea ar:// scheme integration would allow permaweb applications to encode ar:// URIs in their HTML as links to remote resources.

<a href="ar://iiO2AWChgG9yGWzw23IgxLetOOKr3eJiMfXPZrf6zzY">
  Arweave GraphQL Guide
</a>

The browser would combine these schemes with the user configurable gateway information stored in the browser settings to make HTTP requests for the resources at the appropriate location.

As of this writing, developers are forced to hard code an https:// scheme gateway location for their resource paths.

<a href="https://arweave.net/iiO2AWChgG9yGWzw23IgxLetOOKr3eJiMfXPZrf6zzY">
  Arweave GraphQL Guide
</a>

This hard codes a centralized point of failure into the references which could be avoided with client configurable gateway settings and full support for ar:// within HTML documents.

Future Optimizations

One way to optimize the performance of page loading under the ar:// URI scheme would be to generate the deterministic subdomain on the client ahead of time, saving the gateway from having to redirect the client. As long as the gateway and client were using the same deterministic formula for generating the subdomain, the extra HTTP request for the subdomain redirect could be eliminated (improving performance of loading pages).

Ultimately though, it would be better if the subdomain redirect was not necessary and there was some other way to ensure each asset served from the permaweb had its own security sandbox in the browser.