HTML+ and the Verified Internet

Early draft / Tom Walton-Pocock / Ben Levy 2023-09-12

Materials

HTML Standard Reference

Goal of this Document

We seek to advance a full integration of the verified internet ("Web3") into the standard internet protocols.

Over a decade into the story of the 'verified internet', there remains an unnatural division between so-called Web2 and Web3 applications. Most Web3 applications are mostly formed of Web2 software, interfacing with a narrow protocol via messaging. The protocol typically handles the logical transmission of value across the consensus protocol, via small programmes known as smart contracts.

What this in fact means is that in-page data which purports to be verified is in fact trustful (there is no mechanism via which the user can validate the correctness of the core data they are seeing). It also means that, without fiddling with SDKs and making a dizzying array of technical decisions that have real financial ramifications for users, there is no way for a Web2 engineer to introduce value-transmission or verified data into their webpages.

This brings us to the motivating feature of this document: to effect a deep integration of Web3 architectures into the fabric of Web2's HTML protocol, and to enable the browser to take on a bolder role with the user of this new internet: to read, write, and even (in the age of the LLM) interpret and curate what the user sees. Useful, durable protocols should act vaguely imperialistic, gradually subsuming richer functionality into their common interface.

This is a living document to commence the layering-in of these thin, verified, stateful protocols into the Web's communications protocols, for the first time layering the ephemeral machine and a common memory of digital record into the core substrate of the web.

Concretely, we add to HTML a small cluster of elements which, in the tradition of webpages, should be invisible to classical browsers but visible to "Web3 ready" browsers.

These elements convey abstract directions, or intentions, that browsers take on the (often fiduciary) responsibility of executing faithfully and competently, freeing developers to focus on the problems that matter to them.

We call this protocol HTML+.

State of HTML

HTML will of course need to evolve to realize this vision, much as it has already evolved to support each consecutive evolution in Internet content.

HTML4 introduced tables, styles and scripts as the Internet moved to a more aesthetic bent, and HTML5 introduced audio, video, and canvas as richer forms of content gained importance (importantly, enabled by better infrastructure).

Payments were originally intended to be enshrined in HTTP itself, but at this point we believe that HTML is a better, more pragmatic option for integrating stateful features into the Internet.

Extension to Web3 Networks

We extend this to web3 functionality.

The overriding design goal here is to abstract over all the abstruse web3 infrastructure details: no worrying about which DEX to use, how much gas to pay, etc. The browser enables this by assuming a far more weighty role, a shift we already anticipate as the rise of LLMs herald the personalized Internet (elaborated on at the end of this document).

Another design goal is to retain a great degree of minimalism and abstraction within the protocol, allowing new technological developments to be adopted swiftly without protocol changes, thus avoiding the protocol failure mode identified by Moxie Marlinspike of Signal.

Principles

Partially inherited from: https://www.w3.org/TR/html-design-principles/

Compatibility: the semantics are a strict superset of HTML, and therefore will not interrupt the rendering nor behaviour of any webpage displayed in an incompatible browser
Agnostic: it should be possible to fully specify any piece of state on any chain: however the base protocol should remain consensus-agnostic, and be network-independent in its specification
Utilitarian: Make minimal protocol upgrades to strictly advance utility and radically improve ease-of-access for consensus-based internet state
Interpretative: The veracity of read and write access to consensus-based internet state will be assumed to be the responsibility of the browser. This means the protocol limits application liability by design, and pushes interpretation and intention onto the browser. The application, standing further away from the user, has only the responsibility to describe reads and "intentions", sufficient for the browser to interpret.
Universal Access: Features should be designed for universal access - to work across as many platforms as possible, support all world languages and scripts, and to be accessible to all users, including those with disabilities

Syntax

HTML+ Blocks

An HTML+ block may contain transaction and/or state references with the appropriate metadata to allow the browser's embedded light client to verify them, or via storage proofs delegated to a coproccessing service. Since users trust the browser to verify this for them, transport-layer security (TLS) is already sufficient rather than requiring signed HTML+ blocks (which present additional challenges).

All HTML elements (with the exception of <w></w>, introduced below) are void elements since we take the slightly ugly design decision of storing all content within attributes in order to prevent browsers that don't support HTML+ from rendering the content as plaintext. This way, incompatible browsers simply ignore the HTML+ blocks.

Here is a shortlist of initial HTML+ elements (with default attributes populated and required attributes capitalized):

<WRITE CALLDATA="" TO="" text=""/>
<READ ACCOUNT="" CALLDATA="" block_num="latest"/>
<SWAP from="" to=""/>
<SEND from="" to="" asset="USDC" amount=""/>
<STREAM from="" to="" duration="" asset="USDC" amount=""/>
<PRICE ASSET="" quote="USD" venue=""/>

Some notes:

SWAP supports both fiat and crypto pairs.
The VENUE attribute of PRICE should normally not be included, allowing browsers to evaluate & incorporate robust price oracles that incorporate both on-chain and off-chain data, but exists if developers want to specify a particular source of truth for pricing
A browser may offer a premium subscription to provide a storage proof with each READ+

One should wrap these elements with <w></w> tags (though this is optional), forming a single HTML+ element.

<w CHAIN_ID="1" BLOCK="latest">
[body]
</w>

The <w> tag may include attributes like network/chain ID; however, we encourage developers to avoid that, leaving gnarly details like the network unspecified altogether, allowing browsers to abstract over such technical details.

Here is a sample standard HTML body:

<HTML>

<HEAD>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <TITLE>Lord Byron</TITLE>
</HEAD>

<BODY>
    <H1>The Destruction of Sennacherib</H1>
    The Assyrian came down like the wolf on the fold,
    <A HREF="https://www.poetryfoundation.org/poems/43827/the-destruction-of-sennacherib"> read more</A>.
</BODY>

</HTML>

We might add an HTML+ element to this:

<HTML>

<HEAD>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <TITLE>Lord Byron</TITLE>
</HEAD>

<BODY>
    <H1>The Destruction of Sennacherib</H1>
    The Assyrian came down like the wolf on the fold,
    <A HREF="https://www.poetryfoundation.org/poems/43827/the-destruction-of-sennacherib"> read more</A>.
    <w>
        <SEND TO="my_wallet.eth" TEXT="Send tips here!">
    </w>
</BODY>

</HTML>

Scripting

We think that the majority of the value in HTML in the medium-term will stem from very simple elements for actions like sending, swapping, and viewing prices. These are the core web3 primitives that likely solve the most actual real-world problems.

An open question that we'd love to see a conversation on is whether, in the longer-term, a client-side verifiable scripting language would add useful functionality. The natural options for this include Solidity and WASM, neither of which we are quite yet able to performantly prove within a browser.

Product Limitations

Browsers will need to force users to keep their hot wallet balance fairly low
Allowing a plugin ecosystem will present severe security challenges

Schema

The days of internet protocols specifying rendering norms are behind us (the blue and purple links), but there may be value in browsers reserving regions of the palate for plaintext verified data / calls-to-action:

"for swaps, we recommend a green square box with x border radius"

Verified browsers, taking on a fiduciary role safeguarding user state and identity, will also have to run smart software to spot spoofing attempts designed to mislead the user and supply malicious transactions.

Future Advances

Global computation is embarking on a gradual transfer from classical to intuitive computation. Whilst the consequences of this will be felt over decades and not months, it is likely that LLMs will widen the interpretable spectrum of potential outputs for HTML pages, and increasingly HTML and other standards in the interface will use the HTML inputs for their data and otherwise for 'guidance only'.

Seth Rosenberg described the natural progression from the AI-assisted TikTok feed to a feed populated by AI-generated videos as the latest stage in a well-established trend wherein developers can give increasingly abstract directions while intelligent agents handle the implementation details; similarly, developers will likely describe their websites' functions more directly while the browser constructs personalized interfaces for users.

This has two ramifications:

The browser will take on an increasingly weighty role in making sense of the Internet for the user, and in HTML+ indeed even a fiduciary role.
HTML+ should attempt to allow developers to describe abstract functions, or intentions, and let the browsers sort out the details.