Trust in Apps, Docs, and Messages

As Mark Nottingham wrote in RFC 8890: The Internet is for End Users:

[O]ne of the most successful Internet applications is the Web, which uses the HTTP application protocol. One of HTTP's key implementation roles is that of the web browser – called the "user agent" in [RFC7230] and other specifications.

User agents act as intermediaries between a service and the end user; rather than downloading an executable program from a service that has arbitrary access into the users' system, the user agent only allows limited access to display content and run code in a sandboxed environment. End users are diverse and the ability of a few user agents to represent individual interests properly is imperfect, but this arrangement is an improvement over the alternative – the need to trust a website completely with all information on your system to browse it.

All The Fails

It is a beautiful vision, but unfortunately it is more aspirational than real. It fails in different ways:

For the longest time, privacy was not (properly) considered to be part of the threat model and violating privacy was not seen as a violation of trust that a user agent should protect against. This is wrong: the risks from privacy are real, privacy is deeply connected to trust, and UAs do need to protect privacy.
In fact, until ~2010 a whole class of security attacks were not considered to be in scope: cookie highjacking attacks. Major websites that exposed potentially sensitive services (eg. Amazon, Facebook) ran over unencrypted HTTP such that on a shared network it was relatively trivial to intercept cookies and use those sites as someone else. This vulnerability was made famous by the Firefox Firesheep extension, which eventually led to the HTTPS Everywhere and Let's Encrypt projects to fix the problem.
It is diffcult to detect social attacks at the user agent level, but it is enough of a problem at scale that it needs to be addressed. Google's Safe Browsing system was designed with that in mind: it has a registry of dangerous sites that it can match against hashed URLs. As designed, it has completely private governance and centralised operations (though that is not required). Even if replicated with better properties, this system will never be perfect and requires acknowledging that trust cannot be perfect along some dimensions.
These issues are not limited to the web, they are architectural issues and other systems have made different trade-offs. Email typically prevents JavaScript and a lot of the more powerful HTML capabilities, which severely limits use cases. Apps are increasingly controlled via app stores that have some form of internal review, which in turn comes with its own problems (notably, again, that the threat model is biased).
The trust requirement makes it hard to expose powerful APIs (that could cause harm) even though they could address a lot of valuable use cases. In some cases, either heuristics (eg. must use HTTPS) or some more or less intentful UIs (eg. file upload dialogs) can help, but the only known approach for this is consent dialogs, that are known to have issues:
- For privacy, consent is known to be a poor way to support people's privacy (shameless plug: I gave a keynote on this topic called Consent of the Governed).
- In the context of granting powers to Web content, this has been attempted before and the browser UI has generally been thought to provide a poor foundation for this:
  - The December 2022 W3C Workshop on Permissions is meant to solve this.
  - The September 2018 W3C Workshop on Permissions and User Consent was meant to solve this.
  - The September 2014 W3C Next steps on trust and permissions for Web applications was meant to solve this.
  - In 2009, the Device APIs and Permissions WG was supposed to solve that issue.
  - The Widget Access Request Policy (WARP) was supposed to create policies for packaged web apps, as part of Packaged Web Apps (Widgets).
  - The OMTP BONDI project tried to produce the same (now so defunct it has nothing on the Web).
  - I could cite more, this is just off the top of my head and limited to things that I've been a part of; I'm sure that Fabrice could cite twice as many.

Usage Contexts

Part of the issue that browsers are facing is that they have a one-size-fits all approach to the problem. It turns out that tabs are terrible containers for apps — they get lost in the document tabs, they don't show up in app affordances like all the equivalents of Mac's Cmd-Tab and Dock. There is a real sense in which The Web Rocks, Browsers Suck.

Without deciding in advance what good solutions would be, it is important to consider that distinct contexts could be treated differently.

Safe Documents. Safe documents are documents that you can Just Load™ and not worry about it (modulo potential issues with social engineering or malware vectors). Note that such documents typically cannot access the network. A good example of this class is PDF (it would also provide a great implementation requirement for ads). As currently implemented, the Web does not support this use case since it has no bundling and no network access prevention.
Messaging. Another context is that of messaging payloads, such as (be brave, don't cringe) HTML email. We could benefit from having a single messaging format for multiple messaging platforms, covering email but also social media posting (for instance). The payload and constraints would likely be very similar (possibly identical) to those used for safe docs, but it may be necessary to make specific allowances for <portal>-like use and for embeddability.
Browsing. This is the typical document Web. If implemented with privacy that means that it should only be able to communicate back with its source origin plus privacy-preserving sources.
Apps. Installable (for some definition thereof) apps that have access to more powerful capabilities. This includes the ability to have apps in apps (with embedded apps having only a subset of the parent's permissions).

Questions

It is clear from the privacy and Firesheep cases that threat models evolve over time. Who gets to decide what's in it?
Is delegated trust ever good and if it is should it be tied to app stores? If delegated trust is desirable, how can we make it not just functional for sustainable at scale?
How far would a format that only addressed safe docs and messaging take us? Would it provide a useful foundation for our goals and a good first step towards the more advanced use cases, notably apps?
What kind of powerful capability can we add if we have apps that cannot touch the network at all? Are there other trade-offs like this one that can be made, eg. by deploying ocap or hardened JS.
Would stake of some sort be useful here? You can use powerful APIs if you escrow enough value to match. If someone claims that an app has broken their security (willingly or negligently), a tribunal arbiters the question and may use those funds to support the people who have been affected.
Is there potential value in using OCap systems for powerful capabilities? Could UCAN be used?

What Do We Want?

We want to develop more nuanced language to think about "trust" notably in the app space and generally in the space of powerful capabilities.

What does a strategy for us look like here?
What use cases do we have and what usage contexts do we care about?
Are bundled apps on IPFS interesting?
Should we replace PDF, taking over eg. arXiv, etc.

Trust in Apps, Docs, and Messages

All The Fails

Usage Contexts

Questions

What Do We Want?

Read more

Supplemental Minutes

Data-Addressed Structures and Links (DASL)

Web Apps Gathering

HTTP Governance Code