Exploring Universal Documents

--- tags: thoughts --- # Exploring Universal Documents Some thoughts on what this approach could enable. Prerequisite is that documents can have a live or ephemeral version, which can be recorded, as well as a static version, which can be played back, and a version history. Any document may have one or all of these available. ## A Meeting Space from Scratch Start with a new document. This document has a live URL, which is ephemeral. It isn't stored anywhere yet, but it still has an address. You can send out that address to some other people. These can now watch the document live. You can type in the document, and anything you type, they will see. The view follows the cursor by default. You can see which invited people are watching. And you can grant them write access as well. Now you change the document by adding a sidebar. This is another document. Each document can scroll independently from each other. In the sidebar you drop another document. This is a video document. It asks for a video source and you select your camera. Now all people following your document can see and hear you live. People with write access can add their own camera feed as well. Then you turn on a recording feature, so your changes in the document get saved. This feature detects all child documents and turns on their recording feature as well. There is still only one recording, yours, as all other viewers and editors are viewing and editing your document. The you add a document in the main view, that you prepared earlier. This contains a number of slides. This is a static document, but you can edit it. But this document doesn't allow other editors to edit it also. Access must be granted specifically. As you want to have some idea of the time left, you add a new document containing a timer, in the sidebar. You specify that it is a local only enhancement, so it won't be shared. You set the timer for 30 minutes. Then you add a new document, also local only, which is an application that can show content from other documents through a simple API. You connect it to the slides and select the 'next-slide' connector in the slides. Now you have a speaker view with a timer and next slide. When the presentation is finished, you save the recording to a static address and send it to each participant. For those who haven't seen it yet, this is almost exactly what [Douglas Engelbart](https://en.wikipedia.org/wiki/Douglas_Engelbart) showed in [the Mother of all Demo's](https://vimeo.com/69647667) back in 1968. ## How can we build this? Let's see if we can plausibly build something like this now, using the browser as a Universal Application Environment. We'll need a public storage space, that we can share. Any URL will do, but I'd like to use IPFS, so that we get stable links by default. This means we can include any document with an IPFS address and make it trivial to make sure that the included document stays available as long as the parent document is. IPFS has a javascript implementation that runs in the browser. We can bootstrap from any normal URL and then switch to IPFS for the included contents. We'll also need an event stream to share changes. IPFS has a pubsub feature, and it is also available in javascript. - [js-ipfs-http-client-lite](https://github.com/ipfs-shipyard/js-ipfs-http-client-lite) - [js-ipfs](https://github.com/ipfs/js-ipfs) - [Blogpost about IPFS PubSub](https://blog.ipfs.io/29-js-ipfs-pubsub/) So a live document needs a bootstrap to load at least IPFS PubSub. We can do this by creating a normal web URL which does this. It doesn't need to be stored on IPFS, as it is only available live. It will be deleted when you quit the application. This means that if your browser crashes, the live url will disappear. It's probably a good idea to have a grace period and to use local storage (filesystem or indexeddb) to allow you to recover. The live document (with the normal web URL) will contain a PubSub channel address, perhaps using [IPFS PubSub Rooms](https://github.com/ipfs-shipyard/ipfs-pubsub-room). If someone tunes in later, he or she will need to get the current state of the live document. As it isn't itself stored anywhere other than on the hosts computer, the new client must send a message over the PubSub for a full state dump. A reply should be sent over a private channel to avoid overload. Collaborative editing itself isn't new technology, any number of standard implementations can be used, depending on need. But the key difference here is that the host is the server, so access requests, for editing must be handled there. This can be done over a private channel, where the client requests a key, and the host sends one and stores the key or some validation information about it in local storage. You can create extra PubSub channels in IPFS, but it's not meant for that. So private channels should probably use a direct connection, perhaps using websockets. You may not be able to connect directly to a viewers computer, because of firewalls and NAT ([Network Address Translation](https://en.wikipedia.org/wiki/Network_address_translation)). But the live document must have a public URL, so it lives on a public webserver. This server can be used to [punch a hole](http://www.mindcontrol.org/~hplus/nat-punch.html) in the firewall and setup a direct connection through that. The camera feed from the host may be shared over IPFS PubSub, I don't know the performance of it, but for this application we don't need that much. Since the host has the document and is recording, any other camera feeds from viewers must be sent directly to the hosts computer and recorded and shared over IPFS PubSub from there. This may hit a performance wall now, but should be trivial in a few years. A temporary solution is to use a central server. Displaying a video feed is simple enough, but it potentially requires supporting different codecs to support different browsers. Or we could use [ffmpeg, compiled to WebAssembly](https://github.com/ffmpegwasm/ffmpeg.wasm). Currently Safari can't run this by default, since it disabled a necessary feature (SharedArrayBuffer) to avoid exploits. Hopefully they'll turn it on again somewhere this century. Firefox, Chrome and Edge support it anyway. Mobile browsers in general don't, but they'll come around... I hope. For this to work seamlessly, every document included must be able to share its live state, even if the document itself is static. What I mean is that if you include a static document in the live meeting document, that document won't change, but the host may scroll or in this case change the visible slide. This change must be shared to the live document, so it can share this to all viewers. So any document must implement a 'live' API, even if only as a static participant. Static documents may also be turned in to live, editable documents on the fly. This live version only lives on the hosts browser, until it is saved. In that case it gets a new address. In the recording, this change must be recorded as well. Documents can provide a list of documents they can create to include or provide to a host document. In this way the slides document could provide a speaker view of the next slide. Now the user can just drag this document anywhere and it will instantly be configured and connected to the slides. You can drag a document into a new window, since communication uses the postMessage protocol anyway. Host documents can pass through potential new documents from their children up the chain to their host documents and so on. Since postMessage is trivially shared over a network connection, a universal document may also be running on a network computer and attach to different devices. Now you can watch a movie on your tv and decide to drive somewhere, so you pause the movie, jump into a driverless car and continue watching the movie on your telephone. And all this is using existing technology. ## Interesting Links - [coda](https://coda.io) - [microsoft fluid vision](https://www.youtube.com/watch?v=RMzXmkrlFNg) - [first look](https://www.youtube.com/watch?v=A4AUtjBCBVM) - [notion.so](https://www.notion.so/)