From the official documentation of Emscripten as of v3.1.48:
WasmFS is a high-performance, fully-multithreaded, WebAssembly-based file system layer for Emscripten that will replace the existing JavaScript version.
The JavaScript-based file system was originally written before pthreads were supported and when it was more optimal to write code in JS. As a result it has overhead in pthreads builds because we must proxy to the main thread where all filesystem operations are done. WasmFS, instead, is compiled to Wasm and has full multithreading support. It also aims to be more modular and extensible.
Its public-facing API is at src/library_wasmfs.js. To let a project adapt WasmFS, link it with the flag -s WASMFS
. It replaces the position of the traditional FS layer in mainstream. Considering that all filesystem APIs are native in the first place, you probably want -s FORCE_FILESYSTEM
as well to expose the JS APIs.
Unlike traditional FS implementation that wires almost all system calls through their counterparts in JS land, WasmFS is mostly written in C/C++ that is then compiled to WASM alongside with other libraries. This makes async operations directly at comsumer's disposal and provides thread safety naturally when interacting with file systems. In theory, it also helps reduce the runtime JS bundle size, though some benchmark is needed to draw this conclusion.
One can still write JS backends, but writing one for WasmFS is inherently more complicated than before. Now that data structures are not in JS objects, each will require a wrapper layer in C. There is a built-in backend primitive JSImplBackend
for this. The corresponding concrete backend is JSFILEFS. They are not fully-featured, just enough to prove the concepts.
Here is the list of implemented backends in the official repository:
Note that IDBFS does not (yet?) support WasmFS. If your existing project is linked against it, it will not compile with WasmFS.
TODO: explain wasmFS.addBackend
Now let's trace what a WasmFS mount operation goes through when with a custom backend:
createBackend
to the type
argument of FS.mount(type, opts, mountpoint)
. The job of createBackend
is to initate a backend in WASM memory and return an opaque pointer to it. Built-in constructors are named after wasmfs_create_*_backend
. The pointer is passed to __wasmfs_mount
._wasmfs_*
is at system/lib/wasmfs/js_api.cpp. It calls a system call wasmfs_create_directory
, which points to doMkdir
.createDirectory
method. In this file, we can see that when a syscall need to do operation on a file/directory/symlink, it will eith query its backend, or query the file object and delegate to it.Let's see a particular backend implementation, JSImplBackend
:
js_impl_backend.h
contains brief instructions on how to make ones, but I have never compiled Emscripten itself, so that part is left for the future. Hope that they can publish more guides soon.
Current JSImplBackend has only one implementation as JSFILEFS
. The async version also has one as FETCHFS
. Let's dig into JSFILEFS
.
It is available as a library (-ljsfile.js
) and gets implemented at library_wasmfs_js_file.js
:
In JSImplBackend, Symlinks and directories are still memory-mapped. Only files are specialized as JSImplFile
. Files are handled through a minimal collection of JS functions and file contents are exposed as Uint8Array
s. See system/lib/wasmfs/js_impl_backend.h#L115.
TODO: describe what happens on JS-side and C-side respectively.
In jsimpl
, a backend exposes a createBackend
method that stores the JS methods into $wasmFS$backends[id]
. I wonder if different JS mounts are sharing the same definition of backend.
I played around WasmFS's JSFILEFS
here:
https://gist.github.com/andy0130tw/76352f747f7a1d9b3210e9535db7e4a1.