# Async WASM Design Proposal ## Background This is roughly the structure of the Tangram syscall API exposed to a WASM guest. ```rust // A struct that represents the closed-over state used by // tangram syscalls invoked from the WASM guest. struct CallingContext<'caller> { /* ... */ } impl<'caller> CallingContext<'caller> { async fn syscall ( &mut self, opcode:i32, // an opcode to determine the syscall arg:WasmPtr, // *const u8 in the guest arglen:WasmSize, // usize in the guest ret:WasmPtr, // *mut *mut u8 in the guest retlen:WasmPtr, // *mut usize in the guest ) -> Result<()> { match opcode.try_into()? { OpCode::GetExpression => self.get_expression(arg, arglen, ret, retlen).await, /* ... */ } } async fn get_expression(&mut self, arg:WasmPtr, arglen:WasmSize, ret:WasmPtr, retlen:WasmPtr) -> Result<()> { // Read the bytes of the hash string from the calling context's memory and // deserialize as a concrete Hash value. let hash = Hash::from_str(std::str::from_utf8(self.read(arg, arglen)?)?)?; // Call the get_expression method on the builder let expr = self .builder() .lock() .await? .get_expression(hash) .await?; // Serialize to JSON let json = serde_json::to_vec(&expr)?; // Return the serialized json by invoking the // allocator in the calling context and writing // the resulting ptr + length values to ret, retlen // addresses in memory. self.return_bytes(&json, ret, retlen) } } ``` We can create an instance of `CallingContext` and bind a call to `syscall` as a host-provided function like so ```rust pub fn bind_syscall_to_host (state:State, linker:&mut wasmtime::Linker<Host>) { // Create an async closure let binding : Box<dyn Future<Output = i32>> = move |caller: wasmtime::Caller<'_>, arg:i32, arglen:i32, ret:i32, retlen:i32| { // initialize the calling context. let mut context = CallingContext::new(caller, state.clone()); // define the body of the syscall operation let future = async move { match context.syscall(arg, arglen, ret, retlen).await { Ok(_) => 0, // return 0 on success Err(e) => { // print error and return -1 on failure eprintln!("{e}"); -1 } } }; // return as a future Box::new(future) }; // bind to the host // creates a function named "tangram_syscall" with the // signature (i32, i32, i32, i32) -> i32 in the "env" module. linker.func_wrap5_async("env", "tangram_syscall", binding) } ``` ## The problem We've demonstrated how to create a simple, working wrapper around the async Tangram runtime and expose it to WASM guests that are invoked through the wasmtime host. But how is this interface represented to WASM? ```rust #[link(wasm_import_module = "env")] extern "system" { unsafe fn tangram_syscall (op:i32, arg:*const u8, arglen:usize, ret:*mut *mut u8, *mut usize) -> i32; } ``` Notice that even though the *binding* is `async` in the host code, the exposed API is synchronous. How does that work? `wasmtime` uses a data structure called a fiber, aka greenthread, aka stackful coroutine to represent the calling context when run in an async configuration. What this means is the WASM guest code manages its own stack. When the code invokes an `async` binding from the host side, `wasmtime` will swap the caller stack with that of the callee. When the callee's `Future::poll` function is `Ready`, it swaps stacks back with the guest's caller. Consider some higher level user code in a tangram package. ```rust let example_index : Result<File> = download("https://example.com/index.html"); let tangram_index : Result<File> = download("https://tangram.io/index.html"); ``` Because of the synchronous API that is bound through fibers, it's not possible for these two expressions to be evaluated concurrently. The first must complete before the script will be able to begin evaluating the second. We want our users to be able to write normal async code, eg ```rust let (example_index, tangram_index) = futures::join!( download("https://example.com/index.html"), download("https://tangram.io/index.html") ).await; ``` ## What about Wasmer? Wasmer does not support `async` host bindings yet. [GH Issue](https://github.com/wasmerio/wasmer/issues/1127). ## What about Lucet? Lucer is EoL and recommends focusing on wasmtime. However they had an API not based on futures for exposing [async host functions](https://docs.rs/lucet-runtime/latest/lucet_runtime/index.html#yielding-and-resuming) ## Prior Art ### wasm-bindgen `wasm-bindgen` allows for asynchronous calls to host/imported functions by wrapping them as JS as `Promises`. One option is to re-use the promis-ified API for JS/TS bindings in V8 and convert them to wasm bindings, then use `wasm-bindgen-sys` to convert the exposed promises into futures. The [docs](https://rustwasm.github.io/docs/wasm-bindgen/reference/js-promises-and-rust-futures.html) are a pretty good introduction. ### Lunatic [Lunatic](https://github.com/lunatic-solutions/lunatic) is an erlang inspired WASM application runtime. It does not support future-based asynchronous APIs and relies on wasmtime's native support for async host calls under the hood. ### Fastly Compute@Edge Fastly is using wasmtime to do much the same that we want to do but with more extensive bindings for a range of APIs.The relevent code is in the `fastly-sys` crate in the `abi` bindings that are used by downstream Fastly SDK consumers. They expose a `select/poll`-like interface. ```rust pub type AsyncItemHandle = u32; #[derive(Clone, Copy, Eq, PartialEq)] #[repr(transparent)] pub struct FastlyStatus { pub code: i32, } pub mod fastly_async_io { use super::*; #[link(wasm_import_module = "fastly_async_io")] extern "C" { #[link_name = "select"] pub fn select( async_item_handles: *const AsyncItemHandle, async_item_handles_len: usize, timeout_ms: u32, done_index_out: *mut u32, ) -> FastlyStatus; #[link_name = "is_ready"] pub fn is_ready(async_item_handle: AsyncItemHandle, ready_out: *mut u32) -> FastlyStatus; } } ``` `select` takes an array of handles to pending async requests and writes the result of them out. However, they don't wrap this API in `Future`s anywhere. It's up to the caller to explicitly call these APIs through their sync wrappers and handle select/polling the requests. ## Proposals ### 1. Do Nothing and stay sync This is the easiest path. The syscall implementations are still async and managed by the `tg` application runtime, but the guest is never aware of this. It as the limitation that syscalls cannot execute concurrently. ### 2. Do Almost Nothing and force scripts to use wasm-bindgen This will generate a shim script in JS that can be loaded from V8 (in princple, not sure of the mechanics). This has a giant disadvantage in that support will be Rust-only, or only for languages that have something *like* wasm-bindgen. It would be preferable to have a more portable/reusable solution or else we will need to write wasm-bindgen variants for every language we want to support. ### 3. Expose Promise-like API that is compatiable with wasm-bindgen-futures/wasm-bindgen-sys This will take some research to understand how the code is generated and how to use wasm-bindgen's internals without the macro expansion and without the CLI to generate the shim. We can then reuse the wasm-bindgen implementation to do the actual wrapping. ### 4. Manually wrap polling of futures and drive via the guest. The most robust way to handle things seems to be to manually wrap the `Future::poll` API of a syscall in bindings and allow the future in the host to be driven by an async executor in the guest. This is what Fastly is doing, however they don't go through the trouble of wrapping in a `Future` when exposing it to users. ```rust // In Guest mod abi { type OpCode = i32; type SyscallHandle = i32; type Status = i32; type PollStatus = i32; const STATUS_OK : Status = 0; const STATUS_FAIL : Status = -1; const POLL_PENDING : PollStatus = 0; const POLL_READY : PollStatus = 1; const POLL_FAIL : PollStatus = 2; #[link(wasm_import_module = "tangram_host")] extern "C" { // Spawn a new syscall. Returns immediately without driving the syscall. unsafe fn spawn ( opcode: OpCode, // the syscall opcode to use arg: *mut u8, // the argument to the syscall, as a serialized chunk of bytes arglen: usize, // length of the argument array handle: *mut SyscallHandle, // the handle pointing to this syscall. ) -> Status; // If successful returns STATUS_OK, else STATUS_FAIL. unsafe fn poll ( handle: SyscallHandle, // the handle of the syscall we are polling poll_status: *mut PollStatus, // status of the poll. out: *mut *mut u8, // if poll_status == POLL_READY, host will write the output to this pointer outlen:*mut usize // if poll_status == POLL_READY, host will write the length of the outputhere ) -> Status; // If successful returns STATUS_OK, else STATUS_FAIL. } } ``` This low level interface will never be seen by users. We will present a friendlier API that wraps the data. ```rust // In Guest #[repr(i32)] pub enum OpCode { AddExpression, GetExpression, /* ... etc ... */ } pub fn syscall (opcode:OpCode, arg:&[u8]) -> impl Future<Output = Result<Vec<u8>>> { let handle = unsafe { let mut handle = -1; let err = abi::spawn(opcode as i32, arg.as_ptr(), arg.len(), &mut handle as *mut abi::SyscallHandle); if err != abi::STATUS_OK { handle = -1; } handle }; futures::poll_fn(move |_cx_:&mut futures::Context<'_>| -> Poll<Result<Vec<u8>>> { if handle == -1 { let ret = Err(anyhow!("Could not spawn future.")); return Poll::Ready(ret); } let mut status : abi::PollStatus = 0; let mut out : *mut u8 = std::ptr::null_mut(); let mut outlen : usize = 0; let err = unsafe { abi::poll(handle, &mut status as * mut _, ) } if err != abi::STATUS_OK { let ret = Err(anyhow!("Poll failed.")); return Poll::Ready(err); } let ret = match status { abi::POLL_PENDING => return Poll::Pending, abi::POLL_READY => { if out.is_null() { vec![] } else { unsafe { std::slice::from_raw_parts(out, outlen).to_owned() } } }, _ => panic!("Invalid poll status.") }; Poll::Ready(Ok(ret)) }) } ```