Author: Didrik Nordstrom (betamos @ github)
Status: Work in progress
Last updated: 2020-08-14
Publishing target: fuchsia.dev
Audience: Rust on Fuchsia developers
Goal: Improve developer confidence, productivity and understanding of asynchronous Rust.
Out of scope: Build your own executor and synchronization primitives.
Pre-requisites: Basic knowledge of the Rust type system and ownership model, OS threads, concurrency vs parallelism.
Dual-format: Guide (linear reading) and reference (searchable).
Rust comes with excellent documentation. The Rust Book can be read cover to cover and goes over almost all features of Rust. Asynchronous Rust is recently stabilized and dramatically impacts APIs, design patterns and data structures. Early research indicates that developers find async and its concepts harder to use and understand compared to Rust generally. The existing Rust Async Book covers many concepts of async Rust, but is incomplete and in my opinion does not clearly separate the knowledge needed to write asynchronous applications from advanced topics, e.g. building an executor, implementing futures manually and understanding wakers. I believe that this risks overloading prospective developers with information that is not necessary for beginners.
Although this draft is currently written for Fuchsia developers, most concepts are not Fucshia-specific. This follows the model of other runtimes, such as tokio, providing and maintaining their own documentation. Per-runtime documentation is the better model for the meaningful differences across runtimes. However, a majority set of features, concepts and mechanisms are runtime-independent. As such, this guide can be repurposed for async Rust developers in general, by substituting references to Fuchsia in favor of more common environments.
Asynchronous (often abbreviated "async") Rust lets you run multiple tasks concurrently on one or more threads, while preserving zero-cost abstraction performance as well as most of the look and feel of regular, synchronous, Rust. Async Rust is used in operating systems, production-grade servers and other complex systems.
Async Rust is an emerging area. Only a small set of crucial features have yet made it into stable Rust, most notably the Future
trait and the async/await
syntax. Other features reside in library crates, such as async runtimes, combinators and synchronization primitives. In order to be fully productive with async Rust today, you need to rely on a mixture of core language features, the standard library and community crates. In the future, we expect more async features to be integrated with stable Rust.
Much like security and testability, async should be considered early in the design process. Retrofitting async onto an existing synchronous code base is tedious and costly. An async application or library typically uses different APIs, control flow, design patterns and data structures as compared to its synchronous counterpart.
This guide is written for:
Familiarity with the Rust type and ownership system will make the technical details of this guide easier to understand.
Asynchronous programming is a paradigm that has gained traction over the years, as multitasking is playing an increasingly important role in our day-to-day applications. It is suitable for many types of applications, so learning to "think in async" is tremendously useful, both within and outside of Rust.
In some languages, asynchronous programming is baked into the language runtime, such as in modern Javascript. Since Rust is a low-level language with a minimal runtime, async support is not included with the language runtime. Instead, it is your choice whether you want to use async or not.
Imperative programming languages lets you write code that is executed in sequence. Python, JavaScript, C, C++, Go, Rust (and many more) are very different languages, yet they all adhere to this basic principle. The fundamental execution environment consists of a call stack, a heap and a program counter representing the current state of the program.
This model is sufficient for sequential workloads. However, real world applications often require multitasking, for instance:
Imperative programming alone does not answer this question. This leads us to the following problem statement:
How do we model multitasking in an imperative programming environment?
Many different languages, frameworks, libraries and design patterns have been developed to answer this question. As of today, there is no widely acknowledged consensus on which approach is the best.
The most fundamental multitasking primitive is a thread, which is exposed by the operating system itself. Individual threads can run independent on sep The simplest approach is simply to create one thread for each task. However, threads come with a cost, in terms of CPU time, memory consumption and synchronization costs.
In short, using one thread per task results in sub-optimal resource utilization, in particular for workloads with a large number of tasks. The problem statement can be phrased as:
How do we run M tasks concurrently on N worker threads?
Here, M represents the number of tasks in the program. It can be very large, and vary at runtime. For instance, a server that has one task per client could have a large number of tasks.
The number of worker threads, N, deserves further explanation. We just stated that threads are expensive, so why do we use threads in our search for a better way of multitasking? The reason is that threads can run in parallel on multiple logical CPU cores, which can improve the runtime of parallel workloads. In fact, threads constitute the only abstraction for unlocking parallelism, so we have no choice. However, more threads only help up to a point - when all cores are saturated (i.e. working at their maximum capacity). As such, N should not exceed the number of cores. (In some runtimes, a small number of constant threads are used for auxiliary purposes).
Asynchronous programming offers a performant and ergonomic solution to this problem. It lets you run multiple logical tasks on a single OS thread, by exposing multitasking constructs within the programming environment.
Many async programming environments provide a future or promise type. They represent an asynchronous operation, where creation and completion are logically separate. Technically, any operation can be asynchronous, even something simple like adding two numbers. In practice we only make operations asynchronous if they cannot complete quickly on their own, for instance if the operation involves I/O, IPC or synchronization. These operations can be suspended while they wait for something to happen elsewhere, such as the arrival of an IP packet or a mutex being locked. In Rust, an async operation is called a future.
An asynchronous programming environment often provides either green threads, coroutines or tasks, which represent a series of operations that are largely logically independent from the rest of the program. For example, an HTTP server may create a separate task for each incoming HTTP request. In asynchronous Rust they are (informally) called tasks.
An async runtime is responsible for running async applications. If you are familiar with how OS threads work, this analogy may help:
Some programming environments provide an implicit runtime which you cannot directly interact with — you may not even be consciously aware of it! In Rust, async runtimes are provided by community maintained crates. You have to manually import and invoke the runtime in order to run your async application.
async/await
?The async/await
syntax lets you write sequential but asynchronous code, similar to regular synchronous code, which is immensely useful in practice. However, async/await
is neither necessary to write asynchronous code, nor is it by itself sufficient to achieve in-thread concurrency, which is the goal of asynchronous programming. Understanding async/await
is easier if you are already familiar with the asynchronous programming primitives of your programming environment. In Rust, it is easier to understand async/await
once you know the fundamentals of futures.
A blocking operation can occupy the OS thread for a long time. It is usually the result of either:
In a traditional sequential program, blocking operations are completely normal, because there is nothing else to be done while it is waiting.
However, an asynchronous program needs to make progress on multiple tasks concurrently on each OS thread. When a blocking operation is running, it prevents other tasks on the same thread from making progress. This can lead to large latency spikes and even deadlocks. As such, blocking operations should be avoided in async applications.
There is no definite consensus on what constitues "a long time", and hence blocking is not an objective term. By convention, if a system call waits for an external event to occur, it is considered blocking. For computational operations, the line between blocking and non-blocking is blurry. As a rule of thumb, operations that take more than a millisecond of CPU time on modern hardware can be considered blocking. Fortunately, in most circumstances, typical computations are significantly faster than that. If you have stricter or looser latency requirements, you may want to use a different reference number.
Note that async applications can invoke synchronous code freely, as long as they are non-blocking. Some programming environments support running blocking code in a way that doesn't interfere with the rest of the async application. In Rust, you can look up whether this is possible in the documentation provided by your runtime.
The goal of asynchronous programming is multitasking, allowing multiple logical tasks to make progress concurrently on either one or more OS threads. It is typically more performant and ergonomic than using threads directly. Asynchronous applictions should avoid blocking operations which can cause latency spikes and deadlocks. An async runtime schedules and executes an asynchronous application, switching between tasks as needed. The async/await
syntax lets you write sequential but asynchronous code. In async Rust,
Objective | Sync Rust | Async Rust |
---|---|---|
Fast function | Regular function, e.g.fn square(x: i64) -> i64 |
(same) |
Slow* function | Blocking function, e.g.fn sleep(t: Duration) |
Async function, e.g.async fn delay() fn delay() -> impl Future<..> fn delay() -> DelayFut |
Fast closure | Regular closure, e.g.|| { square(x) } move || { square(x) } |
(same) |
Slow* closure | Blocking closure, e.g.|| { sleep() } move || { sleep() } |
Async closure, e.g.|| async { delay().await } move || async { delay().await } |
Fast producer of multiple values | Iterator<Item=T> |
Iterator<Item=T> |
Slow* producer of multiple values | Iterator<Item=T> |
Stream<Item=T> |
Achieving concurrency only | N/A (except through custom event loop) | Spawn local tasks**, future- and stream combinators |
Achieving parallelism | Spawn threads | Spawn tasks** |
Thread/task local storage | Provided by thread_local crate |
Task-local storage may be provided by your runtime.** |
Synchronization | Provided by std::sync . Reference counting must use std::sync::Arc . |
Provided by futures . Local tasks may use std::rc::Rc for reference counting. |
I/O | Provided by std::fs and std::net |
Provided by your runtime** |
Timers | Provided by std::thread::sleep |
Provided by your runtime** |
Feature | Sync Rust | Async Rust |
---|---|---|
Scheduler of threads/tasks | OS kernel | The executor of your runtime, which runs in user-space |
Scheduling strategy | Pre-emptive and cooperative. Pre-emptive means that a logical task can be suspended by the scheduler at any point. | Cooperative only, meaning that the scheduler cannot resume until the task voluntarily yields back control to it. |
Threading model | 1:1 (each logical task has its own OS thread) | N:M model: N logical tasks run on M OS threads, where M is typically the number of CPU cores. N:1 model: N logical tasks run on a single OS thread. Also known as "single-threaded" and "multithreaded" executors. |
Isolation against slow threads/tasks | Yes, through pre-emptive scheduling | No. You must manually avoid blocking code to avoid congestion and/or deadlocks. |
Isolation against panicking threads/tasks | Yes, parent threads can detect and resume after child panics. | See the documentation of your runtime.** |
Cost of creating threads/tasks | Slow to create, due to syscalls and kernel provisioning. Can be mitigated by a thread pool. | Very cheap (allocate, then enqueue). |
Memory cost of dormant threads/tasks | Significant, since each thread needs its own stack. Can be mitigated by a thread pool. | Minimal memory footprint, since each task only provisions memory for its input future, which is known up-front. |
Cost of switching between threads/tasks | Context switches have considerable overhead. | Faster, since many context switches are substituted for user-space task switches.*** |
Cost of synchronization | Considerate cost due to atomics, memory barriers and context switching. | N:M model: Cheaper than sync Rust due to reduced context switches.*** N:1 Model: Significantly cheaper when using single-threaded sync. primitives.*** |
Cost of I/O | Each read or write incurs a mode switch | Multiple read- and write ops can be pooled in a single syscall, which can improve performance dramatically under load***. |
Binary size costs | (Baseline) | Larger binary sizes, due to Future types. In particular, async fn auto-generates futures that can be large. Mitigations exist. |
* Here, slow means any operation that may not complete quickly. Blocking simply means slow and synchronous.
** Depends on your runtime.
*** Theoretical benefit. Needs to be verified through benchmarks.
A future is the async equivalent of a plain old function. Both perform a set of operations
and eventually complete. A function executes from start to end without interruption, but a future executes in bursts. It makes as much progress as possible when it is polled, and then it yields control back to the runtime, allowing other work to be done in the meantime.
The async/await syntax are counterparts. Async is a declaration, similar to fn
. Await is application, similar to ()
.
A task is the async equivalent of a thread. Just like the OS owns and schedules threads, the async runtime owns and schedules tasks. In order to create a thread, you pass a function. When you're creating a task you pass a future.
The async runtime is the equivalent of the OS scheduler. The OS schedules threads in kernel-space, but the async runtime schedules tasks in user-space.
Async combinators are similar to regular combinators (map, and, or_else etc), but deals specifically with futures. Some async combinators (like join, select, etc) alter the concurrent control flow and as such do not have synchronous equivalents.
This section covers the fundamental building blocks of async Rust, and how they fit together:
Think of async Rust as an extension of synchronous Rust, rather than a substitute. In async Rust, you often call synchronous functions in the same way as you always have. However, you cannot call async functions from a sync function:
fn foo() {}
async fn bar() {}
fn baz() {
foo(); // OK: sync -> sync
bar().await; // Error: sync -> async
}
async fn quux() {
foo(); // OK: async -> sync
bar().await; // OK: async -> async
}
In async Rust, operations that involve I/O, timeouts and synchronization are usually async. Since these operations can only be invoked by other async functions, async tends to be "contagious" and bubble up all the way to the main function. As a result, you will see async fn
a lot in async Rust, even for functions that are only using I/O indirectly through nested function calls.
An asynchronous operation in Rust is a represented as a future—a type which implements the Future
trait. Its associated Output
type denotes the type of the result, which is available once the future has completed.
Futures are inert just like any other type, but they know how to make progress. A future makes prorgess only when polled through the poll method. It returns Poll::Pending
if more work is needed and finally Poll::Ready(val)
when complete. Futures cannot be polled after completion.
Most of the time, you won’t need to interact with poll directly unless you implement Future manually [link] or build an executor. Instead, polling is handled automatically by the executor.
A future can represent any asynchronous operation. In practice, a future represents one of the following operations:
When browsing code and documentation, you’ll need to differentiate between asynchronous and synchronous APIs. An asynchronous function returns a future. Typically, they come in either of the following forms:
// The return value is a concrete, but unknown, type which implements Future
fn get_user(id: u64) -> impl Future<Output=User> { .. }
// A function with `async` in its signature implies that a future is returned
async fn get_user(id: u64) -> User { .. }
// Returns a custom type that implements Future
fn get_user(id: u64) -> UserFut { .. }
// Somewhere else
impl Future for UserFut {
type Output = User;
// ...
}
Sometimes a future is wrapped in a smart pointer, e.g. Box<dyn Future<..>>
. For our purposes we consider these futures as well. [see boxing futures].
The async
keyword denotes an asynchronous expression (a block of code) that evaluates to an anonymous future, similar to how a closure evaluates to an anonymous function. The future’s Output
type is inferred by the compiler.
Within an async expression you can add .await
to a future (similar to how ?
is added to a result). .await
suspends execution of the async expression until the future is ready, without blocking the execution of other asynchronous operations. The term “await point” refers a specific .await
statement within an async block.
The async
keyword can be used with standalone functions and methods, closures as well as plain blocks of code.
An async
block automatically captures variables from the outer scope by reference. The resulting future must not exceed the lifetime of the captured variables.
let id = 42;
// A regular async block. We suffix with `_fut` to denote a future.
let friend_count_fut = async {
// Id is captured by reference
let user = get_user(id).await;
let friends = user.get_friends().await;
friends.len() // The future's Output type is inferred here
};
An async move
block capture variables from the outer scope and takes ownership of them, similar to a move closure. This tends to be useful in practice, since you don’t have to worry about the lifetime of captured references:
let mut count = 42;
let incr_fut = async move {
count += 1; // Count captured as owned
};
Async closures are anonymous functions (closures) that return a future. They are written async |x, y| { .. }
or async move |x, y| { .. }
. However, this syntax is currently unstable.
As a workaround, you can wrap an async block inside a regular closure, like so:
let closure = |id| async move {
get_user(id).await.get_friends().await.len()
};
let friend_count_fut = closure(42);
The async
keyword in a function declaration denotes an asynchronous function. It returns a future where the Output
type is equal to the return type of the signature. Async functions use move semantics, similar to async move blocks.
pub async fn get_user(id: u64) -> User {
Database::fetch_one("user", id).await
}
Spelling out async
in the signature helps users quickly identify async functions, and reduces verbose boilerplate. The asynchronous get_user
method from above could also be written without async
in the signature:
pub fn get_user(id: u64) -> impl Future<Output=User> {
async move {
Database::fetch_one("user", id).await
}
}
The impl Future<Output=User>
return type denotes a concrete type that implements Future
, which is inferred at compile time. Recall that the concrete type of an async block is anonymous.
You can use the async keyword on methods and associated functions too:
impl User {
// Async method
pub async fn get_friends(&self) -> Vec<User> { .. }
// Async associated function (doesn't use self)
pub async fn get_common_friends(a: &Self, b: &Self) -> Vec<User> { .. }
}
Recall that the resulting future must outlive all its captured references. This is true also for &self
and &mut self
. [separate section on lifetimes?]
Traits are Rust’s primary abstraction mechanism. However, the async
keyword is not supported in traits. Fortunately, there are a number of workarounds. See the chapter on [writing generic asynchronous code].
Combinators are types, methods and macros that enable non-sequential control flow and data transformations. The combinators in this chapter come from the futures crate. This crate provides many useful utilities for async code including a large collection of combinators.
Two fundamental combinators are the join!
and select!
macros. They correspond to conjunction and disjunction, respectively. Both execute two futures concurrently. join!
awaits both operations and returns their results. select!
awaits until either of the futures is complete and returns its result.
// Awaits both futures concurrently
let (user1, user2) = join!(get_user(42), get_user(43));
// Awaits any of the two futures to complete, and returns the first one
let user = select! {
user1 = get_user(42).fuse() => user1,
user2 = get_user(43).fuse() => user2,
};
The .fuse()
method transforms a future into a FusedFuture
, and is required by select!
. See the section on fusing[link] for an explanation on why this is necessary.
TODO:
TODO:
High level workings and usage. Should not extensively cover lower level mechanisms.
TODO:
TODO:
Author: Lee B