owned this note
owned this note
Published
Linked with GitHub
---
title: "Design meeting 2024-02-08: async Drop"
tags: ["WG-async", "design-meeting", "minutes"]
date: 2024-02-08
discussion: https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async/topic/Meeting.202024-02-08
url: https://hackmd.io/qsCxElt6SM-riz2pMdKBNA
---
# async drop I: constraints
_Note from Yosh: this is an unfinished first draft of a blog post. The plan was to publish a complete draft and read it - but I ran out of time and didn't want to reschedule the reading session a third time. At the bottom of this post there are a number of unfinished sections containing bullet points. These bullet points are not in fact been verified, so please do not consider them as part of the discussion of this post yet. In the final version of this post I will have made sure to elaborate on them_
When we released the async Rust MVP in 2019, we stabilized two things: the
`Future` trait and `async` functions. Last year we stabilized a subset of async
functions in traits (AFITs), which in terms of core language features still
means we're missing: finishing up async traits (dyn traits, anyone?), async
closures, async iteration, and async drop. Of these I consider async Drop to be
the most important one because it currently cannot be polyfilled or worked
around - leading to [all sorts of fun
problems](https://blog.yoshuawuyts.com/tree-structured-concurrency/#what-s-the-worst-that-can-happen).
In this post I want to start by writing down the constraints we have for an
async drop system. I'll start with a brief introduction to how `async Drop` is
expected to work. Then follow up by making the case for why `async Drop` should
be represented in the type system. And finish by talking about the interactions
with other async language features.
## A brief introduction to async drop
In this post we'll start with the design for the async drop trait Sabrina Jewson
proposed in [her post](https://sabrinajewson.org/blog/async-drop). She made a
really good case for why this is the right design, and I recommend reading her
post in the entirety if you haven't already.
```rust
trait AsyncDrop {
fn drop(&mut self);
}
```
The idea is that this trait can then directly be implemented on types. Say we
have a type `Cat`, which when dropped prints some message, we could write that
as follows:
```rust
struct Cat {}
impl AsyncDrop for Cat {
async fn drop(&mut self) {
println!("The cat has plopped").await;
}
}
```
And using it would work something like this, where at the end of an async scope
destructors are run.
```rust
async fn nap_in_place() {
let cat = Cat {}; // 1. Construct `cat`
// 2. `cat` is dropped here, printing:
// "The cat has plopped"
// 3. The function yields control back to the caller
}
```
There is a lot more to this system, and we'll get into the details of that
throughout the remainder of this post. But at its core, this is the feature
we're considering introducing to async Rust - and I wanted to make sure people
reading it had at least a sense of what we're trying to achieve before we dive
into the details.
## The constraints of linearity
Last year
[Tyler](https://tmandry.gitlab.io/blog/posts/2023-03-01-scoped-tasks/) and
[Niko](https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/#uses-for-must-move)
showed why if we want the async equivalent of `thread::scope`, we need both
async destructors as well as futures which can't be forgotten:
> Second, parallel structured concurrency. As Tyler Mandry elegant documented,
> if we want to mix parallel scopes and async, we need some way to have futures
> that cannot be forgotten. The way I think of it is like this: in sync code, when
> you create a local variable x on your stack, you have a guarantee from the
> language that it’s destructor will eventually run, unless you move it. In async
> code, you have no such guarantee, as your entire future could just be forgotten
> by a caller. “Must move” types solve this problem (with some kind of callback
> for panic) give us a tool to solve this problem, by having the future type be
> ?Drop — this is effectively a principled way to integrate completion-style
> futures that must be fully polled.
I showed via both my [first](https://blog.yoshuawuyts.com/linearity-and-control/) and
[second](https://blog.yoshuawuyts.com/linear-types-one-pager/) post on
on linearity that the property we want isn't actually types which can't be
_dropped_, but types which can't be _forgotten_. I later found out
[Sabrina](https://sabrinajewson.org/blog/async-drop#linear-types) had actually
had this exact insight about a year earlier. To my credit though, I feel like I
did meaningfully move the conversation in my second post by enumerating the
rules we should uphold to encode the "drop is guaranteed to run" interpretation
of linearity.
We can reason through this to arrive at a constraint:
1. In order to achieve "task scopes" we have to combine linearity + async destructors
2. The only system of linearity we know how we could encode and can see a path
towards integrating into the language [^open-linearity-questions] is "drop is
guaranteed to run"
3. This system works because we disallow all instances where drop could not be
guaranteed to run - guaranteeing drop will run
4. When async drop is combined with linearity we have to guarantee destructors
will always be run
5. **Async drop cannot introduce any new or unaccounted for scenarios where destructors are not
guaranteed to run**
[^open-linearity-questions]: There are some questions about how to extend
linearity to all language items, but that all seems pretty solvable. That's more
like, an integration question rather than a more fundamental question about the
core system.
We can invert this conclusion too: if we arrive at a design for async drop which
introduces new cases where destructors are not run, we cannot use it for scoped
tasks. That means that a design for async drop should prove that it doesn't
introduce any new or unaccounted cases where destructors aren't run. And the
most practical way to ensure that is be if the semantics of async destructors
closely follow those of non-async destructors.
There are other, practical reasons for why we would want to ensure that async
destructors can't be prevented from running in ways unique to async Rust. But
I'm choosing the type-system interactions with linearity to enable task scopes
here, because they are both clear and strict. It means that if we introduce new
ways in which destructors aren't run, we'd be closing the door on a particularly
desired feature - and that's not something we want to do.
## Drop glue and forwarding of implementations
A key property of rust's drop system is that it provides two properties:
1. Destructors are always run when a type becomes unavailable (modulo leaking)
2. Destructors are directly tied to the lifecycle of the object
This works particularly well with Rust's move semantics, which ensures that the
code to cleanup any object is always provided by that object. When a type is
nested in another type, the encapsulating type ensures that the inner type's
destructors are run. The code generated for this is what we canonically call
"drop glue", and it does not have a representation in the type system. Here's an
example of how this works:
```rust
// Our inner type implementing `Drop`.
struct Inner {}
impl Drop for Inner {
fn drop(&mut self) {
println!("inner type dropped");
}
}
// Our outer type containing our inner type.
// The inner destructor is forwarded by the outer type
// via "drop glue".
struct Outer(Inner);
fn main() {
// Construct the types and drop them.
let outer = Outer(Inner {});
// `outer` is dropped, prints "inner type dropped"
}
```
The type `Inner` implements `Drop` in this example, but the type `Outer` [does
not](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1ac6a1b95c97984c79c07307a1948272).
That is the difference between "drop glue" and a `Drop` impl. Drop glue is
automatically inserted by the compiler, generating actual `Drop` impls. But in
the type system we cannot write a bound which says: "this type implements drop
glue". Instead Rust users are expected to assume drop glue may exist on any
type, and so there is no need to ever really check for its presence.
```rust
fn drop_in_place(t: impl Drop) {}
fn main() {
drop_in_place(Inner {}); // ✅ `Inner` implements `Drop`
drop_in_place(Outer(Inner {})); // ❌ `Outer` does not implement `Drop`
}
```
Now if we convert this to our `AsyncDrop` impl, we would likely want to write it
somewhat like this.
```rust
struct Inner {}
impl AsyncDrop for Inner { // using `AsyncDrop`
async fn drop(&mut self) { // using `async fn drop`
println!("inner type dropped").await; // using async stdio APIs
}
}
struct Outer(Inner);
async fn main() { // 1. Note the our hypothetical `async main`
let outer = Outer(Inner {}); // 2. Construct an instance of `Outer`
// 3. Drop the instance, printing a message
}
```
Now the question is: could we just write it like this and insert async drop
glue, or would we run into trouble if we attempted to do that? To learn more
about that, let's take a look at how we expect types with async destructors to
be called in function bodies, and when they don't work.
## Async destructors can only be executed from async contexts
Okay, let's do a little whirlwind tour of different scenarios where we might try
to drop a type implementing `AsyncDrop`, and discuss whether we can do so. In
order to do that let's take our earlier definition of `Cat` which asynchronously
prints out a message when dropped.
```rust
struct Cat {}
impl AsyncDrop for Cat {
async fn drop(&mut self) {
println!("The cat has plopped").await;
}
}
```
Now the first example we used was an async function. At the end of the scope we
drop the type and run the destructors. This is an async destructor running at
the end of an async context, and is the base case we expect to work.
```rust
async fn nap_in_place() {
let cat = Cat {}; // 1. Construct `cat`
// 2. `cat` is dropped here, printing:
// "The cat has plopped"
// 3. The function yields control back to the caller
}
```
If you're interested in how the state machines for something like this would
desugar, I recommend reading Eric Holk's posts describing the low-level state
machine for async drop ([part
1](https://theincredibleholk.org/blog/2023/11/08/cancellation-async-state-machines/),
[part
2](https://theincredibleholk.org/blog/2023/11/14/a-mechanism-for-async-cancellation/)).
We know that as long as we're in an async context, we can generate the right
state machine code for async destructors. But conversely also: if we're not in
an async context, we can't generate the state machine for async destructors.
Which means that **dropping types which implement async `Drop` is only allowed
in async contexts**.
```rust
/// A non-async function taking an instance of `Cat` and drop it straight away
fn nap_in_place() {
let cat = Cat {}; // ❌ Compiler error: `Cat` can only be dropped in an async context
}
```
Theoretically it might be possible to upgrade async destructors to non-async
destructors at runtime if we detect they're held in non-async contexts. We could
do this by wrapping them in a `block_on` call, but that would risk causing
deadlocks. We can already write `block_on` in destructors today, but generally
choose not to because of this risk. As such **any useful formulation of async
`Drop` cannot rely on runtime mechanisms to cover up gaps in the static
guarantees** [^azure-no].
[^azure-no]: If async drop carried a risk of introducing deadlocks we're
unlikely to use it in Azure, and may potentially even choose to lint against it.
In order for us to use async `Drop`, it's not enough for it to just be available -
it must also be useful. If we can't guarantee that, then it might be
preferable not to have an async destructor feature at all.
## Async destructors and control points
A common sentiment about async drop designs is that they should not introduce
any "hidden `.await` points". The sentiment is that this would run counter to
the design goals of having `.await` in the first place, and so we need to be
careful to ensure that we don't introduce any unexpected control flow. However
it's worth investigating what "control flow" exactly means, which `.await` points
are already introduced today, and use that to formulate the concrete constraints
this imposes on the design of async drop.
In order to define where async destructors may be executed, we first have to
identify and classify where values can be dropped today. For this I like to use
the language of [control
points](https://blog.yoshuawuyts.com/linearity-and-control/#control-points):
locations in source code where control may be yielded from the function to the
caller. Not all control points provide the same semantics, so we can further
classify control points into three distinct categories:
- **Returning operations**: hand control back to the function's caller, ending the
function. Examples: `return`, `?` in function scopes, `panic!`, and the last
expression in a function.
- **Breaking operations**: hand control back to an outer scope, but don't directly return
from the function. Examples: `break`, `continue`, `?` in `try {}`
scopes, and the last expression in a block scope.
- **Suspending operations**: hand control back to the caller, but the caller can
choose to hand control back to the callee again later. If the caller does not
hand control back, the function ends. Examples: `.await`, and `yield`.
To put this theory into practice, here is an example of fairly typical async
function you might see in the wild. It takes a path, does some IO, parses it,
and then returns either a value or an error.
```rust
async fn read_and_parse(path: PathBuf) -> io::Result<Table> {
// 1. futures start suspended, and may not resume again
let file = fs::open(&path).await?; // 2. `.await` suspends, and may not resume again
// 3. `?` propagates errors to the caller
// 4. `fs::open` may panic and unwind
let table = parse_table(file)?; // 5. `?` propagates errors to the caller
// 6. `parse_table` may panic and unwind
Ok(table) // 7. return a value to the caller
}
```
To people who are not used to thinking about control points, this number of
control points in this function is likely surprising. That's seven control
points on a three-line function. This ratio is also why I don't believe any
formulation of linearity or finalization which doesn't rely on destructors will
ever be a practical alternative. Rust's `Drop` system has so many contact
points in function bodies that manually describing them without violating
lifetime invariants would be highly unergonomic.
The most surprising control point in this function is likely the first one:
destructors may be executed before the function even has had an opportunity to
run. For all intents and purposes this can be considered a "hidden `.await`
point", but in practice it doesn't appear to be a problem. Why is that?
```rust
async fn read_and_parse(path: PathBuf) -> io::Result<Table> {
// `path` may be dropped before the function has even begun executing
}
```
The reason is that we have an intuitive understanding that as soon as a function
exits, we run destructors. If a function never starts and immediately exits,
that's semantically equivalent to the function starting and immediately dropping
all values. Despite introducing an "implicit `.await`", this is not a problem
because we're immediately exiting the function, which is when we expect
destructors to execute. That's the same logic which allows returning operations
such as `?` and `return` to exit a function and trigger async destructors. If we
want to allow async destructors to run in the face of `panic` and `return`, then
the rule we must uphold is: **async destructors are run at the end of async
function scopes**.
So when people talk about worrying about hidden `.await` points, what *do* they
actually mean? For that we can adapt one of Sabrina's examples:
```rust
async fn hostile_nap_takeover() {
let mut chashu = Cat {}; // 1. Construct the first instance of `Cat`
let nori = Cat {}; // 2. Construct the second instance of `Cat`
mem::replace(&mut chashu, nori); // 3. Assign the second instance to the location of the first instance
// 4. That will trigger the first instance's destructor to run
// 5. The second instance's destructor is run at the end of the function scope
}
```
The issue is that when we call `mem::replace`, async destructors will be run
without an associated `.await` point. This is not at the end of any function
scope (or block scope, we'll get to that in a sec) - but right in the middle of
a function - which will continue executing after the destructor has finished
running. This would not be an issue if `mem::replace` was an async function -
which would require an `.await` point to be called. But even that only has
limited applicability because Rust also has operators, meaning we could rewrite
the above like this instead:
```rust
async fn hostile_nap_takeover() {
let mut chashu = Cat {}; // 1. Construct the first instance of `Cat`
let nori = Cat {}; // 2. Construct the second instance of `Cat`
nori = chashu; // 3. Assign the second instance to the location of the first instance
// 4. That will trigger the first instance's destructor to run
// 5. The second instance's destructor is run at the end of the function scope
}
```
This is an example we want to prevent, and so this allows us to formulate
another constraint: **Async destructors can only be executed in-line in the
at an `.await` point**. If that isn't a constraint, then the previous
examples would be allowed, which we know we don't want. We can take this rule
and apply it to in-line block scopes as well. Taking the `Cat` type again, we
can imagine a function where a cat goes out of scope at some block.
```rust
async fn nap_anywhere() {
let mut chashu = Cat {}; // 1. Construct an instance of `Cat`
move {
let chashu = chashu; // 2. Ensure the instance is captured by the block
// 3. ❌ The instance would be dropped here
}
}
```
This function violates the constraint we just described: the value `chashu` would be
dropped in the middle of a function, causing async destructors to be run without
any associated `.await` points. Luckily we should be able to abide by this rule
by converting the block scope to an async block scope, and `.await`ing that.
This would cause the destructor to run at an `.await` point, resolving the issue.
```rust
async fn nap_anywhere() {
let mut chashu = Cat {}; // 1. Construct an instance of `Cat`
async move {
let chashu = chashu; // 2. Ensure the instance is captured by the block
// 3. ✅ The instance would be dropped here
}.await;
}
```
This works because it is semantically equivalent to defining an async function
and moving the value to that, which we've already established would be
permissable. That enables us to describe the following constraint: **It must be
possible for async destructors to be run at the end of async block scopes**.
This enables us to synthesize the following rules for when async destructors can
be executed:
- When an async function returns
- Inside another async function which is `.await`ed
- At the end of an async block scope in a function
As mentioned at the start of this section, Sabrina Jewson has done an excellent
job covering the challenges of async drop. Where we've landed with these
constraints is most similar to what she described as ["abort now: always
await"](https://sabrinajewson.org/blog/async-drop#abort-now). This opens up
questions about design and ergonomics, which I believe are important, and would
like to engage with in a follow-up post.
## liveness and ownership
Earlier in this post we established that types implementing async drop cannot be
dropped in non-async contexts. On its face we might be inclined to extrapolate
this rule and say that types implementing async drop cannot be held in non-async
contexts *at all*. That isn't quite true, because it's possible to for values to
be live in a non-async context, without ever being dropped in that same context.
One example are synchronous constructor functions.
```rust
struct Cat {}
impl AsyncDrop for Cat { .. }
impl Cat {
fn new() -> Self {
Self {} // ✅ Implements async Drop, is held live in a non-async function
}
}
```
Here we construct a new instance of `Cat`, which implements `AsyncDrop`, which
is owned inside of a non-async context. And that is fine, because at no point is
there a risk of destructors being run. Intuitively we might be inclined to say
that as long as types implementing async drop aren't held live across control
points, we're fine. But that too would be too restrictive as shown by the
following example.
```rust
/// A function which takes `Cat` by reference and then panics
fn screm(cat: &mut Cat) {
panic!("I scream, you scream"); // `cat` is live when this panic happens
}
async fn screamies_time() {
let mut cat = Cat::new(); // 1. Construct a new instance of `Cat`.
screm(&mut cat); // 2. Pass a mutable reference to a function
// 3. ✅ The function panicked, run the instance's destructor here.
}
```
Again, this is fine because the destructor for `cat` will never be run inside of
the function `screm` - and so the fact that even a mutable reference is held
live across a control point in a non-async function is okay. It's only when
owned values implementing async drop are held live across control points in
non-async contexts that we run into trouble.
```rust
/// A function which takes `Cat` by-value and then panics
fn screm(cat: Cat) {
panic!("I scream, you scream"); // `cat` is live when this panic happens
// ❌ Compiler error: `Cat` must be dropped in an async context
}
async fn screamies_time() {
let mut cat = Cat::new(); // 1. Construct a new instance of `Cat`.
screm(&mut cat); // 2. Pass the instance to a function
}
```
That surfaces the following rules: It's always possible for values implementing
async Drop to be live in non-async contexts as long as they are never held
across control points. And it's also always possible for *references*
implementing async `Drop` to be held live in non-async contexts even across
control points. However **it is never possible for owned values implementing
async `Drop` to be held live in a non-async context across control points.**
## bounds for drop glue
In the Rust stdlib we have a function
[`ManuallyDrop`](https://doc.rust-lang.org/core/mem/struct.ManuallyDrop.html)
which can be used to more finely control when types are dropped. For example
`Arc` uses `ManuallyDrop` number of times internally to manipulate the reference
counters without actually losing data. A simplified version of its signature
looks something like this:
```rust
pub struct ManuallyDrop<T: ?Sized> { .. }
impl<T: ?Sized> ManuallyDrop<T> {
/// Create a new instance of `ManuallyDrop`
pub fn new(value: T) -> ManuallyDrop<T> { .. }
/// Drop the value contained in `ManuallyDrop`
/// Safety: you may only call this function once.
pub unsafe fn drop(&mut self) { .. }
}
```
The bounds for the type it operates on are `T: ?Sized`, not `T: Drop`. That is
because this function is happy to operate on any drop impl including drop glue,
and drop glue is not visible in the type system. So what happens if we want to
write an async version of this type? Presumably `fn drop` should be `async. But
what should the bounds on `T` be?
```rust
/// A hypothetical async drop compatible version of `ManuallyDrop`.
pub struct AsyncManuallyDrop<T: /*bounds*/> { .. }
impl<T: /*bounds*/> AsyncManuallyDrop<T> {
pub fn new(value: T) -> AsyncManuallyDrop<T> { .. }
pub unsafe async fn drop(&mut self) { .. } // Note the `async fn` here
}
```
Presumably the bounds need to be able to express something like: `T: ?Sized +
AsyncDropGlue`. Not `T: ?Sized + Drop`, because that would refer to a
concrete impl. And it can't just be `T: ?Sized` either, since that implies the
existing drop glue bounds. The question of how we can ergonomically surface
these bounds is a design question we won't go into right now. This example
exists to show that
**async drop glue needs to be able to be surfaced to the type system so it can
be used in bounds**.
## mixing async and non-async drop glue
Once we start considering the presence of async drop glue, we might wonder
whether it needs to be mutually exclusive with non-async drop glue. I don't
believe the two can be mutually exclusive, because it would interact badly with
synchronization primitives such as `Arc`. Take for example the following type:
```rust
struct Inner {}
impl AsyncDrop for Inner {
async fn drop(&mut self) {
println!("inner type dropped").await;
}
}
struct Outer(Inner, Arc<usize>);
```
The type `Outer` here carries both an `Arc` which implements `Drop`, and `Inner`
which implements `AsyncDrop`. For this to be valid it needs to be able to
implement both async and non-async drop glue. If we make both kinds of drop glue
mutually exclusive, then we would need to define a new version of `Arc` which
itself doesn't perform any async operations - but does implement `async Drop`
just to satisfy this requirement. This seems like a particularly harsh direction
with no clear upsides, which brings us to the following constraint: **it must be
possible for a type to implement both async and non-async drop glue**.
## cancellation cancellation
In his [first
post](https://theincredibleholk.org/blog/2023/11/08/cancellation-async-state-machines/#cancellation-cancellation)
on async cancellation handlers, Eric asks: "what behaviors are possible if a cancelled
future is cancelled again?" In it he presents the following three options:
1. Attempting to cancel the execution of an async destructor is statically disallowed
2. Attempting to cancel the execution of an async destructor may succeed (recursive)
3. Attempting to cancel the execution of an async destructor results in a no-op (idempotent)
To make this a little more concrete, we can write a code example. In it we'll
author some type implementing `AsyncDrop` which takes a little bit of time to
complete (100 millis). We'll then drop that in a scope somewhere to start it.
Then in some outer scope higher up on the stack we trigger a cancellation after
a much shorter period. What we're asking is: what should happen in this scenario?
```rust
struct Cat {}
impl AsyncDrop for Cat {
async fn drop(&mut self) {
sleep(Duration::from_millis(100)).await;
println!("Napped for 100 millis").await;
}
}
async fn main() {
async {
async {
let cat = Cat {}; // 1. Construct an instance of `Cat`
// 2. Drop the instance of `Cat`, running its destructor
}.await; // 3. This future will now take 100 millis to complete
}.timeout(Duration::from_millis(10)).await; // 4. But we're cancelling it after just 10 millis
}
```
It's key to remember that this is just an example, and necessarily simplified. A
cancellation may be triggered anywhere in the logical call stack, during the
execution of any async destructor. Meaning this is fundamentally a question
about composition, and we should treat the various components involved as black
boxes.
In Eric's post he rejected the option to statically disallow the cancellation of
execution of async destructors for similar reasons as we've just outlined.
Callers higher up on the stack do not know about the internals lower on in the
stack. Eric did not believe this was feasible, and I agree. It's unclear what
analysis we would need to allow this, and even if we could figure it out the
resulting system would likely still be limited to the point of impracticality.
The second option would be to allow async destructors to be cancelled. this
would violate the first constraint we declared: async destructors may not
introduce any new ways in which destructors can be prevented from running as
that would close the door on scoped tasks. In the example we can see this:
if the `timeout` stops the execution of `async drop`, it will never reach the
`println!` statement.
The third option is the only feasible behavior we can adopt which doesn't
violate any of the constraints we've discovered so far. When the `timeout` is
called, rather than cancelling the `async drop` impl it waits for it to finish.
That means that: **attempting to cancel the execution of an async destructor
should result in a no-op.** This behavior is what what Eric called _idempotent_,
and it comes with some additional challenges:
> Admittedly, this might take additional rules, like we may want to declare it
> to be undefined behavior to not poll a cancelled future to completion. Scoped
> tasks would likely need this guarantee […]
TODO: rewrite this paragraph:
That is a great point: with regular `Drop` we cannot call the method directly -
instead we have to pass it by-value to something like the `drop` function which
is basically just a no-op. The `ManuallyDrop` type does provide a `drop`
function, but that is marked `unsafe` and the user is on the hook for upholding
the invariant that `drop` is called at most once. For `AsyncDrop` the rules
should be similar: the only way to obtain an async destructor to manually poll
should be via a built-in (e.g. `AsyncManuallyDrop`). The additional safety
invariant for that should be that if it is used to obtain the async drop future
in a non-async context, it guarantees it will run it to completion. This is
needed because at some point we do need to map async back to sync, but as we
stated we cannot introduce conditions where async destructors would not be run.
`unsafe` invariants is the only way we can do that.
TODO: summarize the constraint this poses.
## Conclusion
In this post we've surfaced the following constraints with respect to any potential async drop design:
1. Async drop cannot introduce any new or unaccounted for scenarios where destructors are not guaranteed to run
2. Dropping types which implement async `Drop` is only allowed in async contexts
3. Any useful formulation of async Drop cannot rely on runtime mechanisms to cover up gaps in the static guarantees
4. It must be possible for async destructors to be run at the end of async function scopes
5. Async destructors can only be executed in-line at an .await point
6. It must be possible for async destructors to be run at the end of async block scopes
7. It is never possible for owned values implementing async Drop to be held live in a non-async context across control points
8. Async drop glue needs to be able to be surfaced to the type system so it can be used in bounds
9. It must be possible for a type to implement both async and non-async drop glue
10. Attempting to cancel the execution of an async destructor should result in a no-op
----
# unimplemented sections
⚠️ _These unimplemented sections are mostly about generics and which restrictions
emerge once we try and interact with the trait system. I understand this is
particularly relevant for the questions we have about async iterator and async
closures. But I'd like to punt discussion on that until these sections have been
spelled out and have examples to substantiate the points they make. That will
make for a better conversation, plus there is plenty in this post already to
discuss._ ⚠️
## TODO: concrete impls and generics
- We should not pass an `AsyncDrop` impl to a type like `Vec` as-is
- Assume data is stored in a `ManuallyDrop`
- It has a manual `Drop` impl, which is not guaranteed to run the `AsyncDrop` impl
- Rule: types implementing async drop can only be passed to bounds which expect it
- Rule: if a bound does not state it wants an `AsyncDropGlue` impl, a type implementing it cannot be passed to it
- reason: otherwise there is nothing preventing the manual drop impl from
only executing the sync drop glue, and yeeting the rest of it. Existing
code can do that today, and we cannot say it is now doing unsound things.
It needs new bounds for that reason.
- reason: `Vec<T: !Leak>` may have a valid implementation for any `T:
DropGlue`, but would not be valid for any `T: AsyncDropGlue`
- If we want to run async destructors, we should have a `Vec<T: AsyncDropGlue>`
## TODO: async drop glue can be a noop
- it's okay if we say `+ AsyncDropGlue` and then there isn't actually any
- we've established that async + non-async drop glue should be able to co-exist
- in non-async rust we cannot name this bound, so this doesn't come up
- this only comes up here because we can name the bound
## TODO: async drop bounds are going to be everywhere, and that's going to be noisy
- most async leaf types will want to be able to provide `AsyncDrop` - including all tasks
- every trait bound in async code will want to `+ AsyncDropGlue`
- that's going to mean a lotttt of bounds if that ends up going through
- if we want it to be practically usable, we're going to have to streamline it
- The best way to achieve that will be to somehow be able to imply `+ AsyncDropGlue` is implied for async trait bounds
- e.g. `T: AsyncRead` should imply `+ AsyncDropGlue`
- We lose nothing on that because async drop glue can be a noop, just like regular drop glue can be a noop
- But we have everything to win by doing that so that the entire async ecosystem doesn't become a mess of annotations
- In a strict sense this is a restriction (covariant? it's some variance thing)
- if `T: AsyncRead` implies `+ AsyncDropGlue`, that's a tighter bound than if it doesn't imply that
- however, that may not be the end of the world if we indeed assert that async drop glue can be a noop
- "hey this bound now expects async drop glue" - will always be true.
Meaning that while theoretically it's a tighter bound, in practice the bound
will work for any type.
## TODO: async functions
- If an async function holds a type which implements async drop glue, it becomes `impl Future + AsyncDropGlue`
- We can combine that with the previous restriction: "types implementing async drop can only be passed to bounds which expect it"0
- This comes at the conclusion: any type which takes a future that can work with async drop glue needs to state so in the bound
---
# Discussion
## Attendance
- People: TC, Vadim Petrochenkov, tmandry, Daria Sukhonina, Yosh, eholk, Vincenzo
## Meeting roles
- Minutes, driver: TC
## Passing mutable references to sync functions
eholk: The example copied below should probably be illegal.
```rust
async fn hostile_nap_takeover() {
let mut chashu = Cat {}; // 1. Construct the first instance of `Cat`
let nori = Cat {}; // 2. Construct the second instance of `Cat`
mem::replace(&mut chashu, nori); // 3. Assign the second instance to the location of the first instance
// 4. That will trigger the first instance's destructor to run
// 5. The second instance's destructor is run at the end of the function scope
}
```
The reason is we can't run the `chashu` destructor before passing it to `mem::replace`, since if we did we'd effectively be passing in a pointer to uninitialized memory. We need `mem::replace` to make the decision about whether to destruct `chashu`, but `mem::replace` is synchronous so it cannot await `chashu`'s destructor. The right answer is probably to annotate `mem::replace` so that the first argument can't be something that has an `async Drop`.
On the other hand, the second version that uses assignment instead would be perfectly fine because we'd be running destructors in an async context.
Yosh: Clearly the code should be rejected.
eholk: The destructor will run in the scope of `mem::replace`.
TC: The `mem::replace` case seems a subset of a larger question. How would the design propose here interact with manual poll implementations? I.e. how would one write the kind of low-level code that's necessary for writing combinators, executors, etc. if everything needs to be in async blocks?
Yosh: That is a critical point. I didn't quite get to this in the document. We would need an `async` version of `ManuallyDrop`. I need to spell this out carefully.
tmandry: What's the rule to prevent `mem::replace` from doing this? The rule in the document doesn't seem sufficient to prevent this case.
Yosh: Since you can't drop an `async` type in a sync context, that covers it.
Yosh: Stepping back, I'm trying to show in this post why things have to be a particular way and what happens if we do the other thing.
## Are linearity and drop tied together?
TC: In talking with CE, I've heard it proposed that linearity may likely be implemented in a way more like the borrow checker rather than by carefully trying to exclude all leaking operations. In that world, do we still need to preserve the invariant mentioned in the document about async drop needing to carefully not add the possibility of any breaks in linearity?
tmandry: This is maybe similar to what Yosh is proposing; there seem to be some control-flow things that would implied or be required by the rules Yosh is putting forward.
Yosh: Linearity adds restrictions on what is possible. I don't see how we could do linearity without threading it through the type system. I'm not sure what a control-flow based mechanism would look like.
eholk: It does seem like a type system thing.
CE: We have a trait in the solver called `Destruct`. It tracks whether a type has a `Drop` impl or contains something with a `Drop` impl. `const_precise_live_drops` is what allows the borrow checker to bound only the right things. This proposal would require something similar.
CE: Let's call it `Destruct` for consistency, not `DropGlue`.
## Comparison with `poll_drop_ready` and similar
TC: There is a proposal...
https://github.com/withoutboats/rfcs/blob/poll-drop-ready/text/0000-poll-drop-ready.md
...to extend the `Drop` trait as follows:
```rust
trait Drop {
fn drop(&mut self);
fn poll_drop_ready(&mut self, cx: &mut Context<'_>) -> Poll<()> {
Poll::Ready(())
}
}
```
(Variations on this are possible.)
The idea here is that in async contexts, `poll_drop_ready` is polled to completion before invoking `drop`.
Presumably this doesn't achieve the connection to linearity this document is going for, but maybe there are other paths there? It'd be good to compare these approaches against our goals and desired properties.
Yosh: I didn't really want to evaluate any concrete designs in this post; I mainly wanted to work through examples to determine constraints. But yeah, that direction of designs introduces new cases where destructors won't run - which Eric covered in his second post. So on the basis of that we'd probably need to reject it.
Daria: That would require to store state needed for async drop inside of the structure, like for example `Vec` would contain an index to the element we currently async drop, unless we poll every element kinda like in `join!`. Wouldn't that have any clear disadvantages?
eholk: Along with what Daria's saying, this is an example where the `async fn drop` formulation has a clear advantage over the `fn poll_drop_ready` formulation. The `async fn drop` function returns a future and that gives you a place to hold the state needed to, for example, keep track of where we are in dropping all the elements held in a `Vec`. With `poll_drop_ready`, you have to build the state needed to do async cleanup in the `Vec` to start with.
TC: It's interesting that in the case of `AsyncIterator`, we've found that having a second state machine causes problems (e.g. with cancellation). But in this case, the additional place to put state turns out to be useful.
eholk: +1.
## Liveness and ownership
tmandry: A couple of questions came up for me in this section:
* Does the doc assume that async destructors will happen when unwinding (panicking) through an async context?
* Vadim: this should be possible (suspending/resuming during unwinding), but we need to catch the exception first, then store it into the coroutine, then rethrow the stored exception after resuming. Unwinding is a "whole thread" process and has a `#[thread_local] static` component. Catching the exception "localizes" it and allows it to be captured by the coroutine.
* How will we deal with holding generic types in synchronous scopes, when those types may or may not implement async Drop?
tmandry: Summary of Yosh's response: You need `T: AsyncDropGlue` to pass it. Effectively `T: ?Drop`.
(The meeting ended here.)
## Main Goals -- what do we want to enable?
eholk: Async drop or cleanup can have a couple of use cases.
For example, this doc seems to have enabling scoped tasks as a key use case. A weaker goal would be enabling best effort async cleanup. The scenarios we want to support have a big impact on requirements, so we should decide what scenarios are important.
For example, I don't see a way to enable scoped tasks without either linear types or implicit await points. But if we're happy with best effort async cleanup, we have a lot more flexibility here.
## Ergonomics of only dropping at existing await points?
eholk: If we say for example `x = y` cannot run an async destructor on `x` because we need an await point to run destructors, does this mean we're going to give up a lot of the ergonomics of `async Drop` and end up with something with ergonomics more like linear types?
## Defer blocks?
Daria: Undroppable types are messy to work with. To convince the compiler that something never drops you have to use `--release` as with `#[no_panic]` attribute. Maybe defer blocks are ~~our salvation~~ a path forward to consider?
## How do runtimes manage tasks?
eholk: One of the things we'll have to do to keep destructors from leaking is disallow holding them in either something that gives shared ownership (e.g. `Arc`), or something that gives mutability (e.g. `RefCell` or `Mutex`). Is this something runtimes can do?
I'm assuming the answer is that runtimes will have to make liberal use of `ManualAsyncDrop`.
Daria: `ManualAsyncDrop` could be safe to use because it should be safe to leak a pinned box to some suspended 'static future.
## More Linearity and async Drop
eholk: If the language already had linearity, how would we design `async Drop`? It seems like maybe these are separable features, and tying them together could lead to weird incongruencies where we have linearity for async types but not in the whole language.
Could the rules proposed here instead be repurposed for general linear types? For example, an effect of only running `async Drop` at the end of `async` scopes is that you can't do `async_x = y`.
If we had `linear T`, it seems like we'd end up with the same rule.
## Why now?
eholk: We've had concerns that shipping other features, like async closures, might close doors for what we can do with `async Drop`.
Do we have concrete examples of where certain `async Drop` designs would not work with certain async closure designs?
CE: It's really about designing APIs that are flexible enough to take `AsyncDestruct` bounds, probably not async closures directly.