owned this note changed a year ago
Published Linked with GitHub

async drop I: constraints

Note from Yosh: this is an unfinished first draft of a blog post. The plan was to publish a complete draft and read it - but I ran out of time and didn't want to reschedule the reading session a third time. At the bottom of this post there are a number of unfinished sections containing bullet points. These bullet points are not in fact been verified, so please do not consider them as part of the discussion of this post yet. In the final version of this post I will have made sure to elaborate on them

When we released the async Rust MVP in 2019, we stabilized two things: the Future trait and async functions. Last year we stabilized a subset of async functions in traits (AFITs), which in terms of core language features still means we're missing: finishing up async traits (dyn traits, anyone?), async closures, async iteration, and async drop. Of these I consider async Drop to be the most important one because it currently cannot be polyfilled or worked around - leading to all sorts of fun problems.

In this post I want to start by writing down the constraints we have for an async drop system. I'll start with a brief introduction to how async Drop is expected to work. Then follow up by making the case for why async Drop should be represented in the type system. And finish by talking about the interactions with other async language features.

A brief introduction to async drop

In this post we'll start with the design for the async drop trait Sabrina Jewson proposed in her post. She made a really good case for why this is the right design, and I recommend reading her post in the entirety if you haven't already.

trait AsyncDrop {
    fn drop(&mut self);
}

The idea is that this trait can then directly be implemented on types. Say we have a type Cat, which when dropped prints some message, we could write that as follows:

struct Cat {}
impl AsyncDrop for Cat {
    async fn drop(&mut self) {
        println!("The cat has plopped").await;
    }
}

And using it would work something like this, where at the end of an async scope destructors are run.

async fn nap_in_place() {
    let cat = Cat {};         // 1. Construct `cat`
                              // 2. `cat` is dropped here, printing:
                              // "The cat has plopped"
                              // 3. The function yields control back to the caller
}

There is a lot more to this system, and we'll get into the details of that throughout the remainder of this post. But at its core, this is the feature we're considering introducing to async Rust - and I wanted to make sure people reading it had at least a sense of what we're trying to achieve before we dive into the details.

The constraints of linearity

Last year Tyler and Niko showed why if we want the async equivalent of thread::scope, we need both async destructors as well as futures which can't be forgotten:

Second, parallel structured concurrency. As Tyler Mandry elegant documented, if we want to mix parallel scopes and async, we need some way to have futures that cannot be forgotten. The way I think of it is like this: in sync code, when you create a local variable x on your stack, you have a guarantee from the language that it’s destructor will eventually run, unless you move it. In async code, you have no such guarantee, as your entire future could just be forgotten by a caller. “Must move” types solve this problem (with some kind of callback for panic) give us a tool to solve this problem, by having the future type be ?Drop — this is effectively a principled way to integrate completion-style futures that must be fully polled.

I showed via both my first and second post on on linearity that the property we want isn't actually types which can't be dropped, but types which can't be forgotten. I later found out Sabrina had actually had this exact insight about a year earlier. To my credit though, I feel like I did meaningfully move the conversation in my second post by enumerating the rules we should uphold to encode the "drop is guaranteed to run" interpretation of linearity.

We can reason through this to arrive at a constraint:

  1. In order to achieve "task scopes" we have to combine linearity + async destructors
  2. The only system of linearity we know how we could encode and can see a path towards integrating into the language [1] is "drop is guaranteed to run"
  3. This system works because we disallow all instances where drop could not be guaranteed to run - guaranteeing drop will run
  4. When async drop is combined with linearity we have to guarantee destructors will always be run
  5. Async drop cannot introduce any new or unaccounted for scenarios where destructors are not guaranteed to run

We can invert this conclusion too: if we arrive at a design for async drop which introduces new cases where destructors are not run, we cannot use it for scoped tasks. That means that a design for async drop should prove that it doesn't introduce any new or unaccounted cases where destructors aren't run. And the most practical way to ensure that is be if the semantics of async destructors closely follow those of non-async destructors.

There are other, practical reasons for why we would want to ensure that async destructors can't be prevented from running in ways unique to async Rust. But I'm choosing the type-system interactions with linearity to enable task scopes here, because they are both clear and strict. It means that if we introduce new ways in which destructors aren't run, we'd be closing the door on a particularly desired feature - and that's not something we want to do.

Drop glue and forwarding of implementations

A key property of rust's drop system is that it provides two properties:

  1. Destructors are always run when a type becomes unavailable (modulo leaking)
  2. Destructors are directly tied to the lifecycle of the object

This works particularly well with Rust's move semantics, which ensures that the code to cleanup any object is always provided by that object. When a type is nested in another type, the encapsulating type ensures that the inner type's destructors are run. The code generated for this is what we canonically call "drop glue", and it does not have a representation in the type system. Here's an example of how this works:

// Our inner type implementing `Drop`.
struct Inner {}
impl Drop for Inner {
    fn drop(&mut self) {
        println!("inner type dropped");
    }
}

// Our outer type containing our inner type.
// The inner destructor is forwarded by the outer type
// via "drop glue".
struct Outer(Inner);

fn main() {
    // Construct the types and drop them.
    let outer = Outer(Inner {});
    // `outer` is dropped, prints "inner type dropped"
}

The type Inner implements Drop in this example, but the type Outer does not. That is the difference between "drop glue" and a Drop impl. Drop glue is automatically inserted by the compiler, generating actual Drop impls. But in the type system we cannot write a bound which says: "this type implements drop glue". Instead Rust users are expected to assume drop glue may exist on any type, and so there is no need to ever really check for its presence.

fn drop_in_place(t: impl Drop) {}

fn main() {
    drop_in_place(Inner {});         // ✅ `Inner` implements `Drop`
    drop_in_place(Outer(Inner {}));  // ❌ `Outer` does not implement `Drop`
}

Now if we convert this to our AsyncDrop impl, we would likely want to write it somewhat like this.

struct Inner {}
impl AsyncDrop for Inner {                     // using `AsyncDrop`
    async fn drop(&mut self) {                 // using `async fn drop`
        println!("inner type dropped").await;  // using async stdio APIs
    }
}

struct Outer(Inner);

async fn main() {                 // 1. Note the our hypothetical `async main`
    let outer = Outer(Inner {});  // 2. Construct an instance of `Outer`
                                  // 3. Drop the instance, printing a message
}

Now the question is: could we just write it like this and insert async drop glue, or would we run into trouble if we attempted to do that? To learn more about that, let's take a look at how we expect types with async destructors to be called in function bodies, and when they don't work.

Async destructors can only be executed from async contexts

Okay, let's do a little whirlwind tour of different scenarios where we might try to drop a type implementing AsyncDrop, and discuss whether we can do so. In order to do that let's take our earlier definition of Cat which asynchronously prints out a message when dropped.

struct Cat {}
impl AsyncDrop for Cat {
    async fn drop(&mut self) {
        println!("The cat has plopped").await;
    }
}

Now the first example we used was an async function. At the end of the scope we drop the type and run the destructors. This is an async destructor running at the end of an async context, and is the base case we expect to work.

async fn nap_in_place() {
    let cat = Cat {};         // 1. Construct `cat`
                              // 2. `cat` is dropped here, printing:
                              // "The cat has plopped"
                              // 3. The function yields control back to the caller
}

If you're interested in how the state machines for something like this would desugar, I recommend reading Eric Holk's posts describing the low-level state machine for async drop (part 1, part 2). We know that as long as we're in an async context, we can generate the right state machine code for async destructors. But conversely also: if we're not in an async context, we can't generate the state machine for async destructors. Which means that dropping types which implement async Drop is only allowed in async contexts.

/// A non-async function taking an instance of `Cat` and drop it straight away
fn nap_in_place() {
    let cat = Cat {};  // ❌ Compiler error: `Cat` can only be dropped in an async context
}

Theoretically it might be possible to upgrade async destructors to non-async destructors at runtime if we detect they're held in non-async contexts. We could do this by wrapping them in a block_on call, but that would risk causing deadlocks. We can already write block_on in destructors today, but generally choose not to because of this risk. As such any useful formulation of async Drop cannot rely on runtime mechanisms to cover up gaps in the static guarantees [2].

Async destructors and control points

A common sentiment about async drop designs is that they should not introduce any "hidden .await points". The sentiment is that this would run counter to the design goals of having .await in the first place, and so we need to be careful to ensure that we don't introduce any unexpected control flow. However it's worth investigating what "control flow" exactly means, which .await points are already introduced today, and use that to formulate the concrete constraints this imposes on the design of async drop.

In order to define where async destructors may be executed, we first have to identify and classify where values can be dropped today. For this I like to use the language of control points: locations in source code where control may be yielded from the function to the caller. Not all control points provide the same semantics, so we can further classify control points into three distinct categories:

  • Returning operations: hand control back to the function's caller, ending the function. Examples: return, ? in function scopes, panic!, and the last expression in a function.
  • Breaking operations: hand control back to an outer scope, but don't directly return from the function. Examples: break, continue, ? in try {} scopes, and the last expression in a block scope.
  • Suspending operations: hand control back to the caller, but the caller can choose to hand control back to the callee again later. If the caller does not hand control back, the function ends. Examples: .await, and yield.

To put this theory into practice, here is an example of fairly typical async function you might see in the wild. It takes a path, does some IO, parses it, and then returns either a value or an error.

async fn read_and_parse(path: PathBuf) -> io::Result<Table> {
                                        // 1. futures start suspended, and may not resume again
    let file = fs::open(&path).await?;  // 2. `.await` suspends, and may not resume again
                                        // 3. `?` propagates errors to the caller
                                        // 4. `fs::open` may panic and unwind
    let table = parse_table(file)?;     // 5. `?` propagates errors to the caller
                                        // 6. `parse_table` may panic and unwind
    Ok(table)                           // 7. return a value to the caller
}

To people who are not used to thinking about control points, this number of control points in this function is likely surprising. That's seven control points on a three-line function. This ratio is also why I don't believe any formulation of linearity or finalization which doesn't rely on destructors will ever be a practical alternative. Rust's Drop system has so many contact points in function bodies that manually describing them without violating lifetime invariants would be highly unergonomic.

The most surprising control point in this function is likely the first one: destructors may be executed before the function even has had an opportunity to run. For all intents and purposes this can be considered a "hidden .await point", but in practice it doesn't appear to be a problem. Why is that?

async fn read_and_parse(path: PathBuf) -> io::Result<Table> {
    // `path` may be dropped before the function has even begun executing
}

The reason is that we have an intuitive understanding that as soon as a function exits, we run destructors. If a function never starts and immediately exits, that's semantically equivalent to the function starting and immediately dropping all values. Despite introducing an "implicit .await", this is not a problem because we're immediately exiting the function, which is when we expect destructors to execute. That's the same logic which allows returning operations such as ? and return to exit a function and trigger async destructors. If we want to allow async destructors to run in the face of panic and return, then the rule we must uphold is: async destructors are run at the end of async function scopes.

So when people talk about worrying about hidden .await points, what do they actually mean? For that we can adapt one of Sabrina's examples:

async fn hostile_nap_takeover() {
    let mut chashu = Cat {};         // 1. Construct the first instance of `Cat`
    let nori = Cat {};               // 2. Construct the second instance of `Cat`
    mem::replace(&mut chashu, nori); // 3. Assign the second instance to the location of the first instance
                                     // 4. That will trigger the first instance's destructor to run
                                     // 5. The second instance's destructor is run at the end of the function scope
}

The issue is that when we call mem::replace, async destructors will be run without an associated .await point. This is not at the end of any function scope (or block scope, we'll get to that in a sec) - but right in the middle of a function - which will continue executing after the destructor has finished running. This would not be an issue if mem::replace was an async function - which would require an .await point to be called. But even that only has limited applicability because Rust also has operators, meaning we could rewrite the above like this instead:

async fn hostile_nap_takeover() {
    let mut chashu = Cat {};  // 1. Construct the first instance of `Cat`
    let nori = Cat {};        // 2. Construct the second instance of `Cat`
    nori = chashu;            // 3. Assign the second instance to the location of the first instance
                              // 4. That will trigger the first instance's destructor to run
                              // 5. The second instance's destructor is run at the end of the function scope
}

This is an example we want to prevent, and so this allows us to formulate another constraint: Async destructors can only be executed in-line in the at an .await point. If that isn't a constraint, then the previous examples would be allowed, which we know we don't want. We can take this rule and apply it to in-line block scopes as well. Taking the Cat type again, we can imagine a function where a cat goes out of scope at some block.

async fn nap_anywhere() {
    let mut chashu = Cat {};  // 1. Construct an instance of `Cat`
    move {
        let chashu = chashu;  // 2. Ensure the instance is captured by the block
                              // 3. ❌ The instance would be dropped here
    }
}

This function violates the constraint we just described: the value chashu would be dropped in the middle of a function, causing async destructors to be run without any associated .await points. Luckily we should be able to abide by this rule by converting the block scope to an async block scope, and .awaiting that. This would cause the destructor to run at an .await point, resolving the issue.

async fn nap_anywhere() {
    let mut chashu = Cat {};  // 1. Construct an instance of `Cat`
    async move {
        let chashu = chashu;  // 2. Ensure the instance is captured by the block
                              // 3. ✅ The instance would be dropped here
    }.await;
}

This works because it is semantically equivalent to defining an async function and moving the value to that, which we've already established would be permissable. That enables us to describe the following constraint: It must be possible for async destructors to be run at the end of async block scopes. This enables us to synthesize the following rules for when async destructors can be executed:

  • When an async function returns
  • Inside another async function which is .awaited
  • At the end of an async block scope in a function

As mentioned at the start of this section, Sabrina Jewson has done an excellent job covering the challenges of async drop. Where we've landed with these constraints is most similar to what she described as "abort now: always await". This opens up questions about design and ergonomics, which I believe are important, and would like to engage with in a follow-up post.

liveness and ownership

Earlier in this post we established that types implementing async drop cannot be dropped in non-async contexts. On its face we might be inclined to extrapolate this rule and say that types implementing async drop cannot be held in non-async contexts at all. That isn't quite true, because it's possible to for values to be live in a non-async context, without ever being dropped in that same context. One example are synchronous constructor functions.

struct Cat {}
impl AsyncDrop for Cat { ..  }
impl Cat {
    fn new() -> Self {
        Self {}  // ✅ Implements async Drop, is held live in a non-async function
    }
}

Here we construct a new instance of Cat, which implements AsyncDrop, which is owned inside of a non-async context. And that is fine, because at no point is there a risk of destructors being run. Intuitively we might be inclined to say that as long as types implementing async drop aren't held live across control points, we're fine. But that too would be too restrictive as shown by the following example.

/// A function which takes `Cat` by reference and then panics
fn screm(cat: &mut Cat) {
    panic!("I scream, you scream"); // `cat` is live when this panic happens
}

async fn screamies_time() {
    let mut cat = Cat::new();  // 1. Construct a new instance of `Cat`.
    screm(&mut cat);           // 2. Pass a mutable reference to a function
                               // 3. ✅ The function panicked, run the instance's destructor here.
}

Again, this is fine because the destructor for cat will never be run inside of the function screm - and so the fact that even a mutable reference is held live across a control point in a non-async function is okay. It's only when owned values implementing async drop are held live across control points in non-async contexts that we run into trouble.

/// A function which takes `Cat` by-value and then panics
fn screm(cat: Cat) {
    panic!("I scream, you scream"); // `cat` is live when this panic happens
                                    // ❌ Compiler error: `Cat` must be dropped in an async context
}

async fn screamies_time() {
    let mut cat = Cat::new();  // 1. Construct a new instance of `Cat`.
    screm(&mut cat);           // 2. Pass the instance to a function
}

That surfaces the following rules: It's always possible for values implementing async Drop to be live in non-async contexts as long as they are never held across control points. And it's also always possible for references implementing async Drop to be held live in non-async contexts even across control points. However it is never possible for owned values implementing async Drop to be held live in a non-async context across control points.

bounds for drop glue

In the Rust stdlib we have a function ManuallyDrop which can be used to more finely control when types are dropped. For example Arc uses ManuallyDrop number of times internally to manipulate the reference counters without actually losing data. A simplified version of its signature looks something like this:

pub struct ManuallyDrop<T: ?Sized> { .. }
impl<T: ?Sized> ManuallyDrop<T> {
    /// Create a new instance of `ManuallyDrop`
    pub fn new(value: T) -> ManuallyDrop<T> { .. }
    /// Drop the value contained in `ManuallyDrop`
    /// Safety: you may only call this function once.
    pub unsafe fn drop(&mut self) { .. }
}

The bounds for the type it operates on are T: ?Sized, not T: Drop. That is because this function is happy to operate on any drop impl including drop glue, and drop glue is not visible in the type system. So what happens if we want to write an async version of this type? Presumably fn drop should be async. But what should the bounds on T` be?

/// A hypothetical async drop compatible version of `ManuallyDrop`.
pub struct AsyncManuallyDrop<T: /*bounds*/> { .. }
impl<T: /*bounds*/> AsyncManuallyDrop<T> {
    pub fn new(value: T) -> AsyncManuallyDrop<T> { .. }
    pub unsafe async fn drop(&mut self) { .. }  // Note the `async fn` here
}

Presumably the bounds need to be able to express something like: T: ?Sized + AsyncDropGlue. Not T: ?Sized + Drop, because that would refer to a concrete impl. And it can't just be T: ?Sized either, since that implies the existing drop glue bounds. The question of how we can ergonomically surface these bounds is a design question we won't go into right now. This example exists to show that async drop glue needs to be able to be surfaced to the type system so it can be used in bounds.

mixing async and non-async drop glue

Once we start considering the presence of async drop glue, we might wonder whether it needs to be mutually exclusive with non-async drop glue. I don't believe the two can be mutually exclusive, because it would interact badly with synchronization primitives such as Arc. Take for example the following type:

struct Inner {}
impl AsyncDrop for Inner {
    async fn drop(&mut self) {
        println!("inner type dropped").await;
    }
}

struct Outer(Inner, Arc<usize>);

The type Outer here carries both an Arc which implements Drop, and Inner which implements AsyncDrop. For this to be valid it needs to be able to implement both async and non-async drop glue. If we make both kinds of drop glue mutually exclusive, then we would need to define a new version of Arc which itself doesn't perform any async operations - but does implement async Drop just to satisfy this requirement. This seems like a particularly harsh direction with no clear upsides, which brings us to the following constraint: it must be possible for a type to implement both async and non-async drop glue.

cancellation cancellation

In his first post on async cancellation handlers, Eric asks: "what behaviors are possible if a cancelled future is cancelled again?" In it he presents the following three options:

  1. Attempting to cancel the execution of an async destructor is statically disallowed
  2. Attempting to cancel the execution of an async destructor may succeed (recursive)
  3. Attempting to cancel the execution of an async destructor results in a no-op (idempotent)

To make this a little more concrete, we can write a code example. In it we'll author some type implementing AsyncDrop which takes a little bit of time to complete (100 millis). We'll then drop that in a scope somewhere to start it. Then in some outer scope higher up on the stack we trigger a cancellation after a much shorter period. What we're asking is: what should happen in this scenario?

struct Cat {}
impl AsyncDrop for Cat {
    async fn drop(&mut self) {
        sleep(Duration::from_millis(100)).await;
        println!("Napped for 100 millis").await;
    }
}

async fn main() {
    async {
        async {
            let cat = Cat {};                   // 1. Construct an instance of `Cat`
                                                // 2. Drop the instance of `Cat`, running its destructor
        }.await;                                // 3. This future will now take 100 millis to complete
    }.timeout(Duration::from_millis(10)).await; // 4. But we're cancelling it after just 10 millis
}

It's key to remember that this is just an example, and necessarily simplified. A cancellation may be triggered anywhere in the logical call stack, during the execution of any async destructor. Meaning this is fundamentally a question about composition, and we should treat the various components involved as black boxes.

In Eric's post he rejected the option to statically disallow the cancellation of execution of async destructors for similar reasons as we've just outlined. Callers higher up on the stack do not know about the internals lower on in the stack. Eric did not believe this was feasible, and I agree. It's unclear what analysis we would need to allow this, and even if we could figure it out the resulting system would likely still be limited to the point of impracticality.

The second option would be to allow async destructors to be cancelled. this would violate the first constraint we declared: async destructors may not introduce any new ways in which destructors can be prevented from running as that would close the door on scoped tasks. In the example we can see this: if the timeout stops the execution of async drop, it will never reach the println! statement.

The third option is the only feasible behavior we can adopt which doesn't violate any of the constraints we've discovered so far. When the timeout is called, rather than cancelling the async drop impl it waits for it to finish. That means that: attempting to cancel the execution of an async destructor should result in a no-op. This behavior is what what Eric called idempotent, and it comes with some additional challenges:

Admittedly, this might take additional rules, like we may want to declare it to be undefined behavior to not poll a cancelled future to completion. Scoped tasks would likely need this guarantee […]

TODO: rewrite this paragraph:

That is a great point: with regular Drop we cannot call the method directly - instead we have to pass it by-value to something like the drop function which is basically just a no-op. The ManuallyDrop type does provide a drop function, but that is marked unsafe and the user is on the hook for upholding the invariant that drop is called at most once. For AsyncDrop the rules should be similar: the only way to obtain an async destructor to manually poll should be via a built-in (e.g. AsyncManuallyDrop). The additional safety invariant for that should be that if it is used to obtain the async drop future in a non-async context, it guarantees it will run it to completion. This is needed because at some point we do need to map async back to sync, but as we stated we cannot introduce conditions where async destructors would not be run. unsafe invariants is the only way we can do that.

TODO: summarize the constraint this poses.

Conclusion

In this post we've surfaced the following constraints with respect to any potential async drop design:

  1. Async drop cannot introduce any new or unaccounted for scenarios where destructors are not guaranteed to run
  2. Dropping types which implement async Drop is only allowed in async contexts
  3. Any useful formulation of async Drop cannot rely on runtime mechanisms to cover up gaps in the static guarantees
  4. It must be possible for async destructors to be run at the end of async function scopes
  5. Async destructors can only be executed in-line at an .await point
  6. It must be possible for async destructors to be run at the end of async block scopes
  7. It is never possible for owned values implementing async Drop to be held live in a non-async context across control points
  8. Async drop glue needs to be able to be surfaced to the type system so it can be used in bounds
  9. It must be possible for a type to implement both async and non-async drop glue
  10. Attempting to cancel the execution of an async destructor should result in a no-op

unimplemented sections

⚠️ These unimplemented sections are mostly about generics and which restrictions emerge once we try and interact with the trait system. I understand this is particularly relevant for the questions we have about async iterator and async closures. But I'd like to punt discussion on that until these sections have been spelled out and have examples to substantiate the points they make. That will make for a better conversation, plus there is plenty in this post already to discuss. ⚠️

TODO: concrete impls and generics

  • We should not pass an AsyncDrop impl to a type like Vec as-is
    • Assume data is stored in a ManuallyDrop
    • It has a manual Drop impl, which is not guaranteed to run the AsyncDrop impl
    • Rule: types implementing async drop can only be passed to bounds which expect it
    • Rule: if a bound does not state it wants an AsyncDropGlue impl, a type implementing it cannot be passed to it
    • reason: otherwise there is nothing preventing the manual drop impl from only executing the sync drop glue, and yeeting the rest of it. Existing code can do that today, and we cannot say it is now doing unsound things. It needs new bounds for that reason.
    • reason: Vec<T: !Leak> may have a valid implementation for any T: DropGlue, but would not be valid for any T: AsyncDropGlue
  • If we want to run async destructors, we should have a Vec<T: AsyncDropGlue>

TODO: async drop glue can be a noop

  • it's okay if we say + AsyncDropGlue and then there isn't actually any
  • we've established that async + non-async drop glue should be able to co-exist
  • in non-async rust we cannot name this bound, so this doesn't come up
  • this only comes up here because we can name the bound

TODO: async drop bounds are going to be everywhere, and that's going to be noisy

  • most async leaf types will want to be able to provide AsyncDrop - including all tasks
  • every trait bound in async code will want to + AsyncDropGlue
  • that's going to mean a lotttt of bounds if that ends up going through
  • if we want it to be practically usable, we're going to have to streamline it
  • The best way to achieve that will be to somehow be able to imply + AsyncDropGlue is implied for async trait bounds
    • e.g. T: AsyncRead should imply + AsyncDropGlue
    • We lose nothing on that because async drop glue can be a noop, just like regular drop glue can be a noop
    • But we have everything to win by doing that so that the entire async ecosystem doesn't become a mess of annotations
  • In a strict sense this is a restriction (covariant? it's some variance thing)
    • if T: AsyncRead implies + AsyncDropGlue, that's a tighter bound than if it doesn't imply that
    • however, that may not be the end of the world if we indeed assert that async drop glue can be a noop
      • "hey this bound now expects async drop glue" - will always be true. Meaning that while theoretically it's a tighter bound, in practice the bound will work for any type.

TODO: async functions

  • If an async function holds a type which implements async drop glue, it becomes impl Future + AsyncDropGlue
  • We can combine that with the previous restriction: "types implementing async drop can only be passed to bounds which expect it"0
  • This comes at the conclusion: any type which takes a future that can work with async drop glue needs to state so in the bound

Discussion

Attendance

  • People: TC, Vadim Petrochenkov, tmandry, Daria Sukhonina, Yosh, eholk, Vincenzo

Meeting roles

  • Minutes, driver: TC

Passing mutable references to sync functions

eholk: The example copied below should probably be illegal.

async fn hostile_nap_takeover() {
    let mut chashu = Cat {};         // 1. Construct the first instance of `Cat`
    let nori = Cat {};               // 2. Construct the second instance of `Cat`
    mem::replace(&mut chashu, nori); // 3. Assign the second instance to the location of the first instance
                                     // 4. That will trigger the first instance's destructor to run
                                     // 5. The second instance's destructor is run at the end of the function scope
}

The reason is we can't run the chashu destructor before passing it to mem::replace, since if we did we'd effectively be passing in a pointer to uninitialized memory. We need mem::replace to make the decision about whether to destruct chashu, but mem::replace is synchronous so it cannot await chashu's destructor. The right answer is probably to annotate mem::replace so that the first argument can't be something that has an async Drop.

On the other hand, the second version that uses assignment instead would be perfectly fine because we'd be running destructors in an async context.

Yosh: Clearly the code should be rejected.

eholk: The destructor will run in the scope of mem::replace.

TC: The mem::replace case seems a subset of a larger question. How would the design propose here interact with manual poll implementations? I.e. how would one write the kind of low-level code that's necessary for writing combinators, executors, etc. if everything needs to be in async blocks?

Yosh: That is a critical point. I didn't quite get to this in the document. We would need an async version of ManuallyDrop. I need to spell this out carefully.

tmandry: What's the rule to prevent mem::replace from doing this? The rule in the document doesn't seem sufficient to prevent this case.

Yosh: Since you can't drop an async type in a sync context, that covers it.

Yosh: Stepping back, I'm trying to show in this post why things have to be a particular way and what happens if we do the other thing.

Are linearity and drop tied together?

TC: In talking with CE, I've heard it proposed that linearity may likely be implemented in a way more like the borrow checker rather than by carefully trying to exclude all leaking operations. In that world, do we still need to preserve the invariant mentioned in the document about async drop needing to carefully not add the possibility of any breaks in linearity?

tmandry: This is maybe similar to what Yosh is proposing; there seem to be some control-flow things that would implied or be required by the rules Yosh is putting forward.

Yosh: Linearity adds restrictions on what is possible. I don't see how we could do linearity without threading it through the type system. I'm not sure what a control-flow based mechanism would look like.

eholk: It does seem like a type system thing.

CE: We have a trait in the solver called Destruct. It tracks whether a type has a Drop impl or contains something with a Drop impl. const_precise_live_drops is what allows the borrow checker to bound only the right things. This proposal would require something similar.

CE: Let's call it Destruct for consistency, not DropGlue.

Comparison with poll_drop_ready and similar

TC: There is a proposal

https://github.com/withoutboats/rfcs/blob/poll-drop-ready/text/0000-poll-drop-ready.md

to extend the Drop trait as follows:

trait Drop {
    fn drop(&mut self);

    fn poll_drop_ready(&mut self, cx: &mut Context<'_>) -> Poll<()> {
        Poll::Ready(())
    }
}

(Variations on this are possible.)

The idea here is that in async contexts, poll_drop_ready is polled to completion before invoking drop.

Presumably this doesn't achieve the connection to linearity this document is going for, but maybe there are other paths there? It'd be good to compare these approaches against our goals and desired properties.

Yosh: I didn't really want to evaluate any concrete designs in this post; I mainly wanted to work through examples to determine constraints. But yeah, that direction of designs introduces new cases where destructors won't run - which Eric covered in his second post. So on the basis of that we'd probably need to reject it.

Daria: That would require to store state needed for async drop inside of the structure, like for example Vec would contain an index to the element we currently async drop, unless we poll every element kinda like in join!. Wouldn't that have any clear disadvantages?

eholk: Along with what Daria's saying, this is an example where the async fn drop formulation has a clear advantage over the fn poll_drop_ready formulation. The async fn drop function returns a future and that gives you a place to hold the state needed to, for example, keep track of where we are in dropping all the elements held in a Vec. With poll_drop_ready, you have to build the state needed to do async cleanup in the Vec to start with.

TC: It's interesting that in the case of AsyncIterator, we've found that having a second state machine causes problems (e.g. with cancellation). But in this case, the additional place to put state turns out to be useful.

eholk: +1.

Liveness and ownership

tmandry: A couple of questions came up for me in this section:

  • Does the doc assume that async destructors will happen when unwinding (panicking) through an async context?
    • Vadim: this should be possible (suspending/resuming during unwinding), but we need to catch the exception first, then store it into the coroutine, then rethrow the stored exception after resuming. Unwinding is a "whole thread" process and has a #[thread_local] static component. Catching the exception "localizes" it and allows it to be captured by the coroutine.
  • How will we deal with holding generic types in synchronous scopes, when those types may or may not implement async Drop?

tmandry: Summary of Yosh's response: You need T: AsyncDropGlue to pass it. Effectively T: ?Drop.

(The meeting ended here.)

Main Goals what do we want to enable?

eholk: Async drop or cleanup can have a couple of use cases. For example, this doc seems to have enabling scoped tasks as a key use case. A weaker goal would be enabling best effort async cleanup. The scenarios we want to support have a big impact on requirements, so we should decide what scenarios are important.

For example, I don't see a way to enable scoped tasks without either linear types or implicit await points. But if we're happy with best effort async cleanup, we have a lot more flexibility here.

Ergonomics of only dropping at existing await points?

eholk: If we say for example x = y cannot run an async destructor on x because we need an await point to run destructors, does this mean we're going to give up a lot of the ergonomics of async Drop and end up with something with ergonomics more like linear types?

Defer blocks?

Daria: Undroppable types are messy to work with. To convince the compiler that something never drops you have to use --release as with #[no_panic] attribute. Maybe defer blocks are our salvation a path forward to consider?

How do runtimes manage tasks?

eholk: One of the things we'll have to do to keep destructors from leaking is disallow holding them in either something that gives shared ownership (e.g. Arc), or something that gives mutability (e.g. RefCell or Mutex). Is this something runtimes can do?

I'm assuming the answer is that runtimes will have to make liberal use of ManualAsyncDrop.

Daria: ManualAsyncDrop could be safe to use because it should be safe to leak a pinned box to some suspended 'static future.

More Linearity and async Drop

eholk: If the language already had linearity, how would we design async Drop? It seems like maybe these are separable features, and tying them together could lead to weird incongruencies where we have linearity for async types but not in the whole language.

Could the rules proposed here instead be repurposed for general linear types? For example, an effect of only running async Drop at the end of async scopes is that you can't do async_x = y. If we had linear T, it seems like we'd end up with the same rule.

Why now?

eholk: We've had concerns that shipping other features, like async closures, might close doors for what we can do with async Drop. Do we have concrete examples of where certain async Drop designs would not work with certain async closure designs?

CE: It's really about designing APIs that are flexible enough to take AsyncDestruct bounds, probably not async closures directly.


  1. There are some questions about how to extend linearity to all language items, but that all seems pretty solvable. That's more like, an integration question rather than a more fundamental question about the core system. ↩︎

  2. If async drop carried a risk of introducing deadlocks we're unlikely to use it in Azure, and may potentially even choose to lint against it. In order for us to use async Drop, it's not enough for it to just be available - it must also be useful. If we can't guarantee that, then it might be preferable not to have an async destructor feature at all. ↩︎

Select a repo