owned this note changed 2 years ago
Published Linked with GitHub

Sabrina Jewson's Async Drop article (Async deep dive 2022-06-23)

tags: deep-dive

https://sabrinajewson.org/blog/async-drop

Notes during reading

Niko's notes

it would now be basically impossible to carefully manage where cancellations can occur and most users would end up having to treat cancellation more as a pthread_kill than a helpful control flow construct.

it seems like cancellation already is quite a bit like pthread-kill, though it's a question of degree. That is, I feel like having potential cancellation at "every await point" often means you just have to be very thoughtful about things you expect to happen on cancellation regardless.

To me the question really comes down controlling the scope of cancellation i.e., how can we be sure that when a task is cancelled, all the tasks that are interacting with it, especially sharing mutable state, are also cancelled.

I think if you're careful about that (and perhaps you have some specially designated cases of shared state that is expected to surive cancellation, for which you use destructors), it's not so hard to manage.

This is the only option of the three to definitively avoid the “implicit cancel” footgun, but it’s still not ideal as it ends up introducing new weird-looking syntax and makes writing async code pretty verbose.

it's not only =.await, it's also all control flow operations like return, break, and continue and (most notably) ?.

fn foo() {
    let x = something();
    let y = something_else()?; // may drop `x`
}

the primary problem in code like that is the confusing semantics of select! and not the cancellation behaviour of futures.

agreed, though there are subtle capabilities here that I think we should think about how to preserve :)

“Delayed abort” designs

Hmm, I might be missing something here, I'm not sure how this works. Aborting in Rust corresponds to dropping if we ignore synchronous drops for the moment, and just assume all drops are asynchronous (which this post seems to do, which I think is fine), then I guess this corresponds to delaying when the async drop itself runs?

Or is the idea that the async dtor for a future will run the async I guess I can imagine injecting a failure or something but having the Drop continue running. Seems messy.

I don’t think is too surprising for users.

Hmm, I am not convinced of this at all. I guess I want to work through the full details of the scenario in question.

It still seems to me like the practical advice is really less about monitoring exactly when your await occurs and more about:

  • ensure that everything "interconnected" goes down together
  • ensure that you are careful when accessing the exceptions to the above rule to use "cancel-safe" code

this makes you robust against both await and panic, as a benefit.

Many many functions from the standard library become essentially off-limits, so not only do you not get their ergonomics in well-written code it would be very easy to create bug-ridden code too, simply by calling any function like Option::insert on a TLS stream.

Hmm, this is a pretty solid point. I hadn't considered this in full, I think. This kind of returns to the question of "can we make fn mean 'maybe async'"? (I still believe we could, up to dyn dispatch).

Except…it’s not so simple. Because at nearly every point in a program, it is possible for the thread to panic, and if that happens unwinding might start to occur and if that happens you need to drop all the local variables in scope but you can only do that if they have a synchronous destructor

This is an assumption, right? That is, I had assumed that panic! would do an "async unroll" if there are async dtors in scope (i.e., we would catch the unwind, do the async drop (possibly awaiting!) and then resume the unwind).

Ah, I see, I was missing the point. The point is that panics in sync code could occur.

This is regulated by two traits, UnwindSafe and RefUnwindSafe, which provide the necessary infrastructure to check all of this at compile time.

This isn't really a safety guarantee of Rust regardless feels like a lot of concern for a pretty artificial example (one that invokes poll manually, for example).

But this design comes with one major drawback that I haven’t seen mentioned so far: i

pretty strong point

Tyler's notes

Not quite following the need for Drop bounds. Oh right, it's talking about relaxing the default to allow async drop, not requiring async drop.

Gus's notes

Would a correct Leak bound now be required when writing unsafe code?

nrc's notes

I wonder if async overloading (keyword generics) adds insights here? In particular, that requires choosing between sync/async and handling await somewhat generically. That might make some of the issues around sync/async interactions easier.

Perhaps the fact that await would be implicit means we simply can't do this in a world with explicit awaits?

Drop bounds suck. I think drop and async drop need to be more integrated and this probably requires some kind of magic.

Yosh's notes

"async closures"

There is another distinct benefit to having async closures: different bounds for the closure and the future! Consider the definition of std::thread::spawn:

pub fn spawn<F, T>(f: F) -> JoinHandle<T> 
where
    F: FnOnce() -> T,
    F: Send + 'static,
    T: Send + 'static
{...}

The closures is Send. The returned value T is Send. But the actual code executing inside the closure is ?Send. This doesn't surface in threads because in non-async Rust the internal structure of a function isn't returned back out in the form of a state machine. But in async Rust, to get the equivalent effect as the std::thread::spawn we could imagine the following:

pub fn async_spawn<F, Fut, T>(f: F) -> JoinHandle<T> 
where
    F: FnOnce() -> Fut,
    F: Send + 'static,
    Fut: Future<Output = T>,
    T: Send + 'static
{...}

There are some issues here because an "async closure" and "closure which returns a future" are not equivalent. But this would enable us to move a closure to a new thread, which guarantees that its contents for the remainder of the execution will not move. Viewed through a lens of "a task is a parallel future", this is a parallelism primitive to operate on !Send types in parallel async Rust.

Niko: glommio's spawn

pub fn spawn<G, F, T>(self, fut_gen: G) -> Result<ExecutorJoinHandle<T>, ()> where
    G: FnOnce() -> F + Send + 'static,
    F: Future<Output = T> + 'static,
    T: Send + 'static, 

there's also spawn_local

pub fn spawn_local<T>(future: impl Future<Output = T> + 'static) -> Task<T> where
    T: 'static, 

use glommio::{LocalExecutor, Task};

let local_ex = LocalExecutor::default();

local_ex.run(async {
    let task = glommio::spawn_local(async { 1 + 2 });
    assert_eq!(task.await, 3);
});

Other people's notes

Add yourself a section!

Meeting discussion

Delayed abort

async fn foo() {
    *x = y; // runs an async dtor
    println!("foo");
    something.await;
}

async fn method() {
    let f = foo().race(bar()).await;
    // if we start polling `foo`,
    // it starts running async drop for `*x`, awaits,
    // but other_foo finishes first,
    // so we async drop the `foo()` future --
    // and instead of stopping immediately, it continues
    // to run until a user-defined await point
}
sequenceDiagram
    method->>foo: start foo
    foo->>dropx: I'm droppping `*x`
    dropx->>method: I'm suspended
    method->>bar: start bar
    bar->>method: I'm done!
    method->>foo: start dropping yourself

the question is does foo just start "unwinding" and running any in-scope destructors? does it keep running for a while longer?

Argument for why to wait until next Poll::Pending is at the end of https://sabrinajewson.org/blog/async-drop#delayed-abort

how realistic is this?

How realistic is it that *x = y will run some async destructor?

We could make you more explicit, but do we want to?

in a sync context?

there are a lot of sync methods that drop stuff (e.g., vec.clear) that want to be async

another argument in favor of linear types?

Discussion (Jun 30)

closure semantics

Connection between

  • async closures <=> IntoFuture
  • Futures <=> FnOnce

async drop

Really needs async overloading!?

linearity

  • linear types: exactly once semantics
  • affine types: at most once semantics

linearity traditionally means "consume exactly once"

niko means by it: cannot be dropped, destructured within its privacy boundary exactly once

let Foo { a, b, c } = self;

functions like vec::clear become clear_with, where they take a fn(T) to "consume" the result

add in async overloading et viola you have

vec_futures.clear_with(async_drop).await; // drop all the connections in here, running their destructors, etc

How to prevent synchronous drop of async Drop values in generic code?

nrc: How would we think about this if it were anything other than Drop? / Should we try to focus on drop in a special way instead of just another trait? This seems to be leading us down async versions of traits and so on and it feels like maybe it's a wrong path. Something like defer might be useful here.

tmandry: What about spawning async destructors as new tasks? Could we have a generic block_on in the environment that always worked?

nrc: We can catch "async drop in sync context" at runtime and panic/abort.

Seems like because of the nature of Drop being built-in, the "genericity over async" you need becomes infectious throughout all code.

fn bar() -> () {
    // is this sync or async context?
    f.await // compile error?
}

async fn foo() {
    let x = bar();
}

fn main() {
    bar();
}

Can we adapt regular sync functions to work in async? Not in general, because await points are cancellation points and we shouldn't introduce those silently.

yosh: But for async Drop specifically, can we make it uncancellable? We might be able to do this, and it could let us adapt sync functions only for the purpose of running async Drop.

nrc: If we go to that trouble we should introduce non-cancellable futures generally.

yosh: Use cases?

nrc: Interop with C++, Structured concurrency,

tmandry: Heard sentiment expressed that "cancel on await" is confusing

yosh: I have too, but when I dug deeper I found it was something different. Need to see examples – usually it's select

nrc: Would be good to understand C# people's case for this

Select a repo