owned this note changed 2 years ago
Published Linked with GitHub

Async reading club notes: Panics vs cancellation, part 1

https://smallcultfollowing.com/babysteps//blog/2022/01/27/panics-vs-cancellation-part-1/

tags: reading-club

Why do panics not seem to cause so many problems but cancellation does?

  • ?: Explicit, can recover
  • Panic: Implicit, can recover
  • Cancel: Marked by await, cannot recover
    • await is a potential cancellation point, but we don't communicate it as such
    • You can "handle" the cancellation by running a drop guard, for async fns this can be done in-line using e.g. scopeguard.

eholk: Cancellation seems like a more normal thing in the ecosystem than panic. Panic seems to happen almost purely in the erroneous case while cancellation seems to happen regularly as part of normal program execution.


nrc: norms and best practices for panic
e.g. thread boundaries, lock poisoning
panic feels well-understood.

tmandry: Whole point of poisoning is to keep your program from being in an inconsistent state. We don't do that for cancellation.

nrc But when you have to think about unwinding that's a more uncomfortable area (unsafe etc)


My take is that the concept behind lock poisoning still seems good to me, but the ergonomics of how we implemented it are bad, and make people not like it.

What did we get wrong with the ergonomics of lock poisoning, and how could we do better?

  • nrc: lock() returns a result, when most people just unwrap it.
    • eholk: panic by default semantics would basically give us linked task failure
  • https://docs.rs/antidote/latest/antidote/
  • we could add a method today, just has to be more ergonomic than .lock().unwrap()?

This could be a crucial difference: I think, for example, it’s the reason that Java deprecated its Thread.stop method.

Did we fail to learn the lesson from this and pthread_cancel?

  • One difference: Can only cancel at await points

Although

A thread's cancellation type, determined by
pthread_setcanceltype(3), may be either asynchronous or deferred
(the default for new threads). Asynchronous cancelability means
that the thread can be canceled at any time (usually immediately,
but the system does not guarantee this). Deferred cancelability
means that cancellation will be delayed until the thread next
calls a function that is a cancellation point.

  • "Cancellation points" listed here; set of libc/syscalls
  • But "asynchronous cancelability" mode means it can be anywhere

Thread.stop:

This method is inherently unsafe. Stopping a thread with Thread.stop causes it to unlock all of the monitors that it has locked (as a natural consequence of the unchecked ThreadDeath exception propagating up the stack). If any of the objects previously protected by these monitors were in an inconsistent state, the damaged objects become visible to other threads, potentially resulting in arbitrary behavior. Many uses of stop should be replaced by code that simply modifies some variable to indicate that the target thread should stop running.

https://aturon.github.io/tech/2016/09/07/futures-design/

A fundamental difference between Rust's futures and those from other languages is that Rust's futures do not do anything unless polled. The whole system is built around this: for example, cancellation is dropping the future for precisely this reason.

source

Other languages have cancelation; difference between "forced cancelation" and "suggested cancelation".

Yosh: It is possible to block_on in a destructor with async_std. So you can ignore cancelation in this way

Why is Java's Thread::stop bad: https://docs.oracle.com/javase/8/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html – doesn't the same reasoning apply to Rust?

If you drop a Future, the MutexGuard inside the future will just be dropped.

MutexGuard is not Send; if your task needs to be Send you can't hold it across await. But if your task doesn't need to be Send


nrc: Java argument is that cancellation safety is too difficult so we'll make it illegal.

"Everything is in a consistent state at every await point" – yosh: I call this halt safety

yosh: Cancellation safety to me is "No side effects occur before you cancel at any await point", e.g. reading from a socket

Should we poison locks on cancellation? Probably. No way to tell whether it's a normal drop or not.

Tokio definition of cancellation safety: https://docs.rs/tokio/latest/tokio/macro.select.html#cancellation-safety

Select a repo