owned this note
owned this note
Published
Linked with GitHub
# ffi-unwind design meeting
The ffi-unwind project group is going to be holding an upcoming design meeting. We are exploring our first decision point.
## XXX some notes that I don't know where else to put
* common functions like `read` and `longjmp` may or may not unwind depending on details of the target
* e.g., `read` can unwind for some pthread implementations thanks to async cancellation
* similarly `pthread_exit` on some platforms triggers unwinding
* `longjmp` unwinds on windows, not elsewhere
* so what ABI should they use?
* if we want Rust to be able to fully interoperate and replace C, we need to be able to reflect these patterns
* longjmp at least is relatively common, not sure about pthread-exit
* no matter what, we need to define behavior of unwinding under `-Cpanic=abort` (either through "C" or "C unwind", depending)
* if we say that under `-Cpanic=abort`, it is UB to unwind over a frame with destructors:
* we can abort in debug builds, optimize release builds fully
* we support longjmp in the cases we want to, but also pthread-exit and other related things
* we can't really do aborting, specifically because we want to permit unwinding frames without destructors:
* shims don't work because we don't know if a dtor is in scope at point of call
* would have to make ever dtor abort, which negates the space size savings
* conceivably, `-Cpanic=abort` could forbid invoking "C unwind" functions altogether
* but then what do you do with `longjmp` and `read`
## The question: should the C ABI permit unwinding?
The core question that we would like to decide is whether the "C" ABI, as defined by Rust, should permit unwinding.
This is not a question we expected to be debating. We've long declared that unwinding through Rust's "C" ABI is undefined behavior. In part, this is because nobody had spent the time to figure out what the correct behavior would be, or how to implement it, although (as we'll see shortly) there are other good reasons for this choice.
In any case, in PR XXX, @Amanieu proposed that we could, in fact, simply define the behavior of unwinding across "C" boundaries. In discussing this, we came to realize that the question of whether the "C" ABI should permit unwinding was not open and shut.
At this point, we've been discussing the question for quite a while, and so the goal of this blog post is to lay out the arguments for either side. We'll discuss these in the lang team meeting and try to reach a final decision, which will be formalized with an RFC.
## Background: What is unwinding?
XX insert some material from other post
## Key considerations
Many of the arguments for either side hinge on the panic behavior of Rust. In particular `-Cpanic=abort`
We would like to ensure the following:
* Unwinding between Rust functions (and in particular unwinding of Rust panics) may not necessarily use the system unwinding mechanism
* In practice, we do use the system mechanism today, but we would like to reserve the freedom to change this
* If you enable `-Cpanic=abort`, we are able to optimize the size of binaries to remove most code related to unwinding.
* Further, in extreme cases, it should be possible to remove **all** traces of unwinding, when you know that it will not occur.
* Changing the behavior from `-Cpanic=unwind` to `-Cpanic=abort` should not cause Undefined Behavior.
* However, this may not be tenable, or at least not without making binaries much larger. See the discussion below for more details.
* It may, of course, cause programs to abort that used to execute successfully. This could occur if a panic would've been caught and recovered.
Some other important points:
* In practice, most "C" functions are never expected to unwind (because they are written in C, for example, and not in C++).
* However, because unwinding is now part of most system ABIs, even C functions can unwind -- most notably, `pthread_cancel` can cause all manner of C functions to unwind, including common functions like `read` and `write`.
* (For a complete list, search for "cancellation points" in the [pthreads man page](http://man7.org/linux/man-pages/man7/pthreads.7.html).)
## Background: foreign exceptions
## Alternatives
There are many different possible designs, but the most important question is to decide between two alternatives:
* Permit any "C" function to unwind
* Add a new ABI ("C unwind") that permits unwinding; the "C" ABI is specified as the system ABI but where unwinding is UB
## Permitting foreign functions to unwind
In fact, these two alternatives have a lot in common. Let's start with that. In both cases there is **some** ABI that permits C funtions to unwind. In one case, it is just "C", but in the other, it is "C unwind". So let's talk a bit about the implications of that. In this section, I will write "C/unwind" to refer to "the non-Rust ABI that permits unwinding, whichever one that is".
### Mapping to/from the system unwinding mechanism
Under both alternatives, we must define certain things about how Rust panics interact with the system unwinding mechanism. Currently, Rust functions use the system unwinding mechanism to propagate panics, but we want to ensure that we can change that to something else in the future.
We do this by defining "conversion" functions. A Rust panic that unwinds across a "C/unwind" boundary is conceptually "converted" into some kind of foreign exception. Similarly, when a "C/unwind" function unwinds, that is conceptually converted back into Rust's native unwinding mechanism.
We can choose how much of these details to specify, and we will likely proceed in stages. To start, we might simply specify that Rust panics can be converted into the system unwinding mechanism and then back into Rust faithfully, but nothing more. This means that we reserve the right to change how a Rust panic "presents itself" (e.g., whether or not a particular `C++` catch block would intercept it). We could also go further and, for some platforms at least, specify how Rust panics are represented in the system unwinding mechanism. Eventually, we will want to talk about foreign exceptions that did not originate in Rust panics, and how they behave (e.g., can they be intercepted by `catch_unwind`?).
The important point for this discussion, though, is that all of this is orthogonal to the question at hand. **Both proposals have to do the same specification work**. In one case, we do it for the "C" ABI, in the other, for the "C unwind" ABI.
### To avoid UB, `-Cpanic=abort` will require some shims if we permit foreign exceptions that did not originate from a Rust panic
Rust permits programs to be compiled with `-Cpanic=abort`. This switch has two effects:
* Any attempt to `panic!` does not initiate unwinding. Instead, it aborts the process.
* As a result, we are able to remove "landing pads" and other bits of code from the executable. This can result in significant savings. Fuchsia measured a win of ~10% overall.
However, all of this assumes that the only way unwinding can initiate is through a Rust panic. As noted above, this will not always be true. Consider the case of a call to `read` that unwinds (because the thread was canceled). In that case, unwinding begins in C code, so we cannot make it abort at the point where unwinding starts. We have to instead figure out what happens when the unwinding reaches Rust code.
The situation is further complicated by the fact that, on Windows, longjmp and even segfaults are handled using unwinding. Even with `-Cpanic=abort`, we would like to permit code to longjmp over Rust frames, so long as those frames do not contain any destructors (we'll call those "inert" frames). This is important for interoperating with various C libaries that make use of longjmp as their error recovery mechanism. (Note that, on unix-y systems, longjmp works via a distinct mechanism than unwinding, and so it works naturally -- but that mechanism also ignores all destructors.) A similar problem occurs with `pthread_exit` which, on some systems, triggers unwinding.
So how should we handle the scenario where a foreign exception triggers unwinding but `-Cpanic=abort` is specified? There are two basic options:
* Declare it UB if a foreign exception unwinds a frame that contains destructors under `-Cpanic=abort`
* Abort if a foreign exception unwinds a frame that contains destructors under `-Cpanic=abort`
#### UB permits full optimization
It is certainly helpful to declare throwing a foreign exception to simply be UB if destructors are present. This means we can optimize all landing pads away. However, it does mean that the code is more brittle -- libraries that work with `-Cpanic=unwind` may start to fail in strange ways with `-Cpanic=abort`. Admittedly, the alternative of guaranteeing aborts just means that this same code would hard abort, but that is still a cleaner form of failure and easier to diagnose.
#### Guaranteeing an abort is hard to do locally
But how we can achieve the second strategy? Initially, we thought that we could simply wrap the callsite and abort the process if unwinding occurs. Effectively we'd convert a call to some foreign function `foo()` into pseudo-code like:
```
try {
call foo()
} catch {
abort()
}
```
However, this will trigger an abort even if there are no destructors to execute, which means that `longjmp`, `pthread_exit`, and other such functions will not work as expected (they will abort unilaterally on Windows).
The key problem here is that the "no destructors would execute" property is not locally observable to the call site. So the best we can do is to make **all destructors** trigger an abort with `-Cpanic=abort` when executed during unwinding. But this works against the space savings of `-Cpanic=abort`, since we wanted to strip out all the unwinding code, and now we have to keep it.
(Another possible option would be to distinguish *longjmp* or other specific cases. We might allow *those specific exceptions* to propagate, and hence cause UB if there are destructors in scope, but intercept *other* exceptions and abort. This way we can just insert shims at the call site and not elsewhere, and we may the behavior of longjmp on linux. But it still doesn't permit the full optimization we might like, since we still require shims at the call sites.)
#### A hybrid: make it UB, but abort in debug builds
A hybrid option would be to declare it to be UB when a foreign exception unwinds a frame with destructors, but to try and help users to diagnose the problem by inserting abort shims in debug builds. This would be helpful in debug builds but would also allow full size optimization in release builds.
### Permit unwinding through "C" ABI
Under this alternative, unwinding is largely orthogonal from ABI. Both "Rust" and "C" functions can unwind and propagate Rust panics.
In order to satisfy the constraint that
Here, we say that the "C" ABI may unwind. If unwinding occurs, the exception will be mapped to a Rust panic.
Effectively the model
Here, we say that the "C" ABI matches the default system ABI. This implies that it is possible
In order to support
### Add a second ABI, "C unwind", to permit unwinding
# OUTDATED TEXT
### Conversion of Rust panics to and from foreign exceptions
Remember that, today, Rust functions use the system unwinding mechanism to propagate panics, but that we wish to reserve the right to change this. In particular, we wish to be able to change this for functions that use the "Rust" ABI (the default). We reserve the right do this by saying that, conceptually, there is a "conversion" that occurs when a Rust panic propagates through a C ABI. At the moment, since Rust functions use the system unwinding ABI, these conversions are no-ops: no action is required. But we reserve the right to insert these sorts of conversions in the future if we wanted to change the Rust ABI (or if the system ABI were to change, and we didn't wish to follow suit).
Consider this example. Start with a Rust function `z` declared with `C` ABI, which panics:
```rust
extern "C" fn z() {
panic!()
}
```
The idea here is that Rust panic would conceptually be "converted" into some kind of foreign exception as we unwind past `z()`. The precise details of this conversion would have to be defined platform-to-platform (and we might in some cases leave certain details unspecified, so that they can change later).
On the other side, we sometimes have Rust functions which invoke functions that have the C ABI. These functions might be functions written in C, or they might be Rust functions with C ABI. Let's imagine a function `x` that invokes `z`:
```rust
fn x() {
z();
}
```
Conceptually, again, the Rust panic that started in `z` would now be converted back from a foreign exception into a native Rust panic.
### Foreign functions that throw their own exceptions
This same concept of conversion can apply to foreign functions that throw their own exceptions. For example, we may wish to support invoking a C++ function that throws an error. Or we may wish to define what happens when you invoke `read()` after `pthread_cancel` has been used (which also triggers unwinding).
As ever, we don't want to specify the precise unwinding mechanism that Rust functions use. But we might want to specify how foreign exceptions would interact with Rust code. There are basically two variables:
* Do destructors run? (The answer simply must be yes, or else unwinding would be UB).
* What happens when `catch_unwind` is invoked.
Conceptually, the idea here is similar. Upon entering "Rust", the foreign exception is intercepted and some (unspecified) Rust unwinding mechanism occurs. But there
Rust
One simple case we wish to support is that Rust panics can be faithfully propagated through the "C" ABI and back into Rust. This means that we would support a case like the following:
* Imagine a Rust function `fn x() { y() }`,
* that invokes a C function `void y() { z() }`,
* and `y()` in turn invokes another Rust function `extern "C" fn z() { panic!() }`,
* where `z()` panics.
Here, the Rust panic must be (conceptually) "converted" into a foreign exception
Here a Rust panic (from `z()`) will unwind through the "C" ABI into `y`, and then in turn transition back into Rust. Conceptually, the Rust panic is converted into a foreign exception as we unwind from `z` and then converted back to
Note that in all cases, support for unwinding would depend on the platform. In particular, some platforms may not support unwinding, and for others we may not have defined how Rust interacts with unwinding.