owned this note
owned this note
Published
Linked with GitHub
# ffi-unwind -- meaning of the "C" ABI
## Information needed before a decision can be reached
* Most urgent is [measurements of the code size impact](https://rust-lang.zulipchat.com/#narrow/stream/210922-wg-ffi-unwind/topic/measuring.20code.20size/near/180244849)
## Considerations
These are abstract considerations that come into play across the various proposals. The following section will try to outline different options and their tradeoffs in terms of these considerations.
In some cases, we give a short title like "Foo: blah blah blah". This "Foo" tag is referenced later.
### Optimizations: Unwinding has an impact on binary size and performance
Despite the claim to be "zero cost", making allowances for unwinding has an impact on binary size. It can also hinder optimizations.
tmandry has gathered numbers estimating that `-Cpanic=abort` reduces the overall size of a Fuschia binary by 10-20% ([Zulip](https://rust-lang.zulipchat.com/#narrow/stream/210922-wg-ffi-unwind/topic/measuring.20code.20size)), but we don't yet know what the impact would be of having FFI calls immediately abort.
### In practice, most FFI calls do not unwind
We believe that the vast majority of FFI calls are to functions (typically implemented in C) that are never expected to unwind.
However:
* thanks to pthread cancelation (see below), pure C functions can in fact unwind;
* of course some *projects* may indeed invoke a lot of unwinding functions, if they are e.g. bridging to C++.
### Interaction with `-Cpanic=abort` and UB
Today, the `-Cpanic=abort` feature works by changing `panic!()` to cause an immediate abort at the site of panic. This in turn means we can remove all the "landing pads" and thus remove the impact of unwinding completely.
However, **if the source of the unwinding is foreign code that is not possible**. Instead, if we wish to avoid UB, we have to add landing pads to the call sites of any foreign function that may unwind, such that we can abort when the unwinding enters into a Rust frame. Note that all designs include ways to indicate foreign functions that cannot unwind, though they vary in defaults and approach. Inserting these landing pads has a certain cost, though it is less than the full cost of `-Cpanic=unwind` (since in that case landing pads are required on **all** call sites, not just foreign call sites).
You might ask "why not just have it be UB to invoke a foreign function that unwinds with `-Cpanic=abort`?" The problem here is that then you can have libraries which compile under both modes but which produce UB when executed with `-Cpanic=abort`. If we install "abort guards", those same libraries would merely abort at runtime, which is much more noticeable.
If you come from C++, this might not seem like a big deal. After all, mixing `-fexceptions` code with code with `-fno-exceptions` code is going to mess everything up. But it's the kind of low-level footgun we'd prefer to avoid in Rust if we can.
### it can be really useful to rule out unwinding
Particularly in unsafe code, it can be tricky to make code "unwind safe". The more it is possible to rule out unwinding at a given call site, the easier it is to reason about things overall, since all control-flow paths are explicit.
Rust has generally set itself up to minimize the need to reason about unwinding, except across very specific boundaries (threads, `catch_unwind`).
### Orthogonal: most concerns about unwinding are orthogonal to the API itself
The previous point notes that it is useful to rule out unwinding -- this could argue for making "C" FFI calls not unwind, but it also could argue for finding a uniform mechanism to rule out unwinding that also applies to "Rust" FFI calls.
The ABIs of all Tier 1 platforms already support unwinding in any case.
The existing of the `no_unwinding` crate can be viewed as a feature request in this regard. It might be useful to study those crates that use it to see what their requirements are.
### Effects: if we are truly trying to rule out unwinding as a "language feature", we are touching on an effect system
In short, if we really want to support people ruling out unwinding in full, that opens up a big can of worms. It's relatively easy to declare that individual free functions do not unwind, but much harder to support all the kinds of cases that can arise:
* being able to rule out unwinding from trait methods, and possibly applying it as an effect to existing traits
* being able to rule out unwinding from function pointers
* being able to "be generic" over unwinding, e.g. to write functions that *may* unwind
In short, we need an effect system, and it seems clear that this is out of scope for this particular project group (and at this time). What is not clear, though, is how often the "full complexity" here is needed (and how much it hurts to have not be able to express the full complexity). Nonetheless, it may be wise to think a bit ahead to what generalization might look like.
### longjmp can be considered unwinding on some platforms
MSVC in particular uses unwinding for longjmp. In C++, it is considered UB to longjmp over a frame with destructors, but on msvc targets it is defined behavior and runs those destructors.
(Is this actually relevant?)
### libc: pthread cancelation means unwinding can occur in a lot of places
Because of the potential for pthread-cancelation, a large set of libc bindings would more accurately be "C unwind". The [pthreads man page] lists them out (search for "cancelation points") but they include many common functions.
[pthreads man page]: http://man7.org/linux/man-pages/man7/pthreads.7.html
This implies a few things:
* to be correct, libc would either have to rule out cancelation (it's UB to use cancelation)
* or change the types, inducing a 2.0, which has a major effect on ecosystem disruption
* but this is believed to be ["not as bad" as 1.0 was the first time](https://rust-lang.zulipchat.com/#narrow/stream/210922-wg-ffi-unwind/topic/.22C.20unwind.22.20vs.20.22C.20noexcept.22/near/179436396) because we are now using more type aliases for things like `libc::c_int`
### system: C is the system ABI, and system ABIs leverage unwinding
Building on the previous point, there is a point of view that the "C" ABI should correspond to the full system ABI, and not some subset -- in practice, systems support unwinding, and (as pthread, longjmp show) they sometimes opt to expose that in common C APIs. Many systems also support raising exceptions in C and other languages as well through various means, though that is not commonly done in practice.
### expectations
We've been saying that unwinding across a "C" boundary is UB and that we expect it to even become a hard abort. If we change those defaults, that will at minimum surprise some folks -- can it cause other sorts of problems? I'm not aware of any apart from code size; you could imagine unsafe code that invokes `extern "C" fn()` pointers and assumes they don't unwind, but if those callees are unknown, then that would already be an "unsafe" thing to do... there is a sort of subtle argument about who is allowed to assume that things which are presently UB never happen. It would be good to have real examples here.
### platform dependent
The details of unwinding are always target and platform dependent. We aim to support a consistent view across Tier 1 platforms, but even there, there can be variation (notably around longjmp and msvc, which is only tangentially unwinding). We will always to have a clear page indicating which details of unwinding are specified and on which platforms.
## Options
There are two major axes under consideration:
* How users controls if unwinding is permitted:
* Use the ABI (e.g., using "C unwind" or "C nounwind")
* Use an attribute like `#[unwind(...)]`
* How to set the "defaults" -- there are a number of varieties here
* Default all functions to permit unwinding
* Default Rust functions to permit unwinding, foreign ABIs not
* Default things written in Rust one way, default external functions another
### Introduce "C unwind", which permits unwinding, and make unwinding through "C" UB
In this version, unwinding is tied to the ABI completely:
* Unwinding through Rust ABI is permitted (when not using `-Cpanic=abort`)
* Unwinding through "C" ABI is never permitted
* Unwinding through "C unwind" ABI is permitted, but aborts when using `-Cpanic=abort`
Advantages of this approach:
* **Optimizations:** In practice, most FFI calls do not unwind, and we would be permitted to assume that.
* Only those functions with "C unwind" ABI will require abort shims with `-Cpanic=abort`
Disadvantages of this approach:
* Not **orthogonal:** We don't have mechanisms for declaring Rust functions do not unwind.
* **System:** The "C" ABI exposes only a subset of the platform, and users must be aware of that when making declarations. This can be error prone, as evidenced by libc.
* **Libc:** As a consequence, libc's declarations are incorrect, and we have to either rule out pthread cancelation or declare a libc 2.0
### Permit unwinding in "C", introduce "C nounwind"
In this version, we still tie unwinding to ABI, but we alter the defaults.
* Unwinding through Rust ABI is permitted (when not using `-Cpanic=abort`)
* Unwinding through "C" ABI is permitted, and an abort shim is added when using `-Cpanic=abort`
* Unwinding through "C nounwind" ABI is UB.
Relative to the previous alternative, this removes the "optimization" disadvantage and aligns "C" with system, but it still does not have the benefit of orthogonality.
### Permit unwinding in "C", introduce attributes to disallow unwinding (default to permit)
In this version, unwinding is essentially "orthogonal" to the ABI. Thus, "C" and "Rust" functions both permit unwinding by default, at least on major Tier 1 platforms.
Functions can declare that they do not unwind using an `#[unwind]` attribute, which contains a number of variations (the details of these options themselves could be tweaked as well):
* `#[unwind(allowed)]` -- the default
* `#[unwind(no)]` -- declares unwinding does not occur
* if this function is defined in the current crate, this will abort if unwinding actually occurs
* if this is a foreign function, it will be UB for unwinding to occur
* ...?
For functions we generate, these declarations also translate to appropriate `noexcept` annotations on LLVM.
Whenever we have a call site to a statically known function, we can control whether we generate landing pads by checking if that function has a `#[unwind]` declaration. Whenever we are invoking through function pointers, we cannot, although if LLVM can resolve the target of the function statically, and the function has a `#[unwind]` declaration, then it may be able to eliminate landing pads.
Advantages of this approach:
* **Orthogonal:** Uniform treatment of unwinding for all functions. If you wish to declare (or guarantee) that a certain function does not unwind in your unsafe code, you can do so, regardless of what language it is in.
* **System:** The "C" ABI matches the full capabilities of the platform.
* **Libc:** As a consequence, the existing declarations accommodate pthread cancelation just fine.
Disadvantages of this approach:
* **Optimizations:** As most FFI calls do not unwind in practice, but the defaults assume they might, we will generate more landing pads than we otherwise would.
* In particular, most FFI calls will require abort shims with `-Cpanic=abort`
* **Expectations:** This is a change from what we said we would do.
### Permit unwinding in "C", introduce attributes to disallow unwinding (default to deny)
This is similar to the above, except that we would require **opt-in** to enable unwinding, rather than opt-out. The main proposal was that "C" ABI functions default to "unwind(no)".
Advantages of this approach relative to the opposite default:
* **Optimizations:** In practice, most FFI calls do not unwind, and we would be permitted to assume that.
* Only those functions with explicit attributes will require abort shims with `-Cpanic=abort`
Disadvantages of this approach:
* Inconsistent / complex to explain:
* We are now saying that the "C" ABI can permit unwinding, but you must opt-in.
* If you do not opt-in, and unwinding occurs anyway, that is UB, which will sometimes abort (if the callee is defined in Rust) but otherwise sometimes does arbitrary things (if callee is not defined in Rust).
* Net result:
* `extern "C" fn foo() { panic!() }` -- aborts
* `extern "C" { fn foo(); }` -- if invoked directly, UB if it unwinds; if invoked by pointer, it is ok for it to unwind (because the C ABI supports unwinding)
* `fn foo() { panic!() }` -- works