owned this note
owned this note
Published
Linked with GitHub
# Why specialization has soundness issues
The core reason behind the soundness issue is the one-two punch of:
1. Specializing over lifetimes is unsound
2. It is nearly impossible to prevent specializing over lifetimes
It's best to look at these statements separately since trying to think of this end-to-end is hard.
## Why can't you prevent specializing over lifetimes?
Let's look at #2 first. It's easy to say that we can forbid the following
```rust
trait Foo {}
default impl<'a> Foo for &'a str {}
impl Foo for &'static str {}
```
However lifetime specialization can arise indirectly, for example through bounds:
```rust
// crate A
trait Bound {}
// crate B (doesn't know about C)
trait Foo {}
default impl Foo for T {}
impl<T: Bound> Foo for T {}
// crate C (doesn't know about B)
pub struct Oops<'a>(&'a str);
impl Bound for Oops<'static> {}
```
Here `Oops<'static>` maps to a different impl than `Oops<'a>`.
It can get worse, you don't even need to mention a lifetime for this to happen:
```rust
trait Foo {}
default impl Foo for T {}
impl Foo for (T, T) {}
```
Here, `(&'a str, &'static str)` will not have the same impl as `(&'static str, &'static str)`
You can also use tricks like above to induce specialization on bounds like `'a: 'b`, etc.
As shown before, constraints like this can get hidden behind more layers of indirection, making it impossible to detect at definition time, and only possible at use-time.
Rust is extremely allergic to type system errors of that kind: if you have a generic, all types that satisfy its trait bound should Just Work. Panics in consteval are one exception to this, but it's one that's relatively easy to debug and usually fixable at use time; whereas these kinds of type system errors may require changing multiple crates.
Now, there are some designs that work around all of this, for example the ["always applicable" rule](https://smallcultfollowing.com/babysteps/blog/2018/02/09/maximally-minimal-specialization-always-applicable-impls/#when-is-an-impl-always-applicable) . This would do two things:
- Traits used in specialization bounds MUST use `#[specialization_predicate]`, which greatly restricts what the trait can do. This will make most traits illegal in specialization bounds, unfortunately (including very useful ones like `Copy`)
- Specializations and specialization predicate traits must not involve lifetimes, and must use each type parameter exactly once (plus some more nitty gritty things that don't matter as much)
Work is stalled, though, since this will:
- require a lot of design work
- require a lot of implementation work blocked on chalk
- greatly restrict specialization as a feature and make it not useful for many common use cases
- it's unclear if this is the right path to take
## Why is specialization over lifetimes unsound
There are two reasons, one is an implementation constraint, and one is a design constraint.
They both relate to how lifetimes are treated in Rust. Fundamentally, lifetimes are very squishy, and stretch and squeeze at will. This is distinct from most other coercions: e.g. `*mut T` will coerce to `*const T`, but `Option<*mut T>` [will not](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=bc07429d443d6d171138ca453d21ff92) coerce to `Option<*const T>`. For other coercions you usually can treat them as an implicit cast operation occurring at some point.
However, `&'static str` will happily coerce to a shorter lifetime, no matter how deep in a stack of generics it is (provided that the generic position it occupies is covariant).
Furthermore, and this is the kicker: lifetimes can coerce _back_. Other coercions behave like "casts", once coerced, it stays coerced. Whereas for lifetimes you can pass two different lifetimes into a function that takes `(&'a str, &'a str)`, the compiler will temporarily cast them for the purposes of that function, and the lifetimes will still be different after the call.
This means that the _same value_ may occupy different types over its lifetime, and it's all highly implicit. The obvious problem with this is that the same value may access different implementations of a method. That seems rather annoying but not a soundness issue. But it gets worse:
### The design issue
What do you do when this happens?
```rust
trait Foo {
type Assoc: Default;
}
default impl<'a> Foo for &'a str {
type Assoc = u8;
}
// remember, we have already shown that it's very hard to avoid this
// happening indirectly. writing a direct lifetime specialization
// for simplicity
impl Foo for &'static str {
type Assoc = String;
}
struct Thingy<T: Foo>(T, T::Assoc);
fn use_thingy<'a>(x: &'a mut Thingy<'a>) {
x.1 = Default::default();
}
let x = Thingy("foo", String::new());
use_thingy(&mut x);
```
The call to `use_thingy` will coerce the lifetime for the duration of that call (to something that is not `'static`), set `x.1` to `0`, which is unsound since `x.1` is supposed to be a `String`.
Note that without specialization, lifetime impls with associated types work fine because the non-'static type would just get rejected during typechecking for not implementing the trait.
This particular issue can be prevented by disallowing associated types from being specialized.
### Implementation constraint
The implementation constraint is that Rust forgets about lifetimes before monomorphization. `fn foo<'a>(...)` will always compile to a single function; lifetimes do not participate in monomorphization at all. For that to change would require a major rearchitecture of the Rust compiler; and have knock-on effects on a lot of other areas.
Bear in mind that in the Rust language most lifetimes are unspecific: they crop up in generic parameters, and we talk about their *relationship* to one another, as opposed to concrete lifetimes (of which there are millions in a program). Monomorphizing for each concrete lifetime would basically involve delaying the monomorphization of any function that has references in its signature, _transitively_, until it's actually used in a way such that all concrete lifetimes are known (and then deduplicating). Lifetime stretchiness further complicates this. "Concrete lifetimes" aren't even really a thing in Rust's compiler model; it only thinks about potential relationships.
A more conservative approach could be taken: once type-monomorphized, if a lifetime parameter of a function is used in such a way that transitively depends on a specialization, that function and lifetime are specially marked, and `'static` and `'a` versions are created and compared. Functions that depend on it must do something similar.
Except lifetime specialization doesn't just happen for `'static`, it happens for relationships between lifetimes as well. So any function with multiple lifetimes (and it doesn't matter if the lifetimes are elided, so this covers a rather common type of function) must now consider all potential outlives-relationships between its lifetimes and try monomorphizing those.
This can explode pretty quickly; the crux of it is that it's not really possible to know beforehand if a function depends on specialization in a way that involves its lifetimes without actually just checking it, so this becomes a cost everyone pays.