owned this note
owned this note
Published
Linked with GitHub
# Non-local errors in Rust
Feb 22, 2023
Design meeting issue: [lang-team#195]
[lang-team#195]: https://github.com/rust-lang/lang-team/issues/195
## Summary
Post-monomorphization errors are part of a larger family of compile-time errors, referred to in this doc as "non-local errors". These errors are occasionally very useful but violate some nice properties of the language. The language team needs to understand the tradeoffs of these errors, then decide its stance on them and in particular, how they relate to the decision to [stabilize inline_const][inline_const].
[inline_const]: https://github.com/rust-lang/rust/pull/104087
## Framework and definitions
While this doc is not meant to capture every aspect or decision point regarding non-local errors, we will first attempt to establish a broad definition and framework for thinking about them.
Hopefully this provides a useful context for the specific decisions being made today, in addition to being useful framing for future conversations.
### Stages of errors
For the purposes of this conversation, in Rust, we can broadly sort _errors that indicate bugs_ into these stages. They are sorted in ascending order of how much information they incorporate.
1. **Syntactic**: Invalid syntax or use of an unknown name. The compiler does not understand what the user is trying to say.
2. **Semantic**: Usually, type system or borrow checker errors. The compiler understood the user to mean something, but it went against language rules that protect against undesired states. Pinpointing the cause of the error is relatively easy, because the language rules are designed to enable local reasoning.
3. **Non-local**: The focus of this doc. Similar to (2), except pinpointing the cause of the error is relatively hard in the general case. From the compiler's perspective something has gone wrong at the level of an entire call/use chain in the program, rather than a specific line of code or function.
4. **Global**: Similar to (3), except something has gone wrong at the level of the entire binary. There are two pieces of code that are inherently incompatible and cannot be linked together.
5. **Runtime panic**: Some unanticipated error condition arose, usually signaling a bug in the program, but it was not caught until runtime. Sometimes "at runtime" means "in production". At this point pinpointing the exact cause is usually hard and must be done by a human. Bugs caught in production can be much more expensive than bugs caught at compile time.
### Non-local errors
A non-local error is a compile-time error that appears in code only in some cases, **contingent on external factors** not controlled by the code itself.
Non-local errors are often more difficult for a user to act on than regular errors, because:
* They **violate abstraction boundaries**, requiring a user to gain more context than they generally need to use a component.
* **Pinpointing the cause** is difficult for tooling to do in an intuitive way.
* They are **reported later** in the development process, making fixing their true cause more costly. In some cases the user **does not control** the offending code.
There are a few overlapping categories of possible non-local errors:
* Language level
* **Parameter-dependent errors** only show up for certain values of generic parameters (types or consts).
* **Use-dependent errors** only show up when code is transitively used from a public function or `main`.
* **Config-dependent errors** only show up when building code for certain `cfg` values.
* Tool level
* **CLI-dependent errors** only show up when passing certain flags or commands, e.g. `cargo build` rather than `cargo check`.
* **Implementation-dependent errors** depend on implementation details of the compiler frontend, such as how optimizations get applied.
* **Backend-dependent errors** only show up when generating code on a particular backend, possibly for a particular target.
#### Evaluating a non-local error
There are several axes we can use to judge how problematic a non-local error might be.
* **Axis of context**: How much information does the user need to effectively _reason_ about whether an error will occur, or how to fix it?
* **Axis of support**: How precisely can _tooling_ pinpoint the cause and suggest a remedy?
* **Axis of actionability**: How easily can the user _correct_ the error?
Generally, the closer to its actual cause an error can be reported, the better.
:::info
**What happened to post-monomorphization errors?**
An earlier version of this doc referred to _post-monomorphization errors_, a term that is tied to a compiler implementation and can mean something different based on who you talk to. It has therefore been replaced with the terms defined in the above section.
:::
## Local reasoning in Rust
### Rust traits, parametricity, and composition
Rust's trait system is designed to eliminate non-local errors, specifically parameter-dependent errors, by requiring generic code to declare all requirements in its signature.
This in turn enables users to create **high-fidelity abstractions** that cover over details of their implementation while being reusable in many different applications. These abstractions are, arguably, a key part of what enables the crates.io ecosystem to develop and interoperate in a scalable way.
However, as we'll see below, there are times when Rust chooses to pierce the veil of abstraction in favor of more practical concerns.
:::info
**Comparison to C++ templates**
C++ templates are based on text substitution, meaning they don't type check until instantiated with a given type.
This can lead to cryptic errors like "no member function `length` on value of type `int`" within the template itself, when the real problem was that the template should only have been instantiated with a container type. Template instantiation errors come with a "backtrace" of instantiations for the user to search through, looking for the actual cause.[^concepts]
[^concepts]: C+\+20 introduced [concepts and constraints](https://en.cppreference.com/w/cpp/language/constraints), which are basically named boolean expressions at the level of the type system that are checked at instantiation time to provide better error messages. These are not used to check the template definition, as far as I know, and they can therefore afford to support arbitrary constraints such as `A || B` disjunctions. C++20 does not appear to be in widespread use at the time of writing.
The stark difference between Rust and C++ in this regard has led to a common belief that "Rust has no post-monomorphization errors". That isn't really true, but they are _much_ less common.
:::
### Uses of non-local errors
#### Const eval
Const eval is useful when you need to guarantee an invariant that can't be expressed in the type system. This is great for internal consistency checks (static assertions), but it can also be used to enforce non-local interface contracts. These come in several flavors:
* Type properties: Unsafe code that doesn't work with ZSTs, or requires size/alignment to be a power of 2.[^type-props]
* Domain restrictions: Generic code that doesn't work with some values of const parameters, like zero.[^as-chunks]
* Non-local static assertions: Checking that the size of a type in an external crate doesn't change, or that an anonymous future doesn't exceed a certain threshold.
[^type-props]: See for example [`sub_ptr`](https://doc.rust-lang.org/stable/std/primitive.pointer.html#method.sub_ptr), mentioned by scottmcm in [this comment](https://github.com/rust-lang/rust/pull/104087#issuecomment-1386041775).
[^as-chunks]: See for example [`as_chunks`](https://doc.rust-lang.org/stable/std/primitive.slice.html#method.as_chunks), where a decision on how to enforce `N != 0` is [currently the blocker for stabilization](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/How.20essential.20is.20the.20compile-time.20check.20for.20empty.20arrays.3F).
In each case, reporting an error at compile time is preferred to reporting it at runtime.
#### Other mechanisms
Sometimes Rust itself defers errors from local to non-local, global, or runtime.
* `Cell` and `RefCell` work around borrow checker restrictions by deferring checking to runtime.
* Opaque types leak auto traits, eliminating the need to write them in a bunch of places, and working around the lack of implication bounds (e.g. `impl Future + (Send if T: Send)`). This can violate abstraction boundaries and create semver hazards.
* The global allocator can only be defined once; this eliminates the need to pass an allocator parameter through every type.
* `mem::transmute` gives an error if transmuting between two types of different sizes.
* `Any` and `TypeId` allow a user to do dynamic type checking at runtime.
:::info
**Avoiding global errors in Rust**
Note that linking errors, which are global, can be a common source of pain in other systems languages. Rust tries hard to avoid these with features like:
* Orphan rules in the trait system
* Lack of a global namespace (undone by `#[no_mangle]`)
* Hashes in mangled symbol names
* Managing the linker invocation for you
* The `-sys` crate pattern
Similarly, dynamic languages run into global conflicts when multiple libraries "monkey patch" the same code in an incompatible way (common in Ruby, for instance). Rust prevents this with a strong static type system that enforces abstraction boundaries.
In situations where global errors can arise (such as the global allocator), they are expected to be easily diagnosed and actionable by the binary crate author.
:::
### Non-local errors in APIs
Library authors sometimes have to choose a level at which to encode an invariant.
For instance, a crate author could choose to parameterize their entire crate on some pervasive concern like logging or an async executor. Or they could choose to require that a global context is set and defer the check to runtime. The latter is usually more ergonomic; it can be a good choice when the user is likely to encounter and fix any problems during development.
Moving a requirement up in the stage hierarchy is almost always a breaking change. Some requirements (such as where clauses) can be moved farther down, though I don't know if this is common in practice.
## Decision: Use- and CLI-dependent errors
**Question:** Should _any_ errors depend on `check` vs `build`, whether code is used, or on how code is optimized?
**Status quo:** Type system errors are always surfaced, unless they are cfg-dependent. Const eval errors are use- and CLI- dependent; they **do not show up** in unused code unless `-Clink-dead-code` is passed, and do not show up at all in `cargo check`.
**Proposed ideal world:** Frontend errors are always surfaced. `build` vs `check`, codegen flags, and the details of MIR optimizations do not affect which errors are shown. Errors from the backend and linker can still occur when using native libraries, in lower tier targets, or in highly unusual situations unsupported by the backend.
* Pro: Users expect `cargo check` to surface all errors.
* Con: Could impact compiler performance.
Implemented in [#107510](https://github.com/rust-lang/rust/pull/107510).
**Alternative 1**: Accept the ideal world, but make exceptions in cases where compiler performance is severely affected.
**Alternative 2**: Codify the status quo, or some part of it. Errors are defined to occur in code that is reachable from `main` for a binary crate or any publicly-reachable function for a library crate.
[^exclude-backends]: May exclude errors from codegen backends in rare cases, like enormous arrays. [Source](https://github.com/rust-lang/rust/pull/107510#issuecomment-1410180431)
## Decision: inline const
#### Background
Parameter-dependent errors can already be expressed directly using associated consts. Example:[^require-zst]
```rust
fn require_zst<T>() {
struct Anon<U>(PhantomData<U>);
impl<U> Anon<U> {
const A: () = assert!(std::mem::size_of::<U>() == 0);
}
Anon::<T>::A
}
```
[^require-zst]: Adapted from [this comment by RalfJung](https://github.com/rust-lang/rust/pull/104087#issuecomment-1375713665).
Parameter-dependent errors have been possible since associated consts were stable in 1.20,[^subtraction-hack] since panicking operations were available in const since then. `size_of` was const-stabilized in 1.24, making the general form of this error available.[^index-hack] I'm not sure if there was ever discussion within the lang team about the implications of these features on parameter-dependent errors.
`const_panic`, stabilized in 1.57 (Jan 2022),[^const-panic] made implementing one feel like less of a hack and provided for nicer error messages. The justification seems to be that this was already possible with various hacks, so we might as well make it nice.
[^subtraction-hack]: See [this example from oli-obk](https://rust.godbolt.org/z/eMhE58e1Y). It depends on manually defining associated consts for every type, since `size_of` was not const stable until 1.24, and the fact that integer underflow panics.
[^index-hack]: Here is [another way](https://rust.godbolt.org/z/7ePeMzjT7) to implement this assertion without panicking directly in const, using only the index operator. The [static_assertions](https://docs.rs/static_assertions/latest/static_assertions/macro.const_assert.html) crate was a collection of similar hacks.
[^const-panic]: https://github.com/rust-lang/rust/pull/89508
The [**`inline_const`**](https://github.com/rust-lang/rust/pull/104087) feature makes the above example significantly easier to write. Note that since the const context is no longer embedded in an item, it is free to name the parameters in scope directly.
```rust
#![feature(inline_const)]
fn require_zst<T>() {
const { assert!(std::mem::size_of::<T>() == 0); }
}
```
Finally, since `inline_const` also makes const contexts easier to access in general, it could exacerbate problems with use- and CLI-dependent errors identified above.
#### Questions
Based on the previous decision and the last paragraph, are we concerned with how `inline_const` could further proliferate use- and CLI-dependent errors? Does that concern affect stabilization?
Are we concerned with how `inline_const` further expands access to, and endorses, parameter-dependent errors? Does that concern affect stabilization?
## Decision: Stance on non-local errors
Should the lang team have an overall stance on non-local errors? Is it important to justify the status quo, and how would we justify it?
This [comment by RalfJung](https://github.com/rust-lang/rust/pull/104087#issuecomment-1382712576) suggests:
> I view post-monomorphization errors in consts as a fallback plan that library authors can use when expressing their constraints with traits doesn’t work or becomes too unergonomic, or when it concerns conditions that are very unlikely to be violated in practice so it’s not worth burdening all users with this concern.
## Future possibilities and interactions
### Controlling monomorphization
scottmcm [suggests](https://github.com/rust-lang/rust/pull/104087#issuecomment-1386041775) something like `if const` that would not monomorphize anything in its body if the const expression evalutes to false. This makes non-local const eval errors actionable.
### Uplifting parameter-dependent errors
If we added new bounds like `T: ZeroSized` or `N > 0` to the type system, users might need to produce these bounds from code that previously only used static assertions to enforce them.
There are a couple ideas for how to do this.
#### Ad-hoc specialization
```rust
pub fn old_thing_that_requires_zst<T>(x: T) {
const { assert!(size_of::<T>() == 0); }
where T: ZeroSized {
new_thing_that_requires_zst::<T>();
}
}
```
As far as I know the only thing that makes this "ad hoc" is the syntax; it's equivalent to full specialization.
#### Deferring trait bounds to after monomorphization
We could opt to enforce some/all trait bounds at monomorphization time, only making it a lint if you don't have bounds in your environment that guarantee they will always be met.
This would allow existing APIs to add the bounds without a breaking change.
:::success
It was noted in the discussion that a version of the above inline `where` syntax that introduces mono-time _assertions_ that can then be relied on, instead of compiling code conditionally, would be a feasible way of introducing these deferred bounds (and would not rely on specialization).
:::
### Future use cases for NLEs
* **C++ interop**: Many legacy codebases that fit the use case of Rust are C++ and cannot be rewritten wholesale. Interop with C++ is an area of active research and development, but use of template APIs, which are very common, will always be limited without some mechanism to instantiate templates with type parameters. This requires parameter-dependent errors. The solutions in the above section would also help here.
* **Contracts**: Some of these might qualify as NLEs.
* Maybe others?
---
## Potential discussion questions
### How do I add a question?
Ferris: To add a question, create a section (`###`) like this one. That helps with the markdown formatting and gives something to link to. Then you can type your name (e.g., `ferris:`) and your question. During the meeting, others might write some quick responses, and when we actually discuss the question, we can take further minutes in this section.
### Observation: Mono-time *name lookup* feels like the most horrific part of C++ templates.
scottmcm: Upon reflection, I think that the main horror with C++ templates is the *ad-hoc extension points*. Not knowing whether `foo(x)` is supposed to be an in-scope-at-template-definition function or an overloaded extension point seems like it causes most of the weirdness. In comparison, Rust's principled extension points (`MyTrait::blah(x)`) avoid that: the `MyTrait` would have to be resolved at generic definition time, and the corresponding error for the type not implementing that would be quite clear, albeit potentially highly nested and thus non-local. [More details on zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/post-monomorphization.20errors/near/321922957).
scottmcm/niko: Being able to say "T does not implement Trait" is better than "I don't know about this method".
gary: Const eval is similar in that it has a clear error message.
### Philosophically, what's the line between trait generics and value generics here?
scottmcm: For example, is there a difference between needing to add `where const { N + M < 100 }` all the way up the chain and needing to add `where T: Debug` all the way up the chain?
gary: I think there is value in deferring trait bound errors to mono-time. People seem to be more amenable to const eval errors than this though.
scottmcm: ~~Someone~~ Tyler mentioned in Zulip that if we deferred `Send` checks on `async fn` for example, maybe that would be good enough.
nikomatsakis: I think that cuts both ways.. feels very non-local today.
tmandry: Could imagine a compromise where you need to put `Send` bounds at public crate boundaries, but on private code you don't need them (enforced at mono-time).
nikomatsakis: Enforcing const eval errors at mono time: If you're enforcing a precondition, fine, but other times it seems like you're leaking implementation details. ...
gary: ...still better than duck typing and templates.
### Do we need strict language rules for what gets monomorphized?
scottmcm: Is a compiler allowed to monomorphize things spuriously, and have compilation fail if one of those "extra" monomorphizations fails? Would that mean that `-C link-dead-code` makes it "not Rust"?
Similarly, is it a legal optimization to stop monomorphizing `foo::<T>` if it's seen in `if false { foo::<T>() }`, or could people be relying on that NLE for a safety property?
### What's the plan for doing bounded const generics *without* NLEs?
scottmcm: Is there a feasible way to prove "early" the kinds of things that people will want to bound in const generics?
- Will we possibly be stuck with NLEs being the only feasible way to do this anyway, given that we wouldn't want to solve NP-hard problems at compile-time.
- Is there some sufficiently-useful subset that we can check feasibly? I think of how we don't try to do do exhaustiveness for general expressions (`x if x > 0`) in `match`, but have a more-constrained problems space (`x @ 1..`) for which it's feasible. What operations are needed in common cases today? How can we bound `concat([T; N], [T; M]) -> [T; {N + M}]`?
nikomatsakis: I don't think we have one, requires an SMT solver basically
gary: we have `generic_const_exprs` but that's *very* limited.
josh: At some point if it starts becoming complicated to echo the body of the function as five different conditions that need to be met, it stops being worth it. It's still better to have a mono-time error than a runtime error.
nikomatsakis: I think this feels similar to contracts. I would like to have contracts that are enforced dynamically but with option for an add-on tool to check statically.
scottmcm: And those contract tools are already running SMT solvers.
nikomatsakis: And I'd rather keep those out of the core language.
### Is the an "in the signature" middle-ground?
Disclaimer: Unbaked idea, might be terrible.
scottmcm: Inspired by the TAITs conversation, maybe there's a way to attack this by making it *visible*? For example, if the errors had to happen in associated constants of a trait mentioned in the signature, then they'd be more obvious, though still depend on implementation details of those impls, not just on bounds, so still wouldn't be as transparent as trait errors -- but `typenum` shows that trait errors aren't necessarily clear/obvious either.
(Though that substantially nerfs `inline_const`, so I'm not sure I actually like it.)
### Interaction of inline-const and use- or cli-dependent errors
nikomatsakis: The doc says...
> Based on the previous decision and the last paragraph, are we concerned with how `inline_const` could further proliferate use- and CLI-dependent errors? Does that concern affect stabilization?
...but it's not obvious to me how much those use- and CLI-dependent errors are "inherent" to inline const. The previous "decision point" was saying "no, let's not have use- and cli-dependent errors". It seems clear that inline const will result in use-dependent errors, because we can't know if a check fails until we know the value of the const generics. But this doesn't necessarily imply cli-dependent errors, right?
tmandry: To clarify, "use-dependent" doesn't include parameter-dependent.
### Performance impact
Doc says:
> **Proposed ideal world:** Frontend errors are always surfaced. `build` vs `check`, codegen flags, and the details of MIR optimizations do not affect which errors are shown. Errors from the backend and linker can still occur when using native libraries, in lower tier targets, or in highly unusual situations unsupported by the backend.
>
> * Pro: Users expect `cargo check` to surface all errors.
> * Con: Could impact compiler performance.
>
> Implemented in [#107510](https://github.com/rust-lang/rust/pull/107510).
is there a sense of the compiler performance impact of this PR?
### [eager] inline const?
The 8472: Would a marking specific inline consts for eager evaluation in `cargo check` provide a middle ground?
### Actionable?
nikomatsakis: What does "actionable" mean here?
> This makes non-local const eval errors actionable.
I think it means: it is possible to write a const error in dead code without aborting compilation. e.g., in something like this?
```rust
const fn foo<C: usize>() -> f32 {
if const C != 0.0 { 1.0 / C } else { 0.0 }
}
```
### NLE vs runtime panic
gary: I think one key point to consider is whether we would prefer NLE to runtime panic. Currently people tend to defer panic to runtime for many functions that doesn't take ZST or zero. In my view NLE is always better than runtime panic for checking bugs.
gary: Also in Rust-for-Linux we try to avoid runtime panic enough that we make code not compile depending on whether optimiser can get rid of checks. See https://rust-for-linux.github.io/docs/kernel/macro.build_assert.html
scottmcm: One huge reason that runtime panic can be better is that it's context-sensitive, which the current NLE errors are not. So if you have a *runtime* check for the condition, the error in the unreachable monomorphized code is obnoxious.
### Breaking changes to move to a better experience
scottmcm: To look at the `as_chunks` example again, if we stabilize it as `const { assert!(N > 0) }`, it'd be a breaking change to move to the "better" `where const { N > 0 }`, should that become possible in future. Does that mean that `core` still shouldn't use NLEs for things?
gary: Maybe not std, but if it's a crate people would start using it.
nikomatsakis: Could we use editions? In earlier editions it would not be checked, in later ones it would.
gary: Could prevent library from upgrading editions because they would have to break their API.
nikomatsakis: Could assert that something is true, which is a non-local error,
tmandry: Equivalent to specialization? Not if it's just an assertion, rather than conditionals.
gary: Since consts are just values, don't have to worry about lifetime issues in specialization.
### Decision: Use- and CLI-dependent errors
nikomatsakis: Remember GCC did stuff like this. Obviously a worse user experience to have CLI-dependent errors, but so is slower compilation time.
scottmcm: If we're going to say this exists we have to define what monomorphizes; does this mean that `-C link-dead-code` isn't Rust anymore?
nikomatsakis: Is there a way to think about these as "deny-by-default" lints?
gary: Miri people would not like to make another const eval-like lint.
scottmcm: I'm not sure what allowing it would mean.
nikomatsakis: Semantics is if it runs it panics, but it's not going to run.
Would this be desirable? Defining what gets monomorphized isn't that bad.
niko: Interesting example: You could imagine a compiler that starts from your main function and only builds exactly what's needed in all your dependencies. In fact I would like that.
gary: It's not causing an issue if it's not being used..
scottmcm: that breaks a bunch of stuff like the `fn _must_support_dyn(_: &dyn Foo){}` pattern?
nikomatsakis: There may be limits to what we can cut out. My assumption is that
scottmcm: Obligatory mention of [dead-code optimize if const { expr } even in opt-level=0 #85836](https://github.com/rust-lang/rust/issues/85836) (which is also mentioned below).
tmandry: Do we have any consensus?
scottmcm: If we don't have a plan to _not_ have these, don't we need to.. have them?
joshtriplett: It's strictly better than a runtime assertion.
scottmcm: Only if it's reachable... that's where it gets awkward to me. Also examples of "if T is ZST then call this thing" in std, if that was a PME it would be seriously broken. Don't think we can say you should move all your panics on const generics to NLEs.
scottmcm: For example, there's lots of things like `if T::IS_ZST { return unsafe { mem::zeroed() }`. That works great, but if `zeroed` got a NLE for things where it's insta-UB to zero them, it'd blow up horribly (unless we had [#85836](https://github.com/rust-lang/rust/issues/85836)).
### Follow up: Summary of discussion
tmandry:
- Consensus that this is a useful framework for thinking about the problem
- Consensus that we prefer compile time errors to runtime
- No consensus on whether we should eagerly surface errors
- Consensus that we would probably have to allow PMEs, because there was no way we were going to be able to express everything in the type system?
- Lingering question about whether we would ever want these in std... maybe not, but useful for ecosystem libraries.
- Some options available for making it possible to migrate in the future, though we haven't covered these in detail yet. Seems like they would work but can't say for certain.
### Follow up: Decisions on inline const
tmandry: There is a summary of the consensus reached on inline_const in [this comment](https://github.com/rust-lang/rust/pull/104087#issuecomment-1449080210).
###
###
###
###
###
###
###
###
###
<!-- ^ This is free space to avoid conflicts on people trying to add newlines while others are typing on the last line -->