non-fatal overflow in the new solver

this is a collection of notes for the final overflow doc

The new solver should avoid hangs as much as possible. It must consider constraints from overflowing branches because of https://github.com/rust-lang/trait-system-refactor-initiative/issues/70.

Overflow can happen in multiple places:

The main concern with non-fatal overflow is how we should handle exponential blowup. This can be caused in multiple different ways yet again:

multiple candidates for a trait or project goals
multiple nested goals of a trait or project candidate
multiple nominal obligations from well-formed goals

more complex overflow issues

These blowup and overflow sources can be combined for even more fun.

solver fixpoint + try_evaluate_added_goals

https://github.com/rust-lang/rust/pull/118774

trait Trait {}

struct W<T: ?Sized>(*const T);

impl<T: ?Sized> Trait for W<W<T>>
where
    W<T>: Trait,
    W<T>: Trait,
{}

fn impls_trait<T: Trait>() {}

fn main() {
    impls_trait::<W<_>>();
    //~^ ERROR overflow evaluating the requirement
}

solver cycle fixpoint at each level + multiple nested goals

#![feature(rustc_attrs)]
#![allow(internal_features)]

#[rustc_coinductive]
trait Trait {}

struct W<T: ?Sized>(T);

impl<T: ?Sized> Trait for W<W<T>>
where
    Self: Assistant,
    W<T>: Trait,
{
}

#[rustc_coinductive]
trait Assistant {}
impl<T: ?Sized> Assistant for W<T>
where
    T: Assistant,
    Self: Trait,
{
}

fn impls_trait<T: Trait + ?Sized>() {}

fn main() {
    impls_trait::<W<_>>();
}

solver cycle fixpoint at each level + multiple candidates

// This has exponential growth because of the growing impl,
// even it does not apply.
trait Trait {}

struct W<T>(T);
struct U<T>(T);

trait NotImplemented {}

impl<T> Trait for W<T>
where
    W<W<T>>: Trait,
    W<T>: NotImplemented,
{}

impl<T: Other> Trait for T {}

trait Other {}
impl<T: Other + Trait> Other for W<T> {}
impl Other for () {}

fn impls_trait<T: Trait + ?Sized>() {}

fn main() {
    impls_trait::<W<_>>();
}

overflow in `try_evaluate_added_goals` and anything else

Rerunning overflowing goals after applying their constraints very easily result in hangs, because we recompute the overflowing goal at each loop, increasing the size of the inferred type even more, e.g: ui/traits/new-solver/overflow/exponential-trait-goals.rs

trait Trait {}

struct W<T>(T);

impl<T, U> Trait for W<(W<T>, W<U>)>
where
    W<T>: Trait,
    W<U>: Trait,
{
}

fn impls<T: Trait>() {}

fn main() {
    impls::<W<_>>();
}

This also results in unstable results. Stopping to apply inference constraints because of overflow allows the solver to make additional progress the next time the goal is computed.

`assemble_candidates_after_normalizing_self_ty` and `try_normalize_ty` blowup

For cyclic projections, normalizing the self type results in recursion_depth nested Projection(Alias, ?new_infer) goals. ?new_infer gets instantiated as Alias which (due to the way the current impl is set up, ends up resulting in a nested AliasRelate(Alias, Alias) goal, which again normalizes the alias resulting in recursion_depth many nested goals.

trait Overflow<U: ?Sized> {
    type Assoc;
}

impl<U: ?Sized> Overflow<U> for () {
    type Assoc: = <() as Overflow<(U,)>>::Assoc;
}

fn main() {}

This results in nested alias relate goals because when generalizing, the generalized types has no unresolved inference variables while the original one does, preventing the structural eq fast path from firing when equating at the end of CombineFields::instantiate. The following diff prevents that overflow

--- a/compiler/rustc_trait_selection/src/solve/project_goals/mod.rs
+++ b/compiler/rustc_trait_selection/src/solve/project_goals/mod.rs
@@ -227,6 +227,7 @@ fn consider_impl_candidate(
             //
             // And then map these args to the args of the defining impl of `Assoc`, going
             // from `[u32, u64]` to `[u32, i32, u64]`.
+            let impl_args = ecx.resolve_vars_if_possible(impl_args);
             let impl_args_with_gat = goal.predicate.projection_ty.args.rebase_onto(
                 tcx,
                 goal_trait_ref.def_id,

random thoughts and summary

we need to apply inference constraints even if there's overflow for backcompat
try_evaluate_added_goals causes other overflow to very quickly result in hangs. Overflowing nested goals have to be heavily penalized or avoided.
changing the layout of the proof tree after stabilization is theoretically breaking, either because of hangs or because we stop visiting paths which are not visited anymore.
avoiding the recomputation of parts of the proof tree is fully backwards compatible and something we can do after stabilization
normalization needs the "full depth"
cycle handling must not allow the full depth as it otherwise hangs
we will probably readd a provisional cache at some point, this may reduce the cost of the cycle fixpoint step
ignoring constaints from overflow is very good for perf, may specialcase the constraints from where-clauses

new idea

stash goals resulting in overflow in try_evaluate_added_goals and avoid evaluating them in following evaluations
do the same in fulfillment
maybe also try to prove them once more at the end

cache usage of the new solver

current implementation (without dependencies) as of 2023.12.04.

crate	overflow	global cache	cycle	compute
syn Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	0	470628	0	49688
rand (slightly changed) Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	0	118929	8586	29173
serde Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	0	4032167	0	122229
bitflags Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	0	10901	0	3477
regex-syntax Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	0	400924	28	51540
regex-automata Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	0	826261	4	139092
typenum Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →	76924	345245	4854	65233

IDEA: Learning from CTFE

Trait solving has similar constraints to CTFE. We can use a similar approach to CTFE to avoid hangs in a backwards compatible way.

Have a simple counter in the trait solver, which is incremented whenever we evaluate a nested goal. If that counter hits some arbitrary limit, we emit a deny by default lint telling the user that the solver seems to be hanging due to their code. If that lint results in an error (i.e. has not been changed to allow/warn), we abort compilation.

If it has been allowed or changed to warn, we repeatedly emit a warning with some exponential backoff.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

This has one significant issue: the number of evaluated nested goals is not a good approximation of the solver runtime: typenum has 95000 uncached goal evaluations in less than a second. tests/ui/traits/new-solver/cycles/coinduction/fixpoint-exponential-growth.rs hangs with less than 700 uncached goal evaluations

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

The time needed to evaluate a single goal can differ widely depending on the amount and size of inference constraints from nested goals. However, we can combine this counter with a size check of constraints, or include the size of constraints when incrementing the counter.

IDEA: split "overflow" and "recursion limit based overflow"

Only yeet constraints from recursion limit based overflow. This does not avoid the hang in tests/ui/traits/new-solver/cycles/coinduction/fixpoint-exponential-growth.rs.

Many crates depend on the "overflow project where bounds" behavior

https://github.com/rust-lang/trait-system-refactor-initiative/issues/70 may just be acceptable breakage. It may be very positive for perf, at least doing it for recursion limit based overflow.

https://github.com/rust-lang/rust/issues/90662 was originally caused by only causing this pattern to error for global goals (if it otherwise cycles), this broke https://github.com/AzureMarker/shaku. It feels likely that always doing so is too impactful.

IDEA: checking the size of the var_values constraints

When canonicalizing a response, check the size of the var_values and discard them if they grow too large, resulting in overflow.

This can result in bistable cycle fixpoint computations, but that seems alright.

TODO: impl header eq constraints old solver

Write tests where we rely on these constraints both for Projection and Trait goals, nested goals either resulting in inductive cycle or hitting the recurison limit (should also be fine if there's just a single candidate).

also write tests where we rely on these constraints from a nested goal. So we need the impl header eq constraints of a goal from the where-bounds.

QUESTION: Can we delay stabilizing our non-fatal overflow behavior

I personally think it is acceptable to delay the stabilization of a "ready" implementation of -Ztrait-solver=next-coherence to collect more data while working on full -Ztrait-solver=next. We should still publish a blogpost asking for testing and stating that it is ready for stabilization.

We cannot completly avoid non-fatal overflow in Ztrait-solver=next-coherence as typenum should continue to compile. We could emit a deny-by default lint when people depend on our current overflow handling. This lint would hit quite a few crates however, so it's not ideal.

There's also the question of whether and where to drop constraints from overflow. If this behavior should affect non-recursion depth based overflow, e.g. inductive cycle fixpoints, then we either have to pretty much decide on that behavior already.

Advantages of stabilizing `-Ztrait-solver=next-coherence`

remove the coherence support from the old solver, reducing complexity
get more testing of the new solver, at least for the behavior relied upon in coherence
milestone showing that the new solver is making progress, alleviate concerns about the type system being stuck
fix bugs in coherence and have a sensible behavior wrt to binders. Mostly negligable: https://hackmd.io/ABcskdRCRj6WuE3TeX9zEQ
the positive impact is overwhelmingly social. there are limited technical benefits from stabilizing it.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

non-fatal overflow in the new solver

more complex overflow issues

solver fixpoint + try_evaluate_added_goals

solver cycle fixpoint at each level + multiple nested goals

solver cycle fixpoint at each level + multiple candidates

overflow in try_evaluate_added_goals and anything else

assemble_candidates_after_normalizing_self_ty and try_normalize_ty blowup

random thoughts and summary

new idea

cache usage of the new solver

IDEA: Learning from CTFE

IDEA: split "overflow" and "recursion limit based overflow"

Many crates depend on the "overflow project where bounds" behavior

IDEA: checking the size of the var_values constraints

TODO: impl header eq constraints old solver

QUESTION: Can we delay stabilizing our non-fatal overflow behavior

Advantages of stabilizing -Ztrait-solver=next-coherence

overflow in `try_evaluate_added_goals` and anything else

`assemble_candidates_after_normalizing_self_ty` and `try_normalize_ty` blowup

Advantages of stabilizing `-Ztrait-solver=next-coherence`