owned this note changed 2 years ago
Published Linked with GitHub

Overflow is currently fundamentally broken and has to be redesigned.

a document for a WIP solution is in https://hackmd.io/QY0dfEOgSNWwU4oiGnVRLw

how overflow should be handled in an ideal world

Ideally we would be perfectly able to differentiate between "true overflow" and "hitting the recursion limit". Goals which overflow because they were just too complex and hit the recursion limit should ideally be treated differently from goals which actually diverge. This is an undecidable decision problem, so we can at best approximate this.

If we did not have incompleteness both kinds of overflow could be treated the same inside of the solver and we would only fatally abort if we hit the recursion limit and returned from the root goal. Because of incompleteness it is defacto impossible to hide any overflow caused by hitting the recursion limit, as the goal could have incompletely guided inference after increasing the limit.

overflow in the old solver

Overflow in the old solver is always fatal. Except when inside of canonical queries, in which case we rerun the solver outside of the canonical query and get fatal overflow this way.

The old solver is also somewhat overeager in detecting overflow, most notably https://github.com/rust-lang/rust/blob/b14fd2359f47fb9a14bbfe55359db4bb3af11861/compiler/rustc_trait_selection/src/traits/select/mod.rs#L1172-L1211. The comment here is somewhat wrong and could use some cleanup. This returns non-fatal overflow errors for Vec<Vec<?1>>: Trait with Vec<?0>: Trait.

overflow in the new solver

The new solver does a depth first search. When it reaches the recursion limit it returns Certainty::Maybe(MaybeCause::Overflow).

This causes hangs because having exponential growth with a depth of 256 means that we can easily end up proving 2^256 goals. Because of this the overflow limit gets reduced to an eighth of the actual limit.

We also increment the depth by one in any other place where infinite loops could happen.

where overflow can happen

  • proving nested goals
  • rerunning goals to get towards a fixpoint for coinduction
  • looping over nested goals in try_evaluate_added_goals
  • assemble_candidates_after_normalizing_self_ty
  • normalize_non_self_ty

issues and constraints

caching and stackdepth -> unstable results

if we have a stack like A -> B -> C -> D and a recursion limit of 3, proving A results in overflow. If however, we first proved C and put it into the global cache, A does not hit overflow anymore. This is unavoidable unless we have a stable way to track the depth of cached goals, which would allow us to overflow on A regardless of whether any nested goals are already cached.

coinductive cycles in stack depth/complexity

Because we cache entries in coinductive cycles, they do not have a stable stack depth of any other meassure of "complexity". Consider

graph TB
    A --> B
    B --> BA
    BA --> B
    B --> C
    C --> A

If we start this by first proving B, then A, we get the following traces:

B -> (C -> A -> B [cycle] && BA -> B [cycle]) (only cache B)
A -> B [cache hit]

If we instead were to start with C, then B then A,we would get the following graph

C -> A -> B -> (C [cycle] && BA -> B [cycle]) (only cache C)
B -> (C [cache hit] && BA -> B [cycle]) 
A -> B [cache hit]

I cannot think of any way to detect overflow via "work done" which is the same regardless of how we enter coinductive cycles. Some ideas considered:

  • stack depth: the maximum depth reached to compute the result of a goal, e.g. for X -> Y -> Z the depth of Z would be 0 and X would be 2.
  • stack depth+: stack depth, but for all other goals Q involved in coinductive cycles, remember the longest path from Q to the other involved goal
  • stack depth*: stack depth, for all other goals Q in cycles: remember the longest path, but stopping at Q. Use this instead of the stack depth when encountering the goal as Q.
  • stack depthX: stack depth*, also computing the longest path through a hit goal by adding the list, if that goal wasn't actually involved in a cycle with our goal, who cares, it will never be used.
ex1 A ex1 B ex2 A ex2 B ex2 C
stack depth 4 3 6
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
5
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
4
stack depth+ 9
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
8⁴
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
stack depth* 4
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
3⁶⁷ 4⁵
stack depthQ 3⁶⁸ 4⁵

¹: with distances: [C: 1, A: 2, BA: 1]
²: distance to B + depth of B + distance from B to A: 1 + 3 + 2 = 6
³: with distances: [A: 1, B: max(2, 4) = 4, BA: 3]
⁴: with distances: [BA: 1]
⁵: with distances: [A: 1, B: 2: BA: 3]
⁶: distance to C + distance from C to B: 1 + 2 = 3
⁷: with distances: [BA: 1]. This is incorrect as it is missing A. We would somehow figure out how to get the right cycle distance for A from the result of C.
⁸: with distances: [BA: 1, A: 2, BA: 4?]

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
we would need to track actual paths in the query result for cycles?

This feels quite messy. We also have to correctly deal with the provisional cache, which I fully ignored for now. It can however be mapped to a simpler problem on graphs, so we could probably experiment with approaches and end up somewhat sure that they are correct thanks to fuzzing.

incompleteness and overflow

unstable results due to overflow can not only go from successful compilation and to failure, but can actually result in a different success.

If a goal were to incompletely constrain inference variables then first incompletely constraining these variables via another goal would result in a different success.

the new solver hits more overflow

The new solver is less eager in detecting failure due to deferred projection equality and lazy normalization. This causes us to hit previously avoided overflow cases.

We also have not added anything liked https://github.com/rust-lang/rust/blob/b14fd2359f47fb9a14bbfe55359db4bb3af11861/compiler/rustc_trait_selection/src/traits/select/mod.rs#L1172-L1211 in the new solver. This hack is quite useful at filtering out overflow however.

potential solutions

I would love us to still keep overflow hitting the recursion limit as a hard error. This means we need to detect overflow which shouldn't hard error via some different mechanism

type size checks

When proving a goal, add some type size limit and fail with overflow if the input type is too big. E.g. one could imagine a type with a maximum depth of recursion_limit to be too big.

approximate recursive instantiations

Add a check similar to https://github.com/rust-lang/rust/blob/b14fd2359f47fb9a14bbfe55359db4bb3af11861/compiler/rustc_trait_selection/src/traits/select/mod.rs#L1172-L1211 to the new solver. This approximation is stack dependent and an approximation. If a nested goal successfully evaluates instead of overflow, whether or not it is already cached changes the result. This results in incremental compilation bugs.

pseudo-breadth first search by increasing the recursion depth

Example approach:

fn evaluate_goal(
    goal,
    remaining_depth
) -> Result<QueryResult<'tcx>, FatalOverflow> {
    if remaining_depth == 0 {
        return Err(FatalOverflow)
    }

    let mut nested_goals = nested_goals(goal);
    let overflowing_nested_goals = Vec::new();
    for nested in mem::take(&mut nested) {
        match evaluate_goal(nested, remaining_depth/ 2) {
            Ok(..) => .. // normal `try_evaluate_added_goals` behavior
            Err(FatalOverflow) => overflowing_nested_goals.push(nested),
        }
    }
    
    for nested in mem::take(&mut nested).chain(overflowing_nested_goals) {
        evaluate_goal(nested, remaining_depth - 1)?;
        // normal `try_evaluate_added_goals` behavior, but propagating
        // fatal overflow.
    }
}

Why is this useful: far cheaper to reject very deep overflowing branches? Not really, though something like this may make sense

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

pseudo fatal overflow

We should not directly abort compilation in the solver. We don't have a useful span and more importantly, users like rustdoc want to handle overflow without failing.

We could have pseudo fatal overflow, which is fatal but is allowed to be hidden if try_evaluate_added_goals has another hard error, or try_merge_candidates has a candidate which does not have any external constraints.

This behavior still results in unstable incremental caching if the trait solver has any incompleteness. In try_evaluate_added_goals if we first evaluate the overflowing goals, it could be changed to not overflow and constrain inference variables in a different way.

Select a repo