Method probing algorithm

# Method probing algorithm ## Status quo, Deref + Receiver chain First we will explain the current method probing procedure with a worked example. ### Basics - `Autoderef` chains ###### [Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024&gist=49af6a19e87a428d38b7c4607308664f) ```rust= struct Inner; impl Inner { fn resolve(self: Myself) {} } struct Myself; impl Receiver for Myself { type Target = Inner; } let value: Box<Box<Box<Myself>>>; value.resolve(); // <-- here ``` A core concept of method probing is the idea of auto-deref chain. As of today, this chain consists of types obtained by iteratively applying the trait projection `<_ as Deref>::Target` or `<_ as Receiver>::Target`. ```mermaid flowchart LR subgraph "Receiver chain" direction TB m --->|Myself as Receiver| i("Inner") end bbb("Box<Box<Box<Myself>>>") ===>|Box as Deref| bb("Box<Box<Myself>>") ===>|Box as Deref| b("Box<Myself>") ===>|Box as Deref| m("Myself") ``` One important remark is that, due to the blanket implementation `impl<T: Deref> Receiver for T` today, following a type down through `Deref`s produces a sequence of types that is also an initial sequence produced by following `Receiver`s. ### Basics - picking The Autoderef process is followed by `pick`ing the method by matching the associated items in both the inherent `impl $ty` blocks and trait `impl Trait for $ty` blocks with the requested name `resolve`. Each block on the Autoderef chain is referred to as a `step`. An important maneuver in method picking as it walks down an Autoderef chain is to **apply adjustments** to the candidate receiver type under the current pick. Before looking up the method in the `impl` blocks, given a type `$self_ty` at each step, the picker tries the lookup with the unmodified `$self_ty`, with these three adjustments at the moment in this order. 1. Make a reference `&[mut] $self_ty` 2. In case`$self_ty := *mut $ty`, make `*const $ty` 3. Reborrow the pin in case `$self_ty := Pin<&[mut] U>`. ```mermaid flowchart TB subgraph "Receiver chain" direction TB m ===>|Myself as Receiver| i("Inner") subgraph "Pick #4" direction TB m -.->|tried| ma("Myself") -.->|tried| mar("&Myself") end subgraph "Pick #5" direction TB i -.->|tried| ia("<Inner>::resolve 🏁") -.->|would try| iar("&Inner") end end bbb("Box<Box<Box<Myself>>>") ===>|Box as Deref| bb("Box<Box<Myself>>") ===>|Box as Deref| b("Box<Myself>") ===>|Box as Deref| m("Myself") subgraph "Pick #1" direction TB bbb -.->|tried| bbba("Box<Box<Myself>>") -.->|tried| bbbar("&Box<Box<Myself>>") end subgraph "Pick #2" direction TB bb -.->|tried| bba("Box<Box<Myself>>") -.-> |tried| bbar("&Box<Box<Myself>>") end subgraph "Pick #3" direction TB b -.->|tried| ba("Box<Myself>") -.->|tried| bar("&Box<Myself>") end ``` Picking proceeds in two stages in general. 1. Assembly stage: collection of all methods in `impl` blocks, where `Self` can be resolved to each receiving type in the chain, with the matching name. - At this stage, methods are scanned first by moving down the Autoderef chain. From each type on the chain, adjustments are applied in the order as mentioned. - When methods are collected, they are binned into three categories while their order of appearence is preseved: **Inherent methods**, **Trait methods** and **Private methods**. Private methods are those inherent methods with in sufficient visibility. They are collected for diagnostics. - Usually, if a method is sourced from a `impl` block without a trait reference, like `impl Type { .. }`, it is an Inherent method. Otherwise, the method is a trait method implementation from a `impl Trait for Type { .. }` block, so it is a Trait method. - There are notable exceptions to the categorisation into Inherent, Trait and Private methods. - If the type of value `self` is `dyn Trait`, a trait method associated with `Trait` is callable on `dyn Trait` values through dynamic dispatch, but it belongs to **Inherent** method. In the current trait system, and probably into the distant future, `dyn Trait: Trait` shall not necessarily and generally hold. - If the type of value `self` is a generic parameter, associated trait methods sourced from the environment, or at least the `where` clauses, belong to the **Inherent** methods as well. - Order is preserved so that the shadowing is shadowing is enforced, which is explained in the next section. 2. Picking stage: starting from Inherent, followed by Trait and Private methods, the picker tries to unify the candidate methods in-order without committing to the resultant typing result to collect unsatisfied predicates for diagnostics, sort out those that are behind an unstable feature gates. Checks on whether an inherent method shadows another inherent method by applying another autoref on the receiving type are performed at the end. - An important check happens for each inspection into a candidate method is the satisfiability of the nested predicates from `where` clauses when the candidate method is to be called in this environment. In `fn consider_probe`, this check is essentially wrapped in a transaction, to be rolled back when the inspection completes, in which the predicates are collected and solved for diagnostic purposes. If there are obligation selection errors due to unprovable predicates, they are recorded for diagnostics and the candidate is effectively skipped over. - We could arrive at multiple applicable candidates in the current Inherent, Trait or Private bin, after winnowing down those with unprovable predicates. This results in an ambiguity error. 3. Final stage: here the final pick is contructed, the diagnostic information is propagated, the number of autoderefs to apply and the pointer adjustment is reported. By looking into the inherent implementation `impl Inner`, the item with a matching name is `fn resolve(self: Myself)`. Indeed this item is picked. ### Array and unsizing Due to possibly historical design (@dingxf please consult historians) `impl<T, const N: usize> Deref for [T; N]` does not exist. For ergonomics, method probing simulate the receiver relation between `[T; N]` and `[T]` so that methods receiving `&[T]`, `&mut [T]`, `*const/mut [T]` can be dispatched, when the **final** type at the end of the Autoderef chain is an array `[T; N]`. The end effect is that given a value of type `[T; N]`, `&[T; N]` or `&mut [T; N]`, important slice methods like `fn len(&[T])`, `fn first(&mut [T])` and `fn sort_unstable(&mut [T])` can be dispatched to. ### Shadowing and short-circuit As a general rule, the picking process has preferences of some candidates over others. 1. Shallower receiving types in the autoderef chain over those deeper; 2. Original type `T` over referenced and reborrowed `&T`, over `&mut T`, then over weakened const pointer `*const U` for applicable `T := *mut U`, over reborrowed pins `Pin<&[mut] U>` for applicable `T := Pin<&[mut] U>` 3. Inherent method over trait method ```rs= struct Ptr<T>(T); impl<T> Receiver for Ptr<T> { type Target = T; } impl<T> Ptr<T> { fn foo(&self) {} } let ptr = Ptr(1); (ptr as Ptr<_>).foo(); // <-- *** ``` The picking flowchart is as followed. ```mermaid flowchart TB pt("Ptr<?0t>") ===>|"Ptr<?0t> as Receiver"| t("?0t") subgraph "Pick #1" pt -.->|would try| pta("<Ptr<?0t>>::foo 🏁") end subgraph "Pick #2 ❌ cancelled" t end ``` ### No commitment to "guessing" Adapted from `tests/ui/methods/method-probe-no-guessing-dyn-trait.rs`, this demonstrate the interaction between autoderefs, method picking and type inference. ```rs= //@ run-pass trait Trait { fn f(&self) {} } struct Foo<T>(T); impl Trait for Foo<u32> {} let foo: Option<Foo<_>> = None; let foo_ref = foo.as_ref().unwrap(); // ... of type &Foo<_> foo_ref.f(); // <-- *** // == now assert that Deref is not used == // == so that foo is not inferred into Option<Foo<()>> == impl Deref for Foo<()> { type Target = dyn Trait + 'static; fn deref(&self) -> &(dyn Trait + 'static) { panic!() } } ``` So the autoref would produce the following chain, without committing anything to inference... ```mermaid flowchart LR bbb("&Foo<?0t>") ===>|"Foo<()> as Deref"| bb("dyn Trait + 'static") ``` ... and the method picking should be ... ```mermaid flowchart LR f("&Foo<?0t>") ===>|"Foo<()> as Deref"| d("dyn Trait + 'static") subgraph "Pick #1" direction TB f -.->|would try| fa("<Foo<?0t>>::f with ?0t := u32 🏁") end subgraph "Pick #2 ❌ cancelled" d end ``` ### `LegacyReceiver` This trait is an assertion on `Rc`, `Arc` and `Box` that, in abscence of `arbitrary_self_types` feature, these types work as if it has `Receiver` implementation with a `type Target` matching the `<_ as Deref>::Target`. This constraint is enshrined in the well-formedness check on `fn` items, `fn receiver_is_valid`. ### Current implementation #### Method lookup `fn lookup_method` @ `compiler/rustc_hir_typeck/src/method/mod.rs` #### Autoderef probing flow * `fn method_autoderef_steps` @ `compiler/rustc_hir_typeck/src/method/probe.rs` <- * `fn probe_op` @ `compiler/rustc_hir_typeck/src/method/probe.rs` <- * `fn probe_op` @ `compiler/rustc_hir_typeck/src/method/probe.rs` <- * `fn lookup_probe` @ `compiler/rustc_hir_typeck/src/method/mod.rs` <- * `fn lookup_method` @ `compiler/rustc_hir_typeck/src/method/mod.rs` <- * `fn check_expr_method_call` *** #### Method picking flow * `fn consider_probe` @ `compiler/rustc_hir_typeck/src/method/probe.rs` <- * `fn consider_candidates` <- * `fn pick_method` <- * `fn pick_by_value_method`/`fn pick_autorefd_method`/`fn pick_reborrow_pin_method`/`fn pick_const_ptr_method` <- * `fn pick_all_method` <- * `fn pick_core` <- * `fn pick` <- * `fn probe_for_name` <- * `fn lookup_probe` @ `compiler/rustc_hir_typeck/src/method/mod.rs` <- *** #### Method well-formed checks `fn receiver_is_valid` @ `compiler/rustc_hir_analysis/src/check/wfcheck.rs` ## Proposed change In the past discussion, it is hinted at that making `Deref` a subtrait of `Receiver` is probably unnecessary. On a high level, `Deref` concerns the automatic dereferences that conceptually "reduces" a type to another "simpler" type. On the contrary, `Receiver` has always meant to control the type of the receiver variable `self` in the method signature. For types with `impl Deref`, given that their `impl Receiver` agrees with `impl Deref`, chasing through `<_ as Deref>::Target` or `<_ as Receiver>::Target` makes no difference: the receiving types overlap. What would interest us is the case when `<_ as Deref>::Target` and `<_ as Receiver>::Target` are allowed to disagree. ### `Receiver` side-chains Notice that `Receiver` has been appending new possible receiving types to the Autoderef chain. The extension is where we can generalise by attaching a `Receiver` side-chains not just at the end of a chain of types following `Deref`s, but also at each step of the autoderefs. On a high level the complete would look like the following. ```mermaid flowchart TB StartType ===>|"Deref::Target, <br> by calling .deref()"| Deref1 ===>|"Deref::Target, <br> by calling .deref()"| eps1("...") ===>|"Deref::Target, <br> by calling .deref()"| DerefN subgraph "Receiver chain #0" StartType --->|Receiver::Target| Receiver1("Receiver₀¹") --->|Receiver::Target| eps2("...") --->|Receiver::Target| ReceiverN("Receiver₀ⁿ") StartType -.->|adjust| a0 subgraph a0["Adjustments"] direction TB S1("StartType") -.->|then| S2("&[mut] StartType, by reborrowing") -.->|then| eps21("...") end end subgraph "Receiver chain #1" Deref1 --->|Receiver::Target| Receiver11("Receiver₁¹") --->|Receiver::Target| eps3("...") --->|Receiver::Target| Receiver1N("Receiver₁ⁿ") Deref1 -.->|adjust| a1 subgraph a1["Adjustments"] direction TB D1("Deref1") -.->|then| eps31("...") end end subgraph "Receiver chain #N" DerefN --->|Receiver::Target| ReceiverN1("Receiverₙ¹") --->|Receiver::Target| eps4("...") --->|Receiver::Target| ReceiverNN("Receiverₙⁿ") DerefN -.->|adjust| an subgraph an["Adjustments"] direction TB DN("DerefN") -.->|then| epsn1("...") end end ``` ### Interleaving `Deref` and `Receiver`; "method probing explosion" The frequently asked question about one of the consequences of separating `Deref` and `Receiver` completely is the possible explosion of methods to look up, through chasing `Deref::Target` and `Receiver::Target` alternatively. By assuming that we are only allowed to transform the method receiver **value** with application of automatic dereferences and adjustments like reborrowing with different mutability, we would argue that chasing down `Deref::Target` after chasing through a `Receiver::Target` side-chain does not add any value to method probing. The purpose of `Receiver` is now to determine the generic type `Self` of a potentially viable `impl` block that contains an applicable method. One can already tell from the form of `trait Receiver` and note that it only contains associated `type Target`. Given that only the `Deref` trait provides the means to reduce a **value** of some type into another **value** of another simplified type, we can conclude that `Receiver` operates purely on the type level and is essentially responsible only for select a `impl` block with an applicable `Self` type in the block header. On the contrary, if we are to honour the principle that we would only adapt the method receiver value with auto-derefs and adjustments, it is clear that chasing down `Deref` again after running into a `Receiver` side-chain is not productive because methods discovered with those types cannot receive a `self` value that is derived from the original value. There is no corresponding value on those side-chains to begin with. ```rs= struct Myself; type SomeType<T> = ...; // Receiver dictates whether a method can receive a value of type // `SomeType<Self>` based on the criterion that the type `Self` // is transitively reachable by following `Receiver::Target`s // and whose `impl` block contains an applicable method. trait Trait { // ^ Receiver decides what is the applicable `Self` here fn method( self: SomeType<Self>, // ^~~~~~~~~~~~~~ this type sits on the main `Deref` chain // Receiver is involved only for the well-formedness of this method definition ); } impl Trait for SomeOtherType { // ^~~~~~~~~~~~~ // Receiver decides whether `SomeOtherType` is applicable ... fn method( self: SomeType<SomeOtherType>, // ^~~~~~~~~~~~~~~~~~~~~~~ // ... but not this type; rather, Deref is in control here. // Receiver is involved only for // the well-formedness of this method definition. ) } ``` ### Proposed changes - For implementation, we are not changing method picking and finalisation stages. - We will restore `Autoderef` to only use `Deref` - We will change the assembly stage where we will solve the `Receiver::Target` projection iteratively. Candidates are reported in the same order as they emerge from the Receiver side-chain.