“Weak” type aliases in Rust 2024

--- title: "Design meeting 2023-11-29: Weak type aliases in Rust 2024" date: 2023-11-29 tags: ["T-lang", "design-meeting", "minutes"] discussion: https://rust-lang.zulipchat.com/#narrow/stream/410673-t-lang.2Fmeetings/topic/Design.20meeting.202023-11-29 url: https://hackmd.io/LCUSAX5LSfqc4m7N4CWNPw --- # "Weak" type aliases in Rust 2024 # Background ## Type aliases today leak impl details Rust type aliases are currently implemented as a "desugaring" step in the compiler that takes place before type and trait checking. Given an alias like this one... ```rust type Foo<T> = Vec<T> where T: Ord; ``` ...any reference like `Foo<u32>` will be "eagerly" converted to the desugaring type `Vec<u32>` before the type checker and trait system see it. As a result, where clauses (like `T: Ord` here) are completely ignored. This is why the language currently warns when where-clauses are present -- because we don't want people to think the compiler is checking them. The one time that where-clauses have an influence is when elaborating an associated type reference like `T: Item`. This is because elaboration from `T::Item` to a fully qualified associated type reference like `<T as Iterator>::Item` takes place in that same desugaring phase and is based on a syntactic analysis of the where-clauses in scope. Therefore, `type Foo<T> = Vec<T::Item>` does not compile, but it will compile if `where T: Iterator` is added or if `T::Item` is written `<T as Iterator>::Item`. ## Concerns about eager desugaring The "eager desugaring" approach used for type aliases in the compiler has several downsides. ### Long, confusing diagnostics and leaky abstractions Type aliases are often used to make a nicer name for common type combinations. But right now users are confronted with the full type on a regular basis. This is because the diagnostic code never sees the type alias, only the desugared form. ### It's confusing and inconsistent to ignore where-clauses Naturally the fact that where-clauses are ignored on type-aliases has long been considered a bug. ## Advantages of eager desugaring Eager desugaring does however have some advantages. ### Concise For one thing, it gives something similar to implied bounds. For example here... ```rust struct NeedsClone<T: Clone> { t: T } type Foo<T> = NeedsClone<T>; ``` ...there is no need to write `T: Clone` on the declaration of `Foo`. # Proposal ## Lazy/weak type aliases With the new trait solver, as well as other recent rustc refactoring, the types team is in a position to implement "lazy" type aliases (aka, "weak" type aliases) which are not eagerly normalized. Conceptually, such a type alias can be seen as equivalent to an associated type reference, so that... ```rust type Foo<T> = Vec<T>; ``` ...would be equivalent to... ```rust trait FooTrait<T> { type Output; } impl<T> FooTrait<T> for () { type Output = Vec<T>; } ``` and references to `Foo<T>` would effectively be desugared to `<() as FooTrait<T>>::Output`. Note that, in this case, all the information needed to resolve the associated type is already known. The main difference is tha the compiler would retain the "unnormalized" (i.e., type alias) form internally. This also makes it easy to support where clauses, they are conceptually added to the `impl`. A side benefit is that we can print out type aliases in their "alias" form (i.e., print `Foo<u32>` instead of `Vec<u32>`). # Question The primary question then becomes "what kinds of where-clauses should be required and when"? Here are two examples worth looking at. ## Ampersand The first example, `Ampersand`, indicates a case where the inner type, `&'a T`, has a well-formedness requirement that `T: 'a`: ```rust type Ampersand<'a, T> = &'a T; ``` and ```rust type OrdVec<T> = Vec<T> where T: Ord; ``` ## Considerations There are a few ways to think about this question. ### Analogy to structs One way to decide what where-clauses should be required is to go by analogy to structs. In this model, `type Foo<T> = XXX` requires bounds if and only if ```rust struct Foo<T> { field: XXX } ``` would require bounds. By this model, [`XXX = Ampersand<T>`](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2bd92b9bde1491d1bdae4073dc5a5506) does not require bounds (because we infer `T: 'a` based on struct fields, thanks to [RFC 2093]), but [`XXX = OrdVec<T>`](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c3c7eaf707b4011a09c1c7f478e68758) does require `T: Ord`. [RFC 2093]: https://github.com/rust-lang/rfcs/blob/master/text/2093-infer-outlives.md ### Analogy to impls Another way to decide what where-clauses should be required is to go by analogy to impls. In this model, `type Foo<T> = XXX` requires bounds if and only if ```rust impl FooTrait<T> for () { type Output = XXX; } ``` would require bounds. By this model, both [`XXX = Ampersand<T>`](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=da8a1f17992b3a19b0079cccd0306453) and [`XXX = OrdVec<T>`](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=43b8c40ce69ee536c387f7a8aeee4062) require bounds. Arguably, though, impls should infer `T: 'a` where-clauses from the values of their associated types -- [the RFC does not elaborate on why we chose not to do this](https://github.com/rust-lang/rfcs/blob/master/text/2093-infer-outlives.md#where-explicit-annotations-would-still-be-required). ### What do we eventually want with implied bounds? We have talked for some time about having some more expressive kind of implied bounds, such that given `struct OrdSet<T: Ord> { ... }` and `fn take_set<T>(o: &OrdSet<T>)`, the function could assume that `T: Ord` because it appears in the struct where-clauses. You might then choose to say that type aliases have implied bounds that make their right-hand side be well-formed, such that (effectively) `type Foo<T> = OrdSet<T>` would implicitly have a `T: Ord` requirement. As of now though we don't have a clearly articulated principle for what bounds are implied and where: **We allow implied bounds on functions based on the arguments, and impls based on the input types, but only for outlives bounds, and not for other bounds.** ### What message are we sending? Currently the compiler warns against adding where-clauses on type aliases. If we do a full 180, and start *requiring* where-clauses on type aliases in Rust 2024, that's quite the whiplash -- particularly if we later go with an implied bounds approach. An alternative might be to do some kind of warnings. ## Recommendation The doc author (nikomatsakis) recommends the following: * Unify the "analogy to structs" and "analogy to impls" by extending [RFC 2093][] inference to add outlives clauses to impls * Adopt the "analogy to impls" as our official model for type aliases; thus we would infer outlives but not other kinds of bounds (in Rust 2024) * Type aliases defined in Rust 2021 and before (as defined by the span information on the `type` keyword) are eagerly expanded. * Type aliases defined in Rust 2024 and later are weakly expanded. ## FAQ ### Aren't you worried about sending weird messages where we flip flop and where-clauses being required or not? Not really, I think aligning to structs/impls and then either migrating those to use implied bounds or *not* makes sense. ### Why the span info from the `type` keyword? Because I expect that people will write macros like `type $name<X> = Something where X: Ord` or whatever. But I'm open on this point. ### Is one easier to implement? The more closely we follow analogy to impls, the easier to implement. ### Other notes? [Here are the notes from the types team meetup](https://hackmd.io/hnJkp_ZdT6aVUu-7piUdsQ?view). --- # Minutes People: TC, nikomatsakis, tmandry, scottmcm, pnkfelix, Josh, fmease, waffle, compiler-errors Minutes: TC ## Must this be an "either/or" decision? pnkfelix: This may be a silly Q. pnkfelix: I can imagine a world where we first analyze the nature of the type-alias, and how it's where-clauses relate to the where-clauses of the types it references, and use *that* to inform whether we do Eager or Late resolution of that type alias. pnkfelix: I.e.: if the type-alias is adding new where clauses, then that *must* use late resolution. If the type-alias has removed where clauses (and isn't adding any new ones; i.e. all of its where clauses are implied by the where clauses attached to the type(s) in the RHS, then that ... could (?) continue to use eager resolution? TC: That would still have the downside then of continuing to show the normalized type in, e.g., error messages. NM: I think we could do this, but I'm not sure what purpose it would serve. pnkfelix: I'm probably in favor of moving overall to late normalization. The question is whether we could serve both use cases. NM: There is an alternative like that. We could say that "where the RHS of the type alias is well formed." So the check would get pushed out to each usage site. pnkfelix: What are the downsides to that? NM: The question before us is "what do we really expect?" Do we expect this to be more like a field on a struct, an associated type, or a parameter to a function. For the latter, you get to assume that they're well formed. That's more like an input. For struct fields and associated types, that's more like an output. You have to add enough bounds to make them well typed. Josh: Is there an alternative where you don't have to declare the bound to use the type, but you have to declare it to use the bound? (e.g. you could use `NeedsOrd<T>` witout declaring `T: Ord`, but you couldn't actually call `T::cmp` without declaring `T: Ord`) pnkfelix: It's tough. It'd be painful if we deferred checking anything about the type alias's bounds until the type alias is used. Josh: We should check the bounds of the type alias themselves when they're declared. I'm talking about the implied bounds. The argument here would be not making something part of an interface you may not mean to. NM: I don't think that applies here. The mechanism Josh is describing.... implied bounds say, "... where each of my types is well formed." Is that an `iff` relation, or `if`. If we check that the RHS is well formed, it's not clear it matters. NM: Again, this comes back to what we expect. Do we *expect* to have to write the bounds on a function using the type alias, or not? pnkfelix: Is the problem that we don't have enough information here to know whether it's an input or output? NM: Just looking at the struct header, I shouldn't have to know the types of its fields to know whether it's valid. That's not true today because of outlives, which arguably isn't important, but it's mostly true. The same is true for GATs. You look at the header to know whether the impl applies. You don't have to look at the value of the GAT. So for a type alias, do I need to look at the RHS to know whether it's valid, or just the LHS? NM: I don't want there to be extra WCs derived from the body, as in C++, to check validity. Josh: You're saying that bringing in bounds from the signature is OK? NM: ... Josh: I'd suggest we treat this as orthogonal. If we had to do this over again, we certainly wouldn't have where clauses on type aliases that aren't enforced. So enforcing them in the right answer. But whether we have implied bounds or not is a problem for other things too (e.g. generic struct bounds). So we shouldn't block the enforcementof type aliases on any decision about implied bounds. NM: We have two different kinds of implied bounds. One is the one on functions. "where this is well formed." The other is like what we do for structs, were we look at the fields, and we add bounds implicitly. So there's a reverse flow thing. Flowing things out from the inner to the outer, or from the outer signature to the inner bounds. pnkfelix: I wanted to come back to the point about whether type aliases should be opaque. There is also the newtype pattern. If we make them opaque, how would express that they should be translucent. NM: This is more analogous to type aliases. No-one is arguing for them to be opaque as in `impl Trait`. NM: In my mind, both type aliases and associated types are exactly the same thing. In T-types, we call these alias types now. So it seems obvious to me that they should behave analogously. TC: NM, perhaps you could discuss how WCs on a trait impl are enforced differently than WCs on an associated type, how that might connect to the input/output discussion here, and what the appropriate desugaring might be? NM: ```rust // Niko's argument: apart from where-clauses and WF'edness, and how they are implemented today, // these two constructs seem equivalent in Rust today: type Foo<T> = Vec<T>; trait FooTrait<T> { type Output; } impl<T> FooTrait<T> for () where XXX // <-- these can be independent from what appears on the trait, condition on which trait is implemented { type Output = Vec<T> where XXX; // <-- these must be a subset of what appears on the trait (part of the trait contract) fn foo<T>() where T: Ord // <-- part of *trait* contract {} } // always use `<() as FooTrait<T>>::Output` instead of `Foo<T>` // basically an impl is basically a module indexed by a type ``` (NM had to drop. CE picks up the thread...) FSK (attempting to encode CE's words): ```rust type Foo<T> where T: Copy = Vec<T>; // desugaring 1 trait FooTrait<T> { type Output; } impl<T> FooTrait<T> for () where T: Copy { type Output = Vec<T>; } // desugaring 2 (CE would prefer 3 to this one) trait FooTrait<T> { type Output where T: Copy; } impl<T> FooTrait<T> for () { type Output = Vec<T> where T: Copy; } // desugaring 3 trait FooTrait { type Output<T> where T: Copy; } impl FooTrait for () { type Output<T> = Vec<T> where T: Copy; } ``` CE: if we are trying to propagate the implied where-clauses, we *do* propagate where clauses attached to the (GAT?), but we would not propagate where-clause from the impl's. CE: If we wrote a function like: ```rust fn foo<T>(x: <() as FooTrait>::Output<T>) ... ``` CE: The compiler will never be smart enough to pull in the implied bounds from desugaring number 1. Because that requires us to know which implementation to normalize. tmandry: Desugaring 3 is closest to type aliases. It looks closer to it. And it's what I'm leaning toward. But from a perspective of conservatism, we don't want to implicitly copy-paste all WCs on the type alias everywhere it gets used. We want to make people write them out to say what bounds are being implied in their interfaces. scottmcm: I'm not sure I'd describe that as conservatism. It being non-breaking to remove an unnecessary bound would be nice. It's more of a SemVer question. We want to let people simplify things. But maybe in a type alias we should think of it as being a promise. You can always make a new type alias if you want different bounds. CE: This gets to conversations T-lang has had before about WCs that come before and after the definition of the type alias. It's also come up in trait aliases. scottmcm: `type (OrdVec<T> where T: Ord) = Vec<T>` vs `type OrdVec<T> = (Vec<T> where T: Ord);` could potentially be different promises, given sufficient syntax. CE: Yes, that's what I was thinking. If I have WCs on my type alias, do we want a function using that type-alias to be able to assume those WC's as implicitly implied bounds, or do we want the function to have to restate those WC's again within its own signature. tmandry: The analogy to structs is seeming more compelling to me. My feeling is that the where clauses should be the same in these cases. waffle: +1. CE: We could make those rules stronger in the future also. For now, we could follow the same rules as structs. We could relax it in the future, but that would be part of a greater move toward assuming things are well formed everywhere. waffle: What CE said is similar to what Josh said earlier. We could make the decision to make type aliases similar to structs, then we could make a decision about implied bounds. tmandry: We're talking about propagation of explicit where clauses, and we're talking about whether we implicity add WCs for well formedness of bounds based on the type on the RHS of the type alias. TC: What desugarings above is this forward compatible with? CE: Doing it as structs is compatible with later moving to any of the desugarings above. The cost of not doing that now is that users would need to write redundant bounds at the use sites as well as on the type alias itself. TC: What's the next step to move it forward? CE: We should run a experiment to see the fallout from enforcing these WCs. CE: And we should write something up about the semantics we want to follow. TC: The reason Niko suggested treated these as non-orthogonal was to prevent whiplash of forcing people to add these bounds and then removing the need for that later. scottmcm: Here's an example of a type alias I used where not having to include the traits for a private type alias was certainly convenient <https://github.com/rust-lang/rust/blob/abe34e9ab14c0a194152b4f9acc3dcbb000f3e98/library/core/src/ops/try_trait.rs#L366>, though it's more about simplifying the usage sites, so adding more to the declaration of the type alias wouldn't be a big deal. tmandry: This needs to be an edition migration? all: Yes. waffle: What will we learn in an experiment? CE: We'll learn about type aliases that do need bounds. tmandry: +1 on going forward with what's been proposed by CE for moving forward. TC: Proposed consensus: We're OK with going forward with type aliases working analogously with structs as a minimum. We accept the fact that people may have to add redundant where clauses, even though we may relax that later. But we're also open to a proposal that would make type aliases more analogous to associated types (and alias types more generally). scottmcm: I'm in favor of that, but I'm a bit unsure what the difference is between the impl version and the struct version. scottmcm: I'd love to see some formalized analysis of exactly where we require these things, here's how we can avoid writing useless things in common cases, etc. Josh: The proposed consensus above sounds right, with the addition that we're also open to proposals that would extend implied bounds more generally on all kinds of alias types. scottmcm: +1 and for the 2024 edition, we're OK doing this without further extension to implied bounds. *Consensus*: We're OK with going forward in the 2024 edition with type aliases being weak and working analogously with structs. We accept the fact that people may have to add redundant where clauses, even though we may relax that later. This path is forward compatible with later relaxing these rules within an edition. We won't block stabilizing this in Rust 2024 on other improvments to implied bounds being made. At the same time, we're open to future or concurrent work along two axes: 1) better unifying type aliases with other alias types (e.g. associated types), and 2) improving implied bounds generally for all alias types. (The meeting ended here.) ## Analogy to desugaring? (skip this, we talked about it above in a better way) scottmcm: Something niko said made me think about trait aliases here -- that trait aliases are kinda like a trait with a blanket impl -- and thus there's a model you could think of here where a generic type alias is the same kind of thing. Thinking out loud, is this kinda like one of these? ```rust struct OrdVec<T>(PhantomData<T>); impl<T: Ord> OrdVec<T> { type Output = Vec<T>; } ``` ...vs... ```rust struct OrdVec<T: Ord>(PhantomData<T>); impl<T: Ord> OrdVec<T> { type Output = Vec<T>; } ``` ...hmm, having written that I'm not sure that makes any sense any more... Oh, niko's example above on `()` instead of a struct makes more sense here. ## Does "analogy to impls" require strictly more bounds? scottmcm: We can presumably always require fewer bounds later (subject to not wanting to churn too much), so if impls is the suggested and easier one and going to the other later would still be *possible*, that's pretty persuasive to me. ## Discussing implied bounds (Intentionally further down; probably not critical for this decision) scottmcm: I would love to move the proof requirement for types, rather than imply them. I don't know what exactly those rules would be, but allowing you to say `Cow<T>` for everything, but move the proof requirement to the constructors, or something. (We had a discussion about this around `Box`, IIRC, but I forget what all that happened.)