owned this note changed a year ago
Published Linked with GitHub

"Weak" type aliases in Rust 2024

Background

Type aliases today leak impl details

Rust type aliases are currently implemented as a "desugaring" step in the compiler that takes place before type and trait checking. Given an alias like this one

type Foo<T> = Vec<T>
where
    T: Ord;

any reference like Foo<u32> will be "eagerly" converted to the desugaring type Vec<u32> before the type checker and trait system see it. As a result, where clauses (like T: Ord here) are completely ignored. This is why the language currently warns when where-clauses are present because we don't want people to think the compiler is checking them.

The one time that where-clauses have an influence is when elaborating an associated type reference like T: Item. This is because elaboration from T::Item to a fully qualified associated type reference like <T as Iterator>::Item takes place in that same desugaring phase and is based on a syntactic analysis of the where-clauses in scope. Therefore, type Foo<T> = Vec<T::Item> does not compile, but it will compile if where T: Iterator is added or if T::Item is written <T as Iterator>::Item.

Concerns about eager desugaring

The "eager desugaring" approach used for type aliases in the compiler has several downsides.

Long, confusing diagnostics and leaky abstractions

Type aliases are often used to make a nicer name for common type combinations. But right now users are confronted with the full type on a regular basis. This is because the diagnostic code never sees the type alias, only the desugared form.

It's confusing and inconsistent to ignore where-clauses

Naturally the fact that where-clauses are ignored on type-aliases has long been considered a bug.

Advantages of eager desugaring

Eager desugaring does however have some advantages.

Concise

For one thing, it gives something similar to implied bounds. For example here

struct NeedsClone<T: Clone> { t: T }

type Foo<T> = NeedsClone<T>;

there is no need to write T: Clone on the declaration of Foo.

Proposal

Lazy/weak type aliases

With the new trait solver, as well as other recent rustc refactoring, the types team is in a position to implement "lazy" type aliases (aka, "weak" type aliases) which are not eagerly normalized. Conceptually, such a type alias can be seen as equivalent to an associated type reference, so that

type Foo<T> = Vec<T>;

would be equivalent to

trait FooTrait<T> { type Output; }
impl<T> FooTrait<T> for () {
    type Output = Vec<T>;
}

and references to Foo<T> would effectively be desugared to <() as FooTrait<T>>::Output. Note that, in this case, all the information needed to resolve the associated type is already known. The main difference is tha the compiler would retain the "unnormalized" (i.e., type alias) form internally. This also makes it easy to support where clauses, they are conceptually added to the impl.

A side benefit is that we can print out type aliases in their "alias" form (i.e., print Foo<u32> instead of Vec<u32>).

Question

The primary question then becomes "what kinds of where-clauses should be required and when"? Here are two examples worth looking at.

Ampersand

The first example, Ampersand, indicates a case where the inner type, &'a T, has a well-formedness requirement that T: 'a:

type Ampersand<'a, T> = &'a T;

and

type OrdVec<T> = Vec<T>
where
    T: Ord;

Considerations

There are a few ways to think about this question.

Analogy to structs

One way to decide what where-clauses should be required is to go by analogy to structs. In this model, type Foo<T> = XXX requires bounds if and only if

struct Foo<T> { field: XXX }

would require bounds. By this model, XXX = Ampersand<T> does not require bounds (because we infer T: 'a based on struct fields, thanks to RFC 2093), but XXX = OrdVec<T> does require T: Ord.

Analogy to impls

Another way to decide what where-clauses should be required is to go by analogy to impls. In this model, type Foo<T> = XXX requires bounds if and only if

impl FooTrait<T> for () {
    type Output = XXX;
}

would require bounds. By this model, both XXX = Ampersand<T> and XXX = OrdVec<T> require bounds. Arguably, though, impls should infer T: 'a where-clauses from the values of their associated types the RFC does not elaborate on why we chose not to do this.

What do we eventually want with implied bounds?

We have talked for some time about having some more expressive kind of implied bounds, such that given struct OrdSet<T: Ord> { ... } and fn take_set<T>(o: &OrdSet<T>), the function could assume that T: Ord because it appears in the struct where-clauses. You might then choose to say that type aliases have implied bounds that make their right-hand side be well-formed, such that (effectively) type Foo<T> = OrdSet<T> would implicitly have a T: Ord requirement. As of now though we don't have a clearly articulated principle for what bounds are implied and where: We allow implied bounds on functions based on the arguments, and impls based on the input types, but only for outlives bounds, and not for other bounds.

What message are we sending?

Currently the compiler warns against adding where-clauses on type aliases. If we do a full 180, and start requiring where-clauses on type aliases in Rust 2024, that's quite the whiplash particularly if we later go with an implied bounds approach. An alternative might be to do some kind of warnings.

Recommendation

The doc author (nikomatsakis) recommends the following:

  • Unify the "analogy to structs" and "analogy to impls" by extending RFC 2093 inference to add outlives clauses to impls
  • Adopt the "analogy to impls" as our official model for type aliases; thus we would infer outlives but not other kinds of bounds (in Rust 2024)
    • Type aliases defined in Rust 2021 and before (as defined by the span information on the type keyword) are eagerly expanded.
    • Type aliases defined in Rust 2024 and later are weakly expanded.

FAQ

Aren't you worried about sending weird messages where we flip flop and where-clauses being required or not?

Not really, I think aligning to structs/impls and then either migrating those to use implied bounds or not makes sense.

Why the span info from the type keyword?

Because I expect that people will write macros like type $name<X> = Something where X: Ord or whatever. But I'm open on this point.

Is one easier to implement?

The more closely we follow analogy to impls, the easier to implement.

Other notes?

Here are the notes from the types team meetup.


Minutes

People: TC, nikomatsakis, tmandry, scottmcm, pnkfelix, Josh, fmease, waffle, compiler-errors

Minutes: TC

Must this be an "either/or" decision?

pnkfelix: This may be a silly Q.

pnkfelix: I can imagine a world where we first analyze the nature of the type-alias, and how it's where-clauses relate to the where-clauses of the types it references, and use that to inform whether we do Eager or Late resolution of that type alias.

pnkfelix: I.e.: if the type-alias is adding new where clauses, then that must use late resolution. If the type-alias has removed where clauses (and isn't adding any new ones; i.e. all of its where clauses are implied by the where clauses attached to the type(s) in the RHS, then that could (?) continue to use eager resolution?

TC: That would still have the downside then of continuing to show the normalized type in, e.g., error messages.

NM: I think we could do this, but I'm not sure what purpose it would serve.

pnkfelix: I'm probably in favor of moving overall to late normalization. The question is whether we could serve both use cases.

NM: There is an alternative like that. We could say that "where the RHS of the type alias is well formed." So the check would get pushed out to each usage site.

pnkfelix: What are the downsides to that?

NM: The question before us is "what do we really expect?" Do we expect this to be more like a field on a struct, an associated type, or a parameter to a function. For the latter, you get to assume that they're well formed. That's more like an input. For struct fields and associated types, that's more like an output. You have to add enough bounds to make them well typed.

Josh: Is there an alternative where you don't have to declare the bound to use the type, but you have to declare it to use the bound? (e.g. you could use NeedsOrd<T> witout declaring T: Ord, but you couldn't actually call T::cmp without declaring T: Ord)

pnkfelix: It's tough. It'd be painful if we deferred checking anything about the type alias's bounds until the type alias is used.

Josh: We should check the bounds of the type alias themselves when they're declared. I'm talking about the implied bounds. The argument here would be not making something part of an interface you may not mean to.

NM: I don't think that applies here. The mechanism Josh is describing implied bounds say, " where each of my types is well formed." Is that an iff relation, or if. If we check that the RHS is well formed, it's not clear it matters.

NM: Again, this comes back to what we expect. Do we expect to have to write the bounds on a function using the type alias, or not?

pnkfelix: Is the problem that we don't have enough information here to know whether it's an input or output?

NM: Just looking at the struct header, I shouldn't have to know the types of its fields to know whether it's valid. That's not true today because of outlives, which arguably isn't important, but it's mostly true. The same is true for GATs. You look at the header to know whether the impl applies. You don't have to look at the value of the GAT. So for a type alias, do I need to look at the RHS to know whether it's valid, or just the LHS?

NM: I don't want there to be extra WCs derived from the body, as in C++, to check validity.

Josh: You're saying that bringing in bounds from the signature is OK?

NM:

Josh: I'd suggest we treat this as orthogonal. If we had to do this over again, we certainly wouldn't have where clauses on type aliases that aren't enforced. So enforcing them in the right answer. But whether we have implied bounds or not is a problem for other things too (e.g. generic struct bounds). So we shouldn't block the enforcementof type aliases on any decision about implied bounds.

NM: We have two different kinds of implied bounds. One is the one on functions. "where this is well formed." The other is like what we do for structs, were we look at the fields, and we add bounds implicitly. So there's a reverse flow thing. Flowing things out from the inner to the outer, or from the outer signature to the inner bounds.

pnkfelix: I wanted to come back to the point about whether type aliases should be opaque. There is also the newtype pattern. If we make them opaque, how would express that they should be translucent.

NM: This is more analogous to type aliases. No-one is arguing for them to be opaque as in impl Trait.

NM: In my mind, both type aliases and associated types are exactly the same thing. In T-types, we call these alias types now. So it seems obvious to me that they should behave analogously.

TC: NM, perhaps you could discuss how WCs on a trait impl are enforced differently than WCs on an associated type, how that might connect to the input/output discussion here, and what the appropriate desugaring might be?

NM:

// Niko's argument: apart from where-clauses and WF'edness, and how they are implemented today,
// these two constructs seem equivalent in Rust today:


type Foo<T> = Vec<T>;


trait FooTrait<T> { type Output; }
impl<T> FooTrait<T> for ()
where
    XXX // <-- these can be independent from what appears on the trait, condition on which trait is implemented
{
    type Output = Vec<T>
    where
        XXX; // <-- these must be a subset of what appears on the trait (part of the trait contract)

    fn foo<T>()
    where
        T: Ord // <-- part of *trait* contract
    {}
}
// always use `<() as FooTrait<T>>::Output` instead of `Foo<T>`

// basically an impl is basically a module indexed by a type

(NM had to drop. CE picks up the thread)

FSK (attempting to encode CE's words):

type Foo<T> where T: Copy = Vec<T>;

// desugaring 1
trait FooTrait<T> { type Output; }
impl<T> FooTrait<T> for () where T: Copy {
    type Output = Vec<T>;
}

// desugaring 2 (CE would prefer 3 to this one)
trait FooTrait<T> { type Output where T: Copy; }
impl<T> FooTrait<T> for () {
    type Output = Vec<T> where T: Copy;
}

// desugaring 3
trait FooTrait { type Output<T> where T: Copy; }
impl FooTrait for () {
    type Output<T> = Vec<T> where T: Copy;
}

CE: if we are trying to propagate the implied where-clauses, we do propagate where clauses attached to the (GAT?), but we would not propagate where-clause from the impl's.

CE: If we wrote a function like:

fn foo<T>(x: <() as FooTrait>::Output<T>) ...

CE: The compiler will never be smart enough to pull in the implied bounds from desugaring number 1. Because that requires us to know which implementation to normalize.

tmandry: Desugaring 3 is closest to type aliases. It looks closer to it. And it's what I'm leaning toward. But from a perspective of conservatism, we don't want to implicitly copy-paste all WCs on the type alias everywhere it gets used. We want to make people write them out to say what bounds are being implied in their interfaces.

scottmcm: I'm not sure I'd describe that as conservatism. It being non-breaking to remove an unnecessary bound would be nice. It's more of a SemVer question. We want to let people simplify things. But maybe in a type alias we should think of it as being a promise. You can always make a new type alias if you want different bounds.

CE: This gets to conversations T-lang has had before about WCs that come before and after the definition of the type alias. It's also come up in trait aliases.

scottmcm: type (OrdVec<T> where T: Ord) = Vec<T> vs type OrdVec<T> = (Vec<T> where T: Ord); could potentially be different promises, given sufficient syntax.

CE: Yes, that's what I was thinking. If I have WCs on my type alias, do we want a function using that type-alias to be able to assume those WC's as implicitly implied bounds, or do we want the function to have to restate those WC's again within its own signature.

tmandry: The analogy to structs is seeming more compelling to me. My feeling is that the where clauses should be the same in these cases.

waffle: +1.

CE: We could make those rules stronger in the future also. For now, we could follow the same rules as structs. We could relax it in the future, but that would be part of a greater move toward assuming things are well formed everywhere.

waffle: What CE said is similar to what Josh said earlier. We could make the decision to make type aliases similar to structs, then we could make a decision about implied bounds.

tmandry: We're talking about propagation of explicit where clauses, and we're talking about whether we implicity add WCs for well formedness of bounds based on the type on the RHS of the type alias.

TC: What desugarings above is this forward compatible with?

CE: Doing it as structs is compatible with later moving to any of the desugarings above. The cost of not doing that now is that users would need to write redundant bounds at the use sites as well as on the type alias itself.

TC: What's the next step to move it forward?

CE: We should run a experiment to see the fallout from enforcing these WCs.

CE: And we should write something up about the semantics we want to follow.

TC: The reason Niko suggested treated these as non-orthogonal was to prevent whiplash of forcing people to add these bounds and then removing the need for that later.

scottmcm: Here's an example of a type alias I used where not having to include the traits for a private type alias was certainly convenient https://github.com/rust-lang/rust/blob/abe34e9ab14c0a194152b4f9acc3dcbb000f3e98/library/core/src/ops/try_trait.rs#L366, though it's more about simplifying the usage sites, so adding more to the declaration of the type alias wouldn't be a big deal.

tmandry: This needs to be an edition migration?

all: Yes.

waffle: What will we learn in an experiment?

CE: We'll learn about type aliases that do need bounds.

tmandry: +1 on going forward with what's been proposed by CE for moving forward.

TC: Proposed consensus: We're OK with going forward with type aliases working analogously with structs as a minimum. We accept the fact that people may have to add redundant where clauses, even though we may relax that later. But we're also open to a proposal that would make type aliases more analogous to associated types (and alias types more generally).

scottmcm: I'm in favor of that, but I'm a bit unsure what the difference is between the impl version and the struct version.

scottmcm: I'd love to see some formalized analysis of exactly where we require these things, here's how we can avoid writing useless things in common cases, etc.

Josh: The proposed consensus above sounds right, with the addition that we're also open to proposals that would extend implied bounds more generally on all kinds of alias types.

scottmcm: +1 and for the 2024 edition, we're OK doing this without further extension to implied bounds.

Consensus: We're OK with going forward in the 2024 edition with type aliases being weak and working analogously with structs. We accept the fact that people may have to add redundant where clauses, even though we may relax that later. This path is forward compatible with later relaxing these rules within an edition. We won't block stabilizing this in Rust 2024 on other improvments to implied bounds being made. At the same time, we're open to future or concurrent work along two axes: 1) better unifying type aliases with other alias types (e.g. associated types), and 2) improving implied bounds generally for all alias types.

(The meeting ended here.)

Analogy to desugaring?

(skip this, we talked about it above in a better way)

scottmcm: Something niko said made me think about trait aliases here that trait aliases are kinda like a trait with a blanket impl and thus there's a model you could think of here where a generic type alias is the same kind of thing.

Thinking out loud, is this kinda like one of these?

struct OrdVec<T>(PhantomData<T>);
impl<T: Ord> OrdVec<T> {
    type Output = Vec<T>;
}

vs

struct OrdVec<T: Ord>(PhantomData<T>);
impl<T: Ord> OrdVec<T> {
    type Output = Vec<T>;
}

hmm, having written that I'm not sure that makes any sense any more

Oh, niko's example above on () instead of a struct makes more sense here.

Does "analogy to impls" require strictly more bounds?

scottmcm: We can presumably always require fewer bounds later (subject to not wanting to churn too much), so if impls is the suggested and easier one and going to the other later would still be possible, that's pretty persuasive to me.

Discussing implied bounds

(Intentionally further down; probably not critical for this decision)

scottmcm: I would love to move the proof requirement for types, rather than imply them. I don't know what exactly those rules would be, but allowing you to say Cow<T> for everything, but move the proof requirement to the constructors, or something. (We had a discussion about this around Box, IIRC, but I forget what all that happened.)

Select a repo