owned this note
owned this note
Published
Linked with GitHub
# Design meeting: consts in patterns
We allow using (some) constants in patterns. However, we cannot allow all of them: some just don't have a way of being compared, such as unions; others get rejected for being "not structural-match", as defined in [RFC 1445][rfc-1445]. The structural-match check had some holes so some constants that we do accept get linted as "this will be an error in the future". These lints were introduced [a long time ago](https://github.com/rust-lang/rust/pull/62339) and have been [warn-by-default](https://github.com/rust-lang/rust/pull/70743) since Rust 1.48, close to 3 years ago. Since then the pattern matching implementation in the compiler changed *a lot* and our ideas of what we do and don't want to do with pattern matching also changed. We also realized there are some gaps in what the RFC discusses, such as raw pointers.
[rfc-1445]: https://rust-lang.github.io/rfcs/1445-restrict-constants-in-patterns.html
It's time to figure out where we want to go with this: enforce RFC 1445 by making (some of) these lints hard errors, or change our mind and remove the lints.
The main questions to figure out are:
- Which exact consts do we want to accept in patterns?
- What should their semantics be, i.e., when are they considered to "match" the place-being-matched-on?
Some smaller questions that also show up are:
- How should consts interact with exhaustiveness checking, i.e., when will the constant value be taken into account to compute which cases are already covered by match arms?
- Should we lint against some consts that we do allow, e.g. because they have surprising semantics?
## Terminology: "structural match"
Before we dive deeper, we need to define some words that will come up again and again.
We say that **a type is structural-match** if its `PartialEq` instance is (syntactically or semantically) equivalent to the one that would be generated by `derive(PartialEq)`. This term only really applies to ADTs. The `StructuralPartialEq` trait reflects this property. Note that this is *non-recursive*, it only talks about the `PartialEq` instance of this type, not about its fields!
We say that **a type is recursively structural-match** if all ADTs that recursively appear in fields are structural-match.
We say that **a value is recursively structural-match** if all ADTs that appear *in this value* are structural-match. Due to `enum`s, it is possible to have non-structural-match types where some values are structural-match, such as the `None` value of `Option<MyNonStructuralMatchType>`.
## Possible design options
Largely, there are two "main points" in the design space. Of course one could also consider a design that sits somewhere in between those two points.
Option 1: **Consts desugar to a pattern.**
This design considers a constant used in pattern position to be basically syntactic sugar for a pattern that one could have also written otherwise.
For instance, matching on a constant with value `(0i32, None)` is completely equivalent to writing the pattern `(0i32, None)`.
Option 2: **Consts desugar to `==`.**
This design considers a constant used in pattern position to be basically equivalent to a `==` guard, except that possibly exhaustiveness checking can take into account the concrete value of `C`.
For instance, `matches!(x, C)` aka `match x { C => true, _ => false }` would desugar to `x == C`.
## Design option discussion
### Option 1: Desugaring to a pattern
Reading through the historic record, this is likely the originally intended design.
In particular, [RFC 1445][rfc-1445], which introduced the "structural match" restriction, seems to pre-suppose that we want constants in patterns to behave like a regular pattern desugared from the computed value of the constant.
One consequence of this design is that *the value of the const must be visible* -- we need the exact pattern to build the MIR, and to do exhaustiveness checking, so if the const cannot be evaluated (since it depends on some generic parameters), we have to reject it.
With this design we could accept constants as patterns that do not implement `PartialEq`, let alone `Eq`. (In fact we currently do accept some such constants, though the [examples](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=a20f2b055ddbda810f19b11c24ce3e2e) are contrived and this is likely an accident -- it involves unnecessary bounds on the derived `PartialEq` instance. Non-`PartialEq` constants in patters used to be more common when not all function pointer types implemented `PartialEq`, but that has been fixed.)
Of course we might want to require `PartialEq` to keep our bets open for the future, e.g. to be forward-compatible with option 2.
This design makes `match` as a language construct completely independent of any user-defined code such as `==`.
#### Pointers in const-patterns
However, to really say that constants desugar to a pattern, we must make sure that the "leaf types" (after traversing all tuples, structs, enums, and arrays) can actually be used as patterns.
And that is not actually the case: we allow matching on constants that involve raw pointers and function pointers, which are not otherwise allowed as patterns.
So, to use this design option, we must pick one of:
- Consider the Rust language of patterns to include raw pointer and function pointer patterns. These patterns do not have a direct surface syntax, they can only come about through the desugaring from constants. They behave like `==` on these types (which is a built-in primitive, so no user-defined `==` is sneaking in here).
- Deprecate and eventually remove support for matching on consts that involve raw pointers or function pointers. (Currently, we emit a future-incompat lint for function pointers and for raw pointers to unsized types, but *not* for other kinds of raw pointers.)
Of course this choice can be made independently for function pointers, and raw pointers with different pointee types.
RFC 1445 does not mention raw pointers or function pointers at all, and arguably they are not very "structural" in their equality.
(It does mention floats and calls them out as non-structural.)
If we want to not have function pointer or `dyn Trait` raw pointer "leaf patterns" because their notion of equality is wonky and definitely not structural, `pointer_structural_match` (or a subset of it that accepts raw slice pointers) needs to be made a hard error.
If we want to not have any raw pointer "leaf patterns", we need a new lint; however, such patterns are used widely, including [in the standard library](https://github.com/rust-lang/rust/blob/b18db7a13e52f71e94bdf221a7a013fd9ace4c7f/library/std/src/sys/windows/thread_parking.rs#L225-L226).
We could consider only allowing matching against raw pointer constants such as `4 as *const i32` but not `&42`, i.e., only constants with a fixed integer value rather than some dynamic memory location.
We make few to no guarantees about pointer identity for pointer equality on `const`s, so this can be justified -- but it would require a new future-incompat lint since currently we just accept such code.
(The fallout from rejecting such patterns is thus also unknown, a crater run would be required.)
Integer-constant raw pointers arguably are as structural as regular integers, so allowing `match` on them is in line with the general philosophy of this design option.
#### Floats in const-patterns
Floats are also an interesting question here.
They have been allowed as primitive patterns (even without constants) since Rust 1.0, but RFC 1445 called them out as non-structural and we have had a future-incompat warning against float patterns for a long time now.
However, when the proposal came up to turn this warning into a hard error, that PR was [rejected by t-lang](https://github.com/rust-lang/rust/pull/84045).
Floats have strange equality on NaNs (which are never equal to anything) and zeroes (where positive and negative zero compare equal).
We could consider entirely rejecting NaNs in patterns since those arms can never be reached anyway, but accepting other float constants.
That would still allow the use-cases people brought up in [the tracking issue](https://github.com/rust-lang/rust/issues/41620).
(For floats we also allow range patterns, but NaN seems to be rejected in range patterns.)
Rejecting zeroes would likely be a lot more surprising, so here we likely have to live with the fact that `match` will consider both zeroes to be equal (if we want to allow float matching at all).
Another option for matching on NaN would be to make the `f32::NAN` pattern match *any* NaN.
#### Structural match restriction
Based on this design, the entire "structural match" story was born out of a desire to reject cases where the desugared pattern does not behave like `==` would. (That's what RFC 1445 was all about; also see [the motivation there](https://rust-lang.github.io/rfcs/1445-restrict-constants-in-patterns.html#motivation).)
To achieve this we must reject consts whose value is not recursively structural-match.
This was originally intended (in the RFC) to be a hard error, but arguably could also be made a lint.
(This also explains why the `StructuralPartialEq` trait is a safe one -- it isn't really load-bearing in this design.)
One notable downside of the structural match checks is that it makes switching from the derived `PartialEq` to a custom one (e.g. one that is more efficient or avoids unnecessary bounds) a breaking change, even if the behavior of `==` remains unchanged.
If we want a hard guarantee that pattern semantics and `==` semantics agree to avoid any potential confusion, this cannot really be avoided.
We could make the trait `unsafe` and off-load the guarantee partially to the user writing `unsafe impl StructuralPartialEq`.
We could also make the trait safe if we treat the structural match check like a lint.
This restriction, when implemented as a hard error, ensures that option 1 is forward-compatible with option 2.
#### Changes from today's behavior
To desugar consts into patterns we need to either reject raw pointer values or consider them legal "leaf patterns" (which likely means we need to permit constructing valtrees with raw pointers, since pattern construction goes through valtrees).
At least the `pointer_structural_match` lint should be made a hard error.
If we want the structural match check to be a hard error, the `indirect_structural_match` future-compat lint also has to be turned into a hard error.
If we want to be able to analyze whether a value is recursively structural-match without computing it (using logic similar to how we determine whether a const value needs dropping), we need to make `nontrivial_structural_match` a hard error; but we could alternatively just say that we will compute the value of the constant and check *that* as basis for the hard error/lint.
(We have to compute that value anyway.)
To follow this design we also have to change the behavior of some existing code such as [this one](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4736e230824cbd7ce2e443ccc8f04336), where currently one can tell that we are *not* actually desugaring consts to native patterns.
This code already has a future-incompat lint (and has had it for a long time), so we could avoid silently changing semantics by making such code a hard error instead.
### Option 2: Desugar to `==`
The alternative to the above is to say that constants used as patterns behave like `==`, and everything else is linting and quality-of-life improvements.
Why would we explain consts-in-patterns via `==` rather than desugared patterns?
We already allow using `float`s in patterns without any constants being involved (this has worked [since Rust 1.0](https://rust.godbolt.org/z/8E6orGEPc), though we started linting against it at some point many years ago), and that uses `==` semantics rather than "exact bitwise equality" and one can argue about whether this is "structural".
We also allow raw pointers to sized types and don't even lint against that; whether one considers `==` on `*const u8` to be "structural" is probably a matter of opinion -- arguably that type doesn't really have any "structure", and its notion of equality is a very low-level machine detail.
So saying that all consts use `==` is not a total surprise.
(OTOH, as discussed above, if we restrict this to "integer pointers" then raw pointer equality can be argued to be structural.)
One big advantage of this option is that we will be able to allow matching against generic consts, associated consts of generic types, and other consts whose value we cannot know at MIR building time.
One big downside is that if people expect consts in patterns to behave as if they desugar to a pattern, then they are not getting the semantics they are expecting.
Generally people might be [expecting stricter equality from `match` as what `==` provides](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/matching.20and.20.60Eq.60/near/392838566).
(However, if that's the deciding argument, we should do something about matching on floats, and possible raw pointers as well.)
People also expressed the opinion that `match` behavior should never depend on user-defined code like custom `==`.
We can of course still detect and lint against "non-structural-match" cases where if the final value were to be written as a pattern, it would behave differently (at least we can do that for consts whose value we can compute).
This would somewhat preserve the spirit of [RFC 1445][rfc-1445].
The details of what gets linted would have to be determined -- if we do allow opaque consts in patterns, we probably don't want to lint against every use of them, so the lint would necessarily miss some warnings.
This option was not a real possibility during prior discussion in past years, since not all function pointer types implemented `PartialEq`.
However, that issue has been resolved now, so (except for what looks like accidents), all types we allow matching on currently do have `PartialEq`.
We could consider requiring `Eq` and not just `PartialEq`, but that would rule out matching on floats.
#### Exhaustiveness checking
As an example for a quality-of-life aspect, if we can determine the value of the constant, we might want to take it into account for exhaustiveness checking -- though of course we can only do that if `==` actually behaves like the desugared pattern, i.e., if the constant value is recursively structural-match.
In those cases we can transparently rewrite the `==` check to a pattern, knowing it does not change program behavior, and then we can do exhaustiveness checking on that pattern.
(This makes the `StructuralPartialEq` trait's promise load-bearing for soundness, and the trait should be made `unsafe`.)
We could say that only constants whose *type* is recursively structural-match are taken into account for exhaustiveness checking; this would entirely avoid having to run the analysis of whether the concrete *value* is recursively structural-match.
(However, this would reject some code that we currently accept and don't lint against.)
#### Changes from today's behavior
For this option we definitely need to reject all constants in patterns that do not implement `PartialEq`.
There are already forward-compatibility warnings against basically every possible such case, though one [corner case was missed](https://github.com/rust-lang/rust/pull/115893).
Other than that we can remove all the structural-equality forward-compatibility lints.
We might consider turning some of them into general lints about potentially surprising behavior.
This is also a massive breaking change for matching on consts in `const fn`, which is currently sometimes allowed but would never work under this option since `==` is not `const fn`.
### Further options
Of course we don't have to decide to be on either end of this design spectrum.
We could say that some consts behave like desugared to a pattern, while others behave like `==`.
This could be decided based on some trait, or the value of the constant, or other things.
This [document by @lcnr](https://hackmd.io/J3H6jwwQRw-MKTnTdX3zGw#Defining-the-StructuralEq-trait) describes a variant of this. The trait is called `StructuralEq` there, but `StructuralMatch` would probably be a more apt name so we will use that here.
The compiler checks that `StructuralMatch` is only implemented when all fields are `StructuralMatch` and not implemented for unions (this ensures a pattern can always be constructed for all values of this type), but otherwise the trait is safe and can be arbitrarily implemented by users.
Consts that implement `StructuralMatch` get pattern behavior and exhaustiveness checking, all other consts get `==` behavior and no exhaustiveness checking.
(Floats and raw pointers could be considered non-`StructuralMatch` to avoid having to ever consider them as primitive patterns.)
This is a slight breaking change compared to today: if an enum has some variants that are `StructuralPartialEq` and others that are not, a constant whose value is a structural variant currently *can* participate in exhaustiveness checking.
[Here's an example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=d4e1a53daeee917e48456b5fb924ae10).
@lcnr's proposal would treat this constant opaquely and match via `==`.
We don't have a future-compat lint for that so we don't know how much breakage this would cause.
One consequence of this design is that when a constant has type `(T, U)`, whether or not the `T` part is compared using `==` or by pattern desugaring depends on whether `U: StructuralMatch`.
That is a potentially concerning semantic discontinuity.
As another example, if we eventually allow matching on const generics, the same constant value might behave differently when it is used as a pattern via a const generic vs a regular const: in the first case the value is unknown at MIR building time so it uses `==` semantics; in the 2nd case the value is known so it could be turned into a pattern (if the type is `StructuralMatch`).
Overall this variant is very similar to option 2 with exhaustiveness checks only for `StructuralMatch` types, *except* that we don't promise that all consts behave like `==`, but instead say that consts of `StructuralMatch` type whose value is available at MIR building time behave like the desugared pattern.
### Summary and comparison
Desugaring to a pattern:
- Matches the historic intent
- Can (optionally) ensure that pattern semantics and `==` semantics are equivalent with a hard error, at the cost of ruling out matching on consts of types with a custom `PartialEq` *even if* that `PartialEq` is equivalent to pattern semantics.
- Makes `match` behavior completely independent from potentially user-defined code such as `==`.
- Is not quite a desugaring, since we need to add support for raw pointer "leaf" patterns despite their notion of equality being a lot more subtle and low-level than anything one can write as a native pattern.
- Cannot support opaque constants or constants that contain a `union`, even if they have a sensible notion of equality.
- Holds custom ADTs to a higher standard than float types and raw pointers, which can be matched on despite not being really fully "structural". (Floats are currently being linted against, but t-lang previously rejected [turning that lint into a hard error](https://github.com/rust-lang/rust/pull/84045), so I assume we want to keep allowing float patterns with `==` semantics. Raw pointer patterns with sized pointees do not even have a lint; they are [used in the standard library](https://github.com/rust-lang/rust/blob/b18db7a13e52f71e94bdf221a7a013fd9ace4c7f/library/std/src/sys/windows/thread_parking.rs#L225-L226) and presumably widely used in the ecosystem to check against sentinel values, so I assume we want to keep allowing those as well.)
- There are possible sub-options here that could avoid some of these issues, such as allowing matching only on particular *values* when raw pointers and floats are involved. For raw pointers we could rule out constant pointers that are not integer constants -- we guarantee little about their identity anyway. For floats we could rule out NaNs; those match arms are unreachable anyway. It's unclear how surprised people would be about such value-based restrictions -- there is precedent; after all we already allow matching on `None` but not `Some(MaybeUninit::new(...))` even when those both have the same type. However, for floats, NaNs are not the only values with "strange" equality, there is also the fact that `+0.0 == -0.0` despite those having different bit patterns, so if we disallow NaNs the question comes up what we should do about zeroes.
- Changes behavior of some [code we used to accept without warnings](https://rust.godbolt.org/z/q1M5GaeMW) where `==` and pattern semantics disagree. However all such code has future-incompat lints since Rust 1.48 (November 2020).
Desugaring to `==`:
- Can support opaque consts and other consts one could not write as a pattern (e.g., a type that involves a custom tagged `union` with a `PartialEq` instance).
- Can only best-effort lint against possible cases of semantic mismatch between the pattern and `==`, leading to possibly surprising behavior that the programmer did not expect.
- Defies expectations of programmers that expect `match` to have a more strict notion of equality than `==`.
- "Opens the floodgates"; suddenly one will be able to match on *tons* of things, based on their `==` semantics. That can be seen as a good thing or a bad thing.
- Is not quite a desugaring, since we want to take into account some constant values for exhaustiveness checking to avoid unnecessary "non-exhaustive match" errors.
Neither of these options has cares much about the `Eq` trait, only `PartialEq` is relevant.
Requiring `Eq` would anyway be inconsistent with allowing matching on floats.
# Post-meeting notes
Some of the main arguments:
1. For option 1: refactoring a binderless pattern into a `const` should not change behavior. In particular for fieldless enum variants, which are almost identical to consts, this would be really surprising. It also violates the "consts behave as if inlined" principle we've been repeating a lot.
2. against unrestricted option 1, for option 2: we shouldn't expose operations on a type that a user didn't decide to expose -- if they gave no `==`, we shouldn't allow matching on those consts.
Argument 1 rules out option 2.
Argument 2 means we need to restrict option 1. But how?
The current scheme is geared towards allowing matching on a const if the *value* is recursively structural-match, and furthermore the type must implement `PartialEq`.
That means if you have no `==` nobody can match on your types, so that's good -- we don't expose syntactic capabilities that the user didn't explicitly expose.
And it means if we allow matching then its behavior is the same as that of `==`, so we also don't expose semantic capabilities that the user didn't choose to expose.
If we want the refactoring from argument 1 to always result in compiling code (as opposed to just ensuring that if it compiles, it is a semantic NOP), we need to relax this check in a scope-based way, where if we can see all the fields of a type (we are in the same module or they are public), then we allow matching even if there is no` PartialEq` and the value is not structural match.
For all-`pub` types this would mean everyone can match any constant no matter which traits are derived or manually implemented!
But it means if you `derive(PartialEq)` that's a semver promise that your consts *can* be matched on, so you can't ever have a non-structural `PartialEq` in the future.
If we want to avoid that we need to decouple `derive(PartialEq)` from "allow matching on consts of this type".
This can only be changed via an edition transition.
This proposal does *not* let one define a `MyBool` type with an unconventional equality and have reasonable `match` behavior for that type.
But it *does* let one define such a type and at least be sure users are not circumventing the abstraction with `match`.
Supporting that would require much more fundamental changes to our `match` system.
Ideally we'd remaing forward-compatible with such changes, but what exactly would be required to ensure this?
Compared to today, this proposal only breaks code that we are already linting against with future-compatibility warnings. Specifically this affects the `indirect_structural_match` lint (which identifies const values that are not recursively structural-match) and the `const_patterns_without_partial_eq` lint (which identifies const values of non-`PartialEq` type).
The latter is very recent (not on stable yet, riding the train for 1.74) but appears in cargo's future compatibility reports; the former is ancient but does not appear in cargo's future compatibility report.
If we want to determine whether a const value is recursively structural-match before evaluating it, and instead do an analysis based on the MIR source that computes the const value, then we'd also need to make the `nontrivial_structural_match` lint a hard error -- but it's unclear what the motivation for that would be.
However this isn't a complete proposal yet, since no answer is given for:
- floats
- raw pointers (thin/wide, integer or "actually pointer")
- function pointers
- potentially letting non-derive(PartialEq) types opt-in to allowing matching (with structural semantics)