We allow using (some) constants in patterns. However, we cannot allow all of them: some just don't have a way of being compared, such as unions; others get rejected for being "not structural-match", as defined in RFC 1445. The structural-match check had some holes so some constants that we do accept get linted as "this will be an error in the future". These lints were introduced a long time ago and have been warn-by-default since Rust 1.48, close to 3 years ago. Since then the pattern matching implementation in the compiler changed a lot and our ideas of what we do and don't want to do with pattern matching also changed. We also realized there are some gaps in what the RFC discusses, such as raw pointers.
It's time to figure out where we want to go with this: enforce RFC 1445 by making (some of) these lints hard errors, or change our mind and remove the lints.
The main questions to figure out are:
Some smaller questions that also show up are:
Before we dive deeper, we need to define some words that will come up again and again.
We say that a type is structural-match if its PartialEq
instance is (syntactically or semantically) equivalent to the one that would be generated by derive(PartialEq)
. This term only really applies to ADTs. The StructuralPartialEq
trait reflects this property. Note that this is non-recursive, it only talks about the PartialEq
instance of this type, not about its fields!
We say that a type is recursively structural-match if all ADTs that recursively appear in fields are structural-match.
We say that a value is recursively structural-match if all ADTs that appear in this value are structural-match. Due to enum
s, it is possible to have non-structural-match types where some values are structural-match, such as the None
value of Option<MyNonStructuralMatchType>
.
Largely, there are two "main points" in the design space. Of course one could also consider a design that sits somewhere in between those two points.
Option 1: Consts desugar to a pattern.
This design considers a constant used in pattern position to be basically syntactic sugar for a pattern that one could have also written otherwise.
For instance, matching on a constant with value (0i32, None)
is completely equivalent to writing the pattern (0i32, None)
.
Option 2: Consts desugar to ==
.
This design considers a constant used in pattern position to be basically equivalent to a ==
guard, except that possibly exhaustiveness checking can take into account the concrete value of C
.
For instance, matches!(x, C)
aka match x { C => true, _ => false }
would desugar to x == C
.
Reading through the historic record, this is likely the originally intended design.
In particular, RFC 1445, which introduced the "structural match" restriction, seems to pre-suppose that we want constants in patterns to behave like a regular pattern desugared from the computed value of the constant.
One consequence of this design is that the value of the const must be visible – we need the exact pattern to build the MIR, and to do exhaustiveness checking, so if the const cannot be evaluated (since it depends on some generic parameters), we have to reject it.
With this design we could accept constants as patterns that do not implement PartialEq
, let alone Eq
. (In fact we currently do accept some such constants, though the examples are contrived and this is likely an accident – it involves unnecessary bounds on the derived PartialEq
instance. Non-PartialEq
constants in patters used to be more common when not all function pointer types implemented PartialEq
, but that has been fixed.)
Of course we might want to require PartialEq
to keep our bets open for the future, e.g. to be forward-compatible with option 2.
This design makes match
as a language construct completely independent of any user-defined code such as ==
.
However, to really say that constants desugar to a pattern, we must make sure that the "leaf types" (after traversing all tuples, structs, enums, and arrays) can actually be used as patterns.
And that is not actually the case: we allow matching on constants that involve raw pointers and function pointers, which are not otherwise allowed as patterns.
So, to use this design option, we must pick one of:
==
on these types (which is a built-in primitive, so no user-defined ==
is sneaking in here).Of course this choice can be made independently for function pointers, and raw pointers with different pointee types.
RFC 1445 does not mention raw pointers or function pointers at all, and arguably they are not very "structural" in their equality.
(It does mention floats and calls them out as non-structural.)
If we want to not have function pointer or dyn Trait
raw pointer "leaf patterns" because their notion of equality is wonky and definitely not structural, pointer_structural_match
(or a subset of it that accepts raw slice pointers) needs to be made a hard error.
If we want to not have any raw pointer "leaf patterns", we need a new lint; however, such patterns are used widely, including in the standard library.
We could consider only allowing matching against raw pointer constants such as 4 as *const i32
but not &42
, i.e., only constants with a fixed integer value rather than some dynamic memory location.
We make few to no guarantees about pointer identity for pointer equality on const
s, so this can be justified – but it would require a new future-incompat lint since currently we just accept such code.
(The fallout from rejecting such patterns is thus also unknown, a crater run would be required.)
Integer-constant raw pointers arguably are as structural as regular integers, so allowing match
on them is in line with the general philosophy of this design option.
Floats are also an interesting question here.
They have been allowed as primitive patterns (even without constants) since Rust 1.0, but RFC 1445 called them out as non-structural and we have had a future-incompat warning against float patterns for a long time now.
However, when the proposal came up to turn this warning into a hard error, that PR was rejected by t-lang.
Floats have strange equality on NaNs (which are never equal to anything) and zeroes (where positive and negative zero compare equal).
We could consider entirely rejecting NaNs in patterns since those arms can never be reached anyway, but accepting other float constants.
That would still allow the use-cases people brought up in the tracking issue.
(For floats we also allow range patterns, but NaN seems to be rejected in range patterns.)
Rejecting zeroes would likely be a lot more surprising, so here we likely have to live with the fact that match
will consider both zeroes to be equal (if we want to allow float matching at all).
Another option for matching on NaN would be to make the f32::NAN
pattern match any NaN.
Based on this design, the entire "structural match" story was born out of a desire to reject cases where the desugared pattern does not behave like ==
would. (That's what RFC 1445 was all about; also see the motivation there.)
To achieve this we must reject consts whose value is not recursively structural-match.
This was originally intended (in the RFC) to be a hard error, but arguably could also be made a lint.
(This also explains why the StructuralPartialEq
trait is a safe one – it isn't really load-bearing in this design.)
One notable downside of the structural match checks is that it makes switching from the derived PartialEq
to a custom one (e.g. one that is more efficient or avoids unnecessary bounds) a breaking change, even if the behavior of ==
remains unchanged.
If we want a hard guarantee that pattern semantics and ==
semantics agree to avoid any potential confusion, this cannot really be avoided.
We could make the trait unsafe
and off-load the guarantee partially to the user writing unsafe impl StructuralPartialEq
.
We could also make the trait safe if we treat the structural match check like a lint.
This restriction, when implemented as a hard error, ensures that option 1 is forward-compatible with option 2.
To desugar consts into patterns we need to either reject raw pointer values or consider them legal "leaf patterns" (which likely means we need to permit constructing valtrees with raw pointers, since pattern construction goes through valtrees).
At least the pointer_structural_match
lint should be made a hard error.
If we want the structural match check to be a hard error, the indirect_structural_match
future-compat lint also has to be turned into a hard error.
If we want to be able to analyze whether a value is recursively structural-match without computing it (using logic similar to how we determine whether a const value needs dropping), we need to make nontrivial_structural_match
a hard error; but we could alternatively just say that we will compute the value of the constant and check that as basis for the hard error/lint.
(We have to compute that value anyway.)
To follow this design we also have to change the behavior of some existing code such as this one, where currently one can tell that we are not actually desugaring consts to native patterns.
This code already has a future-incompat lint (and has had it for a long time), so we could avoid silently changing semantics by making such code a hard error instead.
==
The alternative to the above is to say that constants used as patterns behave like ==
, and everything else is linting and quality-of-life improvements.
Why would we explain consts-in-patterns via ==
rather than desugared patterns?
We already allow using float
s in patterns without any constants being involved (this has worked since Rust 1.0, though we started linting against it at some point many years ago), and that uses ==
semantics rather than "exact bitwise equality" and one can argue about whether this is "structural".
We also allow raw pointers to sized types and don't even lint against that; whether one considers ==
on *const u8
to be "structural" is probably a matter of opinion – arguably that type doesn't really have any "structure", and its notion of equality is a very low-level machine detail.
So saying that all consts use ==
is not a total surprise.
(OTOH, as discussed above, if we restrict this to "integer pointers" then raw pointer equality can be argued to be structural.)
One big advantage of this option is that we will be able to allow matching against generic consts, associated consts of generic types, and other consts whose value we cannot know at MIR building time.
One big downside is that if people expect consts in patterns to behave as if they desugar to a pattern, then they are not getting the semantics they are expecting.
Generally people might be expecting stricter equality from match
as what ==
provides.
(However, if that's the deciding argument, we should do something about matching on floats, and possible raw pointers as well.)
People also expressed the opinion that match
behavior should never depend on user-defined code like custom ==
.
We can of course still detect and lint against "non-structural-match" cases where if the final value were to be written as a pattern, it would behave differently (at least we can do that for consts whose value we can compute).
This would somewhat preserve the spirit of RFC 1445.
The details of what gets linted would have to be determined – if we do allow opaque consts in patterns, we probably don't want to lint against every use of them, so the lint would necessarily miss some warnings.
This option was not a real possibility during prior discussion in past years, since not all function pointer types implemented PartialEq
.
However, that issue has been resolved now, so (except for what looks like accidents), all types we allow matching on currently do have PartialEq
.
We could consider requiring Eq
and not just PartialEq
, but that would rule out matching on floats.
As an example for a quality-of-life aspect, if we can determine the value of the constant, we might want to take it into account for exhaustiveness checking – though of course we can only do that if ==
actually behaves like the desugared pattern, i.e., if the constant value is recursively structural-match.
In those cases we can transparently rewrite the ==
check to a pattern, knowing it does not change program behavior, and then we can do exhaustiveness checking on that pattern.
(This makes the StructuralPartialEq
trait's promise load-bearing for soundness, and the trait should be made unsafe
.)
We could say that only constants whose type is recursively structural-match are taken into account for exhaustiveness checking; this would entirely avoid having to run the analysis of whether the concrete value is recursively structural-match.
(However, this would reject some code that we currently accept and don't lint against.)
For this option we definitely need to reject all constants in patterns that do not implement PartialEq
.
There are already forward-compatibility warnings against basically every possible such case, though one corner case was missed.
Other than that we can remove all the structural-equality forward-compatibility lints.
We might consider turning some of them into general lints about potentially surprising behavior.
This is also a massive breaking change for matching on consts in const fn
, which is currently sometimes allowed but would never work under this option since ==
is not const fn
.
Of course we don't have to decide to be on either end of this design spectrum.
We could say that some consts behave like desugared to a pattern, while others behave like ==
.
This could be decided based on some trait, or the value of the constant, or other things.
This document by @lcnr describes a variant of this. The trait is called StructuralEq
there, but StructuralMatch
would probably be a more apt name so we will use that here.
The compiler checks that StructuralMatch
is only implemented when all fields are StructuralMatch
and not implemented for unions (this ensures a pattern can always be constructed for all values of this type), but otherwise the trait is safe and can be arbitrarily implemented by users.
Consts that implement StructuralMatch
get pattern behavior and exhaustiveness checking, all other consts get ==
behavior and no exhaustiveness checking.
(Floats and raw pointers could be considered non-StructuralMatch
to avoid having to ever consider them as primitive patterns.)
This is a slight breaking change compared to today: if an enum has some variants that are StructuralPartialEq
and others that are not, a constant whose value is a structural variant currently can participate in exhaustiveness checking.
Here's an example.
@lcnr's proposal would treat this constant opaquely and match via ==
.
We don't have a future-compat lint for that so we don't know how much breakage this would cause.
One consequence of this design is that when a constant has type (T, U)
, whether or not the T
part is compared using ==
or by pattern desugaring depends on whether U: StructuralMatch
.
That is a potentially concerning semantic discontinuity.
As another example, if we eventually allow matching on const generics, the same constant value might behave differently when it is used as a pattern via a const generic vs a regular const: in the first case the value is unknown at MIR building time so it uses ==
semantics; in the 2nd case the value is known so it could be turned into a pattern (if the type is StructuralMatch
).
Overall this variant is very similar to option 2 with exhaustiveness checks only for StructuralMatch
types, except that we don't promise that all consts behave like ==
, but instead say that consts of StructuralMatch
type whose value is available at MIR building time behave like the desugared pattern.
Desugaring to a pattern:
==
semantics are equivalent with a hard error, at the cost of ruling out matching on consts of types with a custom PartialEq
even if that PartialEq
is equivalent to pattern semantics.match
behavior completely independent from potentially user-defined code such as ==
.union
, even if they have a sensible notion of equality.==
semantics. Raw pointer patterns with sized pointees do not even have a lint; they are used in the standard library and presumably widely used in the ecosystem to check against sentinel values, so I assume we want to keep allowing those as well.)
None
but not Some(MaybeUninit::new(...))
even when those both have the same type. However, for floats, NaNs are not the only values with "strange" equality, there is also the fact that +0.0 == -0.0
despite those having different bit patterns, so if we disallow NaNs the question comes up what we should do about zeroes.==
and pattern semantics disagree. However all such code has future-incompat lints since Rust 1.48 (November 2020).Desugaring to ==
:
union
with a PartialEq
instance).==
, leading to possibly surprising behavior that the programmer did not expect.match
to have a more strict notion of equality than ==
.==
semantics. That can be seen as a good thing or a bad thing.Neither of these options has cares much about the Eq
trait, only PartialEq
is relevant.
Requiring Eq
would anyway be inconsistent with allowing matching on floats.
Some of the main arguments:
const
should not change behavior. In particular for fieldless enum variants, which are almost identical to consts, this would be really surprising. It also violates the "consts behave as if inlined" principle we've been repeating a lot.==
, we shouldn't allow matching on those consts.Argument 1 rules out option 2.
Argument 2 means we need to restrict option 1. But how?
The current scheme is geared towards allowing matching on a const if the value is recursively structural-match, and furthermore the type must implement PartialEq
.
That means if you have no ==
nobody can match on your types, so that's good – we don't expose syntactic capabilities that the user didn't explicitly expose.
And it means if we allow matching then its behavior is the same as that of ==
, so we also don't expose semantic capabilities that the user didn't choose to expose.
If we want the refactoring from argument 1 to always result in compiling code (as opposed to just ensuring that if it compiles, it is a semantic NOP), we need to relax this check in a scope-based way, where if we can see all the fields of a type (we are in the same module or they are public), then we allow matching even if there is no PartialEq
and the value is not structural match.
For all-pub
types this would mean everyone can match any constant no matter which traits are derived or manually implemented!
But it means if you derive(PartialEq)
that's a semver promise that your consts can be matched on, so you can't ever have a non-structural PartialEq
in the future.
If we want to avoid that we need to decouple derive(PartialEq)
from "allow matching on consts of this type".
This can only be changed via an edition transition.
This proposal does not let one define a MyBool
type with an unconventional equality and have reasonable match
behavior for that type.
But it does let one define such a type and at least be sure users are not circumventing the abstraction with match
.
Supporting that would require much more fundamental changes to our match
system.
Ideally we'd remaing forward-compatible with such changes, but what exactly would be required to ensure this?
Compared to today, this proposal only breaks code that we are already linting against with future-compatibility warnings. Specifically this affects the indirect_structural_match
lint (which identifies const values that are not recursively structural-match) and the const_patterns_without_partial_eq
lint (which identifies const values of non-PartialEq
type).
The latter is very recent (not on stable yet, riding the train for 1.74) but appears in cargo's future compatibility reports; the former is ancient but does not appear in cargo's future compatibility report.
If we want to determine whether a const value is recursively structural-match before evaluating it, and instead do an analysis based on the MIR source that computes the const value, then we'd also need to make the nontrivial_structural_match
lint a hard error – but it's unclear what the motivation for that would be.
However this isn't a complete proposal yet, since no answer is given for: