owned this note changed a year ago
Published Linked with GitHub

Design meeting: consts in patterns

We allow using (some) constants in patterns. However, we cannot allow all of them: some just don't have a way of being compared, such as unions; others get rejected for being "not structural-match", as defined in RFC 1445. The structural-match check had some holes so some constants that we do accept get linted as "this will be an error in the future". These lints were introduced a long time ago and have been warn-by-default since Rust 1.48, close to 3 years ago. Since then the pattern matching implementation in the compiler changed a lot and our ideas of what we do and don't want to do with pattern matching also changed. We also realized there are some gaps in what the RFC discusses, such as raw pointers.

It's time to figure out where we want to go with this: enforce RFC 1445 by making (some of) these lints hard errors, or change our mind and remove the lints.

The main questions to figure out are:

  • Which exact consts do we want to accept in patterns?
  • What should their semantics be, i.e., when are they considered to "match" the place-being-matched-on?

Some smaller questions that also show up are:

  • How should consts interact with exhaustiveness checking, i.e., when will the constant value be taken into account to compute which cases are already covered by match arms?
  • Should we lint against some consts that we do allow, e.g. because they have surprising semantics?

Terminology: "structural match"

Before we dive deeper, we need to define some words that will come up again and again.

We say that a type is structural-match if its PartialEq instance is (syntactically or semantically) equivalent to the one that would be generated by derive(PartialEq). This term only really applies to ADTs. The StructuralPartialEq trait reflects this property. Note that this is non-recursive, it only talks about the PartialEq instance of this type, not about its fields!

We say that a type is recursively structural-match if all ADTs that recursively appear in fields are structural-match.

We say that a value is recursively structural-match if all ADTs that appear in this value are structural-match. Due to enums, it is possible to have non-structural-match types where some values are structural-match, such as the None value of Option<MyNonStructuralMatchType>.

Possible design options

Largely, there are two "main points" in the design space. Of course one could also consider a design that sits somewhere in between those two points.

Option 1: Consts desugar to a pattern. This design considers a constant used in pattern position to be basically syntactic sugar for a pattern that one could have also written otherwise. For instance, matching on a constant with value (0i32, None) is completely equivalent to writing the pattern (0i32, None).

Option 2: Consts desugar to ==. This design considers a constant used in pattern position to be basically equivalent to a == guard, except that possibly exhaustiveness checking can take into account the concrete value of C. For instance, matches!(x, C) aka match x { C => true, _ => false } would desugar to x == C.

Design option discussion

Option 1: Desugaring to a pattern

Reading through the historic record, this is likely the originally intended design. In particular, RFC 1445, which introduced the "structural match" restriction, seems to pre-suppose that we want constants in patterns to behave like a regular pattern desugared from the computed value of the constant.

One consequence of this design is that the value of the const must be visible we need the exact pattern to build the MIR, and to do exhaustiveness checking, so if the const cannot be evaluated (since it depends on some generic parameters), we have to reject it.

With this design we could accept constants as patterns that do not implement PartialEq, let alone Eq. (In fact we currently do accept some such constants, though the examples are contrived and this is likely an accident it involves unnecessary bounds on the derived PartialEq instance. Non-PartialEq constants in patters used to be more common when not all function pointer types implemented PartialEq, but that has been fixed.) Of course we might want to require PartialEq to keep our bets open for the future, e.g. to be forward-compatible with option 2.

This design makes match as a language construct completely independent of any user-defined code such as ==.

Pointers in const-patterns

However, to really say that constants desugar to a pattern, we must make sure that the "leaf types" (after traversing all tuples, structs, enums, and arrays) can actually be used as patterns. And that is not actually the case: we allow matching on constants that involve raw pointers and function pointers, which are not otherwise allowed as patterns. So, to use this design option, we must pick one of:

  • Consider the Rust language of patterns to include raw pointer and function pointer patterns. These patterns do not have a direct surface syntax, they can only come about through the desugaring from constants. They behave like == on these types (which is a built-in primitive, so no user-defined == is sneaking in here).
  • Deprecate and eventually remove support for matching on consts that involve raw pointers or function pointers. (Currently, we emit a future-incompat lint for function pointers and for raw pointers to unsized types, but not for other kinds of raw pointers.)

Of course this choice can be made independently for function pointers, and raw pointers with different pointee types. RFC 1445 does not mention raw pointers or function pointers at all, and arguably they are not very "structural" in their equality. (It does mention floats and calls them out as non-structural.)

If we want to not have function pointer or dyn Trait raw pointer "leaf patterns" because their notion of equality is wonky and definitely not structural, pointer_structural_match (or a subset of it that accepts raw slice pointers) needs to be made a hard error. If we want to not have any raw pointer "leaf patterns", we need a new lint; however, such patterns are used widely, including in the standard library.

We could consider only allowing matching against raw pointer constants such as 4 as *const i32 but not &42, i.e., only constants with a fixed integer value rather than some dynamic memory location. We make few to no guarantees about pointer identity for pointer equality on consts, so this can be justified but it would require a new future-incompat lint since currently we just accept such code. (The fallout from rejecting such patterns is thus also unknown, a crater run would be required.) Integer-constant raw pointers arguably are as structural as regular integers, so allowing match on them is in line with the general philosophy of this design option.

Floats in const-patterns

Floats are also an interesting question here. They have been allowed as primitive patterns (even without constants) since Rust 1.0, but RFC 1445 called them out as non-structural and we have had a future-incompat warning against float patterns for a long time now. However, when the proposal came up to turn this warning into a hard error, that PR was rejected by t-lang.

Floats have strange equality on NaNs (which are never equal to anything) and zeroes (where positive and negative zero compare equal). We could consider entirely rejecting NaNs in patterns since those arms can never be reached anyway, but accepting other float constants. That would still allow the use-cases people brought up in the tracking issue. (For floats we also allow range patterns, but NaN seems to be rejected in range patterns.) Rejecting zeroes would likely be a lot more surprising, so here we likely have to live with the fact that match will consider both zeroes to be equal (if we want to allow float matching at all).

Another option for matching on NaN would be to make the f32::NAN pattern match any NaN.

Structural match restriction

Based on this design, the entire "structural match" story was born out of a desire to reject cases where the desugared pattern does not behave like == would. (That's what RFC 1445 was all about; also see the motivation there.) To achieve this we must reject consts whose value is not recursively structural-match. This was originally intended (in the RFC) to be a hard error, but arguably could also be made a lint. (This also explains why the StructuralPartialEq trait is a safe one it isn't really load-bearing in this design.)

One notable downside of the structural match checks is that it makes switching from the derived PartialEq to a custom one (e.g. one that is more efficient or avoids unnecessary bounds) a breaking change, even if the behavior of == remains unchanged. If we want a hard guarantee that pattern semantics and == semantics agree to avoid any potential confusion, this cannot really be avoided. We could make the trait unsafe and off-load the guarantee partially to the user writing unsafe impl StructuralPartialEq. We could also make the trait safe if we treat the structural match check like a lint.

This restriction, when implemented as a hard error, ensures that option 1 is forward-compatible with option 2.

Changes from today's behavior

To desugar consts into patterns we need to either reject raw pointer values or consider them legal "leaf patterns" (which likely means we need to permit constructing valtrees with raw pointers, since pattern construction goes through valtrees). At least the pointer_structural_match lint should be made a hard error.

If we want the structural match check to be a hard error, the indirect_structural_match future-compat lint also has to be turned into a hard error. If we want to be able to analyze whether a value is recursively structural-match without computing it (using logic similar to how we determine whether a const value needs dropping), we need to make nontrivial_structural_match a hard error; but we could alternatively just say that we will compute the value of the constant and check that as basis for the hard error/lint. (We have to compute that value anyway.)

To follow this design we also have to change the behavior of some existing code such as this one, where currently one can tell that we are not actually desugaring consts to native patterns. This code already has a future-incompat lint (and has had it for a long time), so we could avoid silently changing semantics by making such code a hard error instead.

Option 2: Desugar to ==

The alternative to the above is to say that constants used as patterns behave like ==, and everything else is linting and quality-of-life improvements.

Why would we explain consts-in-patterns via == rather than desugared patterns? We already allow using floats in patterns without any constants being involved (this has worked since Rust 1.0, though we started linting against it at some point many years ago), and that uses == semantics rather than "exact bitwise equality" and one can argue about whether this is "structural". We also allow raw pointers to sized types and don't even lint against that; whether one considers == on *const u8 to be "structural" is probably a matter of opinion arguably that type doesn't really have any "structure", and its notion of equality is a very low-level machine detail. So saying that all consts use == is not a total surprise. (OTOH, as discussed above, if we restrict this to "integer pointers" then raw pointer equality can be argued to be structural.)

One big advantage of this option is that we will be able to allow matching against generic consts, associated consts of generic types, and other consts whose value we cannot know at MIR building time.

One big downside is that if people expect consts in patterns to behave as if they desugar to a pattern, then they are not getting the semantics they are expecting. Generally people might be expecting stricter equality from match as what == provides. (However, if that's the deciding argument, we should do something about matching on floats, and possible raw pointers as well.) People also expressed the opinion that match behavior should never depend on user-defined code like custom ==.

We can of course still detect and lint against "non-structural-match" cases where if the final value were to be written as a pattern, it would behave differently (at least we can do that for consts whose value we can compute). This would somewhat preserve the spirit of RFC 1445. The details of what gets linted would have to be determined if we do allow opaque consts in patterns, we probably don't want to lint against every use of them, so the lint would necessarily miss some warnings.

This option was not a real possibility during prior discussion in past years, since not all function pointer types implemented PartialEq. However, that issue has been resolved now, so (except for what looks like accidents), all types we allow matching on currently do have PartialEq.

We could consider requiring Eq and not just PartialEq, but that would rule out matching on floats.

Exhaustiveness checking

As an example for a quality-of-life aspect, if we can determine the value of the constant, we might want to take it into account for exhaustiveness checking though of course we can only do that if == actually behaves like the desugared pattern, i.e., if the constant value is recursively structural-match. In those cases we can transparently rewrite the == check to a pattern, knowing it does not change program behavior, and then we can do exhaustiveness checking on that pattern. (This makes the StructuralPartialEq trait's promise load-bearing for soundness, and the trait should be made unsafe.)

We could say that only constants whose type is recursively structural-match are taken into account for exhaustiveness checking; this would entirely avoid having to run the analysis of whether the concrete value is recursively structural-match. (However, this would reject some code that we currently accept and don't lint against.)

Changes from today's behavior

For this option we definitely need to reject all constants in patterns that do not implement PartialEq. There are already forward-compatibility warnings against basically every possible such case, though one corner case was missed.

Other than that we can remove all the structural-equality forward-compatibility lints. We might consider turning some of them into general lints about potentially surprising behavior.

This is also a massive breaking change for matching on consts in const fn, which is currently sometimes allowed but would never work under this option since == is not const fn.

Further options

Of course we don't have to decide to be on either end of this design spectrum. We could say that some consts behave like desugared to a pattern, while others behave like ==. This could be decided based on some trait, or the value of the constant, or other things.

This document by @lcnr describes a variant of this. The trait is called StructuralEq there, but StructuralMatch would probably be a more apt name so we will use that here. The compiler checks that StructuralMatch is only implemented when all fields are StructuralMatch and not implemented for unions (this ensures a pattern can always be constructed for all values of this type), but otherwise the trait is safe and can be arbitrarily implemented by users. Consts that implement StructuralMatch get pattern behavior and exhaustiveness checking, all other consts get == behavior and no exhaustiveness checking. (Floats and raw pointers could be considered non-StructuralMatch to avoid having to ever consider them as primitive patterns.)

This is a slight breaking change compared to today: if an enum has some variants that are StructuralPartialEq and others that are not, a constant whose value is a structural variant currently can participate in exhaustiveness checking. Here's an example. @lcnr's proposal would treat this constant opaquely and match via ==. We don't have a future-compat lint for that so we don't know how much breakage this would cause.

One consequence of this design is that when a constant has type (T, U), whether or not the T part is compared using == or by pattern desugaring depends on whether U: StructuralMatch. That is a potentially concerning semantic discontinuity. As another example, if we eventually allow matching on const generics, the same constant value might behave differently when it is used as a pattern via a const generic vs a regular const: in the first case the value is unknown at MIR building time so it uses == semantics; in the 2nd case the value is known so it could be turned into a pattern (if the type is StructuralMatch).

Overall this variant is very similar to option 2 with exhaustiveness checks only for StructuralMatch types, except that we don't promise that all consts behave like ==, but instead say that consts of StructuralMatch type whose value is available at MIR building time behave like the desugared pattern.

Summary and comparison

Desugaring to a pattern:

  • Matches the historic intent
  • Can (optionally) ensure that pattern semantics and == semantics are equivalent with a hard error, at the cost of ruling out matching on consts of types with a custom PartialEq even if that PartialEq is equivalent to pattern semantics.
  • Makes match behavior completely independent from potentially user-defined code such as ==.
  • Is not quite a desugaring, since we need to add support for raw pointer "leaf" patterns despite their notion of equality being a lot more subtle and low-level than anything one can write as a native pattern.
  • Cannot support opaque constants or constants that contain a union, even if they have a sensible notion of equality.
  • Holds custom ADTs to a higher standard than float types and raw pointers, which can be matched on despite not being really fully "structural". (Floats are currently being linted against, but t-lang previously rejected turning that lint into a hard error, so I assume we want to keep allowing float patterns with == semantics. Raw pointer patterns with sized pointees do not even have a lint; they are used in the standard library and presumably widely used in the ecosystem to check against sentinel values, so I assume we want to keep allowing those as well.)
    • There are possible sub-options here that could avoid some of these issues, such as allowing matching only on particular values when raw pointers and floats are involved. For raw pointers we could rule out constant pointers that are not integer constants we guarantee little about their identity anyway. For floats we could rule out NaNs; those match arms are unreachable anyway. It's unclear how surprised people would be about such value-based restrictions there is precedent; after all we already allow matching on None but not Some(MaybeUninit::new(...)) even when those both have the same type. However, for floats, NaNs are not the only values with "strange" equality, there is also the fact that +0.0 == -0.0 despite those having different bit patterns, so if we disallow NaNs the question comes up what we should do about zeroes.
  • Changes behavior of some code we used to accept without warnings where == and pattern semantics disagree. However all such code has future-incompat lints since Rust 1.48 (November 2020).

Desugaring to ==:

  • Can support opaque consts and other consts one could not write as a pattern (e.g., a type that involves a custom tagged union with a PartialEq instance).
  • Can only best-effort lint against possible cases of semantic mismatch between the pattern and ==, leading to possibly surprising behavior that the programmer did not expect.
  • Defies expectations of programmers that expect match to have a more strict notion of equality than ==.
  • "Opens the floodgates"; suddenly one will be able to match on tons of things, based on their == semantics. That can be seen as a good thing or a bad thing.
  • Is not quite a desugaring, since we want to take into account some constant values for exhaustiveness checking to avoid unnecessary "non-exhaustive match" errors.

Neither of these options has cares much about the Eq trait, only PartialEq is relevant. Requiring Eq would anyway be inconsistent with allowing matching on floats.

Post-meeting notes

[This section was added after the meeting.]

Some of the main arguments:

  1. For option 1: refactoring a binderless pattern into a const should not change behavior. In particular for fieldless enum variants, which are almost identical to consts, this would be really surprising. It also violates the "consts behave as if inlined" principle we've been repeating a lot.
  2. against unrestricted option 1, for option 2: we shouldn't expose operations on a type that a user didn't decide to expose if they gave no ==, we shouldn't allow matching on those consts.

Argument 1 rules out option 2. Argument 2 means we need to restrict option 1. But how? The current scheme is geared towards allowing matching on a const if the value is recursively structural-match, and furthermore the type must implement PartialEq. That means if you have no == nobody can match on your types, so that's good we don't expose syntactic capabilities that the user didn't explicitly expose. And it means if we allow matching then its behavior is the same as that of ==, so we also don't expose semantic capabilities that the user didn't choose to expose.

If we want the refactoring from argument 1 to always result in compiling code (as opposed to just ensuring that if it compiles, it is a semantic NOP), we need to relax this check in a scope-based way, where if we can see all the fields of a type (we are in the same module or they are public), then we allow matching even if there is no PartialEq and the value is not structural match. For all-pub types this would mean everyone can match any constant no matter which traits are derived or manually implemented!

But it means if you derive(PartialEq) that's a semver promise that your consts can be matched on, so you can't ever have a non-structural PartialEq in the future. If we want to avoid that we need to decouple derive(PartialEq) from "allow matching on consts of this type". This can only be changed via an edition transition.

This proposal does not let one define a MyBool type with an unconventional equality and have reasonable match behavior for that type. But it does let one define such a type and at least be sure users are not circumventing the abstraction with match. Supporting that would require much more fundamental changes to our match system. Ideally we'd remaing forward-compatible with such changes, but what exactly would be required to ensure this?

Compared to today, this proposal only breaks code that we are already linting against with future-compatibility warnings. Specifically this affects the indirect_structural_match lint (which identifies const values that are not recursively structural-match) and the const_patterns_without_partial_eq lint (which identifies const values of non-PartialEq type). The latter is very recent (not on stable yet, riding the train for 1.74) but appears in cargo's future compatibility reports; the former is ancient but does not appear in cargo's future compatibility report. If we want to determine whether a const value is recursively structural-match before evaluating it, and instead do an analysis based on the MIR source that computes the const value, then we'd also need to make the nontrivial_structural_match lint a hard error but it's unclear what the motivation for that would be.

However this isn't a complete proposal yet, since no answer is given for:

  • floats
  • raw pointers (thin/wide, integer or "actually pointer")
  • function pointers
  • potentially letting non-derive(PartialEq) types opt-in to allowing matching (with structural semantics)

Design meeting minutes

Attendance: TC, nikomatsakis, RalfJ, oli, pnkfelix, eholk, scottmcm, waffle

Minutes, driver: TC

Risks associated with "function pointer pattern" support

pnkfelix: TODO (after I finish reading doc); basic point I wanted to cover is potential of a compiler optimizations to merge or duplicate function definitions, and one needs to clarify if that can be observed in some fashion if one adds support for function pointer patterns.

pnkfelix: E.g. if we do add formal support for function pointer patterns, would that imply that we would then 1. inhibit the ability of the compiler to merge/duplicate functions that participate in such patterns, and/or 2. allow code to observe that such merging/duplication has taken place, and/or 3. make function pointer patterns have to lower into code that enumerates all the duplicate entries (which would handle duplication but pnkfelix doesn't there's a sensible way for it to handle merging)

RJ: We already allow code to observe that with ==. It's only a question of whether we also want the same to be observable via match.

Historical note

nikomatsakis: As the author of RFC 1445, allow me to clarify its intent

This would somewhat preserve the spirit of RFC 1445.

its intent was not to endorse any design, but rather to leave us room to select a design later. Originally I believed strongly that match x { A => .., _ => .. } and if x == A { .. } else { .. } should be equivalent. But the compiler implemented matching on constants via desugaring into patterns, which had various surprising (to me) implications, e.g., the PartialEq impl could diverge, you could view private fields (since fixed, I think?), etc.

However, as pointed out in this document, there were some concerns, like executing user-given code in pattern match handling (and especially if that introduces some potential for unsoundness around exhaustiveness checking). [RJ: we currently have no soundness concerns here, I'm confident we can handle opaque ==-matched constants in a sound way]

Therefore since 1.0 was coming up we compromised with "let's just leave room to hash this out later". Fast forward uh 8 years (holy potatoes) and we still haven't.

At least that's how I remember it.

pnkfelix: We have to get to the heart of whether match and == are the same.

Clarification

nikomatsakis: I do not understand this sentence:

We could consider only allowing matching against raw pointer constants such as 4 as *const i32 but not &42, i.e., only constants with a fxed integer value rather than some dynamic memory location.

Help me Ralf, you're my only hope! How is &42 a raw pointer constant?

Ralf: I just omitted the coercion, sorry for the confusion. const X: *const i32 = &42; compiles.

nikomatsakis:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
ok

NaN handling

pnkfelix: doc says "Another option for matching on NaN would be to make the f32::NAN pattern match any NaN."; a further alternative option would be to add a new kind of pattern, "IsNaN", that would only be usable in pattern contexts and that would match any NaN. One could imagine similarly having IsZero, IsPosZero, IsNegZero, though maybe appetite for a zero-specific change may not be as large as that for NaN None of these would be expressible as consts, and I think that is okay.

scottmcm: And is f32::NAN special in some way for the quoted option, or would any NAN value match any NAN in this world?

RJ: Yes there's a whole bunch of options in this particular corner of the design space. I didn't mean to get bogged down in such details since we need to figure out the "big picture" first. Also we need to say what matching on the literals 0.0, +0.0, -0.0 means. (I presume IsPosZero etc are magic consts, not just defined as that literal. Really they couldn't be consts in the current rustc implementation I think, they'd have to be a new thing a named pattern.)

TC: In favor of something like this, I'll mention that NaN feels like an enum variant. In SQL, NULL is never equal to itself either, but if we were modeling that in Rust, Null would be an enum variant and one could match on it.

pnkfelix; note I edited/augmented my question after the fact, which might make this thread confusing. (An argument for not doing these things async) My intent was that you wouldn't be able to write e.g. IsPosZero in a const, only in a pattern.

const values and semver in exhaustiveness

(low pri; probably skip in discussion)

scottmcm: I guess the ship has already sailed on consts values changing being breaking changes, so it's fine to consider them for exhaustiveness. But part of me feels like it's odd to include them in these checks.

RJ: With array sizes and const generics, there's other ways besides match where changing a const value is semver-breaking.

scottmcm: yeah, that's what I meant by "the ship has sailed".

current behavior?

nikomatsakis: It's not very clear to me what our current behavior is. Long ago, we used to desugar to nested patterns. Then when we wrote the MIR code I secretly lobbied for my agenda by making patterns desugar to == (ha ha!), but in a way that I at least thought wouldn't be important, because it was only for types where the behavior should be the same. But then there are examples like this that I guess say that we use == still?

RJ: Current behavior is hard to describe.^^ Very roughly speaking, if I remember correctly, we

  • First try to build a valtree of the const. This will succeed iff the const value recursively consists only of integers, floats, char, bool, references, tuples, structs, enums (but no raw or fn pointers nor unions).
  • If that succeeds, we try to turn the valtree into a primitive pattern
  • If either of these steps fail, then we fall back to ==
  • Valtree-to-pattern conversion also applies all sorts of checks that can lead to the constant being rejected, to a lint, or to a fallback to ==
  • And there's some checks that reject unions as well rather than falling back to ==, I forgot where exactly they come in

My plan for the meeting was to mostly ignore the current implementation since it's a historic accident, not a design.

NM: There are consts that if desugared would be rejected for reasons of field privacy.

RJ: That's a good point.

Expectations vs Expressiveness

pnkfelix: People might expect == to behave like match, but there's a legitimate concern here that there are some things that may simply be inexpressible if you try to enforce that the two are always the same. My gut feeling is that we should weigh expressiveness concerns first, but I can understand wanting to put a heavy weight on meeting people's intuitions.

nikomatsakis: Not a response, but to tack on a thought, the other side of this is the ability to create complete abstractions. If I define a notion of PartialEq for my type, is it weird that matching on a constant can "peek in" and see a more specific notion? I'd like ideally to be privacy-respecting, but I'm not sure if we've worked out the full implications there (i.e., if we went desugaring, and respecting privacy, maybe that would be a lot of breakage?). I don't think we've tested that, but maybe I am wrong.

NM: There may not be an answer to what users expect. They expect different things.

RJ: Some people don't expect user code to be run during match. I find the argument convincing.

NM:

Axes for comparison. If we desugar to patterns then

  • we require the specific value BUT we get
    • no user-given code (if you care about that)
    • exhaustiveness checking
    • matching where the specific value is ok, but the type may not be
      • we support this on construction, it's nice to have it go the other way
    • consistency between enum variants and constants

NM: We could add a new syntax for equality matching in patterns that's less annoying than _ if ....

scottmcm: (I also have a thing below about == syntax.)

wffl: (I like the idea that match is independant from user code)

eholk: +1; it feels to me like match should be a tool you can use to implement PartialEq, rather than something that depends on PartialEq.

enum Expr { Null, .. }

match expr {
    Expr::Null => // <-- this is a VARIANT match, but it acts a LOT like a pattern
}

nikomatsakis: I am concerned about mismatch between match and == because it means I can't build a real abstraction. Consider a struct doing floating point. There are many NaN bit patterns, but you can't observe a difference between them.

enum Expr { Null, }

const C: Expr = Expr::Null;

scottmcm: Maybe people really want pattern aliases.

RalfJ: That would be really powerful.

What kinds of code are accepted / not-accepted

nikomatsakis: the desugaring to a pattern case also does not support generic or associated constants, right? But do we permit this in some cases today?

Musing on value vs type

scottmcm: I liked the distinction on value vs type structural match; that was helpful to me. The exhaustiveness checks being const value-aware is making me like value-based restrictions here, but maybe the associated const problem forces us into type-based?

More syntax support for if?

scottmcm: today if isn't part of patterns. I wonder if allowing x if x == FOO in more places could help? Like today Some(x if x == FOO) isn't allowed, making things like (Foo(x) | Bar(y)) if x == FOO || y == BAR not possible, but maybe it could be fine to have Foo(x if x == FOO) | Bar(y if y == BAR) and thus make people say == if they want PartialEq semantics?

Could we potentially still allow matching on constants from generics with the first approach?

wffl: e.g. we could treat T::CONST as an opaque const (i.e. so that it wouldn't affect exhaustiveness) and allow them if typeof(T::CONST): SomeTraitRecursiveOkForMatch.

NM: Today we build MIR from patterns. We do that generically. RalfJ, what you're saying

NM: We could delay this in the process. In principle we could embed some sort of "compare against constant" MIR instruction and then, at monmorphizaiton time, expand it.

RJ: Yes, but that's antithetical to how MIR works.

NM: but we can't integrate it into exhaustiveness etc.

Do we have to allow an answer for all possible consts?

pnkfelix: Not sure if its covered in doc, but: Is one option here to say that some consts simply cannot be used in patterns at all? (E.g. a const with a NaN in it is disallowed?)

Why should we pick option 1?

NM: Surprisingly, I'm leaning toward option 1, despite having previously advocated option2. Does anyone want to argue why we shouldn't do that?

Niko's position in favor of compiling to patterns:

  • Consistency between enum variant and constant
    • it means you can extract a binderless pattern into a constant and it continues to work
    • Const semantics for values are generally "copy and paste" (in terms of e.g. temporary lifetimes), but it's not true here
  • Exhaustiveness and more precise analysis

Another argument Jubilee made that Niko finds less persuasive:

  • No user-given code should be invoked during pattern matching

Felix makes an expressiveness argument:

  • Matching against constant values that don't implement PartialEq this is something you would not be able to do othewrise

Niko is concerned about "leaking" around privacy:

  • If I make a constant, I allow people to observe whether values of my private fields are equal in ways that wouldn't otherwise be possible.
mod private {
    pub struct MyBool {
        x: u32,
        y: u32,
    }

    impl PartialEq for MyBool {
        /* ... checks if x == y to decxide true value of this ... */
        return (self.x == self.y) == (other.x == other.y)
    }

    // Can't do this:
    pub const FALSE: MyBool = MyBool { x: 0, y: 1 }; // should be equivalent to ALL OTHER MyBool types where x and y are not equal

    pub fn mk_bool() -> MyBool { MyBool { x: 0, y: 2 } }

    fn nasty(...) {
        // How should this behave?
        match ... {
            MyBool { x: 0, y: 1 } => ...
        }
        // Or this?
        match ... {
            MyBool { x, y } => ...
        }
    }
}

mk_bool() == FALSE // true
matches!(mk_bool(), FALSE) // false (today: compiler error)

Today, the above gives

error: to use a constant of type `MyBool` in a pattern, `MyBool` must be annotated with `#[derive(PartialEq, Eq)]`
  --> src/lib.rs:19:9
   |
19 |         private::FALSE => (),
   |         ^^^^^^^^^^^^^^
   |
   = note: the traits must be derived, manual `impl`s are not sufficient
   = note: see https://doc.rust-lang.org/stable/std/marker/trait.StructuralEq.html for details

Niko was trying to make this point but it may not make sense, and TC's "copy and paste" point is better:

  • Consistency between creating constants from a specific value and then matching on them

NM: E.g., you need some way to implement your PartialEq.

const C: Option<Box<u32>> = None;
match x {
    C => { }
    _ => { }
}

concern

NM: Here's my discomfort with the current behavior. If I derive PartialEq for a type and then publish a constant of that type, I'm promising for all time that the type will remain structurally equal.

playground

When I derive PartialEq and Eq and have constants, I have actually committed to be structurally equality or else it's a breaking change. With Option 2, it's not a problem.

wffl: matching in const context with == semantic is sketchy because it requires const == which we don't require today

Exhaustiveness

Additional example that shows we are committed to a certain leakage today

mod private {
    #[derive(PartialEq, Eq)]
    pub struct MyBool {
        a: bool,
    }

    pub const FALSE: MyBool = MyBool { a: false };
    pub const TRUE: MyBool = MyBool { a: true };

    pub fn mk_bool() -> MyBool { MyBool { a } }
}

use private::MyBool;

pub fn main () {
    let x = private::mk_bool();

    match x {
        // we leak the fact that `a: false` and `a: true` exhaustively covers the space
        private::FALSE => { }
        private::TRUE => { }
    }
}

playground

Why should we pick option 2?

RJ (devil's advocate): If the user writes a PartialEq, we should respect that everywhere (e.g. the SQL NULL example).

scottmcm: one could think of foo IS NULL in SQL as being a match (like matches!(foo, NULL), or a hypothetical foo is Null syntax), so it's correct that there's a way to "match" it even though == works differently. (And the way around this being visible would be to add private enum variants if this needs to be hidden.)

tmandry's taxonomy of concerns

  • Is pattern matching lower level than ==
  • Exhaustiveness checking
  • Refactoring pattern to constant
  • Expectation around weird equality like NAN, MyBool above
  • Bypassing privacy (exposing structural eq on private fields when matching on a const)
Select a repo