Try โ€‚โ€‰HackMD

Deref Patterns Design Proposal(s)

Goal: Be able to match through a Deref or DerefMut smart pointer ergonomically.

let x: Option<Rc<bool>> = ...;
match x {
    Some(deref true) => ...,
    Some(deref false) => ...,
    None => ...,
}

Prior discussions include:

From these and my own considerations, I have come up with two related proposals.

We have a working implementation (without exhaustiveness checking) of one of the proposals available under the deref_patterns experimental feature gate.

Design principles (WIP)

  • Patterns do what I mean: writing the right pattern doesn't require sigil-golf to appease the type-checker;
  • Patterns can be precise: users should be able to spell out exactly which places they want to access with a pattern;
    • In an unsafe block I have confidence I'm not triggering anything unexpected
  • Smart pointers get out of the way: the ergonomic distance between using a &T and a smart pointer should be made as small as possible;
    • I don't need to make match-in-match just because I boxed a field, or turned an array into a vec, or whatever.
    • I still get exhaustiveness checking even though the the integer/enum/array is inside a box
  • Patterns accomodate semantically-irrelevant code changes: switching from Rc to Arc should require minimal code changes;
  • Users can be trusted with powerful features;
  • Safety rails make power safer to wield.
  • Patterns are readable
  • Patterns are predictable
  • Deconstruction mirrors use
  • Mutability is explicit (?)

Q: should & vs &mut indicate the mutability of the type, or of the borrow?
A: today is mostly type, ergo2024 makes it sometimes borrow. E.g. let &mut (ref x) = &mut place; shares place.

Proposal(s)

The two proposals differ in syntax but share how they work: a &<pat>/*<pat>/deref <pat>/etc pattern would be allowed for stdlib smart pointers like Box or Rc, where <pat> would match on the pointed-to value.

let x: Option<Rc<bool>> = ...;
match x {
    Some(&true) => ..., // syntax choice 1
    Some(deref true) => ..., // syntax choice 2
    Some(*true) => ..., // syntax choice 2 bis
    Some(Rc(true)) => ..., // syntax choice 2 ter
    Some(true) => ..., // possible extension (implicit deref patterns)
    _ => ...,
}

Everything else about patterns works the same: we can nest them, we can get immutable or mutable access, they are subject to exhaustiveness checking and the dead arm lint.

Before we discuss the two syntactic options, let's start with some common design details.

Enabled for all std smart pointers

The feature would be enabled for the following std types: Box, Rc, Arc, Vec, String, Cow, Pin, ManuallyDrop, Ref, RefMut, LazyCell, LazyLock. This is sound because all those impls are sufficiently idempotent.

Extending the feature to user-defined Deref impls is outside the scope of this proposal.

Exhaustiveness

Patterns are treated exhaustively as expected:

// This works
match Box::new(true) {
    deref true => ...,
    deref false => ...,
}

This is sound because we only enable the feature for a trusted set of types whose Deref impls behave well enough.

Mixing deref and normal patterns

Some Deref std types like Cow can be matched normally. For simplicity we forbid mixing deref and normal patterns for now.

match Cow::Owned(false) {
    Cow::Owned(_) => ...,
    deref true => ..., // ERROR: don't mix deref patterns and normal patterns
    _ => ...,
}

Explicit vs implicit syntax

The precedent with match ergonomics and the general way rust tends to work suggests that implicit deref patterns (if we want them) should desugar into an explicit form.

Moreover, we need explicit syntax to disambiguate cases like:

let cowcow: Cow<Cow<bool>> = ...;
match cowcow {
    // How do I match the inner cow?
    Cow::Borrowed(_) => ...,
    _ => ...,
}

As such, both proposals focus on the explicit syntax. Implicit patterns are an optional extension.

Types

We follow how the rest of rust works for matches and Deref: we work on places.

This means that &<pat>/deref <pat> operates on a place x: P where P: Deref<Target=T>, and matches <pat> against the place *x of type T.

It does not operate on a place of type &P/&mut P (except insofar as match ergonomics make it seem like it does). It also does not return a place of type &T/&mut T (again modulo match ergonomics). As we will see in Unresolved Questions, this poses an issue for string literals. Yet this is the consistent choice wrt the rest of rust.

The two syntax options

Option 1: &<pat>

match Box::new(0) {
    &0 => ..., // works
    &x => ..., // `x: u32`
    _ => ...,
}

No match ergonomics in the explicit form

Because of how & works today, we don't really have much choice:

match &Box::new(Some(0)) {
    &x => ..., // ERROR: can't move `Box` out
    &Some(_) => ..., // ERROR: type mismatch
    &&Some(0) => ..., // ok
    &&Some(x) => ..., // `x: u32`
    _ => ...,
}
match &mut Box::new(Some(0)) {
    &mut &mut Some(ref mut x)) => ..., // `x: &mut u32`
//       ^^^^ this derefs the `Box`
//  ^^^^ this digs into a `&mut T` as normal
    _ => ...,
}
// Just like for `&T`, `&<pat>` eats the inherited reference if there is one.
match &Some(Box::new(0)) {
    Some(&x) => ..., // `x: u32`
    Some(&0) => ..., // ok
    _ => ...,
}
// See also the discussions below about match ergonomics 2024.

As you can see, this proposal isn't very convenient without implicit deref patterns.

Mutability

As could be expected, &<pat> calls deref and &mut <pat> calls deref_mut.

let mut x = Box::new(Some(0));
match x {
    &Some(ref x) => ..., // `x: &u32`
    _ => ...,
}
match x {
    &mut Some(ref mut x) => ..., // `x: &mut u32`
    _ => ...,
}
match x {
    &Some(ref mut x) => ..., // ERROR
    _ => ...,
}

Interestingly, because &mut T: Deref, this allows matching on &mut T with a &<pat> pattern:

let mut x = 0;
match Some(&mut x) {
    Some(&x) => ..., // `x: u32`
    None => ...,
}

Future-compatibility with moving out/DerefMove

Because we distinguish & and &mut the way we do, to move out we'd probably want some other syntax. Maybe move <pat>, maybe *<pat>, idk.

Interaction with match ergonomics 2024

The match ergonomics 2024 proposal includes some possible changes to &<pat> patterns. Depending on the exact choices made, this could conflict with deref patterns.

For example:

if let Some(&x) = &Some(Box::new(0)) {
    /// ...
}
  • Under current Rust, this is a type error.
  • Under deref patterns only, x: u32.
  • Under "&<pat> everywhere" only, x: Box<u32> (which gives a move error).
  • Under both if we take option eat-both-layers, what makes most sense is x: u32.
  • Under both if we take option eat-one-layer, what makes most sense is x: Box<u32>.

Hence combining the features could accidentally create a breaking change.

The safest choice is to disallow &<pat> deref patterns in the presence of match-ergonomics-inherited references, and to disallow "&<pat> everywhere" on Deref types (should be fine because Deref types are rarely Copy). That gives us freedom to land either feature in any order. We can relax restrictions once both are stable.

For custom Deref, this may mean that implementing Deref on a Copy type can break downstream crates:

// crate A
#[derive(Copy, Clone)]
struct Container<T>(T);
impl<T> Container<T> {
    fn contents(&self) -> &T { &self.0 }
}

// crate B
fn contains_true(x: &Option<Container<bool>>) -> bool {
    match x {
        // With eat-two-layers, this copies out the `Container`.
        // If `Container: Deref`, this would instead copy the `bool`.
        Some(&c) => *c.contents(),
        None => false,
    }
}

Option 2: some other syntax, e.g. deref <pat>, *<pat>, or <Pointer>(<pat>)

For the sake of presentation, I'll use deref. Any syntax different from & can work the same.

Works with match ergonomics

match Box::new(0) {
    deref 0 => ...,
    deref x => ..., // `x: u32`
    _ => ...,
}
match &Box::new(0) {
    // The `deref` passes through references as necessary
    deref 0 => ..., // ok
    deref x => ..., // `x: &u32`
    _ => ...,
}
match &mut Box::new(0) {
    // The `deref` passes through references as necessary
    deref 0 => ..., // ok
    deref x => ..., // `x: &mut u32`
    _ => ...,
}

This does not require implicit deref patterns to be practical.

Mutability

Mutability is inferred from the kinds of bindings we do inside the pattern.

let mut x = Box::new(Some(0));
match x {
    // Calls `deref()`
    deref Some(ref x) => ..., // `x: &u32`
    _ => ...,
}
match x {
    // Calls `deref_mut()`
    deref Some(ref mut x) => ..., // `x: &mut u32`
    _ => ...,
}
// With match ergonomics
match &x {
    // Calls `deref()`
    deref Some(x) => ..., // `x: &u32`
    _ => ...,
}
match &mut x {
    // Calls `deref_mut()`
    deref Some(x) => ..., // `x: &mut u32`
    _ => ...,
}

Future-compatibility with moving out/DerefMove

Compatible

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
. This could in fact replace the box_patterns feature entirely.

Specific syntax choice

Here are the proposals I've seen:

  • deref <pat>;
  • *<pat>;
  • box <pat> (already reserved keyword);
  • Box(<pat>)/Rc(<pat>)/String(<pat>)โ€ฆ type-specific pattern.

Note that deref <pat> would require reserving a keyword, since deref(x, y) could be a tuple struct pattern and deref ::Type could be a path.

Note that Pointer(<pat>) would require some rule to not conflict with tuple struct syntax. Maybe we disallow it on tuple structs, maybe some visibility-based hack. Also it doesn't work in generic contexts, unless we allow T(<pat>) for generic T.

Summary

  • Option 1: &<pat>;

    • drawback: likely requires the implicit form in practice which we may or may not want;
    • drawback: not future-compatible with something like DerefMove;
    • drawback: using a & pattern on something that isn't a reference could feel weird (e.g. matches!(Rc::new(true), &true)).
  • Option 2: deref <pat>, *<pat> or Pointer(<pat>).

    • drawback: deref would require a keyword i.e. new edition;
    • drawback: *<pat> goes the wrong way around, e.g. &*<pat> looks like a reborrow but is actually two dereferences;
    • drawback?: Pointer(<pat>) doesn't fit well with the implicit syntax;
    • drawback: Pointer(<pat>) conflicts with tuple structs.

Compatibility matrix:

Option +implicit only explicit +move +custom Deref works in all editions consistent with today's rust
&<pat>
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
noisy
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
iffy with ergonomics 2024
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
deref <pat>
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
sort of
Pointer(<pat>) weird
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
except tuple structs
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
*<pat>
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More โ†’

Unresolved questions

The string literal issue

A fun consequence of how we deal with places.

let x: String = ...;
match x {
    // ERROR type mismatch: place has type `str`, literal has type `&str`
    deref "foo" => ...,
    _ => ...,
}

A similar issue exists today when matching with constants of type &T (playground):

const FOO: &u32 = &0;
match 0 {
    FOO => ..., // ERROR: expected `u32`, found `&u32`
    _ => ...,
}

According to @compiler-errors this should be easy to solve for the specific case of string literals so let's ignore for now.

Future Possibilities

Implicit deref patterns

As discussed above, we could extend match ergonomics to add implicit deref patterns as needed.

Desugaring algorithm

Today, when a concrete pattern p which isn't of the form &<pat> is used to match on a place x: &T, we adjust the binding mode and keep matching p on the place *x.

We would extend this behavior to places x of type P where P: Deref is one of the supported std types. This would insert implicit deref patterns. E.g.

let x: &Rc<Option<u32>> = ...;
match x {
    Some(x) => ..., // `x: &u32`
    // desugars to:
    &&Some(ref x) => ..., // syntax choice 1
    &deref Some(ref x) => ..., // syntax choice 2
    _ => ...,
}

User-defined Deref

I (Nadri) am reasonably confident that we can make this sound (cannot cause UB) for arbitrary user-defined Derefs as long as we disable exhaustiveness (i.e. require a _ arm).

struct MyBox(..);
match MyBox::new(true) {
    deref true => ...,
    deref false => ...,
    _ => ..., // required because `MyBox` could be doing evil things.
}

That said, there remain many design questions:

  • should types have to opt-in to deref patterns?
  • can types opt-in to exhaustiveness (with an unstable trait)?
  • what are the exact requirements for soundness (inspiration?)?
  • how ok is it that patterns would be running arbitrary user code?
  • how do we stop users from relying on details of match lowering (which we want the freedom to change in the future)?
  • much more; see Zulip, IRLO, and the linked documents.

For that reason they're not included in the current proposal. That said, I personally think they're too cool to forbid, even if they stay perma-unstable or behind an opt-in unsafe trait.

Note that this is in tension with implicit deref patterns, as this means patterns could implicitly run arbitrary user code. We could e.g. force explicit patterns when the Deref impl is untrusted.

Moving out of deref patterns

We could implement this today for Box, to match the existing deref magic (and replace box patterns). For other types, this would require a solution to the tricky DerefMove design issue.

In either case, this requires a compatible syntax for the explicit form as discussed above.

Mixed exhaustiveness

We could allow mixing normal and deref patterns:

// Considered exhaustive because the normal patterns are exhaustive by themselves.
match Cow::Owned(false) {
    Cow::Owned(_) => ...,
    deref true => ...,
    Cow::Borrowed(_) => ...,
}
// Considered exhaustive because the deref patterns are exhaustive by themselves.
match Cow::Owned(false) {
    deref true => ...,
    Cow::Borrowed(_) => ...,
    deref false => ...,
}

(note: the hypothetical impl Cow: DerefMut that clones implicitly would stop Cow from being eligible for exhaustiveness)

The two types of patterns are treated independently. In particular, exhaustiveness won't try to figure out that this is exhaustive:

match AssertUnwindSafe(true) {
    deref true => ...,
    AssertUnwindSafe(false) => ...,
    // ERROR: missing patterns `deref false` and `AssertUnwindSafe(true)`
}