owned this note
owned this note
Published
Linked with GitHub
# Deref Patterns Design Proposal(s)
Goal: Be able to match through a `Deref` or `DerefMut` smart pointer ergonomically.
```rust
let x: Option<Rc<bool>> = ...;
match x {
Some(deref true) => ...,
Some(deref false) => ...,
None => ...,
}
```
Prior discussions include:
- Tracking issue https://github.com/rust-lang/rust/issues/87121
- The whole #project-deref-patterns stream on Zulip
- https://hackmd.io/GBTt4ptjTh219SBhDCPO4A?view=
- https://hackmd.io/vZzzEVIMRiWN_7eptYxaOg?view#deref-patterns
- https://hackmd.io/rGK3QbFGSU66i1vLbNhoUg?view
- https://internals.rust-lang.org/t/somewhat-random-idea-deref-patterns/13813/2, which links to:
- https://github.com/rust-lang/rfcs/pull/462
- https://github.com/rust-lang/rfcs/issues/2099
- https://internals.rust-lang.org/t/pre-rfc-string-literals-through-prefixes/2928/9
- Many threads on IRLO
- [May 2024 lang-team design meeting](https://hackmd.io/fj68a_l1QS2xyQQFUWz0GA?view=)
From these and my own considerations, I have come up with two related proposals.
We have a working implementation (without exhaustiveness checking) of one of the proposals available under the `deref_patterns` experimental feature gate.
# Design principles (WIP)
- Patterns do what I mean: writing the right pattern doesn't require sigil-golf to appease the type-checker;
- Patterns can be precise: users should be able to spell out exactly which places they want to access with a pattern;
- In an unsafe block I have confidence I'm not triggering anything unexpected
- Smart pointers get out of the way: the ergonomic distance between using a `&T` and a smart pointer should be made as small as possible;
- I don't need to make match-in-match just because I boxed a field, or turned an array into a vec, or whatever.
- I still get exhaustiveness checking even though the the integer/enum/array is inside a box
- Patterns accomodate semantically-irrelevant code changes: switching from `Rc` to `Arc` should require minimal code changes;
- Users can be trusted with powerful features;
- Safety rails make power safer to wield.
- Patterns are readable
- Patterns are predictable
- Deconstruction mirrors use
- Mutability is explicit (?)
Q: should `&` vs `&mut` indicate the mutability of the type, or of the borrow?
A: today is mostly type, ergo2024 makes it sometimes borrow. E.g. `let &mut (ref x) = &mut place;` shares `place`.
# Proposal(s)
The two proposals differ in syntax but share how they work: a `&<pat>`/`*<pat>`/`deref <pat>`/etc pattern would be allowed for stdlib smart pointers like `Box` or `Rc`, where `<pat>` would match on the pointed-to value.
```rust
let x: Option<Rc<bool>> = ...;
match x {
Some(&true) => ..., // syntax choice 1
Some(deref true) => ..., // syntax choice 2
Some(*true) => ..., // syntax choice 2 bis
Some(Rc(true)) => ..., // syntax choice 2 ter
Some(true) => ..., // possible extension (implicit deref patterns)
_ => ...,
}
```
Everything else about patterns works the same: we can nest them, we can get immutable or mutable access, they are subject to exhaustiveness checking and the dead arm lint.
Before we discuss the two syntactic options, let's start with some common design details.
## Enabled for all std smart pointers
The feature would be enabled for the following std types: `Box`, `Rc`, `Arc`, `Vec`, `String`, `Cow`, `Pin`, `ManuallyDrop`, `Ref`, `RefMut`, `LazyCell`, `LazyLock`. This is sound because all those impls are sufficiently idempotent.
Extending the feature to user-defined `Deref` impls is outside the scope of this proposal.
## Exhaustiveness
Patterns are treated exhaustively as expected:
```rust!
// This works
match Box::new(true) {
deref true => ...,
deref false => ...,
}
```
This is sound because we only enable the feature for a trusted set of types whose `Deref` impls behave well enough.
#### Mixing deref and normal patterns
Some `Deref` std types like `Cow` can be matched normally. For simplicity we forbid mixing deref and normal patterns for now.
```rust!
match Cow::Owned(false) {
Cow::Owned(_) => ...,
deref true => ..., // ERROR: don't mix deref patterns and normal patterns
_ => ...,
}
```
## Explicit vs implicit syntax
The precedent with match ergonomics and the general way rust tends to work suggests that implicit deref patterns (if we want them) should desugar into an explicit form.
Moreover, we need explicit syntax to disambiguate cases like:
```rust!
let cowcow: Cow<Cow<bool>> = ...;
match cowcow {
// How do I match the inner cow?
Cow::Borrowed(_) => ...,
_ => ...,
}
```
As such, both proposals focus on the explicit syntax. Implicit patterns are an optional extension.
## Types
We follow how the rest of rust works for matches and `Deref`: we work on places.
This means that `&<pat>`/`deref <pat>` operates on a place `x: P` where `P: Deref<Target=T>`, and matches `<pat>` against the place `*x` of type `T`.
It does _not_ operate on a place of type `&P`/`&mut P` (except insofar as match ergonomics make it seem like it does). It also does not return a place of type `&T`/`&mut T` (again modulo match ergonomics). As we will see in Unresolved Questions, this poses an issue for string literals. Yet this is the consistent choice wrt the rest of rust.
<!--To get a `&T`/`&mut T` out of this, one must as usual bind the matched-on place with `ref x`/`ref mut x` (or let match ergonomics do it for you).-->
## The two syntax options
### Option 1: `&<pat>`
```rust
match Box::new(0) {
&0 => ..., // works
&x => ..., // `x: u32`
_ => ...,
}
```
#### No match ergonomics in the explicit form
Because of how `&` works today, we don't really have much choice:
```rust!
match &Box::new(Some(0)) {
&x => ..., // ERROR: can't move `Box` out
&Some(_) => ..., // ERROR: type mismatch
&&Some(0) => ..., // ok
&&Some(x) => ..., // `x: u32`
_ => ...,
}
match &mut Box::new(Some(0)) {
&mut &mut Some(ref mut x)) => ..., // `x: &mut u32`
// ^^^^ this derefs the `Box`
// ^^^^ this digs into a `&mut T` as normal
_ => ...,
}
// Just like for `&T`, `&<pat>` eats the inherited reference if there is one.
match &Some(Box::new(0)) {
Some(&x) => ..., // `x: u32`
Some(&0) => ..., // ok
_ => ...,
}
// See also the discussions below about match ergonomics 2024.
```
As you can see, this proposal isn't very convenient without implicit deref patterns.
#### Mutability
As could be expected, `&<pat>` calls `deref` and `&mut <pat>` calls `deref_mut`.
```rust!
let mut x = Box::new(Some(0));
match x {
&Some(ref x) => ..., // `x: &u32`
_ => ...,
}
match x {
&mut Some(ref mut x) => ..., // `x: &mut u32`
_ => ...,
}
match x {
&Some(ref mut x) => ..., // ERROR
_ => ...,
}
```
Interestingly, because `&mut T: Deref`, this allows matching on `&mut T` with a `&<pat>` pattern:
```rust
let mut x = 0;
match Some(&mut x) {
Some(&x) => ..., // `x: u32`
None => ...,
}
```
#### Future-compatibility with moving out/`DerefMove`
Because we distinguish `&` and `&mut` the way we do, to move out we'd probably want some other syntax. Maybe `move <pat>`, maybe `*<pat>`, idk.
#### Interaction with match ergonomics 2024
The [match ergonomics 2024 proposal](https://github.com/rust-lang/rust/issues/123076) includes some possible changes to `&<pat>` patterns. Depending on the exact choices made, this could conflict with deref patterns.
For example:
```rust
if let Some(&x) = &Some(Box::new(0)) {
/// ...
}
```
- Under current Rust, this is a type error.
- Under deref patterns only, `x: u32`.
- Under "`&<pat>` everywhere" only, `x: Box<u32>` (which gives a move error).
- Under both if we take [option eat-both-layers](https://hackmd.io/YLKslGwpQOeAyGBayO9mdw?both=#Alternate-rules-for-the-behavior-of-ampltpatgt), what makes most sense is `x: u32`.
- Under both if we take [option eat-one-layer](https://hackmd.io/YLKslGwpQOeAyGBayO9mdw?both=#Alternate-rules-for-the-behavior-of-ampltpatgt), what makes most sense is `x: Box<u32>`.
Hence combining the features could accidentally create a breaking change.
The safest choice is to disallow `&<pat>` deref patterns in the presence of match-ergonomics-inherited references, and to disallow "`&<pat>` everywhere" on `Deref` types (should be fine because `Deref` types are rarely `Copy`). That gives us freedom to land either feature in any order. We can relax restrictions once both are stable.
For custom `Deref`, this may mean that implementing `Deref` on a `Copy` type can break downstream crates:
```rust
// crate A
#[derive(Copy, Clone)]
struct Container<T>(T);
impl<T> Container<T> {
fn contents(&self) -> &T { &self.0 }
}
// crate B
fn contains_true(x: &Option<Container<bool>>) -> bool {
match x {
// With eat-two-layers, this copies out the `Container`.
// If `Container: Deref`, this would instead copy the `bool`.
Some(&c) => *c.contents(),
None => false,
}
}
```
### Option 2: some other syntax, e.g. `deref <pat>`, `*<pat>`, or `<Pointer>(<pat>)`
For the sake of presentation, I'll use `deref`. Any syntax different from `&` can work the same.
#### Works with match ergonomics
```rust!
match Box::new(0) {
deref 0 => ...,
deref x => ..., // `x: u32`
_ => ...,
}
match &Box::new(0) {
// The `deref` passes through references as necessary
deref 0 => ..., // ok
deref x => ..., // `x: &u32`
_ => ...,
}
match &mut Box::new(0) {
// The `deref` passes through references as necessary
deref 0 => ..., // ok
deref x => ..., // `x: &mut u32`
_ => ...,
}
```
This does not require implicit deref patterns to be practical.
#### Mutability
Mutability is inferred from the kinds of bindings we do inside the pattern.
```rust!
let mut x = Box::new(Some(0));
match x {
// Calls `deref()`
deref Some(ref x) => ..., // `x: &u32`
_ => ...,
}
match x {
// Calls `deref_mut()`
deref Some(ref mut x) => ..., // `x: &mut u32`
_ => ...,
}
// With match ergonomics
match &x {
// Calls `deref()`
deref Some(x) => ..., // `x: &u32`
_ => ...,
}
match &mut x {
// Calls `deref_mut()`
deref Some(x) => ..., // `x: &mut u32`
_ => ...,
}
```
#### Future-compatibility with moving out/`DerefMove`
Compatible :heavy_check_mark:. This could in fact replace the `box_patterns` feature entirely.
#### Specific syntax choice
Here are the proposals I've seen:
- `deref <pat>`;
- `*<pat>`;
- `box <pat>` (already reserved keyword);
- `Box(<pat>)`/`Rc(<pat>)`/`String(<pat>)`... type-specific pattern.
Note that `deref <pat>` would require reserving a keyword, since `deref(x, y)` could be a tuple struct pattern and `deref ::Type` could be a path.
Note that `Pointer(<pat>)` would require some rule to not conflict with tuple struct syntax. Maybe we disallow it on tuple structs, maybe some visibility-based hack. Also it doesn't work in generic contexts, unless we allow `T(<pat>)` for generic `T`.
# Summary
- Option 1: `&<pat>`;
- drawback: likely requires the implicit form in practice which we may or may not want;
- drawback: not future-compatible with something like `DerefMove`;
- drawback: using a `&` pattern on something that isn't a reference could feel weird (e.g. `matches!(Rc::new(true), &true)`).
- Option 2: `deref <pat>`, `*<pat>` or `Pointer(<pat>)`.
- drawback: `deref` would require a keyword i.e. new edition;
- drawback: `*<pat>` goes the wrong way around, e.g. `&*<pat>` looks like a reborrow but is actually two dereferences;
- drawback?: `Pointer(<pat>)` doesn't fit well with the implicit syntax;
- drawback: `Pointer(<pat>)` conflicts with tuple structs.
Compatibility matrix:
| Option | +implicit | only explicit | +move | +custom `Deref` | works in all editions | consistent with today's rust |
| - | - | - | - | - | - | - |
| `&<pat>` | :heavy_check_mark: | noisy | :question: | iffy with ergonomics 2024 | :heavy_check_mark: | :heavy_check_mark: |
| `deref <pat>` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: | sort of |
| `Pointer(<pat>)` | weird | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: except tuple structs | :heavy_check_mark: | :heavy_check_mark: |
| `*<pat>` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
# Unresolved questions
## The string literal issue
A fun consequence of how we deal with places.
```rust!
let x: String = ...;
match x {
// ERROR type mismatch: place has type `str`, literal has type `&str`
deref "foo" => ...,
_ => ...,
}
```
A similar issue exists today when matching with constants of type `&T` ([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9b76a0ee3f5486f569d312e50c968531)):
```rust!
const FOO: &u32 = &0;
match 0 {
FOO => ..., // ERROR: expected `u32`, found `&u32`
_ => ...,
}
```
According to `@compiler-errors` this should be easy to solve for the specific case of string literals so let's ignore for now.
<!--
Possible solutions:
- Special-case string literals with a bit of inference magic. I am told this is not hard;
- Auto-deref constant patterns as needed. Explicit form could be spelled `*CONST`, e.g. `deref (*"foo")`;
- Auto-ref matched places as needed. Explicit form could be spelled `ref <pat>`, e.g. `deref (ref "foo")`.
-->
# Future Possibilities
## Implicit deref patterns
As discussed above, we could extend match ergonomics to add implicit deref patterns as needed.
#### Desugaring algorithm
Today, when a concrete pattern `p` which isn't of the form `&<pat>` is used to match on a place `x: &T`, we adjust the binding mode and keep matching `p` on the place `*x`.
We would extend this behavior to places `x` of type `P` where `P: Deref` is one of the supported std types. This would insert implicit deref patterns. E.g.
```rust
let x: &Rc<Option<u32>> = ...;
match x {
Some(x) => ..., // `x: &u32`
// desugars to:
&&Some(ref x) => ..., // syntax choice 1
&deref Some(ref x) => ..., // syntax choice 2
_ => ...,
}
```
## User-defined `Deref`
I (Nadri) am reasonably confident that we can make this *sound* (cannot cause UB) for arbitrary user-defined `Deref`s as long as we disable exhaustiveness (i.e. require a `_` arm).
```rust!
struct MyBox(..);
match MyBox::new(true) {
deref true => ...,
deref false => ...,
_ => ..., // required because `MyBox` could be doing evil things.
}
```
That said, there remain many design questions:
- should types have to opt-in to deref patterns?
- can types opt-in to exhaustiveness (with an unstable trait)?
- what are the exact requirements for soundness ([inspiration?](http://kimundi.github.io/owning-ref-rs/stable_deref_trait/trait.StableDeref.html))?
- how ok is it that patterns would be running arbitrary user code?
- how do we stop users from relying on details of match lowering (which we want the freedom to change in the future)?
- much more; see Zulip, IRLO, and the linked documents.
For that reason they're not included in the current proposal. That said, I personally think they're too cool to forbid, even if they stay perma-unstable or behind an opt-in unsafe trait.
Note that this is in tension with implicit deref patterns, as this means patterns could implicitly run arbitrary user code. We could e.g. force explicit patterns when the `Deref` impl is untrusted.
## Moving out of deref patterns
We could implement this today for `Box`, to match the existing deref magic (and replace box patterns). For other types, this would require a solution to the tricky `DerefMove` design issue.
In either case, this requires a compatible syntax for the explicit form as discussed above.
## Mixed exhaustiveness
We could allow mixing normal and deref patterns:
```rust
// Considered exhaustive because the normal patterns are exhaustive by themselves.
match Cow::Owned(false) {
Cow::Owned(_) => ...,
deref true => ...,
Cow::Borrowed(_) => ...,
}
// Considered exhaustive because the deref patterns are exhaustive by themselves.
match Cow::Owned(false) {
deref true => ...,
Cow::Borrowed(_) => ...,
deref false => ...,
}
```
(note: the hypothetical `impl Cow: DerefMut` that clones implicitly would stop `Cow` from being eligible for exhaustiveness)
The two types of patterns are treated independently. In particular, exhaustiveness won't try to figure out that this is exhaustive:
```rust!
match AssertUnwindSafe(true) {
deref true => ...,
AssertUnwindSafe(false) => ...,
// ERROR: missing patterns `deref false` and `AssertUnwindSafe(true)`
}
```