Try   HackMD

Pattern Types Design Doc

Pattern types are the general idea of having a type T is p (e.g. u32 is 0..10), that indicates a value of type T that matches pattern p.

I want this document to gather all the constraints and possibilities we've uncovered around this idea.

Design Goals (opinionated)

  • Should not break any existing code.
  • Pattern inference should not influence runtime behavior (like UB).
  • The same function should not need to be monomorphised multiple times for different patterns. This could be allowed as an optimization, e.g. when inlining integer division.
  • Small changes to the code should not suddenly require more or fewer match arms.
  • Small changes to type inference also shouldn't do that.
  • Which arms a match has should not influence type inference of its scrutinee.

Code examples and design constraints

I've gathered here code we'd more or less like to write. Some of these examples are pretty core to the idea of pattern types, others will need to change in the final design.

This section does not discuss solutions, only constraints and desires.

Sources for this document:

Basic Idea

// Type can appear anywhere a normal type can.
pub struct NonZeroU32(u32 is 1..);

impl NonZeroU32 {
    // Accept pattern types in function input.
    pub fn new(x: u32 is 1..) -> Self;
    // Accept pattern types in function output.
    pub fn as_ref(&self) -> &(u32 is 1..);
}

let x: u32 is 0..40 = ...;

// Type aliases
type Infinity = f32 is f32::INFINITY | f32::NEG_INFINITY

pub fn as_chunks<const N: usize is 1..>(self: &[T]) -> (&[[T; N]], &[T]);

// We can have arbitrary patterns in the type.
let x: Option<(bool, u32)> is None | Some((true, 1..)) = ...;

// Matching with `x @ p` remembers the matched pattern. This is the main way
// to introduce a value of a pattern type.
match foo() {
    x @ Some(_) => {
        // `x` has type `x is Some(_)`
    }
    x @ Some((true, 1.., _)) => {
        // `x` has type `x is Some((true, 1.., _))`
    }
}

Representation

// We want our `NonZeroU32` to have a niche .
pub struct NonZeroU32(u32 is 1..);
size_of::<NonZeroU32>() == 4
size_of::<Option<NonZeroU32>>() == 4

// `T` and `T as p` must have the same representation.
match &Some(number()) {
    Some(x @ 42) => {
        // Here `x: &(u32 as 42)`, so we must represent `u32 as 42` exactly like
        // `u32` even though is morally could be a ZST.
    }
    _ => {}
}

// This allows us to write, using unsafe code:
impl NonZeroU32 {
    // Possibly even `&mut`.
    pub fn option_from_ref(x: &u32) -> &Option<Self> {
        unsafe { transmute(x) }
    }
}

Exhaustiveness

// Match exhaustiveness takes pattern types into account.
let x: i32 as 0 | 1;
match x {
    0 => ...,
    1 => ...,
    // No need for more branches
}

fn minimum(slice: &[i32] is [_, ..]) -> i32 {
    let [first, rest @ ..] = slice;
    ...
}

match foo() {
    Some(false) => ...,
    x @ (None | Some(true)) => {
        match x {
            None => ...,
            Some(true) => ...,
        }
    }
}

Subtyping

let x: i32 is 0.. = ...;
let y: i32 is 1.. = x; // coercion/subtyping happens
let z: i32 = y; // coercion/subtyping happens

let x: Option<i32 is 0..> = ...;
let y: Option<i32> = x; // ok

let x: &RefCell<i32 is 0..> = ...;
let y: &RefCell<i32> = x; // ERROR

impl<I: Iterator> Iterator for Repeat<I> {
    // Refine the trait method with a more precise signature.
    fn next(&mut self) -> Option<I::Item> is Some(_) {
        ...
    }
}

match 42 {
    x @ 42 => {
        // This must keep working, so a type-change of some sort must happen.
        // Note that taking a reference is not a coercion site.
        let y: &mut &i32 = &mut &x;
    }
    _ => ...,
}

// Is `T` identical to `T is _`?
// Is `Option<T> is Some(_)|None` identical to `Option<T> is _`?
// What's the relationship between `Option<i32 is 1..> is Some(_)` and `Option<i32> is Some(1..)`?

// `T as !` would be a subtype of `T` such that we know it has no inhabitants.
trait Trait {
    fn method() -> i32;
}
impl Trait for () {
    // `-> !` would not have satisfied the trait's signature
    fn method() -> i32 in ! {
        panic!()
    }
}
fn main() {
    // Never-to-any coercion
    let _: u16 = ().method();
}

Inference

// TODO: figure out possible backwards-compabitility footguns.
let val = 4;
let bounded: i32 is 2..=42 = val; // Works, because val can only be 4.

let val: &[i32] is [_, _, ..] = /* ... */;
let bounded: &[i32] is [_, ..] = val; // Works because is a more general pattern.

let slice: &[bool] as [true, ..];
match slice {
    // Matching intersects patterns. Here `x` has type `[true]`
    x @ [_] => ...,
    _ => ...,
}

let mut x = 0;
if foo() {
    x = 1;
}
// Inferred to `0 | 1`.
match x {
    0 | 1 => ...,
}

let m = &mut 1;
*m = 2; // What is the type of `m` now?

// Care must be taken to not infer `x: i32 as 0` here.
let mut x = 0;
unsafe {
    ptr::from_mut(&mut x).cast::<i32>().write(1);
}

// Same here.
let x = 0;
let cell = RefCell::new(x);
*cell.borrow_mut() = 1;

// This also must keep working.
match 0 {
  x @ 42 => {
    // We musn't infer `y: u32 is 42`.
    let mut y = x;
    unsafe {
        ptr::from_mut(&mut y).cast::<i32>().write(1);
    }
  }
}

// Here's a trickier case:
let mut a = 42_i128;
unsafe {
    set_equal_to_0(&mut a);
}
// now `a == 0`.
/// # Safety
///
/// `T` must be at least as large as `u32`,
/// and it must be valid to zero the first 4 bytes of it.
unsafe fn set_equal_to_0<T>(mut_ref: &mut T) {
    let ptr = mut_ref as *mut T as *mut u32;
    *ptr = 0;
}

if let y @ Some(_) = &mut x {
    // This must keep working.
    *y = None;
}

// Flow-sensitivity?
fn take_positive(x: i32 is 1..) { ... }
fn take_less_than_ten(x: i32 is ..10) { ... }
let mut x = 20;
take_positive(x);
x = 5;
take_less_than_ten(x);

// How about via mutable reference?
let mut x = 20;
take_positive(x);
let r = &mut x;
*r = 5;
take_less_than_ten(x);

// TODO: more interaction with invariance

Const generics

// These two could be recognized to be disjoint.
// Both together would be recognized to cover any `N` when `T: Default`.
impl<T> Default for [T; 0] {
    fn default() -> [T; 0] {
        []
    }
}
impl<T: Default, const N: usize is 1..> Default for [T; N] {
    fn default() -> [T; N] {
        array::from_fn(|_| T::default())
    }
}

// Note that this similar case isn't considered valid for all `B` in today's rust.
struct WithConst<const B: bool> {}
impl SomeTrait for WithConst<false> {}
impl SomeTrait for WithConst<true> {}

// Need to handle const-dependent typechecking.
// In particular, can we deduce an instance forall `V` if the `ASSOC` happen to be identical? Whatever is decided for the bool case should apply here.
impl<const V: usize is 0..5> SomeTrait for WithConst<V> { type ASSOC = u32; }
impl<const V: usize is 5..> SomeTrait for WithConst<V> { type ASSOC = String; }

// Is that sane?
impl<const L: usize, const U: usize is L.., const R: usize is L..=U>

Traits and methods

// Allow inherent impl on a pattern type.
enum MyEnum { This, That }
impl MyEnum is MyEnum::This {
    fn method(self) {}
}
// Allowing a second `impl` block could complicate inference.
impl MyEnum is MyEnum::That {
    fn method(self) {}
}

// Allow trait impl on a pattern type.
impl SomeTrait for i32 is 0 | 1 {}
// Allowing a second `impl` block could complicate trait resolution.
impl SomeTrait for i32 is 42.. {}
// Allowing a more precise `impl` block likely requires specialization.
impl SomeTrait for i32 is 0 {}

fn generic<T: Copy>(t: T) {}
// Because of the "one impl per base type", this doesn't work.
generic::<i32 is 0>(0); // ERROR: type does not implement `Copy`
generic(0); // Works as T is inferred as `i32`.

// This however should work.
fn copy(v: i32 is 0) -> (i32 is 0, i32 is 0) {
    (v, v)
}
// Unclear if we can make this work.
fn copy(v: &i32 is 0) -> (i32 is 0, i32 is 0) {
    (*v, *v)
}

// Would be nice:
impl<T, pat p> TryFrom<T> for (T is p) {
    ...
}
impl From<u8 as 0|1> for bool {
    ...
}

// Would be nice:
#[derive(Clone)]
struct Foo(Option<String>);
impl Copy for Foo is Foo(None) {}

// Would be cute: non-panicking division
impl Div<i32 is ..=-1 | 1..> for i32 {
    ...
}

Any

There's a backwards-compatilibity subtlety around Any:

let val = 42;
let foo: &dyn Any = match &val {
    x @ 42 => &x as &dyn Any,
    _ => panic!()
};
// This works today, so it must keep working.
let y: &i32 = foo.downcast_ref().unwrap();
// This would be nice.
let y: &i32 is 42 = foo.downcast_ref().unwrap();
// This MUST NOT be allowed since *error == val == 42.
let error: &i32 is 0 = foo.downcast_ref().unwrap();
// All of this is incompatible with having both `T is p: Any` and `TypeId` working like it does now. We could change `TypeId` or change the bound somehow`.
// Conclusion: either change `Any`, or no existing code is allowed to produce a `T is p` (i.e. maybe use `@@` instead).

API Migration

We'd like to be able to migrate an API to a more precise one without breaking downstream crates.

// old
impl u32 {
    /// Panics if `radix` is not in the range from 2 to 36.
    pub fn from_str_radix(src: &str, radix: u32) -> Result<u32, ParseIntError>;
}
// new
impl u32 {
    pub fn from_str_radix(src: &str, radix: u32 is 2..36) -> Result<u32, ParseIntError>;
}

// old
trait Iterator {
    /// Panics if `step == 0`
    fn step_by(self, step: usize) -> StepBy<Self>;
}
// new
trait Iterator {
    fn step_by(self, step: usize as 1..) -> StepBy<Self>;
}

// The kind of thing that might break:
fn foo(a: i32 in 1..) {}
fn bar(f: impl Fn(i32)) {}
bar(foo);

Unexplored potential footguns

  • trait+subtyping madness
  • fn(T as p)? what about its TypeId?

Open Design Questions

  • how precise subtyping
  • performance: this will require a lot of pattern checking. E.g. to prove that None|Some(_) is the same as _ is the same as exhaustiveness-checking (NP-complete). That said, we'd likely only have small patterns around.
  • Do we want an easy mechanism to insert runtime panic to satisfy the pattern type? .try_into().unwrap() would be nice (but requires advanced trait features).

Proposed Solutions

  • Mostly: see the pre-RFC
  • For backcompat, crates opt-in to pattern types. Each function defines two functions: one with the pattern type, the other with the base type and a runtime panic. Crates see one or the other (at resolution time) depending on their opt-in. Opt-in crates can do the runtime check by hand with .try_into().unwrap() if they want.