# Common Rust Lifetime Misconceptions _19 May 2020 · #rust · #lifetimes_ **Table of Contents** - [Intro](#intro) - [The Misconceptions](#the-misconceptions) - [1) `T` only contains owned types](#1-t-only-contains-owned-types) - [2) if `T: 'static` then `T` must be valid for the entire program](#2-if-t-static-then-t-must-be-valid-for-the-entire-program) - [3) `&'a T` and `T: 'a` are the same thing](#3-a-t-and-t-a-are-the-same-thing) - [4) my code isn't generic and doesn't have lifetimes](#4-my-code-isnt-generic-and-doesnt-have-lifetimes) - [5) if it compiles then my lifetime annotations are correct](#5-if-it-compiles-then-my-lifetime-annotations-are-correct) - [6) boxed trait objects don't have lifetimes](#6-boxed-trait-objects-dont-have-lifetimes) - [7) compiler error messages will tell me how to fix my program](#7-compiler-error-messages-will-tell-me-how-to-fix-my-program) - [8) lifetimes can grow and shrink at run-time](#8-lifetimes-can-grow-and-shrink-at-run-time) - [9) downgrading mut refs to shared refs is safe](#9-downgrading-mut-refs-to-shared-refs-is-safe) - [10) closures follow the same lifetime elision rules as functions](#10-closures-follow-the-same-lifetime-elision-rules-as-functions) - [Conclusion](#conclusion) - [Discuss](#discuss) - [Notifications](#notifications) - [Further Reading](#further-reading) ## Intro I've held all of these misconceptions at some point and I see many beginners struggle with these misconceptions today. Some of my terminology might be non-standard, so here's a table of shorthand phrases I use and what I intend for them to mean. | Phrase | Shorthand for | |-|-| | `T` | 1) a set containing all possible types _or_<br>2) some type within that set | | owned type | some non-reference type, e.g. `i32`, `String`, `Vec`, etc | | 1) borrowed type _or_<br>2) ref type | some reference type regardless of mutability, e.g. `&i32`, `&mut i32`, etc | | 1) mut ref _or_<br>2) exclusive ref | exclusive mutable reference, i.e. `&mut T` | | 1) immut ref _or_<br>2) shared ref | shared immutable reference, i.e. `&T` | ## The Misconceptions In a nutshell: A variable's lifetime is how long the data it points to can be statically verified by the compiler to be valid at its current memory address. I'll now spend the next ~6500 words going into more detail about where people commonly get confused. ### 1) `T` only contains owned types This misconception is more about generics than lifetimes but generics and lifetimes are tightly intertwined in Rust so it's not possible to talk about one without also talking about the other. Anyway: When I first started learning Rust I understood that `i32`, `&i32`, and `&mut i32` are different types. I also understood that some generic type variable `T` represents a set which contains all possible types. However, despite understanding both of these things separately, I wasn't able to understand them together. In my newbie Rust mind this is how I thought generics worked: | | | | | |-|-|-|-| | **Type Variable** | `T` | `&T` | `&mut T` | | **Examples** | `i32` | `&i32` | `&mut i32` | `T` contains all owned types. `&T` contains all immutably borrowed types. `&mut T` contains all mutably borrowed types. `T`, `&T`, and `&mut T` are disjoint finite sets. Nice, simple, clean, easy, intuitive, and completely totally wrong. This is how generics actually work in Rust: | | | | | |-|-|-|-| | **Type Variable** | `T` | `&T` | `&mut T` | | **Examples** | `i32`, `&i32`, `&mut i32`, `&&i32`, `&mut &mut i32`, ... | `&i32`, `&&i32`, `&&mut i32`, ... | `&mut i32`, `&mut &mut i32`, `&mut &i32`, ... | `T`, `&T`, and `&mut T` are all infinite sets, since it's possible to borrow a type ad-infinitum. `T` is a superset of both `&T` and `&mut T`. `&T` and `&mut T` are disjoint sets. Here's a couple examples which validate these concepts: ```rust trait Trait {} impl<T> Trait for T {} impl<T> Trait for &T {} // ❌ impl<T> Trait for &mut T {} // ❌ ``` The above program doesn't compile as expected: ```none error[E0119]: conflicting implementations of trait `Trait` for type `&_`: --> src/lib.rs:5:1 | 3 | impl<T> Trait for T {} | ------------------- first implementation here 4 | 5 | impl<T> Trait for &T {} | ^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `&_` error[E0119]: conflicting implementations of trait `Trait` for type `&mut _`: --> src/lib.rs:7:1 | 3 | impl<T> Trait for T {} | ------------------- first implementation here ... 7 | impl<T> Trait for &mut T {} | ^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `&mut _` ``` The compiler doesn't allow us to define an implementation of `Trait` for `&T` and `&mut T` since it would conflict with the implementation of `Trait` for `T` which already includes all of `&T` and `&mut T`. The program below compiles as expected, since `&T` and `&mut T` are disjoint: ```rust trait Trait {} impl<T> Trait for &T {} // ✅ impl<T> Trait for &mut T {} // ✅ ``` **Key Takeaways** - `T` is a superset of both `&T` and `&mut T` - `&T` and `&mut T` are disjoint sets ### 2) if `T: 'static` then `T` must be valid for the entire program **Misconception Corollaries** - `T: 'static` should be read as _"`T` has a `'static` lifetime"_ - `&'static T` and `T: 'static` are the same thing - if `T: 'static` then `T` must be immutable - if `T: 'static` then `T` can only be created at compile time Most Rust beginners get introduced to the `'static` lifetime for the first time in a code example that looks something like this: ```rust fn main() { let str_literal: &'static str = "str literal"; } ``` They get told that `"str literal"` is hardcoded into the compiled binary and is loaded into read-only memory at run-time so it's immutable and valid for the entire program and that's what makes it `'static`. These concepts are further reinforced by the rules surrounding defining `static` variables using the `static` keyword. ```rust // Note: This example is purely for illustrative purposes. // Never use `static mut`. It's a footgun. There are // safe patterns for global mutable singletons in Rust but // those are outside the scope of this article. static BYTES: [u8; 3] = [1, 2, 3]; static mut MUT_BYTES: [u8; 3] = [1, 2, 3]; fn main() { MUT_BYTES[0] = 99; // ❌ - mutating static is unsafe unsafe { MUT_BYTES[0] = 99; assert_eq!(99, MUT_BYTES[0]); } } ``` Regarding `static` variables - they can only be created at compile-time - they should be immutable, mutating them is unsafe - they're valid for the entire program The `'static` lifetime was probably named after the default lifetime of `static` variables, right? So it makes sense that the `'static` lifetime has to follow all the same rules, right? Well yes, but a type _with_ a `'static` lifetime is different from a type _bounded by_ a `'static` lifetime. The latter can be dynamically allocated at run-time, can be safely and freely mutated, can be dropped, and can live for arbitrary durations. It's important at this point to distinguish `&'static T` from `T: 'static`. `&'static T` is an immutable reference to some `T` that can be safely held indefinitely long, including up until the end of the program. This is only possible if `T` itself is immutable and does not move _after the reference was created_. `T` does not need to be created at compile-time. It's possible to generate random dynamically allocated data at run-time and return `'static` references to it at the cost of leaking memory, e.g. ```rust use rand; // generate random 'static str refs at run-time fn rand_str_generator() -> &'static str { let rand_string = rand::random::<u64>().to_string(); Box::leak(rand_string.into_boxed_str()) } ``` `T: 'static` is some `T` that can be safely held indefinitely long, including up until the end of the program. `T: 'static` includes all `&'static T` however it also includes all owned types, like `String`, `Vec`, etc. The owner of some data is guaranteed that data will never get invalidated as long as the owner holds onto it, therefore the owner can safely hold onto the data indefinitely long, including up until the end of the program. `T: 'static` should be read as _"`T` is bounded by a `'static` lifetime"_ not _"`T` has a `'static` lifetime"_. A program to help illustrate these concepts: ```rust use rand; fn drop_static<T: 'static>(t: T) { std::mem::drop(t); } fn main() { let mut strings: Vec<String> = Vec::new(); for _ in 0..10 { if rand::random() { // all the strings are randomly generated // and dynamically allocated at run-time let string = rand::random::<u64>().to_string(); strings.push(string); } } // strings are owned types so they're bounded by 'static for mut string in strings { // all the strings are mutable string.push_str("a mutation"); // all the strings are droppable drop_static(string); // ✅ } // all the strings have been invalidated before the end of the program println!("I am the end of the program"); } ``` **Key Takeaways** - `T: 'static` should be read as _"`T` is bounded by a `'static` lifetime"_ - if `T: 'static` then `T` can be a borrowed type with a `'static` lifetime _or_ an owned type - since `T: 'static` includes owned types that means `T` - can be dynamically allocated at run-time - does not have to be valid for the entire program - can be safely and freely mutated - can be dynamically dropped at run-time - can have lifetimes of different durations ### 3) `&'a T` and `T: 'a` are the same thing This misconception is a generalized version of the one above. `&'a T` requires and implies `T: 'a` since a reference to `T` of lifetime `'a` cannot be valid for `'a` if `T` itself is not valid for `'a`. For example, the Rust compiler will never allow the construction of the type `&'static Ref<'a, T>` because if `Ref` is only valid for `'a` we can't make a `'static` reference to it. `T: 'a` includes all `&'a T` but the reverse is not true. ```rust // only takes ref types bounded by 'a fn t_ref<'a, T: 'a>(t: &'a T) {} // takes any types bounded by 'a fn t_bound<'a, T: 'a>(t: T) {} // owned type which contains a reference struct Ref<'a, T: 'a>(&'a T); fn main() { let string = String::from("string"); t_bound(&string); // ✅ t_bound(Ref(&string)); // ✅ t_bound(&Ref(&string)); // ✅ t_ref(&string); // ✅ t_ref(Ref(&string)); // ❌ - expected ref, found struct t_ref(&Ref(&string)); // ✅ // string var is bounded by 'static which is bounded by 'a t_bound(string); // ✅ } ``` **Key Takeaways** - `T: 'a` is more general and more flexible than `&'a T` - `T: 'a` accepts owned types, owned types which contain references, and references - `&'a T` only accepts references - if `T: 'static` then `T: 'a` since `'static` >= `'a` for all `'a` ### 4) my code isn't generic and doesn't have lifetimes **Misconception Corollaries** - it's possible to avoid using generics and lifetimes This comforting misconception is kept alive thanks to Rust's lifetime elision rules, which allow you to omit lifetime annotations in functions because the Rust borrow checker will infer them following these rules: - every input ref to a function gets a distinct lifetime - if there's exactly one input lifetime it gets applied to all output refs - if there's multiple input lifetimes but one of them is `&self` or `&mut self` then the lifetime of `self` is applied to all output refs - otherwise output lifetimes have to be made explicit That's a lot to take in so let's look at some examples: ```rust // elided fn print(s: &str); // expanded fn print<'a>(s: &'a str); // elided fn trim(s: &str) -> &str; // expanded fn trim<'a>(s: &'a str) -> &'a str; // illegal, can't determine output lifetime, no inputs fn get_str() -> &str; // explicit options include fn get_str<'a>() -> &'a str; // generic version fn get_str() -> &'static str; // 'static version // illegal, can't determine output lifetime, multiple inputs fn overlap(s: &str, t: &str) -> &str; // explicit (but still partially elided) options include fn overlap<'a>(s: &'a str, t: &str) -> &'a str; // output can't outlive s fn overlap<'a>(s: &str, t: &'a str) -> &'a str; // output can't outlive t fn overlap<'a>(s: &'a str, t: &'a str) -> &'a str; // output can't outlive s & t fn overlap(s: &str, t: &str) -> &'static str; // output can outlive s & t fn overlap<'a>(s: &str, t: &str) -> &'a str; // no relationship between input & output lifetimes // expanded fn overlap<'a, 'b>(s: &'a str, t: &'b str) -> &'a str; fn overlap<'a, 'b>(s: &'a str, t: &'b str) -> &'b str; fn overlap<'a>(s: &'a str, t: &'a str) -> &'a str; fn overlap<'a, 'b>(s: &'a str, t: &'b str) -> &'static str; fn overlap<'a, 'b, 'c>(s: &'a str, t: &'b str) -> &'c str; // elided fn compare(&self, s: &str) -> &str; // expanded fn compare<'a, 'b>(&'a self, &'b str) -> &'a str; ``` If you've ever written - a struct method - a function which takes references - a function which returns references - a generic function - a trait object (more on this later) - a closure (more on this later) then your code has generic elided lifetime annotations all over it. **Key Takeaways** - almost all Rust code is generic code and there's elided lifetime annotations everywhere ### 5) if it compiles then my lifetime annotations are correct **Misconception Corollaries** - Rust's lifetime elision rules for functions are always right - Rust's borrow checker is always right, technically _and semantically_ - Rust knows more about the semantics of my program than I do It's possible for a Rust program to be technically compilable but still semantically wrong. Take this for example: ```rust struct ByteIter<'a> { remainder: &'a [u8] } impl<'a> ByteIter<'a> { fn next(&mut self) -> Option<&u8> { if self.remainder.is_empty() { None } else { let byte = &self.remainder[0]; self.remainder = &self.remainder[1..]; Some(byte) } } } fn main() { let mut bytes = ByteIter { remainder: b"1" }; assert_eq!(Some(&b'1'), bytes.next()); assert_eq!(None, bytes.next()); } ``` `ByteIter` is an iterator that iterates over a slice of bytes. We're skipping the `Iterator` trait implementation for conciseness. It seems to work fine, but what if we want to check a couple bytes at a time? ```rust fn main() { let mut bytes = ByteIter { remainder: b"1123" }; let byte_1 = bytes.next(); let byte_2 = bytes.next(); if byte_1 == byte_2 { // ❌ // do something } } ``` Uh oh! Compile error: ```none error[E0499]: cannot borrow `bytes` as mutable more than once at a time --> src/main.rs:20:18 | 19 | let byte_1 = bytes.next(); | ----- first mutable borrow occurs here 20 | let byte_2 = bytes.next(); | ^^^^^ second mutable borrow occurs here 21 | if byte_1 == byte_2 { | ------ first borrow later used here ``` I guess we can copy each byte. Copying is okay when we're working with bytes but if we turned `ByteIter` into a generic slice iterator that can iterate over any `&'a [T]` then we might want to use it in the future with types that may be very expensive or impossible to copy and clone. Oh well, I guess there's nothing we can do about that, the code compiles so the lifetime annotations must be right, right? Nope, the current lifetime annotations are actually the source of the bug! It's particularly hard to spot because the buggy lifetime annotations are elided. Let's expand the elided lifetimes to get a clearer look at the problem: ```rust struct ByteIter<'a> { remainder: &'a [u8] } impl<'a> ByteIter<'a> { fn next<'b>(&'b mut self) -> Option<&'b u8> { if self.remainder.is_empty() { None } else { let byte = &self.remainder[0]; self.remainder = &self.remainder[1..]; Some(byte) } } } ``` That didn't help at all. I'm still confused. Here's a hot tip that only Rust pros know: give your lifetime annotations descriptive names. Let's try again: ```rust struct ByteIter<'remainder> { remainder: &'remainder [u8] } impl<'remainder> ByteIter<'remainder> { fn next<'mut_self>(&'mut_self mut self) -> Option<&'mut_self u8> { if self.remainder.is_empty() { None } else { let byte = &self.remainder[0]; self.remainder = &self.remainder[1..]; Some(byte) } } } ``` Each returned byte is annotated with `'mut_self` but the bytes are clearly coming from `'remainder`! Let's fix it. ```rust struct ByteIter<'remainder> { remainder: &'remainder [u8] } impl<'remainder> ByteIter<'remainder> { fn next(&mut self) -> Option<&'remainder u8> { if self.remainder.is_empty() { None } else { let byte = &self.remainder[0]; self.remainder = &self.remainder[1..]; Some(byte) } } } fn main() { let mut bytes = ByteIter { remainder: b"1123" }; let byte_1 = bytes.next(); let byte_2 = bytes.next(); std::mem::drop(bytes); // we can even drop the iterator now! if byte_1 == byte_2 { // ✅ // do something } } ``` Now that we look back on the previous version of our program it was obviously wrong, so why did Rust compile it? The answer is simple: it was memory safe. The Rust borrow checker only cares about the lifetime annotations in a program to the extent it can use them to statically verify the memory safety of the program. Rust will happily compile programs even if the lifetime annotations have semantic errors, and the consequence of this is that the program becomes unnecessarily restrictive. Here's a quick example that's the opposite of the previous example: Rust's lifetime elision rules happen to be semantically correct in this instance but we unintentionally write a very restrictive method with our own unnecessary explicit lifetime annotations. ```rust #[derive(Debug)] struct NumRef<'a>(&'a i32); impl<'a> NumRef<'a> { // my struct is generic over 'a so that means I need to annotate // my self parameters with 'a too, right? (answer: no, not right) fn some_method(&'a mut self) {} } fn main() { let mut num_ref = NumRef(&5); num_ref.some_method(); // mutably borrows num_ref for the rest of its lifetime num_ref.some_method(); // ❌ println!("{:?}", num_ref); // ❌ } ``` If we have some struct generic over `'a` we almost never want to write a method with a `&'a mut self` receiver. What we're communicating to Rust is _"this method will mutably borrow the struct for the entirety of the struct's lifetime"_. In practice this means Rust's borrow checker will only allow at most one call to `some_method` before the struct becomes permanently mutably borrowed and thus unusable. The use-cases for this are extremely rare but the code above is very easy for confused beginners to write and it compiles. The fix is to not add unnecessary explicit lifetime annotations and let Rust's lifetime elision rules handle it: ```rust #[derive(Debug)] struct NumRef<'a>(&'a i32); impl<'a> NumRef<'a> { // no more 'a on mut self fn some_method(&mut self) {} // above line desugars to fn some_method_desugared<'b>(&'b mut self){} } fn main() { let mut num_ref = NumRef(&5); num_ref.some_method(); num_ref.some_method(); // ✅ println!("{:?}", num_ref); // ✅ } ``` **Key Takeaways** - Rust's lifetime elision rules for functions are not always right for every situation - Rust does not know more about the semantics of your program than you do - give your lifetime annotations descriptive names - try to be mindful of where you place explicit lifetime annotations and why ### 6) boxed trait objects don't have lifetimes Earlier we discussed Rust's lifetime elision rules _for functions_. Rust also has lifetime elision rules for trait objects, which are: - if a trait object is used as a type argument to a generic type then its life bound is inferred from the containing type - if there's a unique bound from the containing then that's used - if there's more than one bound from the containing type then an explicit bound must be specified - if the above doesn't apply then - if the trait is defined with a single lifetime bound then that bound is used - if `'static` is used for any lifetime bound then `'static` is used - if the trait has no lifetime bounds then its lifetime is inferred in expressions and is `'static` outside of expressions All of that sounds super complicated but can be simply summarized as _"a trait object's lifetime bound is inferred from context."_ After looking at a handful of examples we'll see the lifetime bound inferences are pretty intuitive so we don't have to memorize the formal rules: ```rust use std::cell::Ref; trait Trait {} // elided type T1 = Box<dyn Trait>; // expanded, Box<T> has no lifetime bound on T, so inferred as 'static type T2 = Box<dyn Trait + 'static>; // elided impl dyn Trait {} // expanded impl dyn Trait + 'static {} // elided type T3<'a> = &'a dyn Trait; // expanded, &'a T requires T: 'a, so inferred as 'a type T4<'a> = &'a (dyn Trait + 'a); // elided type T5<'a> = Ref<'a, dyn Trait>; // expanded, Ref<'a, T> requires T: 'a, so inferred as 'a type T6<'a> = Ref<'a, dyn Trait + 'a>; trait GenericTrait<'a>: 'a {} // elided type T7<'a> = Box<dyn GenericTrait<'a>>; // expanded type T8<'a> = Box<dyn GenericTrait<'a> + 'a>; // elided impl<'a> dyn GenericTrait<'a> {} // expanded impl<'a> dyn GenericTrait<'a> + 'a {} ``` Concrete types which implement traits can have references and thus they also have lifetime bounds, and so their corresponding trait objects have lifetime bounds. Also you can implement traits directly for references which obviously have lifetime bounds: ```rust trait Trait {} struct Struct {} struct Ref<'a, T>(&'a T); impl Trait for Struct {} impl Trait for &Struct {} // impl Trait directly on a ref type impl<'a, T> Trait for Ref<'a, T> {} // impl Trait on a type containing refs ``` Anyway, this is worth going over because it often confuses beginners when they refactor a function from using trait objects to generics or vice versa. Take this program for example: ```rust use std::fmt::Display; fn dynamic_thread_print(t: Box<dyn Display + Send>) { std::thread::spawn(move || { println!("{}", t); }).join(); } fn static_thread_print<T: Display + Send>(t: T) { // ❌ std::thread::spawn(move || { println!("{}", t); }).join(); } ``` It throws this compile error: ```none error[E0310]: the parameter type `T` may not live long enough --> src/lib.rs:10:5 | 9 | fn static_thread_print<T: Display + Send>(t: T) { | -- help: consider adding an explicit lifetime bound...: `T: 'static +` 10 | std::thread::spawn(move || { | ^^^^^^^^^^^^^^^^^^ | note: ...so that the type `[closure@src/lib.rs:10:24: 12:6 t:T]` will meet its required lifetime bounds --> src/lib.rs:10:5 | 10 | std::thread::spawn(move || { | ^^^^^^^^^^^^^^^^^^ ``` Okay great, the compiler tells us how to fix the issue so let's fix the issue. ```rust use std::fmt::Display; fn dynamic_thread_print(t: Box<dyn Display + Send>) { std::thread::spawn(move || { println!("{}", t); }).join(); } fn static_thread_print<T: Display + Send + 'static>(t: T) { // ✅ std::thread::spawn(move || { println!("{}", t); }).join(); } ``` It compiles now but these two functions look awkward next to each other, why does the second function require a `'static` bound on `T` where the first function doesn't? That's a trick question. Using the lifetime elision rules Rust automatically infers a `'static` bound in the first function so both actually have `'static` bounds. This is what the Rust compiler sees: ```rust use std::fmt::Display; fn dynamic_thread_print(t: Box<dyn Display + Send + 'static>) { std::thread::spawn(move || { println!("{}", t); }).join(); } fn static_thread_print<T: Display + Send + 'static>(t: T) { std::thread::spawn(move || { println!("{}", t); }).join(); } ``` **Key Takeaways** - all trait objects have some inferred default lifetime bounds ### 7) compiler error messages will tell me how to fix my program **Misconception Corollaries** - Rust's lifetime elision rules for trait objects are always right - Rust knows more about the semantics of my program than I do This misconception is the previous two misconceptions combined into one example: ```rust use std::fmt::Display; fn box_displayable<T: Display>(t: T) -> Box<dyn Display> { // ❌ Box::new(t) } ``` Throws this error: ```none error[E0310]: the parameter type `T` may not live long enough --> src/lib.rs:4:5 | 3 | fn box_displayable<T: Display>(t: T) -> Box<dyn Display> { | -- help: consider adding an explicit lifetime bound...: `T: 'static +` 4 | Box::new(t) | ^^^^^^^^^^^ | note: ...so that the type `T` will meet its required lifetime bounds --> src/lib.rs:4:5 | 4 | Box::new(t) | ^^^^^^^^^^^ ``` Okay, let's fix it how the compiler is telling us to fix it, nevermind the fact that it's automatically inferring a `'static` lifetime bound for our boxed trait object without telling us and its recommended fix is based on that unstated fact: ```rust use std::fmt::Display; fn box_displayable<T: Display + 'static>(t: T) -> Box<dyn Display> { // ✅ Box::new(t) } ``` So the program compiles now... but is this what we actually want? Probably, but maybe not. The compiler didn't mention any other fixes but this would have also been appropriate: ```rust use std::fmt::Display; fn box_displayable<'a, T: Display + 'a>(t: T) -> Box<dyn Display + 'a> { // ✅ Box::new(t) } ``` This function accepts all the same arguments as the previous version plus a lot more! Does that make it better? Not necessarily, it depends on the requirements and constraints of our program. This example is a bit abstract so let's take a look at a simpler and more obvious case: ```rust fn return_first(a: &str, b: &str) -> &str { // ❌ a } ``` Throws: ```none error[E0106]: missing lifetime specifier --> src/lib.rs:1:38 | 1 | fn return_first(a: &str, b: &str) -> &str { | ---- ---- ^ expected named lifetime parameter | = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `a` or `b` help: consider introducing a named lifetime parameter | 1 | fn return_first<'a>(a: &'a str, b: &'a str) -> &'a str { | ^^^^ ^^^^^^^ ^^^^^^^ ^^^ ``` The error message recommends annotating both inputs and the output with the same lifetime. If we did this our program would compile but this function would overly-constrain the return type. What we actually want is this: ```rust fn return_first<'a>(a: &'a str, b: &str) -> &'a str { // ✅ a } ``` **Key Takeaways** - Rust's lifetime elision rules for trait objects are not always right for every situation - Rust does not know more about the semantics of your program than you do - Rust compiler error messages suggest fixes which will make your program compile which is not that same as fixes which will make you program compile _and_ best suit the requirements of your program ### 8) lifetimes can grow and shrink at run-time **Misconception Corollaries** - container types can swap references at run-time to change their lifetime - Rust borrow checker does advanced control flow analysis This does not compile: ```rust struct Has<'lifetime> { lifetime: &'lifetime str, } fn main() { let long = String::from("long"); let mut has = Has { lifetime: &long }; assert_eq!(has.lifetime, "long"); { let short = String::from("short"); // "switch" to short lifetime has.lifetime = &short; assert_eq!(has.lifetime, "short"); // "switch back" to long lifetime (but not really) has.lifetime = &long; assert_eq!(has.lifetime, "long"); // `short` dropped here } assert_eq!(has.lifetime, "long"); // ❌ - `short` still "borrowed" after drop } ``` It throws: ```none error[E0597]: `short` does not live long enough --> src/main.rs:11:24 | 11 | has.lifetime = &short; | ^^^^^^ borrowed value does not live long enough ... 15 | } | - `short` dropped here while still borrowed 16 | assert_eq!(has.lifetime, "long"); | --------------------------------- borrow later used here ``` This also does not compile, throws the exact same error as above: ```rust struct Has<'lifetime> { lifetime: &'lifetime str, } fn main() { let long = String::from("long"); let mut has = Has { lifetime: &long }; assert_eq!(has.lifetime, "long"); // this block will never run if false { let short = String::from("short"); // "switch" to short lifetime has.lifetime = &short; assert_eq!(has.lifetime, "short"); // "switch back" to long lifetime (but not really) has.lifetime = &long; assert_eq!(has.lifetime, "long"); // `short` dropped here } assert_eq!(has.lifetime, "long"); // ❌ - `short` still "borrowed" after drop } ``` Lifetimes have to be statically verified at compile-time and the Rust borrow checker only does very basic control flow analysis, so it assumes every block in an `if-else` statement and every match arm in a `match` statement can be taken and then chooses the shortest possible lifetime for the variable. Once a variable is bounded by a lifetime it is bounded by that lifetime _forever_. The lifetime of a variable can only shrink, and all the shrinkage is determined at compile-time. **Key Takeaways** - lifetimes are statically verified at compile-time - lifetimes cannot grow or shrink or change in any way at run-time - Rust borrow checker will always choose the shortest possible lifetime for a variable assuming all code paths can be taken ### 9) downgrading mut refs to shared refs is safe **Misconception Corollaries** - re-borrowing a reference ends its lifetime and starts a new one You can pass a mut ref to a function expecting a shared ref because Rust will implicitly re-borrow the mut ref as immutable: ```rust fn takes_shared_ref(n: &i32) {} fn main() { let mut a = 10; takes_shared_ref(&mut a); // ✅ takes_shared_ref(&*(&mut a)); // above line desugared } ``` Intuitively this makes sense, since there's no harm in re-borrowing a mut ref as immutable, right? Surprisingly no, as the program below does not compile: ```rust fn main() { let mut a = 10; let b: &i32 = &*(&mut a); // re-borrowed as immutable let c: &i32 = &a; dbg!(b, c); // ❌ } ``` Throws this error: ```none error[E0502]: cannot borrow `a` as immutable because it is also borrowed as mutable --> src/main.rs:4:19 | 3 | let b: &i32 = &*(&mut a); | -------- mutable borrow occurs here 4 | let c: &i32 = &a; | ^^ immutable borrow occurs here 5 | dbg!(b, c); | - mutable borrow later used here ``` A mutable borrow does occur, but it's immediately and unconditionally re-borrowed as immutable and then dropped. Why is Rust treating the immutable re-borrow as if it still has the mut ref's exclusive lifetime? While there's no issue in the particular example above, allowing the ability to downgrade mut refs to shared refs does indeed introduce potential memory safety issues: ```rust use std::sync::Mutex; struct Struct { mutex: Mutex<String> } impl Struct { // downgrades mut self to shared str fn get_string(&mut self) -> &str { self.mutex.get_mut().unwrap() } fn mutate_string(&self) { // if Rust allowed downgrading mut refs to shared refs // then the following line would invalidate any shared // refs returned from the get_string method *self.mutex.lock().unwrap() = "surprise!".to_owned(); } } fn main() { let mut s = Struct { mutex: Mutex::new("string".to_owned()) }; let str_ref = s.get_string(); // mut ref downgraded to shared ref s.mutate_string(); // str_ref invalidated, now a dangling pointer dbg!(str_ref); // ❌ - as expected! } ``` The point here is that when you re-borrow a mut ref as a shared ref you don't get that shared ref without a big gotcha: it extends the mut ref's lifetime for the duration of the re-borrow even if the mut ref itself is dropped. Using the re-borrowed shared ref is very difficult because it's immutable but it can't overlap with any other shared refs. The re-borrowed shared ref has all the cons of a mut ref and all the cons of a shared ref and has the pros of neither. I believe re-borrowing a mut ref as a shared ref should be considered a Rust anti-pattern. Being aware of this anti-pattern is important so that you can easily spot it when you see code like this: ```rust // downgrades mut T to shared T fn some_function<T>(some_arg: &mut T) -> &T; struct Struct; impl Struct { // downgrades mut self to shared self fn some_method(&mut self) -> &Self; // downgrades mut self to shared T fn other_method(&mut self) -> &T; } ``` Even if you avoid re-borrows in function and method signatures Rust still does automatic implicit re-borrows so it's easy to bump into this problem without realizing it like so: ```rust use std::collections::HashMap; type PlayerID = i32; #[derive(Debug, Default)] struct Player { score: i32, } fn start_game(player_a: PlayerID, player_b: PlayerID, server: &mut HashMap<PlayerID, Player>) { // get players from server or create & insert new players if they don't yet exist let player_a: &Player = server.entry(player_a).or_default(); let player_b: &Player = server.entry(player_b).or_default(); // do something with players dbg!(player_a, player_b); // ❌ } ``` The above fails to compile. `or_default()` returns a `&mut Player` which we're implicitly re-borrowing as `&Player` because of our explicit type annotations. To do what we want we have to: ```rust use std::collections::HashMap; type PlayerID = i32; #[derive(Debug, Default)] struct Player { score: i32, } fn start_game(player_a: PlayerID, player_b: PlayerID, server: &mut HashMap<PlayerID, Player>) { // drop the returned mut Player refs since we can't use them together anyway server.entry(player_a).or_default(); server.entry(player_b).or_default(); // fetch the players again, getting them immutably this time, without any implicit re-borrows let player_a = server.get(&player_a); let player_b = server.get(&player_b); // do something with players dbg!(player_a, player_b); // ✅ } ``` Kinda awkward and clunky but this is the sacrifice we make at the Altar of Memory Safety. **Key Takeaways** - try not to re-borrow mut refs as shared refs, or you're gonna have a bad time - re-borrowing a mut ref doesn't end its lifetime, even if the ref is dropped ### 10) closures follow the same lifetime elision rules as functions This is more of a Rust Gotcha than a misconception. Closures, despite being functions, do not follow the same lifetime elision rules as functions. ```rust fn function(x: &i32) -> &i32 { x } fn main() { let closure = |x: &i32| x; // ❌ } ``` Throws: ```none error: lifetime may not live long enough --> src/main.rs:6:29 | 6 | let closure = |x: &i32| x; | - - ^ returning this value requires that `'1` must outlive `'2` | | | | | return type of closure is &'2 i32 | let's call the lifetime of this reference `'1` ``` After desugaring we get: ```rust // input lifetime gets applied to output fn function<'a>(x: &'a i32) -> &'a i32 { x } fn main() { // input and output each get their own distinct lifetimes let closure = for<'a, 'b> |x: &'a i32| -> &'b i32 { x }; // note: the above line is not valid syntax, but we need it for illustrative purposes } ``` There's no good reason for this discrepancy. Closures were first implemented with different type inference semantics than functions and now we're stuck with it forever because to unify them at this point would be a breaking change. So how can we explicitly annotate a closure's type? Our options include: ```rust fn main() { // cast to trait object, becomes unsized, oops, compile error let identity: dyn Fn(&i32) -> &i32 = |x: &i32| x; // can allocate it on the heap as a workaround but feels clunky let identity: Box<dyn Fn(&i32) -> &i32> = Box::new(|x: &i32| x); // can skip the allocation and just create a static reference let identity: &dyn Fn(&i32) -> &i32 = &|x: &i32| x; // previous line desugared :) let identity: &'static (dyn for<'a> Fn(&'a i32) -> &'a i32 + 'static) = &|x: &i32| -> &i32 { x }; // this would be ideal but it's invalid syntax let identity: impl Fn(&i32) -> &i32 = |x: &i32| x; // this would also be nice but it's also invalid syntax let identity = for<'a> |x: &'a i32| -> &'a i32 { x }; // since "impl trait" works in the function return position fn return_identity() -> impl Fn(&i32) -> &i32 { |x| x } let identity = return_identity(); // more generic version of the previous solution fn annotate<T, F>(f: F) -> F where F: Fn(&T) -> &T { f } let identity = annotate(|x: &i32| x); } ``` As I'm sure you've already noticed from the examples above, when closure types are used as trait bounds they do follow the usual function lifetime elision rules. There's no real lesson or insight to be had here, it just is what it is. **Key Takeaways** - every language has gotchas 🤷 ## Conclusion - `T` is a superset of both `&T` and `&mut T` - `&T` and `&mut T` are disjoint sets - `T: 'static` should be read as _"`T` is bounded by a `'static` lifetime"_ - if `T: 'static` then `T` can be a borrowed type with a `'static` lifetime _or_ an owned type - since `T: 'static` includes owned types that means `T` - can be dynamically allocated at run-time - does not have to be valid for the entire program - can be safely and freely mutated - can be dynamically dropped at run-time - can have lifetimes of different durations - `T: 'a` is more general and more flexible than `&'a T` - `T: 'a` accepts owned types, owned types which contain references, and references - `&'a T` only accepts references - if `T: 'static` then `T: 'a` since `'static` >= `'a` for all `'a` - almost all Rust code is generic code and there's elided lifetime annotations everywhere - Rust's lifetime elision rules are not always right for every situation - Rust does not know more about the semantics of your program than you do - give your lifetime annotations descriptive names - try to be mindful of where you place explicit lifetime annotations and why - all trait objects have some inferred default lifetime bounds - Rust compiler error messages suggest fixes which will make your program compile which is not that same as fixes which will make you program compile _and_ best suit the requirements of your program - lifetimes are statically verified at compile-time - lifetimes cannot grow or shrink or change in any way at run-time - Rust borrow checker will always choose the shortest possible lifetime for a variable assuming all code paths can be taken - try not to re-borrow mut refs as shared refs, or you're gonna have a bad time - re-borrowing a mut ref doesn't end its lifetime, even if the ref is dropped - every language has gotchas 🤷 ## Discuss Discuss this article on - [learnrust subreddit](https://www.reddit.com/r/learnrust/comments/gmrcrq/common_rust_lifetime_misconceptions/) - [official Rust users forum](https://users.rust-lang.org/t/blog-post-common-rust-lifetime-misconceptions/42950) - [Twitter](https://twitter.com/pretzelhammer/status/1263505856903163910) - [rust subreddit](https://www.reddit.com/r/rust/comments/golrsx/common_rust_lifetime_misconceptions/) - [Hackernews](https://news.ycombinator.com/item?id=23279731) - [Github](https://github.com/pretzelhammer/rust-blog/discussions) # Sizedness in Rust _22 July 2020 · #rust · #sizedness_ **Table of Contents** - [Intro](#intro) - [Sizedness](#sizedness) - [`Sized` Trait](#sized-trait) - [`Sized` in Generics](#sized-in-generics) - [Unsized Types](#unsized-types) - [Slices](#slices) - [Trait Objects](#trait-objects) - [Trait Object Limitations](#trait-object-limitations) - [Cannot Cast Unsized Types to Trait Objects](#cannot-cast-unsized-types-to-trait-objects) - [Cannot create Multi-Trait Objects](#cannot-create-multi-trait-objects) - [User-Defined Unsized Types](#user-defined-unsized-types) - [Zero-Sized Types](#zero-sized-types) - [Unit Type](#unit-type) - [User-Defined Unit Structs](#user-defined-unit-structs) - [Never Type](#never-type) - [User-Defined Pseudo Never Types](#user-defined-pseudo-never-types) - [PhantomData](#phantomdata) - [Conclusion](#conclusion) - [Discuss](#discuss) - [Notifications](#notifications) - [Further Reading](#further-reading) ## Intro Sizedness is lowkey one of the most important concepts to understand in Rust. It intersects a bunch of other language features in often subtle ways and only rears its ugly head in the form of _"x doesn't have size known at compile time"_ error messages which every Rustacean is all too familiar with. In this article we'll explore all flavors of sizedness from sized types, to unsized types, to zero-sized types while examining their use-cases, benefits, pain points, and workarounds. Table of phrases I use and what they're supposed to mean: | Phrase | Shorthand for | |-|-| | sizedness | property of being sized or unsized | | sized type | type with a known size at compile time | | 1) unsized type _or_<br>2) DST | dynamically-sized type, i.e. size not known at compile time | | ?sized type | type that may or may not be sized | | unsized coercion | coercing a sized type into an unsized type | | ZST | zero-sized type, i.e. instances of the type are 0 bytes in size | | width | single unit of measurement of pointer width | | 1) thin pointer _or_<br>2) single-width pointer | pointer that is _1 width_ | | 1) fat pointer _or_<br>2) double-width pointer | pointer that is _2 widths_ | | 1) pointer _or_<br>2) reference | some pointer of some width, width will be clarified by context | | slice | double-width pointer to a dynamically sized view into some array | ## Sizedness In Rust a type is sized if its size in bytes can be determined at compile-time. Determining a type's size is important for being able to allocate enough space for instances of that type on the stack. Sized types can be passed around by value or by reference. If a type's size can't be determined at compile-time then it's referred to as an unsized type or a DST, Dynamically-Sized Type. Since unsized types can't be placed on the stack they can only be passed around by reference. Some examples of sized and unsized types: ```rust use std::mem::size_of; fn main() { // primitives assert_eq!(4, size_of::<i32>()); assert_eq!(8, size_of::<f64>()); // tuples assert_eq!(8, size_of::<(i32, i32)>()); // arrays assert_eq!(0, size_of::<[i32; 0]>()); assert_eq!(12, size_of::<[i32; 3]>()); struct Point { x: i32, y: i32, } // structs assert_eq!(8, size_of::<Point>()); // enums assert_eq!(8, size_of::<Option<i32>>()); // get pointer width, will be // 4 bytes wide on 32-bit targets or // 8 bytes wide on 64-bit targets const WIDTH: usize = size_of::<&()>(); // pointers to sized types are 1 width assert_eq!(WIDTH, size_of::<&i32>()); assert_eq!(WIDTH, size_of::<&mut i32>()); assert_eq!(WIDTH, size_of::<Box<i32>>()); assert_eq!(WIDTH, size_of::<fn(i32) -> i32>()); const DOUBLE_WIDTH: usize = 2 * WIDTH; // unsized struct struct Unsized { unsized_field: [i32], } // pointers to unsized types are 2 widths assert_eq!(DOUBLE_WIDTH, size_of::<&str>()); // slice assert_eq!(DOUBLE_WIDTH, size_of::<&[i32]>()); // slice assert_eq!(DOUBLE_WIDTH, size_of::<&dyn ToString>()); // trait object assert_eq!(DOUBLE_WIDTH, size_of::<Box<dyn ToString>>()); // trait object assert_eq!(DOUBLE_WIDTH, size_of::<&Unsized>()); // user-defined unsized type // unsized types size_of::<str>(); // compile error size_of::<[i32]>(); // compile error size_of::<dyn ToString>(); // compile error size_of::<Unsized>(); // compile error } ``` How we determine the size of sized types is straight-forward: all primitives and pointers have known sizes and all structs, tuples, enums, and arrays are just made up of primitives and pointers or other nested structs, tuples, enums, and arrays so we can just count up the bytes recursively, taking into account extra bytes needed for padding and alignment. We can't determine the size of unsized types for similarly straight-forward reasons: slices can have any number of elements in them and can thus be of any size at run-time and trait objects can be implemented by any number of structs or enums and thus can also be of any size at run-time. **Pro tips** - pointers of dynamically sized views into arrays are called slices in Rust, e.g. a `&str` is a _"string slice"_, a `&[i32]` is an _"i32 slice"_ - slices are double-width because they store a pointer to the array and the number of elements in the array - trait object pointers are double-width because they store a pointer to the data and a pointer to a vtable - unsized structs pointers are double-width because they store a pointer to the struct data and the size of the struct - unsized structs can only have 1 unsized field and it must be the last field in the struct To really hammer home the point about double-width pointers for unsized types here's a commented code example comparing arrays to slices: ```rust use std::mem::size_of; const WIDTH: usize = size_of::<&()>(); const DOUBLE_WIDTH: usize = 2 * WIDTH; fn main() { // data length stored in type // an [i32; 3] is an array of three i32s let nums: &[i32; 3] = &[1, 2, 3]; // single-width pointer assert_eq!(WIDTH, size_of::<&[i32; 3]>()); let mut sum = 0; // can iterate over nums safely // Rust knows it's exactly 3 elements for num in nums { sum += num; } assert_eq!(6, sum); // unsized coercion from [i32; 3] to [i32] // data length now stored in pointer let nums: &[i32] = &[1, 2, 3]; // double-width pointer required to also store data length assert_eq!(DOUBLE_WIDTH, size_of::<&[i32]>()); let mut sum = 0; // can iterate over nums safely // Rust knows it's exactly 3 elements for num in nums { sum += num; } assert_eq!(6, sum); } ``` And here's another commented code example comparing structs to trait objects: ```rust use std::mem::size_of; const WIDTH: usize = size_of::<&()>(); const DOUBLE_WIDTH: usize = 2 * WIDTH; trait Trait { fn print(&self); } struct Struct; struct Struct2; impl Trait for Struct { fn print(&self) { println!("struct"); } } impl Trait for Struct2 { fn print(&self) { println!("struct2"); } } fn print_struct(s: &Struct) { // always prints "struct" // this is known at compile-time s.print(); // single-width pointer assert_eq!(WIDTH, size_of::<&Struct>()); } fn print_struct2(s2: &Struct2) { // always prints "struct2" // this is known at compile-time s2.print(); // single-width pointer assert_eq!(WIDTH, size_of::<&Struct2>()); } fn print_trait(t: &dyn Trait) { // print "struct" or "struct2" ? // this is unknown at compile-time t.print(); // Rust has to check the pointer at run-time // to figure out whether to use Struct's // or Struct2's implementation of "print" // so the pointer has to be double-width assert_eq!(DOUBLE_WIDTH, size_of::<&dyn Trait>()); } fn main() { // single-width pointer to data let s = &Struct; print_struct(s); // prints "struct" // single-width pointer to data let s2 = &Struct2; print_struct2(s2); // prints "struct2" // unsized coercion from Struct to dyn Trait // double-width pointer to point to data AND Struct's vtable let t: &dyn Trait = &Struct; print_trait(t); // prints "struct" // unsized coercion from Struct2 to dyn Trait // double-width pointer to point to data AND Struct2's vtable let t: &dyn Trait = &Struct2; print_trait(t); // prints "struct2" } ``` **Key Takeaways** - only instances of sized types can be placed on the stack, i.e. can be passed around by value - instances of unsized types can't be placed on the stack and must be passed around by reference - pointers to unsized types are double-width because aside from pointing to data they need to do an extra bit of bookkeeping to also keep track of the data's length _or_ point to a vtable ## `Sized` Trait The `Sized` trait in Rust is an auto trait and a marker trait. Auto traits are traits that get automatically implemented for a type if it passes certain conditions. Marker traits are traits that mark a type as having a certain property. Marker traits do not have any trait items such as methods, associated functions, associated constants, or associated types. All auto traits are marker traits but not all marker traits are auto traits. Auto traits must be marker traits so the compiler can provide an automatic default implementation for them, which would not be possible if the trait had any trait items. A type gets an auto `Sized` implementation if all of its members are also `Sized`. What "members" means depends on the containing type, for example: fields of a struct, variants of an enum, elements of an array, items of a tuple, and so on. Once a type has been "marked" with a `Sized` implementation that means its size in bytes is known at compile time. Other examples of auto marker traits are the `Send` and `Sync` traits. A type is `Send` if it is safe to send that type across threads. A type is `Sync` if it's safe to share references of that type between threads. A type gets auto `Send` and `Sync` implementations if all of its members are also `Send` and `Sync`. What makes `Sized` somewhat special is that it's not possible to opt-out of unlike with the other auto marker traits which are possible to opt-out of. ```rust #![feature(negative_impls)] // this type is Sized, Send, and Sync struct Struct; // opt-out of Send trait impl !Send for Struct {} // ✅ // opt-out of Sync trait impl !Sync for Struct {} // ✅ // can't opt-out of Sized impl !Sized for Struct {} // ❌ ``` This seems reasonable since there might be reasons why we wouldn't want our type to be sent or shared across threads, however it's hard to imagine a scenario where we'd want the compiler to "forget" the size of our type and treat it as an unsized type as that offers no benefits and merely makes the type more difficult to work with. Also, to be super pedantic `Sized` is not technically an auto trait since it's not defined using the `auto` keyword but the special treatment it gets from the compiler makes it behave very similarly to auto traits so in practice it's okay to think of it as an auto trait. **Key Takeaways** - `Sized` is an "auto" marker trait ## `Sized` in Generics It's not immediately obvious that whenever we write any generic code every generic type parameter gets auto-bound with the `Sized` trait by default. ```rust // this generic function... fn func<T>(t: T) {} // ...desugars to... fn func<T: Sized>(t: T) {} // ...which we can opt-out of by explicitly setting ?Sized... fn func<T: ?Sized>(t: T) {} // ❌ // ...which doesn't compile since it doesn't have // a known size so we must put it behind a pointer... fn func<T: ?Sized>(t: &T) {} // ✅ fn func<T: ?Sized>(t: Box<T>) {} // ✅ ``` **Pro tips** - `?Sized` can be pronounced _"optionally sized"_ or _"maybe sized"_ and adding it to a type parameter's bounds allows the type to be sized or unsized - `?Sized` in general is referred to as a _"widening bound"_ or a _"relaxed bound"_ as it relaxes rather than constrains the type parameter - `?Sized` is the only relaxed bound in Rust So why does this matter? Well, any time we're working with a generic type and that type is behind a pointer we almost always want to opt-out of the default `Sized` bound to make our function more flexible in what argument types it will accept. Also, if we don't opt-out of the default `Sized` bound we'll eventually get some surprising and confusing compile error messages. Let me take you on the journey of the first generic function I ever wrote in Rust. I started learning Rust before the `dbg!` macro landed in stable so the only way to print debug values was to type out `println!("{:?}", some_value);` every time which is pretty tedious so I decided to write a `debug` helper function like this: ```rust use std::fmt::Debug; fn debug<T: Debug>(t: T) { // T: Debug + Sized println!("{:?}", t); } fn main() { debug("my str"); // T = &str, &str: Debug + Sized ✅ } ``` So far so good, but the function takes ownership of any values passed to it which is kinda annoying so I changed the function to only take references instead: ```rust use std::fmt::Debug; fn dbg<T: Debug>(t: &T) { // T: Debug + Sized println!("{:?}", t); } fn main() { dbg("my str"); // &T = &str, T = str, str: Debug + !Sized ❌ } ``` Which now throws this error: ```none error[E0277]: the size for values of type `str` cannot be known at compilation time --> src/main.rs:8:9 | 3 | fn dbg<T: Debug>(t: &T) { | - required by this bound in `dbg` ... 8 | dbg("my str"); | ^^^^^^^^ doesn't have a size known at compile-time | = help: the trait `std::marker::Sized` is not implemented for `str` = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait> help: consider relaxing the implicit `Sized` restriction | 3 | fn dbg<T: Debug + ?Sized>(t: &T) { | ``` When I first saw this I found it incredibly confusing. Despite making my function more restrictive in what arguments it takes than before it now somehow throws a compile error! What is going on? I've already kinda spoiled the answer in the code comments above, but basically: Rust performs pattern matching when resolving `T` to its concrete types during compilation. Here's a couple tables to help clarify: | Type | `T` | `&T` | |------------|---|----| | `&str` | `T` = `&str` | `T` = `str` | | Type | `Sized` | |-|-| | `str` | ❌ | | `&str` | ✅ | | `&&str` | ✅ | This is why I had to add a `?Sized` bound to make the function work as intended after changing it to take references. The working function below: ```rust use std::fmt::Debug; fn debug<T: Debug + ?Sized>(t: &T) { // T: Debug + ?Sized println!("{:?}", t); } fn main() { debug("my str"); // &T = &str, T = str, str: Debug + !Sized ✅ } ``` **Key Takeaways** - all generic type parameters are auto-bound with `Sized` by default - if we have a generic function which takes an argument of some `T` behind a pointer, e.g. `&T`, `Box<T>`, `Rc<T>`, et cetera, then we almost always want to opt-out of the default `Sized` bound with `T: ?Sized` ## Unsized Types ### Slices The most common slices are string slices `&str` and array slices `&[T]`. What's nice about slices is that many other types coerce to them, so leveraging slices and Rust's auto type coercions allow us to write flexible APIs. Type coercions can happen in several places but most notably on function arguments and at method calls. The kinds of type coercions we're interested in are deref coercions and unsized coercions. A deref coercion is when a `T` gets coerced into a `U` following a deref operation, i.e. `T: Deref<Target = U>`, e.g. `String.deref() -> str`. An unsized coercion is when a `T` gets coerced into a `U` where `T` is a sized type and `U` is an unsized type, i.e. `T: Unsize<U>`, e.g. `[i32; 3] -> [i32]`. ```rust trait Trait { fn method(&self) {} } impl Trait for str { // can now call "method" on // 1) str or // 2) String since String: Deref<Target = str> } impl<T> Trait for [T] { // can now call "method" on // 1) any &[T] // 2) any U where U: Deref<Target = [T]>, e.g. Vec<T> // 3) [T; N] for any N, since [T; N]: Unsize<[T]> } fn str_fun(s: &str) {} fn slice_fun<T>(s: &[T]) {} fn main() { let str_slice: &str = "str slice"; let string: String = "string".to_owned(); // function args str_fun(str_slice); str_fun(&string); // deref coercion // method calls str_slice.method(); string.method(); // deref coercion let slice: &[i32] = &[1]; let three_array: [i32; 3] = [1, 2, 3]; let five_array: [i32; 5] = [1, 2, 3, 4, 5]; let vec: Vec<i32> = vec![1]; // function args slice_fun(slice); slice_fun(&vec); // deref coercion slice_fun(&three_array); // unsized coercion slice_fun(&five_array); // unsized coercion // method calls slice.method(); vec.method(); // deref coercion three_array.method(); // unsized coercion five_array.method(); // unsized coercion } ``` **Key Takeaways** - leveraging slices and Rust's auto type coercions allows us to write flexible APIs ### Trait Objects Traits are `?Sized` by default. This program: ```rust trait Trait: ?Sized {} ``` Throws this error: ```none error: `?Trait` is not permitted in supertraits --> src/main.rs:1:14 | 1 | trait Trait: ?Sized {} | ^^^^^^ | = note: traits are `?Sized` by default ``` We'll get into why traits are `?Sized` by default soon but first let's ask ourselves what are the implications of a trait being `?Sized`? Let's desugar the above example: ```rust trait Trait where Self: ?Sized {} ``` Okay, so by default traits allow `self` to possibly be an unsized type. As we learned earlier we can't pass unsized types around by value, so that limits us in the kind of methods we can define in the trait. It should be impossible to write a method the takes or returns `self` by value and yet this surprisingly compiles: ```rust trait Trait { fn method(self); // ✅ } ``` However the moment we try to implement the method, either by providing a default implementation or by implementing the trait for an unsized type, we get compile errors: ```rust trait Trait { fn method(self) {} // ❌ } impl Trait for str { fn method(self) {} // ❌ } ``` Throws: ```none error[E0277]: the size for values of type `Self` cannot be known at compilation time --> src/lib.rs:2:15 | 2 | fn method(self) {} | ^^^^ doesn't have a size known at compile-time | = help: the trait `std::marker::Sized` is not implemented for `Self` = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait> = note: all local variables must have a statically known size = help: unsized locals are gated as an unstable feature help: consider further restricting `Self` | 2 | fn method(self) where Self: std::marker::Sized {} | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error[E0277]: the size for values of type `str` cannot be known at compilation time --> src/lib.rs:6:15 | 6 | fn method(self) {} | ^^^^ doesn't have a size known at compile-time | = help: the trait `std::marker::Sized` is not implemented for `str` = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait> = note: all local variables must have a statically known size = help: unsized locals are gated as an unstable feature ``` If we're determined to pass `self` around by value we can fix the first error by explicitly binding the trait with `Sized`: ```rust trait Trait: Sized { fn method(self) {} // ✅ } impl Trait for str { // ❌ fn method(self) {} } ``` Now throws: ```none error[E0277]: the size for values of type `str` cannot be known at compilation time --> src/lib.rs:7:6 | 1 | trait Trait: Sized { | ----- required by this bound in `Trait` ... 7 | impl Trait for str { | ^^^^^ doesn't have a size known at compile-time | = help: the trait `std::marker::Sized` is not implemented for `str` = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait> ``` Which is okay, as we knew upon binding the trait with `Sized` we'd no longer be able to implement it for unsized types such as `str`. If on the other hand we really wanted to implement the trait for `str` an alternative solution would be to keep the trait `?Sized` and pass `self` around by reference: ```rust trait Trait { fn method(&self) {} // ✅ } impl Trait for str { fn method(&self) {} // ✅ } ``` Instead of marking the entire trait as `?Sized` or `Sized` we have the more granular and precise option of marking individual methods as `Sized` like so: ```rust trait Trait { fn method(self) where Self: Sized {} } impl Trait for str {} // ✅!? fn main() { "str".method(); // ❌ } ``` It's surprising that Rust compiles `impl Trait for str {}` without any complaints, but it eventually catches the error when we attempt to call `method` on an unsized type so all is fine. It's a little weird but affords us some flexibility in implementing traits with some `Sized` methods for unsized types as long as we never call the `Sized` methods: ```rust trait Trait { fn method(self) where Self: Sized {} fn method2(&self) {} } impl Trait for str {} // ✅ fn main() { // we never call "method" so no errors "str".method2(); // ✅ } ``` Now back to the original question, why are traits `?Sized` by default? The answer is trait objects. Trait objects are inherently unsized because any type of any size can implement a trait, therefore we can only implement `Trait` for `dyn Trait` if `Trait: ?Sized`. To put it in code: ```rust trait Trait: ?Sized {} // the above is REQUIRED for impl Trait for dyn Trait { // compiler magic here } // since `dyn Trait` is unsized // and now we can use `dyn Trait` in our program fn function(t: &dyn Trait) {} // ✅ ``` If we try to actually compile the above program we get: ```none error[E0371]: the object type `(dyn Trait + 'static)` automatically implements the trait `Trait` --> src/lib.rs:5:1 | 5 | impl Trait for dyn Trait { | ^^^^^^^^^^^^^^^^^^^^^^^^ `(dyn Trait + 'static)` automatically implements trait `Trait` ``` Which is the compiler telling us to chill since it automatically provides the implementation of `Trait` for `dyn Trait`. Again, since `dyn Trait` is unsized the compiler can only provide this implementation if `Trait: ?Sized`. If we bound `Trait` by `Sized` then `Trait` becomes _"object unsafe"_ which is a term that means we can't cast types which implement `Trait` to trait objects of `dyn Trait`. As expected this program does not compile: ```rust trait Trait: Sized {} fn function(t: &dyn Trait) {} // ❌ ``` Throws: ```none error[E0038]: the trait `Trait` cannot be made into an object --> src/lib.rs:3:18 | 1 | trait Trait: Sized {} | ----- ----- ...because it requires `Self: Sized` | | | this trait cannot be made into an object... 2 | 3 | fn function(t: &dyn Trait) {} | ^^^^^^^^^^ the trait `Trait` cannot be made into an object ``` Let's try to make an `?Sized` trait with a `Sized` method and see if we can cast it to a trait object: ```rust trait Trait { fn method(self) where Self: Sized {} fn method2(&self) {} } fn function(arg: &dyn Trait) { // ✅ arg.method(); // ❌ arg.method2(); // ✅ } ``` As we saw before everything is okay as long as we don't call the `Sized` method on the trait object. **Key Takeaways** - all traits are `?Sized` by default - `Trait: ?Sized` is required for `impl Trait for dyn Trait` - we can require `Self: Sized` on a per-method basis - traits bound by `Sized` can't be made into trait objects ### Trait Object Limitations Even if a trait is object-safe there are still sizedness-related edge cases which limit what types can be cast to trait objects and how many and what kind of traits can be represented by a trait object. #### Cannot Cast Unsized Types to Trait Objects ```rust fn generic<T: ToString>(t: T) {} fn trait_object(t: &dyn ToString) {} fn main() { generic(String::from("String")); // ✅ generic("str"); // ✅ trait_object(&String::from("String")); // ✅ - unsized coercion trait_object("str"); // ❌ - unsized coercion impossible } ``` Throws: ```none error[E0277]: the size for values of type `str` cannot be known at compilation time --> src/main.rs:8:18 | 8 | trait_object("str"); | ^^^^^ doesn't have a size known at compile-time | = help: the trait `std::marker::Sized` is not implemented for `str` = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait> = note: required for the cast to the object type `dyn std::string::ToString` ``` The reason why passing a `&String` to a function expecting a `&dyn ToString` works is because of type coercion. `String` implements `ToString` and we can convert a sized type such as `String` into an unsized type such as `dyn ToString` via an unsized coercion. `str` also implements `ToString` and converting `str` into a `dyn ToString` would also require an unsized coercion but `str` is already unsized! How do we unsize an already unsized type into another unsized type? `&str` pointers are double-width, storing a pointer to the data and the data length. `&dyn ToString` pointers are also double-width, storing a pointer to the data and a pointer to a vtable. To coerce a `&str` into a `&dyn toString` would require a triple-width pointer to store a pointer to the data, the data length, and a pointer to a vtable. Rust does not support triple-width pointers so casting an unsized type to a trait object is not possible. Previous two paragraphs summarized in a table: | Type | Pointer to Data | Data Length | Pointer to VTable | Total Width | |-|-|-|-|-| | `&String` | ✅ | ❌ | ❌ | 1 ✅ | | `&str` | ✅ | ✅ | ❌ | 2 ✅ | | `&String as &dyn ToString` | ✅ | ❌ | ✅ | 2 ✅ | | `&str as &dyn ToString` | ✅ | ✅ | ✅ | 3 ❌ | #### Cannot create Multi-Trait Objects ```rust trait Trait {} trait Trait2 {} fn function(t: &(dyn Trait + Trait2)) {} ``` Throws: ```none error[E0225]: only auto traits can be used as additional traits in a trait object --> src/lib.rs:4:30 | 4 | fn function(t: &(dyn Trait + Trait2)) {} | ----- ^^^^^^ | | | | | additional non-auto trait | | trait alias used in trait object type (additional use) | first non-auto trait | trait alias used in trait object type (first use) ``` Remember that a trait object pointer is double-width: storing 1 pointer to the data and another to the vtable, but there's 2 traits here so there's 2 vtables which would require the `&(dyn Trait + Trait2)` pointer to be 3 widths. Auto-traits like `Sync` and `Send` are allowed since they don't have methods and thus don't have vtables. The workaround for this is to combine vtables by combining the traits using another trait like so: ```rust trait Trait { fn method(&self) {} } trait Trait2 { fn method2(&self) {} } trait Trait3: Trait + Trait2 {} // auto blanket impl Trait3 for any type that also impls Trait & Trait2 impl<T: Trait + Trait2> Trait3 for T {} // from `dyn Trait + Trait2` to `dyn Trait3` fn function(t: &dyn Trait3) { t.method(); // ✅ t.method2(); // ✅ } ``` One downside of this workaround is that Rust does not support supertrait upcasting. What this means is that if we have a `dyn Trait3` we can't use it where we need a `dyn Trait` or a `dyn Trait2`. This program does not compile: ```rust trait Trait { fn method(&self) {} } trait Trait2 { fn method2(&self) {} } trait Trait3: Trait + Trait2 {} impl<T: Trait + Trait2> Trait3 for T {} struct Struct; impl Trait for Struct {} impl Trait2 for Struct {} fn takes_trait(t: &dyn Trait) {} fn takes_trait2(t: &dyn Trait2) {} fn main() { let t: &dyn Trait3 = &Struct; takes_trait(t); // ❌ takes_trait2(t); // ❌ } ``` Throws: ```none error[E0308]: mismatched types --> src/main.rs:22:17 | 22 | takes_trait(t); | ^ expected trait `Trait`, found trait `Trait3` | = note: expected reference `&dyn Trait` found reference `&dyn Trait3` error[E0308]: mismatched types --> src/main.rs:23:18 | 23 | takes_trait2(t); | ^ expected trait `Trait2`, found trait `Trait3` | = note: expected reference `&dyn Trait2` found reference `&dyn Trait3` ``` This is because `dyn Trait3` is a distinct type from `dyn Trait` and `dyn Trait2` in the sense that they have different vtable layouts, although `dyn Trait3` does contain all the methods of `dyn Trait` and `dyn Trait2`. The workaround here is to add explicit casting methods: ```rust trait Trait {} trait Trait2 {} trait Trait3: Trait + Trait2 { fn as_trait(&self) -> &dyn Trait; fn as_trait2(&self) -> &dyn Trait2; } impl<T: Trait + Trait2> Trait3 for T { fn as_trait(&self) -> &dyn Trait { self } fn as_trait2(&self) -> &dyn Trait2 { self } } struct Struct; impl Trait for Struct {} impl Trait2 for Struct {} fn takes_trait(t: &dyn Trait) {} fn takes_trait2(t: &dyn Trait2) {} fn main() { let t: &dyn Trait3 = &Struct; takes_trait(t.as_trait()); // ✅ takes_trait2(t.as_trait2()); // ✅ } ``` This is a simple and straight-forward workaround that seems like something the Rust compiler could automate for us. Rust is not shy about performing type coercions as we have seen with deref and unsized coercions, so why isn't there a trait upcasting coercion? This is a good question with a familiar answer: the Rust core team is working on other higher-priority and higher-impact features. Fair enough. **Key Takeaways** - Rust doesn't support pointers wider than 2 widths so - we can't cast unsized types to trait objects - we can't have multi-trait objects, but we can work around this by coalescing multiple traits into a single trait ### User-Defined Unsized Types ```rust struct Unsized { unsized_field: [i32], } ``` We can define an unsized struct by giving the struct an unsized field. Unsized structs can only have 1 unsized field and it must be the last field in the struct. This is a requirement so that the compiler can determine the starting offset of every field in the struct at compile-time, which is important for efficient and fast field access. Furthermore, a single unsized field is the most that can be tracked using a double-width pointer, as more unsized fields would require more widths. So how do we even instantiate this thing? The same way we do with any unsized type: by first making a sized version of it then coercing it into the unsized version. However, `Unsized` is always unsized by definition, there's no way to make a sized version of it! The only workaround is to make the struct generic so that it can exist in both sized and unsized versions: ```rust struct MaybeSized<T: ?Sized> { maybe_sized: T, } fn main() { // unsized coercion from MaybeSized<[i32; 3]> to MaybeSized<[i32]> let ms: &MaybeSized<[i32]> = &MaybeSized { maybe_sized: [1, 2, 3] }; } ``` So what are the use-cases of this? There aren't any particularly compelling ones, user-defined unsized types are a pretty half-baked feature right now and their limitations outweigh any benefits. They're mentioned here purely for the sake of comprehensiveness. **Fun fact:** `std::ffi::OsStr` and `std::path::Path` are 2 unsized structs in the standard library that you've probably used before without realizing! **Key Takeaways** - user-defined unsized types are a half-baked feature right now and their limitations outweigh any benefits ## Zero-Sized Types ZSTs sound exotic at first but they're used everywhere. ### Unit Type The most common ZST is the unit type: `()`. All empty blocks `{}` evaluate to `()` and if the block is non-empty but the last expression is discarded with a semicolon `;` then it also evaluates to `()`. Example: ```rust fn main() { let a: () = {}; let b: i32 = { 5 }; let c: () = { 5; }; } ``` Every function which doesn't have an explicit return type returns `()` by default. ```rust // with sugar fn function() {} // desugared fn function() -> () {} ``` Since `()` is zero bytes all instances of `()` are the same which makes for some really simple `Default`, `PartialEq`, and `Ord` implementations: ```rust use std::cmp::Ordering; impl Default for () { fn default() {} } impl PartialEq for () { fn eq(&self, _other: &()) -> bool { true } fn ne(&self, _other: &()) -> bool { false } } impl Ord for () { fn cmp(&self, _other: &()) -> Ordering { Ordering::Equal } } ``` The compiler understands `()` is zero-sized and optimizes away interactions with instances of `()`. For example, a `Vec<()>` will never make any heap allocations, and pushing and popping `()` from the `Vec` just increments and decrements its `len` field: ```rust fn main() { // zero capacity is all the capacity we need to "store" infinitely many () let mut vec: Vec<()> = Vec::with_capacity(0); // causes no heap allocations or vec capacity changes vec.push(()); // len++ vec.push(()); // len++ vec.push(()); // len++ vec.pop(); // len-- assert_eq!(2, vec.len()); } ``` The above example has no practical applications, but is there any situation where we can take advantage of the above idea in a meaningful way? Surprisingly yes, we can get an efficient `HashSet<Key>` implementation from a `HashMap<Key, Value>` by setting the `Value` to `()` which is exactly how `HashSet` in the Rust standard library works: ```rust // std::collections::HashSet pub struct HashSet<T> { map: HashMap<T, ()>, } ``` **Key Takeaways** - all instances of a ZST are equal to each other - Rust compiler knows to optimize away interactions with ZSTs ### User-Defined Unit Structs A unit struct is any struct without any fields, e.g. ```rust struct Struct; ``` Properties that make unit structs more useful than `()`: - we can implement whatever traits we want on our own unit structs, Rust's trait orphan rules prevent us from implementing traits for `()` as it's defined in the standard library - unit structs can be given meaningful names within the context of our program - unit structs, like all structs, are non-Copy by default, which may be important in the context of our program ### Never Type The second most common ZST is the never type: `!`. It's called the never type because it represents computations that never resolve to any value at all. A couple interesting properties of `!` that make it different from `()`: - `!` can be coerced into any other type - it's not possible to create instances of `!` The first interesting property is very useful for ergonomics and allows us to use handy macros like these: ```rust // nice for quick prototyping fn example<T>(t: &[T]) -> Vec<T> { unimplemented!() // ! coerced to Vec<T> } fn example2() -> i32 { // we know this parse call will never fail match "123".parse::<i32>() { Ok(num) => num, Err(_) => unreachable!(), // ! coerced to i32 } } fn example3(some_condition: bool) -> &'static str { if !some_condition { panic!() // ! coerced to &str } else { "str" } } ``` `break`, `continue`, and `return` expressions also have type `!`: ```rust fn example() -> i32 { // we can set the type of x to anything here // since the block never evaluates to any value let x: String = { return 123 // ! coerced to String }; } fn example2(nums: &[i32]) -> Vec<i32> { let mut filtered = Vec::new(); for num in nums { filtered.push( if *num < 0 { break // ! coerced to i32 } else if *num % 2 == 0 { *num } else { continue // ! coerced to i32 } ); } filtered } ``` The second interesting property of `!` allows us to mark certain states as impossible on a type level. Let's take this function signature as an example: ```rust fn function() -> Result<Success, Error>; ``` We know that if the function returns and was successful the `Result` will contain some instance of type `Success` and if it errored `Result` will contain some instance of type `Error`. Now let's compare that to this function signature: ```rust fn function() -> Result<Success, !>; ``` We know that if the function returns and was successful the `Result` will hold some instance of type `Success` and if it errored... but wait, it can never error, since it's impossible to create instances of `!`. Given the above function signature we know this function will never error. How about this function signature: ```rust fn function() -> Result<!, Error>; ``` The inverse of the previous is now true: if this function returns we know it must have errored as success is impossible. A practical application of the former example would be the `FromStr` implementation for `String` as it's impossible to fail converting a `&str` into a `String`: ```rust #![feature(never_type)] use std::str::FromStr; impl FromStr for String { type Err = !; fn from_str(s: &str) -> Result<String, Self::Err> { Ok(String::from(s)) } } ``` A practical application of the latter example would be a function that runs an infinite loop that's never meant to return, like a server responding to client requests, unless there's some error: ```rust #![feature(never_type)] fn run_server() -> Result<!, ConnectionError> { loop { let (request, response) = get_request()?; let result = request.process(); response.send(result); } } ``` The feature flag is necessary because while the never type exists and works within Rust internals using it in user-code is still considered experimental. **Key Takeaways** - `!` can be coerced into any other type - it's not possible to create instances of `!` which we can use to mark certain states as impossible at a type level ### User-Defined Pseudo Never Types While it's not possible to define a type that can coerce to any other type it is possible to define a type which is impossible to create instances of such as an `enum` without any variants: ```rust enum Void {} ``` This allows us to remove the feature flag from the previous two examples and implement them using stable Rust: ```rust enum Void {} // example 1 impl FromStr for String { type Err = Void; fn from_str(s: &str) -> Result<String, Self::Err> { Ok(String::from(s)) } } // example 2 fn run_server() -> Result<Void, ConnectionError> { loop { let (request, response) = get_request()?; let result = request.process(); response.send(result); } } ``` This is the technique the Rust standard library uses, as the `Err` type for the `FromStr` implementation of `String` is `std::convert::Infallible` which is defined as: ```rust pub enum Infallible {} ``` ### PhantomData The third most commonly used ZST is probably `PhantomData`. `PhantomData` is a zero-sized marker struct which can be used to "mark" a containing struct as having certain properties. It's similar in purpose to its auto marker trait cousins such as `Sized`, `Send`, and `Sync` but being a marker struct is used a little bit differently. Giving a thorough explanation of `PhantomData` and exploring all of its use-cases is outside the scope of this article so let's only briefly go over a single simple example. Recall this code snippet presented earlier: ```rust #![feature(negative_impls)] // this type is Send and Sync struct Struct; // opt-out of Send trait impl !Send for Struct {} // opt-out of Sync trait impl !Sync for Struct {} ``` It's unfortunate that we have to use a feature flag, can we accomplish the same result using only stable Rust? As we've learned, a type is only `Send` and `Sync` if all of its members are also `Send` and `Sync`, so we can add a `!Send` and `!Sync` member to `Struct` like `Rc<()>`: ```rust use std::rc::Rc; // this type is not Send or Sync struct Struct { // adds 8 bytes to every instance _not_send_or_sync: Rc<()>, } ``` This is less than ideal because it adds size to every instance of `Struct` and we now also have to conjure a `Rc<()>` from thin air every time we want to create a `Struct`. Since `PhantomData` is a ZST it solves both of these problems: ```rust use std::rc::Rc; use std::marker::PhantomData; type NotSendOrSyncPhantom = PhantomData<Rc<()>>; // this type is not Send or Sync struct Struct { // adds no additional size to instances _not_send_or_sync: NotSendOrSyncPhantom, } ``` **Key Takeaways** - `PhantomData` is a zero-sized marker struct which can be used to "mark" a containing struct as having certain properties ## Conclusion - only instances of sized types can be placed on the stack, i.e. can be passed around by value - instances of unsized types can't be placed on the stack and must be passed around by reference - pointers to unsized types are double-width because aside from pointing to data they need to do an extra bit of bookkeeping to also keep track of the data's length _or_ point to a vtable - `Sized` is an "auto" marker trait - all generic type parameters are auto-bound with `Sized` by default - if we have a generic function which takes an argument of some `T` behind a pointer, e.g. `&T`, `Box<T>`, `Rc<T>`, et cetera, then we almost always want to opt-out of the default `Sized` bound with `T: ?Sized` - leveraging slices and Rust's auto type coercions allows us to write flexible APIs - all traits are `?Sized` by default - `Trait: ?Sized` is required for `impl Trait for dyn Trait` - we can require `Self: Sized` on a per-method basis - traits bound by `Sized` can't be made into trait objects - Rust doesn't support pointers wider than 2 widths so - we can't cast unsized types to trait objects - we can't have multi-trait objects, but we can work around this by coalescing multiple traits into a single trait - user-defined unsized types are a half-baked feature right now and their limitations outweigh any benefits - all instances of a ZST are equal to each other - Rust compiler knows to optimize away interactions with ZSTs - `!` can be coerced into any other type - it's not possible to create instances of `!` which we can use to mark certain states as impossible at a type level - `PhantomData` is a zero-sized marker struct which can be used to "mark" a containing struct as having certain properties