# Common Rust Lifetime Misconceptions
_19 May 2020 · #rust · #lifetimes_
**Table of Contents**
- [Intro](#intro)
- [The Misconceptions](#the-misconceptions)
- [1) `T` only contains owned types](#1-t-only-contains-owned-types)
- [2) if `T: 'static` then `T` must be valid for the entire program](#2-if-t-static-then-t-must-be-valid-for-the-entire-program)
- [3) `&'a T` and `T: 'a` are the same thing](#3-a-t-and-t-a-are-the-same-thing)
- [4) my code isn't generic and doesn't have lifetimes](#4-my-code-isnt-generic-and-doesnt-have-lifetimes)
- [5) if it compiles then my lifetime annotations are correct](#5-if-it-compiles-then-my-lifetime-annotations-are-correct)
- [6) boxed trait objects don't have lifetimes](#6-boxed-trait-objects-dont-have-lifetimes)
- [7) compiler error messages will tell me how to fix my program](#7-compiler-error-messages-will-tell-me-how-to-fix-my-program)
- [8) lifetimes can grow and shrink at run-time](#8-lifetimes-can-grow-and-shrink-at-run-time)
- [9) downgrading mut refs to shared refs is safe](#9-downgrading-mut-refs-to-shared-refs-is-safe)
- [10) closures follow the same lifetime elision rules as functions](#10-closures-follow-the-same-lifetime-elision-rules-as-functions)
- [Conclusion](#conclusion)
- [Discuss](#discuss)
- [Notifications](#notifications)
- [Further Reading](#further-reading)
## Intro
I've held all of these misconceptions at some point and I see many beginners struggle with these misconceptions today. Some of my terminology might be non-standard, so here's a table of shorthand phrases I use and what I intend for them to mean.
| Phrase | Shorthand for |
|-|-|
| `T` | 1) a set containing all possible types _or_<br>2) some type within that set |
| owned type | some non-reference type, e.g. `i32`, `String`, `Vec`, etc |
| 1) borrowed type _or_<br>2) ref type | some reference type regardless of mutability, e.g. `&i32`, `&mut i32`, etc |
| 1) mut ref _or_<br>2) exclusive ref | exclusive mutable reference, i.e. `&mut T` |
| 1) immut ref _or_<br>2) shared ref | shared immutable reference, i.e. `&T` |
## The Misconceptions
In a nutshell: A variable's lifetime is how long the data it points to can be statically verified by the compiler to be valid at its current memory address. I'll now spend the next ~6500 words going into more detail about where people commonly get confused.
### 1) `T` only contains owned types
This misconception is more about generics than lifetimes but generics and lifetimes are tightly intertwined in Rust so it's not possible to talk about one without also talking about the other. Anyway:
When I first started learning Rust I understood that `i32`, `&i32`, and `&mut i32` are different types. I also understood that some generic type variable `T` represents a set which contains all possible types. However, despite understanding both of these things separately, I wasn't able to understand them together. In my newbie Rust mind this is how I thought generics worked:
| | | | |
|-|-|-|-|
| **Type Variable** | `T` | `&T` | `&mut T` |
| **Examples** | `i32` | `&i32` | `&mut i32` |
`T` contains all owned types. `&T` contains all immutably borrowed types. `&mut T` contains all mutably borrowed types. `T`, `&T`, and `&mut T` are disjoint finite sets. Nice, simple, clean, easy, intuitive, and completely totally wrong. This is how generics actually work in Rust:
| | | | |
|-|-|-|-|
| **Type Variable** | `T` | `&T` | `&mut T` |
| **Examples** | `i32`, `&i32`, `&mut i32`, `&&i32`, `&mut &mut i32`, ... | `&i32`, `&&i32`, `&&mut i32`, ... | `&mut i32`, `&mut &mut i32`, `&mut &i32`, ... |
`T`, `&T`, and `&mut T` are all infinite sets, since it's possible to borrow a type ad-infinitum. `T` is a superset of both `&T` and `&mut T`. `&T` and `&mut T` are disjoint sets. Here's a couple examples which validate these concepts:
```rust
trait Trait {}
impl<T> Trait for T {}
impl<T> Trait for &T {} // ❌
impl<T> Trait for &mut T {} // ❌
```
The above program doesn't compile as expected:
```none
error[E0119]: conflicting implementations of trait `Trait` for type `&_`:
--> src/lib.rs:5:1
|
3 | impl<T> Trait for T {}
| ------------------- first implementation here
4 |
5 | impl<T> Trait for &T {}
| ^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `&_`
error[E0119]: conflicting implementations of trait `Trait` for type `&mut _`:
--> src/lib.rs:7:1
|
3 | impl<T> Trait for T {}
| ------------------- first implementation here
...
7 | impl<T> Trait for &mut T {}
| ^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `&mut _`
```
The compiler doesn't allow us to define an implementation of `Trait` for `&T` and `&mut T` since it would conflict with the implementation of `Trait` for `T` which already includes all of `&T` and `&mut T`. The program below compiles as expected, since `&T` and `&mut T` are disjoint:
```rust
trait Trait {}
impl<T> Trait for &T {} // ✅
impl<T> Trait for &mut T {} // ✅
```
**Key Takeaways**
- `T` is a superset of both `&T` and `&mut T`
- `&T` and `&mut T` are disjoint sets
### 2) if `T: 'static` then `T` must be valid for the entire program
**Misconception Corollaries**
- `T: 'static` should be read as _"`T` has a `'static` lifetime"_
- `&'static T` and `T: 'static` are the same thing
- if `T: 'static` then `T` must be immutable
- if `T: 'static` then `T` can only be created at compile time
Most Rust beginners get introduced to the `'static` lifetime for the first time in a code example that looks something like this:
```rust
fn main() {
let str_literal: &'static str = "str literal";
}
```
They get told that `"str literal"` is hardcoded into the compiled binary and is loaded into read-only memory at run-time so it's immutable and valid for the entire program and that's what makes it `'static`. These concepts are further reinforced by the rules surrounding defining `static` variables using the `static` keyword.
```rust
// Note: This example is purely for illustrative purposes.
// Never use `static mut`. It's a footgun. There are
// safe patterns for global mutable singletons in Rust but
// those are outside the scope of this article.
static BYTES: [u8; 3] = [1, 2, 3];
static mut MUT_BYTES: [u8; 3] = [1, 2, 3];
fn main() {
MUT_BYTES[0] = 99; // ❌ - mutating static is unsafe
unsafe {
MUT_BYTES[0] = 99;
assert_eq!(99, MUT_BYTES[0]);
}
}
```
Regarding `static` variables
- they can only be created at compile-time
- they should be immutable, mutating them is unsafe
- they're valid for the entire program
The `'static` lifetime was probably named after the default lifetime of `static` variables, right? So it makes sense that the `'static` lifetime has to follow all the same rules, right?
Well yes, but a type _with_ a `'static` lifetime is different from a type _bounded by_ a `'static` lifetime. The latter can be dynamically allocated at run-time, can be safely and freely mutated, can be dropped, and can live for arbitrary durations.
It's important at this point to distinguish `&'static T` from `T: 'static`.
`&'static T` is an immutable reference to some `T` that can be safely held indefinitely long, including up until the end of the program. This is only possible if `T` itself is immutable and does not move _after the reference was created_. `T` does not need to be created at compile-time. It's possible to generate random dynamically allocated data at run-time and return `'static` references to it at the cost of leaking memory, e.g.
```rust
use rand;
// generate random 'static str refs at run-time
fn rand_str_generator() -> &'static str {
let rand_string = rand::random::<u64>().to_string();
Box::leak(rand_string.into_boxed_str())
}
```
`T: 'static` is some `T` that can be safely held indefinitely long, including up until the end of the program. `T: 'static` includes all `&'static T` however it also includes all owned types, like `String`, `Vec`, etc. The owner of some data is guaranteed that data will never get invalidated as long as the owner holds onto it, therefore the owner can safely hold onto the data indefinitely long, including up until the end of the program. `T: 'static` should be read as _"`T` is bounded by a `'static` lifetime"_ not _"`T` has a `'static` lifetime"_. A program to help illustrate these concepts:
```rust
use rand;
fn drop_static<T: 'static>(t: T) {
std::mem::drop(t);
}
fn main() {
let mut strings: Vec<String> = Vec::new();
for _ in 0..10 {
if rand::random() {
// all the strings are randomly generated
// and dynamically allocated at run-time
let string = rand::random::<u64>().to_string();
strings.push(string);
}
}
// strings are owned types so they're bounded by 'static
for mut string in strings {
// all the strings are mutable
string.push_str("a mutation");
// all the strings are droppable
drop_static(string); // ✅
}
// all the strings have been invalidated before the end of the program
println!("I am the end of the program");
}
```
**Key Takeaways**
- `T: 'static` should be read as _"`T` is bounded by a `'static` lifetime"_
- if `T: 'static` then `T` can be a borrowed type with a `'static` lifetime _or_ an owned type
- since `T: 'static` includes owned types that means `T`
- can be dynamically allocated at run-time
- does not have to be valid for the entire program
- can be safely and freely mutated
- can be dynamically dropped at run-time
- can have lifetimes of different durations
### 3) `&'a T` and `T: 'a` are the same thing
This misconception is a generalized version of the one above.
`&'a T` requires and implies `T: 'a` since a reference to `T` of lifetime `'a` cannot be valid for `'a` if `T` itself is not valid for `'a`. For example, the Rust compiler will never allow the construction of the type `&'static Ref<'a, T>` because if `Ref` is only valid for `'a` we can't make a `'static` reference to it.
`T: 'a` includes all `&'a T` but the reverse is not true.
```rust
// only takes ref types bounded by 'a
fn t_ref<'a, T: 'a>(t: &'a T) {}
// takes any types bounded by 'a
fn t_bound<'a, T: 'a>(t: T) {}
// owned type which contains a reference
struct Ref<'a, T: 'a>(&'a T);
fn main() {
let string = String::from("string");
t_bound(&string); // ✅
t_bound(Ref(&string)); // ✅
t_bound(&Ref(&string)); // ✅
t_ref(&string); // ✅
t_ref(Ref(&string)); // ❌ - expected ref, found struct
t_ref(&Ref(&string)); // ✅
// string var is bounded by 'static which is bounded by 'a
t_bound(string); // ✅
}
```
**Key Takeaways**
- `T: 'a` is more general and more flexible than `&'a T`
- `T: 'a` accepts owned types, owned types which contain references, and references
- `&'a T` only accepts references
- if `T: 'static` then `T: 'a` since `'static` >= `'a` for all `'a`
### 4) my code isn't generic and doesn't have lifetimes
**Misconception Corollaries**
- it's possible to avoid using generics and lifetimes
This comforting misconception is kept alive thanks to Rust's lifetime elision rules, which allow you to omit lifetime annotations in functions because the Rust borrow checker will infer them following these rules:
- every input ref to a function gets a distinct lifetime
- if there's exactly one input lifetime it gets applied to all output refs
- if there's multiple input lifetimes but one of them is `&self` or `&mut self` then the lifetime of `self` is applied to all output refs
- otherwise output lifetimes have to be made explicit
That's a lot to take in so let's look at some examples:
```rust
// elided
fn print(s: &str);
// expanded
fn print<'a>(s: &'a str);
// elided
fn trim(s: &str) -> &str;
// expanded
fn trim<'a>(s: &'a str) -> &'a str;
// illegal, can't determine output lifetime, no inputs
fn get_str() -> &str;
// explicit options include
fn get_str<'a>() -> &'a str; // generic version
fn get_str() -> &'static str; // 'static version
// illegal, can't determine output lifetime, multiple inputs
fn overlap(s: &str, t: &str) -> &str;
// explicit (but still partially elided) options include
fn overlap<'a>(s: &'a str, t: &str) -> &'a str; // output can't outlive s
fn overlap<'a>(s: &str, t: &'a str) -> &'a str; // output can't outlive t
fn overlap<'a>(s: &'a str, t: &'a str) -> &'a str; // output can't outlive s & t
fn overlap(s: &str, t: &str) -> &'static str; // output can outlive s & t
fn overlap<'a>(s: &str, t: &str) -> &'a str; // no relationship between input & output lifetimes
// expanded
fn overlap<'a, 'b>(s: &'a str, t: &'b str) -> &'a str;
fn overlap<'a, 'b>(s: &'a str, t: &'b str) -> &'b str;
fn overlap<'a>(s: &'a str, t: &'a str) -> &'a str;
fn overlap<'a, 'b>(s: &'a str, t: &'b str) -> &'static str;
fn overlap<'a, 'b, 'c>(s: &'a str, t: &'b str) -> &'c str;
// elided
fn compare(&self, s: &str) -> &str;
// expanded
fn compare<'a, 'b>(&'a self, &'b str) -> &'a str;
```
If you've ever written
- a struct method
- a function which takes references
- a function which returns references
- a generic function
- a trait object (more on this later)
- a closure (more on this later)
then your code has generic elided lifetime annotations all over it.
**Key Takeaways**
- almost all Rust code is generic code and there's elided lifetime annotations everywhere
### 5) if it compiles then my lifetime annotations are correct
**Misconception Corollaries**
- Rust's lifetime elision rules for functions are always right
- Rust's borrow checker is always right, technically _and semantically_
- Rust knows more about the semantics of my program than I do
It's possible for a Rust program to be technically compilable but still semantically wrong. Take this for example:
```rust
struct ByteIter<'a> {
remainder: &'a [u8]
}
impl<'a> ByteIter<'a> {
fn next(&mut self) -> Option<&u8> {
if self.remainder.is_empty() {
None
} else {
let byte = &self.remainder[0];
self.remainder = &self.remainder[1..];
Some(byte)
}
}
}
fn main() {
let mut bytes = ByteIter { remainder: b"1" };
assert_eq!(Some(&b'1'), bytes.next());
assert_eq!(None, bytes.next());
}
```
`ByteIter` is an iterator that iterates over a slice of bytes. We're skipping the `Iterator` trait implementation for conciseness. It seems to work fine, but what if we want to check a couple bytes at a time?
```rust
fn main() {
let mut bytes = ByteIter { remainder: b"1123" };
let byte_1 = bytes.next();
let byte_2 = bytes.next();
if byte_1 == byte_2 { // ❌
// do something
}
}
```
Uh oh! Compile error:
```none
error[E0499]: cannot borrow `bytes` as mutable more than once at a time
--> src/main.rs:20:18
|
19 | let byte_1 = bytes.next();
| ----- first mutable borrow occurs here
20 | let byte_2 = bytes.next();
| ^^^^^ second mutable borrow occurs here
21 | if byte_1 == byte_2 {
| ------ first borrow later used here
```
I guess we can copy each byte. Copying is okay when we're working with bytes but if we turned `ByteIter` into a generic slice iterator that can iterate over any `&'a [T]` then we might want to use it in the future with types that may be very expensive or impossible to copy and clone. Oh well, I guess there's nothing we can do about that, the code compiles so the lifetime annotations must be right, right?
Nope, the current lifetime annotations are actually the source of the bug! It's particularly hard to spot because the buggy lifetime annotations are elided. Let's expand the elided lifetimes to get a clearer look at the problem:
```rust
struct ByteIter<'a> {
remainder: &'a [u8]
}
impl<'a> ByteIter<'a> {
fn next<'b>(&'b mut self) -> Option<&'b u8> {
if self.remainder.is_empty() {
None
} else {
let byte = &self.remainder[0];
self.remainder = &self.remainder[1..];
Some(byte)
}
}
}
```
That didn't help at all. I'm still confused. Here's a hot tip that only Rust pros know: give your lifetime annotations descriptive names. Let's try again:
```rust
struct ByteIter<'remainder> {
remainder: &'remainder [u8]
}
impl<'remainder> ByteIter<'remainder> {
fn next<'mut_self>(&'mut_self mut self) -> Option<&'mut_self u8> {
if self.remainder.is_empty() {
None
} else {
let byte = &self.remainder[0];
self.remainder = &self.remainder[1..];
Some(byte)
}
}
}
```
Each returned byte is annotated with `'mut_self` but the bytes are clearly coming from `'remainder`! Let's fix it.
```rust
struct ByteIter<'remainder> {
remainder: &'remainder [u8]
}
impl<'remainder> ByteIter<'remainder> {
fn next(&mut self) -> Option<&'remainder u8> {
if self.remainder.is_empty() {
None
} else {
let byte = &self.remainder[0];
self.remainder = &self.remainder[1..];
Some(byte)
}
}
}
fn main() {
let mut bytes = ByteIter { remainder: b"1123" };
let byte_1 = bytes.next();
let byte_2 = bytes.next();
std::mem::drop(bytes); // we can even drop the iterator now!
if byte_1 == byte_2 { // ✅
// do something
}
}
```
Now that we look back on the previous version of our program it was obviously wrong, so why did Rust compile it? The answer is simple: it was memory safe.
The Rust borrow checker only cares about the lifetime annotations in a program to the extent it can use them to statically verify the memory safety of the program. Rust will happily compile programs even if the lifetime annotations have semantic errors, and the consequence of this is that the program becomes unnecessarily restrictive.
Here's a quick example that's the opposite of the previous example: Rust's lifetime elision rules happen to be semantically correct in this instance but we unintentionally write a very restrictive method with our own unnecessary explicit lifetime annotations.
```rust
#[derive(Debug)]
struct NumRef<'a>(&'a i32);
impl<'a> NumRef<'a> {
// my struct is generic over 'a so that means I need to annotate
// my self parameters with 'a too, right? (answer: no, not right)
fn some_method(&'a mut self) {}
}
fn main() {
let mut num_ref = NumRef(&5);
num_ref.some_method(); // mutably borrows num_ref for the rest of its lifetime
num_ref.some_method(); // ❌
println!("{:?}", num_ref); // ❌
}
```
If we have some struct generic over `'a` we almost never want to write a method with a `&'a mut self` receiver. What we're communicating to Rust is _"this method will mutably borrow the struct for the entirety of the struct's lifetime"_. In practice this means Rust's borrow checker will only allow at most one call to `some_method` before the struct becomes permanently mutably borrowed and thus unusable. The use-cases for this are extremely rare but the code above is very easy for confused beginners to write and it compiles. The fix is to not add unnecessary explicit lifetime annotations and let Rust's lifetime elision rules handle it:
```rust
#[derive(Debug)]
struct NumRef<'a>(&'a i32);
impl<'a> NumRef<'a> {
// no more 'a on mut self
fn some_method(&mut self) {}
// above line desugars to
fn some_method_desugared<'b>(&'b mut self){}
}
fn main() {
let mut num_ref = NumRef(&5);
num_ref.some_method();
num_ref.some_method(); // ✅
println!("{:?}", num_ref); // ✅
}
```
**Key Takeaways**
- Rust's lifetime elision rules for functions are not always right for every situation
- Rust does not know more about the semantics of your program than you do
- give your lifetime annotations descriptive names
- try to be mindful of where you place explicit lifetime annotations and why
### 6) boxed trait objects don't have lifetimes
Earlier we discussed Rust's lifetime elision rules _for functions_. Rust also has lifetime elision rules for trait objects, which are:
- if a trait object is used as a type argument to a generic type then its life bound is inferred from the containing type
- if there's a unique bound from the containing then that's used
- if there's more than one bound from the containing type then an explicit bound must be specified
- if the above doesn't apply then
- if the trait is defined with a single lifetime bound then that bound is used
- if `'static` is used for any lifetime bound then `'static` is used
- if the trait has no lifetime bounds then its lifetime is inferred in expressions and is `'static` outside of expressions
All of that sounds super complicated but can be simply summarized as _"a trait object's lifetime bound is inferred from context."_ After looking at a handful of examples we'll see the lifetime bound inferences are pretty intuitive so we don't have to memorize the formal rules:
```rust
use std::cell::Ref;
trait Trait {}
// elided
type T1 = Box<dyn Trait>;
// expanded, Box<T> has no lifetime bound on T, so inferred as 'static
type T2 = Box<dyn Trait + 'static>;
// elided
impl dyn Trait {}
// expanded
impl dyn Trait + 'static {}
// elided
type T3<'a> = &'a dyn Trait;
// expanded, &'a T requires T: 'a, so inferred as 'a
type T4<'a> = &'a (dyn Trait + 'a);
// elided
type T5<'a> = Ref<'a, dyn Trait>;
// expanded, Ref<'a, T> requires T: 'a, so inferred as 'a
type T6<'a> = Ref<'a, dyn Trait + 'a>;
trait GenericTrait<'a>: 'a {}
// elided
type T7<'a> = Box<dyn GenericTrait<'a>>;
// expanded
type T8<'a> = Box<dyn GenericTrait<'a> + 'a>;
// elided
impl<'a> dyn GenericTrait<'a> {}
// expanded
impl<'a> dyn GenericTrait<'a> + 'a {}
```
Concrete types which implement traits can have references and thus they also have lifetime bounds, and so their corresponding trait objects have lifetime bounds. Also you can implement traits directly for references which obviously have lifetime bounds:
```rust
trait Trait {}
struct Struct {}
struct Ref<'a, T>(&'a T);
impl Trait for Struct {}
impl Trait for &Struct {} // impl Trait directly on a ref type
impl<'a, T> Trait for Ref<'a, T> {} // impl Trait on a type containing refs
```
Anyway, this is worth going over because it often confuses beginners when they refactor a function from using trait objects to generics or vice versa. Take this program for example:
```rust
use std::fmt::Display;
fn dynamic_thread_print(t: Box<dyn Display + Send>) {
std::thread::spawn(move || {
println!("{}", t);
}).join();
}
fn static_thread_print<T: Display + Send>(t: T) { // ❌
std::thread::spawn(move || {
println!("{}", t);
}).join();
}
```
It throws this compile error:
```none
error[E0310]: the parameter type `T` may not live long enough
--> src/lib.rs:10:5
|
9 | fn static_thread_print<T: Display + Send>(t: T) {
| -- help: consider adding an explicit lifetime bound...: `T: 'static +`
10 | std::thread::spawn(move || {
| ^^^^^^^^^^^^^^^^^^
|
note: ...so that the type `[closure@src/lib.rs:10:24: 12:6 t:T]` will meet its required lifetime bounds
--> src/lib.rs:10:5
|
10 | std::thread::spawn(move || {
| ^^^^^^^^^^^^^^^^^^
```
Okay great, the compiler tells us how to fix the issue so let's fix the issue.
```rust
use std::fmt::Display;
fn dynamic_thread_print(t: Box<dyn Display + Send>) {
std::thread::spawn(move || {
println!("{}", t);
}).join();
}
fn static_thread_print<T: Display + Send + 'static>(t: T) { // ✅
std::thread::spawn(move || {
println!("{}", t);
}).join();
}
```
It compiles now but these two functions look awkward next to each other, why does the second function require a `'static` bound on `T` where the first function doesn't? That's a trick question. Using the lifetime elision rules Rust automatically infers a `'static` bound in the first function so both actually have `'static` bounds. This is what the Rust compiler sees:
```rust
use std::fmt::Display;
fn dynamic_thread_print(t: Box<dyn Display + Send + 'static>) {
std::thread::spawn(move || {
println!("{}", t);
}).join();
}
fn static_thread_print<T: Display + Send + 'static>(t: T) {
std::thread::spawn(move || {
println!("{}", t);
}).join();
}
```
**Key Takeaways**
- all trait objects have some inferred default lifetime bounds
### 7) compiler error messages will tell me how to fix my program
**Misconception Corollaries**
- Rust's lifetime elision rules for trait objects are always right
- Rust knows more about the semantics of my program than I do
This misconception is the previous two misconceptions combined into one example:
```rust
use std::fmt::Display;
fn box_displayable<T: Display>(t: T) -> Box<dyn Display> { // ❌
Box::new(t)
}
```
Throws this error:
```none
error[E0310]: the parameter type `T` may not live long enough
--> src/lib.rs:4:5
|
3 | fn box_displayable<T: Display>(t: T) -> Box<dyn Display> {
| -- help: consider adding an explicit lifetime bound...: `T: 'static +`
4 | Box::new(t)
| ^^^^^^^^^^^
|
note: ...so that the type `T` will meet its required lifetime bounds
--> src/lib.rs:4:5
|
4 | Box::new(t)
| ^^^^^^^^^^^
```
Okay, let's fix it how the compiler is telling us to fix it, nevermind the fact that it's automatically inferring a `'static` lifetime bound for our boxed trait object without telling us and its recommended fix is based on that unstated fact:
```rust
use std::fmt::Display;
fn box_displayable<T: Display + 'static>(t: T) -> Box<dyn Display> { // ✅
Box::new(t)
}
```
So the program compiles now... but is this what we actually want? Probably, but maybe not. The compiler didn't mention any other fixes but this would have also been appropriate:
```rust
use std::fmt::Display;
fn box_displayable<'a, T: Display + 'a>(t: T) -> Box<dyn Display + 'a> { // ✅
Box::new(t)
}
```
This function accepts all the same arguments as the previous version plus a lot more! Does that make it better? Not necessarily, it depends on the requirements and constraints of our program. This example is a bit abstract so let's take a look at a simpler and more obvious case:
```rust
fn return_first(a: &str, b: &str) -> &str { // ❌
a
}
```
Throws:
```none
error[E0106]: missing lifetime specifier
--> src/lib.rs:1:38
|
1 | fn return_first(a: &str, b: &str) -> &str {
| ---- ---- ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `a` or `b`
help: consider introducing a named lifetime parameter
|
1 | fn return_first<'a>(a: &'a str, b: &'a str) -> &'a str {
| ^^^^ ^^^^^^^ ^^^^^^^ ^^^
```
The error message recommends annotating both inputs and the output with the same lifetime. If we did this our program would compile but this function would overly-constrain the return type. What we actually want is this:
```rust
fn return_first<'a>(a: &'a str, b: &str) -> &'a str { // ✅
a
}
```
**Key Takeaways**
- Rust's lifetime elision rules for trait objects are not always right for every situation
- Rust does not know more about the semantics of your program than you do
- Rust compiler error messages suggest fixes which will make your program compile which is not that same as fixes which will make you program compile _and_ best suit the requirements of your program
### 8) lifetimes can grow and shrink at run-time
**Misconception Corollaries**
- container types can swap references at run-time to change their lifetime
- Rust borrow checker does advanced control flow analysis
This does not compile:
```rust
struct Has<'lifetime> {
lifetime: &'lifetime str,
}
fn main() {
let long = String::from("long");
let mut has = Has { lifetime: &long };
assert_eq!(has.lifetime, "long");
{
let short = String::from("short");
// "switch" to short lifetime
has.lifetime = &short;
assert_eq!(has.lifetime, "short");
// "switch back" to long lifetime (but not really)
has.lifetime = &long;
assert_eq!(has.lifetime, "long");
// `short` dropped here
}
assert_eq!(has.lifetime, "long"); // ❌ - `short` still "borrowed" after drop
}
```
It throws:
```none
error[E0597]: `short` does not live long enough
--> src/main.rs:11:24
|
11 | has.lifetime = &short;
| ^^^^^^ borrowed value does not live long enough
...
15 | }
| - `short` dropped here while still borrowed
16 | assert_eq!(has.lifetime, "long");
| --------------------------------- borrow later used here
```
This also does not compile, throws the exact same error as above:
```rust
struct Has<'lifetime> {
lifetime: &'lifetime str,
}
fn main() {
let long = String::from("long");
let mut has = Has { lifetime: &long };
assert_eq!(has.lifetime, "long");
// this block will never run
if false {
let short = String::from("short");
// "switch" to short lifetime
has.lifetime = &short;
assert_eq!(has.lifetime, "short");
// "switch back" to long lifetime (but not really)
has.lifetime = &long;
assert_eq!(has.lifetime, "long");
// `short` dropped here
}
assert_eq!(has.lifetime, "long"); // ❌ - `short` still "borrowed" after drop
}
```
Lifetimes have to be statically verified at compile-time and the Rust borrow checker only does very basic control flow analysis, so it assumes every block in an `if-else` statement and every match arm in a `match` statement can be taken and then chooses the shortest possible lifetime for the variable. Once a variable is bounded by a lifetime it is bounded by that lifetime _forever_. The lifetime of a variable can only shrink, and all the shrinkage is determined at compile-time.
**Key Takeaways**
- lifetimes are statically verified at compile-time
- lifetimes cannot grow or shrink or change in any way at run-time
- Rust borrow checker will always choose the shortest possible lifetime for a variable assuming all code paths can be taken
### 9) downgrading mut refs to shared refs is safe
**Misconception Corollaries**
- re-borrowing a reference ends its lifetime and starts a new one
You can pass a mut ref to a function expecting a shared ref because Rust will implicitly re-borrow the mut ref as immutable:
```rust
fn takes_shared_ref(n: &i32) {}
fn main() {
let mut a = 10;
takes_shared_ref(&mut a); // ✅
takes_shared_ref(&*(&mut a)); // above line desugared
}
```
Intuitively this makes sense, since there's no harm in re-borrowing a mut ref as immutable, right? Surprisingly no, as the program below does not compile:
```rust
fn main() {
let mut a = 10;
let b: &i32 = &*(&mut a); // re-borrowed as immutable
let c: &i32 = &a;
dbg!(b, c); // ❌
}
```
Throws this error:
```none
error[E0502]: cannot borrow `a` as immutable because it is also borrowed as mutable
--> src/main.rs:4:19
|
3 | let b: &i32 = &*(&mut a);
| -------- mutable borrow occurs here
4 | let c: &i32 = &a;
| ^^ immutable borrow occurs here
5 | dbg!(b, c);
| - mutable borrow later used here
```
A mutable borrow does occur, but it's immediately and unconditionally re-borrowed as immutable and then dropped. Why is Rust treating the immutable re-borrow as if it still has the mut ref's exclusive lifetime? While there's no issue in the particular example above, allowing the ability to downgrade mut refs to shared refs does indeed introduce potential memory safety issues:
```rust
use std::sync::Mutex;
struct Struct {
mutex: Mutex<String>
}
impl Struct {
// downgrades mut self to shared str
fn get_string(&mut self) -> &str {
self.mutex.get_mut().unwrap()
}
fn mutate_string(&self) {
// if Rust allowed downgrading mut refs to shared refs
// then the following line would invalidate any shared
// refs returned from the get_string method
*self.mutex.lock().unwrap() = "surprise!".to_owned();
}
}
fn main() {
let mut s = Struct {
mutex: Mutex::new("string".to_owned())
};
let str_ref = s.get_string(); // mut ref downgraded to shared ref
s.mutate_string(); // str_ref invalidated, now a dangling pointer
dbg!(str_ref); // ❌ - as expected!
}
```
The point here is that when you re-borrow a mut ref as a shared ref you don't get that shared ref without a big gotcha: it extends the mut ref's lifetime for the duration of the re-borrow even if the mut ref itself is dropped. Using the re-borrowed shared ref is very difficult because it's immutable but it can't overlap with any other shared refs. The re-borrowed shared ref has all the cons of a mut ref and all the cons of a shared ref and has the pros of neither. I believe re-borrowing a mut ref as a shared ref should be considered a Rust anti-pattern. Being aware of this anti-pattern is important so that you can easily spot it when you see code like this:
```rust
// downgrades mut T to shared T
fn some_function<T>(some_arg: &mut T) -> &T;
struct Struct;
impl Struct {
// downgrades mut self to shared self
fn some_method(&mut self) -> &Self;
// downgrades mut self to shared T
fn other_method(&mut self) -> &T;
}
```
Even if you avoid re-borrows in function and method signatures Rust still does automatic implicit re-borrows so it's easy to bump into this problem without realizing it like so:
```rust
use std::collections::HashMap;
type PlayerID = i32;
#[derive(Debug, Default)]
struct Player {
score: i32,
}
fn start_game(player_a: PlayerID, player_b: PlayerID, server: &mut HashMap<PlayerID, Player>) {
// get players from server or create & insert new players if they don't yet exist
let player_a: &Player = server.entry(player_a).or_default();
let player_b: &Player = server.entry(player_b).or_default();
// do something with players
dbg!(player_a, player_b); // ❌
}
```
The above fails to compile. `or_default()` returns a `&mut Player` which we're implicitly re-borrowing as `&Player` because of our explicit type annotations. To do what we want we have to:
```rust
use std::collections::HashMap;
type PlayerID = i32;
#[derive(Debug, Default)]
struct Player {
score: i32,
}
fn start_game(player_a: PlayerID, player_b: PlayerID, server: &mut HashMap<PlayerID, Player>) {
// drop the returned mut Player refs since we can't use them together anyway
server.entry(player_a).or_default();
server.entry(player_b).or_default();
// fetch the players again, getting them immutably this time, without any implicit re-borrows
let player_a = server.get(&player_a);
let player_b = server.get(&player_b);
// do something with players
dbg!(player_a, player_b); // ✅
}
```
Kinda awkward and clunky but this is the sacrifice we make at the Altar of Memory Safety.
**Key Takeaways**
- try not to re-borrow mut refs as shared refs, or you're gonna have a bad time
- re-borrowing a mut ref doesn't end its lifetime, even if the ref is dropped
### 10) closures follow the same lifetime elision rules as functions
This is more of a Rust Gotcha than a misconception.
Closures, despite being functions, do not follow the same lifetime elision rules as functions.
```rust
fn function(x: &i32) -> &i32 {
x
}
fn main() {
let closure = |x: &i32| x; // ❌
}
```
Throws:
```none
error: lifetime may not live long enough
--> src/main.rs:6:29
|
6 | let closure = |x: &i32| x;
| - - ^ returning this value requires that `'1` must outlive `'2`
| | |
| | return type of closure is &'2 i32
| let's call the lifetime of this reference `'1`
```
After desugaring we get:
```rust
// input lifetime gets applied to output
fn function<'a>(x: &'a i32) -> &'a i32 {
x
}
fn main() {
// input and output each get their own distinct lifetimes
let closure = for<'a, 'b> |x: &'a i32| -> &'b i32 { x };
// note: the above line is not valid syntax, but we need it for illustrative purposes
}
```
There's no good reason for this discrepancy. Closures were first implemented with different type inference semantics than functions and now we're stuck with it forever because to unify them at this point would be a breaking change. So how can we explicitly annotate a closure's type? Our options include:
```rust
fn main() {
// cast to trait object, becomes unsized, oops, compile error
let identity: dyn Fn(&i32) -> &i32 = |x: &i32| x;
// can allocate it on the heap as a workaround but feels clunky
let identity: Box<dyn Fn(&i32) -> &i32> = Box::new(|x: &i32| x);
// can skip the allocation and just create a static reference
let identity: &dyn Fn(&i32) -> &i32 = &|x: &i32| x;
// previous line desugared :)
let identity: &'static (dyn for<'a> Fn(&'a i32) -> &'a i32 + 'static) = &|x: &i32| -> &i32 { x };
// this would be ideal but it's invalid syntax
let identity: impl Fn(&i32) -> &i32 = |x: &i32| x;
// this would also be nice but it's also invalid syntax
let identity = for<'a> |x: &'a i32| -> &'a i32 { x };
// since "impl trait" works in the function return position
fn return_identity() -> impl Fn(&i32) -> &i32 {
|x| x
}
let identity = return_identity();
// more generic version of the previous solution
fn annotate<T, F>(f: F) -> F where F: Fn(&T) -> &T {
f
}
let identity = annotate(|x: &i32| x);
}
```
As I'm sure you've already noticed from the examples above, when closure types are used as trait bounds they do follow the usual function lifetime elision rules.
There's no real lesson or insight to be had here, it just is what it is.
**Key Takeaways**
- every language has gotchas 🤷
## Conclusion
- `T` is a superset of both `&T` and `&mut T`
- `&T` and `&mut T` are disjoint sets
- `T: 'static` should be read as _"`T` is bounded by a `'static` lifetime"_
- if `T: 'static` then `T` can be a borrowed type with a `'static` lifetime _or_ an owned type
- since `T: 'static` includes owned types that means `T`
- can be dynamically allocated at run-time
- does not have to be valid for the entire program
- can be safely and freely mutated
- can be dynamically dropped at run-time
- can have lifetimes of different durations
- `T: 'a` is more general and more flexible than `&'a T`
- `T: 'a` accepts owned types, owned types which contain references, and references
- `&'a T` only accepts references
- if `T: 'static` then `T: 'a` since `'static` >= `'a` for all `'a`
- almost all Rust code is generic code and there's elided lifetime annotations everywhere
- Rust's lifetime elision rules are not always right for every situation
- Rust does not know more about the semantics of your program than you do
- give your lifetime annotations descriptive names
- try to be mindful of where you place explicit lifetime annotations and why
- all trait objects have some inferred default lifetime bounds
- Rust compiler error messages suggest fixes which will make your program compile which is not that same as fixes which will make you program compile _and_ best suit the requirements of your program
- lifetimes are statically verified at compile-time
- lifetimes cannot grow or shrink or change in any way at run-time
- Rust borrow checker will always choose the shortest possible lifetime for a variable assuming all code paths can be taken
- try not to re-borrow mut refs as shared refs, or you're gonna have a bad time
- re-borrowing a mut ref doesn't end its lifetime, even if the ref is dropped
- every language has gotchas 🤷
## Discuss
Discuss this article on
- [learnrust subreddit](https://www.reddit.com/r/learnrust/comments/gmrcrq/common_rust_lifetime_misconceptions/)
- [official Rust users forum](https://users.rust-lang.org/t/blog-post-common-rust-lifetime-misconceptions/42950)
- [Twitter](https://twitter.com/pretzelhammer/status/1263505856903163910)
- [rust subreddit](https://www.reddit.com/r/rust/comments/golrsx/common_rust_lifetime_misconceptions/)
- [Hackernews](https://news.ycombinator.com/item?id=23279731)
- [Github](https://github.com/pretzelhammer/rust-blog/discussions)
# Sizedness in Rust
_22 July 2020 · #rust · #sizedness_
**Table of Contents**
- [Intro](#intro)
- [Sizedness](#sizedness)
- [`Sized` Trait](#sized-trait)
- [`Sized` in Generics](#sized-in-generics)
- [Unsized Types](#unsized-types)
- [Slices](#slices)
- [Trait Objects](#trait-objects)
- [Trait Object Limitations](#trait-object-limitations)
- [Cannot Cast Unsized Types to Trait Objects](#cannot-cast-unsized-types-to-trait-objects)
- [Cannot create Multi-Trait Objects](#cannot-create-multi-trait-objects)
- [User-Defined Unsized Types](#user-defined-unsized-types)
- [Zero-Sized Types](#zero-sized-types)
- [Unit Type](#unit-type)
- [User-Defined Unit Structs](#user-defined-unit-structs)
- [Never Type](#never-type)
- [User-Defined Pseudo Never Types](#user-defined-pseudo-never-types)
- [PhantomData](#phantomdata)
- [Conclusion](#conclusion)
- [Discuss](#discuss)
- [Notifications](#notifications)
- [Further Reading](#further-reading)
## Intro
Sizedness is lowkey one of the most important concepts to understand in Rust. It intersects a bunch of other language features in often subtle ways and only rears its ugly head in the form of _"x doesn't have size known at compile time"_ error messages which every Rustacean is all too familiar with. In this article we'll explore all flavors of sizedness from sized types, to unsized types, to zero-sized types while examining their use-cases, benefits, pain points, and workarounds.
Table of phrases I use and what they're supposed to mean:
| Phrase | Shorthand for |
|-|-|
| sizedness | property of being sized or unsized |
| sized type | type with a known size at compile time |
| 1) unsized type _or_<br>2) DST | dynamically-sized type, i.e. size not known at compile time |
| ?sized type | type that may or may not be sized |
| unsized coercion | coercing a sized type into an unsized type |
| ZST | zero-sized type, i.e. instances of the type are 0 bytes in size |
| width | single unit of measurement of pointer width |
| 1) thin pointer _or_<br>2) single-width pointer | pointer that is _1 width_ |
| 1) fat pointer _or_<br>2) double-width pointer | pointer that is _2 widths_ |
| 1) pointer _or_<br>2) reference | some pointer of some width, width will be clarified by context |
| slice | double-width pointer to a dynamically sized view into some array |
## Sizedness
In Rust a type is sized if its size in bytes can be determined at compile-time. Determining a type's size is important for being able to allocate enough space for instances of that type on the stack. Sized types can be passed around by value or by reference. If a type's size can't be determined at compile-time then it's referred to as an unsized type or a DST, Dynamically-Sized Type. Since unsized types can't be placed on the stack they can only be passed around by reference. Some examples of sized and unsized types:
```rust
use std::mem::size_of;
fn main() {
// primitives
assert_eq!(4, size_of::<i32>());
assert_eq!(8, size_of::<f64>());
// tuples
assert_eq!(8, size_of::<(i32, i32)>());
// arrays
assert_eq!(0, size_of::<[i32; 0]>());
assert_eq!(12, size_of::<[i32; 3]>());
struct Point {
x: i32,
y: i32,
}
// structs
assert_eq!(8, size_of::<Point>());
// enums
assert_eq!(8, size_of::<Option<i32>>());
// get pointer width, will be
// 4 bytes wide on 32-bit targets or
// 8 bytes wide on 64-bit targets
const WIDTH: usize = size_of::<&()>();
// pointers to sized types are 1 width
assert_eq!(WIDTH, size_of::<&i32>());
assert_eq!(WIDTH, size_of::<&mut i32>());
assert_eq!(WIDTH, size_of::<Box<i32>>());
assert_eq!(WIDTH, size_of::<fn(i32) -> i32>());
const DOUBLE_WIDTH: usize = 2 * WIDTH;
// unsized struct
struct Unsized {
unsized_field: [i32],
}
// pointers to unsized types are 2 widths
assert_eq!(DOUBLE_WIDTH, size_of::<&str>()); // slice
assert_eq!(DOUBLE_WIDTH, size_of::<&[i32]>()); // slice
assert_eq!(DOUBLE_WIDTH, size_of::<&dyn ToString>()); // trait object
assert_eq!(DOUBLE_WIDTH, size_of::<Box<dyn ToString>>()); // trait object
assert_eq!(DOUBLE_WIDTH, size_of::<&Unsized>()); // user-defined unsized type
// unsized types
size_of::<str>(); // compile error
size_of::<[i32]>(); // compile error
size_of::<dyn ToString>(); // compile error
size_of::<Unsized>(); // compile error
}
```
How we determine the size of sized types is straight-forward: all primitives and pointers have known sizes and all structs, tuples, enums, and arrays are just made up of primitives and pointers or other nested structs, tuples, enums, and arrays so we can just count up the bytes recursively, taking into account extra bytes needed for padding and alignment. We can't determine the size of unsized types for similarly straight-forward reasons: slices can have any number of elements in them and can thus be of any size at run-time and trait objects can be implemented by any number of structs or enums and thus can also be of any size at run-time.
**Pro tips**
- pointers of dynamically sized views into arrays are called slices in Rust, e.g. a `&str` is a _"string slice"_, a `&[i32]` is an _"i32 slice"_
- slices are double-width because they store a pointer to the array and the number of elements in the array
- trait object pointers are double-width because they store a pointer to the data and a pointer to a vtable
- unsized structs pointers are double-width because they store a pointer to the struct data and the size of the struct
- unsized structs can only have 1 unsized field and it must be the last field in the struct
To really hammer home the point about double-width pointers for unsized types here's a commented code example comparing arrays to slices:
```rust
use std::mem::size_of;
const WIDTH: usize = size_of::<&()>();
const DOUBLE_WIDTH: usize = 2 * WIDTH;
fn main() {
// data length stored in type
// an [i32; 3] is an array of three i32s
let nums: &[i32; 3] = &[1, 2, 3];
// single-width pointer
assert_eq!(WIDTH, size_of::<&[i32; 3]>());
let mut sum = 0;
// can iterate over nums safely
// Rust knows it's exactly 3 elements
for num in nums {
sum += num;
}
assert_eq!(6, sum);
// unsized coercion from [i32; 3] to [i32]
// data length now stored in pointer
let nums: &[i32] = &[1, 2, 3];
// double-width pointer required to also store data length
assert_eq!(DOUBLE_WIDTH, size_of::<&[i32]>());
let mut sum = 0;
// can iterate over nums safely
// Rust knows it's exactly 3 elements
for num in nums {
sum += num;
}
assert_eq!(6, sum);
}
```
And here's another commented code example comparing structs to trait objects:
```rust
use std::mem::size_of;
const WIDTH: usize = size_of::<&()>();
const DOUBLE_WIDTH: usize = 2 * WIDTH;
trait Trait {
fn print(&self);
}
struct Struct;
struct Struct2;
impl Trait for Struct {
fn print(&self) {
println!("struct");
}
}
impl Trait for Struct2 {
fn print(&self) {
println!("struct2");
}
}
fn print_struct(s: &Struct) {
// always prints "struct"
// this is known at compile-time
s.print();
// single-width pointer
assert_eq!(WIDTH, size_of::<&Struct>());
}
fn print_struct2(s2: &Struct2) {
// always prints "struct2"
// this is known at compile-time
s2.print();
// single-width pointer
assert_eq!(WIDTH, size_of::<&Struct2>());
}
fn print_trait(t: &dyn Trait) {
// print "struct" or "struct2" ?
// this is unknown at compile-time
t.print();
// Rust has to check the pointer at run-time
// to figure out whether to use Struct's
// or Struct2's implementation of "print"
// so the pointer has to be double-width
assert_eq!(DOUBLE_WIDTH, size_of::<&dyn Trait>());
}
fn main() {
// single-width pointer to data
let s = &Struct;
print_struct(s); // prints "struct"
// single-width pointer to data
let s2 = &Struct2;
print_struct2(s2); // prints "struct2"
// unsized coercion from Struct to dyn Trait
// double-width pointer to point to data AND Struct's vtable
let t: &dyn Trait = &Struct;
print_trait(t); // prints "struct"
// unsized coercion from Struct2 to dyn Trait
// double-width pointer to point to data AND Struct2's vtable
let t: &dyn Trait = &Struct2;
print_trait(t); // prints "struct2"
}
```
**Key Takeaways**
- only instances of sized types can be placed on the stack, i.e. can be passed around by value
- instances of unsized types can't be placed on the stack and must be passed around by reference
- pointers to unsized types are double-width because aside from pointing to data they need to do an extra bit of bookkeeping to also keep track of the data's length _or_ point to a vtable
## `Sized` Trait
The `Sized` trait in Rust is an auto trait and a marker trait.
Auto traits are traits that get automatically implemented for a type if it passes certain conditions. Marker traits are traits that mark a type as having a certain property. Marker traits do not have any trait items such as methods, associated functions, associated constants, or associated types. All auto traits are marker traits but not all marker traits are auto traits. Auto traits must be marker traits so the compiler can provide an automatic default implementation for them, which would not be possible if the trait had any trait items.
A type gets an auto `Sized` implementation if all of its members are also `Sized`. What "members" means depends on the containing type, for example: fields of a struct, variants of an enum, elements of an array, items of a tuple, and so on. Once a type has been "marked" with a `Sized` implementation that means its size in bytes is known at compile time.
Other examples of auto marker traits are the `Send` and `Sync` traits. A type is `Send` if it is safe to send that type across threads. A type is `Sync` if it's safe to share references of that type between threads. A type gets auto `Send` and `Sync` implementations if all of its members are also `Send` and `Sync`. What makes `Sized` somewhat special is that it's not possible to opt-out of unlike with the other auto marker traits which are possible to opt-out of.
```rust
#![feature(negative_impls)]
// this type is Sized, Send, and Sync
struct Struct;
// opt-out of Send trait
impl !Send for Struct {} // ✅
// opt-out of Sync trait
impl !Sync for Struct {} // ✅
// can't opt-out of Sized
impl !Sized for Struct {} // ❌
```
This seems reasonable since there might be reasons why we wouldn't want our type to be sent or shared across threads, however it's hard to imagine a scenario where we'd want the compiler to "forget" the size of our type and treat it as an unsized type as that offers no benefits and merely makes the type more difficult to work with.
Also, to be super pedantic `Sized` is not technically an auto trait since it's not defined using the `auto` keyword but the special treatment it gets from the compiler makes it behave very similarly to auto traits so in practice it's okay to think of it as an auto trait.
**Key Takeaways**
- `Sized` is an "auto" marker trait
## `Sized` in Generics
It's not immediately obvious that whenever we write any generic code every generic type parameter gets auto-bound with the `Sized` trait by default.
```rust
// this generic function...
fn func<T>(t: T) {}
// ...desugars to...
fn func<T: Sized>(t: T) {}
// ...which we can opt-out of by explicitly setting ?Sized...
fn func<T: ?Sized>(t: T) {} // ❌
// ...which doesn't compile since it doesn't have
// a known size so we must put it behind a pointer...
fn func<T: ?Sized>(t: &T) {} // ✅
fn func<T: ?Sized>(t: Box<T>) {} // ✅
```
**Pro tips**
- `?Sized` can be pronounced _"optionally sized"_ or _"maybe sized"_ and adding it to a type parameter's bounds allows the type to be sized or unsized
- `?Sized` in general is referred to as a _"widening bound"_ or a _"relaxed bound"_ as it relaxes rather than constrains the type parameter
- `?Sized` is the only relaxed bound in Rust
So why does this matter? Well, any time we're working with a generic type and that type is behind a pointer we almost always want to opt-out of the default `Sized` bound to make our function more flexible in what argument types it will accept. Also, if we don't opt-out of the default `Sized` bound we'll eventually get some surprising and confusing compile error messages.
Let me take you on the journey of the first generic function I ever wrote in Rust. I started learning Rust before the `dbg!` macro landed in stable so the only way to print debug values was to type out `println!("{:?}", some_value);` every time which is pretty tedious so I decided to write a `debug` helper function like this:
```rust
use std::fmt::Debug;
fn debug<T: Debug>(t: T) { // T: Debug + Sized
println!("{:?}", t);
}
fn main() {
debug("my str"); // T = &str, &str: Debug + Sized ✅
}
```
So far so good, but the function takes ownership of any values passed to it which is kinda annoying so I changed the function to only take references instead:
```rust
use std::fmt::Debug;
fn dbg<T: Debug>(t: &T) { // T: Debug + Sized
println!("{:?}", t);
}
fn main() {
dbg("my str"); // &T = &str, T = str, str: Debug + !Sized ❌
}
```
Which now throws this error:
```none
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/main.rs:8:9
|
3 | fn dbg<T: Debug>(t: &T) {
| - required by this bound in `dbg`
...
8 | dbg("my str");
| ^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
help: consider relaxing the implicit `Sized` restriction
|
3 | fn dbg<T: Debug + ?Sized>(t: &T) {
|
```
When I first saw this I found it incredibly confusing. Despite making my function more restrictive in what arguments it takes than before it now somehow throws a compile error! What is going on?
I've already kinda spoiled the answer in the code comments above, but basically: Rust performs pattern matching when resolving `T` to its concrete types during compilation. Here's a couple tables to help clarify:
| Type | `T` | `&T` |
|------------|---|----|
| `&str` | `T` = `&str` | `T` = `str` |
| Type | `Sized` |
|-|-|
| `str` | ❌ |
| `&str` | ✅ |
| `&&str` | ✅ |
This is why I had to add a `?Sized` bound to make the function work as intended after changing it to take references. The working function below:
```rust
use std::fmt::Debug;
fn debug<T: Debug + ?Sized>(t: &T) { // T: Debug + ?Sized
println!("{:?}", t);
}
fn main() {
debug("my str"); // &T = &str, T = str, str: Debug + !Sized ✅
}
```
**Key Takeaways**
- all generic type parameters are auto-bound with `Sized` by default
- if we have a generic function which takes an argument of some `T` behind a pointer, e.g. `&T`, `Box<T>`, `Rc<T>`, et cetera, then we almost always want to opt-out of the default `Sized` bound with `T: ?Sized`
## Unsized Types
### Slices
The most common slices are string slices `&str` and array slices `&[T]`. What's nice about slices is that many other types coerce to them, so leveraging slices and Rust's auto type coercions allow us to write flexible APIs.
Type coercions can happen in several places but most notably on function arguments and at method calls. The kinds of type coercions we're interested in are deref coercions and unsized coercions. A deref coercion is when a `T` gets coerced into a `U` following a deref operation, i.e. `T: Deref<Target = U>`, e.g. `String.deref() -> str`. An unsized coercion is when a `T` gets coerced into a `U` where `T` is a sized type and `U` is an unsized type, i.e. `T: Unsize<U>`, e.g. `[i32; 3] -> [i32]`.
```rust
trait Trait {
fn method(&self) {}
}
impl Trait for str {
// can now call "method" on
// 1) str or
// 2) String since String: Deref<Target = str>
}
impl<T> Trait for [T] {
// can now call "method" on
// 1) any &[T]
// 2) any U where U: Deref<Target = [T]>, e.g. Vec<T>
// 3) [T; N] for any N, since [T; N]: Unsize<[T]>
}
fn str_fun(s: &str) {}
fn slice_fun<T>(s: &[T]) {}
fn main() {
let str_slice: &str = "str slice";
let string: String = "string".to_owned();
// function args
str_fun(str_slice);
str_fun(&string); // deref coercion
// method calls
str_slice.method();
string.method(); // deref coercion
let slice: &[i32] = &[1];
let three_array: [i32; 3] = [1, 2, 3];
let five_array: [i32; 5] = [1, 2, 3, 4, 5];
let vec: Vec<i32> = vec![1];
// function args
slice_fun(slice);
slice_fun(&vec); // deref coercion
slice_fun(&three_array); // unsized coercion
slice_fun(&five_array); // unsized coercion
// method calls
slice.method();
vec.method(); // deref coercion
three_array.method(); // unsized coercion
five_array.method(); // unsized coercion
}
```
**Key Takeaways**
- leveraging slices and Rust's auto type coercions allows us to write flexible APIs
### Trait Objects
Traits are `?Sized` by default. This program:
```rust
trait Trait: ?Sized {}
```
Throws this error:
```none
error: `?Trait` is not permitted in supertraits
--> src/main.rs:1:14
|
1 | trait Trait: ?Sized {}
| ^^^^^^
|
= note: traits are `?Sized` by default
```
We'll get into why traits are `?Sized` by default soon but first let's ask ourselves what are the implications of a trait being `?Sized`? Let's desugar the above example:
```rust
trait Trait where Self: ?Sized {}
```
Okay, so by default traits allow `self` to possibly be an unsized type. As we learned earlier we can't pass unsized types around by value, so that limits us in the kind of methods we can define in the trait. It should be impossible to write a method the takes or returns `self` by value and yet this surprisingly compiles:
```rust
trait Trait {
fn method(self); // ✅
}
```
However the moment we try to implement the method, either by providing a default implementation or by implementing the trait for an unsized type, we get compile errors:
```rust
trait Trait {
fn method(self) {} // ❌
}
impl Trait for str {
fn method(self) {} // ❌
}
```
Throws:
```none
error[E0277]: the size for values of type `Self` cannot be known at compilation time
--> src/lib.rs:2:15
|
2 | fn method(self) {}
| ^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `Self`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
= note: all local variables must have a statically known size
= help: unsized locals are gated as an unstable feature
help: consider further restricting `Self`
|
2 | fn method(self) where Self: std::marker::Sized {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/lib.rs:6:15
|
6 | fn method(self) {}
| ^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
= note: all local variables must have a statically known size
= help: unsized locals are gated as an unstable feature
```
If we're determined to pass `self` around by value we can fix the first error by explicitly binding the trait with `Sized`:
```rust
trait Trait: Sized {
fn method(self) {} // ✅
}
impl Trait for str { // ❌
fn method(self) {}
}
```
Now throws:
```none
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/lib.rs:7:6
|
1 | trait Trait: Sized {
| ----- required by this bound in `Trait`
...
7 | impl Trait for str {
| ^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
```
Which is okay, as we knew upon binding the trait with `Sized` we'd no longer be able to implement it for unsized types such as `str`. If on the other hand we really wanted to implement the trait for `str` an alternative solution would be to keep the trait `?Sized` and pass `self` around by reference:
```rust
trait Trait {
fn method(&self) {} // ✅
}
impl Trait for str {
fn method(&self) {} // ✅
}
```
Instead of marking the entire trait as `?Sized` or `Sized` we have the more granular and precise option of marking individual methods as `Sized` like so:
```rust
trait Trait {
fn method(self) where Self: Sized {}
}
impl Trait for str {} // ✅!?
fn main() {
"str".method(); // ❌
}
```
It's surprising that Rust compiles `impl Trait for str {}` without any complaints, but it eventually catches the error when we attempt to call `method` on an unsized type so all is fine. It's a little weird but affords us some flexibility in implementing traits with some `Sized` methods for unsized types as long as we never call the `Sized` methods:
```rust
trait Trait {
fn method(self) where Self: Sized {}
fn method2(&self) {}
}
impl Trait for str {} // ✅
fn main() {
// we never call "method" so no errors
"str".method2(); // ✅
}
```
Now back to the original question, why are traits `?Sized` by default? The answer is trait objects. Trait objects are inherently unsized because any type of any size can implement a trait, therefore we can only implement `Trait` for `dyn Trait` if `Trait: ?Sized`. To put it in code:
```rust
trait Trait: ?Sized {}
// the above is REQUIRED for
impl Trait for dyn Trait {
// compiler magic here
}
// since `dyn Trait` is unsized
// and now we can use `dyn Trait` in our program
fn function(t: &dyn Trait) {} // ✅
```
If we try to actually compile the above program we get:
```none
error[E0371]: the object type `(dyn Trait + 'static)` automatically implements the trait `Trait`
--> src/lib.rs:5:1
|
5 | impl Trait for dyn Trait {
| ^^^^^^^^^^^^^^^^^^^^^^^^ `(dyn Trait + 'static)` automatically implements trait `Trait`
```
Which is the compiler telling us to chill since it automatically provides the implementation of `Trait` for `dyn Trait`. Again, since `dyn Trait` is unsized the compiler can only provide this implementation if `Trait: ?Sized`. If we bound `Trait` by `Sized` then `Trait` becomes _"object unsafe"_ which is a term that means we can't cast types which implement `Trait` to trait objects of `dyn Trait`. As expected this program does not compile:
```rust
trait Trait: Sized {}
fn function(t: &dyn Trait) {} // ❌
```
Throws:
```none
error[E0038]: the trait `Trait` cannot be made into an object
--> src/lib.rs:3:18
|
1 | trait Trait: Sized {}
| ----- ----- ...because it requires `Self: Sized`
| |
| this trait cannot be made into an object...
2 |
3 | fn function(t: &dyn Trait) {}
| ^^^^^^^^^^ the trait `Trait` cannot be made into an object
```
Let's try to make an `?Sized` trait with a `Sized` method and see if we can cast it to a trait object:
```rust
trait Trait {
fn method(self) where Self: Sized {}
fn method2(&self) {}
}
fn function(arg: &dyn Trait) { // ✅
arg.method(); // ❌
arg.method2(); // ✅
}
```
As we saw before everything is okay as long as we don't call the `Sized` method on the trait object.
**Key Takeaways**
- all traits are `?Sized` by default
- `Trait: ?Sized` is required for `impl Trait for dyn Trait`
- we can require `Self: Sized` on a per-method basis
- traits bound by `Sized` can't be made into trait objects
### Trait Object Limitations
Even if a trait is object-safe there are still sizedness-related edge cases which limit what types can be cast to trait objects and how many and what kind of traits can be represented by a trait object.
#### Cannot Cast Unsized Types to Trait Objects
```rust
fn generic<T: ToString>(t: T) {}
fn trait_object(t: &dyn ToString) {}
fn main() {
generic(String::from("String")); // ✅
generic("str"); // ✅
trait_object(&String::from("String")); // ✅ - unsized coercion
trait_object("str"); // ❌ - unsized coercion impossible
}
```
Throws:
```none
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/main.rs:8:18
|
8 | trait_object("str");
| ^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
= note: required for the cast to the object type `dyn std::string::ToString`
```
The reason why passing a `&String` to a function expecting a `&dyn ToString` works is because of type coercion. `String` implements `ToString` and we can convert a sized type such as `String` into an unsized type such as `dyn ToString` via an unsized coercion. `str` also implements `ToString` and converting `str` into a `dyn ToString` would also require an unsized coercion but `str` is already unsized! How do we unsize an already unsized type into another unsized type?
`&str` pointers are double-width, storing a pointer to the data and the data length. `&dyn ToString` pointers are also double-width, storing a pointer to the data and a pointer to a vtable. To coerce a `&str` into a `&dyn toString` would require a triple-width pointer to store a pointer to the data, the data length, and a pointer to a vtable. Rust does not support triple-width pointers so casting an unsized type to a trait object is not possible.
Previous two paragraphs summarized in a table:
| Type | Pointer to Data | Data Length | Pointer to VTable | Total Width |
|-|-|-|-|-|
| `&String` | ✅ | ❌ | ❌ | 1 ✅ |
| `&str` | ✅ | ✅ | ❌ | 2 ✅ |
| `&String as &dyn ToString` | ✅ | ❌ | ✅ | 2 ✅ |
| `&str as &dyn ToString` | ✅ | ✅ | ✅ | 3 ❌ |
#### Cannot create Multi-Trait Objects
```rust
trait Trait {}
trait Trait2 {}
fn function(t: &(dyn Trait + Trait2)) {}
```
Throws:
```none
error[E0225]: only auto traits can be used as additional traits in a trait object
--> src/lib.rs:4:30
|
4 | fn function(t: &(dyn Trait + Trait2)) {}
| ----- ^^^^^^
| | |
| | additional non-auto trait
| | trait alias used in trait object type (additional use)
| first non-auto trait
| trait alias used in trait object type (first use)
```
Remember that a trait object pointer is double-width: storing 1 pointer to the data and another to the vtable, but there's 2 traits here so there's 2 vtables which would require the `&(dyn Trait + Trait2)` pointer to be 3 widths. Auto-traits like `Sync` and `Send` are allowed since they don't have methods and thus don't have vtables.
The workaround for this is to combine vtables by combining the traits using another trait like so:
```rust
trait Trait {
fn method(&self) {}
}
trait Trait2 {
fn method2(&self) {}
}
trait Trait3: Trait + Trait2 {}
// auto blanket impl Trait3 for any type that also impls Trait & Trait2
impl<T: Trait + Trait2> Trait3 for T {}
// from `dyn Trait + Trait2` to `dyn Trait3`
fn function(t: &dyn Trait3) {
t.method(); // ✅
t.method2(); // ✅
}
```
One downside of this workaround is that Rust does not support supertrait upcasting. What this means is that if we have a `dyn Trait3` we can't use it where we need a `dyn Trait` or a `dyn Trait2`. This program does not compile:
```rust
trait Trait {
fn method(&self) {}
}
trait Trait2 {
fn method2(&self) {}
}
trait Trait3: Trait + Trait2 {}
impl<T: Trait + Trait2> Trait3 for T {}
struct Struct;
impl Trait for Struct {}
impl Trait2 for Struct {}
fn takes_trait(t: &dyn Trait) {}
fn takes_trait2(t: &dyn Trait2) {}
fn main() {
let t: &dyn Trait3 = &Struct;
takes_trait(t); // ❌
takes_trait2(t); // ❌
}
```
Throws:
```none
error[E0308]: mismatched types
--> src/main.rs:22:17
|
22 | takes_trait(t);
| ^ expected trait `Trait`, found trait `Trait3`
|
= note: expected reference `&dyn Trait`
found reference `&dyn Trait3`
error[E0308]: mismatched types
--> src/main.rs:23:18
|
23 | takes_trait2(t);
| ^ expected trait `Trait2`, found trait `Trait3`
|
= note: expected reference `&dyn Trait2`
found reference `&dyn Trait3`
```
This is because `dyn Trait3` is a distinct type from `dyn Trait` and `dyn Trait2` in the sense that they have different vtable layouts, although `dyn Trait3` does contain all the methods of `dyn Trait` and `dyn Trait2`. The workaround here is to add explicit casting methods:
```rust
trait Trait {}
trait Trait2 {}
trait Trait3: Trait + Trait2 {
fn as_trait(&self) -> &dyn Trait;
fn as_trait2(&self) -> &dyn Trait2;
}
impl<T: Trait + Trait2> Trait3 for T {
fn as_trait(&self) -> &dyn Trait {
self
}
fn as_trait2(&self) -> &dyn Trait2 {
self
}
}
struct Struct;
impl Trait for Struct {}
impl Trait2 for Struct {}
fn takes_trait(t: &dyn Trait) {}
fn takes_trait2(t: &dyn Trait2) {}
fn main() {
let t: &dyn Trait3 = &Struct;
takes_trait(t.as_trait()); // ✅
takes_trait2(t.as_trait2()); // ✅
}
```
This is a simple and straight-forward workaround that seems like something the Rust compiler could automate for us. Rust is not shy about performing type coercions as we have seen with deref and unsized coercions, so why isn't there a trait upcasting coercion? This is a good question with a familiar answer: the Rust core team is working on other higher-priority and higher-impact features. Fair enough.
**Key Takeaways**
- Rust doesn't support pointers wider than 2 widths so
- we can't cast unsized types to trait objects
- we can't have multi-trait objects, but we can work around this by coalescing multiple traits into a single trait
### User-Defined Unsized Types
```rust
struct Unsized {
unsized_field: [i32],
}
```
We can define an unsized struct by giving the struct an unsized field. Unsized structs can only have 1 unsized field and it must be the last field in the struct. This is a requirement so that the compiler can determine the starting offset of every field in the struct at compile-time, which is important for efficient and fast field access. Furthermore, a single unsized field is the most that can be tracked using a double-width pointer, as more unsized fields would require more widths.
So how do we even instantiate this thing? The same way we do with any unsized type: by first making a sized version of it then coercing it into the unsized version. However, `Unsized` is always unsized by definition, there's no way to make a sized version of it! The only workaround is to make the struct generic so that it can exist in both sized and unsized versions:
```rust
struct MaybeSized<T: ?Sized> {
maybe_sized: T,
}
fn main() {
// unsized coercion from MaybeSized<[i32; 3]> to MaybeSized<[i32]>
let ms: &MaybeSized<[i32]> = &MaybeSized { maybe_sized: [1, 2, 3] };
}
```
So what are the use-cases of this? There aren't any particularly compelling ones, user-defined unsized types are a pretty half-baked feature right now and their limitations outweigh any benefits. They're mentioned here purely for the sake of comprehensiveness.
**Fun fact:** `std::ffi::OsStr` and `std::path::Path` are 2 unsized structs in the standard library that you've probably used before without realizing!
**Key Takeaways**
- user-defined unsized types are a half-baked feature right now and their limitations outweigh any benefits
## Zero-Sized Types
ZSTs sound exotic at first but they're used everywhere.
### Unit Type
The most common ZST is the unit type: `()`. All empty blocks `{}` evaluate to `()` and if the block is non-empty but the last expression is discarded with a semicolon `;` then it also evaluates to `()`. Example:
```rust
fn main() {
let a: () = {};
let b: i32 = {
5
};
let c: () = {
5;
};
}
```
Every function which doesn't have an explicit return type returns `()` by default.
```rust
// with sugar
fn function() {}
// desugared
fn function() -> () {}
```
Since `()` is zero bytes all instances of `()` are the same which makes for some really simple `Default`, `PartialEq`, and `Ord` implementations:
```rust
use std::cmp::Ordering;
impl Default for () {
fn default() {}
}
impl PartialEq for () {
fn eq(&self, _other: &()) -> bool {
true
}
fn ne(&self, _other: &()) -> bool {
false
}
}
impl Ord for () {
fn cmp(&self, _other: &()) -> Ordering {
Ordering::Equal
}
}
```
The compiler understands `()` is zero-sized and optimizes away interactions with instances of `()`. For example, a `Vec<()>` will never make any heap allocations, and pushing and popping `()` from the `Vec` just increments and decrements its `len` field:
```rust
fn main() {
// zero capacity is all the capacity we need to "store" infinitely many ()
let mut vec: Vec<()> = Vec::with_capacity(0);
// causes no heap allocations or vec capacity changes
vec.push(()); // len++
vec.push(()); // len++
vec.push(()); // len++
vec.pop(); // len--
assert_eq!(2, vec.len());
}
```
The above example has no practical applications, but is there any situation where we can take advantage of the above idea in a meaningful way? Surprisingly yes, we can get an efficient `HashSet<Key>` implementation from a `HashMap<Key, Value>` by setting the `Value` to `()` which is exactly how `HashSet` in the Rust standard library works:
```rust
// std::collections::HashSet
pub struct HashSet<T> {
map: HashMap<T, ()>,
}
```
**Key Takeaways**
- all instances of a ZST are equal to each other
- Rust compiler knows to optimize away interactions with ZSTs
### User-Defined Unit Structs
A unit struct is any struct without any fields, e.g.
```rust
struct Struct;
```
Properties that make unit structs more useful than `()`:
- we can implement whatever traits we want on our own unit structs, Rust's trait orphan rules prevent us from implementing traits for `()` as it's defined in the standard library
- unit structs can be given meaningful names within the context of our program
- unit structs, like all structs, are non-Copy by default, which may be important in the context of our program
### Never Type
The second most common ZST is the never type: `!`. It's called the never type because it represents computations that never resolve to any value at all.
A couple interesting properties of `!` that make it different from `()`:
- `!` can be coerced into any other type
- it's not possible to create instances of `!`
The first interesting property is very useful for ergonomics and allows us to use handy macros like these:
```rust
// nice for quick prototyping
fn example<T>(t: &[T]) -> Vec<T> {
unimplemented!() // ! coerced to Vec<T>
}
fn example2() -> i32 {
// we know this parse call will never fail
match "123".parse::<i32>() {
Ok(num) => num,
Err(_) => unreachable!(), // ! coerced to i32
}
}
fn example3(some_condition: bool) -> &'static str {
if !some_condition {
panic!() // ! coerced to &str
} else {
"str"
}
}
```
`break`, `continue`, and `return` expressions also have type `!`:
```rust
fn example() -> i32 {
// we can set the type of x to anything here
// since the block never evaluates to any value
let x: String = {
return 123 // ! coerced to String
};
}
fn example2(nums: &[i32]) -> Vec<i32> {
let mut filtered = Vec::new();
for num in nums {
filtered.push(
if *num < 0 {
break // ! coerced to i32
} else if *num % 2 == 0 {
*num
} else {
continue // ! coerced to i32
}
);
}
filtered
}
```
The second interesting property of `!` allows us to mark certain states as impossible on a type level. Let's take this function signature as an example:
```rust
fn function() -> Result<Success, Error>;
```
We know that if the function returns and was successful the `Result` will contain some instance of type `Success` and if it errored `Result` will contain some instance of type `Error`. Now let's compare that to this function signature:
```rust
fn function() -> Result<Success, !>;
```
We know that if the function returns and was successful the `Result` will hold some instance of type `Success` and if it errored... but wait, it can never error, since it's impossible to create instances of `!`. Given the above function signature we know this function will never error. How about this function signature:
```rust
fn function() -> Result<!, Error>;
```
The inverse of the previous is now true: if this function returns we know it must have errored as success is impossible.
A practical application of the former example would be the `FromStr` implementation for `String` as it's impossible to fail converting a `&str` into a `String`:
```rust
#![feature(never_type)]
use std::str::FromStr;
impl FromStr for String {
type Err = !;
fn from_str(s: &str) -> Result<String, Self::Err> {
Ok(String::from(s))
}
}
```
A practical application of the latter example would be a function that runs an infinite loop that's never meant to return, like a server responding to client requests, unless there's some error:
```rust
#![feature(never_type)]
fn run_server() -> Result<!, ConnectionError> {
loop {
let (request, response) = get_request()?;
let result = request.process();
response.send(result);
}
}
```
The feature flag is necessary because while the never type exists and works within Rust internals using it in user-code is still considered experimental.
**Key Takeaways**
- `!` can be coerced into any other type
- it's not possible to create instances of `!` which we can use to mark certain states as impossible at a type level
### User-Defined Pseudo Never Types
While it's not possible to define a type that can coerce to any other type it is possible to define a type which is impossible to create instances of such as an `enum` without any variants:
```rust
enum Void {}
```
This allows us to remove the feature flag from the previous two examples and implement them using stable Rust:
```rust
enum Void {}
// example 1
impl FromStr for String {
type Err = Void;
fn from_str(s: &str) -> Result<String, Self::Err> {
Ok(String::from(s))
}
}
// example 2
fn run_server() -> Result<Void, ConnectionError> {
loop {
let (request, response) = get_request()?;
let result = request.process();
response.send(result);
}
}
```
This is the technique the Rust standard library uses, as the `Err` type for the `FromStr` implementation of `String` is `std::convert::Infallible` which is defined as:
```rust
pub enum Infallible {}
```
### PhantomData
The third most commonly used ZST is probably `PhantomData`. `PhantomData` is a zero-sized marker struct which can be used to "mark" a containing struct as having certain properties. It's similar in purpose to its auto marker trait cousins such as `Sized`, `Send`, and `Sync` but being a marker struct is used a little bit differently. Giving a thorough explanation of `PhantomData` and exploring all of its use-cases is outside the scope of this article so let's only briefly go over a single simple example. Recall this code snippet presented earlier:
```rust
#![feature(negative_impls)]
// this type is Send and Sync
struct Struct;
// opt-out of Send trait
impl !Send for Struct {}
// opt-out of Sync trait
impl !Sync for Struct {}
```
It's unfortunate that we have to use a feature flag, can we accomplish the same result using only stable Rust? As we've learned, a type is only `Send` and `Sync` if all of its members are also `Send` and `Sync`, so we can add a `!Send` and `!Sync` member to `Struct` like `Rc<()>`:
```rust
use std::rc::Rc;
// this type is not Send or Sync
struct Struct {
// adds 8 bytes to every instance
_not_send_or_sync: Rc<()>,
}
```
This is less than ideal because it adds size to every instance of `Struct` and we now also have to conjure a `Rc<()>` from thin air every time we want to create a `Struct`. Since `PhantomData` is a ZST it solves both of these problems:
```rust
use std::rc::Rc;
use std::marker::PhantomData;
type NotSendOrSyncPhantom = PhantomData<Rc<()>>;
// this type is not Send or Sync
struct Struct {
// adds no additional size to instances
_not_send_or_sync: NotSendOrSyncPhantom,
}
```
**Key Takeaways**
- `PhantomData` is a zero-sized marker struct which can be used to "mark" a containing struct as having certain properties
## Conclusion
- only instances of sized types can be placed on the stack, i.e. can be passed around by value
- instances of unsized types can't be placed on the stack and must be passed around by reference
- pointers to unsized types are double-width because aside from pointing to data they need to do an extra bit of bookkeeping to also keep track of the data's length _or_ point to a vtable
- `Sized` is an "auto" marker trait
- all generic type parameters are auto-bound with `Sized` by default
- if we have a generic function which takes an argument of some `T` behind a pointer, e.g. `&T`, `Box<T>`, `Rc<T>`, et cetera, then we almost always want to opt-out of the default `Sized` bound with `T: ?Sized`
- leveraging slices and Rust's auto type coercions allows us to write flexible APIs
- all traits are `?Sized` by default
- `Trait: ?Sized` is required for `impl Trait for dyn Trait`
- we can require `Self: Sized` on a per-method basis
- traits bound by `Sized` can't be made into trait objects
- Rust doesn't support pointers wider than 2 widths so
- we can't cast unsized types to trait objects
- we can't have multi-trait objects, but we can work around this by coalescing multiple traits into a single trait
- user-defined unsized types are a half-baked feature right now and their limitations outweigh any benefits
- all instances of a ZST are equal to each other
- Rust compiler knows to optimize away interactions with ZSTs
- `!` can be coerced into any other type
- it's not possible to create instances of `!` which we can use to mark certain states as impossible at a type level
- `PhantomData` is a zero-sized marker struct which can be used to "mark" a containing struct as having certain properties