Pin
Ergonomics and Move
Thanks for TC and Yosh for their feedback in improving this document!
Users sometimes need to work with address-sensitive types. A common example is with futures, but there are other cases such as intrusive linked lists.[1] Because these types are address sensitive, they cannot move.
Address-sensitive types have strong overlap with self-referential types, but they are not exactly the same. There are self-referential types that are not address sensitive and there are address sensitive types that are not self-referential.
Today, Pin
gives us the ability to work with address-sensitive types, yet the current ergonomics of Pin
are seriously lacking.
We have gotten by so far because Pin
is only used in a few use cases and most users do not directly interact with Pin
.
Upcoming language features, such as async closures and async iterators[2] will create more places where for users interact with Pin
and create new opportunities to define self referential and address sensitive types.
Possible solutions to solutions to Pin
's poor ergonomics would likely be in one of these categories.
Pin
with more language support.Pin
that supports address-sensitive types.This document primarily discusses a way to do Option 1.
We will also touch a design for a Move
trait.
Note that because this document is primarily rooted in the narrow motivation of improving the ergonomics around address-sensitive types, this document will undersell the possible benefits of Move
.
Below is a code example that demonstrates the kinds of data structures that come up more often with async closures.
The example has roughly the same shape as merging two streams.
It shows an Interleave
struct that contains two async closures.
Then the call_both
method will call each closure which returns two futures.
The call_both
method then polls each future until one completes.
The function then returns the value of the first one to complete but crucially, it does not destroy the other future.
Instead, the slower future persists until the next call to call_both
.
There are a couple aspects that make this program interesting:
call_both
leaves a future in a partially completed state across calls. This means those futures have been pinned, which means the Interleave
struct must be pinned, which means call_both
must take a pinned self argument.Without these properties, you can generally get by with an async block or similar to let the compiler handle all the self-referential details.
Note that this will also need unsafe<..>
binders to work, but that's another feature for a different design discussion.
Pin
!Unpin
as a bound is a double negation and hard to reason aboutUnpin
only has meaning when combined with Pin
Pin<T>
types from constructors.as_mut()
all the time.unsafe
for a crate that encapsulates this unsafety.Pin<&mut T>
).&pin mut T
One path is to add pinned references. There have been many parallel ongoing discussions about these. In general, these discussions propose adding the following:
Pin
pointers to avoid needing to write .as_mut()
everywhere&pin mut T
and &pin const T
as sugar for Pin<&mut T>
and Pin<&T>
self
let pin x = ...
as a way of indicating places that can be pinneddrop
for types with pinned fieldsWith these changes, the earlier example would look like this:
It seems like we might be able to rely on Unpin
to handle projection.
If we did this, we would say that projecting through a pinned reference always yields a pinned reference, but if the target is Unpin
then we could convert the reference to an unpinned one using DerefMut
.
Unfortunately, this does not work because with Pin
, immovability is a temporal property.
For example, we might have a structure like this:
Here F::Output
might be !Unpin
, but we would still be allowed to move out of it in get_result
because it hasn't been pinned yet.
Move
traitAnother potential option is to add a new Move
trait that is added by default to all bounds but can be opted out of with ?Move
, like Sized
.
If this were 2014 or even 2015, we could probably add Move
and end up with a nice design.
What's less clear is whether it's possible, and what it would take to adopt Move
now that it's 2024.
Let's explore.
First, let's see what our running example would look like:
It's similar to before.
The pin
keywords have been removed since movability is carried in the types.
To assign an immovable value to a field, however, we need to use some kind of emplacement feature.
This example uses a plausible version of it, but we haven't designed that feature yet.
Move
would be a marker trait that indicates you are allowed to do move operations on a type.
These include:
swap
or take
.Pin
/&pin mut
These restrictions make the type somewhat hard to work with without additional language features.
For example, the usual MyType::new()
pattern does not work for ?Move
types because you cannot return a value without a Move
impl.
Instead, with the features available today you would have to do something like:
Pin
allows types to exist in either a pinned state or an unpinned state, and it is common for values to exist in both.
For example, someone might compose a number of futures together and move them freely in the process.
However, in order to call poll
the future must be pinned and it remains pinned from that point.
With Move
, a type is either always movable or always immovable.
This means the pinned and unpinned phases have to be represented by different types.
One natural way to make this split with futures is to take advantage of IntoFuture
.
A type that implements IntoFuture + Move
could be used to represent the unpinned phase of the life cycle, and then when you call .into_future()
, the result is a ?Move
type that implements Future
.
(Of course, this is not possible without additional language features because into_future()
cannot return a ?Move
value.)
Pin projection is an annoyance with Pin
.
It's less present with &pin mut T
, but it's still there in the form of needing to annotate fields for projection purposes.
On the other hand, with Move
, there is nothing special about pinned projections.
You just take references to fields, and if the type of those fields is ?Move
then you can't move out of those references.
With &pin mut T
, we require a different version of the drop trait if T
has fields to which we can safely pin project.
With Move
, we would expect that we do not need to do anything special.
If the value being dropped has ?Move
fields, then you will not be able to move out of them in drop
.
Unfortunately, some of these fields might be both Move
and ?Unpin
meaning they might have been pinned and then we are not allowed to move out of them.
See below.
Pin
exists now, so there's nothing to be backwards compatible with.
&pin mut T
is sugar for Pin<&mut T>
with extra language support, so the backwards compatibility story is good there too.
The backwards compatibility story for Move
is essentially impossible.
The reason is that there exist values that would be both Move
and ?Unpin
, and in fact these are quite common (the future returned by any async fn
is in this category today).
See Interop with Pin
for more details.
Even if we could solve this, another significant backwards compatibility challenge is with associated type bounds.
For example, we would want to add a ?Move
bound to Deref::Target
.
Doing so would break code like this:
It might be possible to add some kind of edition-dependent bound mechanism to make this work.
It's also quite likely that Move
would require building a new version of IntoFuture
and Future
that work with the move trait, as well as adapters from the existing Pin
-based Future
trait.
Pin
Given that Pin
already exists in the language, we want Move
-based code to work as seamlessly as possible with existing Pin
-based code.
Move
equivalent to Unpin
?No.
The reason is that Unpin
is concerned with what happens after a pinned reference to a value has been created, while Move
applies to the whole lifetime of a value.
There are some relationships though, including:
A type that is ?Unpin
may still be Move
.
In these cases, we must rely on Pin
to represent the pinned typestate.
Values in this category occur, for example, when doing the pattern where you compose futures together, then pin them before polling.
Today any async { ... }
has a type that is Move + ?Unpin
.
A type that is ?Move
may be Unpin
.
Unpin
ultimately means it's safe to get a &mut T
to the contents of a Pin<&mut T>
.
If T
is ?Move
, then you can't move out of a &mut T
.
Move
does not imply Unpin
. See the first bullet point for a counterexample.
A type that is !Move
is Unpin
.
We could probably write impl<T: !Move> Unpin for T
, although this would surely conflict with lots of other impls.
Pin<&mut T>
and Move
First, let's consider if we have a Pin<&mut T>
, what additional things would Move
enable?
It's tempting to try and relax Pin::get_mut
to something like:
After all, if T
is not Move
, then we can't move out of the reference at all, regardless of whether it's Unpin
, and if it is Move
then we should be able to move it.
This doesn't work though, because T
might be Move
but not Unpin
, like many futures today.
Furthermore, if we could do this, that would enable code like this:
The best we could do is probably:
Move
into Pin
Does Move
let us work with Pin
a little easier, at least?
Let's imagine when Move
would let us coerce &mut T
into Pin<&mut T>
.
If we know T: !Move
, we can freely convert between &mut T
and Pin<&mut T>
.
Otherwise, it doesn't seem like there's much more we could do.
To help guide what we should do going forward, this section considers a few hypothetical situations and what we might do in those. The goal is to suss out to what extent these features are independent and have value on their own.
Move
was added pre-1.0Move
is challenging to add to Rust now, and doing so will likely require the ability to make new categories of changes across an edition.
Before Rust 1.0, we could have broken backwards compatibility and added Move
.
Suppose we had done that.
Would there be any reason to add Pin
or &pin mut T
references now?
If the answer is yes, this suggest that Pin
is not an inherently poorer solution to immovable values that we only adopted because it was backwards compatible.
Instead, we would conclude that Pin
had value on its own.
If Pin
has inherent value, independent of Move
, then it stands to reason that improving the ergonomics of Pin
would be a good thing too.
I don't immediately know what this value would be, but given the existence or desire for features like &raw T
, UnsafeCell
, NonNull<T>
, etc., it would not be surprising to find there are cases where lower level, dynamic control over immovability is useful.
Move
The questions here are essentially the same as the previous scenario, but the perspective is slightly different.
Let's assume we're were able to solve the backwards compatibility and migration challenges with Move
and so we adopted it and declare Move
to be the new way to talk about immovable types.
The question is then, what to do with Pin
, given that it is already part of the language?
One option is we leave it in its current state and encourage everyone to migrate their Pin
-based code to Move
as soon as possible.
Can we do that for all code though?
Will Pin
-based code continue to be maintained?
If we cannot migrate all code, or there's a good reason to continue to maintain Pin
-based code, then that suggest that Pin
has inherent value, independent of Move
.
In that case, we should make the lives of maintainers of Pin
-based code by making Pin
easier to work with.
Pin
, do we need Move
?In the two scenarios we've seen so far, it seems plausible that if we had Move
there might also be a reason to have Pin
for advanced use cases.
Let's say we improve Pin
's ergonomics to the point where pinning is roughly the same level of difficulty as mutable references (i.e. &mut T
).
Would it make sense to add Move
at that point?
I think this needs more exploration. Here are some possible improvements though:
Move
makes it easier to write code that doesn't need to move a value to be generic over movability. With Pin
you'd have to write two versions of a function, one that takes a &mut T
and another that takes a &pin mut T
.This section is a collection of a few questions to explore in this space.
Unpin
be unsafe??Move
does not imply Pin
eholk: Prefilling a point raised in review:
A type that is
?Move
may beUnpin
.Unpin ultimately means it's safe to get a
&mut T
to the contents of aPin<&mut T>
. IfT
is?Move
, then you can't move out of a&mut T
, which means there's not problem with
TC: Analyzing the 2nd bullet point (quoted above), if we have:
…that would definitely be wrong. This shows we can't treat a ?Move
type parameter as Unpin
.
(Discussion about modeling in terms of pinned places and temporalness.)
TC:
Yosh: Example of when values can be moved after construction:
Move + ?Unpin
outside of async?Josh: The discussion of Move
observes that it can't fully replace use of Pin
because it doesn't handle types that are Move + ?Unpin
, which come up in async. Do such types come up in non-async usage of address-sensitive types, such as intrusive lists and other uses in RfL?
Asking because if they don't then we may want to consider if we should solve async with one mechanism and other address-sensitive types with another mechanism. However, if such cases do arise in non-async address-sensitive types as well, and there isn't a clear story for how to handle those cases with Move
, that seems like more of a nail in the coffin of Move
.
We need some reasonable way to support address-sensitive types. We need a solution for async. It's not obvious that both of those need to, or should, use the same mechanism. We could, in particular, decide that address-sensitive types should use a different mechanism, and that async should avoid exposing Pin
much more wherever possible, the combination of which would then mean that we shouldn't spend language budget on making Pin
more ergonomic (and should instead spend that language budget on making async more ergonomic).
Yosh: Post on how Move
would work for this, as well as how it generalizes to arbitrary self-referential types.
TC: There almost seems an orthogonal axis about how much we push people into an IntoFuture
-like pattern ("betting on IntoFuture
"). If that pattern were used consistently, which would be required for Move
anyway, then the ergonomics of Pin
might be similar to the ergonomics of Move
.
tmandry: The biggest risk I see in Yosh's post is the reliance on emplacement. But there are also similar ergonomic hurdles to that design that might be on the same scale as with Pin
.
Pin
with existing APIsyosh: the doc makes the following statement about backwards-compatibility of both Pin
and Move
(emphasis mine):
Pin exists now, so there's nothing to be backwards compatible with. &pin mut T is sugar for
Pin<&mut T>
with extra language support, so the backwards compatibility story is good there too.The backwards compatibility story for Move is essentially impossible. The reason is that there exist values that would be both Move and ?Unpin, and in fact these are quite common (the future returned by any async fn is in this category today). See Interop with Pin for more details.
This only considers the narrow scope for which Pin
is being used today. It does not consider the broader uses for which Pin
isn't a good fit because it's incompatible. Consider for example the std::io::Read
trait. Making that work with Pin
requires minting an entirely new trait:
We also can't just impl Read for Pin<&mut impl Read>
because:
This is again a polymorphism problem, but not one we can solve with e.g. effect generics. In contrast Move
would be more compatible with existing traits, because it is an auto-trait. That would make it have the same behavior as e.g. Send
- which we can improve as a class.
tmandry: We should talk about the actual example of backward compatibility.
E.g.:
Or, e.g.:
Josh: We could have a different trait that works with ?Move
, for instance, and use that where feasible. It'd make ?Move
types less conveniently usable at first, but Pin
is already substantially less usable in many places in the language.
tmandry: We could have a trait modifier, ?Move Deref
.
TC: This gets back to Move
having similar ergonomics problems to Pin
. We've talked about pin
trait modifiers too.
Or, e.g.:
eholk: I don't actually think we want + ?Move
on FnOnce::Output
Drop::drop
signaturedifferent version of the drop trait
Josh: As I understand it, users aren't actually allowed to call Drop::drop
directly. So we could make Drop
special such that implementing it on a type may require a different signature for drop
. We may or may not want to, but I think we can without breaking backwards compatibility.
tmandry: Yes, I think we can. That's exactly what is proposed in a recent proposal to make pin more ergonomic: If you have pin
fields, you must take a &pin mut self
your Drop
impl.
tmandry: I've noticed an assumption that it would be more natural to express movability as a property of a type, but I'm feeling less convinced of that now. The fact that it requires emplacement is a sign of how awkward it can be to work with such types.
The big question in my mind is whether we should express movability in terms of a type or a value (or place) containing that type.
There are at least two reasons why I think we might want this as a property of a place:
One reason I think we might want it as a property of a type:
yosh: The doc makes the following statement
The question is then, what to do with Pin, given that it is already part of the language?
This is a fine line to tread, but if we're being specific: Pin
is currently not a language-level item. It definitely borders on being a language item given its safety invariants and uses in async
- but there is nothing inherent to Pin
that makes it special. The closest we get to that is that Unpin
is an auto-trait, which cannot be defined outside of the stdlib.
The only part of Rust where we use Pin
today in the stdlib is in the Future
trait - which means that if we wanted to fully deprecate Pin
that would intersects with stable Rust. It seems important to call this out, because mechanically it means we can think about it like an implementation detail of Future
- than as something more inherent to the language like e.g. Deref
.
Move + !Unpin
make Move
impossible?eholk: I wrote this in the doc:
The backwards compatibility story for
Move
is essentially impossible. The reason is that there exist values that would be bothMove
and?Unpin
, and in fact these are quite common (the future returned by anyasync fn
is in this category today). See Interop withPin
for more details.
Upon further reflection, I'm not sure this is true.
Right now everything is considered movable and it's up to Pin
to prevent immovable values from being moved.
Thus, if we add Move
, Pin
will still need to uphold the expectations of code relying on the Move
trait.
The part that is true is that Pin
and Move
will always have an impedence mismatch.
Josh: This is likely to be a question with a well-explored answer, and I'm not expecting to blaze new ground here, just understand the well-explored answer: what is the underlying reason why we cannot, or do not want to, make futures able to be self-referential without being address-sensitive, such as via base-relative addressing?
Yosh: because you can have segmented address spaces using the heap - relative pointers when the heap gets involved seem like they wouldn't work out well.
tmandry: Dropping this because I have to leave – I remember discussion of this exact point here.. https://without.boats/blog/pin/
Yosh:
Yosh: The ptr
family of APIs also run into some pretty gnarly problems. Any form of offset-based pointers requires that it's updated when moved. However, it's legal today to use ptr::*
to do things with types - and so we'd need to somehow encode additional rules onto them.
Yosh: Adding additional requirements to safely use ptr::*
is backwards-incompatible, which means that e.g. Vec
or other data structures which depend on these operations would run into some pretty gnarly issues.
Yosh: So even beyond whether offset-based pointers are possible, they would inherit a lot of the compat problems immovable systems have. Which unfortunately means we can't just treat relative pointers as self-contained, but they need to communicate externally somehow that internally they're relative.
(The meeting ended here.)
Another example is in the [Rust SymCrypt] bindings, because the underlying SymCrypt library data structures often include a checksum field that is derived in part from the address of the structure. ↩︎
This is true regardless of whether we use a async fn next
design for async iterators or a poll_next
design. ↩︎