unsafe_pinned
unsafe_pinned
Add a type UnsafePinned
that indicates to the compiler that this field is "pinned" and there might be pointers elsewhere that point to the same memory.
This means, in particular, that &mut UnsafePinned
is not necessarily a unique pointer, and thus the compiler cannot just make aliasing assumptions.
However, &mut UnsafePinned
can still be mem::swap
ed, so this is not a free ticket for arbitrary aliasing of mutable references.
You need to use mechanisms such as Pin
to ensure that mutable references cannot be used in incorrect ways by clients.
This type is then used in generator lowering, finally fixing #63818.
Let's say you want to write a type with a self-referential pointer:
#![feature(negative_impls)]
use std::ptr;
use std::pin::{pin, Pin};
pub struct S {
data: i32,
ptr_to_data: *mut i32,
}
impl !Unpin for S {}
impl S {
pub fn new() -> Self {
S { data: 42, ptr_to_data: ptr::null_mut() }
}
pub fn get_data(self: Pin<&mut Self>) -> i32 {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
if this.ptr_to_data.is_null() {
this.ptr_to_data = ptr::addr_of_mut!(this.data);
}
// SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
unsafe { this.ptr_to_data.read() }
}
pub fn set_data(self: Pin<&mut Self>, data: i32) {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
if this.ptr_to_data.is_null() {
this.ptr_to_data = ptr::addr_of_mut!(this.data);
}
// SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
unsafe { this.ptr_to_data.write(data) }
}
}
fn main() {
let mut s = pin!(S::new());
s.as_mut().set_data(42);
println!("{}", s.as_mut().get_data());
}
This kind of code is implicitly generated by rustc all the time when an async fn
has a local variable of reference type that is live across a yield point.
The problem is that this code has UB under our aliasing rules: the &mut S
inside the self
argument of get_data
aliases with ptr_to_data
!
(If you run this code in Miri, remove the impl !Unpin
to see the UB. Miri treats Unpin
as magic as otherwise the entire async ecosystem would cause errors.
But that is not how Unpin
was actually designed.)
This simple code only has UB under Stacked Borrows but not under the LLVM aliasing rules; more complex variants of this – still in the realm of what async fn
generates – also have UB under the LLVM aliasing rules.
A more complex variant
The following roughly corresponds to a generator with this code:
let mut data = 0;
let ptr_to_data = &mut data;
yield;
*ptr_to_data = 42;
println!("{}", data);
return;
When implemented by hand, it looks as follows, and causes aliasing issues:
#![feature(negative_impls)]
use std::ptr;
use std::pin::{pin, Pin};
use std::task::Poll;
pub struct S {
state: i32,
data: i32,
ptr_to_data: *mut i32,
}
impl !Unpin for S {}
impl S {
pub fn new() -> Self {
S { state: 0, data: 0, ptr_to_data: ptr::null_mut() }
}
fn poll(self: Pin<&mut Self>) -> Poll<()> {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
match this.state {
0 => {
// The first time, set up the pointer.
this.ptr_to_data = ptr::addr_of_mut!(this.data);
// Now yield.
this.state += 1;
Poll::Pending
}
1 => {
// After coming back from the yield, write to the pointer.
unsafe { this.ptr_to_data.write(42) };
// And read our local variable `data`.
// THIS IS UB! `this` is derived from the `noalias` pointer
// `self` but we did a write to `this.data` in the previous
// line when writing to `ptr_to_data`. The compiler is allowed
// to reorder this and the previous line and then the output
// would change.
println!("{}", this.data);
// Now yield and be done.
this.state += 1;
Poll::Ready(())
}
_ => unreachable!(),
}
}
}
fn main() {
let mut s = pin!(S::new());
while let Poll::Pending = s.as_mut().poll() {}
}
Beyond self-referential types, a similar problem also comes up with intrusive linked lists: the nodes of such a list often live on the stack frames of the functions participating in the list, but also have incoming pointers from other list elements.
When a function takes a mutable reference to its stack-allocated node, that will alias the pointers from the neighboring elements.
This is an example of an intrusive list in the standard library that is breaking Rust's aliasing rules.
Pin
is sometimes used to ensure that the list elements don't just move elsewhere (which would invalidate those incoming pointers) and provide a safe API, but there still is the problem that an &mut Node
is actually not a unique pointer due to these aliases – so we need a way for the to opt-out of the aliasing rules.
The goal of this RFC is to offer a way of writing such self-referential types and intrusive collections without UB. We don't want to change the rules for mutable references in general (that would also affect all the code that doesn't do anything self-referential), instad we want to be able to tell the compiler that this code is doing funky aliasing and that should be taken into account for optimizations.
To write this code in a UB-free way, wrap the fields that are targets of self-referential pointers in an UnsafePinned
:
pub struct S {
data: UnsafePinned<i32>, // ---- here
ptr_to_data: *mut i32,
}
impl S {
pub fn new() -> Self {
S { data: UnsafePinned::new(42), ptr_to_data: ptr::null_mut() }
}
pub fn get_data(self: Pin<&mut Self>) -> i32 {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
if this.ptr_to_data.is_null() {
this.ptr_to_data = UnsafePinned::raw_get_mut(ptr::addr_of_mut!(this.data));
}
// SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
unsafe { this.ptr_to_data.read() }
}
pub fn set_data(self: Pin<&mut Self>, data: i32) {
// SAFETY: We're not moving anything.
let this = unsafe { Pin::get_unchecked_mut(self) };
if this.ptr_to_data.is_null() {
this.ptr_to_data = UnsafePinned::raw_get_mut(ptr::addr_of_mut!(this.data));
}
// SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
unsafe { this.ptr_to_data.write(data) }
}
}
This lets the compiler know that mutable references to data
might still have aliases, and so optimizations cannot assume that no aliases exist.
That's entirely analogous to how UnsafeCell
lets the compiler know that shared references to this field might undergo mutation.
However, what is not analogous is that &mut S
, when handed to safe code you do not control, must still be unique pointers!
This is because of methods like mem::swap
that can still assume that their two arguments are non-overlapping.
(mem::swap
on two &mut UnsafePinned<i32>
may soundly assume that they do not alias.)
In other words, the safety invariant of &mut S
still requires full ownership of the entire memory range S
is stored at; for the duration that a function holds on to the borrow, nobody else may read and write this memory.
But again, this is a safety invariant; it only applies to safe code you do not control. You can write your own code handling &mut S
and as long as that code is careful not to make use of this memory in the wrong way, potential aliasing is fine.
To hand such references to safe code, use Pin
: the type Pin<&mut S>
can be safely given to external code, since the Pin
wrapper blocks access to operations like mem::swap
.
Similarly, the intrusive linked list from the motivation can be fixed by wrapping the entire UnsafeListEntry
in UnsafePinned
.
API sketch:
/// The type `UnsafePinned<T>` lets unsafe code violate
/// the rule that `&mut UnsafePinned<T>` may never alias anything else.
///
/// However, even if you define your type like `pub struct Wrapper(UnsafePinned<...>)`,
/// it is still very risky to have an `&mut Wrapper` that aliases
/// anything else. Many functions that work generically on `&mut T` assume that the
/// memory that stores `T` is uniquely owned (such as `mem::swap`). In other words,
/// while having aliasing with `&mut Wrapper` is not immediate Undefined
/// Behavior, it is still unsound to expose such a mutable reference to code you do
/// not control! Techniques such as pinning via `Pin` are needed to ensure soundness.
///
/// Similar to `UnsafeCell`, `UnsafePinned` will not usually show up in the public
/// API of a library. It is an internal implementation detail of libraries that
/// need to support aliasing mutable references.
///
/// Further note that this does *not* lift the requirement that shared references
/// must be read-only! Use `UnsafeCell` for that.
///
/// This type blocks niches the same way `UnsafeCell` does.
#[lang = "unsafe_aliased"]
#[repr(transparent)]
struct UnsafePinned<T: ?Sized> {
value: T,
}
/// When this type is used, that almost certainly means safe APIs need to use pinning
/// to avoid the aliases from becoming invalidated. Therefore let's mark this as `!Unpin`.
impl<T> !Unpin for UnsafePinned<T> {}
/// The type is `Copy` when `T` is to avoid people assuming that `Copy` implies there
/// is no `UnsafePinned` anywhere. (This is an issue with `UnsafeCell`: people use `Copy` bounds
/// to mean `Freeze`.) Given that there is no `unsafe impl Copy for ...`, this is also
/// the option that leaves the user more choices (as they can always wrap this in a `!Copy` type).
impl<T: Copy> Copy for UnsafePinned<T> {}
impl<T: Copy> Clone for UnsafePinned<T> {
fn clone(&self) -> Self { *self }
}
// `Send` and `Sync` are inherited from `T`. This is similar to `SyncUnsafeCell`, since
// we eventually concluded that `UnsafeCell` implicitly making things `!Sync` is sometimes
// unergonomic. A type that needs to be `!Send`/`!Sync` should really have an explicit
// opt-out itself, e.g. via an `PhantomData<*mut T>` or (one day) via `impl !Send`/`impl !Sync`.
impl<T: ?Sized> UnsafePinned<T> {
/// Constructs a new instance of `UnsafePinned` which will wrap the specified
/// value.
pub fn new(value: T) -> UnsafePinned<T> where T: Sized {
UnsafePinned { value }
}
pub fn into_inner(self) -> T where T: Sized {
self.value
}
/// Get read-write access to the contents of an `UnsafePinned`.
///
/// You should usually be using `get_mut_pinned` instead to explicitly track
/// the fact that this memory is "pinned" due to there being aliases.
pub fn get_mut_unchecked(&mut self) -> *mut T {
ptr::addr_of_mut!(self.value)
}
/// Get read-write access to the contents of a pinned `UnsafePinned`.
pub fn get_mut_pinned(self: Pin<&mut Self>) -> *mut T {
// SAFETY: we're not using `get_unchecked_mut` to unpin anything
unsafe { ptr::addr_of_mut!(self.get_unchecked_mut().value) }
}
/// Get read-only access to the contents of a shared `UnsafePinned`.
/// Note that `&UnsafePinned<T>` is read-only if `&T` is read-only.
/// This means that if there is mutation of the `T`, future reads from the
/// `*const T` returned here are UB!
///
/// ```rust
/// unsafe {
/// let mut x = UnsafePinned::new(0);
/// let ref1 = &mut *addr_of_mut!(x);
/// let ref2 = &mut *addr_of_mut!(x);
/// let ptr = ref1.get(); // read-only pointer, assumes immutability
/// ref2.get_mut().write(1);
/// ptr.read(); // UB!
/// }
/// ```
pub fn get(&self) -> *const T {
self as *const _ as *const T
}
pub fn raw_get_mut(this: *mut Self) -> *mut T {
this as *mut T
}
pub fn raw_get(this: *const Self) -> *const T {
this as *const T
}
}
The comment about aliasing &mut
being "risky" refers to the fact that their safety invariant still asserts exclusive ownership.
This implies that duplicate
in the following example, while not causing immediate UB, is still unsound:
pub struct S {
data: UnsafePinned<i32>,
}
impl S {
fn new(x: i32) -> Self {
S { data: UnsafePinned::new(x) }
}
fn duplicate<'a>(s: &'a mut S) -> (&'a mut S, &'a mut S) {
let s1 = unsafe { (&s).transmute_copy() };
let s2 = s;
(s1, s2)
}
}
The unsoundness is easily demonstrated by using safe code to cause UB:
let mut s = S::new(42);
let (s1, s2) = s.duplicate(); // no UB
mem::swap(s1, s2); // UB
We could even soundly make get_mut_unchecked
return an &mut T
, given that the safety invariant is not affected by UnsafePinned
.
But that would probably not be useful and only cause confusion.
Here is a polyfill on current Rust that uses the Unpin
hack to achieve mostly the same effect as this API.
("Mostly" because a safe impl Unpin for ...
can un-do the effect of this, which would not be the case with the real UnsafePinned
.)
Reference diff:
* Breaking the [pointer aliasing rules]. `&mut T` and `&T` follow LLVM’s scoped
- [noalias] model, except if the `&T` contains an [`UnsafeCell<U>`].
+ [noalias] model, except if the `&T` contains an [`UnsafeCell<U>`] or
+ the `&mut T` contains an [`UnsafePinned<U>`].
Async generator lowering changes:
UnsafePinned
.rustc and Miri changes:
UnsafeUnpin
auto trait similar to Freeze
that is implemented if the type does not contain any by-val UnsafePinned
.
This trait is an internal implementation detail and (for now) not exposed to users.noalias
on mutable references is only emitted for UnsafeUnpin
types. (This replaces the current hack where it is only emitted for Unpin
types.)UnsafePinned
.SharedReadWrite
retagging inside UnsafePinned
similar to what it does inside UnsafeCell
already. (This replaces the current Unpin
-based hack.)UnsafeCell
: disables aliasing (and affects but does not fully disable dereferenceable) behind shared refs, i.e. &UnsafeCell<T>
is special. UnsafeCell<&T>
(by-val, fully owned) is not special at all and basically like &T
; &mut UnsafeCell<T>
is also not special. Safe wrappers around this type can expose mutability behind shared references, such as &RefCell<T>
.UnsafePinned
: disables aliasing (and affects but does not fully disable dereferenceable) behind mutable refs, i.e. &mut UnsafePinned<T>
is special. UnsafePinned<&mut T>
(by-val, fully owned) is not special at all and basically like &mut T
; &UnsafePinned<T>
is also not special. Safe wrappers around this type can expose sharing that involves pinned mutable references, such as Pin<&mut MyFuture>
.MaybeDangling
: disables aliasing and dereferencable of all references (and boxes) directly inside it, i.e. MaybeDangling<&[mut] T>
is special. &[mut] MaybeDangling<T>
is not special at all and basically like &[mut] T
.It's yet another wrapper type adjusting our aliasing rules and very easy to mix up with UnsafeCell
or MaybeDangling
.
Furthermore, it is an extremely subtle wrapper type, as the duplicate
example shows.
UnsafeUnpin
is a somewhat unfortunate twin to Unpin
.
The purpose of UnsafeUnpin
really is only to search for UnsafePinned
fields, so that we can use the trait solver to determine whether an &mut
reference gets noalias
or not.
The actual safety promise of UnsafeUnpin
is likely going to be exactly the same as Unpin
, but we can't use a stable and safe trait to determine noalias
:
impl UnsafeUnpin for T
would add the noalias
back to &mut T
, and that can lead to very surprising aliasing issues as the poll_fn
debacle showed.
(Note that PollFn
has already been fixed, but that doesn't mean nobody will make similar mistakes in the future so it is worth discussing how the original, problematic PollFn
would fare under this RFC.)
Splitting up the traits partially mitigates such issues: after impl<T> Unpin for PollFn<T>
, PollFn<T>
is (in general) Unpin + !UnsafeUnpin
.
The known examples of UB that Miri found all were caused by bad aliasing assumptions, which no longer occur when the aliasing assumptions are tied to UnsafeUnpin
rather than Unpin
.
Actually moving the PollFn
would still cause problems (and that can be done in safe code since it implements Unpin
), but now the chances of code causing UB are much reduced since one must both pin data that's moved into a closure, and move that closure – even though the Rust compiler will not help prevent such moves, programmers thinking carefully about pinning are hopefully less likely to then try to move that closure.
In conclusion, Unpin + !UnsafeUnpin
types are somewhat foot-gunny but less foot-gunny than the status quo.
Maybe in a future edition Unpin
can be transitioned to an unsafe trait and then this situation can be re-evaluated; for now, UnsafeUnpin
remains an unstable implementation detail similar to Freeze
.
(UnsafeUnpin + !Unpin
types are harmless, one just loses the ability to call Pin::deref_mut
for no good reason.)
It is unfortunate that &mut UnsafePinned
and &mut TypeThatContainsUnsafePinned
lose their no-alias assumptions even when they are not currently pinned.
However, since pinning was implemented as a library concept, there's not really any way the compiler can know whether an &mut UnsafePinned
is pinned or not – working with Pin<&mut TypeThatContainsUnsafePinned>
generally requires using Pin::get_unchecked_mut
and Pin::map_unchecked_mut
, which exposes &mut TypeThatContainsUnsafePinned
and &mut UnsafePinned
that still need to be considered aliased.
The proposal in this RFC matches what was discussed with the lang team a long time ago.
However, of course one could imagine alternatives:
Keep the status quo. The current sitaution is that we only make aliasing requirements on mutable references if the type they point to is Unpin
. This is unsatisfying: Unpin
was never meant to have this job. A consequence is that a stray impl Unpin
on a Wrapper<T>
-style type can lead to subtle miscompilations since it re-adds aliasing requirements for the inner T
.
Contrast this with the UnsafeCell
situation, where it is not possible for (stable) code to just impl Freeze for T
in the wrong way – UnsafeCell
is always recognized by the compiler.
On the other hand, UnsafePinned
is rather quirky in its behavior and having two marker traits (UnsafeUnpin
and Unpin
) might be too confusing, so sticking with Unpin
might not be too bad in comparison.
If we do that, however, it seems preferrable to transition Unpin
to an unsafe trait. There is a clear statement about the types' invariants associated with Unpin
, so an impl Unpin
already comes with a proof obligation. It just happens to be the case that in a module without unsafe, one can always arrange all the pieces such that the proof obligation is satisifed.
This is mostly a coincidence and related to the fact that we don't have safe field projections on Pin
. That said, solving this also requires solving the trouble around Drop
and Pin
, where effectively an impl Drop
does an implicit get_mut_unchecked
, i.e., implicitly assumes the type is Unpin
.
UnsafePinned
could affect aliasing guarantees both on mutable and shared references. This would avoid the currently rather subtle situation that arises when one of many aliasing &mut UnsafePinned<T>
is cast or coerced to &UnsafePinned<T>
: that is a read-only shared reference and all aliases must stop writing!
It would make this type strictly more 'powerful' than UnsafeCell
in the sense that replacing UnsafeCell
by UnsafePinned
would always be correct. (Under the RFC as written, UnsafeCell
and UnsafePinned
can be nested to remove aliasing requirements from both shared and mutable references.)
If we don't do this, we could consider removing get
since since it seems too much like a foot-gun.
But that makes shared references to UnsafePinned
fairly pointless. Shared references to generators/futures are basically useless so it is unclear what the potential use-cases here are.
Instead of introducing a new type, we could say that UnsafeCell
affects both shared and mutable references. That would lose some optimization potential on types like &mut Cell<T>
, but would avoid the footgun of coercing an &mut UnsafePinned<T>
to &UnsafePinned<T>
. That said, so far the author is not aware of Miri detecting code that would run into this footgun (and Miri is able to detect such issues).
We could entirely avoid all these problems by not having aliasing restrictions on mutable references.
But that is completely against the direction Rust has had for 8 years now, and it would mean removing LLVM noalias
annotations for mutable references (and likely boxes) entirely.
That is sacrificing optimization potential for the common case in favor of simplifying niche cases such as self-referential structs – which is against the usual design philosophy of Rust.
Instead of adding a new type that needs to be used as Pin<&mut UnsafePinned<T>>
, can't we just make Pin<&mut T>
special?
The answer is no, because working with Pin<&mut T>
in unsafe code usually involves getting direct access to the &mut
and then using it "carefully".
But being careful is not enough when the compiler makes non-aliasing assumptions!
We need to preserve the fact that the &mut T
may have aliases even after Pin::get_unchecked_mut
was used and inside Pin::map_unchecked_mut
.
In a different universe where pinning is a first-class concept with native support for projections and no need for get_unchecked_mut
, this might not have been required,
but with pinning being introduced as a library type, there is no (currently known) alternative to UnsafePinned
.
In terms of rationale, the question that comes to mind first is why is this so different from UnsafeCell
.
UnsafeCell
opts-out of read-only guarantees for shared references, can't we just have a type that opts-out of uniqueness guarantees for mutable references?
The answer is no, because mutable references have some universal operations that exploit their uniqueness – in particular, mem::swap
.
In contrast, there exists no operation available on all shared references that exploits their immutability.
This is why we need pinning to make APIs around UnsafePinned
actually sound.
An earlier proposal suggested to call the type UnsafeAliased
, since the type is not inherently tied to pinning.
However, it is not possible to write sound wrappers around UnsafeAliased
such that we can have aliasing &mut MyType
. One has to use pinning for that: Pin<&mut MyType>
.
Because of that, the RFC proposes that we suggestively put pinning into the name of the type, so that people don't confuse it with a general mechanism for aliasing mutable references.
It is more like the core primitive behind pinning, where whenever a type is pinned that is caused by an UnsafePinned
field somewhere inside it.
For instance, it may be tempting to use an UnsafeAliased
type to mark a single field in some struct as "separately aliased", and then a Mutex<Struct>
would acquire ownership of the entire struct except for that field.
However, due to mem::swap
, that would not be sound.
One cannot hand out an &mut
to such aliased memory as part of a safe-to-use abstraction – except by using pinning.
This is somewhat like UnsafeCell
, but for mutable instead of shared references.
Adding something like this to Rust has been discussed many times throughout the years. Here are some links for recent discussions:
Unpin
opting-out of aliasing guarantees to the new type? Here's a plan: define the PhantomPinned
type as UnsafePinned<()>
.
Places in the standard library that use impl !Unpin for
and the generator lowering are adjusted to use UnsafePinned
instead.
Then as long as nobody outside the standard library used the unstable impl !Unpin for
, switching the noalias
-opt-out to the UnsafeUnpin
trait is actually backwards compatible with the (never explicitly supported) Unpin
hack!
However, if we ever make UnsafePinned
location-relevant (i.e., only data inside the UnsafePinned
is allowed to have aliases), then unsafe code in the ecosystem needs to be adjusted.
The justification for this is that currently this code relies on undocumented unstable behavior (using !Unpin
to opt-out of aliasing guarantees), so we are in our right to declare it unsound.
Of course we should give the ecosystem time to migrate to the new approach before we actually start doing optimizations that exploit this UB.Unpin
also affects the dereferenceable
attribute, so the same would happen for this type. Is that something we want to guarantee, or do we hope to get back dereferenceable
when better semantics for it materialize on the LLVM side?&UnsafeCell<Struct>
to &UnsafeCell<Field>
, the same kind of projection could also be interesting for UnsafePinned
.Add a type
UnsafePinned
that indicates to the compiler that this field is "pinned" and there might be pointers elsewhere that point to the same memory. … You need to use mechanisms such asPin
to ensure that mutable references cannot be used in incorrect ways by clients.
Josh: This description alone already makes the name seem confusing. UnsafePinned
indicates the field is pinned, but you need to use mechanisms such as Pin
with it? What's the proposed pattern here?
TC: This was originally known as UnsafeAliased
.
Josh: I saw that in the RFC thread. What was the rationale for the rename? It seems like this type is primarily about aliasing?
Ralf: There is no known way to use this type soundly in a public API without pinning. UnsafeAliased
doesn't convey a ton of meaning (UnsafeCell
is also about aliasing), so the name was meant to help guide people in the direction of sound APIs. With UnsafeCell
, you just have to be careful which operations you expose on &MyType
, but with UnsafePinned
things are a lot more complicated.
TC: Also, RalfJ had earlier written about it:
After the discussion over yonder I concluded that we really have to pick a different name. When you have an API that involves shared mutable state, it is not the case that you can freely pick between
&UnsafeCell
and&mut UnsafeAliased
. Wrappers aroundUnsafeCell
that provide mutability behind shared references are possible, and we have plenty of them. Wrappers aroundUnsafeAliased
that provide sharing behind mutable references are not possible. One has to use pinning.So, I updated the RFC to call the type
UnsafePinned
. This hopefully makes it less likely that people mistaken the type for a general mechanism for aliasing mutable references. (There can be no such mechanism.)
This is also discussed in the above document here.
Josh: (Aside: Hypothetically, is it possible to have that mutability without pinning by using a relative pointer and offsetting? This is a tangent, and we probably shouldn't explore it in depth here.)
Josh: I can certainly understand "putting Pinned in the name makes it be perceived as a lot more complicated". :) More seriously, though: does UnsafePinned
already include Pin
as part of its functionality, or are you expected to pin the type containing the UnsafePinned
field??
Ralf: Pin
and UnsafePinned
are two sides of the same coin. The only way to soundly expose UnsafePinned
types is by using Pin
, and every use of Pin
will bottom out at some UnsafePinned
somewhere that is the reason why a Pin
reference is needed.
You are expected to use Pin
like the initial examples at the top of the document do. (I have added the full code for the fixed version of the example in the guide-level section.)
Josh: Thank you for clarifying the usage and how that motivated the name. I'll think about that and try to keep the bikeshedding to a minimum. :)
Without actually suggesting an excessively awkward name here, it seems like the problem is that UnsafePinned
is realy UnsafeNeedsPin
or similar? Since UnsafePinned
doesn't make something pinned, it's the thing at the bottom of a Pin
that's the reason why Pin
is needed.
TC: Perhaps we could say that it doesn't make it pinned, but (assuming the program is correct) it is pinned.
Josh: The name still seems like it isn't helping understanding. And with pinning we could use all the help we can get on that front. In any case, let's not spend substantial meeting time on a naming bikeshed. Let's make sure we have everything else answered first.
TC: Absolutely.
Ralf: Naming is hard, yeah. I haven't yet found a name I am really happy with.
Josh: I know that &mut
is intended to be non-aliased. I'd assumed that meant "you can't have two &mut
pointing to the same place, or a &mut
and a &
". But it looks like the rule being used by this RFC is that you also can't have a &mut
and a *const
/*mut
pointing to the same place, as well? Has that always been the (intended) rule, and people just widely ignored it in unsafe code? Or, can you have a &mut
and a raw pointer pointing to the same place but you can't write through the raw pointer?
Ralf: &mut
having no aliases at all has always been the rule, as far as I know. It's the only way for that rule to actually be useful: if we allow aliasing an &mut
and a *mut
, then we absolutely can not use noalias
in our LLVM codegen.
The saving grace is that we allow reborrowing. So the rule is that an &mut
cannot alias any raw pointer that is not derived from it. But if you have an &mut
and then create 15 raw pointers from that, those can all alias with each other as much as they want.
(Stacked Borrows wants you to not use the &mut
itself while the raw pointers are active. LLVM noalias
and Tree Borrows do allow you to mix and match the &mut
and the *mut
derived from it. But the "derived from it" part is extremely important here.)
Josh: Does "derived from it" here imply that you can't have a &mut
and a *mut
both derived from the same T, you have to ensure that the *mut
is derived from the mut
?
Ralf: What is a reference "derived from a T"? References/pointers get derived from other references/pointers.
Josh: Sigh, terminology. Does "derived from it" here imply that you can't have a &mut
and a *mut
both taken from the same T (e.g. via two separate take-the-address-of operations), you have to ensure that the *mut
is derived from the mut
?
Ralf: let
-bound variables are places so they are basically pointers. Maybe you mean derived from that? (If it's not a let-bound variable then it is another pointer: &addr_of!(*x)
is derived from x
.) Yeah I think we need a concrete code example as we don't seem to have common terms, sorry.
Josh: In code:
let x = Thing;
let p1: &mut Thing = &mut x;
...
let p2: *mut Thing = &mut x; // assume the compiler doesn't just instantly flag this as a problem.
// p1 and p2 are both taken from x, so does that imply p2 is not "derived from" p1?
Ralf: p2
is not derived from p1
and hence using both is UB (under Stacked Borrows and Tree Borrows).
Josh: Got it. Whereas here:
let x = Thing;
let p1: &mut Thing = &mut x;
...
let p2: *mut Thing = &mut *addr_of_mut!(*p1);
// Now p2 is derived from p1?
Ralf: Yes now p2
is derived from p1
.
(It boils down to provenance, as you will probably not be surprised to hear.)
Josh: Thanks. I think this entire section is now resolved and answered. +1.
duplicate
examplepnkfelix: I agree the duplicate
example is subtle. I want to double check my understanding of the statement "while not causing immediate UB, is still unsound".
pnkfelix: I would have thought, intuitively, being in a state where you have a function that returns a (&'a mut S, &'a mut S)
where the two tuple elements alias each other is a state of immediate UB.
Ralf: The entire point of UnsafePinned
is to make that not immediate UB.
UnsafePinned
downgrades what would be immediate UB (violation of language/validity rules) to a violation of the safety invariant, which can lead to UB later – but that "UB later" can be avoided by using carerful abstractions such as Pin
.
pnkfelix: Okay. So then the only case where it is not immediate UB is when S
has an UnsafePinned
contained (transitively) within, right?
Ralf: Specifically the overlapping parts must all be in an UnsafePinned
.
(And then if they are next to UnsafePinned
we enter undecided territory, similar to the discussions about the granularity of UnsafeCell
.)
pnkfelix: and the overlapping parts cannot be themselves behind some level of indirection, like a Box
? I might need to write out an example.
Ralf: Not sure I follow. UnsafePinned
only has a by-value effect, similar to UnsafeCell
.
So two UnsafePinned<&mut T>
are not allowed to alias. An example might help :)
pnkfelix: Okay yes I think that matches my understanding.
pnkfelix: I think I was going for whether this would be immediate UB:
struct S(Box(UnsafePinned<u32>));
fn duplicate_mut_of_box<'a>(s: &'a mut S) -> (&'a mut S, &'a mut S) {
let s1 = unsafe { (&s).transmute_copy() };
let s2 = s;
(s1, s2)
}
pnkfelix: and my current understanding is that that would (still) be immediate UB.
Ralf: Yes, that would still be immediate UB.
tmandry: General comment that even though it can be considered separately from this proposal, I would like to see us move to using field attributes eventually.
pub struct S {
#[unsafe(pinned)]
data: i32,
ptr_to_data: *mut i32,
}
Josh: Don't we still need that represented in the type system somehow, though?
tmandry: Yeah I was looking for the gap in my thinking; that's probably it. An alternative would be "let's make newtypes less annoying to use".
Josh: +1 to that. I wonder if some kind of generalized solution to "types you can project through" would help there. (That's a tangent though.)
tmandry: But as I type this I realize that the accessors on UnsafePinned
are in some sense the point, and a mutable reference to the underlying type is not what you want. Therefore more convenient access to the inner value is not actually desirable.
scottmcm: in particular, any time &mut s.foo
doesn't give &mut typeof(S::foo)
is pretty awkward – it can be really bad for proc macros especially, since often every such macro needs to know about the attribute specifically instead of being able to delegate to a real type.
Josh: Suppose, in the future, with some additional work on the compiler / borrow checker, someone wants to put forth a design for safe self-referential data structures, ideally one that doesn't necessarily require pinning. (e.g. references using relative pointers, or some other solution.) Does the introduction of this type close any doors on those possibilities, or substantially constrain the design of those possibilities? (This RFC does not have to solve that problem, but I'd like to understand the degree to which this RFC constrains future solutions.)
Ralf: That's hard without having an example of what that design would look like. You had earlier mentioned relative pointers. If you have those, e.g., within a struct, you can do that today.
Josh: So there are many possibilities that wouldn't intersect with this feature?
Ralf: That sounds right.
Ralf: E.g.:
let mut s1 = S::new(42);
let mut s2 = S::new(43);
mem::swap(&mut s1, &mut s2);
(Felix's summary: fn duplicate
is buggy; or at least, someone using it has to understand that they are handling nuclear waste that may violate assumptions of other code that is built upon the guarantees of &mut T
. The distinction between "immediate UB" vs "is unsound" is a matter of what we might allow, in the language, for modules/crates to use in a local, controlled manner, and not as part of a general publicly accessible API.)
tmandry: Would this be compatible with adding pinning to the language later?
Ralf: As far as I can tell.
tmandry: It seems like we need to do this in any case.
tmandry: Adding a topic to make sure we walk through all the open questions.
TC: Let's set out subsections for this.
Ralf: Concretely, do we want &UnsafePinned<T>
to disable aliasing guarantees? My inclination is no, as we want orthogonal language features. So it would seem odd to tie these together, as one could simply nest UnsafePinned<UnsafeCell<T>>
. We could of course relax this later.
UnsafePinned
could affect aliasing guarantees both on mutable and shared references. This would avoid the currently rather subtle situation that arises when one of many aliasing&mut UnsafePinned<T>
is cast or coerced to&UnsafePinned<T>
: that is a read-only shared reference and all aliases must stop writing! It would make this type strictly more 'powerful' thanUnsafeCell
in the sense that replacingUnsafeCell
byUnsafePinned
would always be correct. (Under the RFC as written,UnsafeCell
andUnsafePinned
can be nested to remove aliasing requirements from both shared and mutable references.)If we don't do this, we could consider removing
get
since since it seems too much like a foot-gun. But that makes shared references toUnsafePinned
fairly pointless. Shared references to generators/futures are basically useless so it is unclear what the potential use-cases here are.
Unpin
opting-out of aliasing guarantees to the new type? Here's a plan: define the PhantomPinned
type as UnsafePinned<()>
.
Places in the standard library that use impl !Unpin for
and the generator lowering are adjusted to use UnsafePinned
instead.
Then as long as nobody outside the standard library used the unstable impl !Unpin for
, switching the noalias
-opt-out to the UnsafeUnpin
trait is actually backwards compatible with the (never explicitly supported) Unpin
hack!
However, if we ever make UnsafePinned
location-relevant (i.e., only data inside the UnsafePinned
is allowed to have aliases), then unsafe code in the ecosystem needs to be adjusted.
The justification for this is that currently this code relies on undocumented unstable behavior (using !Unpin
to opt-out of aliasing guarantees), so we are in our right to declare it unsound.
Of course we should give the ecosystem time to migrate to the new approach before we actually start doing optimizations that exploit this UB.Ralf: For UnsafeCell
we have Freeze
. For UnsafePinned
, we'd have e.g. an UnsafeUnpinned
. It'd be annoying to have both UnsafePinned
and Pinned
, but I don't see a way around it.
tmandry: How widespread is it that people are relying on this? How much code would need to change if we added this rule that people needed to wrap this generator-like code in UnsafePinned
?
TC: That is already the rule in a sense; people are already doing this; it's just unsound today.
Ralf: Once we give people these tools, people will be able to fix their code. And we'll be able to see in Miri how much code blows up.
Ralf: Long term it might be good to deprecate PhantomPinned
. If we end up with the semantics where UnsafeUnpin
on one field doesn't affect all fields, then PhantomPinned
is a bit of a trap.
(Added later by Felix: regarding UnsafeCell
's effects on neighboring fields, see e.g. ucg#236, and also this zulip chat on TreeBorrows.)
Josh: std::pin
? If it's so closely tied to pinning.
Ralf: That sounds reasonable.
Josh: Maybe we should bundle this with the question of the name. If it's name is tied to pin, it may make sense there.
dereferenceable
?
Unpin
also affects thedereferenceable
attribute, so the same would happen for this type. Is that something we want to guarantee, or do we hope to get backdereferenceable
when better semantics for it materialize on the LLVM side?
TC: What are the next steps?
tmandry: It does seem we need to do this. The name did make me raise an eyebrow initially, but the explanation made sense to me. I do have concerns how we teach this.
Ralf: So do I.
pnkfelix: This makes sense to me also, as far as I understand.
Josh: I don't have any blocking concerns here either.
TC: How do we want to handle the bikeshed on the name raised by Josh?
Ralf: We could ship this on nightly of course.
Josh: +1 on shipping this in nightly. We could consider the name later.
tmandry: I'll propose FCP merge.
UnsafeCell
?tmandry: This gives me pause; I would like to imagine what explanatory text looks like that compares and contrasts these two types.
In terms of rationale, the question that comes to mind first is why is this so different from
UnsafeCell
.UnsafeCell
opts-out of read-only guarantees for shared references, can't we just have a type that opts-out of uniqueness guarantees for mutable references? The answer is no, because mutable references have some universal operations that exploit their uniqueness – in particular,mem::swap
. In contrast, there exists no operation available on all shared references that exploits their immutability. This is why we need pinning to make APIs aroundUnsafePinned
actually sound.
(The meeting ended here.)