`unsafe_pinned`

Feature Name: unsafe_pinned
Start Date: 2022-11-05
RFC PR: rust-lang/rfcs#3467
Rust Issue: rust-lang/rust#0000

Summary

Add a type UnsafePinned that indicates to the compiler that this field is "pinned" and there might be pointers elsewhere that point to the same memory. This means, in particular, that &mut UnsafePinned is not necessarily a unique pointer, and thus the compiler cannot just make aliasing assumptions. However, &mut UnsafePinned can still be mem::swaped, so this is not a free ticket for arbitrary aliasing of mutable references. You need to use mechanisms such as Pin to ensure that mutable references cannot be used in incorrect ways by clients.

This type is then used in generator lowering, finally fixing #63818.

Motivation

Let's say you want to write a type with a self-referential pointer:

#![feature(negative_impls)]
use std::ptr;
use std::pin::{pin, Pin};

pub struct S {
    data: i32,
    ptr_to_data: *mut i32,
}

impl !Unpin for S {}

impl S {
    pub fn new() -> Self {
        S { data: 42, ptr_to_data: ptr::null_mut() }
    }

    pub fn get_data(self: Pin<&mut Self>) -> i32 {
        // SAFETY: We're not moving anything.
        let this = unsafe { Pin::get_unchecked_mut(self) };
        if this.ptr_to_data.is_null() {
            this.ptr_to_data = ptr::addr_of_mut!(this.data);
        }
        // SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
        unsafe { this.ptr_to_data.read() }
    }

    pub fn set_data(self: Pin<&mut Self>, data: i32) {
        // SAFETY: We're not moving anything.
        let this = unsafe { Pin::get_unchecked_mut(self) };
        if this.ptr_to_data.is_null() {
            this.ptr_to_data = ptr::addr_of_mut!(this.data);
        }
        // SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
        unsafe { this.ptr_to_data.write(data) }
    }
}

fn main() {
    let mut s = pin!(S::new());
    s.as_mut().set_data(42);
    println!("{}", s.as_mut().get_data());
}

This kind of code is implicitly generated by rustc all the time when an async fn has a local variable of reference type that is live across a yield point. The problem is that this code has UB under our aliasing rules: the &mut S inside the self argument of get_data aliases with ptr_to_data! (If you run this code in Miri, remove the impl !Unpin to see the UB. Miri treats Unpin as magic as otherwise the entire async ecosystem would cause errors. But that is not how Unpin was actually designed.)

This simple code only has UB under Stacked Borrows but not under the LLVM aliasing rules; more complex variants of this – still in the realm of what async fn generates – also have UB under the LLVM aliasing rules.

A more complex variant

The following roughly corresponds to a generator with this code:

let mut data = 0;
let ptr_to_data = &mut data;
yield;
*ptr_to_data = 42;
println!("{}", data);
return;

When implemented by hand, it looks as follows, and causes aliasing issues:

#![feature(negative_impls)]
use std::ptr;
use std::pin::{pin, Pin};
use std::task::Poll;

pub struct S {
    state: i32,
    data: i32,
    ptr_to_data: *mut i32,
}

impl !Unpin for S {}

impl S {
    pub fn new() -> Self {
        S { state: 0, data: 0, ptr_to_data: ptr::null_mut() }
    }

    fn poll(self: Pin<&mut Self>) -> Poll<()> {
        // SAFETY: We're not moving anything.
        let this = unsafe { Pin::get_unchecked_mut(self) };
        match this.state {
            0 => {
                // The first time, set up the pointer.
                this.ptr_to_data = ptr::addr_of_mut!(this.data);
                // Now yield.
                this.state += 1;
                Poll::Pending
            }
            1 => {
                // After coming back from the yield, write to the pointer.
                unsafe { this.ptr_to_data.write(42) };
                // And read our local variable `data`.
                // THIS IS UB! `this` is derived from the `noalias` pointer
                // `self` but we did a write to `this.data` in the previous
                // line when writing to `ptr_to_data`. The compiler is allowed
                // to reorder this and the previous line and then the output
                // would change.
                println!("{}", this.data);
                // Now yield and be done.
                this.state += 1;
                Poll::Ready(())
            }
            _ => unreachable!(),
        }
    }
}

fn main() {
    let mut s = pin!(S::new());
    while let Poll::Pending = s.as_mut().poll() {}
}

Beyond self-referential types, a similar problem also comes up with intrusive linked lists: the nodes of such a list often live on the stack frames of the functions participating in the list, but also have incoming pointers from other list elements. When a function takes a mutable reference to its stack-allocated node, that will alias the pointers from the neighboring elements. This is an example of an intrusive list in the standard library that is breaking Rust's aliasing rules. Pin is sometimes used to ensure that the list elements don't just move elsewhere (which would invalidate those incoming pointers) and provide a safe API, but there still is the problem that an &mut Node is actually not a unique pointer due to these aliases – so we need a way for the to opt-out of the aliasing rules.

The goal of this RFC is to offer a way of writing such self-referential types and intrusive collections without UB. We don't want to change the rules for mutable references in general (that would also affect all the code that doesn't do anything self-referential), instad we want to be able to tell the compiler that this code is doing funky aliasing and that should be taken into account for optimizations.

Guide-level explanation

To write this code in a UB-free way, wrap the fields that are targets of self-referential pointers in an UnsafePinned:

pub struct S {
    data: UnsafePinned<i32>, // ---- here
    ptr_to_data: *mut i32,
}

impl S {
    pub fn new() -> Self {
        S { data: UnsafePinned::new(42), ptr_to_data: ptr::null_mut() }
    }

    pub fn get_data(self: Pin<&mut Self>) -> i32 {
        // SAFETY: We're not moving anything.
        let this = unsafe { Pin::get_unchecked_mut(self) };
        if this.ptr_to_data.is_null() {
            this.ptr_to_data = UnsafePinned::raw_get_mut(ptr::addr_of_mut!(this.data));
        }
        // SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
        unsafe { this.ptr_to_data.read() }
    }

    pub fn set_data(self: Pin<&mut Self>, data: i32) {
        // SAFETY: We're not moving anything.
        let this = unsafe { Pin::get_unchecked_mut(self) };
        if this.ptr_to_data.is_null() {
            this.ptr_to_data = UnsafePinned::raw_get_mut(ptr::addr_of_mut!(this.data));
        }
        // SAFETY: if the pointer is non-null, then we are pinned and it points to the `data` field.
        unsafe { this.ptr_to_data.write(data) }
    }
}

This lets the compiler know that mutable references to data might still have aliases, and so optimizations cannot assume that no aliases exist. That's entirely analogous to how UnsafeCell lets the compiler know that shared references to this field might undergo mutation.

However, what is not analogous is that &mut S, when handed to safe code you do not control, must still be unique pointers! This is because of methods like mem::swap that can still assume that their two arguments are non-overlapping. (mem::swap on two &mut UnsafePinned<i32> may soundly assume that they do not alias.) In other words, the safety invariant of &mut S still requires full ownership of the entire memory range S is stored at; for the duration that a function holds on to the borrow, nobody else may read and write this memory. But again, this is a safety invariant; it only applies to safe code you do not control. You can write your own code handling &mut S and as long as that code is careful not to make use of this memory in the wrong way, potential aliasing is fine.

To hand such references to safe code, use Pin: the type Pin<&mut S> can be safely given to external code, since the Pin wrapper blocks access to operations like mem::swap.

Similarly, the intrusive linked list from the motivation can be fixed by wrapping the entire UnsafeListEntry in UnsafePinned.

Reference-level explanation

API sketch:

/// The type `UnsafePinned<T>` lets unsafe code violate
/// the rule that `&mut UnsafePinned<T>` may never alias anything else.
///
/// However, even if you define your type like `pub struct Wrapper(UnsafePinned<...>)`,
/// it is still very risky to have an `&mut Wrapper` that aliases
/// anything else. Many functions that work generically on `&mut T` assume that the
/// memory that stores `T` is uniquely owned (such as `mem::swap`). In other words,
/// while having aliasing with `&mut Wrapper` is not immediate Undefined
/// Behavior, it is still unsound to expose such a mutable reference to code you do
/// not control! Techniques such as pinning via `Pin` are needed to ensure soundness.
///
/// Similar to `UnsafeCell`, `UnsafePinned` will not usually show up in the public
/// API of a library. It is an internal implementation detail of libraries that
/// need to support aliasing mutable references.
///
/// Further note that this does *not* lift the requirement that shared references
/// must be read-only! Use `UnsafeCell` for that.
///
/// This type blocks niches the same way `UnsafeCell` does.
#[lang = "unsafe_aliased"]
#[repr(transparent)]
struct UnsafePinned<T: ?Sized> {
    value: T,
}

/// When this type is used, that almost certainly means safe APIs need to use pinning
/// to avoid the aliases from becoming invalidated. Therefore let's mark this as `!Unpin`.
impl<T> !Unpin for UnsafePinned<T> {}

/// The type is `Copy` when `T` is to avoid people assuming that `Copy` implies there
/// is no `UnsafePinned` anywhere. (This is an issue with `UnsafeCell`: people use `Copy` bounds
/// to mean `Freeze`.) Given that there is no `unsafe impl Copy for ...`, this is also
/// the option that leaves the user more choices (as they can always wrap this in a `!Copy` type).
impl<T: Copy> Copy for UnsafePinned<T> {}
impl<T: Copy> Clone for UnsafePinned<T> {
    fn clone(&self) -> Self { *self }
}

// `Send` and `Sync` are inherited from `T`. This is similar to `SyncUnsafeCell`, since
// we eventually concluded that `UnsafeCell` implicitly making things `!Sync` is sometimes
// unergonomic. A type that needs to be `!Send`/`!Sync` should really have an explicit
// opt-out itself, e.g. via an `PhantomData<*mut T>` or (one day) via `impl !Send`/`impl !Sync`.

impl<T: ?Sized> UnsafePinned<T> {
    /// Constructs a new instance of `UnsafePinned` which will wrap the specified
    /// value.
    pub fn new(value: T) -> UnsafePinned<T> where T: Sized {
        UnsafePinned { value }
    }

    pub fn into_inner(self) -> T where T: Sized {
        self.value
    }

    /// Get read-write access to the contents of an `UnsafePinned`.
    ///
    /// You should usually be using `get_mut_pinned` instead to explicitly track
    /// the fact that this memory is "pinned" due to there being aliases.
    pub fn get_mut_unchecked(&mut self) -> *mut T {
        ptr::addr_of_mut!(self.value)
    }

    /// Get read-write access to the contents of a pinned `UnsafePinned`.
    pub fn get_mut_pinned(self: Pin<&mut Self>) -> *mut T {
        // SAFETY: we're not using `get_unchecked_mut` to unpin anything
        unsafe { ptr::addr_of_mut!(self.get_unchecked_mut().value) }
    }

    /// Get read-only access to the contents of a shared `UnsafePinned`.
    /// Note that `&UnsafePinned<T>` is read-only if `&T` is read-only.
    /// This means that if there is mutation of the `T`, future reads from the
    /// `*const T` returned here are UB!
    ///
    /// ```rust
    /// unsafe {
    ///     let mut x = UnsafePinned::new(0);
    ///     let ref1 = &mut *addr_of_mut!(x);
    ///     let ref2 = &mut *addr_of_mut!(x);
    ///     let ptr = ref1.get(); // read-only pointer, assumes immutability
    ///     ref2.get_mut().write(1);
    ///     ptr.read(); // UB!
    /// }
    /// ```
    pub fn get(&self) -> *const T {
        self as *const _ as *const T
    }

    pub fn raw_get_mut(this: *mut Self) -> *mut T {
        this as *mut T
    }

    pub fn raw_get(this: *const Self) -> *const T {
        this as *const T
    }
}

The comment about aliasing &mut being "risky" refers to the fact that their safety invariant still asserts exclusive ownership. This implies that duplicate in the following example, while not causing immediate UB, is still unsound:

pub struct S {
    data: UnsafePinned<i32>,
}

impl S {
    fn new(x: i32) -> Self {
        S { data: UnsafePinned::new(x) }
    }

    fn duplicate<'a>(s: &'a mut S) -> (&'a mut S, &'a mut S) {
        let s1 = unsafe { (&s).transmute_copy() };
        let s2 = s;
        (s1, s2)
    }
}

The unsoundness is easily demonstrated by using safe code to cause UB:

let mut s = S::new(42);
let (s1, s2) = s.duplicate(); // no UB
mem::swap(s1, s2); // UB

We could even soundly make get_mut_unchecked return an &mut T, given that the safety invariant is not affected by UnsafePinned. But that would probably not be useful and only cause confusion.

Here is a polyfill on current Rust that uses the Unpin hack to achieve mostly the same effect as this API. ("Mostly" because a safe impl Unpin for ... can un-do the effect of this, which would not be the case with the real UnsafePinned.)

Reference diff:

  * Breaking the [pointer aliasing rules]. `&mut T` and `&T` follow LLVM’s scoped
-   [noalias] model, except if the `&T` contains an [`UnsafeCell<U>`].
+   [noalias] model, except if the `&T` contains an [`UnsafeCell<U>`] or
+   the `&mut T` contains an [`UnsafePinned<U>`].

Async generator lowering changes:

Fields that represent local variables whose address is taken across a yield point must be wrapped in UnsafePinned.

rustc and Miri changes:

We have a UnsafeUnpin auto trait similar to Freeze that is implemented if the type does not contain any by-val UnsafePinned. This trait is an internal implementation detail and (for now) not exposed to users.
noalias on mutable references is only emitted for UnsafeUnpin types. (This replaces the current hack where it is only emitted for Unpin types.)
Niches are blocked on UnsafePinned.
Miri will do SharedReadWrite retagging inside UnsafePinned similar to what it does inside UnsafeCell already. (This replaces the current Unpin-based hack.)

Comparison with some other types that affect aliasing

UnsafeCell: disables aliasing (and affects but does not fully disable dereferenceable) behind shared refs, i.e. &UnsafeCell<T> is special. UnsafeCell<&T> (by-val, fully owned) is not special at all and basically like &T; &mut UnsafeCell<T> is also not special. Safe wrappers around this type can expose mutability behind shared references, such as &RefCell<T>.
UnsafePinned: disables aliasing (and affects but does not fully disable dereferenceable) behind mutable refs, i.e. &mut UnsafePinned<T> is special. UnsafePinned<&mut T> (by-val, fully owned) is not special at all and basically like &mut T; &UnsafePinned<T> is also not special. Safe wrappers around this type can expose sharing that involves pinned mutable references, such as Pin<&mut MyFuture>.
MaybeDangling: disables aliasing and dereferencable of all references (and boxes) directly inside it, i.e. MaybeDangling<&[mut] T> is special. &[mut] MaybeDangling<T> is not special at all and basically like &[mut] T.

Drawbacks

It's yet another wrapper type adjusting our aliasing rules and very easy to mix up with UnsafeCell or MaybeDangling. Furthermore, it is an extremely subtle wrapper type, as the duplicate example shows.
UnsafeUnpin is a somewhat unfortunate twin to Unpin. The purpose of UnsafeUnpin really is only to search for UnsafePinned fields, so that we can use the trait solver to determine whether an &mut reference gets noalias or not. The actual safety promise of UnsafeUnpin is likely going to be exactly the same as Unpin, but we can't use a stable and safe trait to determine noalias: impl UnsafeUnpin for T would add the noalias back to &mut T, and that can lead to very surprising aliasing issues as the poll_fn debacle showed. (Note that PollFn has already been fixed, but that doesn't mean nobody will make similar mistakes in the future so it is worth discussing how the original, problematic PollFn would fare under this RFC.) Splitting up the traits partially mitigates such issues: after impl<T> Unpin for PollFn<T>, PollFn<T> is (in general) Unpin + !UnsafeUnpin. The known examples of UB that Miri found all were caused by bad aliasing assumptions, which no longer occur when the aliasing assumptions are tied to UnsafeUnpin rather than Unpin. Actually moving the PollFn would still cause problems (and that can be done in safe code since it implements Unpin), but now the chances of code causing UB are much reduced since one must both pin data that's moved into a closure, and move that closure – even though the Rust compiler will not help prevent such moves, programmers thinking carefully about pinning are hopefully less likely to then try to move that closure. In conclusion, Unpin + !UnsafeUnpin types are somewhat foot-gunny but less foot-gunny than the status quo. Maybe in a future edition Unpin can be transitioned to an unsafe trait and then this situation can be re-evaluated; for now, UnsafeUnpin remains an unstable implementation detail similar to Freeze. (UnsafeUnpin + !Unpin types are harmless, one just loses the ability to call Pin::deref_mut for no good reason.)
It is unfortunate that &mut UnsafePinned and &mut TypeThatContainsUnsafePinned lose their no-alias assumptions even when they are not currently pinned. However, since pinning was implemented as a library concept, there's not really any way the compiler can know whether an &mut UnsafePinned is pinned or not – working with Pin<&mut TypeThatContainsUnsafePinned> generally requires using Pin::get_unchecked_mut and Pin::map_unchecked_mut, which exposes &mut TypeThatContainsUnsafePinned and &mut UnsafePinned that still need to be considered aliased.

Rationale and alternatives

The proposal in this RFC matches what was discussed with the lang team a long time ago.

However, of course one could imagine alternatives:

Keep the status quo. The current sitaution is that we only make aliasing requirements on mutable references if the type they point to is Unpin. This is unsatisfying: Unpin was never meant to have this job. A consequence is that a stray impl Unpin on a Wrapper<T>-style type can lead to subtle miscompilations since it re-adds aliasing requirements for the inner T. Contrast this with the UnsafeCell situation, where it is not possible for (stable) code to just impl Freeze for T in the wrong way – UnsafeCell is always recognized by the compiler.

On the other hand, UnsafePinned is rather quirky in its behavior and having two marker traits (UnsafeUnpin and Unpin) might be too confusing, so sticking with Unpin might not be too bad in comparison.

If we do that, however, it seems preferrable to transition Unpin to an unsafe trait. There is a clear statement about the types' invariants associated with Unpin, so an impl Unpin already comes with a proof obligation. It just happens to be the case that in a module without unsafe, one can always arrange all the pieces such that the proof obligation is satisifed. This is mostly a coincidence and related to the fact that we don't have safe field projections on Pin. That said, solving this also requires solving the trouble around Drop and Pin, where effectively an impl Drop does an implicit get_mut_unchecked, i.e., implicitly assumes the type is Unpin.
UnsafePinned could affect aliasing guarantees both on mutable and shared references. This would avoid the currently rather subtle situation that arises when one of many aliasing &mut UnsafePinned<T> is cast or coerced to &UnsafePinned<T>: that is a read-only shared reference and all aliases must stop writing! It would make this type strictly more 'powerful' than UnsafeCell in the sense that replacing UnsafeCell by UnsafePinned would always be correct. (Under the RFC as written, UnsafeCell and UnsafePinned can be nested to remove aliasing requirements from both shared and mutable references.)

If we don't do this, we could consider removing get since since it seems too much like a foot-gun. But that makes shared references to UnsafePinned fairly pointless. Shared references to generators/futures are basically useless so it is unclear what the potential use-cases here are.
Instead of introducing a new type, we could say that UnsafeCell affects both shared and mutable references. That would lose some optimization potential on types like &mut Cell<T>, but would avoid the footgun of coercing an &mut UnsafePinned<T> to &UnsafePinned<T>. That said, so far the author is not aware of Miri detecting code that would run into this footgun (and Miri is able to detect such issues).
We could entirely avoid all these problems by not having aliasing restrictions on mutable references. But that is completely against the direction Rust has had for 8 years now, and it would mean removing LLVM noalias annotations for mutable references (and likely boxes) entirely. That is sacrificing optimization potential for the common case in favor of simplifying niche cases such as self-referential structs – which is against the usual design philosophy of Rust.
Instead of adding a new type that needs to be used as Pin<&mut UnsafePinned<T>>, can't we just make Pin<&mut T> special? The answer is no, because working with Pin<&mut T> in unsafe code usually involves getting direct access to the &mut and then using it "carefully". But being careful is not enough when the compiler makes non-aliasing assumptions! We need to preserve the fact that the &mut T may have aliases even after Pin::get_unchecked_mut was used and inside Pin::map_unchecked_mut. In a different universe where pinning is a first-class concept with native support for projections and no need for get_unchecked_mut, this might not have been required, but with pinning being introduced as a library type, there is no (currently known) alternative to UnsafePinned.

In terms of rationale, the question that comes to mind first is why is this so different from UnsafeCell. UnsafeCell opts-out of read-only guarantees for shared references, can't we just have a type that opts-out of uniqueness guarantees for mutable references? The answer is no, because mutable references have some universal operations that exploit their uniqueness – in particular, mem::swap. In contrast, there exists no operation available on all shared references that exploits their immutability. This is why we need pinning to make APIs around UnsafePinned actually sound.

Naming

An earlier proposal suggested to call the type UnsafeAliased, since the type is not inherently tied to pinning. However, it is not possible to write sound wrappers around UnsafeAliased such that we can have aliasing &mut MyType. One has to use pinning for that: Pin<&mut MyType>. Because of that, the RFC proposes that we suggestively put pinning into the name of the type, so that people don't confuse it with a general mechanism for aliasing mutable references. It is more like the core primitive behind pinning, where whenever a type is pinned that is caused by an UnsafePinned field somewhere inside it.

For instance, it may be tempting to use an UnsafeAliased type to mark a single field in some struct as "separately aliased", and then a Mutex<Struct> would acquire ownership of the entire struct except for that field. However, due to mem::swap, that would not be sound. One cannot hand out an &mut to such aliased memory as part of a safe-to-use abstraction – except by using pinning.

Prior art

This is somewhat like UnsafeCell, but for mutable instead of shared references.

Adding something like this to Rust has been discussed many times throughout the years. Here are some links for recent discussions:

Unresolved questions

How do we transition code that relies on Unpin opting-out of aliasing guarantees to the new type? Here's a plan: define the PhantomPinned type as UnsafePinned<()>. Places in the standard library that use impl !Unpin for and the generator lowering are adjusted to use UnsafePinned instead. Then as long as nobody outside the standard library used the unstable impl !Unpin for, switching the noalias-opt-out to the UnsafeUnpin trait is actually backwards compatible with the (never explicitly supported) Unpin hack! However, if we ever make UnsafePinned location-relevant (i.e., only data inside the UnsafePinned is allowed to have aliases), then unsafe code in the ecosystem needs to be adjusted. The justification for this is that currently this code relies on undocumented unstable behavior (using !Unpin to opt-out of aliasing guarantees), so we are in our right to declare it unsound. Of course we should give the ecosystem time to migrate to the new approach before we actually start doing optimizations that exploit this UB.
Relatedly, in which module should this type live?
Unpin also affects the dereferenceable attribute, so the same would happen for this type. Is that something we want to guarantee, or do we hope to get back dereferenceable when better semantics for it materialize on the LLVM side?

Future possibilities

Similar to how we might want the ability to project from &UnsafeCell<Struct> to &UnsafeCell<Field>, the same kind of projection could also be interesting for UnsafePinned.

Discussion

Attendance

People: TC, pnkfelix, tmandry, scottmcm, Josh, RalfJ

Meeting roles

Minutes, driver: TC

(Low priority) Naming bikeshed containment zone

Add a type UnsafePinned that indicates to the compiler that this field is "pinned" and there might be pointers elsewhere that point to the same memory. … You need to use mechanisms such as Pin to ensure that mutable references cannot be used in incorrect ways by clients.

Josh: This description alone already makes the name seem confusing. UnsafePinned indicates the field is pinned, but you need to use mechanisms such as Pin with it? What's the proposed pattern here?

TC: This was originally known as UnsafeAliased.

Josh: I saw that in the RFC thread. What was the rationale for the rename? It seems like this type is primarily about aliasing?

Ralf: There is no known way to use this type soundly in a public API without pinning. UnsafeAliased doesn't convey a ton of meaning (UnsafeCell is also about aliasing), so the name was meant to help guide people in the direction of sound APIs. With UnsafeCell, you just have to be careful which operations you expose on &MyType, but with UnsafePinned things are a lot more complicated.

TC: Also, RalfJ had earlier written about it:

After the discussion over yonder I concluded that we really have to pick a different name. When you have an API that involves shared mutable state, it is not the case that you can freely pick between &UnsafeCell and &mut UnsafeAliased. Wrappers around UnsafeCell that provide mutability behind shared references are possible, and we have plenty of them. Wrappers around UnsafeAliased that provide sharing behind mutable references are not possible. One has to use pinning.

So, I updated the RFC to call the type UnsafePinned. This hopefully makes it less likely that people mistaken the type for a general mechanism for aliasing mutable references. (There can be no such mechanism.)

This is also discussed in the above document here.

Josh: (Aside: Hypothetically, is it possible to have that mutability without pinning by using a relative pointer and offsetting? This is a tangent, and we probably shouldn't explore it in depth here.)

Josh: I can certainly understand "putting Pinned in the name makes it be perceived as a lot more complicated". :) More seriously, though: does UnsafePinned already include Pin as part of its functionality, or are you expected to pin the type containing the UnsafePinned field??

Ralf: Pin and UnsafePinned are two sides of the same coin. The only way to soundly expose UnsafePinned types is by using Pin, and every use of Pin will bottom out at some UnsafePinned somewhere that is the reason why a Pin reference is needed. You are expected to use Pin like the initial examples at the top of the document do. (I have added the full code for the fixed version of the example in the guide-level section.)

Josh: Thank you for clarifying the usage and how that motivated the name. I'll think about that and try to keep the bikeshedding to a minimum. :)

Without actually suggesting an excessively awkward name here, it seems like the problem is that UnsafePinned is realy UnsafeNeedsPin or similar? Since UnsafePinned doesn't make something pinned, it's the thing at the bottom of a Pin that's the reason why Pin is needed.

TC: Perhaps we could say that it doesn't make it pinned, but (assuming the program is correct) it is pinned.

Josh: The name still seems like it isn't helping understanding. And with pinning we could use all the help we can get on that front. In any case, let's not spend substantial meeting time on a naming bikeshed. Let's make sure we have everything else answered first.

TC: Absolutely.

Ralf: Naming is hard, yeah. I haven't yet found a name I am really happy with.

(Answered) Understanding the aliasing rules

Josh: I know that &mut is intended to be non-aliased. I'd assumed that meant "you can't have two &mut pointing to the same place, or a &mut and a &". But it looks like the rule being used by this RFC is that you also can't have a &mut and a *const/*mut pointing to the same place, as well? Has that always been the (intended) rule, and people just widely ignored it in unsafe code? Or, can you have a &mut and a raw pointer pointing to the same place but you can't write through the raw pointer?

Ralf: &mut having no aliases at all has always been the rule, as far as I know. It's the only way for that rule to actually be useful: if we allow aliasing an &mut and a *mut, then we absolutely can not use noalias in our LLVM codegen. The saving grace is that we allow reborrowing. So the rule is that an &mut cannot alias any raw pointer that is not derived from it. But if you have an &mut and then create 15 raw pointers from that, those can all alias with each other as much as they want. (Stacked Borrows wants you to not use the &mut itself while the raw pointers are active. LLVM noalias and Tree Borrows do allow you to mix and match the &mut and the *mut derived from it. But the "derived from it" part is extremely important here.)

Josh: Does "derived from it" here imply that you can't have a &mut and a *mut both derived from the same T, you have to ensure that the *mut is derived from the mut?

Ralf: What is a reference "derived from a T"? References/pointers get derived from other references/pointers.

Josh: Sigh, terminology. Does "derived from it" here imply that you can't have a &mut and a *mut both taken from the same T (e.g. via two separate take-the-address-of operations), you have to ensure that the *mut is derived from the mut?

Ralf: let-bound variables are places so they are basically pointers. Maybe you mean derived from that? (If it's not a let-bound variable then it is another pointer: &addr_of!(*x) is derived from x.) Yeah I think we need a concrete code example as we don't seem to have common terms, sorry.

Josh: In code:

let x = Thing;
let p1: &mut Thing = &mut x;
...
let p2: *mut Thing = &mut x; // assume the compiler doesn't just instantly flag this as a problem.
// p1 and p2 are both taken from x, so does that imply p2 is not "derived from" p1?

Ralf: p2 is not derived from p1 and hence using both is UB (under Stacked Borrows and Tree Borrows).

Josh: Got it. Whereas here:

let x = Thing;
let p1: &mut Thing = &mut x;
...
let p2: *mut Thing = &mut *addr_of_mut!(*p1);
// Now p2 is derived from p1?

Ralf: Yes now p2 is derived from p1. (It boils down to provenance, as you will probably not be surprised to hear.)

Josh: Thanks. I think this entire section is now resolved and answered. +1.

The `duplicate` example

pnkfelix: I agree the duplicate example is subtle. I want to double check my understanding of the statement "while not causing immediate UB, is still unsound".

pnkfelix: I would have thought, intuitively, being in a state where you have a function that returns a (&'a mut S, &'a mut S) where the two tuple elements alias each other is a state of immediate UB.

Ralf: The entire point of UnsafePinned is to make that not immediate UB. UnsafePinned downgrades what would be immediate UB (violation of language/validity rules) to a violation of the safety invariant, which can lead to UB later – but that "UB later" can be avoided by using carerful abstractions such as Pin.

pnkfelix: Okay. So then the only case where it is not immediate UB is when S has an UnsafePinned contained (transitively) within, right?

Ralf: Specifically the overlapping parts must all be in an UnsafePinned. (And then if they are next to UnsafePinned we enter undecided territory, similar to the discussions about the granularity of UnsafeCell.)

pnkfelix: and the overlapping parts cannot be themselves behind some level of indirection, like a Box? I might need to write out an example.

Ralf: Not sure I follow. UnsafePinned only has a by-value effect, similar to UnsafeCell. So two UnsafePinned<&mut T> are not allowed to alias. An example might help :)

pnkfelix: Okay yes I think that matches my understanding.

pnkfelix: I think I was going for whether this would be immediate UB:

struct S(Box(UnsafePinned<u32>));
fn duplicate_mut_of_box<'a>(s: &'a mut S) -> (&'a mut S, &'a mut S) {
    let s1 = unsafe { (&s).transmute_copy() };
    let s2 = s;
    (s1, s2)    
}

pnkfelix: and my current understanding is that that would (still) be immediate UB.

Ralf: Yes, that would still be immediate UB.

Ergonomics of wrapper types

tmandry: General comment that even though it can be considered separately from this proposal, I would like to see us move to using field attributes eventually.

pub struct S {
    #[unsafe(pinned)]
    data: i32,
    ptr_to_data: *mut i32,
}

Josh: Don't we still need that represented in the type system somehow, though?

tmandry: Yeah I was looking for the gap in my thinking; that's probably it. An alternative would be "let's make newtypes less annoying to use".

Josh: +1 to that. I wonder if some kind of generalized solution to "types you can project through" would help there. (That's a tangent though.)

tmandry: But as I type this I realize that the accessors on UnsafePinned are in some sense the point, and a mutable reference to the underlying type is not what you want. Therefore more convenient access to the inner value is not actually desirable.

scottmcm: in particular, any time &mut s.foo doesn't give &mut typeof(S::foo) is pretty awkward – it can be really bad for proc macros especially, since often every such macro needs to know about the attribute specifically instead of being able to delegate to a real type.

Does this close any doors or add any constraints on future possibilities for safe self-referential types?

Josh: Suppose, in the future, with some additional work on the compiler / borrow checker, someone wants to put forth a design for safe self-referential data structures, ideally one that doesn't necessarily require pinning. (e.g. references using relative pointers, or some other solution.) Does the introduction of this type close any doors on those possibilities, or substantially constrain the design of those possibilities? (This RFC does not have to solve that problem, but I'd like to understand the degree to which this RFC constrains future solutions.)

Ralf: That's hard without having an example of what that design would look like. You had earlier mentioned relative pointers. If you have those, e.g., within a struct, you can do that today.

Josh: So there are many possibilities that wouldn't intersect with this feature?

Ralf: That sounds right.

Felix's question

Ralf: E.g.:

let mut s1 = S::new(42);
let mut s2 = S::new(43);
mem::swap(&mut s1, &mut s2);

(Felix's summary: fn duplicate is buggy; or at least, someone using it has to understand that they are handling nuclear waste that may violate assumptions of other code that is built upon the guarantees of &mut T. The distinction between "immediate UB" vs "is unsound" is a matter of what we might allow, in the language, for modules/crates to use in a local, controlled manner, and not as part of a general publicly accessible API.)

Adding pinning to the language

tmandry: Would this be compatible with adding pinning to the language later?

Ralf: As far as I can tell.

tmandry: It seems like we need to do this in any case.

Going through open questions

tmandry: Adding a topic to make sure we walk through all the open questions.

TC: Let's set out subsections for this.

Should this affect shared references?

Ralf: Concretely, do we want &UnsafePinned<T> to disable aliasing guarantees? My inclination is no, as we want orthogonal language features. So it would seem odd to tie these together, as one could simply nest UnsafePinned<UnsafeCell<T>>. We could of course relax this later.

UnsafePinned could affect aliasing guarantees both on mutable and shared references. This would avoid the currently rather subtle situation that arises when one of many aliasing &mut UnsafePinned<T> is cast or coerced to &UnsafePinned<T>: that is a read-only shared reference and all aliases must stop writing! It would make this type strictly more 'powerful' than UnsafeCell in the sense that replacing UnsafeCell by UnsafePinned would always be correct. (Under the RFC as written, UnsafeCell and UnsafePinned can be nested to remove aliasing requirements from both shared and mutable references.)

If we don't do this, we could consider removing get since since it seems too much like a foot-gun. But that makes shared references to UnsafePinned fairly pointless. Shared references to generators/futures are basically useless so it is unclear what the potential use-cases here are.

How do we transition code that relies on Unpin opting-out of aliasing guarantees to the new type?

How do we transition code that relies on Unpin opting-out of aliasing guarantees to the new type? Here's a plan: define the PhantomPinned type as UnsafePinned<()>. Places in the standard library that use impl !Unpin for and the generator lowering are adjusted to use UnsafePinned instead. Then as long as nobody outside the standard library used the unstable impl !Unpin for, switching the noalias-opt-out to the UnsafeUnpin trait is actually backwards compatible with the (never explicitly supported) Unpin hack! However, if we ever make UnsafePinned location-relevant (i.e., only data inside the UnsafePinned is allowed to have aliases), then unsafe code in the ecosystem needs to be adjusted. The justification for this is that currently this code relies on undocumented unstable behavior (using !Unpin to opt-out of aliasing guarantees), so we are in our right to declare it unsound. Of course we should give the ecosystem time to migrate to the new approach before we actually start doing optimizations that exploit this UB.

Ralf: For UnsafeCell we have Freeze. For UnsafePinned, we'd have e.g. an UnsafeUnpinned. It'd be annoying to have both UnsafePinned and Pinned, but I don't see a way around it.

tmandry: How widespread is it that people are relying on this? How much code would need to change if we added this rule that people needed to wrap this generator-like code in UnsafePinned?

TC: That is already the rule in a sense; people are already doing this; it's just unsound today.

Ralf: Once we give people these tools, people will be able to fix their code. And we'll be able to see in Miri how much code blows up.

Ralf: Long term it might be good to deprecate PhantomPinned. If we end up with the semantics where UnsafeUnpin on one field doesn't affect all fields, then PhantomPinned is a bit of a trap.

(Added later by Felix: regarding UnsafeCell's effects on neighboring fields, see e.g. ucg#236, and also this zulip chat on TreeBorrows.)

In which module should this type live?

Josh: std::pin? If it's so closely tied to pinning.

Ralf: That sounds reasonable.

Josh: Maybe we should bundle this with the question of the name. If it's name is tied to pin, it may make sense there.

Handling of `dereferenceable`?

Unpin also affects the dereferenceable attribute, so the same would happen for this type. Is that something we want to guarantee, or do we hope to get back dereferenceable when better semantics for it materialize on the LLVM side?

Next steps

TC: What are the next steps?

tmandry: It does seem we need to do this. The name did make me raise an eyebrow initially, but the explanation made sense to me. I do have concerns how we teach this.

Ralf: So do I.

pnkfelix: This makes sense to me also, as far as I understand.

Josh: I don't have any blocking concerns here either.

TC: How do we want to handle the bikeshed on the name raised by Josh?

Ralf: We could ship this on nightly of course.

Josh: +1 on shipping this in nightly. We could consider the name later.

tmandry: I'll propose FCP merge.

How do we teach this and `UnsafeCell`?

tmandry: This gives me pause; I would like to imagine what explanatory text looks like that compares and contrasts these two types.

In terms of rationale, the question that comes to mind first is why is this so different from UnsafeCell. UnsafeCell opts-out of read-only guarantees for shared references, can't we just have a type that opts-out of uniqueness guarantees for mutable references? The answer is no, because mutable references have some universal operations that exploit their uniqueness – in particular, mem::swap. In contrast, there exists no operation available on all shared references that exploits their immutability. This is why we need pinning to make APIs around UnsafePinned actually sound.

(The meeting ended here.)

Felix S Klock II

2024/05/29 16:41:30

to be clear: causes aliasing issues with LLVM, right? (And perhaps any other reasonable backend one might imagine, maybe?) (Edited)

Tyler Mandry

2024/05/29 16:47:39

disables aliasing

Meaning, disables aliasing rules?

2024/05/29 16:51:56

// SAFETY: we're not using `get_unchecked_mut` to unpin anything

should this say `get_mut_unchecked`, instead?

2024/05/29 17:14:42

Relatedly, in which module should this type live

`std::pin` seems natural to me, given that `UnsafeCell` is in `std::cell` ...

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

unsafe_pinned

Summary

Motivation

Guide-level explanation

Reference-level explanation

Comparison with some other types that affect aliasing

Drawbacks

Rationale and alternatives

Naming

Prior art

Unresolved questions

Future possibilities

Discussion

Attendance

Meeting roles

(Low priority) Naming bikeshed containment zone

(Answered) Understanding the aliasing rules

The duplicate example

Ergonomics of wrapper types

Does this close any doors or add any constraints on future possibilities for safe self-referential types?

Felix's question

Adding pinning to the language

Going through open questions

Should this affect shared references?

How do we transition code that relies on Unpin opting-out of aliasing guarantees to the new type?

In which module should this type live?

Handling of dereferenceable?

Next steps

How do we teach this and UnsafeCell?

`unsafe_pinned`

The `duplicate` example

Handling of `dereferenceable`?

How do we teach this and `UnsafeCell`?