dyn* Conversion Traits

We want to have a trait or collection of traits to facilitate converting things into dyn* Trait objects. Here we discuss some of the desired properties, possible designs, and attempt to settle on the right way to go.

Design Meeting 2022-10-14

Objectives

  1. Decide what version of the traits we want to implement to start with (not necessarily committing to being the long term design)
  2. Identify requirements and nice-to-have properties

Conclusions:

We have choice. To cast T to dyn* Trait we can require one of the following

  1. T: PointerSized + Trait
  2. T: PointerSized + Deref*<impl Trait>

Option 1 allows small value inline storage but forces double indirection if T is already a pointer. Option 2 does not force double indirection but gives up on the small value inline storage optimization.

Avoiding double indirection seems more important than the inline storage optimization.

Open Question: Can we get both inline storage and single-indirection using IntoDyn?

Currently option 1 is implicitly implemented. Do we want to start implementing option 2 instead, or wait until we resolve IntoDyn?

Option 2 is harder because there are a lot of fiddly rules we'll need to add.

Background:

Questions/Comments:

  • (tmandry) IntoDyn requires us to decide on a per-type basis how to do the conversion. Is that true of the others?
    • (eholk) You can use adapters like Boxing, but that seems like it's actually independent.
    • (tmandry) Not sure why they would have to be connected.
  • (tmandry) IntoDyn allows for a blanket impl that can be specialized for specific types, but it's not clear if this is an important advantage.
  • Can we support by-value self? Probably? The vtable should know the size, and dyn* is an owned value. Box<Self> works currently, and dyn* basically functions as a box, so by-value self should work.
  • Connection to safe transmute - similar because we have the layout and alignment requirement, but we don't need the safety requirements. Safe transmute is one-way (you can go from a reference to an integer, but not back), but we need to be able to go both directions. The compiler guarantees this is safe because it knows it's casting back to the same type.
  • (tmandry) Pinning

Double Indirection

trait Counter {
    fn increment(&mut self);
}

impl Counter for usize {
    fn increment(&mut self) {
        *self += 1
    }
}

impl Counter for Box<usize> {
    fn increment(&mut self) { ... }
}

// compiler-generated
impl Counter for dyn* Counter {
    fn increment(&mut self) {
        self.vtable.increment(&mut self.0)
    }
}

fn main() {
    let x = 0usize as dyn* Counter;
    // X is (0, <usize as Counter>::VTABLE)
    x.increment();
    // desugars to:
    // <dyn* Counter>::increment(&mut x);
    // increment gets inlined, but we have to spill x to the
    // stack to take its address
    
    let y = Box::new(0size) as dyn* Counter;
    y.increment();
    // y is (Box(0), <Box<usize> as Counter>::VTABLE)
    // desugars to:
    // <dyn* Counter>::increment(&mut y);
    //
    // which can be inlined because it's static dispatch to dyn* impl
    {
        //<dyn* Counter>::increment(&mut y);
        
        //(&mut y).vtable.increment(&mut (&mut y).0)
        y.vtable.increment(&mut y.0)
        
        // &mut (y.0) is a pointer to a pointer
    }
}

The double indirection may not be too bad, because the first on is on the stack which is probably in L1.

We could avoid the double indirection if we require the first field to always be a pointer and give up on small value inline storage.

(tmandry) Is this what was motivating IntoRawPointer and FromRawPointer?

(eholk) What if we require PointerSized + Deref<impl Trait> to cast into dyn* Trait?

  • (tmandry) That's enough to get rid of the double indirection
  • We probably want DerefMut, DerefOwned, etc. to support different Self types

Desiderata

Below are some features we might want.

  • Allow for customizable conversion into a dyn* type. For example, many types will want to do this by automatically boxing self, but we want to support other strategies like using an arena allocator or inline storage where possible.
  • Sensible defaults
    • Things that are already safe pointers (e.g. Box, Rc, Arc, &) should automatically implement the correct traits.
    • Do we want default impls for things like usize and isize. It might be better for users to write a newtype wrapper and specify the behavior they want.
  • Traits can be implemented in Safe Rust
  • Supports methods with different self types. We at least should support the ones that are already dyn-safe, but if we can support more that would be great.

Background: Converting into dyn*

Let's consider the following program:

trait Counter { fn get_value(&self) -> usize; fn increment(&mut self); } impl Counter for &mut usize { fn get_value(&self) -> usize { *self } fn increment(&mut self) { *self += 1; } } fn main() { let mut counter = &mut 0usize as dyn* Counter; counter.increment(); println!("The value is {}", counter.get_value()); }

To start with, let's assume the compiler does everything by transmuting. Then the compiler might elaborate the program like this:

// dyn* objects are basically a struct of { data, vtable } // impl automatically generated by compiler impl Counter for dyn* Counter { fn get_value(&self) -> usize { self.vtable.get_value(self.data) } fn increment(&mut self) -> usize { self.vtable.increment(self.data) } } fn main() { let mut counter: dyn* Counter = dyn_star_from_raw_parts( transmute::<*const ()>(&mut 0), VTable { get_value: |this: *const ()| { let this = transmute::<&mut usize>(this); this.get_value() }, increment: |this: *const ()| { let mut this = transmute::<&mut usize>(this); this.increment() } }); counter.increment(); println!("The value is {}", counter.get_value()); }

This roughly corresponds to the code that rustc currently generates for dyn*.

Option 1: PointerSized

This trait would basically signify that the transmute into a *const () in the example above would work.

Currently the compiler has no verification that it will actually be able to convert a value to a dyn* Trait other than making sure the value implements Trait. Having a pointer-sized trait would allow us to guarantee at type checking time that the conversion will succeed.

PointerSized would be a compiler-implemented trait like Sized.

Option 1a: SameLayout<T>

This is more general and preserves the option for large dyn* objects in the future. If we had SameLayout<T>, we could implement PointerSized as:

trait PointerSized: SameLayout<*const ()> {} impl<T> PointerSized for T where T: SameLayout<*const ()> {}

The main advantage of this version is that it would give us a way to parameterize dyn* by a layout and allow more storage space for inline storage.

For example, currently you can do 123usize as dyn* Debug and 123 would be stored in the data field of the dyn* object, rather than a pointer to it. If we wanted to allow larger types to work, we could imagine something like foo as dyn*<[usize; 4]> Debug, where the data field now uses [usize; 4] as its layout and can store up to four usize values, and we'd require foo: Debug + SameLayout<[usize; 4]> to do the conversion.

The ergonomics of using this feature would probably not be great, so it's probably not worth complicating the design too much for what is likely to be a niche use case. Using it in the large would probably be similar to using StackFuture.

Option 2: Flexible Option - IntoDyn / CoerceSized

In this option, we introduce a trait that allows types to control some of the conversion mechanism. The trait would probably look something like this:

trait IntoDyn {
    type Ptr: PointerSized; // PointerSized bound is optional if we want to
                            // support larger dyn* objects
    fn into_dyn(Self) -> Self::Pointer;
    
    fn as_ref_self(&Self::Pointer) -> &Self;
    fn as_mut_Self(&mut Self::Pointer) -> &mut Self;
    // ...additional conversions for other kinds of self arguments
    // 
    // These would probably all be split into separate traits
    // so types would not have to support all self types.
}

Going back to our example above, the compiler would use these traits when filling in the dyn* vtable:

fn main() {
    let mut counter: dyn* Counter =
        dyn_star_from_raw_parts(
            transmute::<*const ()>(
                <&mut usize as IntoDyn>::into_dyn(&mut 0)),
            VTable {
                get_value: |this: *const ()| {
                    let this
                        = transmute::<<&mut usize as IntoDyn>::Ptr>(this);
                    let this = <&mut usize as IntoDyn>::as_ref_self(this);
                    this.get_value()
                },
                increment: |this: *const ()| {
                    let this
                        = transmute::<<&mut usize as IntoDyn>::Ptr>(this);
                    let this = <&mut usize as IntoDyn>::as_mut_self(this);
                    this.increment()                    
                }
            });

    counter.increment();
    
    println!("The value is {}", counter.get_value());
}

Types would then be responsible for defining how to do this coercion. For things that are already pointer sized, this would likely just be a transmute. In fact, we could provide a blanket implementation:

impl<T> IntoDyn for T
where T: PointerSized {
    type Ptr = T;
    fn into_dyn(this: Self) -> Self::Ptr {
        this
    }
    // ...
}

We also have the option of auto-boxing impls:

impl IntoDyn for BigStruct {
    type Ptr = Box<BigStruct>;
    
    fn into_dyn(this: Self) -> Self::Ptr {
        Box::new(this)
    }
    
    fn as_ref_self(this: &Box<Self::Ptr>) -> &Self {
        this.as_ref()
    }
    
    // ...
}

One key point about this design is that the Ptr associated type means the compiler can do all of the transmuting in code it generates, and impls of IntoDyn (or whatever we call it) can be implemented completely in safe code.

Is this extra functionality useful?

While these traits are significantly more flexible, it's unclear whether that buys us much. For example, the auto-boxing example could be done similarly to this:

trait Foo {
    async fn foo(&self);
}

impl Foo for BigStruct {
    #[refine]
    fn foo(&self) -> dyn* Future<Output = ()> {
        Box::new(async {
            // ...
        })
    }
}

It's more verbose, but this could be done automatically with macros or trait transformers or some other feature.

Because there can only be one impl for a type, the IntoDyn trait requires us to decide on a per-type basis how the storage for the result futures is handled. In practice, we may want to make this decision at the call site or somewhere else instead. This could be done using something like Boxing::new(my_big_struct), but that does not require the more complex IntoDyn trait.

Have your cake and eat it too

Can we design a set of traits that let us have inline storage and also single indirection?

See also tmandry's version at https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4a3d7f992f2f8d1710e2926ea9914021

trait Foo { fn by_value(self); fn by_ref(&self); fn by_mut_ref(&mut self); } impl Foo for usize { ... } trait CoerceDyn { // PointerSized is a compiler-implemented trait that // indicates that a type can be transmuted into *const () type Ptr: PointerSized; fn into_ptr(Self) -> Ptr, fn as_value(&Ptr) -> Self, fn as_ref(&Ptr) -> &Self, fn as_mut_ref(&mut Ptr) -> &mut Self, } // inline storage version impl CoerceDyn for usize { type Ptr = usize; fn into_ptr(x: Self) -> Ptr { x } fn as_value(x: &Ptr) -> Self { *x } fn as_ref(x: &Ptr) -> &Self { x } fn as_mut_ref(x: &mut Ptr) -> &mut Self { x } } // Pointer version, using Box as an example impl CoerceDyn for Box<T> where T: Sized, // so Box<T> is a thin pointer { type Ptr = Box<T>; fn into_ptr(x: Self) -> Ptr { x } // FIXME: as_value needs to take a pointer that // we can deinitialize fn as_value(x: &Ptr) -> Self { *x } fn as_ref(x: &Ptr) -> &Self { x // automatically goes through Deref } fn as_mut_ref(x: &mut Ptr) -> &mut Self { x } } // compiler-generated struct DynStarVTable<Foo> { by_value: fn(*const ()), by_ref: fn(*const ()), by_mut_ref: fn(&mut *const ()), drop: fn(*const ()), } // compiler-generated fn cast_dyn_star<T>(x: T) -> dyn* Foo where T: CoerceDyn { <dyn* Foo> { data: CoerceDyn::into_ptr(x), vtable: &DynStarVTable<Foo> { by_value: fn(this: *const ()) { let this: T::Ptr = transmute(this); let this = <T as CoerceDyn>::as_value(&this); <T as Foo>::by_value(this) } by_ref: fn(this: *const ()) { let this: T::Ptr = transmute(this); let this = <T as CoerceDyn>::as_ref(&this); <T as Foo>::by_ref(this) } by_mut_ref: fn(this_orig: &mut *const ()) { let mut this: T::Ptr = transmute(*this_orig); let result = <T as Foo>::by_mut_ref( <T as CoerceDyn>::as_mut_ref(&mut this)); *this_orig = CoerceDyn::into_ptr(this); result } drop: todo!(), } } } // compiler generated struct <dyn* Foo> { data: *const (), table: &'static DynStarVTable<Foo> } // compiler-generated // // Called by static dispatch essentially always, so LLVM // should pretty much always inline them. impl Foo for dyn* Foo { fn by_value(self) { todo!() } fn by_ref(&self) { todo!() } fn by_mut_ref(&mut self) { self.table.by_mut_ref(&mut self.data) } }

Open Questions

  • How do we handle mutable self with inline storage?
  • Do we even need these? What do we actually need?
  • Does the trait design influence whether we introduce more memory references in generated code?
Select a repo