owned this note
owned this note
Published
Linked with GitHub
# `dyn*` Conversion Traits
We want to have a trait or collection of traits to facilitate converting things into `dyn* Trait` objects. Here we discuss some of the desired properties, possible designs, and attempt to settle on the right way to go.
## Design Meeting 2022-10-14
Objectives
1. Decide what version of the traits we want to implement to start with (not necessarily committing to being the long term design)
2. Identify requirements and nice-to-have properties
Conclusions:
We have choice. To cast `T` to `dyn* Trait` we can require one of the following
1. `T: PointerSized + Trait`
2. `T: PointerSized + Deref*<impl Trait>`
Option 1 allows small value inline storage but forces double indirection if `T` is already a pointer. Option 2 does not force double indirection but gives up on the small value inline storage optimization.
Avoiding double indirection seems more important than the inline storage optimization.
Open Question: Can we get both inline storage and single-indirection using `IntoDyn`?
Currently option 1 is implicitly implemented. Do we want to start implementing option 2 instead, or wait until we resolve `IntoDyn`?
Option 2 is harder because there are a lot of fiddly rules we'll need to add.
Background:
- eholk's post on async-fn-in-traits: https://blog.theincredibleholk.org/blog/2022/04/18/how-async-functions-in-traits-could-work-in-rustc/
Questions/Comments:
- (tmandry) `IntoDyn` requires us to decide on a per-type basis how to do the conversion. Is that true of the others?
- (eholk) You can use adapters like `Boxing`, but that seems like it's actually independent.
- (tmandry) Not sure why they would have to be connected.
- (tmandry) `IntoDyn` allows for a blanket impl that can be specialized for specific types, but it's not clear if this is an important advantage.
- Can we support by-value self? Probably? The vtable should know the size, and `dyn*` is an owned value. `Box<Self>` works currently, and `dyn*` basically functions as a box, so by-value self should work.
- Connection to safe transmute - similar because we have the layout and alignment requirement, but we don't need the safety requirements. Safe transmute is one-way (you can go from a reference to an integer, but not back), but we need to be able to go both directions. The compiler guarantees this is safe because it knows it's casting back to the same type.
- (tmandry) Pinning
- https://rust-lang.github.io/async-fundamentals-initiative/explainer/async_fn_in_dyn_trait/generalizing_from_box_to_dynx.html#the-shim-function-builds-the-dynx
- In that example we add an `Unpin` bound on the pointer-sized type to convert it to a `dyn*` *because Future has a `Pin<&mut Self>` method*
### Double Indirection
```rust
trait Counter {
fn increment(&mut self);
}
impl Counter for usize {
fn increment(&mut self) {
*self += 1
}
}
impl Counter for Box<usize> {
fn increment(&mut self) { ... }
}
// compiler-generated
impl Counter for dyn* Counter {
fn increment(&mut self) {
self.vtable.increment(&mut self.0)
}
}
fn main() {
let x = 0usize as dyn* Counter;
// X is (0, <usize as Counter>::VTABLE)
x.increment();
// desugars to:
// <dyn* Counter>::increment(&mut x);
// increment gets inlined, but we have to spill x to the
// stack to take its address
let y = Box::new(0size) as dyn* Counter;
y.increment();
// y is (Box(0), <Box<usize> as Counter>::VTABLE)
// desugars to:
// <dyn* Counter>::increment(&mut y);
//
// which can be inlined because it's static dispatch to dyn* impl
{
//<dyn* Counter>::increment(&mut y);
//(&mut y).vtable.increment(&mut (&mut y).0)
y.vtable.increment(&mut y.0)
// &mut (y.0) is a pointer to a pointer
}
}
```
The double indirection may not be too bad, because the first on is on the stack which is probably in L1.
We could avoid the double indirection if we require the first field to always be a pointer and give up on small value inline storage.
(tmandry) Is this what was motivating `IntoRawPointer` and `FromRawPointer`?
(eholk) What if we require `PointerSized + Deref<impl Trait>` to cast into `dyn* Trait`?
- (tmandry) That's enough to get rid of the double indirection
- We probably want `DerefMut`, `DerefOwned`, etc. to support different `Self` types
---
## Desiderata
Below are some features we might want.
- Allow for customizable conversion into a `dyn*` type. For example, many types will want to do this by automatically boxing self, but we want to support other strategies like using an arena allocator or inline storage where possible.
- Sensible defaults
- Things that are already safe pointers (e.g. `Box`, `Rc`, `Arc`, `&`) should automatically implement the correct traits.
- Do we want default impls for things like `usize` and `isize`. It might be better for users to write a newtype wrapper and specify the behavior they want.
- Traits can be implemented in Safe Rust
- Supports methods with different self types. We at least should support the ones that are already dyn-safe, but if we can support more that would be great.
- by-value self types currently don't worth with `dyn Trait`. [[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=35b42a89584ba378300780674889a628)]
- `Arc<Self>` does work: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b4a623a6e7edbc835f7e9e3889c5eba3
## Background: Converting into `dyn*`
Let's consider the following program:
```rust=
trait Counter {
fn get_value(&self) -> usize;
fn increment(&mut self);
}
impl Counter for &mut usize {
fn get_value(&self) -> usize {
*self
}
fn increment(&mut self) {
*self += 1;
}
}
fn main() {
let mut counter = &mut 0usize as dyn* Counter;
counter.increment();
println!("The value is {}", counter.get_value());
}
```
To start with, let's assume the compiler does everything by transmuting. Then the compiler might elaborate the program like this:
```rust=
// dyn* objects are basically a struct of { data, vtable }
// impl automatically generated by compiler
impl Counter for dyn* Counter {
fn get_value(&self) -> usize {
self.vtable.get_value(self.data)
}
fn increment(&mut self) -> usize {
self.vtable.increment(self.data)
}
}
fn main() {
let mut counter: dyn* Counter =
dyn_star_from_raw_parts(
transmute::<*const ()>(&mut 0),
VTable {
get_value: |this: *const ()| {
let this = transmute::<&mut usize>(this);
this.get_value()
},
increment: |this: *const ()| {
let mut this = transmute::<&mut usize>(this);
this.increment()
}
});
counter.increment();
println!("The value is {}", counter.get_value());
}
```
This roughly corresponds to the code that rustc currently generates for `dyn*`.
## Option 1: `PointerSized`
This trait would basically signify that the `transmute` into a `*const ()` in the example above would work.
Currently the compiler has no verification that it will actually be able to convert a value to a `dyn* Trait` other than making sure the value implements `Trait`. Having a pointer-sized trait would allow us to guarantee at type checking time that the conversion will succeed.
`PointerSized` would be a compiler-implemented trait like `Sized`.
## Option 1a: `SameLayout<T>`
This is more general and preserves the option for large `dyn*` objects in the future. If we had `SameLayout<T>`, we could implement `PointerSized` as:
```rust=
trait PointerSized: SameLayout<*const ()> {}
impl<T> PointerSized for T where T: SameLayout<*const ()> {}
```
The main advantage of this version is that it would give us a way to parameterize `dyn*` by a layout and allow more storage space for inline storage.
For example, currently you can do `123usize as dyn* Debug` and `123` would be stored in the data field of the `dyn*` object, rather than a pointer to it. If we wanted to allow larger types to work, we could imagine something like `foo as dyn*<[usize; 4]> Debug`, where the data field now uses `[usize; 4]` as its layout and can store up to four `usize` values, and we'd require `foo: Debug + SameLayout<[usize; 4]>` to do the conversion.
The ergonomics of using this feature would probably not be great, so it's probably not worth complicating the design too much for what is likely to be a niche use case. Using it in the large would probably be similar to using [StackFuture](https://github.com/microsoft/stackfuture).
## Option 2: Flexible Option - `IntoDyn` / `CoerceSized`
In this option, we introduce a trait that allows types to control some of the conversion mechanism. The trait would probably look something like this:
```rust
trait IntoDyn {
type Ptr: PointerSized; // PointerSized bound is optional if we want to
// support larger dyn* objects
fn into_dyn(Self) -> Self::Pointer;
fn as_ref_self(&Self::Pointer) -> &Self;
fn as_mut_Self(&mut Self::Pointer) -> &mut Self;
// ...additional conversions for other kinds of self arguments
//
// These would probably all be split into separate traits
// so types would not have to support all self types.
}
```
Going back to our example above, the compiler would use these traits when filling in the `dyn*` vtable:
```rust
fn main() {
let mut counter: dyn* Counter =
dyn_star_from_raw_parts(
transmute::<*const ()>(
<&mut usize as IntoDyn>::into_dyn(&mut 0)),
VTable {
get_value: |this: *const ()| {
let this
= transmute::<<&mut usize as IntoDyn>::Ptr>(this);
let this = <&mut usize as IntoDyn>::as_ref_self(this);
this.get_value()
},
increment: |this: *const ()| {
let this
= transmute::<<&mut usize as IntoDyn>::Ptr>(this);
let this = <&mut usize as IntoDyn>::as_mut_self(this);
this.increment()
}
});
counter.increment();
println!("The value is {}", counter.get_value());
}
```
Types would then be responsible for defining how to do this coercion. For things that are already pointer sized, this would likely just be a transmute. In fact, we could provide a blanket implementation:
```rust
impl<T> IntoDyn for T
where T: PointerSized {
type Ptr = T;
fn into_dyn(this: Self) -> Self::Ptr {
this
}
// ...
}
```
We also have the option of auto-boxing impls:
```rust
impl IntoDyn for BigStruct {
type Ptr = Box<BigStruct>;
fn into_dyn(this: Self) -> Self::Ptr {
Box::new(this)
}
fn as_ref_self(this: &Box<Self::Ptr>) -> &Self {
this.as_ref()
}
// ...
}
```
One key point about this design is that the `Ptr` associated type means the compiler can do all of the transmuting in code it generates, and impls of `IntoDyn` (or whatever we call it) can be implemented completely in safe code.
### Is this extra functionality useful?
While these traits are significantly more flexible, it's unclear whether that buys us much. For example, the auto-boxing example could be done similarly to this:
```rust
trait Foo {
async fn foo(&self);
}
impl Foo for BigStruct {
#[refine]
fn foo(&self) -> dyn* Future<Output = ()> {
Box::new(async {
// ...
})
}
}
```
It's more verbose, but this could be done automatically with macros or trait transformers or some other feature.
Because there can only be one impl for a type, the `IntoDyn` trait requires us to decide on a per-type basis how the storage for the result futures is handled. In practice, we may want to make this decision at the call site or somewhere else instead. This could be done using something like `Boxing::new(my_big_struct)`, but that does not require the more complex `IntoDyn` trait.
## Have your cake and eat it too
Can we design a set of traits that let us have inline storage and also single indirection?
See also tmandry's version at https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4a3d7f992f2f8d1710e2926ea9914021
```rust=
trait Foo {
fn by_value(self);
fn by_ref(&self);
fn by_mut_ref(&mut self);
}
impl Foo for usize {
...
}
trait CoerceDyn {
// PointerSized is a compiler-implemented trait that
// indicates that a type can be transmuted into *const ()
type Ptr: PointerSized;
fn into_ptr(Self) -> Ptr,
fn as_value(&Ptr) -> Self,
fn as_ref(&Ptr) -> &Self,
fn as_mut_ref(&mut Ptr) -> &mut Self,
}
// inline storage version
impl CoerceDyn for usize {
type Ptr = usize;
fn into_ptr(x: Self) -> Ptr {
x
}
fn as_value(x: &Ptr) -> Self {
*x
}
fn as_ref(x: &Ptr) -> &Self {
x
}
fn as_mut_ref(x: &mut Ptr) -> &mut Self {
x
}
}
// Pointer version, using Box as an example
impl CoerceDyn for Box<T>
where
T: Sized, // so Box<T> is a thin pointer
{
type Ptr = Box<T>;
fn into_ptr(x: Self) -> Ptr {
x
}
// FIXME: as_value needs to take a pointer that
// we can deinitialize
fn as_value(x: &Ptr) -> Self {
*x
}
fn as_ref(x: &Ptr) -> &Self {
x // automatically goes through Deref
}
fn as_mut_ref(x: &mut Ptr) -> &mut Self {
x
}
}
// compiler-generated
struct DynStarVTable<Foo> {
by_value: fn(*const ()),
by_ref: fn(*const ()),
by_mut_ref: fn(&mut *const ()),
drop: fn(*const ()),
}
// compiler-generated
fn cast_dyn_star<T>(x: T) -> dyn* Foo
where
T: CoerceDyn
{
<dyn* Foo> {
data: CoerceDyn::into_ptr(x),
vtable: &DynStarVTable<Foo> {
by_value: fn(this: *const ()) {
let this: T::Ptr = transmute(this);
let this = <T as CoerceDyn>::as_value(&this);
<T as Foo>::by_value(this)
}
by_ref: fn(this: *const ()) {
let this: T::Ptr = transmute(this);
let this = <T as CoerceDyn>::as_ref(&this);
<T as Foo>::by_ref(this)
}
by_mut_ref: fn(this_orig: &mut *const ()) {
let mut this: T::Ptr = transmute(*this_orig);
let result = <T as Foo>::by_mut_ref(
<T as CoerceDyn>::as_mut_ref(&mut this));
*this_orig = CoerceDyn::into_ptr(this);
result
}
drop: todo!(),
}
}
}
// compiler generated
struct <dyn* Foo> {
data: *const (),
table: &'static DynStarVTable<Foo>
}
// compiler-generated
//
// Called by static dispatch essentially always, so LLVM
// should pretty much always inline them.
impl Foo for dyn* Foo {
fn by_value(self) {
todo!()
}
fn by_ref(&self) {
todo!()
}
fn by_mut_ref(&mut self) {
self.table.by_mut_ref(&mut self.data)
}
}
```
## Open Questions
- How do we handle mutable self with inline storage?
- Do we even need these? What do we actually need?
- Does the trait design influence whether we introduce more memory references in generated code?
## Related Work
- [DST Coercions RFC](https://github.com/rust-lang/rfcs/blob/master/text/0982-dst-coercion.md)
- [`Pointee` trait](https://doc.rust-lang.org/std/ptr/trait.Pointee.html)