Opinions on allocator dyn-compatibility

# Opinions on allocator dyn-compatibility This is an opinion doc hoping to gather perspectives on how to handle dyn-compatibility for `Allocator`. After you've taken a look through the background section, please add your stance below alongside your reasoning. If possible, please point us to your usecase(s) and why they might/might not be served by a given option. ## Background Several members of the libs{-api} team have expressed wanting support for dyn-compatible allocators, at least so long as they do not significantly cut into performance, ergonomics, ecosystem usage, or our ability to express an allocator's semantics. The two main proposals being looked at for enabling dyn-compatibility then are: ### (1) Split trait `trait Allocator` and `trait DynAllocator` would be separate traits. `Allocator` would be free to have associated types or constants, so long as a notion of "most conservative default" exists. That is, something like `const MIN_ALIGN: usize` for the minimum alignment an allocator *always* returns would be okay (setting this value to 1 would never be wrong). The methods on these would be identical, except that any potential type aliases would be substituted in. #### Example ```rust trait Allocator { const PANICS: bool; type Err: Into<AllocationError>; fn allocate(&self, ...) -> Result<NonNull<[u8]>, Err>; /* ... */ } trait DynAllocator { fn dyn_allocate(&self, ...) -> Result<NonNull<[u8]>, AllocationError>; } impl<A: DynAllocator> Allocator for A { const PANICS: bool = true; type Err = AllocationError; #[inline(always)] fn allocate(&self, ...) -> ... { self.dyn_allocate(...) } /* ... */ } ``` #### Noted advantages - Does not rely on any typesystem or language changes - Allows us to have associated types, such as for the error #### Noted disadvantages - Quite clunky, as it requires us to duplicate the whole `Allocator` trait - The proposal as written forces newtyping dyn-compatible allocators, which may mean duplicating the implementations of many *more* traits if a library author wants to expose both dyn-compatible and dyn-incompatible allocators that implement other traits too (see below) #### Unresolved questions Should the provided impl be the one above (`Allocator for A: DynAllocator`), or: ```rust impl Allocator for dyn DynAllocator {...} impl Allocator for dyn DynAllocator + Sync {...} impl Allocator for dyn DynAllocator + Send {...} impl Allocator for dyn DynAllocator + Send + Sync {...} ``` (in which case we'd likely add a blanket impl the other way, `DynAllocator for A: Allocator`)? Doing it like this would mean we get around the newtyping issue, but any added autotrait would combinatorically increase the number of impls we have. Alternatively, we can see if there is some language mechanism we can ask for in order to simplify this. ### (2) Keep `dyn Allocator` We guarantee that `Allocator` never gets dyn-incompatible items; the trait as-is would be dyn-compatible. The semantics of associated items could in the future be expressed with maybe bounds, a lang proposal which would allow types generic over an `A: Allocator + maybe OtherTrait` to have syntax for fallibly attempting to downcast to `A: Allocator + OtherTrait`. This would allow the relevant constants to be included in `OtherTrait`. #### Noted advantages - Much cleaner ergonomically - Possibly less confusing to implementors, too #### Noted disadvantages - Reliant on maybe bounds being stabilised soon-ish for the info in associated items to be usable - Cannot nicely express associated types in method signatures, notably the case of `Err = !` ## Other designs ### (3) `dyn`-compatible `Alloc`, container-optimized `Allocator` #### Background There are two distinct use cases for custom allocators: - Low-level code such as `no_std` libraries may want to perform dynamic memory allocation (e.g. for scratch buffers), but don't want a hard dependency on the `crate alloc` global allocator. In C/C++ it is common to define a small vtable of function pointers that approximate the `malloc` / `free` APIs; the Rust equivalent is a `dyn`-compatible trait. - Higher-level code wants to either use `std` containers (e.g. `Box` or `Vec`) or define its own container types with similar ergonomics. Trait impls such as `Clone for Box<T, A: Allocator>` place additional requirements on the relationship between the container type and the allocator. Additionally, depending on the container, there may be performance-critical optimizations that require monomorphization or non-`dyn`-compatible trait properties. Using a single `Allocator` trait requires either a `dyn`-compatible trait (forclosing monomorphization-based optimizations) or a non-`dyn`-compatible trait (leaving the `no_std` use case completely unsatisfied). #### Design This proposal puts the `dyn`-compatible portions into an `Alloc` trait and the container-oriented portions into `Allocator`, then explores how those two traits are related. ##### `dyn`-compatible `trait Alloc` The `Alloc` trait is an abstraction of the existing free functions in `alloc::alloc`. It defines the interface for allocating and deallocating blocks of memory. ```rust pub struct AllocError; pub unsafe trait Alloc { fn allocate(&self, layout: Layout) -> Result<NonNull<u8>, AllocError>; unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout); fn allocate_zeroed( &self, layout: Layout, ) -> Result<NonNull<u8>, AllocError> { ... } unsafe fn grow( &self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout, ) -> Result<NonNull<u8>, AllocError> { ... } unsafe fn grow_zeroed( &self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout, ) -> Result<NonNull<u8>, AllocError> { ... } unsafe fn shrink( &self, ptr: NonNull<u8>, old_layout: Layout, new_layout: Layout, ) -> Result<NonNull<u8>, AllocError> { ... } } ``` ##### (3a) `trait Allocator` (as supertrait) This is the most direct conversion of the current `Allocator` trait into a split-trait design. An `Allocator` can be used to allocate memory (via its supertrait), may have associated type / constants / functions, and places additional restrictions on its implementations that allow it to be used as a field in cloneable containers such as `Box<T, A: Allocator>`. ```rust #[rustc_dyn_incompatible_trait] pub unsafe trait Allocator: Alloc { } ``` From the perspective of a user who wants a high-level container-based API to memory allocation this version would have the same ergonomics as a single non-`dyn` `Allocator` trait. ##### (3b) `trait Allocator` (as a handle) This design for `Allocator` is based on the observation that the properties desired for `A` in `Box<T, A: Allocator>` are actually those of a *handle to an allocator*. Cloning a `Box<T, A>` requires cloning a value of type `A`, and the expectation is that this doesn't actually clone the entire state of the allocator. So `A` is semantically a sort of reference with special identity guarantees. Thus we can define `Allocator` as an `unsafe` fusion of `Deref + Clone`: ```rust // Safety: all clones of an allocator must return behaviorally equivalent // `Alloc` instances. pub unsafe trait Allocator { type Alloc: Alloc + ?Sized; // names are illustrative fn allocator_instance(&self) -> &Self::Alloc; fn allocator_clone(&self) -> Self; } ``` The relationship between `Allocator` and `Alloc` matches that of references (or smart pointers) and their pointee, so the following blanket impls are both sound and trivial: ```rust unsafe impl<'a> Allocator for &'a dyn Alloc { ... } unsafe impl Allocator for Rc<dyn Alloc> { ... } unsafe impl Allocator for Arc<dyn Alloc> { ... } /* This section previously included the following impls, but they would conflict with `<A: Allocator> Allocator for &A`. unsafe impl<'a, A: Alloc + ?Sized> Allocator for &'a A { ... } unsafe impl<A: Alloc + ?Sized> Allocator for Rc<A> { ... } unsafe impl<A: Alloc + ?Sized> Allocator for Arc<A> { ... } */ ``` Implementation for ZSTs (such as `Global`) or custom wrappers around smart pointers are straightforward: ```rust pub struct Global; impl Alloc for Global { ... } impl Allocator for Global { type Alloc = Self; fn allocator_instance(&self) -> &Self::Alloc { self } fn allocator_clone(&self) -> Self { Global } } pub struct MyAlloc(Arc<InnerState>); impl Alloc for MyAlloc { ... } impl Allocator for MyAlloc { type Alloc = Self; fn allocator_instance(&self) -> &Self::Alloc { self } fn allocator_clone(&self) -> Self { MyAlloc(self.0.clone()) } } ``` ##### (3c) Like (3b) but split up `Allocator` Not every use case for an `Allocator` requires cloning, so the `allocator_clone()` function could be split into a separate trait. ```rust // Safety: must return a behaviorally equivalent instance each time pub unsafe trait Allocator { type Alloc: Alloc + ?Sized; fn allocator_instance(&self) -> &Self::Alloc; } // Safety: all clones of an allocator must return behaviorally // equivalent `Alloc` instances. pub unsafe trait AllocatorClone: Allocator { fn allocator_clone(&self) -> Self; } ``` Signatures on containers such as `Box` could then be more specific: ```rust impl<T, A: Allocator> Box<T, A> { pub fn new_in(x: T, alloc: A) -> Box<T, A> { ... } } impl<T, A: AllocatorClone> Clone for Box<T, A> { ... } ``` There are some high-quality bikeshedding opportunities in the names of such traits, for example this is semantically equivalent but terser: ```rust pub unsafe trait AllocRef { type Alloc: Alloc + ?Sized; fn alloc_ref(&self) -> &Self::Alloc; } pub unsafe trait AllocClone: AllocRef { fn alloc_clone(&self) -> Self; } ``` #### Advantages - The `trait Alloc` portion is well-understood and has no outstanding blockers to stabilization. Stabilizing it by itself unlocks immediate value for `no_std` library authors. - Both designs for `Allocator` can be implemented in today's Rust, with future type system changes (e.g. associated type defaults) being purely additive. - Clearly separating the memory allocation portion from the monomorphized container optimization portion will reduce the scope of the latter, with third-party crates (e.g. `allocator-api2`) being able to explore the design space while depending on shared stable `Alloc`. - New features in `Alloc` (allocator size class queries) or `Allocator` (variants of `allocate` that take flags or use associated error types) can be added incrementally. #### Disadvantages - The API surface is more complex -- authors of crates that provide custom containers would need to understand the separation between `Alloc` and `Allocator`. - If `trait Alloc` is stabilized first and `trait Allocator` stabilized later, then there would be a period of time where `core::alloc::Alloc` is stable but no stable users of `Alloc` exist in the standard library (since extant containers are all `A: Allocator`). - The API of stabilized `Alloc` functions would be frozen, compared to `Allocator` which has more ways to evolve its function signatures in a backwards-compatible way (e.g. via defaulted associated types). - The "`Allocator` as handle" design is a significant change from the current unstable `allocator_api` API, so crates such as `hashbrown` (which consume `core::alloc::Allocator` directly) and `allocator-api2` (which imitate it) would experience churn. ### (4) blanket impl `DynAllocator` `trait Allocator` is the primary trait for allocation and not dyn compatible. There is a second trait, `DynAllocator` to facilitate dynamic dispatch. `DynAllocator` is not implemented manually, but via a blanket impl for all `Allocator` implementors. It is a supertrait of `Allocator` so that `Allocator` objects can be coerced to `dyn DynAllocator`. #### Implementation Sketch ```rust trait Allocator: DynAllocator{ const MAY_PANIC:bool; type Error; fn allocate(&self,...)->Result<NonNull<[u8]>,Self::Error>; ... } trait DynAllocator{ // note the entirely different return type fn allocate(&self,...)->Result<NonNull<u8>,AllocError>; } impl<A:Allocator> DynAllocator for A{ fn allocate(&self,...)->Result<NonNull<u8>,AllocError>{ ... } } // for $PtrLike in Box,Arc,Rc,& ... impl<A:DynAllocator> Allocator for $PtrLike<A>{...} ``` Additionally, all sorts of smart pointers will need to implement `Allocator`, which is no different from the other options. #### Noted advantages - Does not rely on any typesystem or language changes - Allows us to have both associated types/consts and dyn compatibility - Straightforward for allocator implementors - Allows making different API-design tradeoffs for the different optimization capabilities of monomorphization vs dynamic dispatch - We can stabilize Allocator before figuring out ABI optimization details relevant to `DynAllocator` #### Noted disadvantages - separately designing an API that is optimal for dynamic dispatch will take time, delaying `DynAllocator` stabilization. As far as I am aware, the only open question is about how the allocators should communicate available allocation size. - `Allocator` will be implemented in terms of `DynAllocator` even for concrete pointer types like `Arc<UserDefinedAllocator>`. This can presumably be fixed if we ever get maybe bounds (?). ## Opinions (please include a zulip username here so we can ping you) ### Nia I prefer (2) on ergonomics grounds, but ultimately would accept either. It's also the less complex option for both us in the library and for implementors. While associated errors are quite nice, I feel like I'd be willing to sacrifice them for cleanliness since we don't have much of a usecase for this in std beyond merely propagating them (and the `Layout` that failed the allocation can be included by the caller if they so desire) and I think serving what we do in std internally is the main purpose of `Allocator`. I'm deeply worried about (1) + swapped impl, since `Move` and `Forget` are both relevant autotraits that are likely to be added soon and would up the number of combinations considerably. ### jmillikin I prefer (3) on the theory that it's better for both major use cases to be satisfied by a slightly more complex API than if a simpler API left neither side happy. The habit of Rust language features to languish indefinitely in "almost ready" state is also a factor; I hesitate to support designs that depend on in-progress changes to the type system. It's possible that the Rust in 10 years could provide an ergonomic single-trait `Allocator` that supports both `dyn Allocator` and `Box<T, A: Allocator>`, but do we want to wait that long? The blanket trait impl of `impl<A: DynAllocator> Allocator for A` in (1) seems like it combines the disadvantages of both (3a) and (3b). Like (3a) it requires that each `DynAllocator` have at most one `Allocator` impl, but like (3b) the user is forced to use newtypes if they want to control the `Allocator` impl properties. Without that blanket impl it would be more flexible, but note that it would then be a less ergonomic variant of (3b) semantically. (2) would work for my use cases *if stabilized*, but it depends on in-progress language features to provide what `Box<T: A: Allocator>` needs, so it seems unlikely to stabilize in the near future. The tiny ergonomic hit of not having `dyn Allocator` seems fine if `dyn core::alloc::<Something>` exists and doesn't require the per-consumer glue code of `struct DynAllocator`. ### Marcus Müller I prefer (4). I think (1) arose from a misunderstanding. Nobody wants this as far as I can tell. As for (2), I am strongly opposed to stabilizing dyn-compatible Allocator before maybe bounds are stable. I have been disappointed by <future_specialization_feature> promises too often. Once we have maybe bounds, I am still slightly opposed. The 'duplication' is not a bug, but a feature. Different tradeoffs apply to `Allocator` and `DynAllocator`, so it makes sense for them to have different APIs. Namely, `Allocator` should return an associated error type, while `DynAllocator` should return ZST errors. We will probably also want differences in returning `NonNull<u8>` vs `NonNull<[u8]>`, though the sizing part still needs more design work. Personally, I do not care much about dyn compatibility or custom errors, so if we design the common api in a way that works for monomorphization with ZST errors, that works for me. I don't think it would be fair to the people who want these features, but it's not the hill I'm going to die on. The most significant difference between (4) and all variants of (3) is that in (3) there is no blanket impl. Users would need to implement both `DynAllocator`/`Alloc` and `Allocator`. Assuming `Allocator` actually grows the intended features, like custom error types, this will be significant implementation duplication. Note that stabilizing `DynAllocator`/`Alloc` before `Allocator` will make the blanket impl impossible. (3b) is a worse version of (3c). I think forcing all allocators to be cloneable is bad. #### on maybe bounds We may end up in the situation where maybe bounds can only be used in the standard library but not outside, due to soundness concerns. In that case, under (2), only std collections will be able to use non-dyn features while also supporting dyn. Under (4), only smart pointers defined in std will be able to forward to wrapped non-dyn allocator implementations while also supporting dyn. I think (4) is the better tradeoff, as non-std collections are more common than non-std smart pointers.