Abstract Values

# Abstract Values _A brain dump (or maybe pre-pre-RFC) to capture my thoughts on a future extension of [RFC 3729][RFC3729]._ This is based on [RFC 3729 "Hierarchy of Sized traits"][RFC3729], so you should be familiar with the concepts introduced there. :::danger This proposal currently does not properly capture the "can be stored in a local variable, but not used as a pointee" semantics, especially not in a generic context. Large parts of the proposed design are still applicable, though. ::: ## Motivation On some architectures, there exist types that cannot be put into memory: They have neither size, alignment nor a bitpattern that could be read. This is for example the case for WebAssembly's reference types (such as `externref`) or SPIR-V's opaque types (generally handles to some resource). Such types are severely restricted in how they can be used, and this document describes how these restrictions could be modeled in Rust. ## Design [RFC 3729][RFC3729] introduces the `Pointee` trait "at the bottom of the \["sized"] trait hierarchy" to mark "types that which can be used from behind a pointer" and states that "in practice, every type will implement `Pointee`". This RFC introduces `!Pointee` types (i.e. types to which a pointer cannot exist), which will be called "abstract values". The trait to identify such types will be `AbstractValue`. `AbstractValue` would replace `Pointee` at the bottom of the ["sized"] trait hierarchy, and every type would implement `AbstractValue`. When this RFC mentions "abstract values", it always refers to types that are _only_ abstract values, i.e. those that are `!Pointee`. :::info There is an ongoing discussion on [RFC 3729][RFC3729] on whether relaxing the default bound should be expressed as `: SuperTrait` or `: ?Trait`. If the decision ends up being for `: ?Trait`, then adding the `AbstractValue` trait may not be necessary, though the "abstract value" terminology would still be applicable. (In case [RFC 3729][RFC3729] decides to not introduce the `Pointee` trait, that trait would also need to be added to express `?Pointee` bounds). ::: :::info [RFC 3729][RFC3729] mentions abstract values as a future possibility, though it calles the corresponding trait just `Value`. This RFC adds "abstract" to try and better capture the nature of such types, and to have a common term to use in other places (see `AbstractCopy` below). ("Opaque values" could also be used instead, but I think "abstract" fits better, and "opaque" is also already used for a number of other concepts). ::: ### Low-level properties At the machine code / hardware / ABI level, Rust assumes that `AbstractValue` types can be: * Stored in local variables * Including being useable as function arguments and return values * Copied * If a value cannot be copied even at the hardware level, their usefulness would be extremely restricted (similar to what is discussed in the next section). * It would be possible to support such types (at the expense of further complicating things with additional traits or semantic restrictions), but I don't know of a strong enough use case to make that worth it. ### Restrictions Under current Rust semantics, abstract value types would be extremely restricted in how they can be used: * They cannot implement `Copy` (and thus cannot be copied). * A reference to such types cannot exist. * Under the definition of a reference as a pointer with a lifetime. * They cannot be part of (non-transparent) aggregate types. Essentially, an abstract value could be used within a single function, and then optionally returned or passed to exactly one other function (which could return the abstract value again, but that workaround breaks very quickly when multiple abstract values are involved). ### References to abstract values Lifetimes are an important part of Rust semantics and API design. Without support for references to abstract values, lifetimes could essentially only be used via transparent newtype wrappers (with a `PhantomData` lifetime). References to such wrappers also could not exist, so modifying (e.g. (re-)borrowing) that lifetime would be difficult. As such, references to abstract values will be allowed and called "abstract references". At the ABI level, such references will be represented by the abstract value itself. An abstract reference is itself an abstract value, and multiple levels of abstract references are supported (i.e. `&&&T` would be represented as just `T` in the ABI). This only works as long as the original abstract value is not modified, as such: * The original abstract value must not contain any interior mutability. (`UnsafeCell` must never contain an abstract value). * If mutable abstract references are allowed, then they may never be written to. * This RFC proposes to allow mutable abstract references, because mutability is such an important concept in Rust API design. Raw pointers to abstract values are **not** allowed. (Many operations on abstract raw pointers could not be supported, so it seems cleaner to disallow the reference->raw pointer conversion, instead of disallowing the individual raw pointer operations). ### Copyable abstract values Duplicating abstract values must be supported in some form. It could be possible to support this via inline ASM or intrinsics, but there are several reasons to provide proper support for that: * (Immutable) abstract references should be copyable. * It would be nice to support safe duplicating of abstract values. This essentially requires a type-level opt-in. A new `AbstractCopy` trait will be added. * This trait will be used to decided whether local variables are moved or copied. * `AbstractCopy` will be a super trait of `Copy`, and a blanked impl of `AbstractCopy` will exist for `Copy`. * The semantics of the existing `Copy` trait will essentially become "supports a bitwise copy" (though obviously renaming that trait to `BitwiseCopy` is infeasible). * `AbstractCopy` will have similar implementation restrictions as the current `Copy`. ### Minimally useful product I believe that `AbstractCopy` and abstract references are required to make abstract values minimally useful. (Where minimally useful means good enough to properly use in real-world programs without undue performance overhead, _**though large parts of the standard library and ecosystem may be unusable**_). With only one of the two features, abstract values could still be usable in most situations, but would significantly restrict the possible API designs. Without either of the two features, abstract values would be mostly restricted to the FFI boundary. There are still (sometimes significant) limitations on how abstract values can be used (see below), but I believe that `AbstractCopy` and abstract references allow a large enough group of programs that they are worthwhile to support. ## Limitations and future possibilities ### Standard library changes The standard library APIs would need to be evaluated to decided which parts of it should support abstract values. ### Aggregate types This RFC only supports abstract values in transparent aggregates. * This makes it impossible to return more than one abstract value from a function. * Prevents the common pattern of returning `Result<SomeAbstractValue, E>`. Aggregate types could be allowed to contain abstract values, so that would make the aggregate itself an abstract value. In the general case, the individual components of an abstract aggregate would need to be stored (and passed around) separately. * This could have a non-trivial performance impact. * This requires that the ABI supports returning multiple abstract values from a function. In WebAssembly it is possible to define custom aggregate types (which can contain abstract values, and are abstract values themselves). It is unclear how such types would fit into this. One option would be for the compiler to use them transparently as an optimization, though supporting them as user-defined types would obviously also desirable. ### Coroutines and async One direct consequence of the limitation on aggregates is that storing abstract values across supension points would only be possible if abstract aggregates are supported, and it would make the coroutine itself an abstract value, severely limiting how it can be used. Abstract values would not play well with the existing async ecosystem at all. ### Trait objects Abstract values cannot be used as trait objects (and I cannot think of a way around that). Some targets (e.g. WebAssembly) may support converting abstract values into a type that can be stored in memory and thus used as a pointee and trait object. (E.g. WebAssembly can store abstract values in "tables", so they can be referred to by their index in the table). ### format_args A direct consequence of the trait object restriction is that abstract values cannot be used as arguments to `format_args!`. One workaround would be so have library or language support for eager formatting (i.e. turn the abstract value into a string (or some other intermediate value) before passing it to `format_args!`). ### Arrays (and slices) Some targets could support arrays or slices of abstract values. It is unclear how to represent those in the language. Fixed-sized arrays could be handled similar to abstract aggregates described above. [RFC3729]: https://github.com/rust-lang/rfcs/pull/3729