Enum layout and discriminant

# Enum layout and discriminant The proposal below details the AM definition of enum layous and `std::mem::discriminant`, the degree to which the behavior of these is implementation defined, and all the fun problems this creates. For now, we pretend that all types are sized. This should extend fairly naturally to unsized types though. ### Enum Layouts The layout of an enum in the AM is defined as follows: ```rust= type Variant = VariantName; type Field = FieldName; struct EnumLayout { /// Size and align layout: Layout, /// Specifies how to "write" the enum variant_layouts: Map<Variant, VariantLayout>, /// Specifies how to compute the current variant, see below discriminator: Discriminator, } struct VariantLayout { field_offsets: Map<Field, usize>, tag: Map<usize, u8>, } ``` The `field_offsets` should be fairly self-explanatory. The `tag` is a set of offsets and the values for the bytes at those offsets. Encoding the enum just consists of encoding the fields at the specified offsets, writing `AbstractByte::Init(val, /* provenance */ None)` to `offset` for each `(offset, val)` in `tag` and leaving the remaining bytes uninit. Importantly, the set of keys in the `tag` is *not* required to be the same among all variants. This allows us to do niche optimizations and similar things. For example, with a tagged layout for ```rust= enum E { A(u16), B(u16), } ``` the `VariantLayout`s might look like this: ```rust= E::A : VariantLayout { field_offsets: map!(A.0 => 2_usize), tag: map!(0_usize => 0_u8), } E::B : VariantLayout { field_offsets: map!(B.0 => 2_usize), tag: map!(0_usize => 1_u8), } ``` We can model niched layouts too. An `Option<NonZeroU16>` has layout: ```rust= Option<NonZeroU16>::Some : VariantLayout { field_offsets: map!(Some.0 => 0), tag: map!(), } Option<NonZeroU16>::None : VariantLayout { field_offsets: map!(), tag: map!(0_usize => 0_u8, 1_usize => 0_u8), } ``` ### Variant Computation Each `EnumLayout` also specifies how the active variant of the enum is calculated. The basic idea is that computing the layout is a decision tree over a subset of the bytes. We define a `Discriminator` that reads a byte, and depending on the value of that byte either goes to read another one or makes a decision as to which variant to return. ```rust= enum Discriminator { Known(Variant), Unknown { offset: usize, children: Map<u8, Discriminator>, } } ``` Because `discriminant` calculation is potentially done both on a place (via `std::mem::discriminant`/`Rvalue::Discriminant`) and on a `List<AbstractByte>` (for `decode`), we make the computation generic over ```rust= trait ReadByte { fn get(&self, i: usize) -> Result<AbstractByte>; } ``` We implement this for `Place`, where `get` just does a normal AM read of the byte at offset `i`, and we implement it for `List<AbstractByte>`, where `get` is just indexing. `discriminant` is then defined as follows (there is nothing interesting here, this is literally just a decision tree): ```rust= fn discriminant<R: ReadByte>( bytes: &R, EnumLayout { discriminator, .. }: EnumLayout ) -> Result<Variant> { let mut disc = discriminator; let var = loop { match disc { Discriminator::Known(var) => break var, Discriminator::Unknown { offset, children } => { let byte = bytes.get(offset); let AmByte::Init(val, /* provenance */ _) = val else { throw_ub!() }; let Some(new_disc) = children.get(val) else { throw_ub!() }; disc = new_disc; } } }; Ok(var) } ``` Let's once more consider the examples from above. For ```rust= enum E { A(u16), B(u16), } ``` with the layout as before, the `Discriminator` would just pick a variant on the basis of the byte that is used for all tags. It would look like this: ```rust= // Root Discriminator::Unknown { offset: 0, children: map!(0 => dscA, 1 => dscB) } dscA: Discriminator::Known(E::A) dscB: Discriminator::Known(E::B) ``` For `Option<NonZeroU16>`, the `Discriminator` instead has to check whether the memory is `[0, 0]`: ```rust= // Root Discriminator::Unknown { offset: 0, children: map!(0 => dsc1, 1.. => dscS), } dsc1 : Discriminator::Unknown { offset: 1, children: map!(0 => dscN, 1.. => dscS) } dscN: Discriminator::Known(None), dscS: Discriminator::Known(Some), ``` `decode` for an enum computes the discriminant and then reads all the fields of that variant at the appropriate offset. `std::mem::discriminant` just computes the discriminant. ### Requirements on Implementation The `EnumLayout` of each monomorphized enum is chosen by the implementation and is a parameter to the abstract machine. However, we impose some conditions on which layouts may be chosen. The less important of these are all the obvious ones, which I won't go through exhaustively: all variants/fields must be present where appropriate, fields and tags must be in-bounds and pairwise non-overlapping, the discriminator must be finite, etc. The remaining requirements both concern the behavior of `discriminant(encode(EnumValue { variant, fields }))`. The first of these I am calling the "agreement requirement." It specifies that the expression above must yield `variant` - in other words, the discriminator and variant layouts agree regarding which byte patterns imply which active variant. The second of these is the "interior mutability requirement." We first define the notion of a byte being "inside an `UnsafeCell`." The `encode` operation may at some point recurse down to encoding an `UnsafeCell<T>`, which will generate bytes that end up in the output of the `encode` (which bytes these are depends on the value). These are the bytes that are inside an `UnsafeCell`. Let `Unknown { offset: o_1, .. } -> Unknown { offset: o_2, .. } -> ... -> Unknown { offset: o_n, .. } -> Known(_)` be the path in the discriminator tree taken when computing the `discriminant` in the above expression. The interior mutability requirement states that none of the offsets among `o_1 ... o_n` may be inside an `UnsafeCell`. This turns out to be a more precise way of saying `UnsafeCell` can't have niches. It is necessary because we need to ensure that `Option<Mutex<u8>>::is_some` does not cause a data race or invalidate any pointers into the mutex. This is discussed a little more below. ## Discussion These are basically a bunch of proofs/short discussions that the requirements above are well-behaved in a bunch of ways. It's probably not worth your time to read all of these. --- #### We can model structs as well We can easily specify struct layouts with this too. Structs are laid out as single-variant enums, with the additional requirement that the `tag` is empty for that variant and that the discriminator is exactly `Known(TheOneVariant)`. This indirectly covers all of the other requirements on struct layouts like fields being well-aligned and the validity requirement of a struct being the conjunction of its fields' validity requirements. This in turn allows us to actually define a discriminator (and hence `mem::discriminant`) for all types. Primitives and such just get treated the same way as structs. This simplifies the model quite a bit and we will make use of this in the next section. --- #### Validity invariants of non-`UnsafeCell` fields can be enforced **Theorem:** Let `Ty` be some type with discriminator `d`. We show that there is an alternative discriminator `d'` that is identical to `d` in every way except that `discriminant::<Ty>` is UB if the value does not satisfy its bit-validity requirements for any bytes that are not inside an `UnsafeCell`. **Proof:** We proceed via induction on the "structure" of `Ty`. Because we only consider bit-validity, we do not run into circularity issues even when checking references (should they end up having memory-related validity constraints). We do case work on the kinds of the field types. Pointers and references can simply be read to check that they are initialized, and furthermore their numeric values can be inspected to ensure that references are aligned and non-null and such. Unions are always valid, so the discriminator to check this is a nop. `UnsafeCell` validity invariants do not have to be checked since these bytes are inside an `UnsafeCell`. For integer types, the general idea is easy. The discriminator reads the bytes to ensure that they are initialized. For `bool` and `char`, the discriminator additionally uses its decision tree power to check that the integers are in the valid range. However, if ptr2int transmutes are UB, we must additionally check that there is no provenance. This is discussed in detail later. Finally, for all aggregates (arrays, tuples, ADTs, etc.): Begin by taking `d_0`, the standard discriminator for the type. Now consider each field `f_i: F_i` (`i in 1..=n`) in succession. We define `d_i` by replacing every node `Known(x)` in `d_(i-1)` with a copy of the discriminator for `F_i` that checks validity (this uses the induction hypothesis), modified so that all leaves of `F` are `Known(x)` and the offsets are changed appropriately for the offset of `f`. This effectively checks the validity of the field `f_i`. Furthermore, the validity condition of any aggregate type is "`discriminator` is not UB and all fields are valid" and so `d_n` is a suitable discriminator to show that the data is valid at type `Ty`. █ In other words, what this means is that up to the barrier presented by any `UnsafeCell`s, implementations are free to *always* give this validity checking behavior to `#[repr(Rust)]` enum layouts (and are subject to no increased restrictions on layout in exchange). Implementations are also free to define a layout where this is not checked, but a user writing implementation-agnostic code must assume that it is always being checked. --- #### Match Guards Rust allows arbitrary code in the middle of a match statement via if-guards (and possibly in the future via deref patterns). To ensure that this does not break match exhaustiveness checking, match guards are inserted. Essentially, when matching on some `T`, we keep a `&T` around up to the beginning of each match arm to ensure that the value doesn't change (and we do the same thing for any references to fields we create). <details> <summary> Sidenote About `FakeRead` </summary> The way we ensure that the borrows actually stick around until the beginning of each match arm is by inserting a `FakeRead` to them at that point. But this can't actually be an AM read, because it might read into `UnsafeCell`s causing pointer invalidation, data races, etc. Instead, I think we need to either insert a `Retag` or possibly have the tags associated with the match guard references carry protectors that cannot be popped until we enter some match arm (more or less). </details><br /> We would like to show that this is sufficient to make sure that match exhaustiveness checking works. **Theorem:** For all `E::Variant`, `v1...vn`, and `opaque`, the assert in the code below cannot fail in a program without UB: ```rust let x: E = E::Variant(v1, v2, ..., vN); let e: &E = &x; let r1: &F1 = &(x as Variant).f1; let r2: &F2 = &(x as Variant).f2; // for all fields in `E::Variant` let rn: &FN = &(x as Variant).fN; opaque(e, r1, r2, ..., rn); // See the note above for what this `read` really means read(e); read(r1); read(r2); // ... read(rn); assert_eq!(std::mem::discriminant(&x), E::Variant); ``` Essentially, we wish to show that shared references cannot change the active variant of a type. **Proof:** By the interior mutability requirement, the `discriminant` calculation does not read any bytes that are inside an `UnsafeCell`. But the references created all have SRO permissions to bytes that are not inside an `UnsafeCell`, so they would be invalidated by any write to these bytes within `opaque` and the program would exhibit UB at one of the `read`s. As such, the result of `discriminant` after `opaque` must be the same as the result would have been before `opaque`, which is `E::Variant` by the agreement requirement. █ Unfortunately, this proof relies on the interior mutability requirement and the aliasing model's retagging logic agreeing on what it means for a byte to be inside an `UnsafeCell`. With SB as currently implemented, this is not the case, and that presents a potential problem. As far as I can see, there are at least three solutions: 1. Accept that the theorem is false. It follows that matches may be non-exhaustive and that a program that is not otherwise UB may reach the end of a match statement having matched none of the patterns. The only reasonable way to treat this behavior would seem to be to call it UB. This seems possible, but it is at least non-trivial that this does not affect safe code, and it has other unfortunate consequences. 2. Modify the interior mutability requirement's definition of "inside an `UnsafeCell`" to match the aliasing model. What we currently implement is that if an enum has any non-freeze field, it is all inside an `UnsafeCell`. This would imply that layout optimizing `{ () | UnsafeCell<u8> } | ()` cannot be optimized to size 2 (the part inside the `{}` is a separate type). 3. Modify the aliasing model's definition of "inside an `UnsafeCell`" to match the interior mutability requirement's. This is only possible if retagging a `&T` reads from memory. Fortunately, the ideas presented here actually give a very concrete way to do that - retagging reads the discriminant of all fields that are aggregates until it finds all the `UnsafeCell`s. This requires only bare minimum validity (eg no validity requirement on `&bool`). Furthermore, we're also guaranteed to get no data races (since we never read into an `UnsafeCell`). Still though, this potentially introduces a lot of UB. <details> <summary> Uninhabited types in option 3 </summary> In [UCG#77] , one popular-ish opinion seems to be that `&(u8, !)` should not be uninhabited, but `&!` should. If we select option 3 above, there is actually a super straight-forward way to ensure this. `discriminant` for `!` is UB, and so retagging `&!` is UB. However retagging `&(u8, !)` only computes `discriminant::<(u8, !)>` which is not UB as the discriminator on struct-like types is just `Known(TheOneVariant)`. There is no need to recurse into computing `discriminant` for the fields, as all fields are non-aggregates. This is kind of nice. [UCG#77]: https://github.com/rust-lang/unsafe-code-guidelines/issues/77 </details> --- #### ptr2int transmutes You may have noticed that when `Discriminator::Unknown` gets some bytes, it ignores the provenance. If ptr2int transmutes are DB and just throw out provenance, this is a fairly natural choice and fine (as far as I can tell). However, if ptr2int transmutes are UB, this won't do as we once more end up with self-assignments being unsound to remove. As a first example, consider ```rust= enum E { A = 0, B = 1, } ``` If we write a `0` with provenance to the memory of `x: E`, and then self assign `x`, by ignoring provenance we effectively strip it. When we remove the self-assignment, we then fail to strip that provenance, which is in general unsound. However, we also cannot unconditionally ban provenance. The discriminator for `Option<&T>` must be able to read provenance bytes. It's unclear how to fix this without significant complexity. For the example above, it suffices to check that tag bytes carry no provenance before returning from `discriminant`, but that is not the case if we want to be able to define a discriminator that checks integer validity (as above) and ptr2int transmutes are UB. Supporting that would require modifying the discriminator definition to allow it to *optionally* check that there is no provenance. Definitely doable, but potentially ugly. --- #### The "interior mutability requirement" is strict enough and not too strict The requirement is sufficiently strict in the sense that (as I said before) we can show that `discriminant::<Option<Mutex<u8>>>` cannot cause a data race or invalidate pointers to the `u8`. This is guaranteed by the interior mutability requirement, as the `u8` in the `Mutex` is inside an `UnsafeCell` and so cannot be read. Making the requirement any more strict seems challenging; indeed, another thread may have a legitimate claim to the `UnsafeCell<u8>` field as `&mut u8` and we cannot allow the implementation to invalidate that pointer. The requirement is also not so strict as to ban layout optimizations. As shown above, the discriminator is flexible enough to allow niche optimizing `Option<NonZeroU16>`. Making things even less strict appears challenging. If we require that the discriminator never read any field of the active variant, then it is not possible to niche optimize `Option<NonZeroU16>`. --- #### Validity invariants are flexible The section above shows that validity invariants *can* be checked. But they also do not have to be. This may be desirable for eg `#[repr(u8)]` enums. Specifically, the spec could require that the discriminator for these enums be a single `Discriminator::Unknown` reading the tag byte and pointing to the appropriate set of `Discriminator::Known` children. People often want to treat these like "real" tagged unions, and allowing them to use `mem::discriminant` on the in-memory version of `Variant(Uninit)` might be desirable.