Zero-sized memory accesses

We currently have code in the standard library which assumes that code like this is okay: ```rust= let ptr = 16 as *mut (); ptr.copy_from(ptr, 1); ``` Generally, according to [the `ptr` docs](https://doc.rust-lang.org/nightly/std/ptr/index.html#safety), it is valid to do a zero-sized well-aligned read or write from ay non-null pointer that was created via `<integer literal> as *const/mut T`. When generating LLVM IR, zero-sized accesses either entirely disappear (for loads/stores), or they become calls to `memcpy`/`memset` intrinsics which LLVM explicitly documents as being always valid for size 0. However, Miri will reject the following code as UB: ```rust= let b = Box::new(0); let ptr = &*b as *const i32 as *const (); drop(b); ptr.read(); ``` More precisely, a zero-sized access is *not* considered okay if the pointer has provenance pointing to a no longer existing allocation. Similarly, if the pointer is out-of-bounds of the allocation indicated by its provenance, that is UB: ```rust= let b = Box::new(0); let ptr = &*b as *const i32 as *const (); ptr.byte_add(128).read(); ``` This has several problems: - We have pointers where `ptr.offset(0)` is allowed but `ptr.cast::<()>.read()` is not. That seems a bit strange, what exactly is the justification for disallowing the read? (And similar for writes.) - We violate "provenance monotonicity": adding arbitrary provenance to bytes that do not have provenance in the original program must be an allowed transformation. (This is crucial for things such as dead store elimination.) Both of these could be fixed in one fell swoop by saing that zero-sized accesses are always allowed if the pointer is sufficiently aligned. However, to fix provenance monotonicity, it is sufficient to allow zero-sized reads/writes on arbitrary *non-null* pointers (even if their provenance may make them OOB or dangling). The null pointer question is somewhat separate. So let's branch off. Note that we are mostly unconstrained by LLVM here -- zero-zised loads/stores do not generate LLVM IR, and zero-sized copies are already explicitly allowed by LLVM. Only attributes such as alignment and nonnull are enforced. ## Allow zero-sized accesses on non-null pointers with arbitrary provenance? This is the easiest way to regain provenance monotonicity. We definitely want provenance monotonicity (or we need to make ptr2int transmutes UB if the pointer had provenance -- or we don't get to remove dead self-assignments `*x = *x` which we clearly want to allow). ### Potential problem: zero-sized accesses as optimization helpers Consider code like this: ```rust= fn example1(ptr: *const [u8; 4]) { ptr.read(); // `ptr` is dereferenceable(4) here let n = unk(); let slice = slice::from_raw_parts(ptr, n); // `slice` is dereferenceable(n) here, but we don't know n. for elem in slice { // ... } } ``` Under the current semantics, if we further assume that allocations cannot be partially freed, the compiler can assume that `slice` is dereferenceable for at least 4 bytes, and can prefetch loads before the loop. This assumption can be made because even if `n` is zero, creating `slice` would be UB if `ptr` had dangling provenance, therefore the allocation `ptr` points to is still alive, therefore it still has its original size, therefore `ptr` is `dereferenceable(4)`. Under the proposed semantics of allowing zero-sized accesses on arbitrary non-null pointers, if `n` is 0, `unk` might actually have freed this allocation, so this assumption no longer works. However, there is also some desire from people thinking about allocators to have operations like `shrink_in_place` that could partially free an allocation, which would likewise be incompatible with such an assumption. It is [unknown whether LLVM makes such an assumption](https://discourse.llvm.org/t/does-llvm-assume-that-optimizations-cannot-be-partially-freed/72416). ### Alternative to achieve provenance monotonicity: "not dereferenceable" provenance for integers It has been proposed that provenance monotonicity could be achieved by having `ptr::invalid_mut(16)` generate a pointer that has provenance indicating "this allows zero-sized accesses", while `transmute(16usize) as *mut ()` would generate a pointer with *no* provenance, and that would disallow even zero-sized accesses. This would probably work, but would generate yet another distinct kind of "invalid" pointer: - Pointer with no provenance - Pointer with "zero-sized only" provenance - Pointer with provenance of no-longer-existing allocation ("dangling pointer") - Pointer with provenance of allocation for which this pointer is not in-bounds ("out-of-bounds pointer") We would need a new intrinsic to implement `ptr::invalid` (which is supposed to return a pointer with "zero-sized only" provenance), destroying its symmetry with `addr` (currently both are implemented via `transmute`). Therefore in the author's opinion, this should only be pursued if there are serious problems with the alternative of just allowing zero-sized accesses even on dangling or out-of-bounds pointers. Is the problem described in the previous subsection sufficiently serious? ## Allow zero-sized accesses on null pointers? Arguably, if we allow `offset(0)` on the null pointer, it would be more consistent to also allow reads and writes. In particular, this means the zero-sized exception can be put in a single place in the operational semantics. (In MiniRust terms: a single early-return in `check_ptr` is sufficient to cover everything from `offset` to loads and stores. `check_ptr(ptr, size)` is called before each load/store, and it is also called by `offset` to ensure the pointer remains in-bounds. Something similar happens in Miri.) ~~However, we currently [do add `nonnull` attributes](https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=84e66a74a7a649d9641d8aeff252afd8) to raw pointers that are being accessed; we'd have to stop doing that.~~ (That example was bogus, the attribute was infered by LLVM.) Also, `&()` would still be required to be non-null -- so there would be pointers that are allowed to be read but creating a reference to them is illegal. The justification for making references nonnull would be "we want the niche", without there always being underlying UB. The alternative is to keep the nonnull rule for accesses but not `offset`. In the spec/MiniRust/Miri, that would require treating zero-sized offsets and zero-sized load/store separately in the operational semantics, which is at least somewhat unsatisfying. Likely, we would need a special case in `offset` that allows size 0, and only for non-zero sizes does it then actually check "is the memory range between old and new pointer inside some allocation". In contrast, for memory accesses, on size 0 it would still check for null, but the non-zero-sized case uses the same check as `offset`. ## Allow zero-sized accesses on unaligned pointers? Generally alignment is completely separate from the other restrictions imposed on pointers, so it probably makes sense to keep enforcing it on zero-sized accesses. # Meeting notes and questions (2023-09-05) ## Connor: ZST Accesses to recover existing optimizations for user-defined slice references To clarify my inline comment more, `Span<T>` (as well as `SpanMut<T>` which needs even more creativity to get back full behaviour) is used in many places throughout lccc, even outside of ABI boundaries, to avoid the syntactic overhead of interconversion. This can impact optimizibility in llvm due to lack of dereferenceable. While lccc's dereferenceable is more flexible here, it may not be sufficient for all desireable speculations, without some way to preserve it accross an opaque function call. ## carbotaniuman: Is this sufficient to achieve provenance monotinicity? Are there other alternatives? Ralf: sufficient -- to my knowledge, yes. AFAIK this is the last remaining bit of Miri semantics that is not provenance-monotone. Ralf: alternatives -- one has been mentioned in the text. ## Jakob: Unsized locals? I'm trying to think of situations where the semantics for zero-sized accesses are less "separable" from the remaining load/store semantics and this seems like a situation where something might come up. It's not clear to me that there is an actual problem though, I can't think of anything concrete. RJ: I don't quite follow. When doing a load/store we always (dynamically) know the size. When I talk about zero-sized access, I mean "dynamically zero-sized". Unsized locals do not make a difference. Jakob: Are there many other situations where we only dynamically know the size? `memset` and `memcpy` seem like obvious examples. Connor: Slices in general, since retags do (AM) reads IIRC.