# Refs to statics in constants ## Summary This document proposes to stabilize [refs-to-static-in-constants](https://github.com/rust-lang/rust/issues/119618). This feature permits one to create a constant expression that references a static: ```rust= struct Vtable { ptr: *const u32 }; static VT: Vtable = Vtable { ptr: std::ptr::null() }; const C: &Vtable = &VT; ``` On stable Rust, this stabilization does not introduce any surprising behavior. The resulting constants `C` will be equal to the address of `&VT` at runtime -- note that this value is not knowable at compilation time (certainly not early on in the compilation) and so must be represented abstractly (i.e., the compiler thinks of the value of `C` as "the address of `VT`", whereas `std::ptr::null()` is an example of a constant pointer whose value is known to be 0). Given the limited surface area, this stabilization has no interactions with **stable** const generics. However, it does have some implications for **future** const generics; those are discussed in the [Future interactions](#Future-interactions) section. The conclusion is that supporting refs-to-statics does not introduce new challenges for const generics that were not already present in some other form. ## Procedural note Const-refs-to-statics never had an RFC. This stabilization report could be made into an RFC if we think that makes sense. Niko's general opinion is that incremental extensions to const generics do not all seem worthy of RFCs, and yet there is some value to establish principles like 'statics have significant addresses that ought to be preserved'. ## What is being proposed for stabilization ### Background: const evaluation and const values The term **const evaluation** refers to evaluating a constant expression at compilation time. The result of const evaluation is a **const value**. Const values can contain abstract pointers (e.g., the result of `&VT` is "the address of the static `VT`") that are not truly known. We cannot always know whether two const values should be considered equal or whether they would compare as equal at runtimer. Const values are used to store the initializer for a named static (`static S: T = /* initializer */`), the values of named constants (`const C: T = ...`), and the values of associated constants (`<T as Something>::SIZE`). ### What `const_refs_to_static` allows Currently, const evaluation does not allow a `const` value to reference a `static`. A program like this one therefore requires a feature gate: ```rust #![feature(const_refs_to_static)] static S: u32 = 66; const C: u32 = S; fn main() { println!("{C}"); // prints 66 } ``` The feature gate allows not only reading the value of a static but also taking references (and even dereferencing them): ```rust #![feature(const_refs_to_static)] static S: u32 = 66; const C: &u32 = &S; const D: u32 = *C; fn main() { println!("{D}"); // prints 66 } ``` ### The "significant address" property The key distinguishing feature of a `static` versus any other form of variable is that they have a **significant address**. In short, `&S` for some static `S` is often expected to be the same pointer everywhere in the program whenever it occurs (but see the caveat below). This is distinct from a local variable, say, which may have different addresses on each invocation of the function[^move]; it is also distinct from a constant like `const { &22 }`, which can also refer to different memory locations (though they will always have `22`). [^move]: And potentially even within a single function call, if the value is moved or becomes dead -- though arguably that is a separate variable. Precise limitations here still TBD. Const evaluation and constants preserve this property ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=10fa873401bb8df0cfb116a46c7134f2)): ```rust #![feature(const_refs_to_static)] static S: usize = 44; const S_X: &usize = &S; const S_Y: &usize = &S; static T: usize = 44; const T_X: &usize = &T; const T_Y: &usize = &T; fn main() { // These assertions are guaranteed to be true. assert!(std::ptr::eq(&S, S_X)); assert!(std::ptr::eq(S_X, S_Y)); assert!(std::ptr::eq(&T, T_X)); assert!(std::ptr::eq(T_X, T_Y)); assert!(!std::ptr::eq(&S, &T)); } ``` In contrast, the pointer values of constants are not guaranteed to be equal, and hence equivalent assertions would not be guaranteed to be true for these declarations ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=debd44117e3a04f08a6d9ab75faa310f)): ```rust const S: usize = 44; const S_X: &usize = &S; const S_Y: &usize = &S; const T: usize = 44; const T_X: &usize = &T; const T_Y: &usize = &T; fn main() { // These assertions hold in practice because LLVM coallesces // pointers to constants, but that is a "best effort" optimization // and they are not guaranteed to hold: assert!(std::ptr::eq(&S, S_X)); assert!(std::ptr::eq(S_X, S_Y)); assert!(std::ptr::eq(&T, T_X)); assert!(std::ptr::eq(T_X, T_Y)); // As above, `S` and `T` are distinct constants, but they are coallesced // in practice (not guaranteed): assert!(std::ptr::eq(&S, &T)); } ``` **Caveat (generic statics):** Statics are currently forbidden from having generic parameters in large because it is not clear if and how the significant address property could be maintained given monomorphization. Future extensions of statics to support generics may revise the precise guarantee being offered here (e.g., to say that generic statics instantiated in distinct compilation units may sometimes have distinct addresses) and they would have to address how that interacts with constants. ### Extern statics Extern statics are treated conservatively. It is possible to get their address as a raw pointer but it is not possible to read from them (what would the value be) or to include a safe reference to them in your final value ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=9688fd07c61b5fb3539915112f2fcdbe)): ```rust #![feature(const_refs_to_static)] extern { static S: u32; } // ERROR cannot access extern static const C: u32 = unsafe { S }; // ERROR encountered reference to `extern` static in `const` const D: &u32 = unsafe { &S }; // OK const E: *const u32 = unsafe { std::ptr::addr_of!(S) }; ``` ### Freeze requirement Const evaluation is not allowed to - access the contents of any mutable static (whether that via interior mutability or `static mut`). - result in values that safely reference anything mutable (whether that is via interior mutability or `&mut`). "Safely reference" here refers to recursively traversing the value in the same way safe code could (but ignoring visibility), i.e. recursing through references but not through raw pointers or unions. It is possible to create static values with `UnsafeCell` contents, but they can not typically be used from constants except in very narrow ways. For example, creating a constant whose value includes an `UnsafeCell` ([or a reference to memory contained in an unsafe cell](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=52d233e6d20987f990bd02cea0ac780d)) triggers an error that ["it is undefined behavior to use this value"](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=284c41407fdb84548ff48a11012cec05): ```rust #![feature(const_refs_to_static)] #![feature(sync_unsafe_cell)] // required to use `SyncUnsafeCell`, trivial to do on stable use std::cell::SyncUnsafeCell; static S: SyncUnsafeCell<u32> = SyncUnsafeCell::new(66); const C: &SyncUnsafeCell<u32> = &S; // ERROR: undefined behavior to use this value ``` Similarly attempting to access the contents of an unsafe cell results in ["constant accesses mutable global memory"](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=f739e0236a8d23c2467bd19dceb7c604): ```rust! #![feature(const_refs_to_static)] #![feature(const_mut_refs)] // required to deref the raw pointer #![feature(sync_unsafe_cell)] // required to use `SyncUnsafeCell`, trivial to do on stable use std::cell::SyncUnsafeCell; static S: SyncUnsafeCell<u32> = SyncUnsafeCell::new(66); const C: u32 = unsafe { *S.get() }; // ERROR: constant accesses mutable global memory ``` It is however possible to use statics that have `UnsafeCell` in other ways, e.g. [returning a raw pointer to their contents](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=5e987b02c68877bb3516f8b0523a08d5): ```rust #![feature(const_refs_to_static)] #![feature(sync_unsafe_cell)] use std::cell::SyncUnsafeCell; static S: SyncUnsafeCell<u32> = SyncUnsafeCell::new(66); const C: *mut u32 = S.get(); // OK ``` ### Static mut Statics declared as `static mut` generally behave "as if" they were enclosed in an unsafe cell ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=138b51bd473f9787f67ca5b88da4676d)): ```rust #![feature(const_refs_to_static)] #![feature(const_mut_refs)] static mut S: u32 = 0; // ERROR constant accesses mutable global memory const C: u32 = unsafe { S }; // ERROR it is undefined behavior to use this value const D: &u32 = unsafe { &S }; // OK, requires feature(const_mut_refs) const E: *mut u32 = unsafe { std::ptr::addr_of_mut!(S) }; ``` The same is true of external statics ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=2c371bb0348050929242d8539027e378)): ```rust #![feature(const_refs_to_static)] #![feature(const_mut_refs)] extern { static mut S: u32; } // ERROR constant accesses mutable global memory const C: u32 = unsafe { S }; // ERROR it is undefined behavior to use this value const D: &u32 = unsafe { &S }; // OK, requires feature(const_mut_refs) const E: *mut u32 = unsafe { std::ptr::addr_of_mut!(S) }; ``` ## Future interactions **Const generics** refers to Rust items with generic parameters of kind `const`, such as `fn foo<const C: usize>()`. Stable Rust requires that const generic parameters have simple scalar types like `usize` or `i32`. This limitation means that there is no real interaction between the stable surface area of const generics and `const_refs_to_static`. So long as we do not extend const generics to permit values of `&`-type, then there are no problems at all (but of course we limit what users can do, and in particular don't support `&str` values). If however we wish to extend const generics to permit parameters of `&`-type (e.g., `fn foo<const C: &usize>()`), then we will need to extend the current implementation to preserve the "significant address" property. This section dives into detail as to why that property is not currently preserved, the various options to fix that, and some related challenges. ### Background: Const generics and monomorphization Given a function `fn foo<const C: SomeType>()`, Rust's type system must be able to decide whether `foo::<X>` and `foo::<Y>` represent two different instances of the same generic function (or, equivalently, given `struct Foo<const C: SomeType>`, whether `Foo<X>` and `Foo<Y>` are the same type). This requires being able to determine whether `X` and `Y` are equal (i.e., the same value). This equality comparison cannot be done for all const values since some of them lack a well-defined notion of equality (e.g., two values of type `fn()`). Stable Rust sidesteps this issue by only permitting const generics where the type is a scalar value (e.g., `u32`) and the constant expression can be evaluated to a fixed constant (in particular, the expression is not allowed to reference generic types). ### Introducing valtrees To support a richer set of values in const generics, nightly Rust makes use of **valtrees**. A valtree ("value tree") is a simplified form of const value consisting of "branch nodes" and "leaf nodes", which carry simple scalar values. The "value" of a const generic parameter is always a valtree, not an arbitrary const value. For the simple types supported in const generics today, valtree conversion is infallible -- simply convert the scalar value to a leaf node. The same is true for ADTs composed of those simple types. Converting a `(u32, u32)` tuple like `(22, 44)` for example simply means you get a valtree like `(I32LeafNode(22_i32), I32LeafNode(44_i32))`. Valtrees do not carry type information. The same valtree `(I32LeafNode(22_i32), I32LeafNode(44_i32))` that represents a tuple would also represent a fixed-length array like `[22, 44]` or a value of `struct Point { x: u32, y: u32 }`. At monomorphization time, generic constants have both a type and an associated valtree suitable for that type, and that type can be used to instantiate the valtree into an actual value. Values of more complex types may not have a well-defined valtree. For example, there is no way to represent a `fn()` value as a valtree. In the nightly version of const generics, whenever a const value is given as the value for a const generic, the compiler internally attempts to convert that const value to a valtree. This process can fail, in which case an error results. But if it succeeds, then the const generic can be compiled. Whenever the const generic argument is referenced, the valtree will be converted into a const value which can in turn be converted into a real value at runtime. **Example.** Let's walk through an example [supported on stable today](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=5796a004ae39675851492a68d50f70c1): ```rust fn test<const C: u32>() { let x = C; println!("{x}"); } fn main() { test::<{22 + 44}>(); } ``` * In `main`, the expression `22 + 44` is const evaluated into a const value `ConstVal(66)`. * `ConstVal(66)` is then converted into a valtree `I32LeafNode(66)`. * During codegen time, the function `test::<I32LeafNode(66)>` is compiled. * When `let x = C` is compiled, `I32LeafNode(66)` is converted back to `ConstVal(66)` and from there the code is compiled to load a constant. Execution proceeds as expected. ### Supporting references in valtrees As currently implemented, references are ignored when creating a valtree, so the valtrees for `22` and `&22` and even `&&22` are all the same (just `I32LeafNode(22)`). This preserves the property that, given two values `X` and `Y`, if `valtree(X) == valtree(Y)` then `x == y`. For refrences, this means that pointer equalty ought not be considered part of identity, since the `==` operator for `&T` says that two references are equal if their referents are equal (and it doesn't consider the pointer address). Put another way, the `Eq` trait doesn't respect "significant addresses", and valtrees are currently defined to align with `Eq`, so they do not either. The current definition of valtrees implies that const generics of type `&usize` (or any reference) will preserve the **value** of the referent but not its **address** (as that is not part of the valtree). This can create observable behavior on nightly. Consider this example from [#120961]: [#120961]: https://github.com/rust-lang/rust/issues/120961 ```rust #![feature(const_refs_to_static)] #![feature(adt_const_params)] static FOO: usize = 42; const BAR: &usize = &FOO; fn foo<const X: &'static usize>() { if std::ptr::eq(X, &FOO) { // Never prints! But isn't `X == BAR == &FOO`?? println!("activating special mode"); } } fn main() { foo::<BAR>(); } ``` When executed, this example does NOT print anything, even though you might expect that it would. What is happening? * The value of `BAR` is `ConstVal(&FOO)`, which tracks that it is the address of the static `FOO`. * The value of `BAR` is converted into a valtree, which results in just `42` (the value of the static is used to create the valtree). * When `foo::<Leaf(64)>` is compiled, the valtree must be converted into a `&usize`. A new temporary value is synthesized. The `str::ptr::eq` (which observes the physical pointer address) compares the address of this temporary to `FOO` and they have different addresses. * In practice, an anonymous constant like `const BAZ: &'static u32 = &42` would typically be equal to `X`, but that is because LLVM deduplicates such constants into a single allocation; such deduplication is also not *guaranteed* to occur, particularly across codegen units. **There is general agreement that this behavior is surprising and not desirable.** But note that it requires *multiple* feature gates -- `const_refs_to_static` AND `adt_const_params` (and as of very recently, `unsized_const_params`). Stabilizing just `const_refs_to_static` does not really change anything. In other words, the problem with the above example is not due to permitting references to statics in constants, it's due to valtrees encoding references in a surprising way (though if you didn't have references to statics, you couldn't observe it). ### Options to support references in const generics So, what are the options for supporting reference types in const generics, while avoiding surprising examples like the one from [#120961] above? #### Option A: Disallow creating valtrees from references to statics We could make valtree construction fail if it encounters a reference to a static (but succeed for references to anonymous constants). This would avoid the issues but only be preventing users from doing something they likely want to do. This program would not compile, for example, since it invokes `foo` with the constant `&S`: ```rust= fn foo<const C: &usize>() { } static S: usize = 22; fn main() { foo::<&S>(); // People will want to do this! } ``` This option is not very appealing, ecause users likely *want* to create valtrees that reference statics. #### Option B: Extend valtree to represent ref-to-static A more appealing option is to extend valtrees so that "ref-to-static" is something they can directly encode, and thus sacrifice the invariant that `valtree(X) == valtree(Y)` implies `X == Y`. This recognizes the fact that there are additional properties to values that we may wish to preserve beyond what is compared by the `Eq` trait. Significant addresses are not the only examples of such properties, there are many that arise when const functions use unsafe code, such as the value of padding bits, provenance, and potentially things like which NaN is in use (if we wished to support `f64`). We will have to decide which of them we wish to make observable in const evaluation. #### The upshot: Stuff to figure out, but refs-to-statics doesn't make it harder [As BoxyUwu put it:](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Refs.20to.20statics.20in.20constants/near/453376283) > Notably that any of the solutions to making refs to statics not behave weirdly in const generics, wind up being strongly related to existing problems in const generics that already need to be solved. So while there are open questions here they don't actually really make anything worse (in my opinion). > > I wouldn't want anyone to read this and come away wondering whether the feature should be blocked for a while until const generics stuff is figured out. ## Links * [Extensive Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Refs.20to.20statics.20in.20constants/near/443446036)