This RFC proposes extending Rust's tooling support for safety hygiene to named fields that carry library safety invariants. Consequently, Rust programmers will be able to use the unsafe
keyword to denote when a named field carries a library safety invariant; e.g.:
struct UnalignedRef<'a, T> {
/// # Safety
///
/// `ptr` is a shared reference to a valid-but-unaligned instance of `T`.
unsafe ptr: *const T,
_lifetime: PhantomData<&'a T>,
}
Rust will enforce that potentially-invalidating uses of unsafe
fields only occur in the context of an unsafe
block, and Clippy's missing_safety_doc
lint will check that unsafe
fields have accompanying safety documentation.
Safety hygiene is the practice of denoting and documenting where memory safety obligations arise and where they are discharged. Rust provides some tooling support for this practice. For example, if a function has safety obligations that must be discharged by its callers, that function should be marked unsafe
and documentation about its invariants should be provided (this is optionally enforced by Clippy via the missing_safety_doc lint). Consumers, then, must use the unsafe
keyword to call it (this is enforced by rustc), and should explain why its safety obligations are discharged (again, optionally enforced by Clippy).
Functions are often marked unsafe
because they concern the safety invariants of fields. For example, Vec::set_len
is unsafe
, because it directly manipulates its Vec
's length field, which carries the invariants that it is less than the capacity of the Vec
and that all elements in the Vec<T>
between 0 and len
are valid T
. It is critical that these invariants are upheld; if they are violated invoking many of Vec
's other, safe methods induces undefined behavior.
To help ensure these invariants are upheld, programmers may apply safety hygiene techniques to fields, denoting when they carry invariants and documenting why their uses satisfy their invariants. For example, the zerocopy
crate maintains the policy that fields with safety invariants have # Safety
documentation, and that uses of those fields occur in the lexical context of an unsafe
block with a suitable // SAFETY
comment.
Unfortunately, Rust does not yet provide tooling for this practice declaring, discharging, or documenting the safety invariants of fields. Since the unsafe
keyword cannot be applied to field definitions, Rust cannot enforce that potentially-invalidating uses of fields occur in the context of unsafe
blocks, and thus Clippy cannot enforce that safety comments are present either at definition or use sites. This RFC is motivated by the benefits of closing this tooling gap.
The absence of the safety tooling support for fields makes practice of good field safety hygiene entirely a matter of programmer discipline, and, consequently, the practice of good field safety hygiene is nascent.
Rust's visibility mechanisms can, to some extent, be (ab)used to help enforce good field safety hygiene. For example, zerocopy's Ptr
type is defined in a private def
module, which solely contains the datatype definition and an impl containing pub(super)
unsafe
constructors, getters and setters. All other impl
s of Ptr
are defined outside of this module and therefore must mediate their access to Ptr
's private fields through these unsafe functions. This roundabout approach poses significant linguistic friction and may be untenable when split borrows are required. Consequently, this approach is uncommon in the Rust ecosystem.
We hope that less friction and better tooling will make good field safety hygiene more common in the Rust ecosystem.
Rust's safety tooling ensures that unsafe
operations may only occur in the lexical context of an unsafe
block or function. If the safety obligations of an operation cannot be discharged entirely by an unsafe
block, then the surrounding function must, itself, be unsafe
. This tooling cue nudges programmers towards good function hygiene.
But, presently, it has a shortcoming: dangerous field uses are not linted against. The unsafe Vec::set_len
method, for example, contains entirely safe code. There is no tooling cue that suggests this function should be unsafe — only programmer knowledge. Extending safety tooling to fields will close this gap.
To evaluate the soundness of unsafe
code (i.e., code which relies on safety invariants being upheld), it is not enough for reviewers to check the contents of unsafe
blocks — they must check all places (including safe contexts) in which safety invariants might be violated. (See The Scope of Unsafe.) This is, in large part, because safety tooling does not extend to fields. Consequently, safety invariants may be violated at-a-distance in safe code, and safety audits must therefore carefully consider distant safe code.
Crates that practice good field safety hygiene will be easier to review. While reviewers must still ensure that fields which carry safety invariants are actually marked unsafe
, having done so, they may largely limit their review to unsafe
code and (in the absence of unsafe local bindings) safe code in the same function.
Copy
gets this treatment; Unpin
and UnwindSafe
may also be a compelling candidates.The design of unsafe
fields is guided by three tenets:
unsafe
if it carries arbitrary library safety invariants with respect to its enclosing type.unsafe
fields which could violate their invariants must occur in the scope of an unsafe
block.unsafe
fields which cannot violate their invariants should not require an unsafe block.NonZeroU8
must never be 0
.str
encapsulates valid UTF-8 bytes, and much of its API assumes this to be true. However, this invariant may be temporarily violated, so long as no code that assumes this safety invariant holds is invoked.struct Foo(u8)
implicitly introduces a function named Foo
which consumes a u8
and produces a Foo
.A field should be marked
unsafe
if it carries library safety invariants with respect to its enclosing type.
This purpose is consistent with the purpose of the unsafe
keyword in other declaration positions, where it signals to consumers of the unsafe
item that their consumption is conditional on upholding safety invariants; for example:
unsafe
trait denotes that it carries safety invariants which must be upheld by implementors.unsafe
function denotes that it carries safety invariants which must be upheld by callers.A field carrying safety invariants should — not must — be marked unsafe
.
We cannot programatically enforce that fields which carry safety invariants are marked unsafe
, just as we cannot enforce that functions with safety invariants are marked unsafe. The use of unsafe
in declaration position is a social contract.
We also cannot immediately change Rust's social contract, since doing so would mean that code which is currently compliant with Rust's social contract (which does not and cannot require that unsafe
fields are marked with unsafe
) would cease to be compliant. At best, we may be able to evolve Rust's social contract over an edition boundary.
In the simplest case, a field's safety invariant is a restriction of the invariants imposed by the field type, and concern only the immediate value of the field; e.g.:
struct Alignment {
/// SAFETY: `pow` must be between 0 and 29.
pub unsafe pow: u8,
}
A field might carry an invariant with respect to its referent; e.g.:
struct CacheArcCount<T> {
/// SAFETY: This `Arc`'s `ref_count` must equal the
/// value of the `ref_count` field.
unsafe arc: Arc<T>,
/// SAFETY: See [`CacheArcCount::arc`].
unsafe ref_count: usize,
}
A field might carry an invariant with respect to data outside of the Rust abstract machine.
struct Zeroator {
/// SAFETY: The fd points to a uniquely-owned file,
/// and the bytes from the start of the file to the
/// offset `cursor` (exclusive) are zero.
unsafe fd: OwnedFd,
/// SAFETY: See [`Zeroator::fd`].
unsafe cursor: usize,
}
A field safety invariant might also be a relaxation of the safety invariants imposed by the field type. For example, a str
is bound by both the language safety invariant that it is initialized bytes, and by the library safety invariant invariant that it contains valid UTF-8. It is sound to temporarily violate the library invariant of str
, so long as the invalid str
is not exposed to code that might assume str
validity.
Below, MaybeInvalidStr
encapsulates an initialized-but-potentially-invalid str
as an unsafe field:
struct MyabeInvalidStr<'a> {
/// SAFETY: `maybe_invalid` may not contain valid
/// UTF-8. It MUST always contain initialized
/// bytes (per language safety invariant on `str`).
pub unsafe maybe_invalid: &'a str
}
Field unsafety is orthogonal to field visibility. An unsafe
field may be pub
, just as a safe field may be pub(self)
; e.g.:
struct NuclearBriefcase {
/// Do not expose carelessly!
pub(self) launch_code: [u8; 32]
}
Uses of
unsafe
fields which could violate their invariants must occur in the scope of anunsafe
block.
This requirement is consistent with the requirements of the unsafe
keyword when applied to other declarations; for example:
unsafe
trait may only be implemented with an unsafe impl
.unsafe
function is only callable in the scope of an unsafe
block.These requirements are not negotiable; likewise, the requirement that risky operations on unsafe
fields require unsafe
should also be non-negotiable.
The implicit constructor of a struct or enum variant with an unsafe
field must require unsafe
. Writing, referencing or reading an unsafe
field must require unsafe
.
An unsafe field with a suspended invariant can only be read from its enclosing type if the reader respects that the value might be in an invalid state. This amounts to a safety invariant: if the value is in an invalid state, subsequent (potentially safe) uses must not require that it is in a valid state.
Consequently, reading an unsafe field must require unsafe.
A field with both:
may not be sound to drop, as its Drop
impl may depend on the value being in a valid state. Consequently, unsafe
fields must either be Copy
or ManuallyDrop
, both of which preclude non-trivial drops.
Uses of
unsafe
fields which cannot violate their invariants should not require an unsafe block.
Given that the use of unsafe
on fields is a social contract, adherence to that social contract will depend on the UX of using unsafe
fields. We should take care to minimize how often users will be prompted to use unsafe
for field accesses that clearly cannot violate the field's safety invariant.
Given an enum whose variants contain a mix of safe and unsafe fields; e.g.:
enum Example {
Safe(u8, u8, u8),
Unsafe(u8, unsafe u8, u8),
}
It should be safe to initialize and destruture Example::Safe
, but not Example::Unsafe
.
Fields with local, non-suspended invariants are potentially always safe to read. For example, consider reading out the field pow
from Alignment
:
struct Alignment {
/// SAFETY: `pow` must be between 0 and 29.
pub unsafe pow: u8,
}
Outside of the context of Alignment
, u8
has no special meaning. It has no library safety invariants (and thus no library safety invariants that might be suspended by the field pow
), and it is not a pointer or handle to another resource.
The set of safe-to-read types, \(S\), includes:
A type-directed analysis could make reads of these field types safe.
The unsafe
modifier is applicable to the fields of struct-like variants; e.g.:
struct ExampleStruct {
a: u8,
unsafe b: u8,
}
struct ExampleEnum {
UnitLike,
TupleLike(u8, u8),
StructLike {
a: u8,
unsafe b: u8,
},
}
struct ExampleUnion {
a: u8,
unsafe b: u8,
}
It is not applicable tuple-like variants, as this would admit ambiguous parses; e.g.:
struct ExampleAmbiguous(unsafe fn())
Copy
The Copy
trait is, semantically, an unsafe
trait whose safety contract is that all members must be Copy
. However, it is not marked unsafe
since the compiler enforces this condition automatically on all implementations.
The introduction of unsafe fields creates a declaration-site unsafe
obligation — namely, that reading is unsafe — that would not be discharged by use-site in a Copy
impl (which has no methods that would mention the unsafe fields).
To resolve this, we make Copy
conditionally (un)safe: If Self
contains unsafe
fields, Copy
is unsafe
to implement; otherwise it remains safe to implement.
If a type has unsafe fields, its safety invariants are not simply the conjunction of its field types' safety invariants. Consequently, it's invalid to reason about the safety properties of these types in a purely structural manner — i.e., the manner in which auto traits are implemented. Consequently, auto implementations of unsafe auto traits should not be generated for types with unsafe fields.
unsafe(invalid)