Field projections are a way to turn a pointer to a struct into a pointer to a field of that struct. The definition of "pointer" is rather broad: it includes any smart pointer type or reference as well as raw pointers. Essentially, the operation is adding the field offset to the pointer and then casting the pointer to the type of the field in the struct.
The main motivation for field projections is to make pin projections ergonomic. However, they also allow to succinctly express other ideas such as addr_of!((*ptr).field)
/&raw (*ptr).field
as ptr->field
. In addition, they often come up in Rust for Linux as a way to safely wrap an existing C API.
In this document all of the main use cases for abstract field projections are explained and examples outline how they would look like. The word "abstract" is used to convey that there might be additional use cases for the specific implementation of field projections. An implementation using a compiler-internal type for every field of a struct will be explained in a future document. That approach has other potential use cases, also in the context of Rust for Linux.
This document assumes that a new operator "->
" will be introduced solely for field projections. This new operator is crucial in ensuring the ergonomics of this new feature.
While of course any kind of feedback is welcome, this document is only concerned with conveying the use cases for field projection, thus giving the motivation for bringing them into the language. Discussions of implementation, edgecases and general bikeshedding should be postponed until the RFC is updated.
Throughout the examples, we will use the following struct definitions:
struct Data {
cfg: Config,
items: Vec<i32>,
}
struct Config {
name: &'static str,
port: u16,
stats: StatsConfig,
}
struct StatsConfig {
level: u8,
}
There are several low hanging fruit that we can get in addition to the more complicated use cases given below. While not the main motivation, these simple cases might help you understand the concept of field projection better. Additionally, the field projection operator ->
will make these use cases much more ergonomic than they are at the moment.
All pointers listed below have the same behavior. The projection operation is just returning a pointer to the projected field. So if I have a variable ptr: P<Data>
(where P<T>
is one of the pointer types listed below), then one can use ptr->cfg
to get a P<Config>
.
&T
&mut T
*const T
*mut T
NonNull<T>
cell::Ref<'a, T>
cell::RefMut<'a, T>
Users should be able to define their own.
impl Data {
unsafe fn raw_init(ptr: NonNull<Self>) {
unsafe {
let cfg: NonNull<Config> = ptr->cfg;
let cfg: *mut Config = cfg.as_ptr();
Config::raw_init(cfg);
ptr->items.write(vec![]);
}
}
}
impl Config {
unsafe fn raw_init(ptr: *mut Self) {
unsafe {
ptr->port.write(8080);
ptr->name.write("no name configured");
StatsConfig::raw_init(ptr->stats);
}
}
}
impl StatsConfig {
unsafe fn raw_init(ptr: *mut Self) {
unsafe {
ptr->level.write(0);
}
}
}
RefCell
let cell = RefCell::new(Data { /* ... */ });
let cfg = cell.borrow_mut()->cfg;
*cfg->port = 42;
A "container" is a generic repr(transparent)
type that has the same layout as one of it's generics. They often add or remove "properties" of the wrapped value. Since they affect every field of a struct in the same way, projecting through them is possible. Every container type C<T>
is able to be projected through any previously mentioned pointer type: given ptr: P<C<Data>>
, we have ptr->cfg: P<C<Config>>
. Container types are:
MaybeUninit<T>
UnsafeCell<T>
Cell<T>
Users should be able to define their own.
fn set_stats_level(data: &Cell<Data>, level: u8) {
data->cfg->stats->level.set(level);
}
fn safer_init(data: &mut MaybeUninit<Data>) {
data->cfg->stats->level.write(0);
data->cfg->port.write(8080);
data->cfg->name.write("no name configured");
data->items.write(vec![]);
}
// doesn't fit on the stack!
struct BigData {
data: [u8; 1024 * 1024 * 1024],
}
fn init_big_data_to_zero(data: &mut MaybeUninit<BigData>) {
let ptr: *mut [u8; 1024 * 1024 * 1024] = data->data.as_mut_ptr();
unsafe { ptr.write_bytes(0, 1) };
}
In addition to the simple use cases above, field projection will make these more complicated operations not only safe, but also ergonomic. These more complex use cases require a way to mark the fields of a struct in a particular way.
Pin<P>
Fields of a struct can be structurally pinned. If the field cfg
is structurally pinned, then given ptr: Pin<P<Data>>
, we obtain via projection ptr->cfg: Pin<P<Config>>
. If cfg
is not structurally pinned, we get ptr->cfg: P<Config>
, so the wrapper type Pin
is only preserved when the field is structurally pinned. One way of marking the fields would be by annotating them with #[pin]
.
This complicates the idea of field projection quite a lot, the return type of the ->
operator now depends on a property of the field. With this added complexity, we also gain new expressiveness that Rust for Linux can take advantage of.
struct FairRaceFuture<F1, F2> {
#[pin]
f1: F1,
#[pin]
f2: F2,
fair: bool,
}
impl<F1, F2> Future for FairRaceFuture<F1, F2>
where
F1: Future,
F2: Future<Output = F1::Output>,
{
type Output = F1::Output;
fn poll(self: Pin<&mut Self>, ctx: &mut Context<'_>) -> Poll<Self::Output> {
// Since `fair` is not marked with `#[pin]`, we don't get a pinned reference here
let fair: &mut bool = self->fair;
*fair ^= true;
if *fair {
// For `f1` we do get a pinned reference, since `f1` is annotated with `#[pin]`.
let f1: Pin<&mut F1> = self->f1;
match f1.poll(ctx) {
Poll::Ready(value) => Poll::Ready(value),
Poll::Pending => self->f2.poll(ctx),
}
} else {
match self->f2.poll(ctx) {
Poll::Ready(value) => Poll::Ready(value),
Poll::Pending => self->f1.poll(ctx),
}
}
}
}
RCU stands for read, copy, update. It is a creative locking mechanism that is very efficient for data that is seldomly updated, but read very often. Below you can find a small summary of how I understand it to work. No guarantees that I am 100% correct, if you want to make sure that you have a correct understanding of how RCU works, please read the sources provided in the next section.
It requires quite a lot of explaining until I can express why field projection comes up in this instance. However, in this case (similar to Pin
) it is (to my knowledge) impossible to write a safe API without field projections, so they would be invaluable for this use case.
For a much more extensive explanation, please see https://docs.kernel.org/RCU/whatisRCU.html. Since the first paragraph of the first section is invaluable in understanding RCU, it is quoted here for the reader's convenience:
The basic idea behind RCU is to split updates into “removal” and “reclamation” phases. The removal phase removes references to data items within a data structure (possibly by replacing them with references to new versions of these data items), and can run concurrently with readers. The reason that it is safe to run the removal phase concurrently with readers is the semantics of modern CPUs guarantee that readers will see either the old or the new version of the data structure rather than a partially updated reference. The reclamation phase does the work of reclaiming (e.g., freeing) the data items removed from the data structure during the removal phase. Because reclaiming data items can disrupt any readers concurrently referencing those data items, the reclamation phase must not start until readers no longer hold references to those data items.
In C, RCU is used like this:
rcu_read_lock()
and rcu_read_unlock()
functions when accessing any data protected by RCU, within this critical section, blocking is forbidden.rcu_dereference(<pointer>)
.rcu_assign_pointer(<old-pointer>, <new-pointer>)
.synchronize_rcu()
.synchronize_rcu()
waits for all existing read-side critical sections to complete. It does not have to wait for new read-side critical sections that are begun after it has been called.
The big advantage of RCU is that in certain kernel configurations, (un)locking the RCU read lock is achieved with absolutely no instructions.
In Rust, we will of course use a guard for the RCU read lock, so we have
mod rcu {
pub struct Guard(/* ... */);
impl Drop for Guard { /* ... */ }
pub fn read_lock() -> Guard;
}
The pointers that are protected by RCU must be specially tagged, so we introduce the Rcu
type. It exposes the Rust equivalents of rcu_dereference
and rcu_assign_pointer
:
mod rcu {
pub struct Rcu<P> {
inner: UnsafeCell<P>,
// we require this to opt-out of uniqueness of `&mut`.
// if `UnsafePinned` were available, we would use that instead.
_phantom: PhantomPinned,
}
impl<P: Deref> Rcu<P> {
pub fn read<'a>(&'a self, _guard: &'a RcuGuard) -> &'a P::Target;
pub fn set(self: Pin<&mut Self>, new: P) -> Old<P>;
}
pub struct Old<P>(/* ... */);
impl<P> Drop for Old<P> {
fn drop() {
unsafe { bindings::synchronize_rcu() };
}
}
}
The Old
type is responsible for calling synchronize_rcu
before dropping the old value.
Note that set
takes a pinned mutable reference to Rcu
. This is important, since it might not be obvious why there is pinning involved here. Firstly, we need to take a mutable reference, since writers still need to be synchronized. Secondly, since there are still concurrent shared references, we must not allow users to use mem::swap
, since that would change the value without the required compiler and CPU barriers in place.
Now to the crux of the issue and why field projection comes up here: we have to wrap data that is protected by RCU with a lock. However, locks do not allow access to the inner value without locking it (that's kind of their whole point…). So we need a way to get to the Rcu<P>
without locking the lock. Using field projection, we would allow projections for fields of type Rcu
from &Lock
to &Rcu<P>
.
This way, readers can use field projection and the Rcu::read
function and writers can continue to lock the lock and then use Rcu::set
.
Please read the previous section to understand the RCU API in Rust.
struct BufferConfig {
flush_sensitivity: u8,
}
struct Buffer {
// We also require `Rcu` to be pinned, because `&mut Rcu` must not exist (otherwise one could
// call mem::swap).
#[pin]
cfg: Rcu<Box<BufferConfig>>,
buf: Vec<u8>,
}
struct MyDriver {
// The `Mutex` in the kernel needs to be pinned
#[pin]
buf: Mutex<Buffer>,
}
impl MyDriver {
fn set_buffer_config(&self, flush_sensitivity: u8) {
let mut guard: Pin<MutexGuard<'_, Buffer>> = self.buf.lock();
let buf: Pin<&mut Buffer> = guard.as_mut();
// We can use pin-projections since we marked `cfg` as `#[pin]`
let cfg: Pin<&mut Rcu<Box<BufferConfig>>> = buf->cfg;
cfg.set(Box::new(BufferConfig { flush_sensitivity }));
}
fn buffer_config<'a>(&'a self, rcu_guard: &'a RcuGuard) -> &'a BufferConfig {
let buf: &Mutex<Buffer> = &self.buf;
// Here we use the special projections set up for `Mutex` with fields of type `Rcu<T>`
let cfg: &Rcu<Box<BufferConfig>> = buf->cfg;
cfg.read(rcu_guard)
}
fn read_to_buffer(&self, data: &[u8]) -> Result {
let mut buf: Pin<Guard<'_, Buffer, MutexBackend>> = self.buf.lock();
// This method allocates, so it must be fallible.
// `buf.as_mut()->buf` again uses the field projection for `Pin` to yield a `&mut Vec<u8>`.
buf.as_mut()->buf.extend_from_slice(data)
}
}
Rust for Linux would heavily utilize field projections for:
&mut MaybeUninit<T>
without overflowing the stack.unsafe
and having to use addr_of!((*ptr).field)
/ &raw (*ptr).field
.UnsafeCell<T>
.VolatileMem<T>
.The untrusted data patch series introduces the Untrusted<T>
type. It is used to mark data from userspace or hardware as untrusted. Kernel developers are supposed to validate such data before it is used to drive logic within the kernel.
One use case of untrusted data will be ioctls. They are being discussed in this reply (slightly adapted the code):
Example in pseudo-rust:
struct IoctlParams { input: u32, ouptut: u32, }
The thing is that ioctl that use the struct approach like drm does, use the same struct if there's both input and output paramterers, and furthermore we are not allowed to overwrite the entire struct because that breaks ioctl restarting. So the flow is roughly
let userptr: UserSlice; let params: Untrusted<IoctlParams>; userptr.read(params)); // validate params, do something interesting with it params.input // this is _not_ allowed to overwrite params.input but must leave it // unchanged params.write(|x| { x.output = 42; }); userptr.write(params);
Your current write doesn't allow this case, and I think that's not good enough. The one I propsed in private does:
Untrusted<T>::write(&mut self, impl Fn(&mut T))
Importantly, we would like to only overwrite the output
field of the IoctlParams
struct. This is the exact pattern that field projections can help with, instead of exposing a mutable reference to the untrusted data via the write
function, we can have:
impl<T> Untrusted<T> {
fn write(&mut self, value: T);
}
In addition to allowing projections of Untrusted<IoctlParams>
to Untrusted<u32>
.
This document does not consider:
Cow<'_, T>
They will be considered when writing the RFC (either as part of the feature or as future possibilities).