owned this note
owned this note
Published
Linked with GitHub
---
title: "RfL meeting 2024-03-06: Arc in the Linux Kernel"
tags: ["T-lang", "design-meeting", "minutes"]
date: 2024-03-06
discussion: https://rust-lang.zulipchat.com/#narrow/stream/410673-t-lang.2Fmeetings/topic/RfL.20meeting.202024-03-06
url: https://hackmd.io/OCz8EfzrRXeogXEDcOrL2w
---
# Arc in the Linux Kernel
This document outlines how the Linux Kernel is using the unstable features arbitrary-self-types and dispatch_from_dyn/unsize.
But first, an introduction to the custom types that the Kernel is using.
(The pre-meeting version of this document is here: https://hackmd.io/cuKNW7xxRzGsOUMiThs73w)
## The Kernel's custom Arc
The Linux Kernel needs to use a custom implementation of `Arc`. The most important reason is that we need to use the Kernel's `refcount_t` type for the atomic instructions on the refcount. There are two reasons for this:
1. The standard Rust `Arc` will call `abort` on overflow. This is not acceptable in the Kernel; instead we want to saturate the count when it hits `isize::MAX`. This effectively leaks the `Arc`.
2. Using Rust atomics raises various issues with the memory model. We are using the LKMM rather than the usual C++ model, which means that all atomic operations should be implemented with an `asm!` block or similar that matches what Kernel C does, rather than an LLVM intrinsic.
We also make a few other changes to our `Arc`:
1. We need to interact with a lot of different C types that need to be pinned, so our custom `Arc` is implicitly pinned.
2. We do not need weak references, so our refcount can be half the size.
Our `Arc` also comes with two utility types:
1. `ArcBorrow<'a, T>`. Similar to `&'a Arc<T>`, but only one level of indirection.
2. `UniqueArc<T>`. Mutable access to an `Arc`. Used to split allocation and initialization into two steps, which is important since we cannot allocate memory while holding a spinlock.
## Intrusive linked lists
The Kernel uses a lot of intrusive linked lists, which are extremely rare in userspace Rust. This is a consequence of a unique limitation in kernel code related to memory allocations:
1. Memory allocations are always fallible and failures must be handled gracefully.
2. When you are in an atomic context (e.g. when holding a spinlock), you are not allowed to allocate memory at all.
(Technically there are special ways to allocate memory in atomic context, but it should be used sparingly.)
These limitations greatly affect how we design code in the kernel. There are some functions where having a failure path is not acceptable (e.g. destroying something), and other places where we cannot allocate at all. This means that we need data structures that do not need to allocate. Or where the allocation and insert steps are separate. Imagine a map protected by a spinlock. How do you implement that if `insert` simply cannot allocate memory?
One answer to this is to use a linked list (and similar, e.g. a red/black tree can work with the same idea). The value you wish to insert takes this form:
```rs=
struct MyValue {
// In practice, these are wrapped into one field using a struct.
next: *mut MyValue,
prev: *mut MyValue,
foo: Foo,
bar: Bar,
}
```
Then, given an `Arc<MyValue>`, you can insert that into a linked list without having to allocate memory. The only thing you have to do is adjust the `next`/`prev` pointers.
Additionally, there are a bunch of C apis that work using the same principle, so we are also forced into this pattern when we want to use those C apis. For example, this includes the workqueue which stores the list of tasks to run in a linked list.
### The ListArc type
You may have noticed one problem with the above design: The value we are inserting is an `Arc<MyValue>`, so how can you get mutable access to `next`/`prev`? And how do you know that it's not already in a linked list? What about data races — someone could attempt to push (two clones of) the same `Arc<MyValue>` to two different linked lists, which would constitute a data race on the `next`/`prev` fields.
You could solve these issues by adding an `AtomicBool` for keeping track of whether it is in a list, but this isn't great. We really want to avoid the `AtomicBool`.
Our answer is another custom smart pointer type: `ListArc`. The `ListArc` type is just a newtype wrapper around `Arc` with the invariant that each `MyStruct` has at most one `ListArc` reference. However, unlike `UniqueArc`, you are allowed to have `Arc` references to a value that also has an `ListArc`. This way, the `ListArc` reference can be given exclusive access to the `next`/`prev` fields, which is enough to design a safe API for intrusive linked lists containing reference counted values.
One consequence of this is that (unlike `Arc`), we are using a smart pointer where ownership of the pointer is extremely important. You cannot just `clone` a `ListArc`.
Our `ListArc` type also has a second generic parameter, which allows you to have multiple `next`/`prev` pairs. So `ListArc<T, 1>` has exclusive access to the first `next`/`prev` pair, and `ListArc<T, 2>` has exclusive access to the second such pair. This means that you can have multiple list arcs as long as their parameter is different (one per value of the extra generic parameter).
### Prev/next pointers and dynamic dispatch
We want to be able to use linked lists with `dyn Trait`. However, the offset of the `next`/`prev` fields needs to be uniform no matter what the concrete type is. To do that, we use a wrapper type:
```rust=
struct Wrapper<T: ?Sized> {
next: *mut Wrapper<T>,
prev: *mut Wrapper<T>,
value: T,
}
```
And the actual type ends up being `Arc<Wrapper<dyn Trait>>`.
## Arbitrary self types
We wish to use arbitrary self types in various places. Some examples:
1. Many methods need to call `self.clone()` and get a new `Arc` to the current struct. To do that, we need `self: &Arc<T>` or `self: ArcBorrow<'_, T>`.
2. We often need to do `linked_list.push(self)`. To do that, we need `self: ListArc<T>`.
For the struct methods, we _could_ work around this by not using `self` parameters, and calling `MyStruct::foo(my_arc)`. However, we also need to do these things in traits where we perform dynamic dispatch on the value. For example:
```rs=
trait WorkItem {
fn run1(self: ListArc<Self>);
// or, actually:
fn run2(self: ListArc<Wrapper<Self>>);
}
```
This use-case needs both arbitrary self types and the dynamic dispatch feature mentioned in the next section. Arbitrary self types are needed because dynamic dispatch is only performed on self parameters.
## Dynamic dispatch
We wish to use these linked lists to store dynamic trait objects. This is used for a "todo list" of events that need to be delivered to userspace. There are many different event types, and we use a trait object to store a queue of them.
There is a need to have both `Arc<MyStruct>` and `Arc<dyn MyTrait>` references to the same object.
All of the smart pointers that we want to use dynamic dispatch with are newtype wrappers around either `NonNull` or other smart pointers (e.g. `ListArc` is a wrapper around `Arc`). They may also have a `PhantomData` field. These requirements match what is listed on [`DispatchFromDyn`](https://doc.rust-lang.org/stable/std/ops/trait.DispatchFromDyn.html).
A related feature is the [`Unsize`](https://doc.rust-lang.org/stable/std/marker/trait.Unsize.html) trait. Most likely, adding the `DispatchFromDyn` trait depends on also having `Unsize`, so we need it for that reason. But we do not otherwise need the `Unsize` trait.
### Stabilizing part of the feature
Stabilizing the entire `DispatchFromDyn` trait may not be necessary for stabilizing enough for what the kernel needs. One suggestion by compiler-errors is as follows:
```rs=
#[derive(SmartPointer)]
struct Foo<#[pointee] T> {
ptr: *const T,
}
```
As an initial solution, we could stabilize a derive macro for creating smart pointers without stabilizing the underlying `DispatchFromDyn` and `Unsize` traits themselves.
### C-to-Rust Dynamic dispatch
The kernel also has some other uses of dynamic dispatch that trait objects don't help with. Mainly, these are cases where C defines a vtable using a struct with function pointers. Here, we must match the vtable layout that C dictates, so we will manually implement the vtable unsafely to handle these cases. We still use a trait for providing a safe API to these things, but the implementation needs only generics and not trait objects.
However, for Rust-to-Rust dynamic dispatch, trait objects appear to satisfy our needs. As long as we are able to use them with our smart pointers, that is.
---
# Discussion
## Attendance
- People: TC, nikomatsakis, Josh, Alic Ryhl, Andreas Hindborg, Benno Lossin, Boqun Feng, Gary Guo, Lukas Wirth, Miguel Ojeda, Nadri, Trevor Gross
## Meeting roles
- Minutes, driver: TC
## Will the `derive(SmartPointer)` proposal work for Rust-for-Linux?
Josh: Would like to check whether providing this in stable Rust seems likely to meet Rust-for-Linux's needs, without needing to stabilize the full `DispatchFromDyn`. `derive(SmartPointer)` could enforce the requirements needed for that to be sound.
Gary: I guess that depends on the detail of the `derive(SmartPointer)`. Will it work for all cases covered by today's `DispatchFromDyn` and `CoerceUnsized` implemented-together? I.e. will it cover both raw pointers, `NonNull`s and types that are themselves wrapper of pointer types?
Josh: We were I think hesitant to stabilize `DispatchFromDyn` due to the large number of unsafe criteria that have to be satisfied, and whether this was the right interface. Adding a `derive` here could allow us to enforce those.
Alice: What we really need here are raw pointers to work with dynamic dispatch.
Josh: We can iterate with Rust-for-Linux folks to try to cover the specific needs.
## Examples?
nikomatsakis: I think it'd be really useful to have some medium-length snippet of code (or multiple small snippets) showing how these arcs are used in practice. Like how they're allocated, inserted, etc.
Alice: The binder driver is a good example of this. The process has references to all of the threads. Then it has a stack (list) of threads ("ready threads") that can receive incoming transactions right now.
Alice: The key use case here are events that we need to insert and remove from lists, e.g. "deliver to read".
NM:
Patterns Niko wants to see:
* inserting onto a list ("ready threads")
* invoking methods on an Arc or something like that (e.g., the `MyType` example below)
* a list where dyn dispatch is needed ("deliver to read")
E.g.:
```rust
let deliver_to_read: List<dyn Event, 1> = List::new();
let item: ListArc<Wrapper<T>, 1> = some_list_arc;
let item: ListArc<Wrapper<dyn Event>, 1> = some_list_arc;
deliver_to_read.insert(item)
impl<T, const N: usize> List<T, N>
where
T: ?Sized + ListItem<N>,
{
fn insert(list: ListArc<T, N>) {
}
}
#[repr(C)]
pub struct Wrapper<T> {
prev: *const Wrapper<T>,
next: *const Wrapper<T>,
value
}
unsafe impl<T: ?Sized> ListItem<1> for Wrapper<T> {
const PREV_OFFSET = 0;
}
unsafe trait ListItem<const N: usize> {
const PREV_OFFSET: usize;
}
```
## The `Unsize` trait
NM: We didn't really talk about this.
Alice: We don't use this directly. But it comes up with `DispatchFromDyn`. If a macro generates these, then all of our users of this go away.
Gary: If we could use our `Arc` on stable with dynamic dispatch, we could probably work around the other issues.
## Next steps
NM: The first step is to get code examples.
NM: Then, what CE suggested with respect to the derive macro seems feasible. I'd like to play around with that.
## Obstacles to stabilizing
Gary: What are the obstacles to stabiling the feature as is?
NM: `Unsize` probably has fewer questions. There's also `CoerceUnsized`. For `DispatchFromDyn`, there are just more questions.
NM: To meet the needs of RfL here, we don't need to stabilize the traits in full, so it makes sense to keep flexibility here.
NM: Do your types implement `Deref`?
Alice: Yes.
Gary: When we wrap a C struct, we put it into an "opaque" wrapper, which is `UnsafeCell + MaybeUninit + !Unpin`.
Alice: There's some relation to `UnsafeAliased` from RalfJ here.
NM: Intrusive linked lists will want to allow mut pointers from the outside which needs similar pinning guarantees to our `Futures`.
## Direction on arbitrary self types
TC: We might discuss the issue of raw pointers and `NonNull` for arbitrary self types, since we were leaning a direction there.
NM: (Fills in background.)
Gary: This largely isn't an issue for us, though we might want dynamic dispatch on raw pointers.
Alice: We could always use a newtype.
Gary: Yes, we could put another opaque wrapper around it.
## The derive macro
NM: Who should lead putting together a proposal here?
Josh: CE is one option, of course, if interested. The author of the arbitrary self types proposal is also interested in pushing things forward. We should at least get feedback from these parties. CE in particular knows what requirements are needed to make things sound.
TC: Adrian has been working with the RfL team. He might be willing to take this and work with CE on it.
NM: +1. It'd be good to ensure this also meets the `CppRef` use case.
Lukas: I'd also be happy to help here.
NM: Maybe it'd be useful for us to have `libprocmacro` so that you have to explicitly pull this into scope. That may make it easier for us to deprecate this if needed. But I like the idea of exposing macros for these kind of use cases so we can leave ourselves more room on other design questions.
(The meeting ended here.)
---
## Self type requirements
nikomatsakis: The doc says:
> We wish to use arbitrary self types in various places. Some examples:
>
> 1. Many methods need to call `self.clone()` and get a new `Arc` to the current struct. To do that, we need `self: &Arc<T>` or `self: ArcBorrow<'_, T>`.
> 2. We often need to do `linked_list.push(self)`. To do that, we need `self: ListArc<T>`.
For that first point, this is not obviously covered by existing proposals, right? What seems to be missing is wanting something like "auto-deref" to from `Arc<T>` to `ArcBorrow<'_, T>`? i.e., I imagine you want to be able to do:
```rust!
impl MyType {
fn clone(self: ArcBorrow<'_, Self>) -> Arc<Self> {
...
}
}
let x: Arc<MyType> = ...;
x.clone() // ?
```
Alice: Here we just do the inconvenient thing and explicitly convert to `ArcBorrow`.
Gary: I recall that we discussed about the idea of exposing `&ArcInner` in RfL meeting. @Wedson I didn't exactly recall but what issue did we find with that issue?
Boqun: Pointer provenance.
Gary: Ah yes, so this rhymes with the discussion on having arbitrary self type that works with types that do not create reference.
## More info on `ArcBorrow`?
Josh: Could we get the struct definition for `ArcBorrow`? Is there some way we could reduce indirection without needing a unique type?
Alice:
```rs
pub struct ArcBorrow<'a, T: ?Sized + 'a> {
inner: NonNull<ArcInner<T>>,
_p: PhantomData<&'a ()>,
}
```
## Confirmation: wide pointer is ok for dyn traits?
nikomatsakis: To confirm, you say that dyn meets your needs, but you are anticipating "wide pointer"-style semantics?
Alice: Yes, our uses of dyn are wide pointers. The places where this might be a potential problem are also places where we have to do it manually anyway because C defines the layout of the vtable.
## Confirmation: the whole set of types
nikomatakis: I want to be sure I understand the set of types as things are currently setup. It seems like there is...
* `Arc<T>` which is in fact an alias for `ArcInner<Wrapper<T>>`
* `Wrapper` which adds the next/prev fields -- and presumably there is sometimes space for more than one.
* when you allocate a new Arc, you get back a:
* UniqueArc, which can be converted into a:
* ListArc, which can be converted into an:
* Arc.
* Arc.
* something like this?
* I guess all of those are kind of "newtypes" of the `ArcInner` in some sense.
## Opaque type
```rust
/// Stores an opaque value.
///
/// This is meant to be used with FFI objects that are never interpreted by Rust code.
#[repr(transparent)]
pub struct Opaque<T> {
value: UnsafeCell<MaybeUninit<T>>,
_pin: PhantomPinned,
}
```