# Reviewers' Recommendations (WIP) This is a list of topics about which developers may want rules of thumb or checklists to start with. This also helps reviewers to understand the code quickly and provide useful feedbacks. Note that among all the reviewers, there is one we care most: _the future yourself_. These recommendations may be incomplete, since both Rust and Linux are moving targets. In case where this document doesn't cover, please consider the following: * Be Rust idiomatic as hard as possible. * Being explict first and then improving ergonomic usually work. * If you find a good and reasonable way for a certain problem, please do add it in this document! ## Pointers ### Casting When casting raw pointers avoid `as _`. This reduces readability and since it relies on type inference, the behavior might change when the context changes. It also allows implicitly changing `*const` to `*mut`. So prefer the [`cast`](https://doc.rust-lang.org/std/primitive.pointer.html#method.cast) and [`cast_mut`](https://doc.rust-lang.org/std/primitive.pointer.html#method.cast_mut)/[`cast_const`](https://doc.rust-lang.org/std/primitive.pointer.html#method.cast_const) functions for casting raw pointers. * Gary: `as _` can be quite useful for doing FFI (e.g. convert from/to usize) ^ Boqun: I feel like avoid unwanted casting is the point of avoiding `as _`? ### Raw pointers at `pub` interfaces * const vs *mut in pub APIs (arguments and returns): if there exists a mirror (or similar) version of the API with & or &mut, use *const to mirror & and *mut for &mut, otherwise ??? ## Wrapping a kernel struct Say there is a kernel struct: ```clike= struct foo { int f_a; int f_b; ... } ``` * Wrapping the struct ```rust= // The prefer way of wrapping a C side struct is using // `Opaque`. struct Foo(Opaque<bindings::foo>); // Default // Otherwise: `foo` is just plain old data struct Foo(bindings::foo) // <- find some way to notify us. ``` Benno: considering always use `Opaque<_>` Wedson: Better to find the examples in Rust tree -> TODO: boqun * Wrapping the pointer ```rust= // If `foo` is self reference counted, e.g. task_struct. impl AlwaysRefCounted for Foo { ... } // And then use ARef<Foo> // as the pointer wrapper. ``` ## `Send` and `Sync` * If an object of a type can be created on one thread and dropped on the another thread, then it's `Send`. * If multiple threads can have immutable references (`&T`) to the same object, then `T` is `Sync`. ## `Pin`ning (XXX: Introduction of Pin from y86-dev) Pinning is Rust's way of enforcing the address stability of a value. When a value gets pinned it will be impossible for safe code to move it to another location. This is done by wrapping pointers to said object with `Pin<P>`. This wrapper prevents safe code from creating mutable references to the object, preventing mutable access, which is needed to move the value. `Pin<P>` provides `unsafe` functions to circumvent this and allow modifications regardless. It is then the programmer's responsibility to uphold the pinning guarantee. Many kernel data structures require a stable address, because there are foreign pointers to them which would get invalidated by moving the structure. Since these data structures are usually embedded in structs to use them, this pinning property propagates to the container struct. Resulting in most structs in both Rust and C code needing to be pinned. There are two things to think about when you are dealing with pinning: * `Pin<_>` pointers in method signatures: describe * `Unpin` trait: describe ### `Pin<_>` in API When do you need the function to take `Pin<_>` as the argument? If any of the following is true: * The function expects the object pinned (i.e. the object should have a fixed address), this is a sign of starting relying on pin safety guarantee. * [ ] **???** Should all C side functions that take a pointer be considered as a "pinned function"? Reason: C side can do the following without letting us know: ```clike= void do_something(struct foo* ptr, ...) { ... a_debug_obj->data = ptr; ... } ``` * The function is a mutable one (i.e. it would take a `&mut self`) and may be called after the object is pinned. Reason: otherwise users may need to use `unsafe` to call these function. * [ ] **???** What about the function may be called before pinned? A duplicate `&mut self` version or rely on `pin-init` so all mutable functions take `Pin<&mut Self>`. Example: ```rust= struct APinStruct { ... } impl APinStruct { // If a _immutable_ function doesn't rely on the stable address of the object. pub fn immutable_foo(&self) { ... } // If a _immutable_ function relies on the stable address of the object. pub fn immutable_pinned_foo(self: Pin<&Self>) { ... } // If a _mutable_ function relies on the stable address of the object // or it guarantees not moving the object. pub fn mutable_foo(self: Pin<&mut Self>) { ... } } ``` ```rust= #[pin_data] struct APinInitedStruct { ... } impl APinInitedStruct { // If a _immutable_ function doesn't rely on the stable address of the object. pub fn immutable_foo(&self) { ... } // If a _immutable_ function relies on the stable address of the object. // Pin-init types are initialized since creation, // so `&self` is the same as `Pin<&self>`. pub fn immutable_pinned_foo(&self) { ... } // If a _mutable_ function relies on the stable address of the object // or it guarantees not moving the object. pub fn mutable_foo(self: Pin<&mut Self>) { ... } } ``` ### Adding `PhantomPinned` to a type to make it `!Unpin` When writing abstractions, it often is needed to make a type `!Unpin` to ensure it not moving. When doing this, document the reason for why the type cannot be `Unpin`: ```rust /// # Unpin /// /// This type is `!Unpin`, since it wraps `struct mutex` from C. It contains a /// `struct list_head` which is self-referential. pub struct Mutex<T> { pin: PhantomPinned, /* ... */ } ``` ### `UNPIN` functions: The Pin safety guarantee (once pinned, always pinned) may be too restrict sometimes, and * Put an `# Unpin` comments on the functions that make the object movable again (e.g. removing the object from a global list), notes these functions themselves have to be a `Pin<_>` function. For example, ```rust= struct A { ... _pin: PhantomPinned, } impl A { /// Puts on a list for execution. pub fn queue(self: Pin<&mut Self>) /// # Unpin /// /// Removes the object from the global list when return, /// can be moved after this. pub fn wait_for_finish(self: Pin<&mut Self>) } ``` Then if these `# Unpin` functions are used to make the objects movable. Add a `UNSOUND: UNPIN:` comment as the move point, e.g. ```rust= let a = A::new(..); // SAFETY: `a` may be moved afterwards, but that's after // a Unpin function. let pin = unsafe { Pin::new_unchecked(&mut a)}; pin.as_mut().queue(); // UNPIN: `wait_for_finish()` makes `a` movable again. pin.wait_for_finish(); let b = Box::new(a); ``` ## Intrusive Data structure ```rust= struct RawFoo(bindings::foo); struct TypedFoo<T> { raw: RawFoo, data: T, } struct FooEmbeddedA { data_1: A2, raw: RawFoo, data_2: A2, } ``` ## Raw Pointers * `*const` vs `*mut` in `pub` APIs (arguments and returns): if there exists a mirror (or similar) version of the API with `&` or `&mut`, use `*const` to mirror `&` and `*mut` for `&mut`, otherwise **???** * [ ] **???** Casting: ## Guard vs closure ## When to `impl Deref`? ---- ---- # Benno's Recommendations ## Casting When casting raw pointers avoid `as _`. This reduces readability and since it relies on type inference, the behavior might change when the context changes. It also allows implicitly changing `*const` to `*mut`. So prefer the [`cast`](https://doc.rust-lang.org/std/primitive.pointer.html#method.cast) and [`cast_mut`](https://doc.rust-lang.org/std/primitive.pointer.html#method.cast_mut)/[`cast_const`](https://doc.rust-lang.org/std/primitive.pointer.html#method.cast_const) functions for casting raw pointers. * Gary: `as _` can be quite useful for doing FFI (e.g. convert from/to usize) ## Memory indirections When creating structs avoid adding extra indirections like `Arc<T>` for fields that need to be pinned. In that case it is better to use pin-init to embedd the struct. This point is especially important for abstractions. Driver data and other single-use structs can break this "rule", as they might know better. ## Documentation ### General things - doc strings should be full english sentences with periods at the end. - the first letter of a sentence is capitalized. ### Function docs ### Redundant/too simple For example: ```rust impl Bar { /// Sets the Foo size. fn set_foo_size(&self, foo_size: usize); } ``` I think we should strive to avoid these, sometimes in cases where it really is simple and everyone immediately knows what is meant this is fine. However, in cases where someone new does not know, we should have more information/examples. ### C abstractions I think we could add more documentation to the Rust abstractions that explain the C api they abstract. For example in the `block_mq` stuff we could have an explainer what that even means. Linking to C docs is of course a good idea and we should always do that anyways, but adding more might also be helpful. ### Function documentation Every `pub(crate)`/`pub` function should document non-trivial arguments, we define what types are trivial, we could for example start with this list: - `len: usize` - `self: $any` - `this: $any` Similarly for return types: - `len() -> usize` - `-> Result<T, E>` where `T` is a trivial return type and `E` a trivial error type - `-> Option<T>` where `T` is a trivial return type - `-> ()` - `new() -> Self` or variations of `new` - `new() -> impl PinInit<Self>` or variations of `new` or `impl Init` Trivial error types: - `AllocError` - `kernel::error::Error` ## Trivialized `SAFETY` Comments For [example](https://github.com/Rust-for-Linux/linux/blob/ac9a78681b921877518763ba0e89202254349d1b/rust/kernel/sync/arc.rs#L550): ```rust /// Initialize `self` using the given initializer. pub fn init_with<E>(mut self, init: impl Init<T, E>) -> core::result::Result<UniqueArc<T>, E> { // SAFETY: The supplied pointer is valid for initialization. match unsafe { init.__init(self.as_mut_ptr()) } { // SAFETY: Initialization completed successfully. Ok(()) => Ok(unsafe { self.assume_init() }), Err(err) => Err(err), } } ``` I propose that we standardize the way we write safety requirements and the way we discharge safety obligations. For example: ```rust /// # Invariants /// - `msg` is R-valid, struct Foo { msg: *const [u8], } /// # Safety /// Callers have to guarantee: /// - `this` is RW-valid for the duration of the call, /// - `msg` is R-valid[..`len`] for the lifetime of `*this`, fn foo_set_msg(this: *mut Foo, msg: *const u8, len: usize) { // SAFETY: `ptr` is valid by FR. // INVARIANT: `msg` is R-valid[..`len`] as long as `*ptr` lives, so this produces an // R-valid slice. unsafe { (*this).msg = core::ptr::slice_from_raw_parts(msg, len); } } ``` ## `inline` attributes How do we go about adding them? * Gary: We should add them for all non-generic small chunk of code