# Linux 核心專題: 撰寫 LKMPG 的 Rust 核心模組 > 執行人: fourcolor > [專題解說錄影](https://youtu.be/xCw1w-DjEe8) :::success :question: 提問清單 * ? ::: ## 任務簡述 [LKMPG](https://sysprog21.github.io/lkmpg/) 是 [Linux 核心內建文件](https://www.kernel.org/doc/Documentation/process/kernel-docs.rst)所推薦的電子書,是世上極少數自由流通的 Linux 核心專書。本任務嘗試以 Rust 程式語言改寫 LKMPG 既有的 Linux 核心模組範例 (亦可新增),針對 Linux v6.1+。 ## TODOust for Linux 素材 閱讀以下素材: * [Linux: Rust](https://www.kernel.org/doc/html/latest/rust/) * [Linux 核心採納 Rust 的狀況](https://hackmd.io/@linD026/rust-in-linux-organize) * [用 Rust 撰寫 Linux 核心模組相關演講](https://hackmd.io/@0xff07/doxolinux/%2F%400xff07%2FSJZEVrDXP) 紀錄閱讀過程中的認知和疑惑,並答覆以下: * 用 Rust 撰寫 Linux 核心模組,需要做哪些準備? * 在 Linux v6.1 支援 Rust 後,能夠撰寫什麼類似的 Linux 核心模組?有何限制? * Rust for Linux 在開發核心模組時,如何處理 mutex lock, irq, softirq, tasklet, workqueue 以及 kernel thread? ## TODO: 改寫既有的 LKMPG 程式碼 至少應包含: * hello 系列 * character 裝置 * procfs * workqueue 過程中詳細記錄,特別是 Linux v6.1+ 遇到的 Rust 編譯及相容性議題。 ## 引言 Linux 於 v6.1 將 Rust 引入,使得 Rust 成為了第二個 linux kernel 程式語言,此專案將會從文獻、演講了解為何 linux 開發者們為何做出此決定,並進一步學習如何在 Linux 開發核心組。 * [What Problem with C ? Safety and Undefined Behavior](#What-Problem-with-C-?-Safety-and-Undefined-Behavior) * [Why Rust?](#Why-Rust?) * [Interoperability](#Interoperability) * [A little C with your Rust](#A-little-C-with-your-Rust) <!-- * A little Rust with your C --> <!-- * Architecture support --> <!-- * ABI compatibility --> * [What can we do in Linux 6.1 ~ 6.x ?](#What-can-we-do-in-Linux-6.1-~-6.x-?) * Rust-For-Linux(TODO) * [Setting Up an Environment for Writing Linux Kernel Modules in Rust](#Setting-Up-an-Environment-for-Writing-Linux-Kernel-Modules-in-Rust) * [Write Kernel Modules in Rust](#Write-Kernel-Modules-in-Rust) * [Hello](#Hello) * [Character Device](#Character-*Device) * [Misc Device](Misc-Device) * [Procfs](#Procfs) * [Workqueue](#Workqueue) * Semaphore(TODO) * Out-of-tree module(TODO) ## What Problem with C ? Safety and Undefined Behavior Linux 使用 C 語言撰寫已有 30 年之久,然而 C 語言存在著不少的 Undefined Behavior (UB),這在使用 C 語言開發時容易產生安全隱患。以下是 [What Every C Programmer Should Know About Undefined Behavior #2/3](https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html) 所舉的例子 ```c void process_something(int size) { // Catch integer overflow. if (size > size+1) abort(); ... // Error checking from this code elided. char *string = malloc(size+1); read(fd, string, size); string[size] = 0; do_something(string); free(string); } ``` 由於在 C 中,Signed integer overflow 是一個 UB ,編譯器在做最佳化時,會認為 `size > size+1` 總是 false ,因此該程式碼很有可能會等價於 ```c void process_something(int *data, int size) { char *string = malloc(size+1); read(fd, string, size); string[size] = 0; do_something(string); free(string); } ``` 很明顯的當 `size` 為 `INT_MAX` 時,這段程式碼會出現錯誤。這件事代表即使開發人員在寫程式時有想到要檢查 overflow 的問題,也有機會因為編譯器的最佳化而產生錯誤。因此這些 UB 往往對程式造成許多安全上的疑慮。 延伸閱讀:[memory-safety](https://www.memorysafety.org/docs/memory-safety/) ## Why Rust? 從 [lkml.org [PATCH 00/13] [RFC] Rust support](https://lkml.org/lkml/2021/4/14/1023?fbclid=IwAR04IjpsgkDcrVlQfhmkJFxexp8ea5xAUjtrAbzpmINvkscXai0UThLINO0) 可以看到開發者們選擇 Rust 作為 Linux 的第二個程式語言的原因,除了 Rust 有比 C 更多的特徵與功能外,第一點提到了 Rust 是沒有所謂的 UB ,除此之外 Rust 本身也提供許多關於記憶體非法使用上的進階偵測。 ``` ## Why Rust? Rust is a systems programming language that brings several key advantages over C in the context of the Linux kernel: - No undefined behavior in the safe subset (when unsafe code is sound), including memory safety and the absence of data races. - Stricter type system for further reduction of logic errors. - A clear distinction between safe and `unsafe` code. - Featureful language: sum types, pattern matching, generics, RAII, lifetimes, shared & exclusive references, modules & visibility, powerful hygienic and procedural macros... - Extensive freestanding standard library: vocabulary types such as `Result` and `Option`, iterators, formatting, pinning, checked/saturating/wrapping integer arithmetic, etc. - Integrated out of the box tooling: documentation generator, formatter and linter all based on the compiler itself. Overall, Rust is a language that has successfully leveraged decades of experience from system programming languages as well as functional ones, and added lifetimes and borrow checking on top. ``` 以下是從 [Rust for Linux - Miguel Ojeda](https://www.youtube.com/watch?v=46Ky__Gid7M) 舉的其中一個關於 Use-After-Free (UAF) 例子 ```rust pub fn main() { let a = Box::new(42); drop(a); print!("{}", *a); } ``` ```c #include <stdio.h> int main() { int *const a = malloc(sizeof(int)); if (!a) { abort(); } *a = 42; free(a); printf("%d", *a); } ``` 下方的程式碼 (C code) 很明顯每次輸出的結果都是不可預測的,而上方的程式碼 (Rust code) 在編譯時就會出現以下錯誤訊息 ```bash error[E0382]: borrow of moved value: `a` --> test.rs:4:18 | 2 | let a = Box::new(42); | - move occurs because `a` has type `Box<i32>`, which does not implement the `Copy` trait 3 | drop(a); | - value moved here 4 | print!("{}", *a); | ^^ value borrowed here after move | = note: this error originates in the macro `$crate::format_args` which comes from the expansion of the macro `print` (in Nightly builds, run with -Z macro-backtrace for more info) ``` 而 Rust 之所以可以做到這樣,這要歸功於 Rust 的記憶體管理機制 [Ownership](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) * Each value in Rust has an owner. * There can only be one owner at a time. * When the owner goes out of scope, the value will be dropped. 但同時也在前面就提到,引入 Rust 並不是為了改寫整個 kernel > Please note that the Rust support is intended to enable writing drivers and similar "leaf" modules in Rust, at least for the foreseeable future. In particular, we do not intend to rewrite the kernel core nor the major kernel subsystems (e.g. `kernel/`, `mm/`, `sched/`...). Instead, the Rust support is built on top of those. 除此之外,在 [Supporting Linux kernel development in Rust](https://lwn.net/Articles/829858/) 提到關於 Rust 引入 Kernel 的三個探討的主題分別是: Binding to existing C APIs, Architecture support, ABI compatibility with the kernel ### Interoperability Rust 透過 [std::ffi module](https://doc.rust-lang.org/std/ffi/index.html) 實現與其他語言的跨語言函式介面,而 Rust 的 [bindgen crate](https://docs.rs/bindgen/latest/bindgen/) 便是基於 std:ffi 實現 Rust 自動化產生與 C/C++ 的跨語言函式介面 #### [A little C with your Rust](https://docs.rust-embedded.org/book/interoperability/c-with-rust.html) 由於 linux 引入 Rust 的目的並不是為了將整個 kernel 改寫,因此讓 Rust 可以呼叫現有的 C Kernel API 便是最重要。這裡我們以靜態函式庫作為範例。 假設我們要將下列 C 程式引入 Rust 來做使用。 ```c /* File: cool.h */ typedef struct Item { int x; int y; } Item; void cool_function(Item* cs); ``` ```c /* File: cool.c */ #include "cool.h" #include <stdio.h> void cool_function(Item *cs) { printf("cool item %d %d\n", cs->x, cs->y); } ``` 將上述程式碼編譯成靜態函式庫 ```bash $ gcc -c cool.c $ ar -rcs libcool.a cool.o ``` 利用 cargo 新增 rust 專案 ```bash $ cargo new project_name --bin ``` 新增 [build.rs](https://doc.rust-lang.org/cargo/reference/build-scripts.html) ```rust fn main() { println!("cargo:rustc-link-search=/home/fourcolor/Documents/rust/clib"); println!("cargo:rustc-link-lib=static=cool"); } ``` 接著在使用 C 的函式需要在 rust 做相對應的定義 ```rust #[repr(C)] pub struct Item { x: i32, y: i32, } extern "C" { fn cool_function(cs: *const Item); } fn main() { let item = Item { x: 12, y: 32 }; unsafe { cool_function(&item); } } ``` 執行結果如下 ```bash cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.01s Running `target/debug/rusttoc` cool item 12 32 ``` 除了自行定義相關結構和函式,我們也可以透過 bindgen 來自動化產生 ```bash bindgen ../clib/cool.c -o bindings.rs ``` 更多相關用法可以參考 * https://doc.rust-lang.org/cargo/reference/build-scripts.html * https://docs.rust-embedded.org/book/interoperability/c-with-rust.html <!-- #### [A little Rust with your C](https://docs.rust-embedded.org/book/interoperability/rust-with-c.html) --> ## What can we do in Linux 6.1 ~ 6.x ? ### linux 6.1 [A first look at Rust in the 6.1 kernel](https://lwn.net/Articles/910762/) 中提到,6.1 版作為第一個將 rust 引入的版本並不能做到太多事情,並且在 sample 中引入 rust_minimal.rs :::spoiler rust_minimal ```rust // SPDX-License-Identifier: GPL-2.0 //! Rust minimal sample. use kernel::prelude::*; module! { type: RustMinimal, name: "rust_minimal", author: "Rust for Linux Contributors", description: "Rust minimal sample", license: "GPL", } struct RustMinimal { numbers: Vec<i32>, } impl kernel::Module for RustMinimal { fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> { pr_info!("Rust minimal sample (init)\n"); pr_info!("Am I built-in? {}\n", !cfg!(MODULE)); let mut numbers = Vec::new(); numbers.try_push(72)?; numbers.try_push(108)?; numbers.try_push(200)?; Ok(RustMinimal { numbers }) } } impl Drop for RustMinimal { fn drop(&mut self) { pr_info!("My numbers are {:?}\n", self.numbers); pr_info!("Rust minimal sample (exit)\n"); } } ``` ::: 用於展示如何使用 rust macro 建立 MODULE_DESCRIPTION() 和 MODULE_LICENSE() ,並且透過實作 [trait](https://doc.rust-lang.org/book/ch10-02-traits.html) (概念類似於 interface ) kernel::Module 的 init ,以及 trait Drop 中的 drop 來做到 module_init() 和 module_exit()。此外可以透過 Vec 做到類似於 array 的功能。而 try_push() -> Result<T, E> 會回傳成功與否,而 [`?`](https://doc.rust-lang.org/reference/expressions/operator-expr.html#the-question-mark-operator) 則會在 try_push 失敗時讓 init 回傳失敗,成功時回傳 T 物件。 ### linux 6.2 從 [Rust in the 6.2 kernel](https://lwn.net/Articles/914458/) 可看到在這個版本中支援了所有的 [linux log level](https://docs.kernel.org/next/core-api/printk-basics.html) ,並且新增了 rust_print.rs 的範例 :::spoiler ```rust // SPDX-License-Identifier: GPL-2.0 //! Rust printing macros sample. use kernel::pr_cont; use kernel::prelude::*; module! { type: RustPrint, name: "rust_print", author: "Rust for Linux Contributors", description: "Rust printing macros sample", license: "GPL", } struct RustPrint; impl kernel::Module for RustPrint { fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> { pr_info!("Rust printing macros sample (init)\n"); pr_emerg!("Emergency message (level 0) without args\n"); pr_alert!("Alert message (level 1) without args\n"); pr_crit!("Critical message (level 2) without args\n"); pr_err!("Error message (level 3) without args\n"); pr_warn!("Warning message (level 4) without args\n"); pr_notice!("Notice message (level 5) without args\n"); pr_info!("Info message (level 6) without args\n"); pr_info!("A line that"); pr_cont!(" is continued"); pr_cont!(" without args\n"); pr_emerg!("{} message (level {}) with args\n", "Emergency", 0); pr_alert!("{} message (level {}) with args\n", "Alert", 1); pr_crit!("{} message (level {}) with args\n", "Critical", 2); pr_err!("{} message (level {}) with args\n", "Error", 3); pr_warn!("{} message (level {}) with args\n", "Warning", 4); pr_notice!("{} message (level {}) with args\n", "Notice", 5); pr_info!("{} message (level {}) with args\n", "Info", 6); pr_info!("A {} that", "line"); pr_cont!(" is {}", "continued"); pr_cont!(" with {}\n", "args"); Ok(RustPrint) } } impl Drop for RustPrint { fn drop(&mut self) { pr_info!("Rust printing macros sample (exit)\n"); } } ``` ::: 引入了 #[vtable] 這個 macro ,在 linux kernel 中有許多 function pointer 的應用,struct file_operations 就是一個典型的例子,使用者自定義 read(), write() 並將對應的 function pointer 傳入 file_operations。雖然這個功能看似可以使用 rust 中的 trait 來達成,然而 linux 中允許省略任何不相關的功能,此狀況會造成空指標的發生,remap_file_range() 就是其中一個例子,它在大多時刻是沒有用處的。然而空指標是 Rust 竭盡避免的事情。因此透過 #[vtable] 來解決,#[vtable] 會為每個 XXX function 新增一個 HAS_XXX 的變數,若是有實作該函式就會將其設置為 TRUE ,在編譯時就會透過訪問這些變數生成對應的 struct ,若是 FALSE 則放入空指標。詳細實作可以看這個 [patch](https://lwn.net/ml/linux-kernel/20221110164152.26136-7-ojeda@kernel.org/) 引入 [`declare_err!()`](https://lwn.net/ml/linux-kernel/20221110164152.26136-10-ojeda@kernel.org/) macro 產生對應的 error code [TODO 背後機制](https://lwn.net/ml/linux-kernel/20221110164152.26136-11-ojeda@kernel.org/) 引入 cStr 和 CString 對應 c string ,確保字串結尾為 `NUL` TODO: 引入 `dbg!` macro ### linux 6.3 從這個 [patch](https://lore.kernel.org/lkml/20230212183249.162376-1-ojeda@kernel.org/) 可以看到在 linux 6.3 支援 [Arc](https://docs.rs/triomphe/latest/triomphe/struct.Arc.html)、[ArcBorrow](https://docs.rs/triomphe/latest/triomphe/struct.ArcBorrow.html) 和 [UniqueArc](https://docs.rs/triomphe/latest/triomphe/struct.UniqueArc.html) 類型,以及 ForeignOwnable 和 ScopeGuard ## [Setting Up an Environment for Writing Linux Kernel Modules in Rust](https://www.youtube.com/watch?v=tPs1uRqOnlk) ### Arch Support 目前 Rust for linux 只支援 um (user-mode Linux) , x86 這兩個架構 ### Prerequisite 先下載相關套件 ```shell $ sudo apt -y install \ flex \ bison \ build-essential \ wget \ lld \ qemu-kvm \ clang \ llvm ``` ### rustup 接著下載 rustup , rustup 是負責安裝及管理 Rust 的工具。 ```bash $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh ``` ### rustc 然後安裝 linux 指定的 rustc (rust compiler)版本 ```bash $ rustup override set $(scripts/min-tool-version.sh rustc) ``` 由於目前 Rust-For-Linux 使用到許多 unstable feature ,因此所使用的是特定版本的 rustc 而非最小版本 unstable feature: https://github.com/Rust-for-Linux/linux/issues/2 ### [bingen](https://docs.rs/bindgen/latest/bindgen/) bingen 是 rust 用來產生與 c/c++ 的跨語言程式面界的工具,然後透過 cargo (rust 的套件管理系統)來安裝 bindgen ```bash $ cargo install --locked --version $(scripts/min-tool-version.sh bindgen) bindgen ``` ### linux 下載 [linux 核心原始碼](https://www.kernel.org/) ( 版本至少要 6.1 以上) 或是下載 [Rust-for-Linux](https://github.com/Rust-for-Linux/linux) 專案(recommend) ```bash $ wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.3.6.tar.xz $ tar xvf linux-6.3.6.tar.xz $ cd linux-6.3.6 ``` 接下來我們會透過 qemu 和 busybox 來建立測試環境,我們參考 [Rust-for-Linux/linux](https://github.com/Rust-for-Linux/linux) 的 qemu-busybox-min.config 設定檔放在 kernel/configs 下 :::spoiler qemu-busybox-min.config ```bash # This is a minimal configuration for running a busybox initramfs image with # networking support. # # The following command can be used create the configuration for a minimal # kernel image: # # make allnoconfig qemu-busybox-min.config # # The following command can be used to build the configuration for a default # kernel image: # # make defconfig qemu-busybox-min.config # # On x86, the following command can be used to run qemu: # # qemu-system-x86_64 -nographic -kernel vmlinux -initrd initrd.img -nic user,model=rtl8139,hostfwd=tcp::5555-:23 # # On arm64, the following command can be used to run qemu: # # qemu-system-aarch64 -M virt -cpu cortex-a72 -nographic -kernel arch/arm64/boot/Image -initrd initrd.img -nic user,model=rtl8139,hostfwd=tcp::5555-:23 CONFIG_SMP=y CONFIG_PRINTK=y CONFIG_PRINTK_TIME=y CONFIG_PCI=y # We use an initramfs for busybox with elf binaries in it. CONFIG_BLK_DEV_INITRD=y CONFIG_RD_GZIP=y CONFIG_BINFMT_ELF=y # This is for /dev file system. CONFIG_DEVTMPFS=y # Core networking (packet is for dhcp). CONFIG_NET=y CONFIG_PACKET=y CONFIG_INET=y # RTL8139 NIC support. CONFIG_NETDEVICES=y CONFIG_ETHERNET=y CONFIG_NET_VENDOR_REALTEK=y CONFIG_8139CP=y # To get GDB symbols and script. CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y CONFIG_GDB_SCRIPTS=y # For the power-down button (triggered by qemu's `system_powerdown` command). CONFIG_INPUT=y CONFIG_INPUT_EVDEV=y CONFIG_INPUT_KEYBOARD=y ``` ::: ```bash # in ./linux $ make LLVM=1 allnoconfig qemu-busybox-min.config rust.conf ``` 理論上在設定完後你應該要在 .config 看到以下 `CONFIG_RUST=y` ,若是沒有可以透過 `make LLVM=1 menuconfig` 確認 `CONFIG_RUST=y` 相關的 dependency ```bash depends on: ( CONFIG_HAVE_RUST ) && ( CONFIG_RUST_IS_AVAILABLE ) && (! CONFIG_MODVERSIONS ) && (! CONFIG_GCC_PLUGINS ) && (! CONFIG_RANDSTRUCT ) && (! CONFIG_DEBUG_INFO_BTF || CONFIG_PAHOLE_HAS_LANG_EXCLUDE ) ``` ### busybox > 或是直接使用 [Writing Linux Kernel Modules in Rust](https://www.youtube.com/watch?v=-l-8WrGHEGI) 所提供的 [disk.img]((https://drive.google.com/drive/folders/1nmUhDVJxCKolkDN64U7Brm4kBQrvUGPq)) 我們會透過 busybox 來搭建 linux 掛載的檔案系統及 initrd 我們先從 GitHub 上下載 busybox 的原始碼 ```bash $ git clone https://github.com/mirror/busybox.git ``` 接著使用預設的設定檔 ```bash $ make defconfig ``` 然後我們希望透過靜態連結的方式來連結函式庫因此使用 menuconfig 做以下設定 ``` Busybox Settings ---> Build Options ---> Build BusyBox as a static binary (no shared libs) ---> yes ``` 設定完後接著編譯並建立一個 initramfs ```bash $ make -j$(nproc) $ make install $ cd _install && ls bin linuxrc sys dev init sbin usr ``` 做完上述指令後會在目錄下生成一個 `_install` 目錄,裡面包含了基本的檔案目錄架構,以及一些常用的工具(如 ps, shell, cat ...),接著我們要將對其做一些調整 ```bash $ mkdir etc $ cp ../examples/inittab etc/. ``` 由於 busybox 預設會開啟 tty2~tty5 ,但測試環境並不會使用,因此修改 inittab ```diff= -tty2::askfirst:-/bin/sh -tty3::askfirst:-/bin/sh -tty4::askfirst:-/bin/sh - -# /sbin/getty invocations for selected ttys -tty4::respawn:/sbin/getty 38400 tty5 -tty5::respawn:/sbin/getty 38400 tty6 ``` 並且我們需要系統去掛載 `/proc` 新增 `etc/init.d/rcS` ``` #!/bin/sh mount -t proc none /proc mount -t sysfs none /sys ifconfig lo up ``` 最後將其打包成映像檔 ```bash # in busybox/_install $ find . | cpio -H newc -o | gzip > ../ramdisk.img ``` ### Run qemu ```bash # in linux $ qemu-system-x86_64 -kernel arch/x86_64/boot/bzImage -initrd ../busybox/ramdisk.img -nic -nographic -append "console=ttyS0" -enable-kvm ``` ## [Writing Linux Kernel Modules in Rust](https://www.youtube.com/watch?v=-l-8WrGHEGI) 利用上面建的環境,我們可以在 linux 透過 `make LLVM=1 menuconfig` 來將 sample/rust 中的範例引入 ``` Kernel hacking -> Sample kernel code -> Rust samples -> [choose the same you want] ``` 或是可以透過修改 samples/rust/Kconfig, samples/rust/Makefile 來新增自己的 kernel modules ```diff # samples/rust/Kconfig if SAMPLES_RUST +config SAMPLE_RUST_HELLO + tristate "HELLO" + help + This option builds the Rust hello module sample. + + To compile this as a module, choose M here: + the module will be called rust_hello. + + If unsure, say N. ``` ```diff # samples/rust/Makefile + obj-$(CONFIG_SAMPLE_RUST_HELLO) += rust_hello.o ``` 接著編寫相對應的檔案 ```bash $ vim rust_hello.rs ``` ### Hello ```rust use kernel::prelude::*; module! { type: Hello, name: "rust_hello", license: "GPL", } struct Hello; impl kernel::Module for Hello { fn init(_module: &'static ThisModule) -> Result<Self> { pr_info!("Hello\n"); Ok(Hello) } } impl Drop for Hello { fn drop(&mut self) { pr_info!("Bye\n"); } } ``` 接著開機後就會在開機畫面看到 ``` ... [ 0.538949] rust_hello: Hello ... ``` 一開始會用 [module!](https://rust-for-linux.github.io/docs/macros/macro.module.html) 來定義一個 type 接著要為這個 type 實作 Module [trait](https://doc.rust-lang.org/book/ch10-02-traits.html) ,其中 [trait `kernel::Module`](https://rust-for-linux.github.io/docs/kernel/trait.Module.html) 中的 `fn init` 對應的就是 C linux kernel module 中的 `void init_module()`,相當於 kernel module 的進入點,而 [trait `Drop`](https://rust-for-linux.github.io/docs/core/ops/trait.Drop.html) 中的 `fn drop` 對應的則是 `void cleanup_module()`,會在 kernel module remod 時執行。[`pr_info!`](https://rust-for-linux.github.io/docs/kernel/macro.pr_info.html) 則對應 [`pr_info`](https://www.kernel.org/doc/html/next/core-api/printk-basics.html) ### Character Device #### The file_operations Structure linux kernel 的 file operations 如下 :::spoiler struct file_operations ```c struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); int (*iopoll)(struct kiocb *kiocb, bool spin); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long); long (*compat_ioctl) (struct file *, unsigned int, unsigned long); int (*mmap) (struct file *, struct vm_area_struct *); unsigned long mmap_supported_flags; int (*open) (struct inode *, struct file *); int (*flush) (struct file *, fl_owner_t id); int (*release) (struct inode *, struct file *); int (*fsync) (struct file *, loff_t, loff_t, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); int (*check_flags)(int); int (*flock) (struct file *, int, struct file_lock *); ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); int (*setlease)(struct file *, long, struct file_lock **, void **); long (*fallocate)(struct file *file, int mode, loff_t offset, loff_t len); void (*show_fdinfo)(struct seq_file *m, struct file *f); ssize_t (*copy_file_range)(struct file *, loff_t, struct file *, loff_t, size_t, unsigned int); loff_t (*remap_file_range)(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); int (*fadvise)(struct file *, loff_t, loff_t, int); } __randomize_layout; ``` ::: 在 rust 中與之相對應的為 [Trait kernel::file::Operations](https://rust-for-linux.github.io/docs/kernel/file/trait.Operations.html) :::spoiler Trait kernel::file::Operations ```rust #[vtable] pub trait Operations { /// The type of the context data returned by [`Operations::open`] and made available to /// other methods. type Data: PointerWrapper + Send + Sync = (); /// The type of the context data passed to [`Operations::open`]. type OpenData: Sync = (); /// Creates a new instance of this file. /// /// Corresponds to the `open` function pointer in `struct file_operations`. fn open(context: &Self::OpenData, file: &File) -> Result<Self::Data>; /// Cleans up after the last reference to the file goes away. /// /// Note that context data is moved, so it will be freed automatically unless the /// implementation moves it elsewhere. /// /// Corresponds to the `release` function pointer in `struct file_operations`. fn release(_data: Self::Data, _file: &File) {} /// Reads data from this file to the caller's buffer. /// /// Corresponds to the `read` and `read_iter` function pointers in `struct file_operations`. fn read( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _writer: &mut impl IoBufferWriter, _offset: u64, ) -> Result<usize> { Err(EINVAL) } /// Writes data from the caller's buffer to this file. /// /// Corresponds to the `write` and `write_iter` function pointers in `struct file_operations`. fn write( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _reader: &mut impl IoBufferReader, _offset: u64, ) -> Result<usize> { Err(EINVAL) } /// Changes the position of the file. /// /// Corresponds to the `llseek` function pointer in `struct file_operations`. fn seek( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _offset: SeekFrom, ) -> Result<u64> { Err(EINVAL) } /// Performs IO control operations that are specific to the file. /// /// Corresponds to the `unlocked_ioctl` function pointer in `struct file_operations`. fn ioctl( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _cmd: &mut IoctlCommand, ) -> Result<i32> { Err(ENOTTY) } /// Performs 32-bit IO control operations on that are specific to the file on 64-bit kernels. /// /// Corresponds to the `compat_ioctl` function pointer in `struct file_operations`. fn compat_ioctl( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _cmd: &mut IoctlCommand, ) -> Result<i32> { Err(ENOTTY) } /// Syncs pending changes to this file. /// /// Corresponds to the `fsync` function pointer in `struct file_operations`. fn fsync( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _start: u64, _end: u64, _datasync: bool, ) -> Result<u32> { Err(EINVAL) } /// Maps areas of the caller's virtual memory with device/file memory. /// /// Corresponds to the `mmap` function pointer in `struct file_operations`. fn mmap( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _vma: &mut mm::virt::Area, ) -> Result { Err(EINVAL) } /// Checks the state of the file and optionally registers for notification when the state /// changes. /// /// Corresponds to the `poll` function pointer in `struct file_operations`. fn poll( _data: <Self::Data as PointerWrapper>::Borrowed<'_>, _file: &File, _table: &PollTable, ) -> Result<u32> { Ok(bindings::POLLIN | bindings::POLLOUT | bindings::POLLRDNORM | bindings::POLLWRNORM) } } ``` ::: #### Registering A Device 在使用 char device 前,需要先向核心註冊,並且 kernel 會透過 major number 和 minor number 對該裝置進行存取,在 C 中我們會透過 ```c int register_chrdev(unsigned int major, const char *name, struct file_operations *fops); ``` 或藉由下列兩個其中一函式: ```c int register_chrdev_region(dev_t from, unsigned count, const char *name); int alloc_chrdev_region(dev_t *dev, unsigned baseminor, unsigned count, const char *name); ``` 接著會使用 `cdev_alloc()` 創健一個 char device 並使用 file_operations 實作 char device 的 ops ,最後使用 `int cdev_add(struct cdev *p, dev_t dev, unsigned count);` 將 char device 加入系統。 而在 rust 中我們我們可查看 [Struct kernel::chrdev::Registration](https://rust-for-linux.github.io/docs/kernel/chrdev/struct.Registration.html) 發現到這個 struct 會需要一個參數 N 代表這個 char device 最多可以註冊幾次,並且透過 `pub fn new_pinned` 讓 char device 的資料固定在某個記憶體位址上(TODO [原因](https://lwn.net/Articles/907876/)),最後則是使用 `pub fn register<T: Operations<OpenData = ()>>` 將 char device 加入系統。 :::info rust 中的 pub fn register<T: Operations<OpenData = ()>> 包含呼叫 C 語言的 `alloc_chrdev_region`, `cdev_alloc()` 和 `cdev_add()` [source code](https://rust-for-linux.github.io/docs/src/kernel/chrdev.rs.html#3-206) ::: #### Example ```rust use kernel::prelude::*; use kernel::{c_str, chrdev, file}; module! { type: RustChrdev, name: "rust_chrdev", author: "Rust for Linux Contributors", description: "Rust character device sample", license: "GPL", } struct RustFile; #[vtable] impl file::Operations for RustFile { fn open(_shared: &(), file: &file::File) -> Result { pr_info!("File opened\n"); Ok(()) } fn read( _data: <Self::Data as ForeignOwnable>::Borrowed<'_>, _file: &File, _writer: &mut impl IoBufferWriter, _offset: u64, ) -> Result<usize> { pr_info!("File read\n"); Ok(0) } fn write( _data: <Self::Data as ForeignOwnable>::Borrowed<'_>, _file: &File, _reader: &mut impl IoBufferReader, _offset: u64, ) -> Result<usize> { pr_info!("File write\n"); Ok(0) } } struct RustChrdev { _dev: Pin<Box<chrdev::Registration<1>>>, } impl kernel::Module for RustChrdev { fn init(name: &'static CStr, module: &'static ThisModule) -> Result<Self> { pr_info!("Rust character device sample (init) {}\n", name); let mut chrdev_reg = chrdev::Registration::new_pinned(name, 0, module)?; chrdev_reg.as_mut().register::<RustFile>()?; Ok(RustChrdev { _dev: chrdev_reg }) } } impl Drop for RustChrdev { fn drop(&mut self) { pr_info!("Rust character device sample (exit)\n"); } } ``` :::info TODO 目前還不清楚確定有執行 init 的情況下,/dev/ 下沒有 rust_chrdev 的原因 (已手動新增 i-node) ::: ### Misc Device misc device 是在當設備無法被分類時所使用的,所有 misc driver 的 major number 都為 10 並且建立 misc device 時會自動建立 char device 可以大幅簡化 char device 的編寫。在 C 中 會透過 misc_register 向系統註冊,而在 rust 中則是透過 [`Struct kernel::miscdev::Registration`](https://rust-for-linux.github.io/docs/kernel/miscdev/struct.Registration.html) `pub fn new_pinned()` ```rust use kernel::io_buffer::{IoBufferReader, IoBufferWriter}; use kernel::prelude::*; use kernel::{file, miscdev}; module! { type: Misc, name: "rust_miscdev", license: "GPL", } struct Misc { _dev: Pin<Box<miscdev::Registration<Misc>>>, } #[vtable] impl file::Operations for Misc { fn open(_context: &(), _file: &file::File) -> Result { pr_info!("File opened\n"); Ok(()) } fn read( _data: (), _file: &file::File, _writer: &mut impl IoBufferWriter, _offset: u64, ) -> Result<usize> { pr_info!("File read\n"); Ok(0) } fn write( _data: (), _file: &file::File, reader: &mut impl IoBufferReader, _offset: u64, ) -> Result<usize> { pr_info!("File written\n"); Ok(reader.len()) } } impl kernel::Module for Misc { fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> { pr_info!("Hello world!\n"); let reg = miscdev::Registration::new_pinned(fmt!("Misc"), ())?; Ok(Self { _dev: reg }) } } ``` ### Procfs proc 檔案系統可用來讓核心模組傳遞訊息給行程。 :::info 目前還沒找到相關支援,考慮使用 kernel::fs 或使用 out of tree 的方式實作 ::: ### Workqueue [Workqueue](https://www.kernel.org/doc/html/latest/core-api/workqueue.html) 常常被用在需要執行多個非同步任務的場合,使用 Workqueue 的好處在於開發者不需要額外管理生產新任務所需要的配置與釋放。在 C 中,其主要的三個 API 如下 * `alloc_workqueu`: 配置一個 workque ,並透過 flags 來指定排程特性 * `queue_work`: 將任務加入 workqueue * `destroy_workqueue`: 釋放 workqueuq 在 Rust 中,我們透過以下 API 來使用 workqueue 1. [Struct kernel::workqueue::Queue](https://rust-for-linux.github.io/docs/kernel/workqueue/struct.Queue.html) * kernel::workqueue::Queue::try_new(): 配置一個 workqueue * kernel::workqueue::Queue::enqueue(): 將任務加入 workqueue * kernel::workqueue::Queue::try_spawn(): 將一個任務以 function 的形式加入 2. [kernel::workqueue::xxx](https://rust-for-linux.github.io/docs/kernel/workqueue/index.html) : 根據不同需求使用不一樣的 workqueue * system() * system_freezable(): `WQ_FREEZABLE` * system_freezable_power_efficient * system_highpri: `WQ_HIGHPRI` * system_long: * system_power_efficient * system_unbound: `WQ_UNBOUND` 而在使用 work_queue 前我們需要建立 work item ,每個 work item 會綁定一個 function pointer 代表要執行的事情,而在 C 中 struct work_struct 定義如下 ```c struct work_struct { atomic_long_t data; struct list_head entry; work_func_t func; #ifdef CONFIG_LOCKDEP struct lockdep_map lockdep_map; #endif }; ``` 在 Rust 中,我們需要實作 trait kernel::workqueue::WorkAdapter 中的 run() 來做到綁定任務給 work item ,我們可以透過 kernel::impl_self_work_adapter! 這個 macro 來達成。 以下為幾個 workqueue 簡單的使用案例 ```rust // SPDX-License-Identifier: GPL-2.0 //! Rust work_queue macros sample. use kernel::pr_cont; use kernel::prelude::*; module! { type: RustQueue, name: "rust_workq", description: "Rust work_queue macros sample", license: "GPL", } struct RustQueue; fn work() { pr_info!("Hello from a work item\n"); } impl kernel::Module for RustQueue { fn init(name: &'static CStr, _module: &'static ThisModule) -> Result<Self> { let wq = kernel::workqueue::Queue::try_new(fmt!("workq"))?; static lock: kernel::sync::LockClassKey = kernel::sync::LockClassKey::new(); kernel::workqueue::Queue::try_spawn(&wq, &lock, work); Ok(RustQueue) } } impl Drop for RustQueue { fn drop(&mut self) { pr_info!("Rust work_queue sample (exit)\n"); } } ``` 執行結果: ``` [ 0.337483] rust_workq: Hello from a work item ``` ```rust use core::result::Result::Ok; use kernel::prelude::*; use kernel::workqueue::*; use core::sync::atomic::{AtomicU32, Ordering}; use kernel::sync::UniqueArc; module! { type: RustQueue, name: "rust_workq", author: "Rust for Linux Contributors", description: "Rust work_queue macros sample", license: "GPL", } struct RustQueue; struct Example { count: AtomicU32, work: Work, } kernel::impl_self_work_adapter!(Example, work, |w| { let count = w.count.fetch_add(1, Ordering::Relaxed); pr_info!("Called with count={}\n", count); // Queue again if the count is less than 10. if count < 10 { kernel::workqueue::system().enqueue(w); } }); impl kernel::Module for RustQueue { fn init(name: &'static CStr, _module: &'static ThisModule) -> Result<Self> { let e = UniqueArc::try_new(Example { count: AtomicU32::new(0), // SAFETY: `work` is initialised below. work: unsafe { Work::new() }, })?; kernel::init_work_item!(&e); // Queue the first time. kernel::workqueue::system().enqueue(e.into()); Ok(RustQueue) } } impl Drop for RustQueue { fn drop(&mut self) { pr_info!("Rust work_queue macros sample (exit)\n"); } } ``` 執行結果: ``` [ 0.339465] rust_workq: Called with count=0 [ 0.339998] rust_workq: Called with count=1 [ 0.340543] rust_workq: Called with count=2 [ 0.341074] rust_workq: Called with count=3 [ 0.341706] rust_workq: Called with count=4 [ 0.343028] rust_workq: Called with count=5 [ 0.343577] rust_workq: Called with count=6 [ 0.344080] rust_workq: Called with count=7 [ 0.344592] rust_workq: Called with count=8 [ 0.345200] rust_workq: Called with count=9 [ 0.345726] rust_workq: Called with count=10 ``` #### rust_echo server rust_echo server 是在 [7ee240](https://github.com/Rust-for-Linux/linux/commit/7ee240a384854068b898e08e7e3366226047bacf) 加入 Rust-for-linux 的範例,展示了如何使用 Rust 的 async 並搭配 workqueue worker 來實現類似於 [kecho](https://github.com/sysprog21/kecho) 功能的 kernel module ```rust use kernel::{ kasync::executor::{workqueue::Executor as WqExecutor, AutoStopHandle, Executor}, kasync::net::{TcpListener, TcpStream}, net::{self, Ipv4Addr, SocketAddr, SocketAddrV4}, prelude::*, spawn_task, sync::{Arc, ArcBorrow}, }; async fn echo_server(stream: TcpStream) -> Result { let mut buf = [0u8; 1024]; loop { let n = stream.read(&mut buf).await?; if n == 0 { return Ok(()); } stream.write_all(&buf[..n]).await?; } } async fn accept_loop(listener: TcpListener, executor: Arc<impl Executor>) { loop { if let Ok(stream) = listener.accept().await { let _ = spawn_task!(executor.as_arc_borrow(), echo_server(stream)); } } } fn start_listener(ex: ArcBorrow<'_, impl Executor + Send + Sync + 'static>) -> Result { pr_info!("Start Listen\n"); let addr = SocketAddr::V4(SocketAddrV4::new(Ipv4Addr::ANY, 8080)); let listener = TcpListener::try_new(net::init_ns(), &addr)?; spawn_task!(ex, accept_loop(listener, ex.into()))?; Ok(()) } struct RustEchoServer { _handle: AutoStopHandle<dyn Executor>, } impl kernel::Module for RustEchoServer { fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> { pr_info!("Workq init\n"); let handle = WqExecutor::try_new(kernel::workqueue::system())?; start_listener(handle.executor())?; Ok(Self { _handle: handle.into(), }) } } module! { type: RustEchoServer, name: "rust_echo_server", author: "Rust for Linux Contributors", description: "Rust tcp echo sample", license: "GPL v2", } ``` 可藉由 `telnet 127.0.0.1 8080` 傳送訊息給 echo server ```bash $ telnet 127.0.0.1 8080 aaa aaa hello hello ``` ## Reference * [What Every C Programmer Should Know About Undefined Behavior #2/3](https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html) * [Rust for Linux - Miguel Ojeda](https://www.youtube.com/watch?v=46Ky__Gid7M) * [LKML: [PATCH 00/13] [RFC] Rust support](https://lkml.org/lkml/2021/4/14/1023) * [Resource acquisition is initialization](https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization) * [Rust-for-Linux](https://github.com/Rust-for-Linux/linux) * [The perils of pinning](https://lwn.net/Articles/907876/) * https://wusyong.github.io/posts/rust-kernel-module-03/ * [rustdoc code docs (rust 2023-03-13)](https://rust-for-linux.github.io/docs/v6.3-rc2/kernel/index.html) * [rustdoc code docs](https://rust-for-linux.github.io/docs/kernel/index.html)