# Book : The Linux Kernel Module Programming Guide > Reference \: > * The Linux Kernel Module Programming Guide *by Peter Jay Salzman, Michael Burian, Ori Pomerantz, Bob Mottram, Jim Huang* > https://sysprog21.github.io/lkmpg/ > * Github Repo > https://github.com/sysprog21/lkmpg > * Linux kernel模組的開發 > https://ithelp.ithome.com.tw/users/20001007/ironman/958 > * EmbeTronicX > https://embetronicx.com/ *This book is a free book published on github. I read this book for possible usage and I see it as a top down apporch to learn Linux Driver Programming and finally, reaching bare metal programming on SoC.* ## My Other Post About RPI Development * Book : Raspberry Pi GPU Audio Video Programming https://hackmd.io/kRW20PfmTpGkCL5e7-NSEw * Course : Raspberry Pi Bare Metal Tutorial by LLD https://hackmd.io/E2udBWTNSseyz69cMdevsA :::info :information_source: **Hardware Used** https://hackmd.io/@Erebustsai/SJPwQzOvh WSL Kernel does not support loading kernel modules. Therefore, programming and compiling natively is my current setup \(01\/07\/2025\) **Temperature Monitoring** > Reference \: > * How to find out Raspberry Pi GPU and ARM CPU temperature on Linux > https://www.cyberciti.biz/faq/linux-find-out-raspberry-pi-gpu-and-arm-cpu-temperature-command/ The above temperature monitor tool `vcgencmd` is in `/usr/bin/vcgencmd`, which is different than the above reference shows. ```bash cpu=$(</sys/class/thermal/thermal_zone0/temp) echo "$(date) @ $(hostname)" echo "-------------------------------------------" echo "GPU => $(/usr/bin/vcgencmd measure_temp)" echo "CPU => $((cpu/1000))'C" ``` **Swap File Increase** > Reference \: > * Increase size of existing swap file > https://forums.raspberrypi.com/viewtopic.php?t=46472 ```bash sudo vim /etc/dphys-swapfile ``` The setting of the *swap size*, *swap location* can be found in the above file. Notice that `sudo` is required in order to modified the file. **Cooling Raspberry Pi 3B+** > Reference \: > * Raspberry Pi 4 with cooling 5v\/3v? > https://forums.raspberrypi.com/viewtopic.php?t=248918 The raspberry pi 3B+ working at above 51 degree celsius even in a pretty cold room temperature. \(20 degree celsius\) Therefore, a 2 pin fan is added to the case I have. The result is pretty good and the temp is lower to 36 degree even with some load. **Disable GUI** ```bash sudo init 3 ``` **Disable WiFi Temporary** > Reference \: > * How do I turn off wifi? > https://forums.raspberrypi.com/viewtopic.php?t=342095 ```bash ip link show # this will list all the network interface sudo ip link set wlan0 down ``` **RPi-Monitor** > Reference \: > * Github Repo > https://github.com/XavierBerger/RPi-Monitor?tab=readme-ov-file This provide info through web-page. I think it will consume more resources then I wanted \(a shell based\). Therefore I only put it here as reference and will not deploy it to the current system configuration. ::: :::info :information_source: **What’s the Difference Between Kernel Drivers and Kernel Modules?** https://www.baeldung.com/linux/kernel-drivers-modules-difference#:~:text=While%20kernel%20drivers%20are%20an,from%20the%20kernel%20during%20runtime ::: # Development Environment > Reference \: > https://www.youtube.com/watch?v=0SIqVkzDAuM ## Header Depandency Basically, most of the tutorial are out dated; therefore, I am going to simply listing what I do. * Install packages and required module headers ```bash sudo apt install raspberrypi-kernel raspberrypi-kernel-headers ``` * Make sure module header and the kernel running have matching version number. ```bash /lib/modules/6.1.21-v7+/build/include # this dir should exist uname --all # Linux raspberrypi 6.1.21-v7+ #1642 SMP Mon Apr 3 17:20:52 BST 2023 armv7l GNU/Linux ``` * Follow the tutorial to create the `Makefile`. \(*TODO \: The detail might be covered later*\) ## VsCode Environment Setting \(Optional\) As long as the source code can be compiled. Vscode error is just annoy to see but it will not effect the source code. https://hackmd.io/@Eroiko/vscode-for-linux-kernel The following is used currently. There will still be false error showed but it still provide some doc for functions and macros. ```json { "configurations": [ { "name": "Linux", "includePath": [ "${workspaceFolder}/**", "/usr/include", "/usr/local/include", "/usr/src/linux-headers-6.1.21-v7+/include", "/usr/src/linux-headers-6.1.21-v7+/arch/arm/include", "/usr/lib/gcc/arm-linux-gnueabihf/10/include" ], "defines": [ "__GNUC__", "__KERNEL__" ], "compilerPath": "/bin/gcc", "cStandard": "c99", "cppStandard": "c++17", "intelliSenseMode": "linux-gcc-arm" } ], "version": 4 } ``` ## Connect RPi3 with Serial Connection \(USB to TTL\) > Reference \: > * CH340G USB轉TTL模組 > https://www.taiwansensor.com.tw/product/ch340g-usb%E8%BD%89ttl%E6%A8%A1%E7%B5%84-ch340g%E5%88%B7%E6%A9%9F%E7%B7%9A-%E6%94%AF%E6%8F%B4-win10-linux/ The one I used is with CH340G. There are different chips that you can choose from and each of them have only miner differences. See the following reference \: > 【史上最全】常用USB轉序列埠晶片特性比較 > https://www.cnblogs.com/xiaobaibai2021/p/15716893.html Using `putty` to connect a serial port is the most simple way to do it. Other softwares I tested \(`windterm`, `mobaxterm`\) are either having wierd bug or out right not working. They might work properly if setup correctly; however, `putty` just work out of the box. ## Power Button for RPi3 > Reference \: > * A Button to Switch your Pi Safe On and Off > https://www.youtube.com/watch?v=h6GFXtKkXco ![RPi3_power_button](https://hackmd.io/_uploads/SkeG4CqCJl.png) # Introduction A Linux kernel module is precisely defined as a code segment capable of dynamic loading and unloading within the kernel as needed. These modules enhance kernel capabilities without necessitating a system reboot. A notable example is seen in the device driver module, which facilitates kernel interaction with hardware components linked to the system. :::info :information_source: **Difference between Linux Loadable and built-in modules** https://stackoverflow.com/questions/22929065/difference-between-linux-loadable-and-built-in-modules ::: :::success :bulb: **List Kernel module in a Linux System** Notice that `lsmod` only list dynamically loaded modules not the built in one. ```bash lsmod cat /proc/modules | column -t # Similar output if column is used ``` ::: ## Hello Kernel *TODO \: Add Github Repo* ### Kernel Module Code #### Style \#1 ```cpp #include <linux/module.h> #include <linux/printk.h> int init_module(void) { pr_info("Hello world 1.\n"); return 0; } void cleanup_module(void) { pr_info("Goodbye world 1.\n"); } MODULE_LICENSE("GPL"); ``` #### Style \#2 ```cpp #include <linux/init.h> #include <linux/module.h> int hello_init(void) { printk(KERN_ALERT "I am inside the kernel \n"); return 0; } void hello_exit(void) { printk(KERN_ALERT "Leaving the kernel \n"); } module_init(hello_init); module_exit(hello_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("ErebusTsai"); MODULE_DESCRIPTION("A Hello World Module."); ``` :::info :information_source: **Difference between `printk` and `pr_info`** Basically, `pr_info` is a macro wrapper of `printk`. * Source code `printk.h` can found macros wrappers for `printk` https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/linux/printk.h * Difference between `printk` and `pr_info` https://stackoverflow.com/questions/42243185/difference-between-printk-and-pr-info ::: ### `Makefile` structure #### `Makefile` basic for kernel modules > Reference \: > * \[Linux Kernel慢慢學\]快速上手Makefile和Kbuild Makefile > https://meetonfriday.com/posts/5523c739/ ```cmake obj-m += hello.o PWD := $(CURDIR) all: $(MAKE) -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: $(MAKE) -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean ``` #### `Makefile` detail > Reference \: > * The Linux Kernel Module Programming Guide *by Peter Jay Salzman, Michael Burian, Ori Pomerantz, Bob Mottram, Jim Huang* > https://sysprog21.github.io/lkmpg/ > If there is no `PWD := $(CURDIR)` statement in `Makefile`, then it may not compile correctly with `sudo make`. Because some environment variables are specified by the security policy, they can’t be inherited. :::info :information_source: Listing `Makefile` variables ```bash sudo -E make -p | grep PWD # grep is recommanded because the list is vary vary big. ``` ::: > Kernel modules must have at least two functions: a "start" (initialization) function called `init_module()` which is called when the module is insmoded into the kernel, and an "end" (cleanup) function called `cleanup_module()` which is called just before it is removed from the kernel. :::info :information_source: **Kernel Programming Indention** Kernel programming require **tabs not space** for the indentation. This is the coding conventions of the kernel. ::: `obj-$(CONFIG_FOO)` entries you see everywhere expand into `obj-y` or `obj-m`, depending on whether the `CONFIG_FOO` variable has been set to `y` or `m`. While we are at it, those were exactly the kind of variables that you have set in the `.config` file in the top-level directory of Linux kernel source tree, the last time when you said make `menuconfig` or something like that ## The `__init` and `__exit` Macros > Reference \: > * Linux Device Driver Programming Lecture 18\: `__init` and `__exit` macros > https://fastbitlab.com/linux-device-driver-programming-lecture-18-__init-and-__exit-macros/#:~:text=__init%20is%20for%20code,functions%20during%20the%20build%20process. ## Licensing and Module Documentation > Reference \: > * `root/include/linux/module.h` in git.kernel.org > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/linux/module.h Using proprietary module will cause the linux kernel to be tainted. Refer to [What is a tainted Linux kernel?](https://unix.stackexchange.com/questions/118116/what-is-a-tainted-linux-kernel) ## Building modules for a precompiled kernel > Reference \: > * 解決:disagrees about version of symbol module_layout > https://www.cnblogs.com/wanglouxiaozi/p/17767576.html :::info :information_source: **Linux 核心設計: System call** https://hackmd.io/@RinHizakura/S1wfy6nQO ::: # Preliminaries ## Functions Available to Modules > modules are object files whose symbols get resolved upon running insmod or modprobe. The definition for the symbols comes from the kernel itself; the only external functions you can use are the ones provided by the kernel \(System Calls\). :::info :information_source: **Linux System Call Table** This website provide system calls that can be used for different architecture of computers. https://syscall.sh/ ::: ## Name Space > When writing kernel code, even the smallest module will be linked against the entire kernel, so this is definitely an issue. **The best way to deal with this is to declare all your variables as static and to use a well-defined prefix for your symbols**. **By convention, all kernel prefixes are lowercase**. If you do not want to declare everything as static, another option is to declare a symbol table and register it with the kernel. ## Code Space > Since a module is code which can be dynamically inserted and removed in the kernel \(as opposed to a semi-autonomous object\), it shares the kernel’s codespace rather than having its own. Therefore, if your module segfaults, the kernel segfaults. And if you start writing over data because of an **off-by-one error**, then you’re trampling on kernel data \(or code\). This is even worse than it sounds, so try your best to be careful. ## Device Drivers > So the `es1370.ko` sound card device driver might connect the `/dev/sound` device file to the Ensoniq IS1370 sound card. A userspace program like mp3blaster can use `/dev/sound` without ever knowing what kind of sound card is installed. ```bash crw-rw---- 1 root video 241, 0 Jan 7 17:05 media0 crw-rw---- 1 root video 241, 1 Jan 7 17:05 media1 crw-rw---- 1 root video 241, 2 Jan 7 17:05 media2 crw-r----- 1 root kmem 1, 1 Jan 7 17:05 mem brw-rw---- 1 root disk 179, 0 Jan 7 17:05 mmcblk0 brw-rw---- 1 root disk 179, 1 Jan 7 17:05 mmcblk0p1 brw-rw---- 1 root disk 179, 2 Jan 7 17:05 mmcblk0p2 ``` * `c` or `b` \: means character device or block device * Column with two numbers \: first number is which driver used for the hardware and the second number is used by driver to differetiate devices that using the same driver. # Character Device Drivers > Reference \: > * Linux Driver正點原子課程筆記3 - 我的第一個Linux驅動 > https://meetonfriday.com/posts/62f55520/ ## `file_operations` Structure > Reference \: > * `file_operations` Structure Definition in `/include/linux/fs.h` > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/linux/fs.h Notice that not all the function need to be implemented. If a entry is not used by the driver, it should be set to `NULL`. However, we don't need to assign them one by one, since any member of a structure without explicitly initialized will be set to `NULL` by `gcc`. :::info :information_source: **Function Pointer in C** [My other post](https://hackmd.io/7n-VHaZtQBmLUgHEmc2uIA?view#Chapter-3--Pointer-and-Functions) ::: > Since Linux v3.14, the read, write and seek operations are guaranteed for **thread-safe** by using the `f_pos` specific lock, which makes the file position update to become the mutual exclusion. So, we can safely implement those operations without unnecessary locking. ## The file Structure > Each device is represented in the kernel by a file structure, which is defined in `include/linux/fs.h`. Be aware that a file is a kernel level structure and never appears in a user space program. It is not the same thing as a `FILE`, which is defined by glibc and would never appear in a kernel space function. ## Registering a Device ```cpp static inline int register_chrdev(unsigned int major, const char *name, const struct file_operations *fops) ``` * `*fops` is a pointer to the `file_operations` table * A negative return value means the registration failed ### Dynamically Assign a Major Number Simply pass a major number of 0 to `register_chrdev`, the return value will be dynamically allocated major number. > The downside is that you can not make a device file in advance, since you do not know what the major number will be. There are a couple of ways to do this. First, the driver itself can print the newly assigned number and we can make the device file by hand. Second, the newly registered device will have an entry in `/proc/devices`, and we can either make the device file by hand or write a shell script to read the file in and make the device file. **The third method is that we can have our driver make the device file using the `device_create` function after a successful registration and `device_destroy` during the call to `cleanup_module`**. ### Unregistering a Device > We can not allow the kernel module to be rmmod’ed whenever root feels like it. If the device file is opened by a process and then we remove the kernel module, using the file would cause a call to the memory location where the appropriate function \(read\/write\) used to be. If we are lucky, no other code was loaded there, and we’ll get an ugly error message. If we are unlucky, another kernel module was loaded into the same location, which means a jump into the middle of another function within the kernel. The results of this would be impossible to predict, but they can not be very positive A counter, which keeps track of how many process is using the module can be used to indicate a module can be removed safely. ```bash Module Size Used by uvcvideo 102400 0 sg 28672 0 uas 24576 0 8021q 32768 0 garp 16384 1 8021q stp 16384 1 garp llc 16384 2 garp,stp binfmt_misc 20480 1 spidev 20480 0 vc4 315392 2 snd_soc_hdmi_codec 16384 1 brcmfmac 335872 0 drm_display_helper 16384 1 vc4 cec 49152 1 vc4 drm_dma_helper 20480 1 vc4 ``` ### Writing Modules for Multiple Kernel Version > The system calls, which are the major interface the kernel shows to the processes, generally stay the same across versions. A new system call may be added, but usually the old ones will behave exactly like they used to. > The way to do this to compare the macro `LINUX_VERSION_CODE` to the macro `KERNEL_VERSION`. In version `a.b.c` of the kernel, the value of this macro would be `2^16a + 2^8b + c`. # The `/proc` File System \(VFS\) > Reference \: > * `/proc` 檔案系統 > https://ithelp.ithome.com.tw/articles/10160143 > * wiki > https://zh.wikipedia.org/zh-tw/Procfs ## Code Example > Reference \: > * `procfs1.c` > https://github.com/sysprog21/lkmpg/blob/master/examples/procfs1.c ## The `proc_ops` Structure The proc_ops structure is defined in `include/linux/proc_fs.h` in `Linux v5.6+`. The rule of thumb is using the newest way to write things only learn the old ways when needed. ## Important Note In the follwoint success section, the choose of using `sys` over `proc` is discussed in the stackoverflow post. Notice that the accepted answer express that `sys` is recommeded to used. :::success :bulb: **How to choose between "sys' and "proc" files in linux kernel** https://stackoverflow.com/questions/34133497/how-to-choose-between-sys-and-proc-files-in-linux-kernel ::: # `sysfs`: Interacting with your module > Reference \: > * Github Resource Code > https://github.com/sysprog21/lkmpg/blob/master/examples/hello-sysfs.c In this chapter, the book provide a sample code which can be used as an example when this function is needed. Starting with an example is way easier than working from ground up. ## Description of `kobject` in the Book > ... After a bit of mission creep, it is now the glue that holds much of the device model and its sysfs interface together. :::success :bulb: **統一裝置模型:kobj、kset分析** http://www.wowotech.net/device_model/421.html ::: :::success :bulb: **Linux 核心:裝置驅動模型(1)sysfs與kobject基類** https://www.cnblogs.com/schips/p/linux_device_model_1.html ::: # `ioctl` Talking to Device Files :::success :bulb: **Linux driver: ioctl or sysfs?** https://stackoverflow.com/questions/40529308/linux-driver-ioctl-or-sysfs ::: Basically, the driver should define the what the driver will do when user-space process call `ioctl` on the driver. :::success :bulb: **用Raspberry Pi學Embedded Linux \(3\) — Kernel module用ioctl交換訊息** https://medium.com/@sepfy95/用raspberry-pi學embedded-linux-3-kernel-module用ioctl交換訊息-6e49a5c25fa2 ::: :::success :bulb: **\[Linux Kernel慢慢學\] Different betweeen `ioctl`, `unlocked_ioctl` and `compat_ioctl`** https://meetonfriday.com/posts/736969d7/ ::: > Reference \: > * Github Resource Code > https://github.com/sysprog21/lkmpg/blob/master/examples/ioctl.c > Most physical devices are used for output as well as input, so there has to be some mechanism for device drivers in the kernel to get the output to send to the device from processes. This is done by opening the device file for output and writing to it, just like writing to a file. ... However, this leaves open the question of what to do when you need to talk to the serial port itself, for example to configure the rate at which data is sent and received. Notice that `ioctl` read is to send information to the kernel and write is the receive information from the kernel. Basically, see things in the kernel POV. > The ioctl function is called with three parameters: the file descriptor of the appropriate device file, the ioctl number, and a parameter, which is of type long so you can use a cast to use it to pass anything. You will not be able to pass a structure this way, but you will be able to pass a pointer to the structure. ```cpp #include <linux/ioctl.h> ``` :::info :information_source: **`Documentation/userspace-api/ioctl/ioctl-number.rst `** https://github.com/torvalds/linux/blob/master/Documentation/userspace-api/ioctl/ioctl-number.rst ::: TODO \: Include and Read Example `chardev2.c`, `ioctl.c` # System Calls > Reference \: > * System Call \(系統呼叫\) - 從零開始的開源地下城 > https://hackmd.io/@combo-tw/Linux-%E8%AE%80%E6%9B%B8%E6%9C%83/%2F%40combo-tw%2FBJPoAcqQS > The location in the kernel a process can jump to is called system_call. The procedure at that location checks the system call number, which tells the kernel what service the process requested. Then, it looks at the table of system calls (sys_call_table) to see the address of the kernel function to call. Then it calls the function, and after it returns, does a few system checks and then return back to the process. If we want to change how a system call works, we can write or version of the call and then change only the pointer at the `sys_call_table` to point to our function. Being able to restore back to the default table is recommended. # Blocking Process and threads ## Sleep > Reference \: > * Code Example > https://github.com/sysprog21/lkmpg/blob/master/examples/sleep.c `wait_event_interruptible` can be `Ctrl+C`, while `wait_event` ignore `Ctrl+C`. > This function changes the status of the task \(a task is the kernel data structure which holds information about a process and the system call it is in, if any\) to `TASK_INTERRUPTIBLE`, which means that the task will not run until it is woken up somehow, and adds it to WaitQ, the queue of tasks waiting to access the file. Then, the function calls the scheduler to context switch to a different process, one which has some use for the CPU :::success In my opinion, for kenrel modules, the synchronization mechenism needed is much simpler than whatever concurrency algorithms need. ::: ## Completions > Reference \: > * Code Example > https://github.com/sysprog21/lkmpg/blob/master/examples/completions.c `wait_for_completion()` and `copmlete()` are used to provide synchronization between threads in a module. This mechanism is used for **happened before** relationship. # Avoiding Collisions and Deadlocks ## Spinlock > Reference \: > * Linux 核心同步(二):自旋鎖(Spinlock) > https://blog.csdn.net/zhoutaopower/article/details/86598839 > * Spinlock in Linux Kernel Part 1 – Linux Device Driver Tutorial Part 23 > https://embetronicx.com/tutorials/linux/device-drivers/spinlock-in-linux-kernel-1/#SpinLock > Sleeping in atomic contexts may leave the system hanging, as the occupied CPU devotes 100% of its resources doing nothing but sleeping. In some worse cases the system may crash. Thus, sleeping in atomic contexts is considered a bug in the kernel. They are sometimes called “sleep-in-atomic-context” in some materials. ## Atomics > Before the C11 standard adopts the built-in atomic types, the kernel already provided a small set of atomic types by using a bunch of tricky architecturespecific codes. Implementing the atomic types by C11 atomics may allow the kernel to throw away the architecture-specific codes and letting the kernel code be more friendly to the people who understand the standard. But there are some problems, such as the memory model of the kernel doesn’t match the model formed by the C11 atomics. # Scheduling Tasks ## Tasklets > Although tasklet is easy to use, it comes with several drawbacks, and developers are discussing about getting rid of tasklet in linux kernel. The tasklet callback runs in atomic context, inside a software interrupt, meaning that it cannot sleep or access user-space data, so not all work can be done in a tasklet handler. Also, the kernel only allows one instance of any given tasklet to be running at any given time; multiple different tasklet callbacks can run in parallel. Basically, the `Tasklets` are gradually being replaced by other funcitons. ## Work Queues See [Example](https://github.com/sysprog21/lkmpg/blob/master/examples/sched.c) :::info :information_source: **Completely Fair Scheduler \(CFS\)** https://w.wiki/D6wg ::: # Interrupt Handlers > Reference \: > * Understanding Interrupts, Softirqs, and Softnet in Linux > https://www.netdata.cloud/blog/understanding-interrupts-softirqs-and-softnet-in-linux/#what-are-interrupts > * Linux 核心設計: 中斷處理和現代架構考量 > https://hackmd.io/@sysprog/linux-interrupt > * What is the difference between FIQ and IRQ interrupt system? > https://stackoverflow.com/questions/973933/what-is-the-difference-between-fiq-and-irq-interrupt-system > Linux kernel solves the problem by splitting interrupt handling into two parts. The first part executes right away and masks the interrupt line. Hardware interrupts must be handled quickly, and that is why we need the second part to handle the heavy work deferred from an interrupt handler. ## Examples * Handling GPIO with interrupts https://github.com/sysprog21/lkmpg/blob/master/examples/intrpt.c * Top and bottom half interrupt handling https://github.com/sysprog21/lkmpg/blob/master/examples/bottomhalf.c * Top and bottom half interrupt handling with threads https://github.com/sysprog21/lkmpg/blob/master/examples/bh_threaded.c # Standardizing the interfaces\: The Device Model Basically, use the following reference source code as the example to work on. https://github.com/sysprog21/lkmpg/blob/master/examples/devicemodel.c # Optimization ## Likely and Unlikely Conditions :::info :information_source: **Similar to How We Optimize User Space Program** https://hackmd.io/@Erebustsai/HJ4PO0bV6 ::: ## Static Keys > Reference \: > * Linux Static Key原理與應用 > https://blog.csdn.net/SGchi/article/details/132859345