# 2025q1 Homework3 (kxo)
contributed by < [otischung](https://github.com/otischung) >
{%hackmd NrmQUGbRQWemgwPfhzXj6g %}
## Environment
```bash
❯ uname -a
Linux scream-Ubuntu-24 6.11.0-24-generic #21~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 24 16:52:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
```
```bash
❯ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-linux-gnu/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 13.3.0-6ubuntu2~24.04' --with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-13 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-libstdcxx-backtrace --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.3.0 (Ubuntu 13.3.0-6ubuntu2~24.04)
```
```bash
❯ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Vendor ID: GenuineIntel
Model name: 13th Gen Intel(R) Core(TM) i7-13700
CPU family: 6
Model: 183
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 1
CPU(s) scaling MHz: 21%
CPU max MHz: 5200.0000
CPU min MHz: 800.0000
BogoMIPS: 4224.00
```
## Settings of `c_cpp_properties.json`
If you are using Visual Studio Code to develop the code, you may need the following settings to update the custom include path for this repository.
If there is no such file in your workspace, press ctrl+shift+p, enter "C/C++: Edit Configurations (JSON)", and you'll get the json file.
```json
{
"configurations": [
{
"name": "Linux Kernel Module",
"includePath": [
"${workspaceFolder}/**",
"/usr/include",
"/usr/local/include",
"/usr/lib/gcc/x86_64-linux-gnu/13/include",
"/usr/src/linux-hwe-6.11-headers-6.11.0-21/include",
"/usr/src/linux-hwe-6.11-headers-6.11.0-21/arch/x86/include",
"/usr/src/linux-headers-6.11.0-21-generic/include",
"/usr/src/linux-headers-6.11.0-21-generic/arch/x86/include/generated"
],
"defines": [
"__KERNEL__",
"MODULE"
],
"compilerPath": "/usr/bin/gcc",
"cStandard": "c11",
"cppStandard": "c++17",
"intelliSenseMode": "linux-gcc-x64"
}
],
"version": 4
}
```
:::info
Question: What is the difference between
- `linux-hwe-6.11-headers-6.11.0-24`
- `linux-headers-6.11.0-24-generic`
:::
## Code-Structure Analysis
This project involves *kernel OOT (out-of-tree) modules*. If bugs exists, there may cause a kernel panic. Therefore, make sure you understand the project structure and know exactly what needs to be done before making any modifications.
### Basic Usage
In the project directory, after running `make` in the terminal, you're excepted to get the `kxo.ko` kernel object file. You can then insert the kernel module by
```bash
sudo insmod ./kxo.ko
```
To unload the kernel module, run the following command:
```bash
sudo rmmod kxo
```
This project provides a userspace interface called `xo-user` to display the currect Tic-Tac-Toe game board and control the kernel module. The following control commands are available:
- `ctrl+p`: Toggle pause/resume for the game board display.
- `ctrl+q`: Terminate all Tic-Tac-Toe games running in kernel space.
To enable terminal input shortcuts, run the following command:
```bash
stty start '^-' stop '^-'
```
After completing the above settings, you can run the userspace control program with root privileges.
```bash
sudo ./xo-user
```
### Make Kernel Modules
`make all` expends to the following command:
```bash
make -C /lib/modules/$(shell uname -r)/build M=$(shell pwd) modules
```
The directory specified in the `-C` option is actually a symbolic link that points to `/usr/src/linux-headers-$(shell uname -r)`.
```bash
ls -lash /lib/modules/6.11.0-21-generic/build
0 lrwxrwxrwx 1 root root 40 Feb 24 23:23 /lib/modules/6.11.0-21-generic/build -> /usr/src/linux-headers-6.11.0-21-generic
```
Next, we read the [manual](https://www.man7.org/linux/man-pages/man1/make.1.html) of the `make` command.
> -C dir, --directory=dir
> Change to directory dir before reading the makefiles or doing anything else. If multiple -C options are specified, each is interpreted relative to the previous one: -C / -C etc is equivalent to -C /etc. This is typically used with recursive invocations of make.
Therefore, we also need the `Makefile` in the Linux header directory. You can find the definition of `modules` here.
The definition of `M=dir` is shown below:
> Use make M=dir or set the environment variable KBUILD_EXTMOD to specify the directory of external module to build. Setting M= takes precedence.
The definition of `O=dir` is shown below:
> If you want to save output files in a different location, there are two syntaxes to specify it.
> 1) O=
> Use "make O=dir/to/store/output/files/"
> 2) Set KBUILD_OUTPUT
> Set the environment variable KBUILD_OUTPUT to point to the output directory.
> export KBUILD_OUTPUT=dir/to/store/output/files/; make
>
> The O= assignment takes precedence over the KBUILD_OUTPUT environment variable.
There are also another interesting environment variables, such as `CROSS_COMPILE` and `INSTALL_MOD_PATH`. I’m excited to discover that kernel compilation on the NVIDIA Jetson Orin Nano in my [previous work](https://github.com/otischung/jetson_linux_36.4.3) is similar to this project. I successfully enabled custom features like exFAT, nftables, and WireGuard, then compiled and flashed them to the board.
### `xo-user`
This is a userspace program that can read the status of the kernel module, display the current Tic-Tac-Toe game board, and control the kernel module via the `kxo-state` device file.
```cpp
#define XO_DEVICE_ATTR_FILE "/sys/class/kxo/kxo/kxo_state"
```
The main process uses `select` on 2 file descriptors, `stdin` and `XO_DEVICE_FILE`, to perform I/O multiplexing.
```cpp
#define XO_DEVICE_FILE "/dev/kxo"
```
If the user presses the keys, `stdin` is set, and `listen_keyboard_handler` is triggered.
The handler opens `XO_DEVICE_ATTR_FILE` to retrieve the current status of the `kxo` kernel module. It then modifies the values and writes back to the file to control the kernel module.
### `main`
#### Define a Read-Write Device Attribute
The functions, `kxo_state_show` and `kxo_state_store`, aren't called directly in the code; rather, they're used indirectly by the sysfs subsystem. They are assigned as the callbacks for a device attribute via the macro:
```cpp
static DEVICE_ATTR_RW(kxo_state);
```
This macro creates a device attribute (named `dev_attr_kxo_state`) that ties `kxo_state_show` to the read (show) operation and `kxo_state_store` to the write (store) operation.
In `linux/device.h`:
```cpp
/**
* DEVICE_ATTR_RW - Define a read-write device attribute.
* @_name: Attribute name.
*
* Like DEVICE_ATTR(), but @_mode is 0644, @_show is <_name>_show,
* and @_store is <_name>_store.
*/
#define DEVICE_ATTR_RW(_name) \
struct device_attribute dev_attr_##_name = __ATTR_RW(_name)
```
In `linux/sysfs.h`:
```cpp
#define __ATTR_RW(_name) __ATTR(_name, 0644, _name##_show, _name##_store)
/*
* Use these macros to make defining attributes easier.
* See include/linux/device.h for examples..
*/
#define __ATTR(_name, _mode, _show, _store) { \
.attr = {.name = __stringify(_name), \
.mode = VERIFY_OCTAL_PERMISSIONS(_mode) }, \
.show = _show, \
.store = _store, \
}
struct attribute {
const char *name;
umode_t mode;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
bool ignore_lockdep:1;
struct lock_class_key *key;
struct lock_class_key skey;
#endif
};
```
Again, in `linux/device.h`:
```cpp
/**
* DEVICE_ATTR - Define a device attribute.
* @_name: Attribute name.
* @_mode: File mode.
* @_show: Show handler. Optional, but mandatory if attribute is readable.
* @_store: Store handler. Optional, but mandatory if attribute is writable.
*
* Convenience macro for defining a struct device_attribute.
*
* For example, ``DEVICE_ATTR(foo, 0644, foo_show, foo_store);`` expands to:
*
* .. code-block:: c
*
* struct device_attribute dev_attr_foo = {
* .attr = { .name = "foo", .mode = 0644 },
* .show = foo_show,
* .store = foo_store,
* };
*/
#define DEVICE_ATTR(_name, _mode, _show, _store) \
struct device_attribute dev_attr_##_name = __ATTR(_name, _mode, _show, _store)
/**
* struct device_attribute - Interface for exporting device attributes.
* @attr: sysfs attribute definition.
* @show: Show handler.
* @store: Store handler.
*/
struct device_attribute {
struct attribute attr;
ssize_t (*show)(struct device *dev, struct device_attribute *attr,
char *buf);
ssize_t (*store)(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count);
};
```
According to these definitions, the macro expands to:
```cpp
struct device_attribute dev_attr_kxo_state = {
.attr = {.name = "kxo_state",
.mode =
(((int) (sizeof(struct { int : (-!!((0644) < 0)); }))) +
((int) (sizeof(struct { int : (-!!((0644) > 0777)); }))) +
((int) (sizeof(struct {
int : (-!!((((0644) >> 6) & 4) < (((0644) >> 3) & 4)));
}))) +
((int) (sizeof(struct {
int : (-!!((((0644) >> 3) & 4) < ((0644) & 4)));
}))) +
((int) (sizeof(struct {
int : (-!!((((0644) >> 6) & 2) < (((0644) >> 3) & 2)));
}))) +
((int) (sizeof(struct { int : (-!!((0644) & 2)); }))) +
(0644))},
.show = kxo_state_show,
.store = kxo_state_store,
};
```
Later in the initialization function (`kxo_init`), the attribute is registered with the device by this call:
```cpp
ret = device_create_file(kxo_dev, &dev_attr_kxo_state);
```
Once registered, whenever a user reads from or writes to the sysfs file (typically found at `/sys/class/kxo/kxo/kxo_state`), the corresponding function is automatically invoked by the sysfs infrastructure.
```
ls -lash /sys/class/kxo/kxo/kxo_state
0 -rw-r--r-- 1 root root 4.0K May 1 21:15 /sys/class/kxo/kxo/kxo_state
```
#### Interrupt Request (IRQ)
The timer periodically triggers events (simulated IRQs) that update the game state.
#### Game-Board Drawing Buffer
The updated game board is rendered into the `draw_buffer`.
```cpp
#define BOARD_SIZE 4
#define DRAWBUFFER_SIZE \
((BOARD_SIZE * (BOARD_SIZE + 1) << 1) + (BOARD_SIZE * BOARD_SIZE) + \
((BOARD_SIZE << 1) + 1) + 1)
static char draw_buffer[DRAWBUFFER_SIZE]; // 66
```
The board message is shown below:
```
<- Note that there are 2 newlines '\n' at the beginning.
| | | <- Note that there is a newline '\n' here.
-------
|X|X|
-------
O|O|X|
-------
| | |
------- <- Note that there is a newline '\n' here.
```
#### Kernel FIFO Buffer and Mutex Lock
Data from `draw_buffer` is then transferred into the `rx_fifo` FIFO buffer.
```cpp
static DECLARE_KFIFO_PTR(rx_fifo, unsigned char);
```
This macro expands to:
```cpp
struct {
union {
struct __kfifo kfifo;
unsigned char *type;
const unsigned char *const_type;
char (*rectype)[0];
unsigned char *ptr;
unsigned char const *ptr_const;
};
unsigned char buf[0];
} rx_fifo
```
A user-space process reading the device file is blocked on `rx_wait` until data becomes available.
Once data is available, it is safely read using the `read_lock` mutex, and the user sees the current state of the tic-tac-toe game.
```cpp
static DEFINE_MUTEX(read_lock);
static DECLARE_WAIT_QUEUE_HEAD(rx_wait);
```
These macros expands to:
```cpp
struct mutex read_lock = {
.owner = {(0)},
.wait_lock =
(raw_spinlock_t){
.raw_lock = {{.val = {(0)}}},
},
.wait_list = {&(read_lock.wait_list), &(read_lock.wait_list)}}
struct wait_queue_head rx_wait = {
.lock = (spinlock_t){{.rlock =
{
.raw_lock = {{.val = {(0)}}},
}}},
.head = {&(rx_wait.head), &(rx_wait.head)}}
```
:::danger
Use Ftrace to analyze the above interconnect.
:::
#### Producer and Consumer
The producer mutex is used within workqueue handlers to serialize writes to the kfifo buffer. Since multiple work items (e.g., drawing the board or processing AI moves) might try to produce output concurrently, this lock ensures that only one producer writes to the FIFO at any given time, preventing data corruption.
Although the fast circular buffer (described next) is used in an interrupt context, its consumers run in a workqueue (kernel thread context). The consumer mutex serializes access to the fast buffer, ensuring that only one consumer retrieves data at a time, which avoids race conditions during the transfer of data from the fast buffer to the kfifo.
The fast buffer is an additional circular buffer intended to capture data quickly from the interrupt context. Because interrupt handlers must execute very fast, this buffer allows data to be temporarily stored with minimal overhead. Later, the workqueue handler can safely retrieve and process this data (using the consumer_lock) before moving it into the main kfifo buffer for userspace consumption.
#### Draw the Board into Draw Buffer
It starts by writing two newline characters at the beginning of draw_buffer to provide some spacing at the top, as shown [above](#Game-Board-Drawing-Buffer).
```cpp
int i = 0, k = 0;
draw_buffer[i++] = '\n';
smp_wmb();
draw_buffer[i++] = '\n';
smp_wmb();
```
After each write, `smp_wmb()` (a write memory barrier) is called to ensure that the write operations are globally visible in the intended order. This is important in SMP (Symmetric Multi-Processing) environments to prevent reordering of memory writes.
The `smp_wmb()` expands to:
```cpp
__asm__ __volatile__("": : :"memory")
```
:::danger
Not exactly. Check your system configurations.
:::
In `include/generated/autoconf.h`, we have the following definition:
```
#define CONFIG_SMP 1
```
In `include/asm-generic/barrier.h`, we have the following definition:
```
#define smp_wmb() do { kcsan_wmb(); __smp_wmb(); } while (0)
```
:::info
We have to figure out why VSCode doesn't include the `autoconf.h`.
:::
#### Workqueue Handler
These macros ensure that the function is not being executed in interrupt or softirq context, as workqueue handlers run in process context. If either check fails, a warning is issued.
```cpp
WARN_ON_ONCE(in_softirq());
WARN_ON_ONCE(in_interrupt());
```
We have 3 workqueue handlers run in process context.
- `drawboard_work_func`
- If the `display` attribute is set to '1', the board is drawn from `static char table[N_GRIDS]` into `static char draw_buffer[DRAWBUFFER_SIZE]`, serialized by the producer lock.
- The entire chess board is inserted into the `kfifo` buffer, serialized by the consumer lock.
- This call wakes up any user-space process that is blocked (waiting) on the rx_wait wait queue, informing it that new data is available to be read from the device.
```cpp
wake_up_interruptible(&rx_wait);
```
- `ai_one_work_func`
- `ai_two_work_func`
:::danger
Use eBPF to trace the above.
:::