---
# System prepended metadata

title: Device Drivers

---

# Device Driver 筆記

# 學習目標
- 中斷管理
- 儲存資料結構
- 鎖
- 裝置管理


# 中斷管理
![Screenshot 2025-01-25 at 2.00.03 PM](https://hackmd.io/_uploads/rJfff4O1ll.png)
取自羅習五教授課程

判斷不同context的函數: [ref](https://elixir.bootlin.com/linux/v6.14.3/source/include/linux/preempt.h#L141)

```cpp!

/*
 * The following macros are deprecated and should not be used in new code:
 * in_irq()       - Obsolete version of in_hardirq()
 * in_softirq()   - We have BH disabled, or are processing softirqs
 * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
 */
#define in_irq()		(hardirq_count())
#define in_softirq()		(softirq_count())
#define in_interrupt()		(irq_count())
```

## Top Half
通常是hardware irq. 執行快速(把網路卡的DMA的封包拿到socket buffer)
確認中斷來源：在共享 interrupt line 的情況下，檢查硬體狀態以判斷是否為本裝置觸發的中斷。
Acknowledgement：向硬體發送確認訊號，表示中斷已被接收，防止重複觸發。
### 常用interrupt number
- 0x80 syscall -> 不再是透過傳統interrput routine
```
cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
  0:         23          0          0          0          0          0          0          0   IO-APIC   2-edge      timer
  1:          0          0         10          0          0          0          0          0  xen-pirq   1-ioapic-edge  i8042
  4:          0          0          0          0        674          0          0          0  xen-pirq   4-ioapic-edge  ttyS0
  8:          0          0          0          1          0          0          0          0  xen-pirq   8-ioapic-edge  rtc0
  9:          0          0          0          0          0          0          0          0  xen-pirq   9-ioapic-level  acpi
 12:          0        154          0          0          0          0          0          0  xen-pirq  12-ioapic-edge  i8042
 14:          0          0          0          0          0          0          0          0   IO-APIC  14-edge      ata_piix
 15:          0          0          0          0          0          0          0          0   IO-APIC  15-edge      ata_piix
 48:  504264750          0          0          0          0          0          0          0  xen-percpu    -virq      timer0
 49:    8872691          0          0          0          0          0          0          0  xen-percpu    -ipi       resched0
 50:    6620259          0          0          0          0          0          0          0  xen-percpu    -ipi       callfunc0
 51:          0          0          0          0          0          0          0          0  xen-percpu    -virq      debug0
 52:  499657445          0          0          0          0          0          0          0  xen-percpu    -ipi       callfuncsingle0
 53:          0          0          0          0          0          0          0          0  xen-percpu    -ipi       spinlock0
 54:          0  493474153          0          0          0          0          0          0  xen-percpu    -virq      timer1

```
## Bottom Half
- softirq
- tasklet
- workqueue
- timer

### softirq
每個cpu有一個自己的softirqd
```CPP!
static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;

DEFINE_PER_CPU(struct task_struct *, ksoftirqd);

// show in /proc/softirqs
const char * const softirq_to_name[NR_SOFTIRQS] = {
	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
	"TASKLET", "SCHED", "HRTIMER", "RCU"
};
```
顯示為[ksoftirqd/(cpu id)] []代表為kernel thread
```
root          13  0.0  0.0      0     0 ?        S    Mar17   0:34 [ksoftirqd/0]
root          22  0.0  0.0      0     0 ?        S    Mar17   0:34 [ksoftirqd/1]
root          28  0.0  0.0      0     0 ?        S    Mar17   0:34 [ksoftirqd/2]
root          34  0.0  0.0      0     0 ?        S    Mar17   0:34 [ksoftirqd/3]
root          40  0.0  0.0      0     0 ?        S    Mar17   0:34 [ksoftirqd/4]
root          46  0.0  0.0      0     0 ?        S    Mar17   0:34 [ksoftirqd/5]
root          52  0.0  0.0      0     0 ?        S    Mar17   0:35 [ksoftirqd/6]
root          58  0.0  0.0      0     0 ?        S    Mar17   1:46 [ksoftirqd/7]
```

softirq 顯示
```
cat /proc/softirqs 
                    CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      
          HI:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
       TIMER:   92838560   69051016   63276173   60445512   60199216   57178732   55909473  116221079          0          0          0          0          0          0          0
      NET_TX:    5795331    4509853    3413238    2841642    2533658    2285522    2271241  717866587          0          0          0          0          0          0          0
      NET_RX:   14420089   14164785   13537241   13100689   12562658   11933342   11388569  662091911          0          0          0          0          0          0          0
       BLOCK:       1907     508270       1426       5067   13658611     117387   29877921       9561          0          0          0          0          0          0          0
    IRQ_POLL:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0
     TASKLET:      79605      73784      67074      63043      61516      59482      58815    7760535          0          0          0          0          0          0          0
       SCHED:  138948366  115153178  101612606   96263402   95116351   92100004   91851021  151182969          0          0          0          0          0          0          0
     HRTIMER:          9         13         13         11         10         13          5       2159          0          0          0          0          0          0          0
         RCU:  178229514  168995553  161812932  157542764  155516781  151663593  151334579  230644429          0          0          0          0          0          0          0

```

## tasklet
- 在interrput context 且在 softirq context
- 同一種tasklet只能同時執行一個
- 建立在softirq之上, 所以是在softirq context, interrupt context
--  TASKLET_SOFTIRQ
-- HI_SOFTIRQ
- kernel註解建議是不在使用


[原始碼](https://elixir.bootlin.com/linux/v6.14.3/source/include/linux/interrupt.h#L696)
```
This API is deprecated. Please consider using threaded IRQs instead:
   https://lore.kernel.org/lkml/20200716081538.2sivhkj4hcyrusem@linutronix.de

   Main feature differing them of generic softirqs: tasklet
   is running **only on one CPU** simultaneously.

   Main feature differing them of BHs: different tasklets
   may be run simultaneously on different CPUs.
```

#### tasklet 結構體
```cpp!
struct tasklet_struct
{
    struct tasklet_struct *next;
    unsigned long state;
    atomic_t count;
    bool use_callback;
    union {
        void (*func)(unsigned long data);
        void (*callback)(struct tasklet_struct *t);
    };
    unsigned long data;
}
```

有兩種tasklet初始方式

編譯期靜態定義
- DECLARE_TASKLET(name, _callback)
- DECLARE_TASKLET_OLD(name, _func)
顧名思義就是差在回呼函式的定義
動態定義
- tasklet_init(struct tasklet_strut *t,
  void (*func)(unsigned long), unsigned long data))
- tasklet_setup(struct tasklet_struct *t,
			  void (*callback)(struct tasklet_struct *));

讓tasklet排程執行透過
- void task_schedule(struct tasklet_struct *t)
- void task_hi_schedule(struct tasklet_struct *t)
```cpp!
static inline void tasklet_schedule(struct tasklet_struct *t)
{
	// &t->state is not sched schedule it
    if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
		__tasklet_schedule(t);
}
```
移除tasklet
- tasklet_kill(struct tasklet_struct *t)

### Referenes
- https://linux-kernel-labs.github.io/refs/pull/189/merge/labs/deferred_work.html

## workqueue
[原始碼](https://elixir.bootlin.com/linux/v6.14.3/source/include/linux/workqueue.h)

- 執行在softirq 或是process context下
- 可在不同cpu上同時執行

顯示和ksoftirqd很像
```
supplyframe-fcl@app-00.nc0.as1:~$ ps aux | grep kworker
root           8  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/0:0H-events_highpri]
root          24  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/1:0H-kblockd]
root          30  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/2:0H-events_highpri]
root          36  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/3:0H-kblockd]
root          42  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/4:0H-kblockd]
root          48  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/5:0H-kblockd]
root          54  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/6:0H-kblockd]
root          60  0.0  0.0      0     0 ?        I<   Mar17   0:00 [kworker/7:0H-kblockd]
root         130  0.0  0.0      0     0 ?        I<   Mar17   0:19 [kworker/0:1H-kblockd]
root         149  0.0  0.0      0     0 ?        I<   Mar17   2:14 [kworker/4:1H-kblockd]
root         154  0.0  0.0      0     0 ?        I<   Mar17   0:15 [kworker/2:1H-kblockd]
```
有兩種常用的workqueue struct
- work_struct: 一般來說的workqueue 結構體
- delayed_work: 在指定一段時間過後執行的結構體, 它內嵌work_struct
```cpp!
struct delayed_work {
	struct work_struct work;
	struct timer_list timer;

	/* target workqueue and CPU ->timer uses to queue ->work */
	struct workqueue_struct *wq;
	int cpu;
};
```

建立workqueue
- struct workqueue_struct *
alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...)
```cpp!
simrupt_workqueue = alloc_workqueue("simruptd", WQ_UNBOUND, WQ_MAX_ACTIVE);
```

初始化work物件
- DECLARE_WORK(name, void (*function)(struct work_struct *))
- DECLARE_DELAYED_WORK(name, void(*function)(struct work_struct *));
有初始化的work_struct
- INIT_WORK(struct work_struct *work, void(*function)(struct work_struct *));
- INIT_DELAYED_WORK(struct delayed_work *work, void(*function)(struct work_struct *));

```cpp!
#include <linux/workqueue.h>

void my_work_handler(struct work_struct *work);

DECLARE(my_work, my_work_handler);

struct work_struct my_work2;

INIT_WORK(&my_work2, my_work_handler);
```
排程work
- bool schedule_work(struct work_struct *work): 把work放到全域串列排程
- bool queue_work(struct workqueue_struct *wq, struct work_struct *work): 把work放到特定的排程串列
- bool queue_work_on(int cpu, struct workqueue_struct *wq,
			struct work_struct *work): 和上面比起來多指定在特定cpu上

清除workqueue
```cpp!
flush_workqueue(simrupt_workqueue);
destroy_workqueue(simrupt_workqueue);
```

## timer
### jiffies
```cpp!
jiffies_value = seconds_value * HZ;
seconds_value = jiffies_value / HZ;
```
HZ:舊系統多數預設100, HZ也是系統每秒觸發clock interrupt的次數
而jiffies可以理解為核心看時間的度量衡單位

**timer執行在interrupt context和softirq context上**

結構體: [timer_list](https://elixir.bootlin.com/linux/v6.14.3/source/include/linux/timer_types.h#L8)
```cpp!
struct timer_list {
	/*
	 * All fields that change during normal runtime grouped to the
	 * same cacheline
	 */
	struct hlist_node	entry;
	unsigned long		expires;
	void			(*function)(struct timer_list *);
	u32			flags;

#ifdef CONFIG_LOCKDEP
	struct lockdep_map	lockdep_map;
#endif
};
```
設定timer: void timer_setup(struct timer_list *timer,
                           void (*function)(struct timer_list *),
                           unsigned int flags)
```cpp!
struct timer_list timer;
timer_setup(&timer, timer_handler, 0);
```
排程timer: int mod_timer(struct timer_list *timer, unsigned long expires)
- 實際等於: del_timer(timer); timer->expires = expires; add_timer(timer);
```cpp!
mod_timer(&timer, jiffies + msecs_to_jiffies(delay));
```
移除timer
- int del_timer(struct timer_list *timer)
- int [del_timer_sync(struct timer_list *timer)](https://elixir.bootlin.com/linux/v6.14.3/source/include/linux/timer.h#L171)
建議使用timer_delete_sync或 timer_delete

#### reference
- [man time](https://man7.org/linux/man-pages/man7/time.7.html)
- [clock interrupt](https://www.sciencedirect.com/topics/computer-science/clock-interrupt)
- [timers](https://linux-kernel-labs.github.io/refs/pull/189/merge/labs/deferred_work.html#timers)

# 儲存資料結構
## kfifo
[kfifo](https://archive.kernel.org/oldlinux/htmldocs/kernel-api/kfifo.html) 是 Linux 核心裡頭 First-In-First-Out (FIFO) 的結構，在 Single Producer Single Consumer (SPSC) 情況中是 safe 的，即不需要額外的 lock 維護，在程式碼中註解中也有提及。


編譯期定義kfifo: DECLARE_KFIFO_PTR(rx_fifo, unsigned char)
動態定義kfifo: kfifo_alloc(fifo, size, gfp_mask)

複製資料到kfifo: kfifo_in(fifo, buf, n)
```cpp!
len = kfifo_in(&rx_fifo, &val, sizeof(val));
    if (unlikely(len < sizeof(val)))
        pr_warn_ratelimited("%s: %zu bytes dropped\n", __func__,
                            sizeof(val) - len);
```
從kfifo取得資料: kfifo_out(fifo, buf, n)
釋放kfifo: kfifo_free(fifo)
將最多 len 個 bytes 資料從 fifo 移到 userspace: [kfifo_to_user](https://archive.kernel.org/oldlinux/htmldocs/kernel-api/API-kfifo-to-user.html)(fifo, to, len, copied)


## Circular buffer
[資料結構](https://www.kernel.org/doc/Documentation/core-api/circular-buffers.rst)
(1) A 'head' index - the point at which the producer inserts items into the buffer.

(2) A 'tail' index - the point at which the consumer finds the next item in the buffer.

The head index is incremented when items are added, and the tail index when
items are removed.

判斷是否是
- 空的head  == tail
- 滿的(head -1) == tail

標頭檔: #include <linux/circ_buf.h>
使用時多數要搭配頭尾指標和memory barrier

```cpp!
// 資料結構
static struct circ_buf fast_buf;
fast_buf.buf = vmalloc(PAGE_SIZE);

static void fast_buf_clear(void)
{
    fast_buf.head = fast_buf.tail = 0;
}

static void fast_buf_put()
{
    struct circ_buf *ring = &fast_buf;
    unsigned long head = ring->head;
    
    unsigned long tail = READ_ONCE(ring->tail);
    
    if (unlikely(!CIRC_SPACE(head, tail, PAGE_SIZE)))
         pr_warn_ratelimited("%s: fast buffer full, will drop some data\n", __func__);
    
    memcpy(&ring->buf[head], &game_moves, sizeof(u64));
    smp_wmb();

    ring->head = (ring->head + sizeof(u64)) & (PAGE_SIZE - 1);
}

static u64 fast_buf_get(void)
{
    struct circ_buf *ring = &fast_buf;
    unsigned long head = READ_ONCE(ring->head), tail = ring->tail;

    // nothing to read
    if (unlikely(!CIRC_CNT(head, tail, PAGE_SIZE)))
        return 0;
    
    smp_rmb();
    u64 move;
    memcpy(&move, (void *) &ring->buf[tail], sizeof(u64));

    smp_mb();

    ring->tail = (ring->tail + sizeof(u64)) & (PAGE_SIZE - 1);
    return move;
}

vfree(fast_buf.buf);
```
- CIRC_SPACE(head_index, tail_index, buffer_size): buffer還有多少空間
- CIRC_CNT(head_index, tail_index, buffer_size): 還有多少資料可以讀


# 研讀simrupt
[原始碼](https://github.com/sysprog21/simrupt/blob/main/simrupt.c)

# 裝置驅動管理
## 裝置

## 裝置sysfs 註冊