看習五 OS 影片

影片

教材

同步機制

在 gdb 下達 disassemble /m <function name> 進行反組譯

critical section 的解決要件：

mutual exclusion
progress
bounded waiting

適用長度：

spinlock ：短區塊
semaphore ：長區塊
mutex ： adaptive

Peterson's Solution

atomic and fence

atomic_thread_fence() 可防止指令重排的越界

x86 instructions :

lfence fence for load
sfence fence for store

Semaphore

acquire 失敗幾乎附帶 context switch

Spinlock

acquire 失敗會進入 loop 不斷嘗試

Mutex

Optional feature
- ownership
- priority inheritance
- nested lock
- adaptive

Adaptive mutex

sun solaris 依持鎖人當下的情況來判定

Q 在 waiting queue ： P sleep and kernel context switch
Q 執行中： spinlock
P, Q 在同一 processor 上： P sleep

glibc 根據之前 lock 等待的時間，決定 spinlock 的 loop 次數

Linux futex

man futex

glibc 沒有對此 system call 的包裝，需要直接使用 syscall()

TODO
看 jserv 版本的解說交互參照

SPSC

single producer single cosumer problem

an example solution

mutex
full
empty

Readers-Writers problem

constrain：

readers 和 writer 不可以同時存取同個資料結構
不同 writer 不可同時存取同個資料結構

/* reader */
do {
    wait(mutex);
    readcount++;
    if (readcount == 1)
        wait(wrt);
    signal(mutex);
    // reading here
    wait(mutex);
    readcount--;
    if (readcount == 0)
        signal(wrt);
    signal(mutex);
} while (1);

pthread 有提供 read write lock 供同時讀寫用

pthread_rwlock_wrlock()

哲學家用餐問題

所有人都先拿某一邊

⟹

deadlock

依奇偶數決定拿取順序

⟹

可能違反 bounded waiting

輪流獲得高優先權

⟹

可能造成效能瓶頸

每次使用 semaphore 都會進入 kernel mode，因此過多的 semaphore 會對系統帶來效能負擔。

Cache

block

CHA : cashing and home agent
SF : snoop filter
LLC : last level cache

Cache coherent 算法

snoop：使用 broadcast
dictionary：使用額外空間來紀錄分散的情況，可以做點對點傳輸

目前的 CPU 通常是混用兩種方法

cache coherent 不保證 atomic operation

atomic

傳統上有 instruction： test_n_set 和 swap

test_n_set 相當於 C 程式碼：

test_n_set

int test_n_set(int *value)
{
    register tmp = value;
    *value = 1;
    return tmp;
}

使用範例：

https://en.wikipedia.org/wiki/Test-and-set

volatile int lock = 0;

void foo() {
    while (test_n_set(&lock) == 1)
        ;
    // critical section
    lock = 0;
}

C11 有 atomic_exchange() 類似於 test_n_set

swap

swap 相當於 C 程式碼：

void swap(int *a, int *b)
{
    register tmp;
    tmp = *a;
    *a = *b;
    *b = tmp;
}

此二方法一直更新記憶體，表示一直觸發 cache coherence，造成效率負擔。

因此以 compare exchange 來改善，compare exchange 只會讀取值做比較，通常不需要額外 cache coherence。

compare exchange

bool atomic_compare_exchange_strong(volatile atomic_int *obj,
                                    int *expected,
                                    int desired)
{
    if (obj == expected) {
        *obj = desired;
        return true;
    } else {
        expected = obj;    // 失敗時只有讀 obj
        return false;
    }
}

expected 是 local variable 不需要做 coherent

使用範例：

void foo()
{
    int expected;
    while (1) {
        while (!atomic_compare_exchange_weak(&current, &expected, 1))
            expected = 0;
        // critical section
        current = 0;
    }
}

RISC 系列常使用 load-link (LL) 或 store-condition (SC) 實現

/* example of doing value + 1 */
do {
    tmp = LL(&value);
    tmp += 1;
} while (SC(&value, tmp) == false);

Spinlock

檢查 - 進入會出錯，因為中間的過程可能導致很多 thread 都進到 critical section。正確的方法是改變狀態 - 檢查 - 確認是否成功 lock，或是使用 atomic 方式檢查 - 進入 (上鎖)。

Use atomic_int rather than volatile int.

The volatile types do not provide inter-thread synchronization, memory ordering, or atomicity.
https://en.cppreference.com/w/c/language/atomic

Sequential lock

Ticket lock

FIFO

有一個最大門票數值

N

，writer 每次要拿到

N

張票才能進入， reader 只需要

1

張票就能進入。

busy waiting 可以改用 inline assembly asm("pause") 來省電

memory order

C11 memory order

#include <stdatomic.h>
typedef enum memory_order {
    memory_order_relaxed,
    memory_order_consume,
    memory_order_acquire,
    memory_order_release,
    memory_order_acq_rel,
    memory_reder_seq_cst
} memory_order;

e.g.

atomic_load_explicit(const volatile int *obj, memory_order_release);

relaxed：不保證任何 order，只保證 atomic，適用於統計次數
consume：與該變數相關的 instruction 不可以往前跨越
acquire：之後的 instruction 都不可以往前跨越，適用於 lock
release：之前的 instruction 都不可以往後跨越，適用於 unlock
acq_rel：前後都不可跨越，適用於同時有 read, modify, write 的資料結構
seq_cst：滿足 program order 和 write atomicity，所有 thread 看到的順序都一樣

seq_cst 等同在硬體上一個大鎖，大幅影響多核效能

看習五 OS 影片

同步機制

Peterson's Solution

atomic and fence

Semaphore

Spinlock

Mutex

Adaptive mutex

Linux futex

SPSC

Readers-Writers problem

哲學家用餐問題

Cache

Cache coherent 算法

atomic

test_n_set

swap

compare exchange

Spinlock

Sequential lock

Ticket lock

memory order

Read more

讀 CS:APP 簡中版

學習 Linux 流水筆記

2022q1 Homework3 (fibdrv)

fibdrv 2