Copyright (慣C) 2018 宅色夫
直播錄影
What every systems programmer should know about concurrency
自我檢查清單
點題 / 小處做起
Linux 的 spinlock 出現在裝置驅動程式之際,有以下形式: (只列出常見者)
- spin_lock() / spin_unlock(),
- spin_lock_irq() / spin_unlock_irq(),
- spin_lock_irqsave / spin_unlock_irqrestore()
- spin_lock_bh() / spin_unlock_bh()
- local_irq_disable / local_irq_enable
- local_bh_disable / local_bh_enable
這一切很難「透明」地一眼望穿,遑論核心改版時,陸續添增新的 spinlock 機制,這該怎麼辦呢?不僅「Bug 出不去」,而且「程式碼進不來」。
本講座試著帶大家回頭研究 多核處理器的設計,重新學習 context switch 在 process 的意義,確認釐清多核之間的執行路徑,再來則是 IRQ 和核心間的關聯,最後才是思考 spinlock 存在和真正要解決的問題層面。過程中也會搭配 Linux 邁向即時處理機制 (不僅有工業控制,也有自動駕駛的控制需求,Google 公司也是推手之一) 所著重的 kernel preemption,我們會提及 PREEMPT_RT 對 spinlock 的影響,後者會轉換為 sleeping spinlocks,也就是支援 priority inheritance 的 RT-mutex。
如果以上就讓你頭昏,多核處理器之間的 cache coherence 也是 spinlock 實作的關鍵考量,這十年來 Linux 核心針對 scalability 做了巨大的變革,許多廠商為了「發大財」,相繼投入可觀的工程資源,我們也會解析相關議題,並且用機率統計 (不是只有人工智慧需要數學,你真要理解作業系統核心運作,也逃不掉的,孩子,誠實面對自己吧) 來解釋箇中運作機制。
在議程進行之前,請自行評估對以下提問的認知:
- 知道 Linux 核心的 atomic context 出現於哪些場合嗎?
- spinlock 作為 busy-waiting (教科書術語),到底在瞎忙什麼?對應於現代處理器做了哪些事?
- spinlock 的實作可能涉及 disable interrupt,後者對作業系統和 latency 及 scalability 的影響為何?
- 你應該知道 deadlock,但 spinlock 可能會導致 livelock,你知道為什麼嗎?
- 中斷處理機制對許多程式開發者不陌生,特別是待過 IC 設計公司和系統廠的朋友,但為了提升 kernel preemption,核心開發者避免 disable interrupt 的發生,那該如何確保資料的正確呢? (提示: preempt_count)
- 如果 spinlock 變成可以 sleeping (在 PREEMPT_RT),還有 critical section 可言嗎?
- Threaded IRQ 是什麼?又如何對應到現代處理器的實作呢?
- hard IRQ 和 soft IRQ (BH) 的關聯為何?你知道 ksoftirqd 的運作機制嗎?
若上述問題你沒辦法在三分鐘內回覆,那沒關係,就過來跟我們一起學習吧。
相關講座: Linux 核心設計: 淺談同步機制
重新思考 CS 和互斥
CS:APP 課程大綱 (會隨著課程開課時間逐步開放,且前面的編號也可能改變)
CS:APP Concurrent Programming, Page 37-56
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
CS:APP Synchronization: Advanced, Page 40-49
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
計數器的程式範例:
執行結果每次不盡相同:
分析 cnt
變數操作的部分,其中執行緒函式的迴圈程式碼:
對應的組合語言程式碼:
cnt 使用 volatile
關鍵字聲明,意思是避免編譯器產生的程式碼中,透過暫存器來保存數值,無論是讀取還是寫入,都在主記憶體操作。
細部的步驟分成 5 步:H
-> L
-> U
-> S
-> T
,尤其要注意 LUS 這三個操作,這三個操作必須在一次執行中完成,一旦次序打亂,就會出現問題,不同執行緒拿到的值就不一定是最新的。
也就是說該函式的正確執行和指令的執行順序有關
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
將 n 個 concurrent 執行緒的執行模型描述為 n 維空間中的軌跡線,每條軸 k 對應 k 的進度。下圖展示對於 H1, L1, U1, H2, L2, S1, T1, U2, S2, T2 的軌跡線的進度圖:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
對於執行緒編號 i
,操作共享變數 cnt
內含值的指令 (,Li,Ui,Si) 構成了一個 CS,每個 CS 不應該和其他執行緒的 CS 交替執行。換言之,需要對共享變數進行互斥的存取:
- 當執行緒繞過不安全區,叫做安全軌跡線;
- 反之,為不安全軌跡線;
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
mutual exclusion (互斥) 手段的選擇,不是根據 CS 的大小,而是根據 CS 的性質,以及有哪些部分的程式碼,也就是,仰賴於核心內部的執行路徑。
semaphore 和 spinlock 屬於不同層次的互斥手段,前者的實現仰賴於後者,可類比於 HTTP 和 TCP/IP 的關係,儘管都算是網路通訊協定,但層次截然不同:
- POSIX semaphore 用於多個 process 之間對資源的互斥
- 雖然也是在核心中,但是該核心執行路徑是以 process 的身份,代表 process 來爭奪資源
- 如果競爭不上,會有 context switch,process 可以去 sleep,但 CPU 不會停下來,會接著運作其他的執行路徑
- 從概念上說,這和單核 CPU 或多核 CPU 沒有直接的關係,只是在
semaphore 本身的實做層面上,為了保證 semaphore 的 atomics,在多核 CPU 中需要 spinlock 來互斥
- 在多核 CPU 中, 加上了其他 CPU 的干擾,因此需要 spinlock 來幫助。一旦 CPU 進入 spinlock,就不會做別的事,直到鎖定成功為止。因此,這就決定了被 spinlock 鎖住的 CS 不能停
spinlock 的本意和目的就是保證資料修改的 atomics,因此也沒有理由在spinlock 鎖住的 CS 中停留。
Synchronization primitives in the Linux kernel. Part 3 指出:
- The main idea behind spinlock concept is a lock which will be acquired for a very short time. We can't sleep when a lock acquired by a process or thread, because other processes wait us. Context switch is not allowed because preemption is disabled to avoid deadlocks.
- semaphores is a good solution for locks which may be acquired for a long time. In other way this mechanism is not optimal for locks that acquired for a short time.
CS:APP Thread-Level Parallelism, Page 24-37
核心同步機制從 spinlock 開始
General properties of spinlocks:
- Loop until another processor releases the lock
- Should only be used for short critical sections
- Should always be released before the process/thread goes to sleep
- Need to have interrupts disabled when locked by ordinary threads, if shared by an interrupt handler
- Must be initialized explicitly, before use
取自 Donald E. Porter 教授的 CSE 506: Operating Systems 教材
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Big Kernel Lock: 鎖定整個 Linux 核心的機制,透過
lock_kernel()
與 unlock_kernel()
函式。這是為了支援多處理器,暫時引進的鎖定機制,已在 v2.6.39 徹底清除。
- commit
- Killing the Big Kernel Lock
- BKL 用於保護整個核心,而 spinlock 用於保護非常特定的某一共享資源。行程 (process) 持有 BKL 時允許發生排程。實作機制:在執行
schedule
時,schedule
將檢查行程是否擁有 BKL,若有,它將被釋放,以致於其它的行程能夠獲得該 lock,而當輪到該行程執行之際,再讓它重新獲得 BKL。注意:在持有 spinlock 期間不會發生排程。
- 整個核心僅有一個 BKL
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
- 延伸閱讀: Evaluating the Cost of Atomic Operations on
Modern Architectures
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
- Linux spinlock 的對等 C 語言實作
- Test-and-Set Lock
- 注意: Write Back + Evict (驅逐) Cache Line

Linux Spinlocks
- Are currently based on the "ticket lock" mechanism
- Also bundle in scheduler-level non-preemption
- Optionally bundle in interrupt-masking (spin_lock_irqsave()).
搭配閱讀:
Preemptable ticket spinlocks: improving consolidated performance in the cloud
Symmetric Multi-Processing (SMP) 的衝擊

搭配閱讀 Symmetric Multi-Processing
會造成 race condition 的程式碼:

透過 atomics 改寫:

Disabling preemption (interrupts)
在早期 i386 的實作
x86 中斷相關指令: Basic x86 interrupts
- enabled interrupts (sti)
- Disable interrupt (cli)
- Push bits 0-15 of RFLAGS to stack (pushf) 主要是要儲存 RFLAGS register 的 bit 9 IF: Interrupt enable flag
- Pop Stack into EFLAGS Register (popf)
queued spin lock

搭配閱讀:
User vs. Kernel Locks
Linux Locking Mechanisms
livelock?
查閱 Mutual exclusion, synchronization,communicationin shared memoryOperating systems 及 Kernels and Locking


ELC 2017 Effectively Measure and Reduce Kernel Latency for Real-time Constraints