Non-scalable locks

# Non-scalable locks LWN 上對於 spinlock 的描述[^LWN]，目前 kernel 是使用 ticket spinlock，加入了 FIFO 的佇列，讓單一個執行緒不會永久地等待鎖，改善以前版本的 spinlock 造成的 starvation 問題，以及增進整體取得鎖的延遲 (即時性能也可以提升)。以下為簡化的 spinlock 實作[^fig1] ```c=1 struct spinlock_t { int current_ticket; int next_ticket; } void spin_lock(spinlock_t *lock) { int t = atomic_fetch_and_inc(&lock->next_ticket); while (t != lock->current_ticket) ; /* spin */ } void spin_unlock(spinlock_t *lock) { lock->current_ticket++; } ``` > 第七行 `atomic_fetch_and_inc` 做的事可以想成原子式的 `lock->next_ticket++` spinlock 定義有兩個欄位 `current_ticket` 和 `next_ticket`，分別代表目前可取得鎖的號碼以及目前已發放的號碼。 ticket spinlock 可想成櫃台發放號碼牌，當一個客人 (task) 要進行一項服務時，他會先向櫃台取的一個號碼 (現在 `next_ticket` 的值)，櫃台這時會已發放的號碼 + 1，接著客人會比對手上的號碼牌與螢幕上的號碼 (`current_ticket`)，如果不相同就等待，當有其他客人完成後就會將螢幕上的號碼 + 1，直到螢幕上的號碼與手上的號碼牌相同時，就能進行服務。 spinlock 進行初始化時兩個值都會設為 0，取得鎖的條件為目前發放的號碼 `next_ticket` 與可取得鎖的號碼 `current_ticket` 相同，釋放鎖後將 `current_ticket` + 1，通知在等待的執行緒誰可以取得鎖。 > `current_ticket` 和 `next_ticket` 的型別會限制硬體的核心數，假設兩者都只有 8bit，那最多只能有 256 個核心等待一個 spinlock，比 256 個核心更多的處理器可能會讓多個核心上的 `t` 值相同。 ## Cache corherence 現代處理器會使用快取 (cache) 加速記憶體存取的延遲，每個處理器核心會有自己私有快取，當多核同時存取一個共享變數，其值會被放到快取裡面，可以說每個核心都有自己的一份 copy，以下圖[^4]為例，P1 和 P2 共享一個變數 `x`，所以兩個核心的快取都有 x 的值 (1000)。 ![](https://i.imgur.com/yrfFUbF.png) 當某一核修改了存在快取的變數的值，硬體必須提供機制讓其他核也能看到修改的值，這就是 cache coherence 的目的，讓軟體開發者不用處理這個問題。 cache coherence 的方式: 1. update-based protocols: 當一個共享的記憶體位址的值被修改，其值會被更新到任何有此記憶體位址的快取上 2. invalidation-based protocols: 當一個共享的記憶體位址的值被修改，將其他有此記憶體位址的快取設為 invalid，直到其他處理器核心產生一個 miss (存取此位址) 才會讀取資料，如較知名的 MSI, MESI 協議 Coherence 必須提供: - **Write propagation**: guarantee that updates will propagate - **Write serialization**: provide a consistent order seen by all processors for the same memory location 下圖[^3] ![](https://i.imgur.com/NbHp3bq.png) [^LWN]: LWM spinlock implementation: https://lwn.net/Articles/267968/ [^fig1]: Non-scalable locks are dangerous. Figure 1 [^3]: David Culler, J.P. Singh, Anoop Gupta, Parallel Computer Architecture: A Hardware Software Approach (The Morgan Kaufmann Series in Computer Architecture and Design) (1998) [^4]: ETHZ fall2019-lecture22-cachecoherence-afterlecture