# RCU Notes ###### tags: `rcu` ## TODO list - ==補 RCU (linux doc, lwn)== 還沒全部讀完 - Cppcon 的 rcu, hazard pointer, concurrency speech 找時間看一看 ### some issues [Double compare-and-swap](https://en.wikipedia.org/wiki/Double_compare-and-swap) what is cohort structure in [Hazard Pointer Synchronous Reclamation Beyond Concurrency TS2 - Maged Michael - CppCon 2021](https://youtu.be/lsy8RRq2hHM) ABA problem 可能要寫一下? --- ## Terminology > - [Comparative Performance of Memory Reclamation Strategies for Lock-free and Concurrently-readable Data Structures](https://www.cs.utoronto.ca/~tomhart/papers/tomhart_thesis.pdf) > - [Practical lock-freedom](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf) ### Linearizability [What is "Linearizability"?](https://stackoverflow.com/questions/9762101/what-is-linearizability) ![](https://i.imgur.com/hkbkLhH.png) **Linearizability means that modifications happen instantaneously**, and once a registry value is written, any subsequent read operation will find the very same value as long as the registry will not undergo any modification. ### non-blocking properties ![](https://i.imgur.com/pDliuTE.png) To guarantee that a stalled process cannot cause all other processes to stall indefinitely. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. ---- ### compare and swap (CAS) The most common use of CAS is to read from a memory location, perform some computation on the value, and then use CAS to write the modified value while ensuring that the location has not meanwhile been altered. 最常使用到 CAS 的情境是,先讀取記憶體,進行修改,然後再利用 CAS 將該改過後的數值寫回記憶體。 通常現代的處理器會支援 CAS,沒有的話則是在很小的 overhead 下將此操作模擬 (emulate) 出來。 ---- ### lock free > https://en.wikipedia.org/wiki/Non-blocking_algorithm#Lock-freedom > Lock-freedom allows individual threads to starve but guarantees system-wide throughput. An algorithm is lock-free if, when the program threads are run for a sufficiently long time, at least one of the threads makes progress (for some sensible definition of progress). All wait-free algorithms are lock-free. > > In particular, if one thread is suspended, then a lock-free algorithm guarantees that the remaining threads can still make progress. Hence, if two threads can contend for the same mutex lock or spinlock, then the algorithm is not lock-free. (If we suspend one thread that holds the lock, then the second thread will block.) The most popular non-blocking property each operation must be carefully designed to ensure that each execution step can update shared memory at most once, no matter how many times it is 'replayed' by different processes; 無論重新進行幾次,每個操作都要都要確保每次的執行最多只能更新 share memory 一次 ---- ### wait free A data structure is wait-free if and only if every operation on the structure completes after it has executed a finite number of steps. This condition ensures that no operation can experience permanent livelock and, in principle, a worst-case execution time can be calculated for any operation. 要確保每個 operation 都會在有限的階段內完成,所以不會有 permanent livelock 出現。 It is very difficult to implement efficient wait-free algorithms on commodity hardware since fair access to memory is usually not guaranteed. 因為 fair access to memory 是無法被保證的,所以很難實作。現在的 micro processor 都無法支援 --- ## Making Lockless Synchronization Fast: Performance Implications of Memory Reclamation > Locking is also susceptible to priority inversion, convoying, deadlock, and blocking due to thread failure locking 容易遇到 priority inversion, convoying, deadlock 等問題 reference counting 會有很高的 overhead ### Grace period A grace period is a time interval [a,b] such that, after time $b$, all nodes removed before time $a$ may safely be reclaimed (free). QSBR and EBR, which use the concept of a grace period ---- ### Quiescent-State-Based Reclamation (QSBR) > A grace period for QSBR is any interval of time during which all threads pass through at least one quiescent state. QSBR 的 grace period 是所有 threads 都通過至少一次 quiescent state ==A Blocking Scheme== #### Quiescent State > dpdk [7.1. What is Quiescent State](https://doc.dpdk.org/guides/prog_guide/rcu_lib.html) > Quiescent State can be defined as “any point in the thread execution where the thread does not hold a reference to shared memory”. > A quiescent state for thread T is a state in which T holds no references to shared nodes 當沒有存取 share nodes 時就可以稱作quiescent state #### Features - The choice of quiescent states is application-dependent. - Detecting minimal grace periods is unnecessary - QSBR is blocking - low overhead > EBR requires two fences per operation: one when setting a flag when entering a critical region, and one when clearing it upon exit. Since QSBR has no per-operation fences, its per-operation overhead can be very low ---- ### Epoch-Based Reclamation (EBR) > p80, [Practical lock-freedom](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf) > Processes that have observed epoch e may therefore still hold private references to objects in the limbo list associated with epoch e − 1, so it is not safe to reuse those objects until epoch e + 1. ==A Blocking Scheme== #### Features - As with QSBR, reclamation can be stalled by threads which fail in critical regions, but threads not in critical regions cannot stall EBR. 若是有一個 thread 沒有進 critical section 並離開 (on quiescent),則 QSBR 的記憶體回收會被卡住 - bookkeeping is invisible to the applications programmer - This property imposes significant overhead on EBR. - This drawback may also affect preemptively-scheduled systems, in which a process may be descheduled in the middle of a shared-memory operation with no guarantee when it will be rescheduled. > p80, [Practical lock-freedom](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf) #### 在 [Practical lock-freedom](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf) 內 有探討到在現代處理器下需要佐以 memory barrier 帶來的 overhead > a global array of per-process pointers must be checked to ensure no private references remain. scan global array 造成的效能問題,如果 amortises 來降低效能影響,則會有 cache locality, increase heap size 等問題 > p81 ---- ### Hazard-Pointer-BasedReclamation (HPBR) ==A non-blocking scheme== - have H=NK hazard pointers in total - N: numver of thread - K: datastructure-dependent, list is 2, stack is 1 > After removing a node, a thread places that node in a private list. When the list grows to a predefined size R, the thread reclaims each node lacking a corresponding hazard pointer. Increasing R amortizes reclamation overhead across more nodes, but increases memory usage. #### Features - ABA problem - amortizes reclamation overhead across more nodes, but increases memory usage 每個記憶體回收機制都有的特性,越晚回收記憶體,效率越好,但記憶體用量越高 ---- ### Lock-Free Reference Counting (LFRC) ==A non-blocking scheme== - reclaiming any node whose reference count is zero - uses CAS and fetch-and-add (FAA) - LFRC introduces overhead which often makes lockless algorithms perform worse than lock-based versions ---- ### Reclamation Performance Factors #### Memory Consistency Blocking schemes may perform poorly with update-heavy structures, since the risk of memory exhaustion is higher. Conversely, non-blocking schemes may perform poorly with operations such as list or tree traversal which visit many nodes, since they require per-node fences. 這裡的 blocking 指的是 reclamation 層面,不是指程式的 progress 層面 ### issues blocking-based 的 reclamation algorithm 會 exhaust memory,所以 Linux 使用的 QSBR 會不好處理 DDoS ### Conclusion 在選擇 blocking/non-blocking reclamation 時,需要取捨的是 run out of memory (尤其是在 write-heavy 下) --- ## Comparisons TODO > - [C++ Lock-free Hazard Pointer](https://sf-zhou.github.io/programming/hazard_pointer.html) 中文,講 folly 的 hazard pointer > - [Hazard Pointers vs RCU](http://concurrencyfreaks.blogspot.com/2016/08/hazard-pointers-vs-rcu.html) 內有比較表 > - [Hazard Pointers: Proposed Interface and Wording for Concurrency TS 2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1121r3.pdf) 內有比較表 > - [P0461R1: Proposed RCU C++ API](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0461r1.pdf) 內有比較表 ### folly > - [folly/synchronization/Hazptr.h](https://github.com/facebook/folly/blob/main/folly/synchronization/Hazptr.h) > - [folly/synchronization/Rcu.h](https://github.com/facebook/folly/blob/main/folly/synchronization/Rcu.h) > - [Proposed Wording for Concurrent Data Structures: Hazard Pointer and Read-Copy-Update (RCU)](https://open-std.org/JTC1/SC22/WG21/docs/papers/2017/p0566r3.pdf) > - [Working Draft, Extensions to C++for Concurrency Version 2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/n4895.pdf) - Locking (exclusive or shared): - Pros: simple to reason about. - Cons: serialization, high reader overhead, high contention, deadlock. - When to use: When speed and contention are not critical, and when deadlock avoidance is simple. - Reference counting (atomic shared_ptr): - Pros: automatic reclamation, thread-anonymous, independent of support for thread local data, immune to deadlock. - Cons: high reader (and writer) overhead, high reader (and writer) contention. - When to use: When thread local support is lacking and deadlock can be a problem, or automatic reclamation is needed. - Read-copy-update (RCU): - Pros: simple, fast, scalable. - Cons: sensitive to blocking - When to use: When speed and scalability are important and objects do not need to be protected while blocking. #### Hazard Pointers vs RCU hazard pointers protect specific objects RCU sections protect all protectable objects. | | Reference Counting with DCAS | RCU | Hazard Pointers | |:----------------:|:------------------:|:---:|:------:| | Unreclaimed objects | Bounded | Unbounded | Bounded | | Contention among readers | Can be very high | None | None | | Traversal forward progress | lock-free | wait-free | lock-free | | Reclamation forward progress | lock-free | blocking <br>(memory grows) | Bounded wait-free | | Automatic reclamation | yes | no | no | | Ease of use | easy | easy | hard | --- ## Linex doc / RCU API [What is RCU? – “Read, Copy, Update”](https://www.kernel.org/doc/html/latest/RCU/whatisRCU.html) ``` CPU 0 CPU 1 CPU 2 ----------------- ------------------------- --------------- 1. rcu_read_lock() 2. enters synchronize_rcu() 3. rcu_read_lock() 4. rcu_read_unlock() 5. exits synchronize_rcu() 6. rcu_read_unlock() ``` ### APIs - rcu_read_lock - rcu_read_unlock #### `call_rcu()` & `synchronize_rcu()` Marks the end of **updater** code and the beginning of reclaimer code. > The call_rcu() API is a callback form of synchronize_rcu(), and is described in more detail in a later section. Instead of blocking, it registers a function and argument which are invoked after all ongoing RCU read-side critical sections have completed. This callback variant is particularly useful in situations where it is illegal to block or where update-side performance is critically important. 在一般情況下,為了 simpler code,應使用 synchronize_rcu() 而不是 call_rcu(),因為他會在 grace period 被延遲時,自動限制 update rate 來維持系統的 resilience,應對 DoS attack #### `rcu_assign_pointer()` macro > The updater uses this function to assign a new value to an RCU-protected pointer, in order to safely communicate the change in value from the updater to the reader. **updater** 用來將新的值放到 RCU-protected pointer,來確保 updater 跟 reader 間可以安全溝通。 > it serves to document (1) which pointers are protected by RCU and (2) the point at which a given structure becomes accessible to other CPUs. That said, rcu_assign_pointer() is most frequently used indirectly, via the _rcu list-manipulation primitives such as list_add_rcu(). 1. which pointers are protected by RCU 2. the point at which a given structure becomes accessible to other CPUs `rcu_assign_pointer()` 通常不會被直接使用,而是在像 list_add_rcu() 的 function 中呼叫 #### `rcu_dereference()` macro The **reader** uses `rcu_dereference()` to fetch an RCU-protected pointer > Note that rcu_dereference() does not actually dereference the pointer, instead, it protects the pointer for later dereferencing. - 用 rcu_dereference() 複製一個 RCU-protected pointer 到 local variable, 然後 dereferences this local variable, for example as follows: ```c p = rcu_dereference(head.next); return p->data; ``` 也可以把他結合成一個 statement ```c return rcu_dereference(head.next)->data; ``` > Note that the value returned by rcu_dereference() is valid only within the enclosing RCU read-side critical section - `rcu_dereference()` 的資料只能在 read-side critical section 內使用 - `rcu_dereference()` 通常不會被直接使用,而是在像 list_for_each_entry_rcu() 的 function 中呼叫 - `rcu_dereference_protected()` 是一種 variant,可以在 critical section 之外使用,可以避免 lockdep warning,也有助於 compiler optimization > [RCU and lockdep checking](https://www.kernel.org/doc/html/latest/RCU/lockdep.html) #### Diagram ``` rcu_assign_pointer() +--------+ +---------------------->| reader |---------+ | +--------+ | | | | | | | Protect: | | | rcu_read_lock() | | | rcu_read_unlock() | rcu_dereference() | | +---------+ | | | updater |<----------------+ | +---------+ V | +-----------+ +----------------------------------->| reclaimer | +-----------+ Defer: synchronize_rcu() & call_rcu() ``` 注意 `rcu_assign_pointer()` 是 updater 用來保護 reader 的,並不能確保 concurrent updater 之間的行為。(需要用到 lock, semaphore 之類的) ---- ### EXAMPLE USES OF CORE RCU API - synchronize_rcu() ```cpp void foo_update_a(int new_a) { struct foo *new_fp; struct foo *old_fp; new_fp = kmalloc(sizeof(*new_fp), GFP_KERNEL); spin_lock(&foo_mutex); old_fp = rcu_dereference_protected(gbl_foo, lockdep_is_held(&foo_mutex)); *new_fp = *old_fp; new_fp->a = new_a; rcu_assign_pointer(gbl_foo, new_fp); spin_unlock(&foo_mutex); synchronize_rcu(); kfree(old_fp); } ``` #### IF MY UPDATING THREAD CANNOT BLOCK? - call_rcu() ```cpp void foo_update_a(int new_a) { struct foo *new_fp; struct foo *old_fp; new_fp = kmalloc(sizeof(*new_fp), GFP_KERNEL); spin_lock(&foo_mutex); old_fp = rcu_dereference_protected(gbl_foo, lockdep_is_held(&foo_mutex)); *new_fp = *old_fp; new_fp->a = new_a; rcu_assign_pointer(gbl_foo, new_fp); spin_unlock(&foo_mutex); call_rcu(&old_fp->rcu, foo_reclaim); } ``` TODO 讀完了筆記補完整 ---- ### LIST OF RCU APIs Linux kernel 中有多種 RCU API family,以下是讓使用者知道什麼情況下該選擇哪種 API 的 guideline However, given that there are no fewer than four families of RCU APIs in the Linux kernel, how do you choose which one to use? The following list can be helpful: #### Will readers need to block? If so, you need SRCU. #### What about the -rt patchset? If readers would need to block in an non-rt kernel, you need SRCU. If readers would block in a -rt kernel, but not in a non-rt kernel, SRCU is not necessary. (The -rt patchset turns spinlocks into sleeplocks, hence this distinction.) 在 rt patchset 下會把 spinlocks 轉成 sleeplocks #### Do you need to treat NMI handlers, hardirq handlers, and code segments with preemption disabled (whether via preempt_disable(), local_irq_save(), local_bh_disable(), or some other mechanism) as if they were explicit RCU readers? If so, RCU-sched is the only choice that will work for you. #### Do you need RCU grace periods to complete even in the face of softirq monopolization of one or more of the CPUs? For example, is your code subject to network-based denial-of-service attacks? If so, you should disable softirq across your readers, for example, by using rcu_read_lock_bh(). #### Is your workload too update-intensive for normal use of RCU, but inappropriate for other synchronization mechanisms? If so, consider SLAB_TYPESAFE_BY_RCU (which was originally named SLAB_DESTROY_BY_RCU). But please be careful! #### Do you need read-side critical sections that are respected even though they are in the middle of the idle loop, during user-mode execution, or on an offlined CPU? If so, SRCU is the only choice that will work for you. #### Otherwise, use RCU. --- ## Sleepable RCU (SRCU) - [Sleepable RCU](https://lwn.net/Articles/202847/) The primary challenge in designing an SRCU is to prevent any given task sleeping in an RCU read-side critical section from preventing an unbounded number of RCU callbacks. > SRCU uses two strategies to achieve this goal: > 1. refusing to provide asynchronous grace-period interfaces, such as the Classic RCU's call_rcu() API > 2. isolating grace-period detection within each subsystem using SRCU. SRCU 採取以下兩種策略: 1. 不使用非同步的 grace-period interfaces 像是 call_rcu() 2. 將偵測 grace-period 的範圍限縮在每個 subsystem :::info [srcu: Implement call_srcu()](https://lwn.net/Articles/478293/) Jan 2012 加入 `call_srcu()` ::: ### Example #### Read-Side Primitives The read-side srcu_read_lock() and srcu_read_unlock() primitives are used as shown: ```c idx = srcu_read_lock(&ss); /* read-side critical section. */ srcu_read_unlock(&ss, idx); ``` #### Update-Side Primitives The synchronize_srcu() primitives may be used as shown below: ```c list_del_rcu(p); synchronize_srcu(&ss); kfree(p); ``` #### Table 1: SRCU Update and Read-Side Critical Sections ![](https://i.imgur.com/5GFQAaq.png) > Here, CPU 1 need only wait for the completion of CPU 0's SRCU read-side critical section. It need not wait for the completion of CPU 2's SRCU read-side critical section, because CPU 2 did not start this critical section until after CPU 1 began executing synchronize_srcu(). Finally, CPU 1's synchronize_srcu() need not wait for CPU 3's SRCU read-side critical section, because CPU 3 is using s2 rather than s1 as its struct srcu_struct. CPU 3's SRCU read-side critical section is thus related to a different set of grace periods than those of CPUs 0 and 2. CPU 1 只需要等待 CPU 0,不需要等待 CPU 3,因為 CPU 1 跟 CPU 3 屬於不同的 subsystem ---- ### SRCU vs. preemptible RCU > - [The design of preemptible read-copy-update](https://lwn.net/Articles/253651/) > lwn article by Paul McKenney 2007 > The RCU implementation for the -rt patchset is unusual in that it permits read-side critical sections to be preempted and to be blocked waiting for locks. However, it does not handle general blocking (for example, via the wait_event() primitive): if you need that, you should instead use SRCU. In contrast to SRCU, preemptible RCU only permits blocking within primitives that are both subject to priority inheritance and non-blocking in a non-CONFIG_PREEMPT kernel. This ability to acquire blocking locks and to be preempted within RCU read-side critical sections is required for the aggressive real-time capabilities provided by Ingo Molnar's -rt patchset. preemptible RCU 只能夠允許 read-side critical sections 被 preempted 或是因為等待 lock 而被 block。若是想要處理一般的 blocking 像是 wait_event(),則需要使用 sleepable RCU - 為什麼 preemptible RCU 會有這個限制呢? 當 reader 在 RCU read-side critical section 時做了 wait_event(),而條件又不會滿足,那麼 grace period 將會被無限期延長,最後會導致 Out of memory (OOM)。如果是等待 lock,那就可以透過各種 priority boost 的方法來解決 (像是 priority inheritance),強制該 reader 有 progress。 #### preemptible RCU > Because this implementation of preemptible RCU does not require memory barriers in rcu_read_lock() and rcu_read_unlock(), a multi-stage grace-period detection algorithm is required. Instead of using a single wait queue of callbacks (which has sufficed for earlier RCU implementations), this implementation uses an array of wait queues, so that RCU callbacks are enqueued on each element of this array in turn. This difference in callback flow is shown in the following figure for a preemptible RCU implementation with two waitlist stages per grace period 因為 preemptible RCU 不需要在 rcu_read_lock() 跟 rcu_read_unlock() 內使用 memory barriers,所以需要一個 multi-stage grace-period detection algorithm,而 classic RCU 中使用的 single `wait` queue 也被改為 an array of `wait` queues。如下圖: ![](https://i.imgur.com/NYNLaXn.png) ---- TODO, Cleaning Up Safely & Implementation 之後再補 ---- #### SRCU Summary SRCU provides an RCU-like set of primitives that permit general sleeping in the SRCU read-side critical sections. However, it is important to note that SRCU has been used only in prototype code, though it has passed the RCU torture test. It will be very interesting to see what use, if any, SRCU sees in the future. ---- ### SRCU Quick Quizzes #### Why is sleeping prohibited within Classic RCU read-side critical sections? > Because sleeping implies a context switch, which in Classic RCU is a quiescent state, and RCU's grace-period detection requires that quiescent states never appear in RCU read-side critical sections. 因為 sleep 亦即 context switch,而 context switch 又屬於 quiescent state。在 classic RCU 中是不允許在 read-side critical section 中有 quiescent state 存在的。 #### Why not permit sleeping in Classic RCU read-side critical sections by eliminating context switch as a quiescent state, leaving user-mode execution and idle loop as the remaining quiescent states? > This would mean that a system undergoing heavy kernel-mode execution load (e.g., due to kernel threads) might never complete a grace period, which would cause it to exhaust memory sooner or later. 因為這樣 kernel threads 可能永遠無法完成一個 grace period,導致快速耗盡記憶體 --- ## [Using RCU to Protect Read-Mostly Linked Lists](https://docs.kernel.org/RCU/listRCU.html) 討論適合 RCU 的應用場景 - Linux kernel 內 [Concurrent hash table](https://en.wikipedia.org/wiki/Concurrent_hash_table) 的 list 會用 RCU 來保護 One of the best applications of RCU is to protect read-mostly linked lists (struct list_head in list.h). One big advantage of this approach is that all of the required memory barriers are included for you in the list macros. This document describes several applications of RCU, with the best fits first. RCU 最好的應用場景莫過於 read-mostly linked lists,像是 list.h 中的 struct list_head,所有需要的 memory barrier 已經被包含在裡面的 macros 內了。 - Example 1: Read-mostly list: Deferred Destruction - Example 2: Read-Side Action Taken Outside of Lock: No In-Place Updates routing table - Example 3: Handling In-Place Updates - Example 4: Eliminating Stale Data - Example 5: Skipping Stale Objects ### Summary > Read-mostly list-based data structures that can tolerate stale data are the most amenable to use of RCU. The simplest case is where entries are either added or deleted from the data structure (or atomically modified in place), but non-atomic in-place modifications can be handled by making a copy, updating the copy, then replacing the original with the copy. If stale data cannot be tolerated, then a deleted flag may be used in conjunction with a per-entry spinlock in order to allow the search function to reject newly deleted data. 而能夠容忍 "過時" 資料的情境是最適合 RCU 的。 若需要即時性資料,則要 set deleted flag if the deleted flag is set, pretends that the entry does not exist. --- ## Reference ### MISC - ==[Introduction to RCU](https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit#)== Paul E. McKenney 整理的 RCU 資料集 - [Linux 核心設計: RCU 同步機制](https://hackmd.io/@sysprog/linux-rcu) - [並行程式設計: Atomics 操作](https://hackmd.io/@sysprog/concurrency/https%3A%2F%2Fhackmd.io%2F%40sysprog%2Fconcurrency-atomics) - [CppCon 2017: Fedor Pikus “Read, Copy, Update, then what? RCU for non-kernel programmers”](https://www.youtube.com/watch?v=rxQ5K9lo034) cppcon speech - [Lock Free中的Epoch Based Reclamation](http://www.yebangyu.org/blog/2016/09/09/epochbasedreclamation/) a third-party article - [Lock Free中的Hazard Pointer(上)](http://www.yebangyu.org/blog/2015/12/10/introduction-to-hazard-pointer/) a third-party article - [Quiescent State Based Reclamation vs Epoch Based Reclamation](https://stackoverflow.com/questions/36573370/quiescent-state-based-reclamation-vs-epoch-based-reclamation) stackoverflow question - [Memory Barriers: a Hardware View for Software Hackers](http://www.rdrop.com/~paulmck/scalability/paper/whymb.2010.06.07c.pdf) paper, Paul E. McKenney, 2010 - [Hazard pointer, 2022q1 第 5 週測驗題](https://hackmd.io/@sysprog/linux2022-quiz5#%E6%B8%AC%E9%A9%97-2) - [Concurrency I: Types of Concurrency](https://www.suse.com/c/concurrency-i-types-of-concurrency/) [Concurrency II: Concurrently Readable Structures](https://www.suse.com/c/concurrency-ii-concurrently-readable-structures/) suse 有關 concurrency 的介紹,有討論 concurrently readable data structures > A concurrently readable structure allows one writer and multiple parallel readers. An interesting property is that when the reader “begins” to read, the view for that reader is guaranteed not to change until the reader completes. This means that the structure is transactional. A concurrently readable structure 指的是在 reader 開始讀取 share data 時,要保證在他結束讀取前,他所讀到的值都不會改變 ---- ### books & papers - [Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel](https://www.usenix.org/conference/2003-usenix-annual-technical-conference/using-read-copy-update-techniques-system-v-ipc) usenix 2003, Paul E. McKenney (paper) - [READ-COPY UPDATE: USING EXECUTION HISTORY TO SOLVE CONCURRENCY PROBLEMS](http://www.rdrop.com/~paulmck/scalability/paper/rclockpdcsproof.pdf) 1998/2002, Paul E. McKenney (paper) - [Comparative Performance of Memory Reclamation Strategies for Lock-free and Concurrently-readable Data Structures](https://www.cs.utoronto.ca/~tomhart/papers/tomhart_thesis.pdf) 2005, Thomas Edward Hart (long paper/book) - [Making lockless synchronization fast: performance implications of memory reclamation](https://dl.acm.org/doi/10.5555/1898953.1898956) http://www.cecs.uci.edu/~papers/ipdps06/pdfs/1568974892-IPDPS-paper-1.pdf IPDPS 2006, Thomas Edward Hart, Paul E. McKenney (paper) - [Is Parallel Programming Hard, And, If So, What Can You Do About It?](https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html) perfbook, Paul E. McKenney (book) - [Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System](https://www.usenix.org/legacy/events/osdi99/full_papers/gamsa/gamsa.pdf) usenix 1999, Ben Gamsa, Orran Krieger - [Performance of memory reclamation for lockless synchronization](https://dl.acm.org/doi/10.1016/j.jpdc.2007.04.010) 2007, Thomas Edward Hart, Paul E. McKenney (paper) - [A marriage of pointer- and epoch-based reclamation](https://dl.acm.org/doi/abs/10.1145/3385412.3385978) PLDI 2020, Jeehoon Kang, Jaehwang Jung - [Practical lock-freedom](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf) EBR 出處, 2004, Keir Fraser - [Harnessing epoch-based reclamation for efficient range queries](https://dl.acm.org/doi/abs/10.1145/3178487.3178489) PPoPP 2018 (paper) (感覺沒用?) - [Safe memory reclamation for dynamic lock-free objects using atomic reads and writes](https://dl.acm.org/doi/10.1145/571825.571829) hazard pointer 出處, PODC 2022, Maged M. Michael - [Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects](https://dl.acm.org/doi/10.1109/TPDS.2004.8) hazard pointer 出處, IEEE Transactions on Parallel and Distributed Systems 2004, Maged M. Michael - [Scalable lock-free dynamic memory allocation](https://dl.acm.org/doi/10.1145/996893.996848) ACM SIGPLAN Notices 2004, Maged M. Michael - [Lock-Free Data Structures with Hazard Pointers](https://erdani.org/publications/cuj-2004-12.pdf) implementation with hazard pointer - [Towards Hard Realtime Response from the Linux Kernel on SMP Hardware](https://www.researchgate.net/publication/244519278_Towards_Hard_Realtime_Response_from_the_Linux_Kernel_on_SMP_Hardware) paper, 2005 paul, talk about quiescent state - [Avoiding RCU Stalls in the real-time kernel](https://access.redhat.com/solutions/2260151) 討論 RCU Stall 的 issue - [RCU-preempt: What happens on a context switch](https://www.joelfernandes.org/linuxinternals/2018/05/10/5-rcu-preempt-context-switch.html) 討論 PREEMPT_RT config 下的 RCU, 提到請搭配 [Expedited RCU](https://www.kernel.org/doc/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html) 一起讀 - [RCU Usage In the Linux Kernel: Eighteen Years Later](https://dl.acm.org/doi/10.1145/3421473.3421481) paper, 2020 paul, 講述 RCU 在 Linux kernel 的情況 (v5.6) ---- ### RCU doc / articles - [User-space RCU](https://lwn.net/Articles/573424/) - [7. RCU Library](https://doc.dpdk.org/guides/prog_guide/rcu_lib.html) dpdk - [A Tour Through RCU's Requirements](https://docs.kernel.org/RCU/Design/Requirements/Requirements.html) 2015, Author: Paul E. McKenney 整理的 RCU 文件,內含多篇文章 #### LWN articles - [What is RCU, Fundamentally?](https://lwn.net/Articles/262464/) RCU 的三個基本概念 - a publish-subscribe mechanism for adding new data, - a way of waiting for pre-existing RCU readers to finish, and - a discipline of maintaining multiple versions to permit change without harming or unduly delaying concurrent RCU readers. #### Linex kernel doc - [RCU Concepts](https://01.org/linuxgraphics/gfx-docs/drm/RCU/index.html) ---- ### Unused here (useless now?) - [Why Hazard Pointers Should Be in C++26](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2530r0.pdf) - [LOCK-FREE ALGORITHMS AN INTRODUCTION - SAMY BAHRA / APPNEXUS.COM](http://concurrencykit.org/presentations/lockfree_introduction/index.html) - [RCU(Read, Copy, Update) - What is RCU](https://ithelp.ithome.com.tw/articles/10218032) ithome, 裡面有一些文章的連結 --- ## Application / library - [Paul E. McKenney's Journal](https://paulmck.livejournal.com/) - [folly/Rcu.h](https://github.com/facebook/folly/blob/main/folly/synchronization/Rcu.h) - [dpdk/lib/rcu/](https://github.com/DPDK/dpdk/tree/main/lib/rcu) https://www.dpdk.org/ - [oscarlab/versioning](https://github.com/oscarlab/versioning) Framework for creating simple, efficient, and composable lock-free data structures - [wuxb45/wormhole](https://github.com/wuxb45/wormhole) [wuxb45/remixdb](https://github.com/wuxb45/remixdb) Wormhole internally uses QSBR RCU to synchronize readers/writers so every holder of a reference (ref) needs to actively perform index operations. - [laurynas-biveinis/unodb](https://github.com/laurynas-biveinis/unodb) UnoDB is an implementation of Adaptive Radix Tree in two flavors – a regular one and Optimistic Lock Coupling-based concurrent one with Quiescent State Based Reclamation (QSBR). - [rmind/libqsbr](https://github.com/rmind/libqsbr) Quiescent-State and Epoch based reclamation - [khizmax/libcds](https://github.com/khizmax/libcds) CDS C++ library - [urcu/userspace-rcu](https://github.com/urcu/userspace-rcu) Userspace RCU Implementation http://liburcu.org/ - [sysprog21/concurrent-programs](https://github.com/sysprog21/concurrent-programs) - [rculist.h](https://github.com/torvalds/linux/blob/master/include/linux/rculist.h) ---- ### Atomic [Atomic operations library](https://en.cppreference.com/w/c/atomic) [Atomic types](https://en.cppreference.com/w/c/language/atomic) [memory_order](https://en.cppreference.com/w/c/atomic/memory_order) --- ## Other - [Concurrent hash table](https://en.wikipedia.org/wiki/Concurrent_hash_table) A concurrent hash table (concurrent hash map) is an implementation of hash tables allowing concurrent access by multiple threads using a hash function - [内核工具 – Sparse 简介](https://www.cnblogs.com/wang_yb/p/3575039.html) - [McKenney: So You Want to Rust the Linux Kernel?](https://lwn.net/Articles/872444/) 2021, paul McKenney