### Generalize Synchronization Mechanisms for Analyzing Properties Trade-offs and Exploring New Mechanisms 泛化同步機制以分析屬性權衡及探索新機制 <small style="color: #999">(Link to the paper: https://arxiv.org/abs/2309.11972)</small> --- ## Introduction The synchronization of shared resources. --- ### The origin * Multiple workers for scalability and speedup * No shared resources * <span style="color: #999">No data race. Simply divide the workload</span> * With shared resources * <span style="color: #999">Need synchronization to prevent data race</span> --- ### Existed sync. mechanism * Shared memory * <span style="color: #999">Atomics, Locks, Semaphore</span> * Distributed memory * <span style="color: #999">Paxos family, Raft, CRDTs</span> --- ### No perfect sync. mechanism * Enforce strong consistency and low latency may lose R/W freedom, or even fault tolerance * Weaker consistency, for higher writing freedom, lower latency --- ### The proposed framework * Generalize synchronization mechanism for easier reasoning * <span style="color: #999">From the perspective of multi-writer to single-writer convergence</span> * Formally analyze and reason about how mechanisms converge to get properties * <span style="color: #999">State trade-offs and limits on properties</span> * Is there room for improvements? Optimal? --- ## Related works * $ROLL$ theorem * Model leaderless SMR with a framework * quorum size <span style="color: #999">vs</span> fault tolerance <span style="color: #999">trade-off</span> * $2F + f -1 \leq n$ Note: 這頁應該可以跳過 **Why framework?** * Explain how mechanisms work in the same clear way * Check if existed ones can be improved to reach limits * A template for new mechanism --- ### Our framework * From different perspective * More universal * **Consistency**, <span style="color: #83aec9">writing freedom</span>, <span style="color: #9dd1a4">latency</span>, <span style="color: #e8e2b0">loading</span>, <span style="color: #fcb3b3">fault tolerance</span> * More on finding trade-offs and limits * More understandable --- ## Synchronization Framework --- ### System Model Give context --- **Shared memory** A register $r$ is a sequence of memory location $r=\{a_1, a_2,...\}$ * <span style="color: #aaa">A writer $write$ in a new location, appending to the end</span> * Readers perform $read$ by reading through the sequence to the latest possible location. * <span style="color: #aaa">Only focus on writers</span> --- **Distributed memory** * Shared memory illusion * Each replication is another register --- ### A valid synchronization * Sync all the register(s) to the same projection result * <span style="color: #aaa">correctness</span> * <span style="color: #aaa">progress</span> * projection result: accumulate all the values, readers determine the final outcome by * <span style="color: #aaa">Take the latest one</span> * <span style="color: #aaa">Summarize</span> * <span style="color: #aaa">or other rules</span> --- * An invalid sync * <span style="color: #aaa">End up with wrong result</span> * <span style="color: #aaa">Stuck</span> * Sync * <span style="color: #aaa">multiple writers on 1 register</span> * <span style="color: #aaa">multiple writers across multiple registers (1 on each)</span> * Failure mode: Failed recover. No Byzantine. --- **Synchronization is a process of Multi-writer to Single-writer convergence**<br> How synchronization mechanisms work in general. --- * Among concurrent writes <span style="color: #999">(no $hb$)</span>, different write order could lead to different state of conflicting result * <span style="color: #bbb">Sync. shrinks possible states to the desired one by MW$\to$SW convergence</span> * Resolve them * <span style="color: #bbb">By rules: giving order or priority</span> * <span style="color: #bbb">By making them commutative inherently</span> --- * In shared memory environment * <span style="color: #aaa">Concurrent writes without mutual exclusion cause race condition. UB in C11.</span> * Synchronization mechanisms are designed around two core aspects * <span style="color: #aaa">The spread of fact</span> * <span style="color: #aaa">What to do after receiving the fact</span> --- ### Properties in Synchronization Mechanisms --- ### Properties * Consistency * <span style="color: #83aec9">Writing freedom</span> * <span style="color: #9dd1a4">Latency</span> * <span style="color: #e8e2b0">Loading</span> * <span style="color: #fcb3b3">Fault Tolerance</span> --- ### Consistency ![](https://hackmd.io/_uploads/Sy1UPkZZh.png =700x) <small>ref: consistency in non-transactional Distributed storage system</small> The order guarantee of the result from $read$ --- ### Writing freedom The amount of writers that can propose values: 1~$n$ * Avoid bottleneck on a writer * Load balancing * <span style="color: #999">Cost latency to coordinate</span> --- ### Latency * How many round trips would it take to reach certain consistency? * The coordination during/before writing introduces latency --- ### Loading * The size of the set of communication targets in each round (quorum size) * Lower down message counts. Avoid high ping. * Impact the density of facts. Lead to different Fault Tolerance and Latency. --- ### Fault tolerance At most $f$ can fail at any moment during the sync. <span style="color: #999">If exceed, no progress or end up incorrectly</span> --- ## The framework Synchronization is a process of Multi-writer to Single-writer convergence --- ### A valid pass of sync. * Writers converged * Have enough writers learned the converged result to ensure safety. --- ### Horizontal ![](https://hackmd.io/_uploads/BJfyMz1i2.png =700x) $write(v, m)$ --- * Consistency: $|w|$, 1 ~ Any * Latency: RTTs in stages * Writing freedom: The amount of writers in $exe$ Note: 拿 Paxos 舉例 --- ### Vertically * **Quorum**: Writers to communicate in a round. * **Arbitration**: Rules on how to response on receiving the fact. * <span style="color: #999">Only accept one</span> * <span style="color: #999">Merger to one</span> --- * Arbiters: Live and hold facts between rounds * Static * <span style="color: #999">Known before $write$ with value</span> * <span style="color: #999">Decided explicitly</span> * Dynamic * <span style="color: #999">Known after $write$ with value</span> * <span style="color: #999">Decided by intersection</span> --- * Loading: quorum size * Fault tolerance: $f$ $ROLL$ theorem: $2F + f - 1 \leq n$ --- ## Model and Analyze --- ### Raft / Multi-Paxos ![](https://hackmd.io/_uploads/HkpKzVbb2.png =600x) * **Consistency**: 1 * **Latency**: 1 RTT, both electing and elected * **Writing freedom**: 1 * **loading**: $n$ * **Fault tolerance**: $\lfloor\frac{n-1}{2}\rfloor$ --- ### EPaxos ![](https://hackmd.io/_uploads/rJLMrEbWn.png =650x) * **Consistency**: 1 * **Latency**: 1 RTT, fast. 2RTT, slow * **Writing freedom**: $n$ * **loading**: $\lfloor\frac{3n}{4}\rfloor$, fast. $\lceil\frac{n+1}{2}\rceil$, slow * **Fault tolerance**: $\lfloor\frac{n-1}{2}\rfloor$ --- ### CRDTs ![](https://hackmd.io/_uploads/H1tBp4ZZn.png) (The simplest one) * **Consistency**: Any * **Latency**: 0.5 RTT * **Writing freedom**: $n$ * **loading**: Any * **Fault tolerance**: $n-1$ --- ## Trade-offs and limits --- ### Observation * Different level of **consistency** needs different level of **coordination** * Differences of **Writing freedom** enable the dynamic/static **arbiter type** and different **coordination latency** --- * Bigger **loading** trades off for better **fault tolerance** * Dynamic and static **arbiter** have different limitation on trading off **loading** and **fault tolerance** --- ### Analysis * Assuming linearizability for consistency * The split brain problem * <span style="color: #999">Two or more projection result at the end of a pass</span> * <span style="color: #999">Not converged</span> * Caused by insufficient fact spreading --- **Lemma 1.** In the context of linearizability, the amount of fault does not exceed $\lfloor \frac{n-1}{2} \rfloor$ $proof.$ By contradiction. Having half or more failed may end up with two different projection result. --- * Static arbiter * 1RTT to decide in $pre$ * 1RTT to use in $exe$ * Dynamic arbiter * 1RTT to decide * 2RTT or even more to linearize --- **Theorem 1.** Under n writing freedom, one round trip is optimal for deciding to write or abort. $proof.$ The arbitration rule determines to response negative message to all the other writers except the converged writer as positive. --- * Need sufficient size to decide static/dynamic arbiter * Combined with lemma 1, all properties are related to quorum. * Key concept for loading trading for fault tolerance * Not only focus on fact spreading, but also on which writer does not get the fact --- * The principle * There must be at least one writer holding facts across two rounds to deliver them * Count which writer can not deliver * writers without facts + faulty writers * A writer is either without fact or faulty --- **Theorem 2.** To decide a dynamic arbiter, $2(n − |Q|) + f − 2 < n$ needs to be satisfied $proof.$ Writers without facts: $2(n − |Q|)$, faulty writers: $f$. Exclude 2 writer for having additional facts. --- * EPaxos uses $\lfloor \frac{3n}{4} \rfloor$ quorum size for maxing out the fault tolerance * $ROLL$ theorem is the theorem for fixed size quorum and dynamic arbiter situation, under the consistency level of linearizability. --- **Theorem 3.** To decide a static arbiter, $(n − |Q|) + f \leq \lceil \frac{n-1}{2} \rceil$ needs to be satisfied. $proof.$ Writers without facts: $(n − |Q|)$, faulty writers: $f$. According to Lemma 1, to have sufficient writers to fail, the combination should not exceed $\lceil \frac{n-1}{2} \rceil$ --- * Raft and VR uses $n$ quorum to tolerate minority fault. * Key concept behind Theorem 2 and 3 can be applied to more situation. --- * Eventual consistency trades weakest consistency for every other properties to be optimal * For consistency level in-between eventual and linearizability, decompose into the combination of strongest and weakest, to be analyzed in the known way. --- ## The Discussion --- **Extra findings and details** * Properties do not indicate performance. * Clock only can help local resolving, reducing latency by 1 RTT * Apply key concept to dynamic or non-homogeneous quorum * Application logic in synchronization * TOQ uses time to reduce 1 RTT in $exe$ for EPaxos --- ### Exploring New Mechanism --- Use priority tree to resolve conflict locally. Order conflict writes by applying fixed priority. * **Consistency**: 1 * **Latency**: ++1 RTT, fast. 1RTT, slow++ * **Writing freedom**: $n$ * **loading**: $\lfloor\frac{3n}{4}\rfloor$, fast. $\lceil\frac{n+1}{2}\rceil$, slow * **Fault tolerance**: $\lfloor\frac{n-1}{2}\rfloor$ --- ### Future work * There are still more details to discuss. Especially on applying to real implementation. * Some optimization performed on the implementation can not be covered in the framework. * Write specification of the framework using TLA+, thinking in a more formal method way. --- ## The end --- ## Special Thanks Thanks Prof. Chen for allowing me to research the topic I've been looking forward to studying. Thanks Prof. Hsieh for emphasizing the importance of simplicity during graph theory course. --- ## FAQ --- * **Why the analysis is only applicable to the valid synchronization mechanism? Why it can not be placed on others?** <span style="color: #999">`Without the guarantee on both correctness and progression, the mechanisms can easily be designed to have any property at optimal level. Correctness and progression come at a price only when they are guaranteed, so that there are trade-offs and limits.`</span>
{"showTitle":"false","showTags":"false","slideOptions":"{\"theme\":\"night\",\"transition\":\"slide\"}","title":"Generalize Synchronization Mechanisms for Analyzing Properties Trade-off and Exploring New Mechanisms","description":"Trying to explain synchronization mechanisms universally in a formal way. Newton tried to explained universe as a mechanical machine with math. I am a fan of Newton so I tried to do the same on computing universe, here focusing on synchronization.","image":"https://hackmd.io/_uploads/HJPlvPish.png","contributors":"[{\"id\":\"fc8603a8-fb74-41e4-86f9-f8a28701b0d6\",\"add\":17820,\"del\":5842}]"}
    460 views