Generalize Synchronization Mechanisms for Analyzing Properties Trade-offs and Exploring New Mechanisms
泛化同步機制以分析屬性權衡及探索新機制
(Link to the paper: https://arxiv.org/abs/2309.11972)
Introduction
The synchronization of shared resources.
The origin
- Multiple workers for scalability and speedup
- No shared resources
- No data race. Simply divide the workload
- With shared resources
- Need synchronization to prevent data race
Existed sync. mechanism
- Shared memory
- Atomics, Locks, Semaphore
- Distributed memory
- Paxos family, Raft, CRDTs
No perfect sync. mechanism
- Enforce strong consistency and low latency may lose R/W freedom, or even fault tolerance
- Weaker consistency, for higher writing freedom, lower latency
The proposed framework
- Generalize synchronization mechanism for easier reasoning
- From the perspective of multi-writer to single-writer convergence
- Formally analyze and reason about how mechanisms converge to get properties
- State trade-offs and limits on properties
- Is there room for improvements? Optimal?
- \(ROLL\) theorem
- Model leaderless SMR with a framework
- quorum size vs fault tolerance trade-off
- \(2F + f -1 \leq n\)
Note:
這頁應該可以跳過
Why framework?
- Explain how mechanisms work in the same clear way
- Check if existed ones can be improved to reach limits
- A template for new mechanism
Our framework
- From different perspective
- More universal
- Consistency, writing freedom, latency, loading, fault tolerance
- More on finding trade-offs and limits
- More understandable
Synchronization Framework
System Model
Give context
Shared memory
A register \(r\) is a sequence of memory location
\(r=\{a_1, a_2,...\}\)
- A writer \(write\) in a new location, appending to the end
- Readers perform \(read\) by reading through the sequence to the latest possible location.
- Only focus on writers
Distributed memory
- Shared memory illusion
- Each replication is another register
A valid synchronization
- Sync all the register(s) to the same projection result
- projection result: accumulate all the values, readers determine the final outcome by
- Take the latest one
- Summarize
- or other rules
- An invalid sync
- End up with wrong result
- Stuck
- Sync
- multiple writers on 1 register
- multiple writers across multiple registers (1 on each)
- Failure mode: Failed recover. No Byzantine.
Synchronization is a process of Multi-writer to Single-writer convergence
How synchronization mechanisms work in general.
- Among concurrent writes (no \(hb\)), different write order could lead to different state of conflicting result
- Sync. shrinks possible states to the desired one by MW\(\to\)SW convergence
- Resolve them
- By rules: giving order or priority
- By making them commutative inherently
- In shared memory environment
- Concurrent writes without mutual exclusion cause race condition. UB in C11.
- Synchronization mechanisms are designed around two core aspects
- The spread of fact
- What to do after receiving the fact
Properties in Synchronization Mechanisms
Properties
- Consistency
- Writing freedom
- Latency
- Loading
- Fault Tolerance
Consistency

ref: consistency in non-transactional Distributed storage system
The order guarantee of the result from \(read\)
Writing freedom
The amount of writers that can propose values: 1~\(n\)
- Avoid bottleneck on a writer
- Load balancing
- Cost latency to coordinate
Latency
- How many round trips would it take to reach certain consistency?
- The coordination during/before writing introduces latency
Loading
- The size of the set of communication targets in each round (quorum size)
- Lower down message counts. Avoid high ping.
- Impact the density of facts. Lead to different Fault Tolerance and Latency.
Fault tolerance
At most \(f\) can fail at any moment during the sync.
If exceed, no progress or end up incorrectly
The framework
Synchronization is a process of Multi-writer to
Single-writer convergence
A valid pass of sync.
- Writers converged
- Have enough writers learned the converged result to ensure safety.
Horizontal

\(write(v, m)\)
- Consistency: \(|w|\), 1 ~ Any
- Latency: RTTs in stages
- Writing freedom: The amount of writers in \(exe\)
Note:
拿 Paxos 舉例
Vertically
- Quorum: Writers to communicate in a round.
- Arbitration: Rules on how to response on receiving the fact.
- Only accept one
- Merger to one
- Arbiters: Live and hold facts between rounds
- Static
- Known before \(write\) with value
- Decided explicitly
- Dynamic
- Known after \(write\) with value
- Decided by intersection
- Loading: quorum size
- Fault tolerance: \(f\)
\(ROLL\) theorem: \(2F + f - 1 \leq n\)
Model and Analyze
Raft / Multi-Paxos

- Consistency: 1
- Latency: 1 RTT, both electing and elected
- Writing freedom: 1
- loading: \(n\)
- Fault tolerance: \(\lfloor\frac{n-1}{2}\rfloor\)
EPaxos

- Consistency: 1
- Latency: 1 RTT, fast. 2RTT, slow
- Writing freedom: \(n\)
- loading: \(\lfloor\frac{3n}{4}\rfloor\), fast. \(\lceil\frac{n+1}{2}\rceil\), slow
- Fault tolerance: \(\lfloor\frac{n-1}{2}\rfloor\)
CRDTs

(The simplest one)
- Consistency: Any
- Latency: 0.5 RTT
- Writing freedom: \(n\)
- loading: Any
- Fault tolerance: \(n-1\)
Trade-offs and limits
Observation
- Different level of consistency needs different level of coordination
- Differences of Writing freedom enable the dynamic/static arbiter type and different coordination latency
- Bigger loading trades off for better fault tolerance
- Dynamic and static arbiter have different limitation on trading off loading and fault tolerance
Analysis
- Assuming linearizability for consistency
- The split brain problem
- Two or more projection result at the end of a pass
- Not converged
- Caused by insufficient fact spreading
Lemma 1. In the context of linearizability, the amount of fault does not exceed \(\lfloor \frac{n-1}{2} \rfloor\)
\(proof.\) By contradiction. Having half or more failed may end up with two different projection result.
- Static arbiter
- 1RTT to decide in \(pre\)
- 1RTT to use in \(exe\)
- Dynamic arbiter
- 1RTT to decide
- 2RTT or even more to linearize
Theorem 1. Under n writing freedom, one round trip is optimal for deciding to write or abort.
\(proof.\) The arbitration rule determines to response negative message to all the other writers except the converged writer as positive.
- Need sufficient size to decide static/dynamic arbiter
- Combined with lemma 1, all properties are related to quorum.
- Key concept for loading trading for fault tolerance
- Not only focus on fact spreading, but also on which writer does not get the fact
- The principle
- There must be at least one writer holding facts across two rounds to deliver them
- Count which writer can not deliver
- writers without facts + faulty writers
- A writer is either without fact or faulty
Theorem 2. To decide a dynamic arbiter, \(2(n − |Q|) + f − 2 < n\) needs to be satisfied
\(proof.\) Writers without facts: \(2(n − |Q|)\), faulty writers: \(f\). Exclude 2 writer for having additional facts.
- EPaxos uses \(\lfloor \frac{3n}{4} \rfloor\) quorum size for maxing out the fault tolerance
- \(ROLL\) theorem is the theorem for fixed size quorum and dynamic arbiter situation, under the consistency level of linearizability.
Theorem 3. To decide a static arbiter, \((n − |Q|) + f \leq \lceil \frac{n-1}{2} \rceil\) needs to be satisfied.
\(proof.\) Writers without facts: \((n − |Q|)\), faulty writers: \(f\). According to Lemma 1, to have sufficient writers to fail, the combination should not exceed \(\lceil \frac{n-1}{2} \rceil\)
- Raft and VR uses \(n\) quorum to tolerate minority fault.
- Key concept behind Theorem 2 and 3 can be applied to more situation.
- Eventual consistency trades weakest consistency for every other properties to be optimal
- For consistency level in-between eventual and linearizability, decompose into the combination of strongest and weakest, to be analyzed in the known way.
The Discussion
Extra findings and details
- Properties do not indicate performance.
- Clock only can help local resolving, reducing latency by 1 RTT
- Apply key concept to dynamic or non-homogeneous quorum
- Application logic in synchronization
- TOQ uses time to reduce 1 RTT in \(exe\) for EPaxos
Exploring New Mechanism
Use priority tree to resolve conflict locally. Order conflict writes by applying fixed priority.
- Consistency: 1
- Latency: 1 RTT, fast. 1RTT, slow
- Writing freedom: \(n\)
- loading: \(\lfloor\frac{3n}{4}\rfloor\), fast. \(\lceil\frac{n+1}{2}\rceil\), slow
- Fault tolerance: \(\lfloor\frac{n-1}{2}\rfloor\)
Future work
- There are still more details to discuss. Especially on applying to real implementation.
- Some optimization performed on the implementation can not be covered in the framework.
- Write specification of the framework using TLA+, thinking in a more formal method way.
The end
Special Thanks
Thanks Prof. Chen for allowing me to research the topic I've been looking forward to studying.
Thanks Prof. Hsieh for emphasizing the importance of simplicity during graph theory course.
FAQ
- Why the analysis is only applicable to the valid synchronization mechanism? Why it can not be placed on others?
Without the guarantee on both correctness and progression, the mechanisms can easily be designed to have any property at optimal level. Correctness and progression come at a price only when they are guaranteed, so that there are trade-offs and limits.