Clock Synchronization in Distributed System - HackMD

<style> H2{color:#BF0060 !important;} H3{color:#009393 !important;} p{color:Black !important;} li strong {color:#4682b4 !important;} </style> # Clock Synchronization in Distributed System ## **Physiacal clock** * 時鐘有 skew * 電腦 are synchronized to UTC(=GMT), which is based on Atomic Clock. ### **名詞定義** * **Perfect clock C~p~(t) = t** * **Frequency** * Frequency is the rate at which a clock progresses * ==C~p~'(t)== is the frequency at time t of clock C~p~ * Cp'(t) ＝ ${dCp(t)\over dt} = \lim_{Δ\to\ 0}{Cp(t)-Cp(t-Δ) \over Δ}$ = 1 for perfect clock <img src="https://i.imgur.com/EUY6Sm6.png" width = "200"/></img> * **Offset** * 時間差 <img src="https://i.imgur.com/7TOG6eT.png" width = "350"/></img> * **Precision and Accuracy** * Precision π * 精確度是相對概念 $∀t,\ ∀p, q : |C~p~(t)-C~q~(t)| ≤ π$ * the offset between two clocks is bounded by π * Accuracy α * 準確度代表絕對 * $∀t,\ ∀p : |C~p~(t)-t| ≤ α$ * the offset between a clock and a reference point (e.g., UTC) is bounded by α * Note : * clocks that are accurate within bound α, will be precise within bound π = 2α * **Skew** * ==C~a~’(t) – C~b~’(t)== : The skew of a clock C~a~ `relative to `clock C~b~ at time t * ==C~a~’(t) – 1== : The skew of C~a~, since perfect clock C~p~(t)' = 1 * **Skew Tolerance: Max Drift Rate** * Clock drift rate : the difference per unit of time from a perfect reference clock * ==Max Drift Rate $ρ$== is specified by manufactors * $1-ρ ≤ {dCp(t)\over dt} ≤ 1+ρ \ , \ ∀t$ * 2 clock may differ by ==2$ρ$Δt in Δt== * **題目** * Given Max Drift Rate $ρ$, 要求 Precision π * 解： * $2ρΔt ≤ π$ $Δt ≤ {π \over 2ρ}$ ，所以每 ${π \over 2ρ}$ 秒應 resynchronize 一次 ### **Cristian’s Algorithm** * **原因** * 電腦向 UTC =t 對齊時，msg delay Δ 導致收到的 t 實際上是 t+Δ * **假設** * Clock A B 有相同 offset & frequency * ==round-trip delay = delay(m) + delay(m’)== : 一來一回的 delay * round-trip delay = ==(T4-T3) - (T1-T2)== <img src="https://i.imgur.com/zjCo4P5.png" width = "300"/></img> * **Estimate the Offset** * Assume * O = C~B~(t) – C~A~(t) * delay(m) ~= delay(m’) * ==O = (T1 - T3 + T2 - T4) /2== <img src="https://i.imgur.com/rTpQF59.png" width = "250"/></img> * 例題 <img src="https://i.imgur.com/BfJOZ12.png" width = "500"/></img> ### **Network Time Protocol (NTP)** * **特色** * 向階層化 server 要時間 : primary, secondary...server * accuracy : 1~50ms * statically technique for filtering (去除outlier) * 保留最近 8 次 (O~i~,D~i~) pair * O~i~ = offset, D~i~ = round-trip time * ==O = 最小 D~i~ 那次的 O~i~== * survive lengthy losses of connectivity (克服斷線) * Reduntant server and paths to server , which provides protection against malicious interference ### **Berkeley Algorithm** * **特色** * 電腦間彼此同步 * 一台為 master 定期 poll slaves，取大家 Avg time 差值 <img src="https://i.imgur.com/CqRWJU9.png" width = "500"/></img> ### **Decentralized Averaging Algo** * **特色** * 每台電腦定期廣播 local time，算 Avg time 自我校正 ### **Reference Broadcast Sync (RBS)** * **前情提要** * 同步時的 4 種 time error <img src="https://i.imgur.com/OVed7Yw.png" width = "400"/></img> * **原理** * <img src="https://i.imgur.com/1gUUYCt.png" width = "300"/></img> * Sender 同時發 m∈M 的 referenced packets 給 receivers * 假設所有 receivers 應該同時收到 packets * removes send and access time errors (因為同 sender 送出) * sensitive to receive time and propagation delay (因為此 Algo 假設這兩種 error 一樣) * 範例：上圖 receiver 1 & 2 彼此內部時間差 1s * 所有 receivers 互相同步，計算彼此時間差 Offset[p,q] * 忽略 clock skew : `Offset[p,q]` = ${\sum\limits_{m∈M}^{}Tp,m - Tq,m \over |M|}$ * 考慮 clock skew：令 `Offset[p,q](t)` = 時間函數 αt + β * **特性** * Broadcast 使用 relative time reference * 因為實際時間不重要，Msg 只是當作共同參考點 * Each receiver synchronizes to a reference packet * Msg ==沒有 timestamp== * 所以任一種 broadcast protocol 都可以 e.g. ARP, RTS/CTS, route discovery packets, etc. ### **Multihop RBS** * **前提： Phase Offset Estimation** * Receivers 紀錄收到脈衝時的 local time * Receivers exchange observations * form a local (relative) timescale * **原理**： * <img src="https://i.imgur.com/DpQLgCJ.png" width = "500"/></img> ## **Model of distributed computation** ### **Actions and events** * The states of processes and channels change as the events occur. * **3 type of events** | | internal event| msg send event | msg receive event | | --------| -------- | -------- | -------- | | Change the state of | process itself | sending process & channel | receiving process & channel| ### **From Causality(因果) to Event ordering** * 利用 Event ordering 推斷因果關係 * Event Relation 是 partial order，因為 DS 中無 global clock，不可能有 total order * **Happened-Before (HB) Relation** * Induced By Message Passing * Strict Partial Order * irrefexivity, antisymmetry(因到果，不能果到因), transitivity ### **2 Rules** 1. e~i~ 到不了 e~j~ ，但 e~j~ 到/到不了 e~i~ 都有可能 2. e~i~ 能到 e~j~，則 e~j~ 必到不了 e~i~ <img src="https://i.imgur.com/V9H2xI7.png" width = "400"/></img> ### **Concurrent** * **Def** * Neither e~i~ -> e~j~ nor e~j~ -> e~i~ 則 e~i~ || e~j~ * <img src="https://i.imgur.com/srGP4UD.png" width = "200"/></img> * **Concurrent relation 沒有遞移性** * <img src="https://i.imgur.com/O6ttTNT.png" width = "400"/></img> * **將 Logically concur 視作 physically concur** * 因為不影響 causality * LogiC 出錯，不代表 PhyC 會出錯 ### **結論** * <img src="https://i.imgur.com/uxkde99.png" width = "400"/></img> ## **Logical time and logical clock** ### **Logical clock** * **Definition** * C: H ↦ T * ==Maps event e to timestamp C(e)==, e∈H, C(e)∈T * ==Clock consistent conditions== * (1) ei ➞ ej ⇒ C(ei) < C(ej) * satisfied protocal : Lamport's Scalar Time * (2) ei ➞ ej ⇔ C(ei) < C(ej) : strongly consistent * satisfied protocal : Vector time * **physical vs. logocal clock** * <img src="https://i.imgur.com/uUW79CC.png" width = "300"/></img> ### **Implementing logical clock** * **2 issue** * (1) Data Structure * local logical clock:lc~i~ * global logical clock:gc~i~ * (2) ==Protocal== to update the DS that ensures the consistency * 2 rules * ==R1== : how ==lc ~i~== is udated as the events (send/receive/internal) occurs * ==R2== : how ==gc~i~== is udated * **Protocal** * ==(1) Lamport's scalar time== * Use C~i~ to represent both lc~i~ and gc~i~ * R1 : C~i~ := C~i~ + d * R2 : C~i~ = max(C~i~, C~msg~)+d * d=1 as an example * <img src="https://i.imgur.com/oluSybO.png" width = "300"/></img> * <img src="https://i.imgur.com/GnEyOj8.png" width = "350"/></img> * ==(2) Vector time== * R1 : vt~i~[i] := vt~i~[i] + d * R2 : * 1 ≤ k ≤ n : vt~i~[k] := max(vt~i~[k], vt[k]) + d * vt~i~[i] := vt~i~[i] + d * d=1 as an example * <img src="https://i.imgur.com/qkKIv6I.png" width = "400"/></img> ## **Lamport’s Scalar Time** ### **Construct Consistent Total Order** * **問題** * Scalar Clocks 可以排出 Consistent Total Order，但不能排出所有可能 * **目標 : Construct Unique Total Order** * 作法：時間相同時比 process id * 範例： * 原來有 2!2!種排法，變成 1 種 * <img src="https://i.imgur.com/lIWA8hk.png" width = "300"/></img> ### **Totally-ordered Multicasting** * **原理** * Msg timestamped 後 multicast 出去 * 收到 Msg 後放進 ==local queue== 按照 C(m) ==排序==，回覆 ==Ack== * Scalar time 中，若有 hb 關係則 C(m1) < C(m2) * Msg 在 queue 的 head 且收到所有 processes 的 Ack，才會把 Msg 交給 Application * **Ack 用處** * 因為 channel 是 FIFO 所以當 Pa 收到 Pb 的 Ack，表示 Pb 在 Ack 以前的訊息都送完了 ## **Vector Time** ### **Relation** * `vh < vk` : less than, 存在 vh < vk ，其他維度 vh <= vk * `vh || vk` : incomparable, 某些維度上 vh < vk 有些 vk < vh * **Isomorphism** * Set of events <---isomorphic to---> Set of timestamp * Strong consistency holds * ei ➞ ej ⇔ C(ei) < C(ej) * ei || ej ⇔ C(ei) || C(ej) ### **Singhal–Kshemkalyani’s differential technique** * **目的** * 減少 msg overhead * **原理** * 每個 process 只花 2n storage * 存上次 send/update 的本地時間vt~i~[i] * LS~i~[j] : last sent to process j * LU~i~[j] : 上次 update vt~i~[j] * 當 LS~i~[j] < LU~i~[k] 時，發送 {(k, vt~i~[k])} 給 process j * 範例： * LS~i~[j] = 4, LU~i~[k]=9, LU~i~[l]=10 * 發送 {(k,vt~i~[k]),(l,vt~i~[l])} 給 process j * **例題** * <img src="https://i.imgur.com/EStmH3O.png" width = "500"/></img> ###### tags: `OS`