<style>
@import url('https://fonts.googleapis.com/css2?family=PT+Serif&display=swap');
@import url('https://fonts.googleapis.com/css?family=Noto+Serif+TC&display=swap');
@import url('https://fonts.googleapis.com/css2?family=Fira+Code&family=Noto+Serif+TC&display=swap');
* {
font-family: 'PT Serif', 'Noto Serif TC', sans-serif;
letter-spacing: 0.02em;
/* font-size: 16px;
letter-spacing: 0.02em; */
}
.easy{
color:green;
}
.medium{
color:#eda35e;
}
.red{
color:red;
}
.proof {
padding:10px;
background-color: #f7f7f7;
}
</style>
# Disjoint Sets
## Introduction
本篇將介Set如何利用Data Strucutre表達一個集合(Set),以便在Graph Algorithm中使用
## Content
---
### 1. Set Operations
Set的操作定義如下
> #### ***Union($S_i$ , $S_j$)***
if $S_i$ and $S_j$ are two disjoint sets, then their union
\begin{align}
S_i\cup\text{ }S_j = \{x\text{ } |\text{ } x \in S_i \text{ or } x \in S_j\}
\end{align}
> #### ***Find(i)***
Find the set containing element $i$
接下來方便介紹Set的各個種類,例子都預設如下
\begin{split}
&S\text{ = { 0, 1 , 2 , 3 , 4 , 5 }}\\
&S_1\text{ = { 0, 2 , 3 }}\\
&S_2\text{ = { 1 }}\\
&S_3\text{ = { 4 , 5 }}\\
\end{split}
其中 $S_1$ , $S_2$ , $S_3$ 分別為 $S$ 的 Subsets
### 2. Union with Quick Find
> #### ***Using sequential mapping array***
> #### ***Find(i)***
使用array來表達每個元素屬於哪個Set,例如element 1只要利用S[1]馬上就能知道屬於$S_2$

> #### ***Union($S_i$ , $S_j$)***
$S_2$ = Union($S_2$ , $S_3$),將$S_3$的element改成屬於Set 2

> #### ***Time Complexity***
\begin{align}
&Find(i) : O(1)\\
&Union(S_i , S_j) : O(n)\\
\end{align}
> 我們可以發現如果使用Union則必須將每個元素改為對應的Set顯得很沒有效率,下一個Set的種類,將介紹怎樣可以使Union更具效率
---
### 3. Union with Simple Find
> #### ***Using Tree Representation***
> #### ***Find(i)***
使用Link list表達Set,若須查找對應的Set則需拜訪根節點屬於哪個Set

使用Array 版本

由上面例子,Find(2)需查看父節點0,再從父節點查看得知屬於Set 1
> #### ***Union($S_i$ , $S_j$)***
$S_1$ = Union($S_1$ , $S_3$),將代表根結點的4指向0


從上面我們可以知道4原本指向$S_3$,如今指向0再透過0查找屬於哪一個Set,合併動作只需要一 次就可以將所有element union
> #### ***Time Complexity***
\begin{align}
&Find(i) : O(n)\\
&Union(S_i , S_j) : O(1)\\
\end{align}
我們可以注意到,雖然Union大幅改善,但隨之而來的代價卻是Find Complexity變為 $O(n)$
ex: Union($S_0$ , $S_1$) , Union($S_1$ , $S_2$) , ... , Union($S_{n-2}$ , $S_{n-1}$) 皆合併至右邊集合,將造成以下情形發生

> 因此必須審視Union的規則以免到最後Find(i)時間複雜度皆須$O(n)$
---
### 4. Weighted Union
為了改善Simple Find,所造成的worst case情形發生,故Union採取權重的形式,避免形成歪斜情況,其所引用的規則稱為Weighting rule
> #### ***Weighting rule***
\begin{align}
&S_i = S_i \cup S_j \;\;\; \text{, if |} S_i \text{| >=|} S_j \text{|} \\
&S_j = S_i \cup S_j \;\;\; \text{, if |} S_i \text{| <|} S_j \text{|} \\
\end{align}
其核心規則為將元素個數小集合的合併至元素個數多的一方
> #### ***Time Complexity***
\begin{align}
&Find(i) : O(log(n))\\
&Union(S_i , S_j) : O(1)\\
\end{align}
>**Lemma : Let T be a tree with m nodes created by a sequence of weighting union the height of $T\;{\small \le}\text{ } \lfloor log_2\;m \rfloor + 1$**
<div class="proof">
*Proof by Mathematical Induction*
  *The Lemma is clearly true for m = 1.*
  *Assume it is true for all trees with i $\small \le\;$ m-1 nodes*
  *Let T be a tree with m node created by Weighting Union*
  *The last Union is Union(k , j).*
  Let $|j| = a$ nodes in tree j , $|k| = m - a$ nodes in tree k
   Without loss of generality, assume 1 $\small \le\;$ $a$$\;\small \le\;$ $m/2$
   The tree height of T is either the same as that of k
   That is, the height of T $\small \le\;$ $\lfloor log_2\;(m -a) \rfloor + 1$ $\small \le\;$ $\lfloor log_2\;(m) \rfloor + 1$
   or the tree height of T is one more than that of j
   That is, the height of T $\small \le\;$ $\lfloor log_2\;(a) \rfloor + 2$ $\small \le\;$ $\lfloor log_2\;(m/2) \rfloor + 2$ $\small \le\;$ $\lfloor log_2\;(m) \rfloor + 1$
  So, the lemma is also true for i = m.
  We know that the lemma is true by Mathematical Induction.
</div>
---
### 5. Weighted Union with Collapsing Find
改善Union的樹結構,有更好的方法稱為Collapsing rule
> #### ***Collapsing rule***
\begin{align}
if\;j\;is\;a\;node\;on\;the\;path\;from\;i\;to\;the\;root,\;set\;parent[j]\;to\;root(i)
\end{align}
透過Find(i)時,將查找的路徑節點皆指向根節點以此改善Find的執行效率

> #### ***Time Complexity***
\begin{align}
&Find(i) : O(\alpha(n))\\
&Union(S_i , S_j) : O(1)\\
\end{align}
---
### Reference
[1] Data Structure slides made by Prof. REN-SONG TSAY @ National Tsing Hua University
[2] Fundamentals of data structures in C++ 2nd edition , Horowitz, Sahni, Mehta
###### tags: `DataStructure`