2021 Week5 - Basics of Graph

--- tags: 成大高階競技程式設計 2021 --- # 2021 Week5 - Basics of Graph 註：:bulb: 代表相對少見、較進階或與主題關聯性較低的內容，讀者可自行斟酌觀看。 --- 圖論（grpah theory）在資料結構與演算法中，可說是一大主題，本篇將介紹一些圖（grpah）的基本知識與術語。 # 圖（Graph）圖（graph），是點（vertex）與邊（edge）的集合；點是圖的基本元素，而邊描述點之間的關係。 ```graphviz graph { rankdir=RL; node [shape=circle, label=""]; A -- B; A -- C; B -- C; B -- E; C -- D; D -- E; D -- F; } ``` * **點／節點（vertex）** * 圖的基本元素 * **邊（edge）** * 點與點的關係 * 鄰點／鄰居（neighbors） * 透過一條邊直接相連的點 </br> * 道路（walk） * 點邊相間的序列（e.g., $v_0e_1v_1e_2v_2\dots e_nv_n$） * 行跡（trail） * 邊不重複的道路 * **路徑（path）** * **點不重複**的道路 * **環（cycle）** * 只有起點與終點重複的行跡 </br> * 度數（degree） * 與點連接的邊數 * 入度（in-degree） * 點作為終點的邊數 * 出度（out-degree） * 點作為起點的邊數 </br> * 連通（connected） * 點 * 二個點之間有路徑 * 圖 * 任意二點之間連通 * 非連通（disconnected） </br> * **（無向）圖（(undirected) graph）** * 邊無方向性，或說是**雙向**的 * **有向圖（directed graph）** * 邊有**方向性** * **賦權圖（weighted graph）** * 邊有權重 * 二分圖 * 點可以分為二個集合 $X,Y$，所有邊都是 $X$ 到 $Y$ 或 $Y$ 到 $X$ * 完全圖 * 任意二點間都有邊圖論常用 **$V$** 代表**點集合**，**$E$** 代表**邊集合。** <hr class="dashed"> ## 圖的資料結構 ### 鄰接矩陣（Adjacency matrix） ```graphviz graph { Adj [shape=none, label= <<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="10"> <TR> <TD BGCOLOR="DimGray"></TD> <TD BGCOLOR="DimGray"></TD> <TD></TD> <TD BGCOLOR="DimGray"></TD> </TR> <TR> <TD BGCOLOR="DimGray"></TD> <TD BGCOLOR="DimGray"></TD> <TD></TD> <TD></TD> </TR> <TR> <TD></TD> <TD></TD> <TD BGCOLOR="DimGray"></TD> <TD></TD> </TR> <TR> <TD BGCOLOR="DimGray"></TD> <TD COLSPAN="1"></TD> <TD></TD> <TD BGCOLOR="DimGray"></TD> </TR> </TABLE>> ]; } ``` 用一個二維的矩陣 $adj$，$adj[u][v]$ 代表邊 $(u,v)$，可能是 `true`, `false` 代表邊存在與否，或是儲存邊的權重。 ```cpp! int n, m; // 點數，邊數 cin >> n >> m; vector<vector<int>> adj(n, vector<int>(n, false)); while (m--) { int u, v; cin >> u >> v; adj[u][v] = adj[v][u] = true; // 無向圖 } ``` 空間複雜度 $O(|V|^2)$。 > 好處是可以快速存取到某條邊。 <hr class="dotted"> ### 鄰接表（Adjacency list） ```graphviz digraph { rankdir=LR; V [shape=record, height=2.4, label="<1>1|<2>2|<3>3|<4>4"] 1 [shape=record, width=0.5, label="{3}"]; 2 [shape=record, width=1.0, label="{3|4}"]; 3 [shape=record, width=1.5, label="{1|2|4}"]; 4 [shape=record, width=1.0, label="{2|3}"]; V:1 -> 1; V:2 -> 2; V:3 -> 3; V:4 -> 4; } ``` 並不是任意二點之間都有邊，因此可以只記錄存在的邊，以起點的角度，$(u,v)$ 把它存到 $u$ 之下。 > 也可以從終點的角度進行紀錄，取決於後續操作的需求。 ```cpp! int n, m; // 點數，邊數 cin >> n >> m; vector<vector<pair<int, int>>> adj(n); while (m--) { int u, v, w; cin >> u >> v >> w; adj[u].emplace_back(v, w); // 賦權有向圖 } ``` 空間複雜度 $O(|V| + |E|)$。 > 好處是只遍歷存在的邊，壞處則是比較難找特定的邊， > 當然真的有需要查詢的話，也可用 `vector<set<int>>` 來記錄。 > 一般**鄰接表較常被使用**，避免無謂的儲存空間與執行時間浪費； > 完全圖（或稠密圖（dense graph）[^dg]）才比較會使用鄰接矩陣。 <hr class="dashed"> ## 圖的遍歷（Graph Traversal）圖的遍歷也被稱為圖的搜尋（Graph Search），是指拜訪（visit）圖上各個點的一個過程。 > 不一定要拜訪所有點，工作完成就可以提早結束。 ```graphviz graph { node [shape=circle, label="", fixedsize=true]; a -- e; a -- b; b -- c; c -- d -- b; c -- e; c -- f; b -- g; b -- h -- a; h -- i -- b; h -- j; a -- k; k -- l; a -- m; m -- n; m -- o -- a; } ``` <hr class="dotted"> ### 深度優先搜尋（DFS, Depth-First Search）顧名思義，要盡量先往深層的地方搜尋，每次都繼續往鄰點拜訪下去（除非遇到拜訪過的點）。 > 以 $1$ 為根進行 DFS，點上數字代表拜訪順序。 ```graphviz graph { node [shape=circle, fixedsize=true]; edge [dir=forward, style=bold, color="#8B0000"]; 1 -- 2; 3 -- 5 -- 6; 3 -- 2 [dir=back]; 3 -- 4; 6 -- 10; 8 -- 7 [dir=back]; 6 -- 7; 8 -- 9; 1 -- 11; 11 -- 12; 1 -- 13; 13 -- 14; 13 -- 15; edge [dir=none, style="", color=black]; 1 -- 6; 6 -- 3; 6 -- 8; 8 -- 1; 15 -- 1; } ``` DFS 走訪的邊，形成一棵[有根樹](#有根樹（Rooted-Tree）)，稱為 **DFS 樹**。 ```graphviz digraph { node [shape=circle, fixedsize=true]; edge [color="#8B0000"]; 1 -> 2; 2 -> 3; 3 -> 4; 3 -> 5; 5 -> 6; 6 -> 7 -> 8 -> 9; 6 -> 10; 1 -> 11; 11 -> 12; 1 -> 13; 13 -> 14; 13 -> 15; } ``` <hr class="thin"> DFS 過程中，點會有三種狀態： * 未拜訪 * 拜訪中 * DFS 樹中的子孫尚未拜訪完 * 函數尚在呼叫堆疊（Call stack）中 * 拜訪完 * DFS 樹中的所有子孫都已拜訪完 * 函數已結束 > **拜訪中與拜訪完都稱為已被拜訪。** ```cpp void dfs(int u, int dep = 0) { vis[u] = true; for (auto& v : adj[u]) { if (vis[v]) continue; dfs(v, dep + 1); } } ``` > * dep：節點深度 > * vis：是否已被拜訪 > * adj：鄰接表 ```cpp dfs(root); ``` <hr class="thin"> > 非連通圖的話，只跑一次 DFS 沒辦法遍歷所有點。 > > ```cpp! > for (int i{0}; i < n; ++i) > if (!vis[i]) dfs(i); > ``` 每個點都被拜訪一次，而每個點都要檢查所有的鄰點（邊），複雜度 $O(|V|+|E|)$。 <hr class="dotted"> ### 暴力搜尋 DFS 不只能運用在圖上，一些問題也能透過 DFS 進行暴力搜尋，而 DFS 本身就有著可以回溯（backtracking）的特性，所以能夠輕易地知道從初始狀態到答案的過程。 > 回溯就是透過回到父節點，或者說實作上的遞迴返回。 <hr class="thin"> ### [Transformation: from A to B](https://codeforces.com/contest/727/problem/A) :::info 將一個數字，每次乘上 $2$，或是乘以 $10$ 再加 $1$，請問如何透過若干次操作，將 $a$ 變為 $b$？（或指出這樣是不可能的） ::: 因為二種操作都是指數增長，很快 $a$ 就會大於等於 $b$ 了，所以可以將所有可能的情況都搜尋一次。 ```cpp! optional<vector<long long>> dfs(long long a, long long b) { if (a > b) return nullopt; if (a == b) return vector<long long>{a}; auto r{dfs(a * 10 + 1, b)}; if (r) return r.value().push_back(a), r; r = dfs(a * 2, b); if (r) return r.value().push_back(a), r; return nullopt; // a 無法變為 b } ``` <hr class="thin"> 有時候透過問題特性，知道繼續搜尋下去也找不到答案，可以提早返回，這樣的技巧稱為「**（暴力）剪枝**」；不太會降低複雜度，卻有機會能大幅減少執行時間。 <hr class="dotted"> ### 廣度優先搜尋（BFS, Breadth-First Search）顧名思義，要盡量以寬廣的方式搜尋，將距離搜尋起點較近的點拜訪完後，才去拜訪較遠的點。 ```graphviz graph { node [shape=circle, fixedsize=true]; edge [dir=forward, style=bold, color="#8B0000"]; 1 -- 2; 1 -- 3; 3 -- 10; 8 -- 2 [dir=back]; 8 -- 15; 3 -- 9; 1 -- 4; 3 -- 11; 4 -- 12; 1 -- 5; 5 -- 13; 1 -- 6; 6 -- 14; 1 -- 7; edge[dir=none, style="", color=black]; 3 -- 8; 8 -- 10; 3 -- 4; 4 -- 11; 6 -- 7; } ``` BFS 經過的邊也是形成一棵[樹](#有根樹（Rooted-Tree）)，稱為 **BFS 樹**，同一深度的節點都拜訪完，才會拜訪到更深的節點。 ```graphviz digraph { node [shape=circle, fixedsize=true]; subgraph cluster_0 { style=dotted; 1; node [style=invis]; A0, B0, C0, D0, E0, F0; } subgraph cluster_1 { style=dotted; 2; 3; 4; 5; 6; 7; node [style=invis]; A1; } subgraph cluster_2 { style=dotted; 8; 9; 10; 11; 12; 13; 14; } subgraph cluster_3 { style=dotted; 15; node [style=invis]; A3, B3, C3, D3, E3, F3; } edge [style="invis"]; A0 -> A1 -> 8 -> 15; B0 -> 2 -> 9 -> A3; C0 -> 3 -> 10 -> B3; 1 -> 4 -> 11 -> C3; D0 -> 5 -> 12 -> D3; E0 -> 6 -> 13 -> E3; F0 -> 7 -> 14 -> F3; edge [style="", color="#8B0000"]; 1 -> 2; 1 -> 3; 1 -> 4; 1 -> 5; 1 -> 6; 1 -> 7; 2 -> 8; 3 -> 9; 3 -> 10; 3 -> 11; 4 -> 12; 5 -> 13; 6 -> 14; 8 -> 15; } ``` <hr class="thin"> 1. 將起點加入佇列（queue）中 2. 每次將一個拜訪中的點 $u$ 取出拜訪，並把 $u$ 所有**未拜訪的鄰點**加進佇列中等待 3. 重複步驟 $2$ 直到佇列為空過程中，點一樣有三種狀態： * 未拜訪 * 未進入佇列 * 拜訪中 * BFS 樹中的子節點尚未拜訪 * 在佇列中 * 拜訪完 * BFS 樹中的子節點已被拜訪 * 已離開佇列 ```cpp! void bfs(int u) { queue<int> qu{}; qu.push(u), vis[u] = true; while (!qu.empty()) { u = qu.front(), qu.pop(); for (auto& v : adj[u]) { if (vis[v]) continue; qu.push(v), vis[v] = true; } } } ``` > 若是非聯通圖，一次 BFS 也沒辦法拜訪完所有點。複雜度與 DFS 相同，為 $O(|V|+|E|)$。 <hr class="dotted"> ### 最短路徑（無權重）搜索圖上**起點到任意點**的最短路徑，就很適合 BFS，因為它先拜訪完距離近的點，才慢慢往距離遠的點拜訪。尤其在二維矩陣上，可以直接執行 BFS，也不需要真的有一個圖。 * 牆（不能走）：`*` * 路：` ` * 起點：`@` * 終點：`#` ``` * * * * * * * * * * * * * * @ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * # * * * * * * * * * * * * * ``` 數字代表距離（BFS 樹的深度）。 ```haskell * * * * * * * * * * * * * * 1 0 1 2 * * * 5 4 3 2 * * 3 * * * 6 * 4 * * * 4 * * * 7 * 5 6 7 * 5 * * * 8 * 6 * 8 7 6 * * * 9 10 * * 9 * 7 8 * * * 11 12 11 10 * * 9 * * * * 12 * 12 11 10 * * * * 13 * 13 * 11 * * 17 16 15 14 15 14 * * * * * * * * * * * * * ``` <hr class="thin"> ### [Fire!](https://onlinejudge.org/index.php?option=onlinejudge&Itemid=8&page=show_problem&problem=2671) :::info 在一個矩陣上，有 Joe 與好幾個起火點，`#` 代表牆，`.` 代表路。 Joe 每分鐘可以往上下左右移動一格，火則是每分鐘會往四個方向擴散一格，請問 Joe 能夠順利逃出這個矩陣範圍嗎？可以的話要多久呢？ ::: ```cpp! int R, C; cin >> R >> C; vector<vector<int>> maze(R, vector<int>(C)); pair<int, int> J; vector<pair<int, int>> F{}; for (int r{0}; r < R; ++r) for (int c{0}; c < C; ++c) { char x; cin >> x; if (x == 'J') J = {r, c}; if (x == 'F') F.emplace_back(r, c); maze[r][c] = (x == '#' ? -1 : R + C); // 先給一個夠大的數，代表走不到 } ``` Joe 不能被火燒到，所以 Joe 一定要**走得比火快**，因此，先算出火何時會燒到各個點（搜尋火源到各點的**最短路徑**）。 ```cpp! /* 上,下,左,右 */ const array<int, 4> dr{-1, 1, 0, 0}; const array<int, 4> dc{0, 0, -1, 1}; vector<vector<bool>> vis(R, vector<bool>(C, false)); queue<pair<int, int>> qu{}; for (auto& f : F) { maze[f.first][f.second] = 0; vis[f.first][f.second] = true; qu.push(f); } while (!qu.empty()) { auto [r, c]{qu.front()}; qu.pop(); for (int d{0}; d < 4; ++d) { int nr{r + dr[d]}, nc{c + dc[d]}; if (nr < 0 || nr >= R || nc < 0 || nc >= C) continue; // 超出邊界 if (maze[nr][nc] == -1 || vis[nr][nc]) continue; // 牆壁 or 走過的點 maze[nr][nc] = maze[r][c] + 1; vis[nr][nc] = true; qu.emplace(nr, nc); } } ``` 現在已經計算完火到各個點的時間了，接著就可以讓 Joe 也找一次最短路徑，除了牆壁不能走以外，火更早抵達的點也不能走。 ```cpp for (int r{0}; r < R; ++r) fill(vis[r].begin(), vis[r].end(), false); // 重複使用 vis，先初始化一下 maze[J.first][J.second] = 0; // 因為火源與 Joe 不在同個點上，所以至少 Joe 一開始是安全的 vis[J.first][J.second] = true; qu.push(J); while (!qu.empty()) { auto [r, c]{qu.front()}; qu.pop(); for (int d{0}; d < 4; ++d) { int nr{r + dr[d]}, nc{c + dc[d]}; if (nr < 0 || nr >= R || nc < 0 || nc >= C) { // 逃出邊界 cout << maze[r][c] + 1 << '\n'; return ; } if (maze[nr][nc] == -1 || vis[nr][nc]) continue; // 牆壁 or 走過的點 if (maze[nr][nc] <= maze[r][c] + 1) continue; // 火先到達的點 maze[nr][nc] = maze[r][c] + 1; vis[nr][nc] = true; qu.emplace(nr, nc); } } cout << "IMPOSSIBLE\n"s; ``` 如果能走出邊界，Joe 就順利逃脫啦！ --- # 樹（Tree）樹是一個連通無環無向圖（connected acyclic undirected graph）。 ```graphviz graph { node [shape=circle, label=""]; A -- B; A -- C; A -- D; B -- E; B -- F; } ``` <hr class="dashed"> ## 有根樹（Rooted Tree） > 有根樹很常見，但通常不會強調「有根」，請依情境判斷； > 而「樹」在處理時，常常也會選擇一個點作為根，例如 DFS 或 BFS。有根樹（rooted tree）是一棵一個節點被指定為根（root）的樹，如此一來樹就有方向（orientation），像一棵倒著的[樹](https://en.wikipedia.org/wiki/Tree)。 > 方向可以是遠離根，也可以是指向根。 ```graphviz digraph { node [shape=circle, label=""]; A [label="R"]; A -> B; A -> C; A -> D; B -> E; B -> F; a [label="R"]; a -> b [dir=back]; a -> c [dir=back]; a -> d [dir=back]; b -> e [dir=back]; b -> f [dir=back]; } ``` * **節點（node）** * 樹的基本元素 * **根（root）** * **父節點（parent）** * 靠近根的鄰點 * **子節點（child）** * 遠離根的鄰點 * 祖先（ancestor） * 往父節點移動多次可以到達的節點 * 子孫（descendant） * 往子節點移動多次可以到達的節點 * **葉（leaf）** * 沒有子節點的節點 </br> * 高度（height） * 節點高度（height of a node） * 往下到某個葉的最長距離 * 樹高度（height of the tree） * 根的高度 * 深度（depth） * 節點深度（depth of a node） * 到根的距離 </br> * 子樹（subtree） * 以某節點 $u$ 為根，包含 $u$ 所有的子孫而形成的樹 * 森林（forest） * 多棵不相交（disjoint）的樹形成的集合 <hr class="dotted"> ## 二元樹 ```graphviz graph { node [shape=circle, label=""]; A :sw -- B; B :sw -- D; B :se -- E; A :se -- C; C :se -- F; F :sw -- G; F :se -- H; } ``` 每個節點**至多有二個子節點**的有根樹，稱為二元樹；二個子節點分別稱為左子節點（left child）與右子節點（right child）。 <hr class="thin"> > 深度相同的節點稱為同一層。 **二元樹中，深度為 $d$ 的節點數最多 $2^d$ 個。** > 1. $d=0$ 時成立。 > 2. 若 $d=k$ 時，節點數不超過 $2^k$， > 則 $d=k+1$ 時，節點數不超過 $2\cdot2^k$。 > > 根據數學歸納法，命題為真。 **樹高為 $h$ 的二元樹，節點數不超過 $2^{h+1}-1$。** > $1+2+...+2^h=2^{h+1}-1$ <hr class="thin"> * 完美二元樹（perfect binary tree） * 每一層的節點都是滿的 > ```graphviz > graph { > node [shape=circle, label=""]; > > A :sw -- B; > B :sw -- C; > B :se -- D; > A :se -- E; > E :sw -- F; > E :se -- G; > } > ``` * 完全二元樹（complete binary tree） * 除了最底層以外，其餘的節點是滿的，並且最底層的節點也是靠左填滿的 > ```graphviz > graph { > node [shape=circle, label=""]; > > A :sw -- B; > B :sw -- C; > B :se -- D; > A :se -- E; > E :sw -- F; > G [style=invis]; > E :se -- G [style=invis]; > } > ``` <hr class="thin"> 對於完全二元樹，我們可以用編碼的方式，將其儲存於陣列之中。 ```graphviz graph { node [shape=circle]; 1 :sw -- 2; 2 :sw -- 4; 2 :se -- 5; 1 :se -- 3; 3 :sw -- 6; 7 [style=invis]; 3 :se -- 7 [style=invis]; } ``` 如果由上至下，由左至右進行編號， **編號為 $v$ 的節點，左子節點編號為 $2v$，右子節點編號為 $2v+1$。** > 1. 當 $v=1$ 時，命題為真。 > 2. 如果 $v=k$ 時命題也為真， > 因為 $k+1$ 的左子節點緊隨在 $k$ 的右子節點後面， > 編號為 $2k+2$，而右子節點編號為 $2k+3$，命題也為真。 > > 根據數學歸納法，對任意數命題都為真。 <hr class="dashed"> ## 樹的資料結構樹是圖的一種，可以用鄰接表儲存；為了動態加入與刪除節點，也可以用指標的方式來儲存。 ```cpp! struct Node { Node* p; vector<Node*> ch; // 其他資料 }; ``` <hr class="dashed"> ## 樹的性質 ```graphviz graph { rankdir=LR; node [shape=circle, fixedsize=true, label=""]; 6 [style=filled]; 1 -- 2 -- 5; 3 -- 4 -- 5 -- 6 [style=bold]; 6 -- 7 -- 8; 7 -- 9; 5 -- 10 -- 11; 5 -- 12; 5 -- 13; 14 -- 5; } ``` > 這裡討論的是樹，不是有根樹。 #### 葉（leaf）度數為 $1$ 的點。 #### 離心率（Eccentricity）點 $u$ 的離心率，定義為 $u$ 到其他點的最長距離。 <hr class="dotted"> ### 樹的中心（Centers of a tree）中心是**離心率最小**的點。對於任意點數大於 $2$ 的樹，所有的葉都不會是中心。 > 假設點 $u$ 是葉，$v$ 是它唯一相鄰的點，$u$ 到其他點 $w$ 的距離 $d(u, w) = d(v,w) + 1$， > $u$ 的離心率大於 $v$，所以 $u$ 不是中心。任意點 $u$ 到其他點的最長路徑，終點一定是葉，所以**把所有的葉移除**後，所有非葉的點離心率會減 $1$，並且**中心不變**。因此要尋找中心的話，只需要持續把葉移除，直到剩餘 $1$ 或 $2$ 個點。 > **樹的中心只有 $1$ 或 $2$ 個**；如果有 $2$ 個中心，彼此相鄰。 ```cpp! pair<int, int> center(const vector<vector<int>>& adj) { int n{adj.size()}; vector<int> d(n); // degree vector<int> leaf{}; for (int u{0}; u < n; ++u) { d[u] = adj[u].size(); if (d[u] == 1) leaf.push_back(u); } int ts{n}; // tree size while (ts > 2) { vector<int> nl{}; ts -= leaf.size(); for (auto& u : leaf) for (auto& v : adj[u]) if (--d[v] == 1) nl.push_back(v); leaf.swap(nl); } return {leaf[0], leaf.size() == 2 ? leaf[1] : -1}; } ``` <hr class="dotted"> ### 樹的半徑（Radius of a tree）半徑是**中心為起點的最長路徑**；半徑（長度）是中心的離心率。 > 找中心時，移除了葉幾次，就可以知道半徑長度； > 以中心為根，進行 DFS 也可以找出半徑。 <hr class="dotted"> ### 樹的直徑（Diameter of a tree）直徑是樹上**最長路徑**；直徑（長度）是所有點的離心率的最大值。一條最長的路徑在 DFS 樹會有一個深度最小的點，所以路徑就可以分為此點往子樹最長的二條路徑。 > 也有可能只有一棵子樹。 ```cpp! int dfs(int u, int w = -1) { int mx{0}; for (auto& v : adj[u]) if (v != w) { auto len{1 + dfs(v, u)}; diam = max(diam, mx + len); // 二條路徑透過 u 接在一起 mx = max(mx, len); } return mx; } ``` > 如果要真的找出路徑，紀錄每個節點最長與第二長的路徑，是往哪個子節點， > 並紀錄到底哪個點，是直徑在 DFS 樹中深度最小的點。 <hr class="dotted"> ### 關聯（Relationship）:bulb: > 令 $D$ 為直徑長度，$R$ 為半徑長度。中心 $c$，假設某鄰點 $x$，$c$ 往 $x$ 走到 $y$ 為半徑，至少還有另外一鄰點 $a$，$c$ 往 $a$ 走到某個葉 $b$，$d(c,b)\ge R-1$。 :::spoiler 證明 > ```graphviz > graph { > rankdir=LR; > node [shape=circle]; > n0 [style=dotted, label=""]; > n1 [style=dotted, label=""]; > n2 [style=dotted, label=""]; > > c -- x -- n0; > n0 -- n1 -- y [style=dotted]; > c -- a -- n2; > n2 -- b [style=dotted]; > } > ``` > > 以 $c$ 為根，在子樹 $x$ 這邊的節點 $u$，$d(x,u)=d(c,u)-1$； > 而其餘節點 $v$，$d(x,v)=d(c,v)+1$。 > 如果 $d(c,v)$ 都小於 $R-1$，則 $d(x,v)\le R-1$， > $x$ 的離心率只有 $R-1$，與 $c$ 是中心矛盾，所以假設錯誤。 > > > 從找中心的過程，因為 $c$ 的度數一直大於 $1$，也可以得到相同結論。 ::: <hr class="thin"> 通過中心的最長路徑至少 $2R-1$，代表直徑的下界；而直徑的上界為 $2R$，若是大於 $2R$，沒有點的離心率會等於 $R$。如果 $D=2R$，只有一個點的離心率能夠等於 $R$，所以只有一個中心；若 $D=2R-1$，則有二個相鄰的點離心率都是 $R$，有二個中心。 :::danger **若只有 $1$ 個中心，$D=2R$；若有 $2$ 個中心，$D=2R-1$**；並且中心在直徑上。 ::: <hr class="dotted"> ### 樹的重心（Centroids of a tree）以某個點為根，其**最大子樹最小的點**，即為重心。進行 DFS 的時候，對於某一點 $u$，除了 DFS 樹上的所有子樹外，還有往父節點的也是子樹，大小可以由其他子樹的節點數推算。 ```cpp! int n; vector<int> mxst(n, 0); // max subtree int dfs(int u, int w = -1) { int sz{1}; // u 與所有子樹的節點數量 for (auto& v : adj[u]) if (v != w) { int s{dfs(v, u)}; mxst[u] = max(mxst[u], s); sz += s; } mxst[u] = max(mxst[u], n - sz); // 往父節點的子樹 return sz; } ``` **樹最多有二個重心。** :::spoiler 證明 > 令 $s$ 為重心的最大子樹大小； > 如果有三個點 $x,y,z$ 都是重心，$y$ 在 $x\to z$ 的路徑上； > 對 $y$ 來說，如果 $z$ 所在子樹大小為 $s$， > 則 $x$ 的最大子樹大於 $s$，$x$ 不會是重心； > 如果 $x$ 所在子樹大小為 $s$，$y$ 不會是重心； > 如果有另一棵子樹大小為 $s$，而且 $x,z$ 都不在其中，$x,z$ 都不是重心。 ::: <hr class="dashed"> ## 並查集（Disjoint Set Union-Find Forest, DSU） **對於不交集的集合（Disjoint Set），支援合併與查詢。** 利用樹狀結構來記錄，**一棵樹代表一個集合**； **合併**，只需將二個樹合在一起，最簡單的就是把**根接到另一棵樹的根上**；而**查詢**，只需**找到樹根**，用根的編號代表該集合，就能讓每個集合有一個獨特（unique）的編號。一開始每個元素都是一個獨立的集合。 ```graphviz graph { 0; 1; 2; 3; 4; } ``` <hr class="dotted"> ### Weighted Union ```graphviz graph { 0; 1; 2; 3; 4; 2 -- 3; } ``` 如果 $\{2,3\}$ 再跟 $\{0\}$ 合併，由 $0$ 當根的話，樹高會變得更高。 ```graphviz graph { 0; 1; 2; 3; 4; 0 -- 2 -- 3; } ``` 所以透過集合的大小來決定誰當根，可以將樹高限制在 $O(\lfloor\log_2 n\rfloor)$。 ```graphviz graph { 0; 1; 2; 3; 4; 2 -- 0; 2 -- 3; } ``` > #### 證明 > 1. $n=1$ 時，命題成立。 > 2. 假設二個集合個數為 $n_1,n_2$，高度為 $h_1,h_2$ 並符合命題， > 不失一般性假設 $n_1\ge n_2$，合併之後樹高為 $\max(h_1,h_2+1)$， > 則合併後的樹也符合命題。 > * $\lfloor\log_2(n_1+n_2)\rfloor\ge\lfloor\log_2n_1\rfloor\ge h_1$ > * $\lfloor\log_2(n_1+n_2)\rfloor\ge \lfloor\log_2(2n_2)\rfloor=\lfloor \log_2(n_2) \rfloor+1\ge h_2+1$ > > 根據數學歸納法，命題成立。 <hr class="dotted"> ### Collapsing Find 每次查詢的過程，都要一直往父節點尋找找，而所有祖先一樣都在這個集合之中，所以就可以將他們都接到根下，縮小節點深度，避免下次查詢重複計算。 ```graphviz graph { 1 [style=filled]; 6 -- 2 [dir=back]; 2 -- 0 [dir=back]; 0 -- 1 [dir=back]; 1 -- 4; 1 -- 5; 2 -- 3; } ``` > 查詢後： > ```graphviz > graph { > 6 -- 0; > 6 -- 1; > 1 -- 4; > 1 -- 5; > 6 -- 2; > 2 -- 3; > } > ``` <hr class="dotted"> ### Implementation 不管合併或查詢，都只會往父節點的方向移動，所以用一個陣列 $p$ 來記錄父節點就足夠了。 > 根的父節點可以設定為 `-1`，或是根自己。 > `find()` 在遞迴返回得知根結點後，才能把路徑上的點的父節點也都設為根結點。 ```cpp! // fast disjoint set union class DSU { vector<int> p, sz; public: explicit DSU(int n) : p(n, -1), sz(n, 1) {} int find(int x) { // collapsing find return p[x] == -1 ? x : p[x] = find(p[x]); } void unionn(int x, int y) { // weighted union auto rx{find(x)}, ry{find(y)}; if (rx == ry) return ; if (sz[rx] < sz[ry]) swap(rx, ry); p[ry] = rx, sz[rx] += sz[ry]; } }; ``` Weighted Union 搭配 Collapsing Find，均攤複雜度為 $\alpha(n)$，是個成長非常緩慢的函數，inverse Ackermann function，在實務上不超過 $5$，可以**將並查集的操作視為均攤常數時間**（amortized constant time）。 > 複雜度分析過於錯綜複雜，就不討論。 <hr class="thin"> :bulb::bulb::bulb:有一個較簡單的分析[^DSU]，可以得出複雜度上界為 $O(\lg^*n)$。 > In computer science, $\lg^*$ is often used to indicate the binary iterated logarithm.[^itlog] :::spoiler 定義一個節點 $u$ 的大小，如果 $u$ 是根，$u$ 的大小為樹的節點數；如果不是根，$u$ 的大小為最後為根時的大小。 > 與程式碼中的 `size` 一致。 > :::danger > （推論 1.）節點 $u$ 的大小為 $s$，父節點的大小至少為 $2s$。 > ::: 將結點依照大小分籃，$[1:2),[2:4),[4:16),[16:65536),\dots$，如果某一籃的範圍是 $[2^B:2^{2^B})$，下一籃的範圍是 $[2^{2^B}:2^{2^{2^B}})$。 > :::danger > （推論 2.）籃子數為 $\log^*n$。 > ::: > :::danger > （推論 3.）籃子 $[2^B:2^{2^B})$ 中的元素不多於 $2n/2^B$ 個。 > ::: > 將籃子分隔為 $[2^B:2^{B+1}), [2^{B+1}:2^{B+2})+\dots+[2^{(2^B-1)}:2^{2^B})$， > 先討論區間 $[2^B:2^{B+1})$， > 根據推論 1.，區間中元素的祖先與子孫都不在區間中， > 所以區間中的元素的子節點與曾經的子節點都不重複， > 因此區間中的元素最多有 $n/2^B$ 個； > 籃子中其他區間亦同，因此得到結論，籃子中的元素， > 最多有 $n/2^B+n/2^{B+1}+\dots+n/2^{(2^B-1)}\le2n/2^B$ 個。 <hr class="thin"> $m$ 次 `find()`，將所有一步一步往上找的過程，依照 $u$ 到父節點 $w$ 的差別分成三類。 1. $w$ 是根 * 每次 `find()` 出現一次，**複雜度為 $O(m)$**。 2. $u,w$ 屬於不同籃子（$w$ 不是根） * 因為往上找的過程節點大小一直變大，根據推論 2.，每次 `find()` 最多有 $\log^*n$ 個（籃子數）， **複雜度為 $O(m\lg^*n)$**。 3. $u,w$ 屬於同個籃子 $[2^B:2^{2^B})$（$w$ 不是根） * 固定某個節點 $u$，根據推論 1.，可以當 $w$ 的最多只有 $2^B$ 個，而因為 Collapsing Find，這樣的 $(u,w)$ 只會走一次；根據推論 3.，籃子最多有 $2n/2^B$ 個節點，所以一個籃子中，最多走 $(2n/2^B)\cdot2^B=2n$ 個這樣的 $(u,w)$；又根據推論 2.，最多有 $\lg^*n$ 個籃子，**複雜度為 $O(n\lg^*n)$**。總複雜度為 $O(n+m\lg^*n+n\lg^*n)=O(m\lg^*n)$，每次操作的**均攤複雜度為 $O(\lg^*n)$**。 > `unionn()` 除了 `find()` 以外是常數時間，所以複雜度相同。 > $m$ 如果小於 $n$，$O(n\lg^*n)$ 也跟初始化的 $O(n)$ 差不多了。 > $\lg^*(2^{65536})=5$ ::: --- # 有向無環圖（DAG, Directed Acyclic Graph） ```graphviz digraph { A -> B; A -> C; A -> D; A -> F; B -> C; C -> E; D -> C; E -> F; } ``` ## 拓樸排序（Topological Sort）拓樸排序是一個排序，使得**對於任意有向邊 $(u,v)$，$u$ 都在 $v$ 之前；一個圖有拓樸排序若且唯若它是一個 DAG**。 > ABDCEF 或 ADBCDF 都是一個拓樸排序。要找到一個拓樸排序也很簡單，只要持續把入度為 $0$ 的點加入排序中，接著從圖中移除並更新其他點的入度，直到處理完所有點，或是找不到拓樸排序為止。 ```cpp! optional<vector<int>> topSort(vector<vector<int>>& adj) { vector<int> res{}; int n{adj.size()}; vector<int> cnt(n, 0); // 入度 for (int u{0}; u < n; ++u) for (auto& v : adj[u]) ++cnt[v]; queue<int> qu{}; for (int u{0}; u < n; ++u) if (!cnt[u]) qu.push(u); // 一開始入度為 0 的點 while (!qu.empty()) { auto u{qu.front()}; qu.pop(); res.push_back(u); for (auto& v : adj[u]) if (!--cnt[v]) qu.push(v); // 入度變為 0 即可放入排序中 } if (res.size() != adj.size()) return nullopt; // 不存在拓樸排序 return res; } ``` --- # References * Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed (2007). “Fundamentals of Data Structures in C, 2/e”. * 張鎮華、蔡牧村（2020）。演算法觀點的圖論（修訂版）。 * Wikipedia - [Graph theory](https://en.wikipedia.org/wiki/Graph_theory) * Wikipedia - [Tree (graph theory)](https://en.wikipedia.org/wiki/Tree_(graph_theory)) * Wikipedia - [Disjoint-set data structure](https://en.wikipedia.org/wiki/Disjoint-set_data_structure) * CP-Algorithms - [Disjoint Set Union](https://cp-algorithms.com/data_structures/disjoint_set_union.html) * Wikipedia - [Graph traversal](https://en.wikipedia.org/wiki/Graph_traversal) [^dg]: 稠密圖（dense graph），意指邊數接近上限的圖；與之相對的是稀疏圖（sparse graph）。 [^DSU]: 內容參考自 Wikipedia - [Disjoint-set data structure, Time complexity](https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Proof_of_O(log*(n))_time_complexity_of_Union-Find) [^itlog]: Wikipedia - [Iterated logarithm](https://en.wikipedia.org/wiki/Iterated_logarithm) --- # 練習題 | Problem | Tags | |:-:|:- | | [新手訓練系列 ~ 圖論](https://zerojudge.tw/ShowProblem?problemid=a290) | `BFS`, `DFS` | | [迷宮問題#1](https://zerojudge.tw//ShowProblem?problemid=a982) | `BFS` | | [00532 - Dungeon Master](https://zerojudge.tw/ShowProblem?problemid=c124) | `BFS` | | [肥水之戰](https://tioj.ck.tp.edu.tw/problems/1602) | `BFS` | | [00572 - Oil Deposits](https://zerojudge.tw/ShowProblem?problemid=c129) | `BFS` / `DFS` | | [圖論之二分圖測試](https://tioj.ck.tp.edu.tw/problems/1209) | `BFS` / `DFS` | | [老闆阿我要退貨](https://zerojudge.tw/ShowProblem?problemid=b298) | `DFS` | | [一筆畫問題](https://tioj.ck.tp.edu.tw/problems/1084) | `DFS` | | [第 4 題血緣關係](https://zerojudge.tw/ShowProblem?problemid=b967) | `tree diameter` | | [黑暗部落](https://zerojudge.tw/ShowProblem?problemid=d808) | `DSU` | | [APCS 2017-0304-2小群體](https://zerojudge.tw/ShowProblem?problemid=c291) | `DSU` | | [TOI2010 第二題：專案時程](https://zerojudge.tw/ShowProblem?problemid=a454) | `Topological sort` | | [Fox And Names](https://codeforces.com/contest/510/problem/C) | `Topological sort` | | **Challenging** | | [UVa 1267 - Network](https://onlinejudge.org/index.php?option=onlinejudge&page=show_problem&problem=3708) | | <style> hr.thin { height: 0.5px; } hr.dotted { border-top: dotted; height: .0em; color: #777777; background-color: #ffffff; } hr.dashed { border-top: dashed; height: .0em; color: #777777; background-color: #ffffff; } /* Gradient transparent - color - transparent */ hr.gradient { border: 0; height: 1px; background-image: linear-gradient(to right, rgba(0, 0, 0, 0), rgba(0, 0, 0, 0.75), rgba(0, 0, 0, 0)); } /* Flaired edges, by Tomas Theunissen */ hr.flaired { overflow: visible; height: 30px; border-style: solid; border-color: black; border-width: 1px 0 0 0; border-radius: 20px; background-color: #ffffff; } hr.flaired:before { /* Not really supposed to work, but does */ display: block; content: ""; height: 30px; margin-top: -31px; border-style: solid; border-color: black; border-width: 0 0 1px 0; border-radius: 20px; } </style>