【C++】競程筆記（資料結構：Binary Tree）

【C++】競程筆記（資料結構：Binary Tree） === [TOC] 程式碼範例參考：NTUCPC Guide，此筆記僅為個人學習用途。簡介 --- 樹（Tree）在電腦科學領域是一個由「節點」（Node）和「邊」所組成的圖形，如圖： ![image](https://hackmd.io/_uploads/Hk2LbR1Ngg.png) Image Source：https://www.geeksforgeeks.org/dsa/binary-tree-data-structure/ ### 基本術語 - 節點（Node）： - 如 1、2、3、4 等這些數字被圓圈包起來的就是一個節點。 - 根節點（Root）： - 指的是最上層的節點，沒有父節點，如 1 就是根節點 Root。 - 子節點（Child）： - 位於某節點之下的直接後繼節點，如 1 底下的 2 跟 3 都是 1 的子節點；2 底下的 4 跟 5 都是 2 的子節點。 - 父節點（Parent）： - 反之，父節點為子節點的上層節點。1 是 2 跟 3 的父節點，2 是 4 跟 5 的父節點。 - 葉節點（Leaf）： - 沒有任何子節點的節點。如 8, 5, 9, 10 都沒有子節點，所以稱為葉節點。 - 深度（Depth）： - 根節點的深度定義為 0，子節點的深度等於其父節點深度＋1。如 2 的深度是 1，4 的深度是 2。 - 高度（Height）： - 從某節點到其最遠葉節點之最大邊數或節點數。整棵樹的高度即根節點的高度。如 1 的高度為 4（依節點看）或 3（依邊數看）。 - 子樹（Subtree）： - 以某節點為根所構成的整個樹，如 6、9、10 成為一棵子樹。 ### 二元樹類型 - 滿二元樹（Full Binary Tree）： - 每個非葉節點恰好有兩個子節點。 - ![image](https://hackmd.io/_uploads/rybf20J4ge.png) - Image Source：https://www.geeksforgeeks.org/dsa/types-of-binary-tree/ - 完全二元樹（Complete Binary Tree）： - 除了最底層外，每一層的節點都達到最大數；最底層的節點集中在左側。 - ![image](https://hackmd.io/_uploads/rJJD2RyVgl.png) - Image Source：https://www.geeksforgeeks.org/dsa/types-of-binary-tree/ - 完美二元樹（Perfect Binary Tree）： - 是一個滿二元樹，且所有葉節點深度相同；節點數 $= 2^{(h+1)} – 1$ ， $h$ 為高度。 - ![image](https://hackmd.io/_uploads/SkAs3R1Nxe.png) - Image Source：https://www.geeksforgeeks.org/dsa/types-of-binary-tree/ - 平衡二元樹（Balanced Binary Tree）： - 任一節點的兩個子樹高度差不超過 1。如：AVL 樹（Adelson‑Velsky and Landis Tree）、紅黑樹（Red‑Black Tree）。 ### n 元樹 > n 元樹的意思就是每個節點至多有 n 個子節點。根據定義，一元樹也是一種樹，但是他其實跟陣列沒兩樣。通常這種樹會被稱作一個鏈（Chain）。 > From NTUCPC Guide Binary Search Tree（BST） --- 二元搜尋樹中的每個節點最多有兩個子節點，左子節點會比右子節點的值還小，當然反過來說，右子節點的值會比左子節點還大，如圖： ![image](https://hackmd.io/_uploads/H1vyX1xNle.png) Image Source：https://www.geeksforgeeks.org/binary-search-tree-data-structure/ 在 C++ 中存一棵 BST： ```cpp= // from NTUCPC Guide struct node { node *parent, *lson, *rson; int val; node(const int &data : parent(NULL), lson(NULL), rson(NULL), val(data)) {} } ``` node：節點。 parent：父節點。 lson、rson：左子節點、右子節點。 val：節點值。這裡的第五行冒號（`:`）為建構子初始化列表（Constructor Initialization List）的行為。語法為： ``` 類別名稱(參數) : 成員1(值1), 成員2(值2), ... { // 建構子主體 } ``` 這種寫法等價於： ```cpp= // ... node(const int &data){ parent = NULL; lson = NULL; rson = NULL; val = data; } // ... ``` 但用 `:` 去寫會更有效率、簡潔一點。 > 一個指標通常需要占 8 個 bytes，所以用指標實作的東西都要花滿多的記憶體，在時間上常數也會比較大。 > From NTUCPC Guide ### 插入一個元素若要在 BST 中插入一個數字，需遵循以下規則： 1. 如果插入值比目前節點小，則往左子樹找插入位置。 2. 如果插入值比目前節點大，則往右子樹找插入位置。 3. 若目前節點為 NULL，表示找到插入點，則創建新節點。 ```cpp= // From NTUCPC Guide // Comment by LukeTseng void Insert(node *&x, int val) { // 空節點處理 if (!x) { // 若 x 是空指標，表示可放新節點 x = new node(val); // 建立新節點，並使 x 指向它 return x->parent = NULL, void(); // 將父節點設定為 NULL } // 插入左子樹 if (x->val > val) { // 若插入值小於目前節點，往左子節點遞迴 if (x->lson) Insert(x->lson, val); // 如果左子節點已存在，直接遞迴 // 否則新增新節點作為左子節點，並設定其 parent 為目前節點 else x->lson = new node(val), x->lson->parent = x; } // 插入右子樹 if (x->val < val) { // 反之，邏輯基本上與插入左子樹操作相同 if (x->rson) Insert(x->rson, val); else x->rson = new node(val), x->rson->parent = x; } } ``` 在此程式中沒有處理到相等的情況，代表會不允許重複的元素插入。 ### 建立二元搜尋樹若有 $n$ 個數字 $[a_1, a_2, \cdots , a_n]$ ，設定 $a_1$ 為 Root，再對其他數字依序插入。 ### Inorder Traversal 二元搜尋樹的特性：左子樹比右子樹小。中序遍歷（Inorder Traversal）先走左子樹，再走右子樹，即可輸出一個由小到大的序列。 :::info 中序遍歷走訪順序：左子樹 → 根節點 → 右子樹 ::: ```cpp= // From NTUCPC Guide // Comment by LukeTseng void travel(node *x) { if (!x) return; // 檢查目前節點是否為空 travel(x->lson); // 走訪左子樹 printf("%d ",x->val); // 印出目前節點值 travel(x->rson); // 走訪右子樹 } ``` ### 查詢節點假設要搜尋一值 $x$，目前所在節點值為 $y$。若 $x < y$ ，往左走；反之， $x > y$ 就往右走。 ```cpp= // NTUCPC Guide // Comment by LukeTseng node *Find(node *x, int val) { if (!x) return NULL; // 若目前節點是空指標（如走到葉節點的子節點），代表樹不存在目標值 if (x->val == val) return x; // 找到了 if (x->val > val) return Find(x->lson, val); // 目標值比目前節點值小，往左走 return Find(x->rson, val); // 表示前面三個條件不成立，那就是要往右了 } ``` ### 刪除節點要刪除節點時有三種情形： 1. 葉節點直接刪。 2. 其中一個子節點為空，則將其拉上來。如下圖，4 連到 7。 - ![image](https://hackmd.io/_uploads/Hy4iZ-eEgl.png) - From NTUCPC Guide 3. 找中序後繼（inorder successor）或前驅來取代被刪除節點。（用另一個「替代節點」的值來取代 x 的值，再刪除那個替代節點） :::info 中序前驅（in-order predecessor）： - `x->lson` 中的最大節點（也就是 `x->lson` 子樹最右邊的節點） - 該值 `< x->val`，是小於 `x` 的最大值 ::: :::info 中序後繼（in-order successor）： * `x->rson` 中的最小節點（也就是 `x->rson` 子樹最左邊的節點） * 該值 `> x->val`，是大於 `x` 的最小值 ::: ```cpp= // From NTUCPC Guide // Comment by LukeTseng bool Delete(node *&root, node *x) { if (!x) return false; // 葉節點 if (!x->lson && !x->rson) { if (x->parent) { // 若有父節點，從其父節點的左或右子樹中移除 x。 if (x->parent->val > x->val) x->parent->lson = NULL; else x->parent->rson = NULL; } delete x; } else if (!x->lson) { // 僅有右子樹 if (x->parent) { // 這部分邏輯同屬刪除葉節點的方式，但不同的點是會直接連到 x-> rson if (x->parent->val > x->val) x->parent->lson = x->rson; else x->parent->rson = x->rson; } else root = x->rson; x->rson->parent = x->parent; delete x; } else if (!x->rson) { // 僅有左子樹 if (x->parent) { // 邏輯同僅有右子樹的情形 if (x->parent->val > x->val) x->parent->lson = x->lson; else x->parent->rson = x->lson; } else root = x->lson; x->lson->parent = x->parent; delete x; } else { // 有左右兩棵子樹 node *exchange = x->lson; while (exchange->rson) exchange = exchange->rson; x->val = exchange->val; // copy the data Delete(root, exchange); } return true; } ``` ### 刪除一個值 ```cpp= // From NTUCPC Guide bool Delete_Val(node *&root, int val) { return Delete(root, Find(root, val)); } ``` ### 複雜度分析設 BST 含有 $n$ 個節點， $h$ 為樹的高度。則： | 操作 | 最佳情況 | 平均情況 | 最差情況 | | -------------- | -------------- | -------------- | ---------- | | 搜尋 Search | 平衡樹 | 大致平衡 | 退化為鏈結串列 | | 插入 Insert | 平衡樹 | 大致平衡 | 退化為鏈結串列 | | 刪除 Delete | 平衡樹 | 大致平衡 | 退化為鏈結串列 | | 最小/最大查詢 | $O(log n)$ | $O(log n)$ | $O(n)$ | | 中序走訪 Traversal | $O(n)$ | $O(n)$ | $O(n)$ | 當插入的資料為已排序的序列時，BST 不再分叉，整棵樹會變成單一路徑，因此樹會不平衡，退化成一個鏈結串列，高度會變成 $h = n$ ，效率變差，直接變成 $O(n)$ 。然後二元搜尋樹是要去平衡的，有些測資如習題練習第一題，還蠻大的，沒去好好平衡的話，時間複雜度甚至會到 $O(n^2)$ 。習題練習 --- ### 二元搜尋樹 (TRVBST) Source：https://tioj.ck.tp.edu.tw/problems/1609 問中序遍歷結果。 TLE 解（手刻實作二元搜尋樹，且未平衡）： ```cpp= #include <bits/stdc++.h> using namespace std; struct Node { int val; Node *left, *right; Node(int v) : val(v), left(nullptr), right(nullptr) {} }; void insert(Node*& root, int val) { if (root == nullptr) { root = new Node(val); return; } if (val < root->val) insert(root->left, val); else insert(root->right, val); } void inorder(Node* root, vector<int>& result) { if (!root) return; inorder(root->left, result); result.push_back(root->val); inorder(root->right, result); } int main() { ios::sync_with_stdio(false); cin.tie(nullptr); int n; cin >> n; Node* root = nullptr; for (int i = 0; i < n; ++i) { int val; cin >> val; insert(root, val); } vector<int> output; output.reserve(n); inorder(root, output); for (int i = 0; i < n; ++i) { if (i > 0) cout << ' '; cout << output[i]; } return 0; } ``` ~~AC 解（內建函式 sort() 大法）~~： ```cpp= #include <bits/stdc++.h> using namespace std; int main() { ios::sync_with_stdio(false); cin.tie(nullptr); int n; cin >> n; vector<int> values(n); for (int i = 0; i < n; ++i) { cin >> values[i]; } sort(values.begin(), values.end()); for (int i = 0; i < n; ++i) { if (i > 0) cout << ' '; cout << values[i]; } return 0; } ``` ### 1d-kd-tree Source：https://tioj.sprout.tw/problems/46 查詢操作是比較難的（查詢一個點，問距離哪一個點的距離最短），但可以用二元搜尋樹找 predecessor（不大於 x 的最大點）跟 successor（大於 x 的最小點），並比較這些點與 x 的距離。 ```cpp= #include <bits/stdc++.h> using namespace std; int main() { ios::sync_with_stdio(false); cin.tie(nullptr); int N; cin >> N; set<long long> points; while (N--) { string op; long long x; cin >> op >> x; if (op == "insert") { points.insert(x); } else if (op == "delete") { points.erase(x); } else if (op == "query") { auto it = points.upper_bound(x); long long dist1 = LLONG_MAX, dist2 = LLONG_MAX; long long ans1 = LLONG_MAX, ans2 = LLONG_MAX; if (it != points.end()) { // > x 的最小值 dist2 = *it - x; // 右側距離 ans2 = *it; // 候選數值 } if (it != points.begin()) { // <= x 的最大值 --it; dist1 = x - *it; ans1 = *it; } if (dist1 < dist2) { // 表示左側近，輸出左側元素 cout << ans1 << '\n'; } else if (dist2 < dist1) { // 表示右側近，輸出右側元素 cout << ans2 << '\n'; } else if (dist1 == dist2 && dist1 != LLONG_MAX) { cout << min(ans1, ans2) << ' ' << max(ans1, ans2) << '\n'; } } } return 0; } ```