Data structure

# Data structure ###### tags: `DataStructure` ## Tree http://alrightchiu.github.io/SecondRound/binary-tree-introjian-jie.html - Tree definition * Full Binary Tree : A Binary Tree is if every node has 0 or 2 children. The following are the examples of a full binary tree. We can also say a full binary tree is a binary tree in which all nodes except leaf nodes have two children. * Complete Binary Tree: A Binary Tree is if all the levels are completely filled except possibly the last level and the last level has all keys as left as possible * Perfect Binary Tree :A Binary tree is all the internal nodes have two children and all leaf nodes are at the same level. ![](https://i.imgur.com/zPm1Uku.png) * Binary Search Tree: a. 若任意節點的左子樹不空，則左子樹上所有節點的值均小於它的根節點的值； b. 若任意節點的右子樹不空，則右子樹上所有節點的值均大於它的根節點的值； c. 任意節點的左、右子樹也分別為二元搜尋樹； - Tree traversal (post, pre, in-order) https://shubo.io/iterative-binary-tree-traversal/ ### Pre-order ```cpp= /** * Definition for a binary tree node. * struct TreeNode { * int val; * TreeNode *left; * TreeNode *right; * TreeNode() : val(0), left(nullptr), right(nullptr) {} * TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} * TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {} * }; */ class Solution { public: void traverse(TreeNode* cur, vector<int>& res){ if(cur == NULL){ return; } res.push_back(cur->val); traverse(cur->left, res); traverse(cur->right, res); } vector<int> preorderTraversal(TreeNode* root) { vector<int> res; traverse(root, res); return res; } }; ``` ### In-order ```cpp= /** * Definition for a binary tree node. * struct TreeNode { * int val; * TreeNode *left; * TreeNode *right; * TreeNode() : val(0), left(nullptr), right(nullptr) {} * TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} * TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {} * }; */ class Solution { public: vector<int>res; vector<int> inorderTraversal(TreeNode* root) { if(root!=NULL) { inorderTraversal(root->left); res.push_back(root->val); inorderTraversal(root->right); } return res; } }; ``` ### Post-order ```cpp= /** * Definition for a binary tree node. * struct TreeNode { * int val; * TreeNode *left; * TreeNode *right; * TreeNode() : val(0), left(nullptr), right(nullptr) {} * TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} * TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {} * }; */ class Solution { private : void dfs(TreeNode * node, vector <int> &ans) { if(node == NULL) return; dfs(node->left, ans); dfs(node->right, ans); ans.push_back(node->val); } public: vector<int> postorderTraversal(TreeNode* root) { vector <int> ans; dfs(root, ans); return ans; } }; ``` ## Hashtable 1. 主要優點是時間複雜度為常數的O(1)完成**查詢、新增和刪除資料** 2. 可解決用Array組成的Direct Access Table的兩個重大缺陷: * a.Key一定要是「非負整數(non-negative integer)」，才能作為Array的index * b.若Key的「範圍」非常大，可是Key的「數量」相對很少，那麼會非常浪費記憶體空間。 3. 可能發生Collision * a.Chaining:使用Linked list把「Hashing到同一個slot」的資料串起來。對hash table搜尋目標，並用linked list traversal作修改或刪除 * b.Open Addressing：使用Probing Method來尋找Table中「空的slot」存放資料。(Linear Probing, Quadratic Probing, Double Hashing) 4. Hash function: a. Division Method：把大範圍的key對應到較小範圍的table size m(m有限制，但是比較快。) ex. modulus b. Multiplication Method：無法預先得知「Key的範圍」以及「在該範圍內Key的分佈情形」(m沒有限制，但是比較慢。) Note: Probing的Hash Function與Chaining的Hash Function不同，在於一個是資料的Key和Probing的「次數」。 * Reference: 1. Introduction http://alrightchiu.github.io/SecondRound/hash-tableintrojian-jie.html 2. C++ Set STL https://shengyu7697.github.io/std-set/ 3. C++ unoredr_map STL https://shengyu7697.github.io/std-unordered_map/ ## Stack and Queue * Stack: 1. **Last-In-First-Out**, 2. 最主要的功能是「記得先前的資訊」，所以可以用來處理需要「回溯到先前的狀態」的問題，也稱為**Back-Tracking**問題，所以常用在**DFS**或任何**Recursion**形式的演算法 3. Implement with linked list or array * Queue: 1. **First-In-First-Out**,可以快速pop掉最先存進來的資料(第一筆) 2. 在Queue中新增資料又稱為enqueue、刪除資料又稱為dequeue 3. 常用在**BFS** 4. Implement with linked list or array 5. Implement with linked list is common 6. Implement with array have two types: * Sequencial queue: A) front與back之初始值為−1 B) Pop並沒有真正把Array的記憶體位置釋放，只是調整front，使得Queue「看起來」有刪除資料，故會浪費記憶體空間 * Circular queue: A) front與back之初始值為0，因此會犧牲Array中的其中一個記憶體位置 B) 被Pop後的記憶體空間將會重複使用，將會用上餘數(mod)來計算index * Reference: 1. introduction https://alrightchiu.github.io/SecondRound/queue-introjian-jie-bing-yi-linked-listshi-zuo.html 2. Queue in c++ std https://shengyu7697.github.io/std-queue/ 3. Stack in c++ std https://www.csie.ntu.edu.tw/~b01902011/material.php?type=cpp&&id=1 4. What is Call stack? https://www.youtube.com/watch?v=uG_JOJgwbco&ab_channel=RichardKlein ## Linked list ### Array 跟 Linked list比較 * Array: A) **Random access**：只要利用index即可在O(1)時間對Array的資料做存取 B) **較Linked list為節省記憶體空間**：因為Linked list需要多一個pointer來記錄下一個node的記憶體位置 C) **新增/刪除資料很麻煩**：若要在第一個位置新增資料，就需要O(N)時間把矩陣中所有元素往後移動。同理，若要刪除第一個位置的資料，也需要O(N)時間把矩陣中剩餘的元素往前移動。 D) **若資料數量時常在改變，要時常調整矩陣的大小**，會花費O(N)的時間在搬動資料(把資料從舊的矩陣移動到新的矩陣)。 >結論: >1. 希望能夠快速存取資料。 >2. 已知欲處理的資料數量，便能確認矩陣的大小。 >3. 要求記憶體空間的使用越少越好。 * Linked list: A)新增/刪除資料較Array簡單，只要對O(1)個node(所有與欲新增/刪除的node有pointer相連的node)調整pointer即可，不需要如同Array般搬動其餘元素。 * 若是在Linked list的「開頭」新增node，只要O(1)。 * 若要刪除特定node，或者在特定位置新增node，有可能需要先執行O(N)的「搜尋」 B) Linked list的資料數量可以是動態的，不像Array會有resize的問題。 C) 因為Linked list沒有index，若要找到特定node，需要從頭(ListNode *first)開始找起，搜尋的時間複雜度為O(N)。 D) 需要額外的記憶體空間來儲存pointer E) 能夠不使用連續的記憶體空間的情況下，能夠保有並使用一份連續的資料；相對來看，陣列則需要使用連續的記憶體空間，實作出其他的資料結構，例如堆疊(Stack)和佇列(Queue)等。 >結論: >1. 無法預期資料數量時，使用Linked list就沒有resize的問題。 >2. 需要頻繁地新增/刪除資料時。 >3. 不需要快速查詢資料。 > * Reference: 1. http://alrightchiu.github.io/SecondRound/linked-list-xin-zeng-zi-liao-shan-chu-zi-liao-fan-zhuan.html 2. http://alrightchiu.github.io/SecondRound/linked-list-introjian-jie.html 3. https://pisces1026.wordpress.com/2017/09/21/cc-linked-list/ 4. C++ list STL https://shengyu7697.github.io/std-list/ - Linked list reverse 由於linked list是一種動態資料結構，可有效的使用記憶體資源，但在建立節點、增加刪除節點和反轉節點時，較為複雜，以下為四個節點例子，如何用三個指標變數(previous, current , preceding)進行反轉。 ![](https://i.imgur.com/H8MAMh0.png) ## Graph * Introduction: * Reference: http://alrightchiu.github.io/SecondRound/graph-introjian-jie.html https://web.ntnu.edu.tw/~algo/Graph.html ## AVL tree ## Heap(Priority Queue) - Min Heap - Max Heap [Example](https://leetcode.com/problems/kth-largest-element-in-an-array/solutions/3670158/c-easy-explained-3-solutions-using-sort-min-heap-max-heap/) [C++ STL Intro.](https://yuihuang.com/cpp-stl-priority-queue/)