## 【Data Structures & Algorithms 資料結構和演算法】 :::info - 什麼是Algorithms演算法? - 如何評估一個算法? - 資料結構和演算法,兩者關係為? - Trees Travesal - Depth First Search, DFS 深度優先搜尋 (stack 先進後出,後進先出) - Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出) - Graph Travesal - Bubble Sort 氣泡排序 - Selection Sort 選擇排序 - Insertion Sort 插入排序 - Merge Sort 歸併排序 - QuickSort 快速排序 [【Data Structures 資料結構】](https://hackmd.io/CUexQ6_XTR-7ZGtuBLCN0A) ::: :::warning 參考書籍: ![](https://hackmd.io/_uploads/SkO8yn_y6.jpg) <br/> 參考網站: [NTNU](https://web.ntnu.edu.tw/~algo/) <br/> 參考網站: [VISUALGO](https://visualgo.net/en) <br/> 參考網站: [十大经典排序算法](https://www.runoob.com/w3cnote/ten-sorting-algorithm.html) <br/> 參考網站: [Leetcode 學習 探索知識地圖](https://leetcode.cn/leetbook/) ![螢幕擷取畫面 2023-12-22 160601](https://hackmd.io/_uploads/SytsQTMPp.png) ::: <br/> ## 什麼是Algorithms演算法? 演算法是一系列明確定義的指令,按照順序執行,得到某種特定的結果 - 如何評估一個算法? - 清晰明確性:每一步必須清晰明確,沒有解釋或混淆的餘地 - 有限性 : 要有所限制,意思是不能無限循環 - 有效性 : 可以跑得動的 - 一般性 : 廣泛通用,並非為了單一問題設計 - 明確性 : 每一步驟都要很明確,不存在模糊 - 輸入輸出 : 算法必須有出入和輸出。輸入,是初始數據;輸出,是完成算法任務後產生的結果或解決方案 - 考慮時間複雜度 (Time Complexity)和空間複雜度 (Space Complexity): ![螢幕擷取畫面 2023-12-21 163507](https://hackmd.io/_uploads/HkbROuZDT.png) ![螢幕擷取畫面 2023-12-21 163511](https://hackmd.io/_uploads/ryERuuZvp.png) Big-O : 是一種用於描述算法時間複雜度的表示法,用於表示一個算法的執行時間相對於輸入大小的增長率 主要關注算法執行時間的「上界」,意思是在最壞情況下,算法執行時間的增長不會超過某個常數倍數 ![](https://hackmd.io/_uploads/BkjEWDOJT.jpg) 【O(1) 常數複雜度 (Constant Complexity)】 效率最高 無論輸入大小如何增加,算法的執行時間都是固定的,使用index,算法具有恆定的執行時間,"不受輸入大小的影響" ```= # o(1) num = [1,2,3,4,5,6] num[2] ``` ![截圖 2024-09-01 下午3.59.42](https://hackmd.io/_uploads/BJ-W1sb2R.png) 【O(logn) 對數複雜度 (Logarithmic Complexity)】 效率僅次於O(1) 當輸入大小增加時,執行時間以"對數"的方式增長,表示算法的執行時間增長率比線性更慢。例如二分搜尋算法,經過排序後,從中間開始找(1og 16 2 = 4,只要找四次,不需要歷遍整個list),對於大規模問題效率更高 ```= # 二分搜索 (Binary Search) def binary_search(arr, target): left, right = 0, len(arr) - 1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: return mid # 返回目標元素的索引 elif arr[mid] < target: left = mid + 1 else: right = mid - 1 return -1 # 如果目標元素不在數組中,返回 -1 # 範例使用 arr = [1, 3, 5, 7, 9, 11, 13] target = 11 result = binary_search(arr, target) if result != -1: print(f"Element found at index: {result}") else: print("Element not found in the array.") ``` ![截圖 2024-09-01 下午4.34.50](https://hackmd.io/_uploads/BJ1HPsb2A.png) 【O(n) 線性複雜度 (Linear Complexity)】 當輸入大小增加時,執行時間呈現"線性"增長,遍歷數組中的所有元素,表示算法的執行時間和輸入大小成正比 ```= # o(n) def bigo(n): for i in range(n): print(i) print('-') bigo(3) ``` ![截圖 2024-09-01 下午4.17.06](https://hackmd.io/_uploads/BkuzQi-h0.png) ```= # o(2n) def bigo(n): for i in range(n): print(i) for j in range(n): print(j) print('-') bigo(3) ``` ![截圖 2024-09-01 下午4.20.13](https://hackmd.io/_uploads/SyKAmsZn0.png) 【O(nlogn) 線性對數複雜度 (Loglinear Complexity)】 當輸入大小增加時,執行時間以"線性對數"的方式增長,是介於線性和對數之間的增長率,通常比純線性複雜度更好 例子: 合併排序、快速排序等分治算法 ```= ``` 【O(n^x) 多項式複雜度 (Polynomial Complexity)】 當輸入大小增加時,執行時間呈現多項式級別的增長,執行時間與輸入大小的多項式次數成正比 ```= # o(n**2) def bigo(n): for i in range(n): print(i) for j in range(n): print(j) print('-') bigo(3) ``` ![截圖 2024-09-01 下午4.32.08](https://hackmd.io/_uploads/rydiUiWhR.png) ```= # o(n**3) def bigo(n): for i in range(n): print(i) for j in range(n): print(j) for k in range(n): print(k) print('-') bigo(2) ``` ![截圖 2024-09-01 下午4.32.16](https://hackmd.io/_uploads/HkMjLjZn0.png) 【O(X^n) 指數時間 (Exponential Time)】 當輸入大小增加時,執行時間呈"指數"級別的增長,執行時間增長非常快,通常是效能最差的情況之一,例如求解TSP旅行推銷員問題 ```= ``` 【O(n!): 階乘複雜度 (Factorial Complexity)】 當輸入大小增加時,執行時間呈"階乘"級別的增長,執行時間增長極其快速,通常是效能最差的情況之一,僅適用於極小的輸入,例如求解某些排列組合問題 ```= ``` <br/> [BigO時間性比較](https://docs.google.com/spreadsheets/d/1TIK5thry_L9-zDOUIv67suzZr2FD3aQv/edit#gid=1275087821) ![螢幕擷取畫面 2023-12-25 172746](https://hackmd.io/_uploads/HJ3Soa8va.png) ![螢幕擷取畫面 2023-12-25 172754](https://hackmd.io/_uploads/BknSo68wT.png) ![螢幕擷取畫面 2023-12-25 172801](https://hackmd.io/_uploads/r1hrs6UP6.png) ![螢幕擷取畫面 2023-12-25 172807](https://hackmd.io/_uploads/rJnSiaIP6.png) ![螢幕擷取畫面 2023-12-25 172816](https://hackmd.io/_uploads/H12rjpUw6.png) <br/> ## 資料結構和演算法,兩者關係為? 資料結構和演算法兩者相輔相成 資料結構提供了【存儲和組織數據】的方式,而演算法則【操作這些資料】以解決特定的問題 一個有效的演算法需要建立在合適的資料結構之上 一個合適的資料結構可以使演算法更加高效 <br/> ## Algorithms演算法 ### Trees Travesal - Depth First Search, DFS 深度優先搜尋 (stack 先進後出,後進先出) ```= A / \ B C / \ \ D E F / \ \ G H I ``` ```= DFS: Pre-order: A, B, D, E, G, H, C, F, I 根節點 -> 左子樹先序 -> 右子樹先序 In-order: D, B, G, E, H, A, C, F, I 左子樹中序 -> 根節點 -> 右子樹中序 (由小到大) Post-order: D, G, H, E, B, I, F, C, A 左子樹後序 -> 右子樹後序 -> 根節點 ``` Pre-order 一直把左子樹加進stack和results,直到沒有左子樹 pop最後的值出stack,當pop的點有右子樹時,將右子樹和右子樹的左子樹同時加入stack, result ```= # python stack =[A,B,D],results = [A,B,D] # stack =[A,B],results = [A,B,D] # stack =[A],results = [A,B,D] # stack =[A,E,G],results = [A,B,D,E,G] # stack =[A,E],results = [A,B,D,E,G] # stack =[A],results = [A,B,D,E,G] # stack =[A,H],results = [A,B,D,E,G,H] # stack =[A],results = [A,B,D,E,G,H] # stack =[A,C],results = [A,B,D,E,G,H,C] # stack =[A,C,F],results = [A,B,D,E,G,H,C,F] # stack =[A,C,F,I],results = [A,B,D,E,G,H,C,F,I] # stack =[A,C,F],results = [A,B,D,E,G,H,C,F,I] # stack =[A,C],results = [A,B,D,E,G,H,C,F,I] # stack =[A],results = [A,B,D,E,G,H,C,F,I] # stack =[],results = [A,B,D,E,G,H,C,F,I] ``` In-order 一直把左子樹加進stack,直到沒有左子樹 pop最後的值出stack,同時把pop的數字加到results 當pop的點有右子樹時,將右子樹和右子樹的左子樹同時加入stack ```= # python stack =[A,B,D],results = [] # stack =[A,B],results = [D] # stack =[A],results = [D,B] # stack =[A,E,G],results = [D,B] # stack =[A,E],results = [D,B,G] # stack =[A],results = [D,B,G,E] # stack =[A,H],results = [D,B,G,E] # stack =[A],results = [D,B,G,E,H] # stack =[],results = [D,B,G,E,H,A] # stack =[C,F,I],results = [D,B,G,E,H,A] # stack =[],results = [D,B,G,E,H,A,C,F,I] ``` Post-order 一直把左子樹加進stack,直到沒有左子樹 pop最後的值出stack 當pop的點沒有左子樹和右子樹時,才能加到results; 如果有左子樹或右子樹,就加進stack ```= # python stack =[A,B,D],results = [] # stack =[A,B],results = [D] # stack =[A,B,E,G],results = [D] # stack =[A,B,E,H],results = [D,G] # stack =[A,B,E],results = [D,G,H] # stack =[A,B],results = [D,G,H,E] # stack =[A,C,F,I],results = [D,G,H,E,B] # stack =[A,C,F],results = [D,G,H,E,B,I] # stack =[A,C],results = [D,G,H,E,B,I,F] # stack =[A],results = [D,G,H,E,B,I,F,C] # stack =[],results = [D,G,H,E,B,I,F,C,A] ``` <br/> ```= # DFS O(n) class Node: def __init__(self, value) -> None: self.value = value self.left = None self.right = None class BinarySearchTree: def __init__(self, value) -> None: self.root = Node(value) def insert(self, value): new_node = Node(value) if self.root is None: self.root = new_node return True tmp_root = self.root while True: if new_node.value == tmp_root.value: return False elif new_node.value < tmp_root.value: if tmp_root.left is None: tmp_root.left = new_node return True tmp_root = tmp_root.left else: if tmp_root.right is None: tmp_root.right = new_node return True tmp_root = tmp_root.right def dfs_pre_order(self): results = [] def traversal(current_node): results.append(current_node.value) if current_node.left is not None: traversal(current_node=current_node.left) if current_node.right is not None: traversal(current_node=current_node.right) traversal(self.root) return results def dfs_in_order(self): results = [] def traversal(current_node): if current_node.left is not None: traversal(current_node=current_node.left) results.append(current_node.value) if current_node.right is not None: traversal(current_node=current_node.right) traversal(self.root) return results def dfs_post_order(self): results = [] def traversal(current_node): if current_node.left is not None: traversal(current_node=current_node.left) if current_node.right is not None: traversal(current_node=current_node.right) results.append(current_node.value) traversal(self.root) return results if __name__ == "__main__": bst = BinarySearchTree(8) for i in [23,11,1,16,15,13]: bst.insert(i) print("pre_order:" , bst.dfs_pre_order()) print("in_order:" , bst.dfs_in_order()) print("post_order:" , bst.dfs_post_order()) ``` ![螢幕擷取畫面 2023-12-26 143355](https://hackmd.io/_uploads/SJfJNxdwp.png) ```= # 深度優先搜索 (Depth-First Search, DFS) def dfs(graph, start, visited=None): if visited is None: visited = set() visited.add(start)S print(start) # 處理當前節點 for neighbor in graph[start]: if neighbor not in visited: dfs(graph, neighbor, visited) # 範例使用 graph = { 'A': ['B', 'C'], 'B': ['D', 'E'], 'C': ['F'], 'D': [], 'E': ['F'], 'F': [] } dfs(graph, 'A') ``` ![截圖 2024-09-08 晚上11.04.47](https://hackmd.io/_uploads/rk77T4inC.png) <br/> - Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出) ```= A / \ B C / \ \ D E F / \ \ G H I ``` ```= BFS : A, B, C, D, E, F, G, H, I ``` ```= # BFS O(n) # python # 邏輯是先設一個 queue[]、一個 results[] # 當queue有值時,值自動移到results # 原本queue的下一層,自動移到queue # queue = [A],results = [] # queue = [B,C], results = [A] # queue = [C,D,E], results = [A,B] # queue = [D,E,F], results = [A,B,C] # queue = [F,G,H], results = [A,B,C,D,E] # queue = [G,H,I], results = [A,B,C,D,E,F] # queue = [], results = [A,B,C,D,E,F,G,H,I] class Node: def __init__(self, value) -> None: self.value = value self.left = None self.right = None class BinarySearchTree: def __init__(self, value) -> None: self.root = Node(value) def insert(self, value): new_node = Node(value) if self.root is None: self.root = new_node return True tmp_root = self.root while True: if new_node.value == tmp_root.value: return False elif new_node.value < tmp_root.value: if tmp_root.left is None: tmp_root.left = new_node return True tmp_root = tmp_root.left else: if tmp_root.right is None: tmp_root.right = new_node return True tmp_root = tmp_root.right def bfs(self): node_queue = [] results = [] current_node = self.root node_queue.append(current_node) while len(node_queue) > 0: current_node = node_queue.pop(0) results.append(current_node.value) if current_node.left is not None: node_queue.append(current_node.left) if current_node.right is not None: node_queue.append(current_node.right) return(results) if __name__ == "__main__": bst = BinarySearchTree(10) bst.insert(5) bst.insert(15) bst.insert(1) bst.insert(8) print(bst.bfs()) print(bst.root.value) print(bst.root.left.value) print(bst.root.left.left.value) ``` ![螢幕擷取畫面 2023-12-26 123618](https://hackmd.io/_uploads/S1t8uRPwT.png) ```= # 廣度優先搜索 (Breadth-First Search, BFS) from collections import deque def bfs(graph, start): visited = set() queue = deque([start]) # deque(雙端佇列)實現 FIFO(先進先出) while queue: vertex = queue.popleft() if vertex not in visited: print(vertex) # 處理當前節點 visited.add(vertex) queue.extend(neighbor for neighbor in graph[vertex] if neighbor not in visited) # graph = { 'A': ['B', 'C'], 'B': ['D', 'E'], 'C': ['F'], 'D': [], 'E': ['F'], 'F': [] } bfs(graph, 'A') ``` ![截圖 2024-09-08 晚上10.54.03](https://hackmd.io/_uploads/HJ1T5Nsh0.png) <br/> ### Graph Travesal Graph 沒有root概念,需要找一個起始點,假設是A ```= E / B / \ A - C F \ / D A : [B, C, D] B : [A, E, F] C : [A] D : [A, F] E : [B] F : [B, D] ``` - Depth First Search, DFS 深度優先搜尋 (stack 先進後出,後進先出) 當stack有值時,先進後出,值移到results,它的下一層都加進stack ```= # GRAPH_DFS O(n) # stack = [A],results = [] # stack = [B,C,D],results = [A] # stack = [E,F,C,D],results = [A,B] # stack = [D,C,D],results = [A,B,E,F] # stack = [C,D],results = [A,B,E,F,D] # stack = [D],results = [A,B,E,F,D,C] # stack = [],results = [A,B,E,F,D,C] from collections import deque graph = { "A": ["B", "C", "D"], "B": ["A", "E", "F"], "C": ["A"], "D": ["A", "F"], "E": ["B"], "F": ["B", "D"], } results = [] stack = deque() def dfs(graph, node): stack.append(node) # 初始節點加入堆疊 while stack: current_node = stack.pop() # 從堆疊中彈出一個節點 if current_node not in results: results.append(current_node) # 訪問節點,加入結果列表 # 將鄰居節點逆序加入堆疊,確保 DFS 按照給定順序遍歷 for neighbor in reversed(graph[current_node]): if neighbor not in results: stack.append(neighbor) dfs(graph, "A") print(results) ``` ![截圖 2024-09-09 晚上11.54.36](https://hackmd.io/_uploads/SkkL9q33A.png) <br/> - Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出) 當stack有值時,先進先出,值移到results,它的下一層都加進stack ```= # GRAPH_BFS O(n) # stack = [A],results = [] # stack = [B,C,D],results = [A] # stack = [C,D,E,F],results = [A,B] # stack = [D,E,F],results = [A,B,C] # stack = [E,F],results = [A,B,C,D] # stack = [],results = [A,B,C,D,E,F] graph = { "A": ["B", "C", "D"], "B": ["A", "E", "F"], "C": ["A"], "D": ["A", "F"], "E":["B"], "F" : ["B", "D"], } results = [] queue = [] def bfs(graph, node): results.append(node) queue.append(node) while queue: p = queue.pop(0) for neighbor in graph[p]: if neighbor not in results: results.append(neighbor) queue.append(neighbor) bfs(graph=graph, node='A') print(results) ``` ![截圖 2024-09-08 晚上11.01.37](https://hackmd.io/_uploads/S10Dh4ohC.png) <br/> ### Bubble Sort 氣泡排序 [參考圖](https://favtutor.com/blogs/bubble-sort-python) 氣泡排序一次比較兩個數字,如果順序錯誤就交換(更大交換,或更小交換),將較大的數字往右推,每一輪都從頭開始比,直到整個數列排序完成 假設array = [1,2,3,4,5,6],需要進行5 (6-1=5) 輪循環 ```= # Bubble Sort 氣泡排序 O(n**2) def bubble_sort(array): n = len(array) for i in range(n): for j in range(0, n-1): if array[j] > array[j+1]: array[j], array[j+1] = array[j+1], array[j] array = [5, 2, 4, 16, 10] bubble_sort(array=array) print(array) ``` ![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png) <br/> ### Selection Sort 選擇排序 [參考圖](https://studyalgorithms.com/array/selection-sort/) 第一輪,預設index0為最小值的index,內迴圈所有數字都和array[0]比較 第二輪,預設index1為最小值的index,內迴圈所有數字都和array[1]比較... 如果找到的值,比index最小值還要小,則進行交換 選擇排序的特點是"不穩定",如果有兩個相同元素 A -> B,第二次選擇排序,有可能變成 B -> A 不影響數據的正確性,但需要保持相同元素的相對順序的情況下,不適合使用 ```= # Selection Sort 選擇排序 O(n**2) def selection_sort(array): n = len(array) for i in range(n): min_index = i for j in range(i+1,n): if array[j] < array[min_index]: min_index = j if min_index != i: array[i], array[min_index] = array[min_index], array[i] array = [5, 2, 4, 16, 10] selection_sort(array=array) print(array)y[min_index] = array[min_index], array[i] ``` ![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png) <br/> ### Insertion Sort 插入排序 [參考圖](https://www.testingdocs.com/questions/what-is-insertion-sort-write-a-java-program-to-demonstrate-it/) 將一個數列分為已排序、未排序,從未排序取出一個數字,插入到已排序的最右邊,往左比較,找到適當位置,直到整個數列排序完成 插入排序的特點是"穩定",相同元素的相對位置在排序前後不會改變 ```= # Insertion Sort 插入排序 O(n**2) def insertion_sort(array): for i in range(1, len(array)): current_value = array[i] position = i while position > 0 and current_value < array[position - 1]: array[position] = array[position - 1] position -= 1 array[position] = current_value return array array = [5, 2, 4, 16, 10] insertion_sort(array=array) print(array) ``` ![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png) <br/> ### Merge Sort 歸併排序 [參考圖](https://www.kirupa.com/sorts/mergesort.htm) 將待排序的數列分成兩部分,分別對兩部分進行排序,然後將排序好的兩部分合併成一個有序的序列 歸併排序的過程包括兩個主要步驟:分割和合併,通常使用遞迴實現,適用大型數列的排序 ```= def merge(l1, l2): result = [] i = j = 0 while i < len(l1) and j < len(l2): if l1[i] > l2[j]: result.append(l2[j]) j += 1 elif l1[i] <= l2[j]: result.append(l1[i]) i += 1 while i < len(l1): result.append(l1[i]) i += 1 while j < len(l2): result.append(l2[j]) j += 1 return result result2 = merge([1,3,5], [2,4,7]) print(result2) ``` ![截圖 2024-09-12 晚上11.58.23](https://hackmd.io/_uploads/Bkf0ycepC.png) ```= # 遞迴 def merge_sort(my_list): if len(my_list) == 1: return my_list mid_index = int(len(my_list) / 2) left = merge_sort(my_list[:mid_index]) right = merge_sort(my_list[mid_index:]) return merge(left, right) result = merge_sort([6, 6, 1, 8, 5]) print(result) ``` ![截圖 2024-09-13 凌晨12.01.47](https://hackmd.io/_uploads/ryyKgqxpC.png) <br/> ### Quick Sort 快速排序 [參考圖](https://fahsa.cn/post/structure-algorithms/mergesort-quicksort/) 通過一趟排序將數列分為兩部分,自己放在中間,左半部分的元素都 < 基準元素,右半部分的元素都 > 基準元素,然後分別對左右兩部分進行遞迴 ```= # Quick Sort 快速排序 # pivot_index = 0, end_index = 4 # swap_index = 0 交換指摽 # swap=false # i 1 # my_list[1] < my_list[0], swap_index=1, i=2, swap=false # my_list[2] > my_list[0], i=3, swap=true # my_list[3] > my_list[0], i=4, swap=true # my_list[4] < my_list[0], i=4, swap_index=2, swap=false # my_list[0], my_list[2] def pivot(my_list, start_index, end_index): pivot_index = start_index swap_index = pivot_index swap = False for i in range(pivot_index + 1, end_index + 1): if my_list[i] < my_list[pivot_index]: if swap: swap_index += 1 my_list[swap_index], my_list[i] = my_list[i], my_list[swap_index] else: swap_index += 1 i += 1 elif my_list[i] > my_list[pivot_index]: i += 1 swap = True my_list[pivot_index], my_list[swap_index] = ( my_list[swap_index], my_list[pivot_index], ) return swap_index my_list = [6, 5, 8, 7, 1] pivot(my_list, 0, len(my_list) - 1) print(my_list) ``` ![截圖 2024-09-13 凌晨12.50.05](https://hackmd.io/_uploads/BJe0iqepA.png) ```= # 遞迴 def quick_sort(my_list, start_index, end_index): if start_index < end_index: pivot_index = pivot(my_list, start_index, end_index) quick_sort(my_list, start_index, pivot_index) quick_sort(my_list, pivot_index + 1, end_index) my_list = [6, 5, 8, 7, 1] quick_sort(my_list, 0, len(my_list) - 1) print(my_list) ``` ![截圖 2024-09-13 凌晨12.50.05](https://hackmd.io/_uploads/BJe0iqepA.png)