【Data Structures & Algorithms 資料結構和演算法】(procressing)

## 【Data Structures & Algorithms 資料結構和演算法】 :::info - 什麼是Algorithms演算法? - 如何評估一個算法? - 資料結構和演算法，兩者關係為? - Trees Travesal - Depth First Search, DFS 深度優先搜尋 (stack 先進後出，後進先出) - Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出) - Graph Travesal - Bubble Sort 氣泡排序 - Selection Sort 選擇排序 - Insertion Sort 插入排序 - Merge Sort 歸併排序 - QuickSort 快速排序 [【Data Structures 資料結構】](https://hackmd.io/CUexQ6_XTR-7ZGtuBLCN0A) ::: :::warning 參考書籍: ![](https://hackmd.io/_uploads/SkO8yn_y6.jpg) 參考網站: [NTNU](https://web.ntnu.edu.tw/~algo/) 參考網站: [VISUALGO](https://visualgo.net/en) 參考網站: [十大经典排序算法](https://www.runoob.com/w3cnote/ten-sorting-algorithm.html) 參考網站: [Leetcode 學習探索知識地圖](https://leetcode.cn/leetbook/) ![螢幕擷取畫面 2023-12-22 160601](https://hackmd.io/_uploads/SytsQTMPp.png) ::: ## 什麼是Algorithms演算法? 演算法是一系列明確定義的指令，按照順序執行，得到某種特定的結果 - 如何評估一個算法? - 清晰明確性：每一步必須清晰明確，沒有解釋或混淆的餘地 - 有限性 : 要有所限制，意思是不能無限循環 - 有效性 : 可以跑得動的 - 一般性 : 廣泛通用，並非為了單一問題設計 - 明確性 : 每一步驟都要很明確，不存在模糊 - 輸入輸出 : 算法必須有出入和輸出。輸入，是初始數據；輸出，是完成算法任務後產生的結果或解決方案 - 考慮時間複雜度 (Time Complexity)和空間複雜度 (Space Complexity): ![螢幕擷取畫面 2023-12-21 163507](https://hackmd.io/_uploads/HkbROuZDT.png) ![螢幕擷取畫面 2023-12-21 163511](https://hackmd.io/_uploads/ryERuuZvp.png) Big-O : 是一種用於描述算法時間複雜度的表示法，用於表示一個算法的執行時間相對於輸入大小的增長率主要關注算法執行時間的「上界」，意思是在最壞情況下，算法執行時間的增長不會超過某個常數倍數 ![](https://hackmd.io/_uploads/BkjEWDOJT.jpg) 【O(1) 常數複雜度 (Constant Complexity)】效率最高無論輸入大小如何增加，算法的執行時間都是固定的，使用index，算法具有恆定的執行時間，"不受輸入大小的影響" ```= # o(1) num = [1,2,3,4,5,6] num[2] ``` ![截圖 2024-09-01 下午3.59.42](https://hackmd.io/_uploads/BJ-W1sb2R.png) 【O(logn) 對數複雜度 (Logarithmic Complexity)】效率僅次於O(1) 當輸入大小增加時，執行時間以"對數"的方式增長，表示算法的執行時間增長率比線性更慢。例如二分搜尋算法，經過排序後，從中間開始找（1og 16 2 = 4，只要找四次，不需要歷遍整個list)，對於大規模問題效率更高 ```= # 二分搜索 (Binary Search) def binary_search(arr, target): left, right = 0, len(arr) - 1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: return mid # 返回目標元素的索引 elif arr[mid] < target: left = mid + 1 else: right = mid - 1 return -1 # 如果目標元素不在數組中，返回 -1 # 範例使用 arr = [1, 3, 5, 7, 9, 11, 13] target = 11 result = binary_search(arr, target) if result != -1: print(f"Element found at index: {result}") else: print("Element not found in the array.") ``` ![截圖 2024-09-01 下午4.34.50](https://hackmd.io/_uploads/BJ1HPsb2A.png) 【O(n) 線性複雜度 (Linear Complexity)】當輸入大小增加時，執行時間呈現"線性"增長，遍歷數組中的所有元素，表示算法的執行時間和輸入大小成正比 ```= # o(n) def bigo(n): for i in range(n): print(i) print('-') bigo(3) ``` ![截圖 2024-09-01 下午4.17.06](https://hackmd.io/_uploads/BkuzQi-h0.png) ```= # o(2n) def bigo(n): for i in range(n): print(i) for j in range(n): print(j) print('-') bigo(3) ``` ![截圖 2024-09-01 下午4.20.13](https://hackmd.io/_uploads/SyKAmsZn0.png) 【O(nlogn) 線性對數複雜度 (Loglinear Complexity)】當輸入大小增加時，執行時間以"線性對數"的方式增長，是介於線性和對數之間的增長率，通常比純線性複雜度更好例子：合併排序、快速排序等分治算法 ```= ``` 【O(n^x) 多項式複雜度 (Polynomial Complexity)】當輸入大小增加時，執行時間呈現多項式級別的增長，執行時間與輸入大小的多項式次數成正比 ```= # o(n**2) def bigo(n): for i in range(n): print(i) for j in range(n): print(j) print('-') bigo(3) ``` ![截圖 2024-09-01 下午4.32.08](https://hackmd.io/_uploads/rydiUiWhR.png) ```= # o(n**3) def bigo(n): for i in range(n): print(i) for j in range(n): print(j) for k in range(n): print(k) print('-') bigo(2) ``` ![截圖 2024-09-01 下午4.32.16](https://hackmd.io/_uploads/HkMjLjZn0.png) 【O(X^n) 指數時間 (Exponential Time)】當輸入大小增加時，執行時間呈"指數"級別的增長，執行時間增長非常快，通常是效能最差的情況之一，例如求解TSP旅行推銷員問題 ```= ``` 【O(n!): 階乘複雜度 (Factorial Complexity)】當輸入大小增加時，執行時間呈"階乘"級別的增長，執行時間增長極其快速，通常是效能最差的情況之一，僅適用於極小的輸入，例如求解某些排列組合問題 ```= ``` [BigO時間性比較](https://docs.google.com/spreadsheets/d/1TIK5thry_L9-zDOUIv67suzZr2FD3aQv/edit#gid=1275087821) ![螢幕擷取畫面 2023-12-25 172746](https://hackmd.io/_uploads/HJ3Soa8va.png) ![螢幕擷取畫面 2023-12-25 172754](https://hackmd.io/_uploads/BknSo68wT.png) ![螢幕擷取畫面 2023-12-25 172801](https://hackmd.io/_uploads/r1hrs6UP6.png) ![螢幕擷取畫面 2023-12-25 172807](https://hackmd.io/_uploads/rJnSiaIP6.png) ![螢幕擷取畫面 2023-12-25 172816](https://hackmd.io/_uploads/H12rjpUw6.png) ## 資料結構和演算法，兩者關係為? 資料結構和演算法兩者相輔相成資料結構提供了【存儲和組織數據】的方式，而演算法則【操作這些資料】以解決特定的問題一個有效的演算法需要建立在合適的資料結構之上一個合適的資料結構可以使演算法更加高效 ## Algorithms演算法 ### Trees Travesal - Depth First Search, DFS 深度優先搜尋 (stack 先進後出，後進先出) ```= A / \ B C / \ \ D E F / \ \ G H I ``` ```= DFS: Pre-order: A, B, D, E, G, H, C, F, I 根節點 -> 左子樹先序 -> 右子樹先序 In-order: D, B, G, E, H, A, C, F, I 左子樹中序 -> 根節點 -> 右子樹中序 (由小到大) Post-order: D, G, H, E, B, I, F, C, A 左子樹後序 -> 右子樹後序 -> 根節點 ``` Pre-order 一直把左子樹加進stack和results，直到沒有左子樹 pop最後的值出stack，當pop的點有右子樹時，將右子樹和右子樹的左子樹同時加入stack, result ```= # python stack =[A,B,D]，results = [A,B,D] # stack =[A,B]，results = [A,B,D] # stack =[A]，results = [A,B,D] # stack =[A,E,G]，results = [A,B,D,E,G] # stack =[A,E]，results = [A,B,D,E,G] # stack =[A]，results = [A,B,D,E,G] # stack =[A,H]，results = [A,B,D,E,G,H] # stack =[A]，results = [A,B,D,E,G,H] # stack =[A,C]，results = [A,B,D,E,G,H,C] # stack =[A,C,F]，results = [A,B,D,E,G,H,C,F] # stack =[A,C,F,I]，results = [A,B,D,E,G,H,C,F,I] # stack =[A,C,F]，results = [A,B,D,E,G,H,C,F,I] # stack =[A,C]，results = [A,B,D,E,G,H,C,F,I] # stack =[A]，results = [A,B,D,E,G,H,C,F,I] # stack =[]，results = [A,B,D,E,G,H,C,F,I] ``` In-order 一直把左子樹加進stack，直到沒有左子樹 pop最後的值出stack，同時把pop的數字加到results 當pop的點有右子樹時，將右子樹和右子樹的左子樹同時加入stack ```= # python stack =[A,B,D]，results = [] # stack =[A,B]，results = [D] # stack =[A]，results = [D,B] # stack =[A,E,G]，results = [D,B] # stack =[A,E]，results = [D,B,G] # stack =[A]，results = [D,B,G,E] # stack =[A,H]，results = [D,B,G,E] # stack =[A]，results = [D,B,G,E,H] # stack =[]，results = [D,B,G,E,H,A] # stack =[C,F,I]，results = [D,B,G,E,H,A] # stack =[]，results = [D,B,G,E,H,A,C,F,I] ``` Post-order 一直把左子樹加進stack，直到沒有左子樹 pop最後的值出stack 當pop的點沒有左子樹和右子樹時，才能加到results; 如果有左子樹或右子樹，就加進stack ```= # python stack =[A,B,D]，results = [] # stack =[A,B]，results = [D] # stack =[A,B,E,G]，results = [D] # stack =[A,B,E,H]，results = [D,G] # stack =[A,B,E]，results = [D,G,H] # stack =[A,B]，results = [D,G,H,E] # stack =[A,C,F,I]，results = [D,G,H,E,B] # stack =[A,C,F]，results = [D,G,H,E,B,I] # stack =[A,C]，results = [D,G,H,E,B,I,F] # stack =[A]，results = [D,G,H,E,B,I,F,C] # stack =[]，results = [D,G,H,E,B,I,F,C,A] ``` ```= # DFS O(n) class Node: def __init__(self, value) -> None: self.value = value self.left = None self.right = None class BinarySearchTree: def __init__(self, value) -> None: self.root = Node(value) def insert(self, value): new_node = Node(value) if self.root is None: self.root = new_node return True tmp_root = self.root while True: if new_node.value == tmp_root.value: return False elif new_node.value < tmp_root.value: if tmp_root.left is None: tmp_root.left = new_node return True tmp_root = tmp_root.left else: if tmp_root.right is None: tmp_root.right = new_node return True tmp_root = tmp_root.right def dfs_pre_order(self): results = [] def traversal(current_node): results.append(current_node.value) if current_node.left is not None: traversal(current_node=current_node.left) if current_node.right is not None: traversal(current_node=current_node.right) traversal(self.root) return results def dfs_in_order(self): results = [] def traversal(current_node): if current_node.left is not None: traversal(current_node=current_node.left) results.append(current_node.value) if current_node.right is not None: traversal(current_node=current_node.right) traversal(self.root) return results def dfs_post_order(self): results = [] def traversal(current_node): if current_node.left is not None: traversal(current_node=current_node.left) if current_node.right is not None: traversal(current_node=current_node.right) results.append(current_node.value) traversal(self.root) return results if __name__ == "__main__": bst = BinarySearchTree(8) for i in [23,11,1,16,15,13]: bst.insert(i) print("pre_order:" , bst.dfs_pre_order()) print("in_order:" , bst.dfs_in_order()) print("post_order:" , bst.dfs_post_order()) ``` ![螢幕擷取畫面 2023-12-26 143355](https://hackmd.io/_uploads/SJfJNxdwp.png) ```= # 深度優先搜索 (Depth-First Search, DFS) def dfs(graph, start, visited=None): if visited is None: visited = set() visited.add(start)Ｓ print(start) # 處理當前節點 for neighbor in graph[start]: if neighbor not in visited: dfs(graph, neighbor, visited) # 範例使用 graph = { 'A': ['B', 'C'], 'B': ['D', 'E'], 'C': ['F'], 'D': [], 'E': ['F'], 'F': [] } dfs(graph, 'A') ``` ![截圖 2024-09-08 晚上11.04.47](https://hackmd.io/_uploads/rk77T4inC.png) - Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出) ```= A / \ B C / \ \ D E F / \ \ G H I ``` ```= BFS : A, B, C, D, E, F, G, H, I ``` ```= # BFS O(n) # python # 邏輯是先設一個 queue[]、一個 results[] # 當queue有值時，值自動移到results # 原本queue的下一層，自動移到queue # queue = [A]，results = [] # queue = [B,C], results = [A] # queue = [C,D,E], results = [A,B] # queue = [D,E,F], results = [A,B,C] # queue = [F,G,H], results = [A,B,C,D,E] # queue = [G,H,I], results = [A,B,C,D,E,F] # queue = [], results = [A,B,C,D,E,F,G,H,I] class Node: def __init__(self, value) -> None: self.value = value self.left = None self.right = None class BinarySearchTree: def __init__(self, value) -> None: self.root = Node(value) def insert(self, value): new_node = Node(value) if self.root is None: self.root = new_node return True tmp_root = self.root while True: if new_node.value == tmp_root.value: return False elif new_node.value < tmp_root.value: if tmp_root.left is None: tmp_root.left = new_node return True tmp_root = tmp_root.left else: if tmp_root.right is None: tmp_root.right = new_node return True tmp_root = tmp_root.right def bfs(self): node_queue = [] results = [] current_node = self.root node_queue.append(current_node) while len(node_queue) > 0: current_node = node_queue.pop(0) results.append(current_node.value) if current_node.left is not None: node_queue.append(current_node.left) if current_node.right is not None: node_queue.append(current_node.right) return(results) if __name__ == "__main__": bst = BinarySearchTree(10) bst.insert(5) bst.insert(15) bst.insert(1) bst.insert(8) print(bst.bfs()) print(bst.root.value) print(bst.root.left.value) print(bst.root.left.left.value) ``` ![螢幕擷取畫面 2023-12-26 123618](https://hackmd.io/_uploads/S1t8uRPwT.png) ```= # 廣度優先搜索 (Breadth-First Search, BFS) from collections import deque def bfs(graph, start): visited = set() queue = deque([start]) # deque（雙端佇列）實現 FIFO（先進先出) while queue: vertex = queue.popleft() if vertex not in visited: print(vertex) # 處理當前節點 visited.add(vertex) queue.extend(neighbor for neighbor in graph[vertex] if neighbor not in visited) # graph = { 'A': ['B', 'C'], 'B': ['D', 'E'], 'C': ['F'], 'D': [], 'E': ['F'], 'F': [] } bfs(graph, 'A') ``` ![截圖 2024-09-08 晚上10.54.03](https://hackmd.io/_uploads/HJ1T5Nsh0.png) ### Graph Travesal Graph 沒有root概念，需要找一個起始點，假設是A ```= E / B / \ A - C F \ / D A : [B, C, D] B : [A, E, F] C : [A] D : [A, F] E : [B] F : [B, D] ``` - Depth First Search, DFS 深度優先搜尋 (stack 先進後出，後進先出) 當stack有值時，先進後出，值移到results，它的下一層都加進stack ```= # GRAPH_DFS O(n) # stack = [A]，results = [] # stack = [B,C,D]，results = [A] # stack = [E,F,C,D]，results = [A,B] # stack = [D,C,D]，results = [A,B,E,F] # stack = [C,D]，results = [A,B,E,F,D] # stack = [D]，results = [A,B,E,F,D,C] # stack = []，results = [A,B,E,F,D,C] from collections import deque graph = { "A": ["B", "C", "D"], "B": ["A", "E", "F"], "C": ["A"], "D": ["A", "F"], "E": ["B"], "F": ["B", "D"], } results = [] stack = deque() def dfs(graph, node): stack.append(node) # 初始節點加入堆疊 while stack: current_node = stack.pop() # 從堆疊中彈出一個節點 if current_node not in results: results.append(current_node) # 訪問節點，加入結果列表 # 將鄰居節點逆序加入堆疊，確保 DFS 按照給定順序遍歷 for neighbor in reversed(graph[current_node]): if neighbor not in results: stack.append(neighbor) dfs(graph, "A") print(results) ``` ![截圖 2024-09-09 晚上11.54.36](https://hackmd.io/_uploads/SkkL9q33A.png) - Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出) 當stack有值時，先進先出，值移到results，它的下一層都加進stack ```= # GRAPH_BFS O(n) # stack = [A]，results = [] # stack = [B,C,D]，results = [A] # stack = [C,D,E,F]，results = [A,B] # stack = [D,E,F]，results = [A,B,C] # stack = [E,F]，results = [A,B,C,D] # stack = []，results = [A,B,C,D,E,F] graph = { "A": ["B", "C", "D"], "B": ["A", "E", "F"], "C": ["A"], "D": ["A", "F"], "E":["B"], "F" : ["B", "D"], } results = [] queue = [] def bfs(graph, node): results.append(node) queue.append(node) while queue: p = queue.pop(0) for neighbor in graph[p]: if neighbor not in results: results.append(neighbor) queue.append(neighbor) bfs(graph=graph, node='A') print(results) ``` ![截圖 2024-09-08 晚上11.01.37](https://hackmd.io/_uploads/S10Dh4ohC.png) ### Bubble Sort 氣泡排序 [參考圖](https://favtutor.com/blogs/bubble-sort-python) 氣泡排序一次比較兩個數字，如果順序錯誤就交換(更大交換，或更小交換)，將較大的數字往右推，每一輪都從頭開始比，直到整個數列排序完成假設array = [1,2,3,4,5,6]，需要進行5 (6-1=5) 輪循環 ```= # Bubble Sort 氣泡排序 O(n**2) def bubble_sort(array): n = len(array) for i in range(n): for j in range(0, n-1): if array[j] > array[j+1]: array[j], array[j+1] = array[j+1], array[j] array = [5, 2, 4, 16, 10] bubble_sort(array=array) print(array) ``` ![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png) ### Selection Sort 選擇排序 [參考圖](https://studyalgorithms.com/array/selection-sort/) 第一輪，預設index0為最小值的index，內迴圈所有數字都和array[0]比較第二輪，預設index1為最小值的index，內迴圈所有數字都和array[1]比較... 如果找到的值，比index最小值還要小，則進行交換選擇排序的特點是"不穩定"，如果有兩個相同元素 A -> B，第二次選擇排序，有可能變成 B -> A 不影響數據的正確性，但需要保持相同元素的相對順序的情況下，不適合使用 ```= # Selection Sort 選擇排序 O(n**2) def selection_sort(array): n = len(array) for i in range(n): min_index = i for j in range(i+1,n): if array[j] < array[min_index]: min_index = j if min_index != i: array[i], array[min_index] = array[min_index], array[i] array = [5, 2, 4, 16, 10] selection_sort(array=array) print(array)y[min_index] = array[min_index], array[i] ``` ![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png) ### Insertion Sort 插入排序 [參考圖](https://www.testingdocs.com/questions/what-is-insertion-sort-write-a-java-program-to-demonstrate-it/) 將一個數列分為已排序、未排序，從未排序取出一個數字，插入到已排序的最右邊，往左比較，找到適當位置，直到整個數列排序完成插入排序的特點是"穩定"，相同元素的相對位置在排序前後不會改變 ```= # Insertion Sort 插入排序 O(n**2) def insertion_sort(array): for i in range(1, len(array)): current_value = array[i] position = i while position > 0 and current_value < array[position - 1]: array[position] = array[position - 1] position -= 1 array[position] = current_value return array array = [5, 2, 4, 16, 10] insertion_sort(array=array) print(array) ``` ![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png) ### Merge Sort 歸併排序 [參考圖](https://www.kirupa.com/sorts/mergesort.htm) 將待排序的數列分成兩部分，分別對兩部分進行排序，然後將排序好的兩部分合併成一個有序的序列歸併排序的過程包括兩個主要步驟：分割和合併，通常使用遞迴實現，適用大型數列的排序 ```= def merge(l1, l2): result = [] i = j = 0 while i < len(l1) and j < len(l2): if l1[i] > l2[j]: result.append(l2[j]) j += 1 elif l1[i] <= l2[j]: result.append(l1[i]) i += 1 while i < len(l1): result.append(l1[i]) i += 1 while j < len(l2): result.append(l2[j]) j += 1 return result result2 = merge([1,3,5], [2,4,7]) print(result2) ``` ![截圖 2024-09-12 晚上11.58.23](https://hackmd.io/_uploads/Bkf0ycepC.png) ```= # 遞迴 def merge_sort(my_list): if len(my_list) == 1: return my_list mid_index = int(len(my_list) / 2) left = merge_sort(my_list[:mid_index]) right = merge_sort(my_list[mid_index:]) return merge(left, right) result = merge_sort([6, 6, 1, 8, 5]) print(result) ``` ![截圖 2024-09-13 凌晨12.01.47](https://hackmd.io/_uploads/ryyKgqxpC.png) ### Quick Sort 快速排序 [參考圖](https://fahsa.cn/post/structure-algorithms/mergesort-quicksort/) 通過一趟排序將數列分為兩部分，自己放在中間，左半部分的元素都 < 基準元素，右半部分的元素都 > 基準元素，然後分別對左右兩部分進行遞迴 ```= # Quick Sort 快速排序 # pivot_index = 0, end_index = 4 # swap_index = 0 交換指摽 # swap=false # i 1 # my_list[1] < my_list[0], swap_index=1, i=2, swap=false # my_list[2] > my_list[0], i=3, swap=true # my_list[3] > my_list[0], i=4, swap=true # my_list[4] < my_list[0], i=4, swap_index=2, swap=false # my_list[0], my_list[2] def pivot(my_list, start_index, end_index): pivot_index = start_index swap_index = pivot_index swap = False for i in range(pivot_index + 1, end_index + 1): if my_list[i] < my_list[pivot_index]: if swap: swap_index += 1 my_list[swap_index], my_list[i] = my_list[i], my_list[swap_index] else: swap_index += 1 i += 1 elif my_list[i] > my_list[pivot_index]: i += 1 swap = True my_list[pivot_index], my_list[swap_index] = ( my_list[swap_index], my_list[pivot_index], ) return swap_index my_list = [6, 5, 8, 7, 1] pivot(my_list, 0, len(my_list) - 1) print(my_list) ``` ![截圖 2024-09-13 凌晨12.50.05](https://hackmd.io/_uploads/BJe0iqepA.png) ```= # 遞迴 def quick_sort(my_list, start_index, end_index): if start_index < end_index: pivot_index = pivot(my_list, start_index, end_index) quick_sort(my_list, start_index, pivot_index) quick_sort(my_list, pivot_index + 1, end_index) my_list = [6, 5, 8, 7, 1] quick_sort(my_list, 0, len(my_list) - 1) print(my_list) ``` ![截圖 2024-09-13 凌晨12.50.05](https://hackmd.io/_uploads/BJe0iqepA.png)