## 【Data Structures & Algorithms 資料結構和演算法】
:::info
- 什麼是Algorithms演算法?
- 如何評估一個算法?
- 資料結構和演算法,兩者關係為?
- Trees Travesal
- Depth First Search, DFS 深度優先搜尋 (stack 先進後出,後進先出)
- Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出)
- Graph Travesal
- Bubble Sort 氣泡排序
- Selection Sort 選擇排序
- Insertion Sort 插入排序
- Merge Sort 歸併排序
- QuickSort 快速排序
[【Data Structures 資料結構】](https://hackmd.io/CUexQ6_XTR-7ZGtuBLCN0A)
:::
:::warning
參考書籍:
![](https://hackmd.io/_uploads/SkO8yn_y6.jpg)
<br/>
參考網站: [NTNU](https://web.ntnu.edu.tw/~algo/)
<br/>
參考網站: [VISUALGO](https://visualgo.net/en)
<br/>
參考網站: [十大经典排序算法](https://www.runoob.com/w3cnote/ten-sorting-algorithm.html)
<br/>
參考網站: [Leetcode 學習 探索知識地圖](https://leetcode.cn/leetbook/)
![螢幕擷取畫面 2023-12-22 160601](https://hackmd.io/_uploads/SytsQTMPp.png)
:::
<br/>
## 什麼是Algorithms演算法?
演算法是一系列明確定義的指令,按照順序執行,得到某種特定的結果
- 如何評估一個算法?
- 清晰明確性:每一步必須清晰明確,沒有解釋或混淆的餘地
- 有限性 : 要有所限制,意思是不能無限循環
- 有效性 : 可以跑得動的
- 一般性 : 廣泛通用,並非為了單一問題設計
- 明確性 : 每一步驟都要很明確,不存在模糊
- 輸入輸出 : 算法必須有出入和輸出。輸入,是初始數據;輸出,是完成算法任務後產生的結果或解決方案
- 考慮時間複雜度 (Time Complexity)和空間複雜度 (Space Complexity):
![螢幕擷取畫面 2023-12-21 163507](https://hackmd.io/_uploads/HkbROuZDT.png)
![螢幕擷取畫面 2023-12-21 163511](https://hackmd.io/_uploads/ryERuuZvp.png)
Big-O : 是一種用於描述算法時間複雜度的表示法,用於表示一個算法的執行時間相對於輸入大小的增長率
主要關注算法執行時間的「上界」,意思是在最壞情況下,算法執行時間的增長不會超過某個常數倍數
![](https://hackmd.io/_uploads/BkjEWDOJT.jpg)
【O(1) 常數複雜度 (Constant Complexity)】
效率最高
無論輸入大小如何增加,算法的執行時間都是固定的,使用index,算法具有恆定的執行時間,"不受輸入大小的影響"
```=
# o(1)
num = [1,2,3,4,5,6]
num[2]
```
![截圖 2024-09-01 下午3.59.42](https://hackmd.io/_uploads/BJ-W1sb2R.png)
【O(logn) 對數複雜度 (Logarithmic Complexity)】
效率僅次於O(1)
當輸入大小增加時,執行時間以"對數"的方式增長,表示算法的執行時間增長率比線性更慢。例如二分搜尋算法,經過排序後,從中間開始找(1og 16 2 = 4,只要找四次,不需要歷遍整個list),對於大規模問題效率更高
```=
# 二分搜索 (Binary Search)
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid # 返回目標元素的索引
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1 # 如果目標元素不在數組中,返回 -1
# 範例使用
arr = [1, 3, 5, 7, 9, 11, 13]
target = 11
result = binary_search(arr, target)
if result != -1:
print(f"Element found at index: {result}")
else:
print("Element not found in the array.")
```
![截圖 2024-09-01 下午4.34.50](https://hackmd.io/_uploads/BJ1HPsb2A.png)
【O(n) 線性複雜度 (Linear Complexity)】
當輸入大小增加時,執行時間呈現"線性"增長,遍歷數組中的所有元素,表示算法的執行時間和輸入大小成正比
```=
# o(n)
def bigo(n):
for i in range(n):
print(i)
print('-')
bigo(3)
```
![截圖 2024-09-01 下午4.17.06](https://hackmd.io/_uploads/BkuzQi-h0.png)
```=
# o(2n)
def bigo(n):
for i in range(n):
print(i)
for j in range(n):
print(j)
print('-')
bigo(3)
```
![截圖 2024-09-01 下午4.20.13](https://hackmd.io/_uploads/SyKAmsZn0.png)
【O(nlogn) 線性對數複雜度 (Loglinear Complexity)】
當輸入大小增加時,執行時間以"線性對數"的方式增長,是介於線性和對數之間的增長率,通常比純線性複雜度更好
例子: 合併排序、快速排序等分治算法
```=
```
【O(n^x) 多項式複雜度 (Polynomial Complexity)】
當輸入大小增加時,執行時間呈現多項式級別的增長,執行時間與輸入大小的多項式次數成正比
```=
# o(n**2)
def bigo(n):
for i in range(n):
print(i)
for j in range(n):
print(j)
print('-')
bigo(3)
```
![截圖 2024-09-01 下午4.32.08](https://hackmd.io/_uploads/rydiUiWhR.png)
```=
# o(n**3)
def bigo(n):
for i in range(n):
print(i)
for j in range(n):
print(j)
for k in range(n):
print(k)
print('-')
bigo(2)
```
![截圖 2024-09-01 下午4.32.16](https://hackmd.io/_uploads/HkMjLjZn0.png)
【O(X^n) 指數時間 (Exponential Time)】
當輸入大小增加時,執行時間呈"指數"級別的增長,執行時間增長非常快,通常是效能最差的情況之一,例如求解TSP旅行推銷員問題
```=
```
【O(n!): 階乘複雜度 (Factorial Complexity)】
當輸入大小增加時,執行時間呈"階乘"級別的增長,執行時間增長極其快速,通常是效能最差的情況之一,僅適用於極小的輸入,例如求解某些排列組合問題
```=
```
<br/>
[BigO時間性比較](https://docs.google.com/spreadsheets/d/1TIK5thry_L9-zDOUIv67suzZr2FD3aQv/edit#gid=1275087821)
![螢幕擷取畫面 2023-12-25 172746](https://hackmd.io/_uploads/HJ3Soa8va.png)
![螢幕擷取畫面 2023-12-25 172754](https://hackmd.io/_uploads/BknSo68wT.png)
![螢幕擷取畫面 2023-12-25 172801](https://hackmd.io/_uploads/r1hrs6UP6.png)
![螢幕擷取畫面 2023-12-25 172807](https://hackmd.io/_uploads/rJnSiaIP6.png)
![螢幕擷取畫面 2023-12-25 172816](https://hackmd.io/_uploads/H12rjpUw6.png)
<br/>
## 資料結構和演算法,兩者關係為?
資料結構和演算法兩者相輔相成
資料結構提供了【存儲和組織數據】的方式,而演算法則【操作這些資料】以解決特定的問題
一個有效的演算法需要建立在合適的資料結構之上
一個合適的資料結構可以使演算法更加高效
<br/>
## Algorithms演算法
### Trees Travesal
- Depth First Search, DFS 深度優先搜尋 (stack 先進後出,後進先出)
```=
A
/ \
B C
/ \ \
D E F
/ \ \
G H I
```
```=
DFS:
Pre-order: A, B, D, E, G, H, C, F, I
根節點 -> 左子樹先序 -> 右子樹先序
In-order: D, B, G, E, H, A, C, F, I
左子樹中序 -> 根節點 -> 右子樹中序 (由小到大)
Post-order: D, G, H, E, B, I, F, C, A
左子樹後序 -> 右子樹後序 -> 根節點
```
Pre-order
一直把左子樹加進stack和results,直到沒有左子樹
pop最後的值出stack,當pop的點有右子樹時,將右子樹和右子樹的左子樹同時加入stack, result
```=
# python
stack =[A,B,D],results = [A,B,D]
# stack =[A,B],results = [A,B,D]
# stack =[A],results = [A,B,D]
# stack =[A,E,G],results = [A,B,D,E,G]
# stack =[A,E],results = [A,B,D,E,G]
# stack =[A],results = [A,B,D,E,G]
# stack =[A,H],results = [A,B,D,E,G,H]
# stack =[A],results = [A,B,D,E,G,H]
# stack =[A,C],results = [A,B,D,E,G,H,C]
# stack =[A,C,F],results = [A,B,D,E,G,H,C,F]
# stack =[A,C,F,I],results = [A,B,D,E,G,H,C,F,I]
# stack =[A,C,F],results = [A,B,D,E,G,H,C,F,I]
# stack =[A,C],results = [A,B,D,E,G,H,C,F,I]
# stack =[A],results = [A,B,D,E,G,H,C,F,I]
# stack =[],results = [A,B,D,E,G,H,C,F,I]
```
In-order
一直把左子樹加進stack,直到沒有左子樹
pop最後的值出stack,同時把pop的數字加到results
當pop的點有右子樹時,將右子樹和右子樹的左子樹同時加入stack
```=
# python
stack =[A,B,D],results = []
# stack =[A,B],results = [D]
# stack =[A],results = [D,B]
# stack =[A,E,G],results = [D,B]
# stack =[A,E],results = [D,B,G]
# stack =[A],results = [D,B,G,E]
# stack =[A,H],results = [D,B,G,E]
# stack =[A],results = [D,B,G,E,H]
# stack =[],results = [D,B,G,E,H,A]
# stack =[C,F,I],results = [D,B,G,E,H,A]
# stack =[],results = [D,B,G,E,H,A,C,F,I]
```
Post-order
一直把左子樹加進stack,直到沒有左子樹
pop最後的值出stack
當pop的點沒有左子樹和右子樹時,才能加到results; 如果有左子樹或右子樹,就加進stack
```=
# python
stack =[A,B,D],results = []
# stack =[A,B],results = [D]
# stack =[A,B,E,G],results = [D]
# stack =[A,B,E,H],results = [D,G]
# stack =[A,B,E],results = [D,G,H]
# stack =[A,B],results = [D,G,H,E]
# stack =[A,C,F,I],results = [D,G,H,E,B]
# stack =[A,C,F],results = [D,G,H,E,B,I]
# stack =[A,C],results = [D,G,H,E,B,I,F]
# stack =[A],results = [D,G,H,E,B,I,F,C]
# stack =[],results = [D,G,H,E,B,I,F,C,A]
```
<br/>
```=
# DFS O(n)
class Node:
def __init__(self, value) -> None:
self.value = value
self.left = None
self.right = None
class BinarySearchTree:
def __init__(self, value) -> None:
self.root = Node(value)
def insert(self, value):
new_node = Node(value)
if self.root is None:
self.root = new_node
return True
tmp_root = self.root
while True:
if new_node.value == tmp_root.value:
return False
elif new_node.value < tmp_root.value:
if tmp_root.left is None:
tmp_root.left = new_node
return True
tmp_root = tmp_root.left
else:
if tmp_root.right is None:
tmp_root.right = new_node
return True
tmp_root = tmp_root.right
def dfs_pre_order(self):
results = []
def traversal(current_node):
results.append(current_node.value)
if current_node.left is not None:
traversal(current_node=current_node.left)
if current_node.right is not None:
traversal(current_node=current_node.right)
traversal(self.root)
return results
def dfs_in_order(self):
results = []
def traversal(current_node):
if current_node.left is not None:
traversal(current_node=current_node.left)
results.append(current_node.value)
if current_node.right is not None:
traversal(current_node=current_node.right)
traversal(self.root)
return results
def dfs_post_order(self):
results = []
def traversal(current_node):
if current_node.left is not None:
traversal(current_node=current_node.left)
if current_node.right is not None:
traversal(current_node=current_node.right)
results.append(current_node.value)
traversal(self.root)
return results
if __name__ == "__main__":
bst = BinarySearchTree(8)
for i in [23,11,1,16,15,13]:
bst.insert(i)
print("pre_order:" , bst.dfs_pre_order())
print("in_order:" , bst.dfs_in_order())
print("post_order:" , bst.dfs_post_order())
```
![螢幕擷取畫面 2023-12-26 143355](https://hackmd.io/_uploads/SJfJNxdwp.png)
```=
# 深度優先搜索 (Depth-First Search, DFS)
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)S
print(start) # 處理當前節點
for neighbor in graph[start]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
# 範例使用
graph = {
'A': ['B', 'C'],
'B': ['D', 'E'],
'C': ['F'],
'D': [],
'E': ['F'],
'F': []
}
dfs(graph, 'A')
```
![截圖 2024-09-08 晚上11.04.47](https://hackmd.io/_uploads/rk77T4inC.png)
<br/>
- Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出)
```=
A
/ \
B C
/ \ \
D E F
/ \ \
G H I
```
```=
BFS : A, B, C, D, E, F, G, H, I
```
```=
# BFS O(n)
# python
# 邏輯是先設一個 queue[]、一個 results[]
# 當queue有值時,值自動移到results
# 原本queue的下一層,自動移到queue
# queue = [A],results = []
# queue = [B,C], results = [A]
# queue = [C,D,E], results = [A,B]
# queue = [D,E,F], results = [A,B,C]
# queue = [F,G,H], results = [A,B,C,D,E]
# queue = [G,H,I], results = [A,B,C,D,E,F]
# queue = [], results = [A,B,C,D,E,F,G,H,I]
class Node:
def __init__(self, value) -> None:
self.value = value
self.left = None
self.right = None
class BinarySearchTree:
def __init__(self, value) -> None:
self.root = Node(value)
def insert(self, value):
new_node = Node(value)
if self.root is None:
self.root = new_node
return True
tmp_root = self.root
while True:
if new_node.value == tmp_root.value:
return False
elif new_node.value < tmp_root.value:
if tmp_root.left is None:
tmp_root.left = new_node
return True
tmp_root = tmp_root.left
else:
if tmp_root.right is None:
tmp_root.right = new_node
return True
tmp_root = tmp_root.right
def bfs(self):
node_queue = []
results = []
current_node = self.root
node_queue.append(current_node)
while len(node_queue) > 0:
current_node = node_queue.pop(0)
results.append(current_node.value)
if current_node.left is not None:
node_queue.append(current_node.left)
if current_node.right is not None:
node_queue.append(current_node.right)
return(results)
if __name__ == "__main__":
bst = BinarySearchTree(10)
bst.insert(5)
bst.insert(15)
bst.insert(1)
bst.insert(8)
print(bst.bfs())
print(bst.root.value)
print(bst.root.left.value)
print(bst.root.left.left.value)
```
![螢幕擷取畫面 2023-12-26 123618](https://hackmd.io/_uploads/S1t8uRPwT.png)
```=
# 廣度優先搜索 (Breadth-First Search, BFS)
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start]) # deque(雙端佇列)實現 FIFO(先進先出)
while queue:
vertex = queue.popleft()
if vertex not in visited:
print(vertex) # 處理當前節點
visited.add(vertex)
queue.extend(neighbor for neighbor in graph[vertex] if neighbor not in visited)
#
graph = {
'A': ['B', 'C'],
'B': ['D', 'E'],
'C': ['F'],
'D': [],
'E': ['F'],
'F': []
}
bfs(graph, 'A')
```
![截圖 2024-09-08 晚上10.54.03](https://hackmd.io/_uploads/HJ1T5Nsh0.png)
<br/>
### Graph Travesal
Graph 沒有root概念,需要找一個起始點,假設是A
```=
E
/
B
/ \
A - C F
\ /
D
A : [B, C, D]
B : [A, E, F]
C : [A]
D : [A, F]
E : [B]
F : [B, D]
```
- Depth First Search, DFS 深度優先搜尋 (stack 先進後出,後進先出)
當stack有值時,先進後出,值移到results,它的下一層都加進stack
```=
# GRAPH_DFS O(n)
# stack = [A],results = []
# stack = [B,C,D],results = [A]
# stack = [E,F,C,D],results = [A,B]
# stack = [D,C,D],results = [A,B,E,F]
# stack = [C,D],results = [A,B,E,F,D]
# stack = [D],results = [A,B,E,F,D,C]
# stack = [],results = [A,B,E,F,D,C]
from collections import deque
graph = {
"A": ["B", "C", "D"],
"B": ["A", "E", "F"],
"C": ["A"],
"D": ["A", "F"],
"E": ["B"],
"F": ["B", "D"],
}
results = []
stack = deque()
def dfs(graph, node):
stack.append(node) # 初始節點加入堆疊
while stack:
current_node = stack.pop() # 從堆疊中彈出一個節點
if current_node not in results:
results.append(current_node) # 訪問節點,加入結果列表
# 將鄰居節點逆序加入堆疊,確保 DFS 按照給定順序遍歷
for neighbor in reversed(graph[current_node]):
if neighbor not in results:
stack.append(neighbor)
dfs(graph, "A")
print(results)
```
![截圖 2024-09-09 晚上11.54.36](https://hackmd.io/_uploads/SkkL9q33A.png)
<br/>
- Breadth First Search, BFS 廣度優先搜尋 (queue 先進先出)
當stack有值時,先進先出,值移到results,它的下一層都加進stack
```=
# GRAPH_BFS O(n)
# stack = [A],results = []
# stack = [B,C,D],results = [A]
# stack = [C,D,E,F],results = [A,B]
# stack = [D,E,F],results = [A,B,C]
# stack = [E,F],results = [A,B,C,D]
# stack = [],results = [A,B,C,D,E,F]
graph = {
"A": ["B", "C", "D"],
"B": ["A", "E", "F"],
"C": ["A"],
"D": ["A", "F"],
"E":["B"],
"F" : ["B", "D"],
}
results = []
queue = []
def bfs(graph, node):
results.append(node)
queue.append(node)
while queue:
p = queue.pop(0)
for neighbor in graph[p]:
if neighbor not in results:
results.append(neighbor)
queue.append(neighbor)
bfs(graph=graph, node='A')
print(results)
```
![截圖 2024-09-08 晚上11.01.37](https://hackmd.io/_uploads/S10Dh4ohC.png)
<br/>
### Bubble Sort 氣泡排序
[參考圖](https://favtutor.com/blogs/bubble-sort-python)
氣泡排序一次比較兩個數字,如果順序錯誤就交換(更大交換,或更小交換),將較大的數字往右推,每一輪都從頭開始比,直到整個數列排序完成
假設array = [1,2,3,4,5,6],需要進行5 (6-1=5) 輪循環
```=
# Bubble Sort 氣泡排序 O(n**2)
def bubble_sort(array):
n = len(array)
for i in range(n):
for j in range(0, n-1):
if array[j] > array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
array = [5, 2, 4, 16, 10]
bubble_sort(array=array)
print(array)
```
![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png)
<br/>
### Selection Sort 選擇排序
[參考圖](https://studyalgorithms.com/array/selection-sort/)
第一輪,預設index0為最小值的index,內迴圈所有數字都和array[0]比較
第二輪,預設index1為最小值的index,內迴圈所有數字都和array[1]比較...
如果找到的值,比index最小值還要小,則進行交換
選擇排序的特點是"不穩定",如果有兩個相同元素 A -> B,第二次選擇排序,有可能變成 B -> A
不影響數據的正確性,但需要保持相同元素的相對順序的情況下,不適合使用
```=
# Selection Sort 選擇排序 O(n**2)
def selection_sort(array):
n = len(array)
for i in range(n):
min_index = i
for j in range(i+1,n):
if array[j] < array[min_index]:
min_index = j
if min_index != i:
array[i], array[min_index] = array[min_index], array[i]
array = [5, 2, 4, 16, 10]
selection_sort(array=array)
print(array)y[min_index] = array[min_index], array[i]
```
![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png)
<br/>
### Insertion Sort 插入排序
[參考圖](https://www.testingdocs.com/questions/what-is-insertion-sort-write-a-java-program-to-demonstrate-it/)
將一個數列分為已排序、未排序,從未排序取出一個數字,插入到已排序的最右邊,往左比較,找到適當位置,直到整個數列排序完成
插入排序的特點是"穩定",相同元素的相對位置在排序前後不會改變
```=
# Insertion Sort 插入排序 O(n**2)
def insertion_sort(array):
for i in range(1, len(array)):
current_value = array[i]
position = i
while position > 0 and current_value < array[position - 1]:
array[position] = array[position - 1]
position -= 1
array[position] = current_value
return array
array = [5, 2, 4, 16, 10]
insertion_sort(array=array)
print(array)
```
![螢幕擷取畫面 2023-12-26 173148](https://hackmd.io/_uploads/B1M9Tz_Pp.png)
<br/>
### Merge Sort 歸併排序
[參考圖](https://www.kirupa.com/sorts/mergesort.htm)
將待排序的數列分成兩部分,分別對兩部分進行排序,然後將排序好的兩部分合併成一個有序的序列
歸併排序的過程包括兩個主要步驟:分割和合併,通常使用遞迴實現,適用大型數列的排序
```=
def merge(l1, l2):
result = []
i = j = 0
while i < len(l1) and j < len(l2):
if l1[i] > l2[j]:
result.append(l2[j])
j += 1
elif l1[i] <= l2[j]:
result.append(l1[i])
i += 1
while i < len(l1):
result.append(l1[i])
i += 1
while j < len(l2):
result.append(l2[j])
j += 1
return result
result2 = merge([1,3,5], [2,4,7])
print(result2)
```
![截圖 2024-09-12 晚上11.58.23](https://hackmd.io/_uploads/Bkf0ycepC.png)
```=
# 遞迴
def merge_sort(my_list):
if len(my_list) == 1:
return my_list
mid_index = int(len(my_list) / 2)
left = merge_sort(my_list[:mid_index])
right = merge_sort(my_list[mid_index:])
return merge(left, right)
result = merge_sort([6, 6, 1, 8, 5])
print(result)
```
![截圖 2024-09-13 凌晨12.01.47](https://hackmd.io/_uploads/ryyKgqxpC.png)
<br/>
### Quick Sort 快速排序
[參考圖](https://fahsa.cn/post/structure-algorithms/mergesort-quicksort/)
通過一趟排序將數列分為兩部分,自己放在中間,左半部分的元素都 < 基準元素,右半部分的元素都 > 基準元素,然後分別對左右兩部分進行遞迴
```=
# Quick Sort 快速排序
# pivot_index = 0, end_index = 4
# swap_index = 0 交換指摽
# swap=false
# i 1
# my_list[1] < my_list[0], swap_index=1, i=2, swap=false
# my_list[2] > my_list[0], i=3, swap=true
# my_list[3] > my_list[0], i=4, swap=true
# my_list[4] < my_list[0], i=4, swap_index=2, swap=false
# my_list[0], my_list[2]
def pivot(my_list, start_index, end_index):
pivot_index = start_index
swap_index = pivot_index
swap = False
for i in range(pivot_index + 1, end_index + 1):
if my_list[i] < my_list[pivot_index]:
if swap:
swap_index += 1
my_list[swap_index], my_list[i] = my_list[i], my_list[swap_index]
else:
swap_index += 1
i += 1
elif my_list[i] > my_list[pivot_index]:
i += 1
swap = True
my_list[pivot_index], my_list[swap_index] = (
my_list[swap_index],
my_list[pivot_index],
)
return swap_index
my_list = [6, 5, 8, 7, 1]
pivot(my_list, 0, len(my_list) - 1)
print(my_list)
```
![截圖 2024-09-13 凌晨12.50.05](https://hackmd.io/_uploads/BJe0iqepA.png)
```=
# 遞迴
def quick_sort(my_list, start_index, end_index):
if start_index < end_index:
pivot_index = pivot(my_list, start_index, end_index)
quick_sort(my_list, start_index, pivot_index)
quick_sort(my_list, pivot_index + 1, end_index)
my_list = [6, 5, 8, 7, 1]
quick_sort(my_list, 0, len(my_list) - 1)
print(my_list)
```
![截圖 2024-09-13 凌晨12.50.05](https://hackmd.io/_uploads/BJe0iqepA.png)