A Guided Note on Algorithms

# A Guided Note on Algorithms Algorithm plays a huge part in achieving better performance in coding. This article contains algorithms that are commonly used in real life and important for LeetCode practicing problems. ### Table of contents - ⚙️ Algorithm concepts - 🧩 Maching algorithms - 🕸️ Graphs - 💰 Greedy algorithms - ✂️ Divide and conquer - 🧠 Dynamic programming - 🌊 Flow Network - 🅿️ Beyond polynomial running times - 📏 Linear Programming *\* This article is a set of study notes based on online teaching materials. Some content and examples are adapted from those resources and are not fully created by me.* *\* These notes are derived from the original public videos in this [YouTube playlist](https://youtube.com/playlist?list=PLj6E8qlqmkFtoRpLn6IXnH_eboef-3QvZ&si=SaB5LEQCwgWY7SpF). Some examples and explanations have been adapted or rephrased to facilitate better understanding and for my own practice.* *\* Some of the graphics/text in this article were generated/refined with the help of ChatGPT to enhance clarity and improve understanding.* # ⚙️ Algorithm concepts ### What is a good algorithm? - Run quickly and consume small memory. - When the input size doubles, the algorithm should only slow down by constant factor c. ### Proposed definition - Runtime is platform-independent and instance-independent - Polynomial running time is if there exists constants c > 0 and d > 0, and the input is N, then the running time is bounded by cNᵈ. - Primitive computational step refers to a basic operation whose cost does not depend on the input size. - An algorithm is efficient if it has a polynomial running time. ### Time Complexity - Seperate algorithms by their max exponent. Ex. `10n²+5n+2` → `n²` complexity. - Let `T(n)` be a function to describe the worst-case running time of a certain algorithm on an input of size `n`: - Asymptotic upper bound: `T(n) = O(f(n))` if there exist constants `c > 0` and `n₀ >= 0` s.t. for all `n > n₀` we hace `T(n) ≤ cf(n)`. - Asymptotic lower bound: `T(n) = Ω(f(n))` if there exist constants `c > 0` and `n₀ >= 0` s.t. for all `n > n₀` we hace `T(n) >= cf(n)`. - Asymptotic tight bound: `T(n) = Θ(f(n))` if `T(n)` is both `O(f(n))` and `Ω(f(n))`. <img style="width: 40%" src="https://hackmd.io/_uploads/HkELX9hGxe.png" /> # 🧩 Matching algorithms ## Gale-Shapley (Stable Marriage Problem) The **Gale-Shapley algorithm** (also known as the **Stable Marriage Problem algorithm**), solves the problem of matching two equally sized sets (e.g., men and women) based on individual preferences, s.t. the resulting matching is **stable**. >[!Important] > A matching is considered stable if there is no pair (A, B) s.t. both A and B would prefer each other over their current matches. ### 💡 Algorithm notes: - Individuals from one group propose to individuals in the other group based on their preferences(ranking). - The other group temporarily holds the best proposal they’ve received, rejecting others, or dumping the current one if the incomming one is in it's higher rank. - Rejected(dumped) individuals continue proposing to the next preferred option. - The process repeats until everyone is matched. ### 🕐 Time complexity 1. Initialize all people to be free: `O(n)` 2. Initialize woman preference ranking and current status: `O(n²)` 3. Engaging → - Store each woman’s preference ranking in a map or array: `O(n²)` >[!Note] > The data is stored by ranking, using the index to represent each man's number. For example, [3, 1, 2] means the woman ranks man 1 as her 3rd choice, man 2 as 1st, and man 3 as 2nd. This allows preference comparisons in O(1) time instead of O(n). - Use a stack or linked list to track free men: `O(1)` (per operation) - For each free man: - Propose to the next woman on his list: `O(1)` - Check if the woman is engaged: `O(1)` - Compare current proposer with her current partner using ranking: `O(1)` - Update engagement accordingly: `O(1)` - The total number of proposals is at most `n²`, so this phase is `O(n²)` 4. Return the engaged result: `O(n)` **⇒ Overall time complexity: `O(n²)`** ### ⭐ Example: <img style="width: 50%" src="https://hackmd.io/_uploads/SJuaHViGgg.png" /> # 🕸️ Graphs Graphs are made up by vertices(nodes) and edges. Each edge joins two vertices(nodes). ### 💡 Key points #### Basic concepts - Graphs are written in the format `G(V, E)` to represent the number of vertices and edges. - The edge format `e: {u, v}` indicates an undirected connection between vertices `u` and `v`. <img style="width: 50%" src="https://hackmd.io/_uploads/rJbmapnzex.png" /> - The edge format `e: (u, v)` indicates an directed connection from `u` to `v`. <img style="width: 50%" src="https://hackmd.io/_uploads/B1Jcnp2Mxg.png" /> The length of edge doesn't mean the distance between two vertices. `|V| = the number of vertices = n` `|E| = the number of edges = m` It takes `O(m+n)` to read the input. #### Path Path refers to a sequence of vertices traversed in order (e.g., `P = <V₁, V₂, V₃>`). A path that visits each vertex at most once is called a simple path. #### Cycle A cycle is a path that starts and ends at the same vertex, visits all other vertices exactly once, and contains at least three vertices. #### Connected An undirected graph is connected if, for every pair of nodes `u` and `v`, there is a path from `u` to `v`. #### Distance The distance between vertices `u` and `v` is the minimum number of edges in the u-v path. #### Adjacency matrix The connections between vertices can be represented using an `n × n` matrix, known as an adjacency matrix. <img src="https://hackmd.io/_uploads/SyvtCcpMle.png" style="width: 60%;" /> - The time complexity to look for the relationship between two vertices is `O(1)`. - The time complexity to find all neighbors for a vertex is `O(n)`. - The space complexity to store the matrix is `O(n²)` #### Adjacency list The connections between vertices can also be represented using an array of lists, where each index corresponds to a vertex and stores a list of its neighboring vertices. This structure is known as an adjacency list. <img src="https://hackmd.io/_uploads/BJZTVoafee.png" style="width: 60%;" /> - The time complexity to check whether two vertices are connected is `O(k)`, where k is the degree of the vertex. - The time complexity to find all neighbors of a vertex is `O(k)`. - The space complexity to store the graph is `O(n + e)`, where `n` is the number of vertices and `e` is the number of edges. #### Strongly connected A directed graph is strongly connected if every pair of vertices is mutually reachable, meaning there is a directed path from each vertex to the other. To check out if a graph is strongly connected, we can: 1. pick any vertex `s` in graph `G` 2. save the BFS result `R = BFS(s, G)` 3. reverse the graph and do BFS again `Rʳᵉᵛ = BFS(s, Gʳᵉᵛ)` 4. if `(R = V = Rʳᵉᵛ )` then return true else false The main idea is to check whether all vertices can reach a chosen vertex `s`, and whether `s` can reach all vertices. This approach avoids checking each pair individually, and the entire process can be completed in linear time using two BFS traversals. #### Strong component The strong component containing `s` in a directed graph is the maximal set of all `v` s.t. `s` and `v` are mutually reachable. --- ### 📘 Tree An undirected graph is a tree if it is connected and does not contain a cycle. <div style="display: flex; justify-content: space-between;"> <img src="https://hackmd.io/_uploads/BJewYV6fgg.png" style="width: 90%;" /> <img src="https://hackmd.io/_uploads/HkSPYNpzgg.png" style="width: 90%;" /> </div> The trees above are the same. It's easy to see that they are connected and contain no cycles. #### Rooted tree A rooted tree is a tree with a clear starting point — the root node. It follows a hierarchical structure with roles like parent, child, sibling, leaf, and more. <img src="https://hackmd.io/_uploads/Hk_FJrpzgl.png" style="width: 50%;" /> --- ### 📈 Breadth-First Search(BFS) Breadth-First Search(BFS) visits vertices level by level, starting from the target(or source) vertex and expanding outward. Using BFS, the original graph (left) can be transformed into the BFS tree shown on the right. <div style="display: flex; justify-content: space-between;"> <img src="https://hackmd.io/_uploads/r1bEaB6Mlg.png" style="width: 90%;" /> <img src="https://hackmd.io/_uploads/B1KlyIpGee.png" style="width: 90%;" /> </div> - Layer `L₀` is the starting point `s`, and layer `L₁` = all neighbors of `L₀`. - `i` represents the distance from `s` to the vertex that belongs the layer `Lᵢ`. - In a BFS tree, if there is an edge between two vertices, their layers (or levels) must differ by at most 1. - Dot line represents nontree edge. Psudo code implementation: ```python function BFS(graph, start): create a queue Q mark start as visited enqueue start into Q while Q is not empty: current = Q.dequeue() visit(current) for each neighbor in graph[current]: if neighbor is not visited: mark neighbor as visited enqueue neighbor into Q ``` >[!Tip] >BFS is widely used in practical applications, such as the bucket fill tool in graphics programs. It starts from a target pixel and systematically explores neighboring pixels in a breadth-first manner to apply the fill to all connected regions of the same color. --- ### 📈 Depth-First Search(DFS) Depth-First Search (DFS) explores as deeply as possible before backtracking. Think of it like recursively visiting the smallest (or next) child vertex each time and marking it as visited in an array, continuing until a leaf node is reached. Then, it backtracks to explore other unvisited paths. Using DFS, the original graph (left) can be transformed into the DFS tree shown on the right. <div style="display: flex; justify-content: space-between;"> <img src="https://hackmd.io/_uploads/Sylj5_aGlg.png" style="width: 90%;" /> <img src="https://hackmd.io/_uploads/HJlx6dTfgg.png" style="width: 90%;" /> </div> - DFS can be implemented using recursion or stack. Psudo code implementation: ```python function DFS(graph, start): create empty stack create empty set visited push start into stack while stack is not empty: current = pop from stack if current not in visited: mark current as visited process current (e.g., print or record it) for neighbor in graph[current] in reverse order: if neighbor not in visited: push neighbor into stack ``` To preserve the DFS visiting sequence, neighbor vertices are pushed in reverse order to ensure that the smallest one is processed first when popping out from the stack. --- ### 📘 DAGs and Topological Ordering #### Directed Acyclic Graphs(DAG) A DAG is a directed graph that contains no cycles. For example, trees and forests are DAGs. DAGs are often used to represent dependencies or precedence constraints. <img src="https://hackmd.io/_uploads/HJcDk-yQex.png" style="width: 40%;" />  #### Topological Ordering Given a directed graph G, a topological ordering is an ordering of its vertices as `v₁, v₂...vₙ` so that for every edge `(vᵢ, vⱼ)`, we have `i < j`. By indexing each vertices one by one, each time finding the one that does not contain incoming edges, we can easily find the topology ordering as shown below: <img src="https://hackmd.io/_uploads/rJyKVZkQel.png" style="width: 70%;" /> <img src="https://hackmd.io/_uploads/r14BEbkmgx.png" style="width: 50%;" /> This can be implemented using the following steps: 1. Initialize a set `S` with all vertices that have no incoming edges. 2. While `S` is not empty, do the following: a. Remove a vertex `v` from `S` and add it to the topological ordering. b. For each neighbor `u` of `v`, decrease the in-degree of `u` by 1. c. If the in-degree of `u` becomes 0, add `u` to `S`. This made the time complexity becomes `O(m+n)` where `n` is the number of vertices and `m` is the number of edges. # 💰 Greedy algorithms An algortithm is greedy if it builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion. - The greedy algorithm stays ahead, show that after each step of the greedy algorithm, its partial solution is better than the optimal. ### 💡 Key points #### Basic concepts - Use a simple rule to select a first request `i₁`. - Once `i₁` is selected, reject all requests not compatible with `i₁`. - To let resource becomes free ASAP, the best option is to accept the request that finish first. <hr> ### 📘 Interval Scheduling Problem Interval scheduling problem is a classic problem. To solve the Interval scheduling problem using a greedy algorithm, we can follow the steps below: - `jobs`: the set of all unscheduled requests - `schedule`: the set of selected non-overlapping requests ``` 1. schedule = ∅ // Start with an empty set of scheduled jobs 2. while jobs is not empty: 3. pick a job j ∈ jobs with the earliest finish time ← greedy choice 4. add j to schedule 5. remove j and all jobs that conflict with j from jobs 6. return schedule ``` <img src="https://hackmd.io/_uploads/BkW4371mge.png" style="width: 50%;" /> Always choose the fastest finished job. #### Time Complexity: 1. Sort the `jobs` by their finish time `f(i)` in ascending order: `O(n log n)` 2. Constructing an array of job start times takes linear time. Time: O(n) 3. Instead of removing all overlapping jobs in each iteration, we simply move to the next job whose start time is after the current job’s finish time. This avoids repeatedly checking all remaining jobs and ensures that each job is considered at most once, spending only: `O(n)` **⇒ Overall time complexity: `O(n log n)`** #### Depth In interval problems, the depth refers to the maximum number of overlapping intervals at any point in time (in other words, it represents the lower bound on the number of resources needed to process all intervals concurrently). To determine the depth, we need to examine all relevant time points and track how many intervals overlap at each one. <img src="https://hackmd.io/_uploads/r1iC2Eg7gx.png" style="width: 50%;" /> <hr> ### 📈 Interval Partitioning Interval partitioning is to seperate the whole intervals set into the least amount of resources to ensure them execute concurrently. This can be implemented using the following steps: ``` 1. {I₁, ..., Iₙ} = sort intervals in ascending order by their starting times 2. for j from 1 to n do 3. exclude labels of all assigned intervals that are not compatible with Iᵢ 4. if (there's a nonexcluded label to Iᵢ) 5. assign a nonexcluded label to Iᵢ ``` - To implement the feature of excluding incompatible labels, we can record the current usage time for all labels to make it easier to compare with the starting time of the current interval Iᵢ. #### Illustration Using the same interval graph from last section, first we sort them by their starting time: <img src="https://hackmd.io/_uploads/Sy2ByUlXlx.png" style="width: 100%;" /> - The illustration shows that the depth is 3 (which means we need 3 labels). - After sorting intervals by their start time, we assign each interval to the first available label s.t. it does not overlap with previously assigned intervals in that label. - After assigning an interval, remember to update the current usage time for that label. >[!Note] >If at some moment have both `finished` and `starting` intervals, do the `finished` first, so that no repeated calculation occurs. <hr> ### 📈 Scheduling to Minimize Lateness Scheduling to minimize lateness means that each interval has a deadline by which it should be completed. We evaluate the arrangement by measuring the maximum lateness, and aim to minimize that value. <img src="https://hackmd.io/_uploads/SJv1cLl7xg.png" style="width: 50%;" /> To implement this algorithm, we simply sort all intervals by their deadlines and schedule them in that order: - `f`: the finishing time of the last scheduled request - `s`: the starting time ``` 1. {d₁, ..., dₙ} = sort requests in ascending order of their deadlines 2. f = s 3. for i from 1 to n do 4. assign request i to the time interval 5. f = f + tᵢ 6. return the set of scheduled intervals ``` <hr> ### 📈 Shortest Paths The shortest path problem is that given directed graph `G = (V, E)` and starting point `s`, we want to find the shortest path `Pᵥ` from `s` to each other vertices `v ∈ V - {s}`. #### Dijkstra's Algorithm We aim to compute the shortest path distances from a source vertex `s` to all other vertices in a weighted graph `G = (V, E)` with non-negative edge weights `I(u, v)`. ##### Implementation Dijkstra(G, I) - `S`: The set of vertices whose shortest path distances from `s` have been finalized. - `d(u)`: The shortest known distance from `s` to vertex `u`. - `d′(v)`: The tentative shortest distance to a vertex `v ∉ S`, calculated via a neighbor `u ∈ S`. ``` 1. Initialize: S = {s}, d(s) = 0 2. While S ≠ V: 3. Select a vertex v ∉ S that is adjacent to at least one u ∈ S, 4. s.t. d′(v) = min { d(u) + I(u, v) | u ∈ S and (u, v) ∈ E } 5. Add v to S and set d(v) = d′(v) ``` <img src="https://hackmd.io/_uploads/H1n6gCxmlx.png" style="width: 40%;" /> ##### Ex. <img src="https://hackmd.io/_uploads/r1dYHDZXeg.png" style="width: 100%;" /> <hr> ### 📘 Priority Queue Priority queue let all elements in the queue has their own priority, there are two types of them, max priority queue and min priority queue. ##### Related actions: - FindMin(Max) - ExtractMin(Max) - Insert - ChangeKey #### Heap (Recall) A heap is a specialized tree-based data structure that satisfies the heap property: in a max-heap, each parent node is greater than or equal to its children; in a min-heap, each parent node is less than or equal to its children. <img src="https://hackmd.io/_uploads/r1rCMtbXlx.png" style="width: 80%;" /> The only possible empty space appears at the bottom right, which ensures the heap remains a complete binary tree. Heap can be implemented using array as it is continuous: <img src="https://hackmd.io/_uploads/HydsGtW7eg.png" style="width: 80%;" /> If a new vertex is added to the heap but does not satisfy the heap property, it needs to be swapped with its parent until the property is restored. <img src="https://hackmd.io/_uploads/SJXwV2WQxl.png" style="width: 90%;" /> To remove the root vertex, we replace it with the last leaf node and then percolate it down the heap until the heap property is maintained. <img src="https://hackmd.io/_uploads/Bkhu33-Qge.png" style="width: 100%;" /> #### Max(Min) heapify Heapify is the process of adjusting a binary tree to maintain the max-heap or min-heap property. - In a max-heap, each parent must be greater than or equal to its children. - In a min-heap, each parent must be less than or equal to its children. When a node violates this rule, we compare it with its children and swap it with the larger (or smaller) child, then repeat the process recursively until the property is restored. ##### Ex. Max heapify <img src="https://hackmd.io/_uploads/HyEdepb7ll.png" style="width: 100%;" /> Checking the parent nodes from the bottom to the top. The time complexity is `O(n log(n))`. <hr> ### 📘 Minimum Spanning Trees(MST) The minimum spanning tree is a problem that involves finding a subset of edges in a weighted, connected graph s.t. all vertices are connected, the total edge weight is minimized, and no cycles are formed. - Let `T` be a minimum-cost solution, we want to get `(V, T)`. <img src="https://hackmd.io/_uploads/S1cjDGMQee.png" style="width: 40%;" /> #### Kruskal's algorithm - Start with `T = 0` - Consider edges in ascending order of cost - Insert edge `e` in `T` as long as it does not create a cycle, otherwise, discard `e` and continue. <img src="https://hackmd.io/_uploads/BksdwMfmxe.png" style="width: 100%;" /> Always choose the smallest edge if it doesn't cause cycle. ##### Implementation ``` 1. {e₁, e₂, ..., eₘ} = sort edges in ascending order of their costs 2. T = {} 3. for each eᵢ=(u, v) do 4. if(u and v are not connected by edges in T) then 5. T = T + {eᵢ} ``` #### Prim's algorithm - Start with a root node `s`. - Greedy grow a tree `T` from `s` outward. - At each step, add the cheapest edge `e` to the partial tree `T` that has exactly one endpoint in `T`. <img src="https://hackmd.io/_uploads/B1fC9zz7gx.png" style="width: 100%;" /> ##### Implementation - Start with a node `s`, add it to the explored set `S`. ``` 1. Initialize: S = {s}, d(s) = 0 2. While S ≠ V: 3. Select a vertex v ∉ S that is adjacent to at least one u ∈ S, 4. s.t. d′(v) = min { cost(u, v) | u ∈ S and (u, v) is an edge } 5. Add v to S and set d(v) = d′(v) ``` #### Reverse-delete algorithm - Start with `T = E`. - Consider edges in descending order of cost. - Delete edge `e` from `T` unless doing so would disconnect `T`. <img src="https://hackmd.io/_uploads/BypSnfGQgx.png" style="width: 100%;" /> **This method is hard to implement, just a concept.** #### Cut property Let `S` be any subset of nodes, and let `e = (v, w)` be the minimum cost edge with one end in `S` and the other in `V-S`. Then every minimum spanning tree contains `e`. <img src="https://hackmd.io/_uploads/rkLPyXMQlx.png" style="width: 50%;" /> #### Cycle property Let `C` be any cycle in `G`, and `let e = (v, w)` be the maximum cost edge in `C`. Then `e` does not belong to any minimum spanning tree. #### The Union-Find data structrure The union-find data structure represents disjoint sets. Each set has a representative. It's operation contains: - MakeUnionFind(S): Initialize a set for each element in `S`. - Find(u): Return the representative of the set containing `u`. - Union(A, B): Merge sets `A` and `B`. <img src="https://hackmd.io/_uploads/HkEo_tM7eg.png" style="width: 70%;" /> The find operation updates the structure by compressing paths, making future find operations faster. <img src="https://hackmd.io/_uploads/SJzR_FzXxl.png" style="width: 30%;" /> # ✂️ Divide and conquer ### 💡 Key points #### Mathematical theory - Weak Induction - Given the propositional `P(n)` where `n ∈ X`, a proof by mathematical induction is of the form: - Basis step: The proposision `P(0)` is shown to be true. - Inductive step: The implication `P(k)` → `P(k+1)` is shown to be true for every `k ∈ X`. - Strong Induction - Given the propositional `P(n)` where `n ∈ X`, a proof by second principle of mathematical induction (or strong induction) is of the form: - Basis step: The proposision `P(0)` is shown to be true. - Inductive step: The implication `P(0) ∧ P(1) ∧ ... ∧ P(k)` → `P(k+1)` is shown to be true for every `k ∈ X`. #### Divide and conquer - Divide: Break the input into several parts of the same type. - Conquer: Solve the problem in each part recursively. - Combine: Combine the solutions to sub-problems into an overall solution. **Spend only linear time for the initial division and final recombining.** <hr> ### 📈 Merge sort Merge sort divide the input into two halves, sort each half recursively. - `A`: An array of sorting elements. - `p`: Index to start storting. - `r`: Index to stop sorting. ``` Mergesort(A, p, r) 1. if (p < r) then 2. q = ⌊(p+r)/2⌋ 3. Mergesort(A, p, q) 4. Mergesort(A, p+1. r) 5. Merge(A, p, q, r) ``` <img src="https://hackmd.io/_uploads/Sy645GmQxg.png" style="width: 40%;" /> #### Time complexity - `T(n)` for input size `n` - Divide spent `D(n)`, but for array it is: `O(1)`. - Conquer: `2T(n/2)` - Combine spent `C(n)`, we want it to be: `O(n)`. ⇒ `T(n) = D(n) + 2T(n/2) + C(n) = O(n log n)`. >[!Note] >In MergeSort, each level does `O(n)` work(as the total data is still `n`), and there are `log n` levels, so the total time is `O(n log n)`. #### Merge operation For each half, always choose the smaller one to put in result array. <img src="https://hackmd.io/_uploads/B10-rw7mex.png" style="width: 60%;" /> <hr> ### 📈 Counting inversions Assume one array (`A`) is fixed with ascending sequence, the counting inversions with the other array (`B`) is to count by two ways: - Method 1: For each element in array `B`, find and count the number of elements before it that are greater than the current element. This represents the number of inversions involving that element. - Method 2: Draw lines connecting corresponding elements between arrays `A` and `B`, and count the number of intersections between the lines. Each intersection represents an inversion. <img src="https://hackmd.io/_uploads/rJa6Oh7Qgx.png" style="width: 60%;" /> Implement divide and conquer in it: <img src="https://hackmd.io/_uploads/SyptjnmXlx.png" style="width: 60%;" /> To ensure the comparison runs in linear time, we first sort the two divided subarrays. - During the merge step, we use two pointers: one for each subarray. As we compare elements from left to right, if an element from the left subarray is greater than the current element from the right subarray, it forms an inversion. - Because both subarrays are sorted, we know that all remaining elements in the left subarray will also be greater, so we can count multiple inversions at once. - This allows us to avoid restarting the comparison and ensures each element is only compared once. <img src="https://hackmd.io/_uploads/ryMqDTmXex.png" style="width: 60%;" /> #### Implementation ``` Sort-and-Count(L, p, q) 1. if (p = q) then return 0 2. else 3. m = ⌊(p+q)/2⌋ 4. rp = Sort-and-Count(L, p, m) 5. rq = Sort-and-Count(L, m+1, q) 6. rm = Merge-and-Count(L, p, m, q) 7. return r = rp + rq + rm ``` <hr> ### 📘 Closest pair of points Given a set of `n` points on a plane `p`, `pᵢ` is located at `(xᵢ, yᵢ)`, we want to find a pair `(pᵢ, pⱼ)` with the smallest Eucidean distance between them: `√[(xᵢ-xⱼ)²+(yᵢ-yⱼ)²]`. - Using divide and conquer, we split the problem into smaller subproblems of the same type. First, we find the closest pair on each side (the orange edges), and then check if a closer pair exists across the two halves (the red edge): <img src="https://hackmd.io/_uploads/HJkmG7Emgg.png" style="width: 40%;" /> - When searching for the closest pair across the two halves, we only need to consider the smaller of the two closest distances from each half as the threshold for cross-pair comparisons.: <img src="https://hackmd.io/_uploads/SyG6M7N7xe.png" style="width: 40%;" /> #### Implementation ``` Closest-Pair(P) 1. construct Px and Py // O(n log n) 2. (p*₀, p*₁) = Closest-Pair-Rec(Px, Py) // T(n) Closest-Pair-Rec(P) 1. if |P| ≤ 3 then return closest pair measured by all pair-wise distances 2. x* = (⌈n/2⌉)-th smallest x-coordinate in Px 3. construct Qx, Qy, Rx, Ry // O(n) 4. (q*₀, q*₁) = Closest-Pair-Rec(Qx, Qy) // T(n/2) 5. (r*₀, r*₁) = Closest-Pair-Rec(Rx, Ry) // T(n/2) 6. δ = min(d(q*₀, q*₁), d(r*₀, r*₁)) 7. L = {(x, y): x = x*}; S = {points in P within distance δ of L} 8. construct Sy // O(n) 9. for each s ∈ S do 10. compute distance from s to each of next 15 points in Sy 11. d(s, s') = min distance of all computed distances // O(n) 12. if d(s, s') < δ then return (s, s') 13. else if d(q*₀, q*₁) < d(r*₀, r*₁) then return (q*₀, q*₁) 14. else return (r*₀, r*₁) ``` - In the closest pair algorithm, we only need to compare each point with up to 15 nearby points in the sorted strip because of geometric packing constraints. - Within a vertical strip of width 2δ, at most 15 points can lie within δ distance of each other without violating the minimal distance found so far. This limit ensures the algorithm remains efficient at `O(n log n)` time. # 🧠 Dynamic programming Dynamic programming (DP) came from the term mathematical programming. Basic idea: - One implicitly explores the space of all possible solutions. - Carefully decomposing things into a senes of subproblems. - Building up correct solutions to larger and larger subproblems. ### 💡 Key points #### Basic concepts Suppose the interval scheduling problem is illustrated in the left image. We begin by sorting the intervals based on their end times, as shown in the right image. <img src="https://hackmd.io/_uploads/r1RQpUN7gg.png" style="width: 90%;" /> To determine how to solve the problem, we first break it down into smaller subproblems and then explore different strategies for combining their solutions into a complete answer. 1. Solve based on time Construct the overall solution by combining subproblems defined over time intervals. However, it can be difficult to determine the optimal threshold between intervals. <img src="https://hackmd.io/_uploads/r1-H6LNmee.png" style="width: 90%;" /> 2. Solve based on weight gain Assume that `p(i)` equals the amount that will not overlap. - `Oⱼ` means the optimal solution for intervals `1, ..., j`. - `OPT(j)` means the value of the optimal solution for intervals `1, ... j`. <img src="https://hackmd.io/_uploads/rJv3Jw4Qge.png" style="width: 50%;" /> In this example: - `O₆ = {6, O₃}` or `O₅` - `OPT(6) = max{{v₆+OPT(3)}, OPT(5)}` #### Implementation ``` // Preprocessing // 1. Sort intervals by finish times: f₁ ≤ f₂ ≤ ... ≤ fₙ // 2. Compute p(1), p(2), ..., p(n) Compute-Opt(j) 1. if (j = 0) then return 0 2. else return max{{vⱼ+Compute-Opt(p(j))}, Compute-Opt(j-1)} ``` ##### Problem: This recursive process recalculates the same subproblems many times. <img src="https://hackmd.io/_uploads/ByeJImvVXgg.png" style="width: 40%;" /> To solve the problem, we need to memorize the calculated results so that it can be reused when needed. This is called memorization. The implementation thus becomes: ``` M-Compute-Opt(j) 1. if (j = 0) then return 0 2. else if (M[j] is not empty) then return M[j] 3. else return M[j] = max{{vⱼ+M-Compute-Opt(p(j))}, M-Compute-Opt(j-1)} ``` This is the top-down method for saving calculation time, it can also be implemented using bottom-up, which called iteration: ``` I-Compute-Opt 1. M[0] = 0 2. for j = 1, 2, ..., n do 3. M[j] = max{vⱼ+M[p(j)], M[j-1]} ``` #### When to use DP Dynamic programming can be used if the problem satisfies the following properties: - There are only a polynomial number of subproblems. - The solution to the original problem can be easily computed from the solutions to the subproblems. - There is a natural ordering on subproblems from "smallest" to "largest", together with an easy-to-compute recurrence. If a problem can be solved using DP, it need the following properties: - Optimal substructure: An optimal solution contains within its optimal solutions to subproblems. - Overlapping subproblem: A recursive algorithm revisits the same problem over and over again; typically, the total number of distinct subproblems is a polynomial in the input size. <hr> ### 📘 Fibonacci Sequence Recurrence relation: `Fₙ = Fₙ-₁ + Fₙ-₂`, `F₀ = 0`, `F₁ = 1`. If we want to get `fib(5)`, we need: ``` fib(5) = fib(4) + fib(3) = (fib(3) + fib(2)) + (fib(2) + fib(1)) = ((fib(2) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1)) = (((fib(1) + fib(0)) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1)) ``` So many repeated calculations, can use dynamic programming to improve. <img src="https://hackmd.io/_uploads/HJzGTsEXxx.png" style="width: 80%;" /> #### Implementation (Memoization) ``` fib(n) 1. Initialize f[0...n] with -1 // -1: unfilled 2. f[0] = 0, f[1] = 1 3. fibonacci(n, f) fibonacci(n, f) 1. if f[n] == -1 then 2. f[n] = fibonacci(n-1, f) + fibonacci(n-2, f) 3. return f[n] // if f[n] already directly return ``` The array f helps store the calculated elements to reduce time waste. <hr> ### 📘 Maze Routing Given a grid-based layout with a start and an target point, maze routing is the process of finding a valid path that connects the two points while avoiding obstacles. It is commonly used in VLSI design and robotics pathfinding. - A planar rectangular grid graph. - Two points `S` and `T` on the graph. - Obstacles modeled as blocked vertices. We want to find the shortest path connecting `S` and `T`. <div style="display: flex; justify-content: space-between;"> <img src="https://hackmd.io/_uploads/BkbWZhEQlg.png" style="width: 90%;" /> <img src="https://hackmd.io/_uploads/HJOuGn4Xgl.png" style="width: 90%;" /> </div> The path is determined by exploring all possible moves at each step, and there may be multiple valid solutions. <hr> ### 📘 Subset Sums & Knapsacks Given a set of `n` items and knapsack, item `i` weights `wᵢ > 0`. The knapsack has capacity of `W`. We want to fill the knapsack so as to maximize total weight. <img src="https://hackmd.io/_uploads/S1--PrHQgg.png" style="width: 40%;" /> #### Optimaization problem formulation - `max Σᵢ∈ₛ wᵢ` s.t.` Σᵢ∈ₛ wᵢ < W`, `S⊆{1, ..., n}` and `Σⱼ∈ₛ wⱼ < w` #### OPT(i) = the total weight of the optimal solution for items 1, ..., i - OPT(i, w) = maxₛ Σⱼ∈ₛ wₚ S⊆{1, ..., i} Consider `OPT(n)`, i.e., the total weithgt of the final solution O, we got the following cases: - Case 1: `n ∉ O` (`OPT(n)` does not count `wₙ`) → `OPT(n, w) = OPT(n-1, w)`. - Case 2: `n ∈ O` (`OPT(n)` counts `wₙ`) → `OPT(n, w) = wₙ + OPT(n-1, w-wₙ)`. <img src="https://hackmd.io/_uploads/SkBxYYSXex.png" style="width: 60%;" /> #### Implementation ``` Subset-sum(n, w₁, ..., wₙ, W) 1. for w = 0, 1, ..., W do // if no selected item 2. M[0, w] = 0 3. for i = 0, 1, ..., n do // if no empty space 4. M[i, 0] = 0 5. for i = 1, 2, ..., n do 6. for w = 1, 2, ..., W do 7. if (wᵢ > W) then 8. M[i, w] = M[i-1, w] 9. else 10. M[i, w] = max {M[i-1, w], wᵢ+M[i-1, w-wᵢ]} ``` <img src="https://hackmd.io/_uploads/rkCAz8BXxx.png" style="width: 70%;" /> ⇒ Time complexity: `O(nW)` → **Not polynomial time, only psudo-polynomial time**. #### 📈 The Knapsack Problem Given a set of `n` items and a knapsack, item `i` weights `wᵢ > 0` and has value `vᵢ > 0`. The knapsack has capacity of `W`. We want to fill the knapsack so as to maximize total value. #### Optimaization problem formulation - `max Σᵢ∈ₛ Vᵢ` s.t.` Σᵢ∈ₛ wᵢ < W`, `S⊆{1, ..., n}` <img src="https://hackmd.io/_uploads/SkaQtrBXee.png" style="width: 60%;" /> <hr> ### 📈 Bellman-Ford algorithm If `G` has no negative cycles, then there is a shortest path from `s` to `t` that is simple (i.e., does not repeat nodes), and hence has at most `n-1` edges. `OPT(i, v)` means length of shortest v-t path P using at most i edges. - `OPT(n-1, s)` means length of shortest s-t path. - Case 1: `P` uses at most `i-1` edges. - `OPT(i, v) = OPT(i-1, v)`. - Case 2: `P` uses exactly `i` edges. - `OPT(i, v) = Cvw + OPT(i-1, w)`, assume that the middle point is `w` and `Cvw` is the distance for `v` and `w`. <img src="https://hackmd.io/_uploads/B1bn85SXex.png" style="width: 60%;" /> #### Implementation ``` Bellman-Ford(G, s, t) // n = # of nodes in G // M[0...n-1, V]: table recording optimal solutions of subproblems 1. M[0, f] = 0 2. foreach v ∈ V-{f} do 3. M[0, v] = ∞ 4. for i = 1 to n-1 do 5. for v ∈ V in any order do 6. M[i, v] = min{M[i-1, v], ,min₍ᵥ, w₎∈E{Cvw+M[i-1, w]}} ``` <img src="https://hackmd.io/_uploads/BkfZfoH7xx.png" style="width: 50%;" /> <img src="https://hackmd.io/_uploads/B1aGmiSQgl.png" style="width: 70%;" /> ⇒ Time complexity: `O(nm)`, `n` is the number of nodes and `m` is the number of edges. <hr> ### 📘 Currency Conversion Given `n` currencies and the exchange rates between each pair, we want to detect an arbitrage opportunity, or to say check if someone can ended up with more money by exchanging currencies in a cycle. To detect such opportunities, we need to check whether the product of exchange rates along a cycle is greater than 1. However, since shortest path algorithms work with sums rather than products, we convert the problem by taking the logarithm of each rate and using the negative log values. This transforms the product into a sum, allowing us to apply shortest path algorithms like Bellman-Ford to detect negative-weight cycles, which correspond to arbitrage opportunities. <img src="https://hackmd.io/_uploads/BkNAfnSmxl.png" style="width: 50%;" /> #### Negative cycle detection If a graph has `n` nodes, `OPT(n, v) < OPT(n-1, v)` for some `v`, then shortest path contains a negative cycle. To detect it, we can add a new vertex and connect all vertices to it with edge weight `0`. This ensures all vertices can reach the new one, so if there's a negative cycle, it will be detected when running Bellman-Ford algorithm. <img src="https://hackmd.io/_uploads/S1XE9hBmeg.png" style="width: 40%;" /> Also, remember to run `n+1` iterations to detect negative cycles. <hr> ### 📘 Travelling Salesman Problem Travelling Salesman Problem (TSP): A salesman is required to visit once and only once each of n different cities starting from a base city, and returning to this city. What path minimizes the total distance travelled by the salesman? For each subset `S` of the cities with `|S| >= 2` and each `u, v ∈ S`, `OPT(S, u, v)` means the length of the shortest path that starts at `u`, ends at `v`, visits all cities in `S`. - Case 1: `S = {u, v}` - `OPT(S, u, v) = d(u, v)` - Case 2: `|S| > 2` - Assume `w ∈ S-{u, v}` is visted first: `OPT(S, u, v) = d(u, w)+OPT(S-u, w, v)` - `OPT(S, u, v) = min(w∈S-P{u, v}) {d(u, w)+OPT(S-u, w, v)}` ##### ⇒ Time complexity: `O(2ⁿn³)`. From above we can see that although it is much better than `O(n!)` (using brute force), DP is suitable when the number of subproblems is polynomial. # 🌊 Flow Network A flow network `G = (V, E)` is a directed graph. - In a flow network, an augmenting path may include both forward and backward edges. Forward edges represent unused capacity and allow additional flow, while backward edges represent previously pushed flow that can be canceled or "undone." - This mechanism enables the algorithm to redirect flow by decreasing flow on certain edges and increasing it on others, ultimately leading to a more optimal overall flow. ### 💡 Key points - `s-t flow f: E → R+` - A function f that maps each edge e to a nonnegative real number, `f(e)`: flow carried by edge e. - `v(f)`: The value of a flow f. #### Flow properties - Capacity conditions - `∀e∈E, 0 ≤ f(e) ≤ cₑ` - Conservation conditions - `∀u∈V\{s, f}, Σₑ ᵢₙₜₒ ᵥ f(e) = Σₑ ₒᵤₜ ₒf ᵥ f(e)` <img src="https://hackmd.io/_uploads/ByG4vW8Xex.png" style="width: 100%;" /> #### Upper bounds of the maximum s-t flow To find the upper bound of the s-t flow, we divide the nodes into two sets, A and B, so that `s ∈ A` and `t ∈ B`. - Any s-t flow must cross from `A` into `B` at some point. - The s-t flow uses up some of the edge capacity from `A` to `B`. <hr> ### 📈 Ford-Fulkerson: Residual Graph Assume we have a pushing flow: - Push forward on edges with leftover capacity. - Push backward on edges with flow. The residual graph `Gf` of `G w.r.t.f`. - `V(Gf) = V(G)` <img src="https://hackmd.io/_uploads/SJ-p1K8Qxg.png" style="width: 70%;" /> The **blue edges** mean the leftover capacity of an edge, and the **red edges** mean the maximum available push-back flow. At this point, we can determine whether more flow can be sent from `s` to `t` by checking for a path in the residual graph. By finding augmenting paths step by step, we can maximize the overall flow. ##### Update pushing flow <img src="https://hackmd.io/_uploads/BJQR1YI7le.png" style="width: 70%;" /> #### Implementation - `e`: Edge. - `E`: Flow network. - `Gf`: Residual Graph. ``` Ford-Fulkerson(G, s, t) 1. foreach (e ∈ E) do 2. f(e) = 0 3. construct Gf 4. while (Ǝ an s-t path P in Gf) do 5. b = bottleneck(P, f) 6. foreach(e ∈ P) do 7. if (e ∈ E) then f(e) = f(e) + b 8. else f(eᴿ) = f(eᴿ) - b 9. update Gf 10. return f ``` #### Ex. <img src="https://hackmd.io/_uploads/Sy5l0KLQex.png" style="width: 100%;" /> <hr> ### 📘 Bipartite Maching Given `n` men, `m` women and their feasible partners, `G(X, Y, E)`, we want to find the maching `M⊆E` with the max number of marriages. The Bipartite Matching problem can also be solved using a residual graph. We only need to define a start and an end node, and set all edge capacities to 1. <img src="https://hackmd.io/_uploads/HyW7L587xx.png" style="width: 100%;" /> <hr> ### 📘 Maximum Flows and Minimum Cuts #### Termination and Running Time Assumption: All capacities are integers. Through out Ford-Fulderson, the flow values `{f(e): e∈E}` and the residual capacities in `Gf` are integers. - `v(f') = v(f) + bottleneck(P, f)`. Since `bolltleneck(P, f) > 0`, `v(f') > v(f)`. - `C = Σₑ ₒᵤₜ ₒf ₛ cₑ >= Σₑ ₒᵤₜ ₒf ₛ f(e) = v(f)`. Ford-Fulkerson terminates in at most C iterations of the while loop. Running time: `O(mC)`. For certain pathological cases—like the one shown—each augmenting path may only contribute 1 unit of flow, resulting in `C` iterations and thus worst-case performance. <img src="https://hackmd.io/_uploads/ryw-U2I7xx.png" style="width: 40%;" /> #### Choose augmenting paths To avoid the above worst-case, we should choose augmenting paths with: - Max bottlenect capacity. - Fewest number of edges. - Sufficiently large bottleneck capacity. - Δ-scaling: look for paths with bottleneck capacity of at least Δ (choosing larger edge at the beginning). #### Max-flow min-cut theorem The value of the max flow is equal to the value of the min cut. - There exists a cut `(A, B)` s.t. `v(f) = cap(A, B)`. - Flow `f` is a max flow. - There is no augmenting path relative of `f`. ##### Ex. <img src="https://hackmd.io/_uploads/BJExR08Xlx.png" style="width: 40%;" /> The cut shows the maximum flow is `8+8=16`, which is the same as the final flow that goes into `t` (`6+6+4`). # 🅿️ Beyond polynomial running times Those problems with polynomial-time algorithms will be able to solve in practice. ### 📘 Decision & Optimization Problems #### Decision problems Those having yes/no answers called decision problems. - MST: Given a graph `G=(V, E)` and a bound `K`, is there a spanning tree with a cost at most `K`? - TSP: Given a set of cities, distance between each pair of cities, and a bound `B`, is there a route that starts and ends at a given city, visits every city exactly once, and has total distance at most `B`? #### Optimization problems Those finding a legal configuration s.t. its cost is minimum (or maximum) Binary search is available for a decision problem to obtain solutions to its optimization problem. <hr> ### 📘 Complexity Classes - The class P is the class of problems that can be solved in polynomial time in the size of input. - The class NP (Nondeterministic Polynomial) class is the class of problems that can be verified in polynomial time in the size of input. - The class NP-complete (NPC) is a problem `Y` in NP with the property that for every problem `X` in NP, `X ≤ₚ Y`. - Suppose `Y` is NPC, then `Y` is solvable in polynomial time iff `P = NP`. #### Fundamental question: Do there exitst "natual" NPC problems? <hr> ### 📘 The First Proved NPC: Circuit Satiusfiability Q: Given a combinational circuit built out of AND, OR, and NOT gates, is there a way to set the circuit inputs so that the output is 1? A: Yes: `1 0 1` <img src="https://hackmd.io/_uploads/r1Aaflw7le.png" style="width: 50%;" /> **This is a NP-complete problem (also the first on in history).** <hr> ### 📘 Polynomial-Time Reduction Suppose we could solve `Y` in polynomial-time, we want to let problem `X` polynomial reduces to problem `Y` if given an arbitrary instance `x` of problem `X`, we can construct an input `y` to problem `Y` in polynomial time s.t. `x` is a yes instance to `X` iff `y` is a yes instance of `Y`. - Design algorithms If `X ≤ₚ Y` and `Y` can be solved in polynomial-time, then `X` can also be solved in polynomial-time. - Establish intractability If `X ≤ₚ Y` and `Y` cannot be solved in polynomial-time, then `Y` cannot be solved in polynomial-time. - Establish equivalence If `X ≤ₚ Y` and `Y ≤ₚ X`, `X ≡ₚ Y`. <hr> ### 📘 Polynomial Reduction: HC ≤ₚ TSP The Hamiktonian Circuit(Cycle) Problem (HC) is an undirected graph `G=(V, E)`. We want to know if there is a cycle in `G` that includes every vertex exactly once? - TSP: The Traveling Salesman Problem. - Claim: `HC ≤ₚ TSP`. #### Steps 1. Define a reduction function `f` for `HC ≤ₚ TSP`. - Given an HC instance `G = (G, V)` with `n` vertices, create a set of `n` cities labeled with names in `V`. - Assign distance between `u` and `v`. <img src="https://hackmd.io/_uploads/S14EiLPXlx.png" style="width: 40%;" /> - Set bound `B = n`. - `f` can be computed in `O(V²)` time. <img src="https://hackmd.io/_uploads/r1nJAIwXgl.png" style="width: 60%;" /> 2. G has a HC iff the reduced instance has a TSP with `distance ≤ B`. - `x ∈ HC` ⇒ `f(x) ∈ TSP`. - Suppose the HC is `h = <v₁, v₂, ..., vₙ>`. Then, `h` is also a tour in the transformed TSP instance. - The distsnce of the tour `h` is `n = B` since there are `n` consecutive edges in `E`, and so has distance 1 in `f(x)`. - Thus, `f(x) ∈ TSP` (`f(x)` has a TSP tour with `distance ≤ B`). - `f(x) ∈ TSP` ⇒ `x ∈ HC` - Suppose there is a TSP tour with `distance ≤ B`. Let it be `<v₁, v₂, ..., vₙ>`. - Since distance of the `tour ≤ n` and there are `n` edges in the TSP tour, the tour contains only edges in `E`. - Thus, `<v₁, v₂, ..., vₙ>` is a Hamiltonian cycle (`x ∈ HC`). <hr> ### 📘 NP-Completeness Definition: A decision problem L (a language `L ⊆ {0, 1}'`) is NP-complete (NPC) if: - `L ∈ NP`, and - `L' ≤ₚ L` for every `L' ∈ NP`. NP-hard: If `L` statisfies property 2, but not necessarily property 1, we say that `L` is NP-hard. Suppose `L ∈ NPC`, `P = NP`? - If `L ∈ P`, then there exist a polynomial-time algorithm for every `L' ∈ NP` (i.e., `P = NP`). - If `L ∉ P`, then there exist no polynomial-time algorithm for any `L' ∈ NPC` (i.e., `P ≠ NP`). # 📏 Linear Programming Linear programming describes a broad class of optimization tasks in which both the optimization cirterion and the coinstraints are linear functions. ### 💡 Key points #### Basic concepts Linear programming contains three parts: - A set of decision variables. - An objective function. - A set of constraints. #### Example: Profit Maximization A boutique chocolatier has two products: - A(Pydamide): profit $1 per box. - B(Nuit): profit $6 per box. ##### Constraints - The daily demand for these exclusive chocolates is limited to at most 200 boxes of A and 300 boxes of B. - The current workforce can produce a total of at most 400 boxes of chocolate per day. ##### Decision variables - `x₁` = Boxes of A. - `x₂` = Boxes of B. ##### Objective Function - Maximize profit ``` max x₁+6x₂ Subject to x₁ ≤ 200 (1) x₂ ≤ 300 (2) x₁ + x₂ ≤ 300 (3) x₁, x₂ ≥ 0 ``` <img src="https://hackmd.io/_uploads/HJEU4Zd7xe.png" style="width: 30%;" /> Using the simplex method: hill climbing to find the best solution. ##### Multipliers To find the upper bound, we can also use the constraints to caclculate the result: - (1) + 6*(2): `x₁ + 6x₂ ≤ 2000` - 0*(1) + 5*(2) + (3): `x₁ + 6x₂ ≤ 1900` #### Duality The problem can be changed to the following type, for us to get the answer by minimize. ``` Multiplier Ineuality y₁ x₁ ≤ 200 y₂ x₂ ≤ 300 y₃ x₁ + x₂ ≤ 400 ⇒ (y₁+y₃)x₁+(y₂+y₃)x₂ ≤ 200y₁+300y₂+400y₃ ``` We want a tight bound for the dual question, thus we need to minimize `200y₁+300y₂+400y₃`. ##### Generic form: ``` Primal LP: Dual LP: max cᵀx min yᵀb Ax ≤ b yᵀA ≥ cᵀ x ≥ 0 y ≥ 0 ``` #### Standard Form Variants - Either a maximization or a minimization problem. - Constraints can be queations and/or ineualities. - Variables are restricted be nonnegative or unrestricted in sign.