<!--
Content close-up :
<span style="color:#bf9ded">
Terminology :
<b style="color:#CA8EFF">
Algorithm name :
<span style="color:#B3A1FF">
ππ
O()
Ξ©()
Ξ()
β°ΒΉΒ²Β³β΄β΅βΆβ·βΈβΉββββββ
ββββα΅α΅αΆα΅α΅αΆ α΅Κ°β±Κ²α΅Λ‘α΅βΏα΅α΅Κ³Λ’α΅α΅α΅Κ·Λ£ΚΈαΆ»βββα΅’β±Όββββββα΅£ββα΅₯α΅€Λ£ΚΈ
βββββδΣβ
βββββ β€β₯
<img src="https://hackmd.io/_uploads/12345.png" style="width: 40%;" />
<div style="display: flex; justify-content: space-between;">
<img src="https://hackmd.io/_uploads/12345.png" style="width: 90%;" />
<img src="https://hackmd.io/_uploads/12345.png" style="width: 90%;" />
</div>
-->
# A Guided Note on Algorithms
Algorithm plays a huge part in achieving better performance in coding. This article contains algorithms that are commonly used in real life and important for LeetCode practicing problems.
### <span style="color:#B3A1FF">Table of contents</span>
- βοΈ Algorithm concepts
- π§© Maching algorithms
- πΈοΈ Graphs
- π° Greedy algorithms
- βοΈ Divide and conquer
- π§ Dynamic programming
- π Flow Network
- π
ΏοΈ Beyond polynomial running times
- π Linear Programming
*\* This article is a set of study notes based on online teaching materials. Some content and examples are adapted from those resources and are not fully created by me.*
*\* These notes are derived from the original public videos in this [YouTube playlist](https://youtube.com/playlist?list=PLj6E8qlqmkFtoRpLn6IXnH_eboef-3QvZ&si=SaB5LEQCwgWY7SpF). Some examples and explanations have been adapted or rephrased to facilitate better understanding and for my own practice.*
*\* Some of the graphics/text in this article were generated/refined with the help of ChatGPT to enhance clarity and improve understanding.*
<br>
# βοΈ Algorithm concepts
### What is a <span style="color:#B3A1FF">good algorithm</span>?
- Run quickly and consume small memory.
- When the input size doubles, the algorithm should only slow down by constant factor c.
### Proposed definition
- Runtime is <span style="color:#bf9ded">platform-independent</span> and <span style="color:#bf9ded">instance-independent</span>
- <b style="color:#CA8EFF">Polynomial running time</b> is if there exists constants c > 0 and d > 0, and the input is N, then the running time is bounded by <span style="color:#bf9ded">cNα΅</span>.
- <b style="color:#CA8EFF">Primitive computational step</b> refers to a basic operation whose cost does not depend on the input size.
- An algorithm is efficient if it has a polynomial running time.
### Time Complexity
- Seperate algorithms by their max exponent.
Ex. `10nΒ²+5n+2` β `nΒ²` complexity.
- Let `T(n)` be a function to describe the worst-case running time of a certain algorithm on an input of size `n`:
- <span style="color:#bf9ded">Asymptotic upper bound</span>: `T(n) = O(f(n))` if there exist constants `c > 0` and `nβ >= 0` s.t. for all `n > nβ` we hace `T(n) β€ cf(n)`.
- <span style="color:#bf9ded">Asymptotic lower bound</span>: `T(n) = Ξ©(f(n))` if there exist constants `c > 0` and `nβ >= 0` s.t. for all `n > nβ` we hace `T(n) >= cf(n)`.
- <span style="color:#bf9ded">Asymptotic tight bound</span>: `T(n) = Ξ(f(n))` if `T(n)` is both `O(f(n))` and `Ξ©(f(n))`.
<img style="width: 40%" src="https://hackmd.io/_uploads/HkELX9hGxe.png" />
<br>
# π§© Matching algorithms
## <span style="color:#B3A1FF">Gale-Shapley (Stable Marriage Problem)</span>
The **Gale-Shapley algorithm** (also known as the **Stable Marriage Problem algorithm**), solves the problem of matching two equally sized sets (e.g., men and women) based on individual preferences, s.t. the resulting matching is **stable**.
>[!Important]
> A matching is considered <span style="color:#bf9ded">stable</span> if there is no pair (A, B) s.t. both A and B would prefer each other over their current matches.
### π‘ Algorithm notes:
- Individuals from one group <span style="color:#bf9ded">propose</span> to individuals in the other group based on their preferences(ranking).
- The other group temporarily holds the best proposal theyβve received, rejecting others, or dumping the current one if the incomming one is in it's higher rank.
- Rejected(dumped) individuals continue proposing to the next preferred option.
- The process repeats <span style="color:#bf9ded">until everyone is matched</span>.
### π Time complexity
1. Initialize all people to be free: `O(n)`
2. Initialize woman preference ranking and current status: `O(nΒ²)`
3. Engaging β
- Store each womanβs preference ranking in a map or array: `O(nΒ²)`
>[!Note]
> The data is stored by ranking, using the index to represent each man's number. For example, [3, 1, 2] means the woman ranks man 1 as her 3rd choice, man 2 as 1st, and man 3 as 2nd. This allows preference comparisons in O(1) time instead of O(n).
- Use a stack or linked list to track free men: `O(1)` (per operation)
- For each free man:
- Propose to the next woman on his list: `O(1)`
- Check if the woman is engaged: `O(1)`
- Compare current proposer with her current partner using ranking: `O(1)`
- Update engagement accordingly: `O(1)`
- The total number of proposals is at most `nΒ²`, so this phase is `O(nΒ²)`
4. Return the engaged result: `O(n)`
**β Overall time complexity: `O(nΒ²)`**
### β Example:
<img style="width: 50%" src="https://hackmd.io/_uploads/SJuaHViGgg.png" />
<br>
<br>
# πΈοΈ Graphs
Graphs are made up by <b style="color:#CA8EFF">vertices(nodes)</b> and <b style="color:#CA8EFF">edges</b>. Each edge joins two vertices(nodes).
### π‘ Key points
#### Basic concepts
- Graphs are written in the format `G(V, E)` to represent the number of vertices and edges.
- The edge format `e: {u, v}` indicates an <span style="color:#bf9ded">undirected</span> connection between vertices `u` and `v`.
<img style="width: 50%" src="https://hackmd.io/_uploads/rJbmapnzex.png" />
- The edge format `e: (u, v)` indicates an <span style="color:#bf9ded">directed</span> connection from `u` to `v`.
<img style="width: 50%" src="https://hackmd.io/_uploads/B1Jcnp2Mxg.png" />
The <span style="color:#bf9ded">length</span> of edge <span style="color:#bf9ded">doesn't mean the distance</span> between two vertices.
`|V| = the number of vertices = n`
`|E| = the number of edges = m`
It takes `O(m+n)` to read the input.
#### <b style="color:#CA8EFF">Path</b>
Path refers to a sequence of vertices traversed in order (e.g., `P = <Vβ, Vβ, Vβ>`).
A path that visits each vertex at most once is called a <b style="color:#CA8EFF">simple path</b>.
#### <b style="color:#CA8EFF">Cycle</b>
A cycle is a path that <span style="color:#bf9ded">starts and ends at the same vertex</span>, visits all other vertices exactly once, and contains at least three vertices.
#### <b style="color:#CA8EFF">Connected</b>
An undirected graph is connected if, for every pair of nodes `u` and `v`, there is a <span style="color:#bf9ded">path</span> from `u` to `v`.
#### <b style="color:#CA8EFF">Distance</b>
The distance between vertices `u` and `v` is the minimum number of <span style="color:#bf9ded">edges</span> in the u-v path.
#### <b style="color:#CA8EFF">Adjacency matrix</b>
The connections between vertices can be represented using an `n Γ n` matrix, known as an adjacency matrix.
<img src="https://hackmd.io/_uploads/SyvtCcpMle.png" style="width: 60%;" />
- The time complexity to look for the relationship between two vertices is `O(1)`.
- The time complexity to find all neighbors for a vertex is `O(n)`.
- The space complexity to store the matrix is `O(nΒ²)`
#### <b style="color:#CA8EFF">Adjacency list</b>
The connections between vertices can also be represented using an array of lists, where each index corresponds to a vertex and stores a list of its neighboring vertices. This structure is known as an adjacency list.
<img src="https://hackmd.io/_uploads/BJZTVoafee.png" style="width: 60%;" />
- The time complexity to check whether two vertices are connected is `O(k)`, where k is the degree of the vertex.
- The time complexity to find all neighbors of a vertex is `O(k)`.
- The space complexity to store the graph is `O(n + e)`, where `n` is the number of vertices and `e` is the number of edges.
#### <b style="color:#CA8EFF">Strongly connected</b>
A directed graph is strongly connected if every pair of vertices is <b style="color:#CA8EFF">mutually reachable</b>, meaning there is a directed path from each vertex to the other.
To check out if a graph is strongly connected, we can:
1. pick any vertex `s` in graph `G`
2. save the BFS result `R = BFS(s, G)`
3. reverse the graph and do BFS again `RΚ³α΅α΅ = BFS(s, GΚ³α΅α΅)`
4. if `(R = V = RΚ³α΅α΅ )` then return true else false
The main idea is to check whether all vertices can reach a chosen vertex `s`, and whether `s` can reach all vertices. This approach avoids checking each pair individually, and the entire process can be completed in linear time using two BFS traversals.
#### <b style="color:#CA8EFF">Strong component</b>
The strong component containing `s` in a directed graph is the maximal set of all `v` s.t. `s` and `v` are mutually reachable.
---
### π Tree
An undirected graph is a <b style="color:#CA8EFF">tree</b> if it is <span style="color:#bf9ded">connected</span> and does <span style="color:#bf9ded">not</span> contain a <span style="color:#bf9ded">cycle</span>.
<div style="display: flex; justify-content: space-between;">
<img src="https://hackmd.io/_uploads/BJewYV6fgg.png" style="width: 90%;" />
<img src="https://hackmd.io/_uploads/HkSPYNpzgg.png" style="width: 90%;" />
</div>
The trees above are the same. It's easy to see that they are connected and contain no cycles.
#### Rooted tree
A <b style="color:#CA8EFF">rooted tree</b> is a tree with a clear starting point β the <span style="color:#bf9ded">root</span> node. It follows a <b style="color:#CA8EFF">hierarchical</b> structure with roles like parent, child, sibling, leaf, and more.
<img src="https://hackmd.io/_uploads/Hk_FJrpzgl.png" style="width: 50%;" />
---
### π <span style="color:#B3A1FF">Breadth-First Search(BFS)</span>
<b style="color:#CA8EFF">Breadth-First Search(BFS)</b> visits vertices <span style="color:#bf9ded">level by level</span>, starting from the target(or source) vertex and expanding outward. Using BFS, the original graph (left) can be transformed into the <b style="color:#CA8EFF">BFS tree</b> shown on the right.
<div style="display: flex; justify-content: space-between;">
<img src="https://hackmd.io/_uploads/r1bEaB6Mlg.png" style="width: 90%;" />
<img src="https://hackmd.io/_uploads/B1KlyIpGee.png" style="width: 90%;" />
</div>
- Layer `Lβ` is the starting point `s`, and layer `Lβ` = all neighbors of `Lβ`.
- `i` represents the distance from `s` to the vertex that belongs the layer `Lα΅’`.
- In a <span style="color:#bf9ded">BFS tree</span>, if there is an edge between two vertices, their layers (or levels) must differ by at most 1.
- Dot line represents nontree edge.
Psudo code implementation:
```python
function BFS(graph, start):
create a queue Q
mark start as visited
enqueue start into Q
while Q is not empty:
current = Q.dequeue()
visit(current)
for each neighbor in graph[current]:
if neighbor is not visited:
mark neighbor as visited
enqueue neighbor into Q
```
>[!Tip]
>BFS is widely used in practical applications, such as the bucket fill tool in graphics programs. It starts from a target pixel and systematically explores neighboring pixels in a breadth-first manner to apply the fill to all connected regions of the same color.
---
### π <span style="color:#B3A1FF">Depth-First Search(DFS)</span>
<b style="color:#CA8EFF">Depth-First Search (DFS)</b> explores as deeply as possible before backtracking. Think of it like recursively visiting the smallest (or next) child vertex each time and marking it as visited in an array, continuing until a leaf node is reached. Then, it backtracks to explore other unvisited paths.
Using DFS, the original graph (left) can be transformed into the <b style="color:#CA8EFF">DFS tree</b> shown on the right.
<div style="display: flex; justify-content: space-between;">
<img src="https://hackmd.io/_uploads/Sylj5_aGlg.png" style="width: 90%;" />
<img src="https://hackmd.io/_uploads/HJlx6dTfgg.png" style="width: 90%;" />
</div>
- DFS can be implemented using <span style="color:#bf9ded">recursion</span> or <span style="color:#bf9ded">stack</span>.
Psudo code implementation:
```python
function DFS(graph, start):
create empty stack
create empty set visited
push start into stack
while stack is not empty:
current = pop from stack
if current not in visited:
mark current as visited
process current (e.g., print or record it)
for neighbor in graph[current] in reverse order:
if neighbor not in visited:
push neighbor into stack
```
To preserve the DFS visiting sequence, neighbor vertices are pushed in reverse order to ensure that the smallest one is processed first when popping out from the stack.
---
### π DAGs and Topological Ordering
#### <b style="color:#CA8EFF">Directed Acyclic Graphs(DAG)</b>
A DAG is a directed graph that contains no cycles. For example, trees and forests are DAGs. DAGs are often used to represent <span style="color:#bf9ded">dependencies</span> or <span style="color:#bf9ded">precedence constraints</span>.
<img src="https://hackmd.io/_uploads/HJcDk-yQex.png" style="width: 40%;" />
<!-- <span style="color:#bf9ded">Topological ordering</span> is a way to arrange the nodes of a directed graph so that for every directed edge `(vα΅’, vβ±Ό)`, node `vα΅’` comes before `vβ±Ό` in the order. This only works if the graph has no cycles (i.e., it's a DAG). The slide uses a traffic analogy to explain that cycles cause deadlock, just like drivers at a stop sign who can't decide who goes first. -->
#### <b style="color:#CA8EFF">Topological Ordering</b>
Given a directed graph G, a topological ordering is an ordering of its vertices as `vβ, vβ...vβ` so that for every edge `(vα΅’, vβ±Ό)`, we have `i < j`.
By indexing each vertices one by one, each time finding the one that does not contain incoming edges, we can easily find the topology ordering as shown below:
<img src="https://hackmd.io/_uploads/rJyKVZkQel.png" style="width: 70%;" />
<img src="https://hackmd.io/_uploads/r14BEbkmgx.png" style="width: 50%;" />
This can be implemented using the following steps:
1. Initialize a set `S` with all vertices that have no incoming edges.
2. While `S` is not empty, do the following:
a. Remove a vertex `v` from `S` and add it to the topological ordering.
b. For each neighbor `u` of `v`, decrease the in-degree of `u` by 1.
c. If the in-degree of `u` becomes 0, add `u` to `S`.
This made the time complexity becomes `O(m+n)` where `n` is the number of vertices and `m` is the number of edges.
<br>
<br>
# π° Greedy algorithms
An algortithm is <b style="color:#CA8EFF">greedy</b> if it builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion.
- The greedy algorithm stays ahead, show that after each step of the greedy algorithm, its partial solution is better than the optimal.
### π‘ Key points
#### Basic concepts
- Use a <span style="color:#bf9ded">simple</span> rule to select a first request `iβ`.
- Once `iβ` is selected, reject all requests not compatible with `iβ`.
- To let resource becomes free ASAP, the best option is to accept the request that <span style="color:#bf9ded">finish first</span>.
<hr>
### π Interval Scheduling Problem
<b style="color:#CA8EFF">Interval scheduling problem</b> is a classic problem. To solve the Interval scheduling problem using a greedy algorithm, we can follow the steps below:
- `jobs`: the set of all unscheduled requests
- `schedule`: the set of selected non-overlapping requests
```
1. schedule = β
// Start with an empty set of scheduled jobs
2. while jobs is not empty:
3. pick a job j β jobs with the earliest finish time β greedy choice
4. add j to schedule
5. remove j and all jobs that conflict with j from jobs
6. return schedule
```
<img src="https://hackmd.io/_uploads/BkW4371mge.png" style="width: 50%;" />
Always choose the fastest finished job.
#### Time Complexity:
1. Sort the `jobs` by their finish time `f(i)` in ascending order: `O(n log n)`
2. Constructing an array of job start times takes linear time.
Time: O(n)
3. Instead of removing all overlapping jobs in each iteration, we simply move to the next job whose start time is after the current jobβs finish time. This avoids repeatedly checking all remaining jobs and ensures that each job is considered at most once, spending only: `O(n)`
**β Overall time complexity: `O(n log n)`**
#### Depth
In interval problems, the <b style="color:#CA8EFF">depth</b> refers to the maximum number of overlapping intervals at any point in time (in other words, it represents the lower bound on the number of resources needed to process all intervals concurrently).
To determine the depth, we need to examine all relevant time points and track how many intervals overlap at each one.
<img src="https://hackmd.io/_uploads/r1iC2Eg7gx.png" style="width: 50%;" />
<hr>
### π <span style="color:#B3A1FF">Interval Partitioning</span>
<b style="color:#CA8EFF">Interval partitioning</b> is to seperate the whole intervals set into the least amount of resources to ensure them execute concurrently. This can be implemented using the following steps:
```
1. {Iβ, ..., Iβ} = sort intervals in ascending order by their starting times
2. for j from 1 to n do
3. exclude labels of all assigned intervals that are not compatible with Iα΅’
4. if (there's a nonexcluded label to Iα΅’)
5. assign a nonexcluded label to Iα΅’
```
- To implement the feature of excluding incompatible labels, we can <span style="color:#bf9ded">record the current usage time for all labels</span> to make it easier to compare with the starting time of the current interval Iα΅’.
#### Illustration
Using the same interval graph from last section, first we sort them by their <span style="color:#bf9ded">starting time</span>:
<img src="https://hackmd.io/_uploads/Sy2ByUlXlx.png" style="width: 100%;" />
- The illustration shows that the depth is 3 (which means we need 3 labels).
- After sorting intervals by their start time, we <span style="color:#bf9ded">assign each interval to the first available label</span> s.t. it does not overlap with previously assigned intervals in that label.
- After assigning an interval, remember to <span style="color:#bf9ded">update the current usage time</span> for that label.
>[!Note]
>If at some moment have both `finished` and `starting` intervals, do the `finished` first, so that no repeated calculation occurs.
<hr>
### π <span style="color:#B3A1FF">Scheduling to Minimize Lateness</span>
<b style="color:#CA8EFF">Scheduling to minimize lateness</b> means that each interval has a deadline by which it should be completed. We evaluate the arrangement by measuring the <span style="color:#bf9ded">maximum lateness</span>, and aim to minimize that value.
<img src="https://hackmd.io/_uploads/SJv1cLl7xg.png" style="width: 50%;" />
To implement this algorithm, we simply sort all intervals by their deadlines and schedule them in that order:
- `f`: the finishing time of the last scheduled request
- `s`: the starting time
```
1. {dβ, ..., dβ} = sort requests in ascending order of their deadlines
2. f = s
3. for i from 1 to n do
4. assign request i to the time interval
5. f = f + tα΅’
6. return the set of scheduled intervals
```
<hr>
### π Shortest Paths
The <b style="color:#CA8EFF">shortest path</b> problem is that given directed graph `G = (V, E)` and starting point `s`, we want to find the shortest path `Pα΅₯` from `s` to each other vertices `v β V - {s}`.
#### <span style="color:#B3A1FF">Dijkstra's Algorithm</span>
We aim to compute the shortest path distances from a source vertex `s` to all other vertices in a weighted graph `G = (V, E)` with non-negative edge weights `I(u, v)`.
##### Implementation Dijkstra(G, I)
- `S`: The set of vertices whose shortest path distances from `s` have been finalized.
- `d(u)`: The shortest known distance from `s` to vertex `u`.
- `dβ²(v)`: The tentative shortest distance to a vertex `v β S`, calculated via a neighbor `u β S`.
```
1. Initialize: S = {s}, d(s) = 0
2. While S β V:
3. Select a vertex v β S that is adjacent to at least one u β S,
4. s.t. dβ²(v) = min { d(u) + I(u, v) | u β S and (u, v) β E }
5. Add v to S and set d(v) = dβ²(v)
```
<img src="https://hackmd.io/_uploads/H1n6gCxmlx.png" style="width: 40%;" />
##### Ex.
<img src="https://hackmd.io/_uploads/r1dYHDZXeg.png" style="width: 100%;" />
<hr>
### π Priority Queue
<b style="color:#CA8EFF">Priority queue</b> let all elements in the queue has their own priority, there are two types of them, <span style="color:#bf9ded">max priority queue</span> and <span style="color:#bf9ded">min priority queue</span>.
##### Related actions:
- FindMin(Max)
- ExtractMin(Max)
- Insert
- ChangeKey
#### Heap (Recall)
A <b style="color:#CA8EFF">heap</b> is a specialized <span style="color:#bf9ded">tree-based data structure</span> that satisfies the heap property: in a max-heap, each parent node is greater than or equal to its children; in a min-heap, each parent node is less than or equal to its children.
<img src="https://hackmd.io/_uploads/r1rCMtbXlx.png" style="width: 80%;" />
The only possible empty space appears at the bottom right, which ensures the heap remains a complete binary tree. Heap can be implemented using array as it is continuous:
<img src="https://hackmd.io/_uploads/HydsGtW7eg.png" style="width: 80%;" />
If a new vertex is added to the heap but does not satisfy the heap property, it needs to be swapped with its parent until the property is restored.
<img src="https://hackmd.io/_uploads/SJXwV2WQxl.png" style="width: 90%;" />
To remove the root vertex, we replace it with the last leaf node and then percolate it down the heap until the heap property is maintained.
<img src="https://hackmd.io/_uploads/Bkhu33-Qge.png" style="width: 100%;" />
#### <b style="color:#CA8EFF">Max(Min) heapify</b>
<span style="color:#bf9ded">Heapify</span> is the process of adjusting a binary tree to maintain the max-heap or min-heap property.
- In a max-heap, each parent must be greater than or equal to its children.
- In a min-heap, each parent must be less than or equal to its children.
When a node violates this rule, we compare it with its children and swap it with the larger (or smaller) child, then repeat the process recursively until the property is restored.
##### Ex. Max heapify
<img src="https://hackmd.io/_uploads/HyEdepb7ll.png" style="width: 100%;" />
Checking the parent nodes from the bottom to the top. The time complexity is `O(n log(n))`.
<hr>
### π Minimum Spanning Trees(MST)
The <b style="color:#CA8EFF">minimum spanning tree</b> is a problem that involves finding a subset of edges in a weighted, connected graph s.t. <span style="color:#bf9ded">all vertices are connected</span>, the <span style="color:#bf9ded">total edge weight is minimized</span>, and no cycles are formed.
- Let `T` be a minimum-cost solution, we want to get `(V, T)`.
<img src="https://hackmd.io/_uploads/S1cjDGMQee.png" style="width: 40%;" />
#### <span style="color:#B3A1FF">Kruskal's algorithm</span>
- Start with `T = 0`
- Consider edges in ascending order of cost
- Insert edge `e` in `T` as long as it does not create a cycle, otherwise, discard `e` and continue.
<img src="https://hackmd.io/_uploads/BksdwMfmxe.png" style="width: 100%;" />
Always choose the smallest edge if it doesn't cause cycle.
##### Implementation
```
1. {eβ, eβ, ..., eβ} = sort edges in ascending order of their costs
2. T = {}
3. for each eα΅’=(u, v) do
4. if(u and v are not connected by edges in T) then
5. T = T + {eα΅’}
```
#### <span style="color:#B3A1FF">Prim's algorithm</span>
- Start with a root node `s`.
- <span style="color:#bf9ded">Greedy</span> grow a tree `T` from `s` outward.
- At each step, add the cheapest edge `e` to the partial tree `T` that has exactly one endpoint in `T`.
<img src="https://hackmd.io/_uploads/B1fC9zz7gx.png" style="width: 100%;" />
##### Implementation
- Start with a node `s`, add it to the explored set `S`.
```
1. Initialize: S = {s}, d(s) = 0
2. While S β V:
3. Select a vertex v β S that is adjacent to at least one u β S,
4. s.t. dβ²(v) = min { cost(u, v) | u β S and (u, v) is an edge }
5. Add v to S and set d(v) = dβ²(v)
```
#### <span style="color:#B3A1FF">Reverse-delete algorithm</span>
- Start with `T = E`.
- Consider edges in descending order of cost.
- Delete edge `e` from `T` unless doing so would disconnect `T`.
<img src="https://hackmd.io/_uploads/BypSnfGQgx.png" style="width: 100%;" />
**This method is hard to implement, just a concept.**
#### Cut property
Let `S` be any subset of nodes, and let `e = (v, w)` be the <span style="color:#bf9ded">minimum cost edge</span> with one end in `S` and the other in `V-S`. Then every minimum spanning tree contains `e`.
<img src="https://hackmd.io/_uploads/rkLPyXMQlx.png" style="width: 50%;" />
#### Cycle property
Let `C` be any cycle in `G`, and `let e = (v, w)` be the <span style="color:#bf9ded">maximum cost edge</span> in `C`. Then `e` does not belong to any minimum spanning tree.
#### The Union-Find data structrure
The union-find data structure represents disjoint sets. Each set has a representative. It's operation contains:
- MakeUnionFind(S): Initialize a set for each element in `S`.
- Find(u): Return the representative of the set containing `u`.
- Union(A, B): Merge sets `A` and `B`.
<img src="https://hackmd.io/_uploads/HkEo_tM7eg.png" style="width: 70%;" />
The find operation updates the structure by compressing paths, making future find operations faster.
<img src="https://hackmd.io/_uploads/SJzR_FzXxl.png" style="width: 30%;" />
<br>
<br>
# βοΈ Divide and conquer
### π‘ Key points
#### Mathematical theory
- <b style="color:#CA8EFF">Weak Induction</b> - Given the propositional `P(n)` where `n β X`, a proof by mathematical induction is of the form:
- <span style="color:#bf9ded">Basis step</span>: The proposision `P(0)` is shown to be true.
- <span style="color:#bf9ded">Inductive step</span>: The implication `P(k)` β `P(k+1)` is shown to be true for every `k β X`.
- <b style="color:#CA8EFF">Strong Induction</b> - Given the propositional `P(n)` where `n β X`, a proof by second principle of <span style="color:#bf9ded">mathematical induction</span> (or <span style="color:#bf9ded">strong induction</span>) is of the form:
- <span style="color:#bf9ded">Basis step</span>: The proposision `P(0)` is shown to be true.
- <span style="color:#bf9ded">Inductive step</span>: The implication `P(0) β§ P(1) β§ ... β§ P(k)` β `P(k+1)` is shown to be true for every `k β X`.
#### <b style="color:#CA8EFF">Divide and conquer</b>
- <span style="color:#bf9ded">Divide</span>: Break the input into several parts of <span style="color:#bf9ded">the same type</span>.
- <span style="color:#bf9ded">Conquer</span>: Solve the problem in each part <span style="color:#bf9ded">recursively</span>.
- <span style="color:#bf9ded">Combine</span>: Combine the solutions to sub-problems into an overall solution.
**Spend only linear time for the initial division and final recombining.**
<hr>
### π <span style="color:#B3A1FF">Merge sort</span>
Merge sort divide the input into two halves, sort each half recursively.
- `A`: An array of sorting elements.
- `p`: Index to start storting.
- `r`: Index to stop sorting.
```
Mergesort(A, p, r)
1. if (p < r) then
2. q = β(p+r)/2β
3. Mergesort(A, p, q)
4. Mergesort(A, p+1. r)
5. Merge(A, p, q, r)
```
<img src="https://hackmd.io/_uploads/Sy645GmQxg.png" style="width: 40%;" />
#### Time complexity
- `T(n)` for input size `n`
- Divide spent `D(n)`, but for array it is: `O(1)`.
- Conquer: `2T(n/2)`
- Combine spent `C(n)`, we want it to be: `O(n)`.
β `T(n) = D(n) + 2T(n/2) + C(n) = O(n log n)`.
>[!Note]
>In MergeSort, each level does `O(n)` work(as the total data is still `n`), and there are `log n` levels, so the total time is `O(n log n)`.
#### Merge operation
For each half, always choose the smaller one to put in result array.
<img src="https://hackmd.io/_uploads/B10-rw7mex.png" style="width: 60%;" />
<hr>
### π <span style="color:#B3A1FF">Counting inversions</span>
Assume one array (`A`) is fixed with ascending sequence, the counting inversions with the other array (`B`) is to count by two ways:
- <span style="color:#bf9ded">Method 1</span>: For each element in array `B`, <span style="color:#bf9ded">find and count</span> the number of elements <span style="color:#bf9ded">before it</span> that are <span style="color:#bf9ded">greater than the current</span> element. This represents the number of inversions involving that element.
- <span style="color:#bf9ded">Method 2</span>: Draw lines <span style="color:#bf9ded">connecting corresponding elements between arrays `A` and `B`</span>, and count the number of intersections between the lines. Each intersection represents an inversion.
<img src="https://hackmd.io/_uploads/rJa6Oh7Qgx.png" style="width: 60%;" />
Implement divide and conquer in it:
<img src="https://hackmd.io/_uploads/SyptjnmXlx.png" style="width: 60%;" />
To ensure the comparison runs in linear time, we first sort the two divided subarrays.
- During the merge step, we use two pointers: one for each subarray. As we compare elements from left to right, if an <span style="color:#bf9ded">element from the left subarray is greater than the current element from the right subarray</span>, it forms an <span style="color:#bf9ded">inversion</span>.
- Because both subarrays are sorted, we know that <span style="color:#bf9ded">all remaining elements in the left subarray will also be greater</span>, so we can count multiple inversions at once.
- This allows us to <span style="color:#bf9ded">avoid restarting the comparison</span> and ensures each element is only compared once.
<img src="https://hackmd.io/_uploads/ryMqDTmXex.png" style="width: 60%;" />
#### Implementation
```
Sort-and-Count(L, p, q)
1. if (p = q) then return 0
2. else
3. m = β(p+q)/2β
4. rp = Sort-and-Count(L, p, m)
5. rq = Sort-and-Count(L, m+1, q)
6. rm = Merge-and-Count(L, p, m, q)
7. return r = rp + rq + rm
```
<hr>
### π Closest pair of points
Given a set of `n` points on a plane `p`, `pα΅’` is located at `(xα΅’, yα΅’)`, we want to find a pair `(pα΅’, pβ±Ό)` with the smallest Eucidean distance between them: `β[(xα΅’-xβ±Ό)Β²+(yα΅’-yβ±Ό)Β²]`.
- Using <span style="color:#bf9ded">divide and conquer</span>, we split the problem into smaller subproblems of the same type. First, we <span style="color:#bf9ded">find the closest pair on each side</span> (the orange edges), and then check if a closer pair exists <span style="color:#bf9ded">across the two halves</span> (the red edge):
<img src="https://hackmd.io/_uploads/HJkmG7Emgg.png" style="width: 40%;" />
- When searching for the closest pair across the two halves, we only need to consider <span style="color:#bf9ded">the smaller of the two closest distances from each half</span> as the <b style="color:#CA8EFF">threshold</b> for cross-pair comparisons.:
<img src="https://hackmd.io/_uploads/SyG6M7N7xe.png" style="width: 40%;" />
#### Implementation
```
Closest-Pair(P)
1. construct Px and Py // O(n log n)
2. (p*β, p*β) = Closest-Pair-Rec(Px, Py) // T(n)
Closest-Pair-Rec(P)
1. if |P| β€ 3 then return closest pair measured by all pair-wise distances
2. x* = (βn/2β)-th smallest x-coordinate in Px
3. construct Qx, Qy, Rx, Ry // O(n)
4. (q*β, q*β) = Closest-Pair-Rec(Qx, Qy) // T(n/2)
5. (r*β, r*β) = Closest-Pair-Rec(Rx, Ry) // T(n/2)
6. Ξ΄ = min(d(q*β, q*β), d(r*β, r*β))
7. L = {(x, y): x = x*}; S = {points in P within distance Ξ΄ of L}
8. construct Sy // O(n)
9. for each s β S do
10. compute distance from s to each of next 15 points in Sy
11. d(s, s') = min distance of all computed distances // O(n)
12. if d(s, s') < Ξ΄ then return (s, s')
13. else if d(q*β, q*β) < d(r*β, r*β) then return (q*β, q*β)
14. else return (r*β, r*β)
```
- In the closest pair algorithm, we only need to <span style="color:#bf9ded">compare each point with up to 15 nearby points</span> in the sorted strip because of geometric packing constraints.
- Within a vertical strip of width 2Ξ΄, at most 15 points can <span style="color:#bf9ded">lie within Ξ΄ distance</span> of each other without <span style="color:#bf9ded">violating the minimal distance found so far</span>. This limit ensures the algorithm remains efficient at `O(n log n)` time.
<br>
<br>
# π§ Dynamic programming
<span style="color:#B3A1FF">Dynamic programming (DP)</span> came from the term mathematical programming. Basic idea:
- One <span style="color:#bf9ded">implicitly</span> explores the space of all possible solutions.
- Carefully <span style="color:#bf9ded">decomposing</span> things into a senes of subproblems.
- <span style="color:#bf9ded">Building up</span> correct solutions to larger and <span style="color:#bf9ded">larger subproblems</span>.
### π‘ Key points
#### Basic concepts
Suppose the interval scheduling problem is illustrated in the left image. We begin by sorting the intervals based on their end times, as shown in the right image.
<img src="https://hackmd.io/_uploads/r1RQpUN7gg.png" style="width: 90%;" />
To determine how to solve the problem, we first break it down into smaller subproblems and then explore different strategies for combining their solutions into a complete answer.
1. Solve based on <span style="color:#bf9ded">time</span>
Construct the overall solution by combining subproblems defined over time intervals. However, it can be difficult to determine the optimal threshold between intervals.
<img src="https://hackmd.io/_uploads/r1-H6LNmee.png" style="width: 90%;" />
2. Solve based on <span style="color:#bf9ded">weight gain</span>
Assume that `p(i)` equals the amount that will not overlap.
- `Oβ±Ό` means the optimal solution for intervals `1, ..., j`.
- `OPT(j)` means the value of the optimal solution for intervals `1, ... j`.
<img src="https://hackmd.io/_uploads/rJv3Jw4Qge.png" style="width: 50%;" />
In this example:
- `Oβ = {6, Oβ}` or `Oβ
`
- `OPT(6) = max{{vβ+OPT(3)}, OPT(5)}`
#### Implementation
```
// Preprocessing
// 1. Sort intervals by finish times: fβ β€ fβ β€ ... β€ fβ
// 2. Compute p(1), p(2), ..., p(n)
Compute-Opt(j)
1. if (j = 0) then return 0
2. else return max{{vβ±Ό+Compute-Opt(p(j))}, Compute-Opt(j-1)}
```
##### Problem: This recursive process recalculates the same subproblems many times.
<img src="https://hackmd.io/_uploads/ByeJImvVXgg.png" style="width: 40%;" />
To solve the problem, we need to <span style="color:#bf9ded">memorize the calculated results</span> so that it can be reused when needed. This is called <b style="color:#CA8EFF">memorization</b>. The implementation thus becomes:
```
M-Compute-Opt(j)
1. if (j = 0) then return 0
2. else if (M[j] is not empty) then return M[j]
3. else return M[j] = max{{vβ±Ό+M-Compute-Opt(p(j))}, M-Compute-Opt(j-1)}
```
This is the top-down method for saving calculation time, it can also be implemented using bottom-up, which called <b style="color:#CA8EFF">iteration</b>:
```
I-Compute-Opt
1. M[0] = 0
2. for j = 1, 2, ..., n do
3. M[j] = max{vβ±Ό+M[p(j)], M[j-1]}
```
#### <span style="color:#bf9ded">When to use DP</span>
<span style="color:#bf9ded">Dynamic programming</span> can be used if the problem satisfies the following properties:
- There are <span style="color:#bf9ded">only a polynomial number of subproblems</span>.
- The solution to the original problem can be <span style="color:#bf9ded">easily computed from the solutions to the subproblems</span>.
- There is a <span style="color:#bf9ded">natural ordering on subproblems</span> from "smallest" to "largest", together with an easy-to-compute recurrence.
If a problem can be solved using DP, it need the following properties:
- <b style="color:#CA8EFF">Optimal substructure</b>: An optimal solution contains within its <span style="color:#bf9ded">optimal solutions to subproblems</span>.
- <b style="color:#CA8EFF">Overlapping subproblem</b>: A recursive algorithm <span style="color:#bf9ded">revisits the same problem over and over again</span>; typically, the total number of distinct subproblems is a <span style="color:#bf9ded">polynomial</span> in the input size.
<hr>
### π Fibonacci Sequence
Recurrence relation: `Fβ = Fβ-β + Fβ-β`, `Fβ = 0`, `Fβ = 1`.
If we want to get `fib(5)`, we need:
```
fib(5) = fib(4) + fib(3)
= (fib(3) + fib(2)) + (fib(2) + fib(1))
= ((fib(2) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
= (((fib(1) + fib(0)) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
```
So many <span style="color:#bf9ded">repeated calculations</span>, can use <span style="color:#bf9ded">dynamic programming</span> to improve.
<img src="https://hackmd.io/_uploads/HJzGTsEXxx.png" style="width: 80%;" />
#### Implementation (Memoization)
```
fib(n)
1. Initialize f[0...n] with -1 // -1: unfilled
2. f[0] = 0, f[1] = 1
3. fibonacci(n, f)
fibonacci(n, f)
1. if f[n] == -1 then
2. f[n] = fibonacci(n-1, f) + fibonacci(n-2, f)
3. return f[n] // if f[n] already directly return
```
The array f helps store the calculated elements to reduce time waste.
<hr>
### π Maze Routing
Given a grid-based layout with a <span style="color:#bf9ded">start</span> and an <span style="color:#bf9ded">target</span> point, maze routing is the process of <span style="color:#bf9ded">finding a valid path that connects the two points while avoiding obstacles</span>. It is commonly used in VLSI design and robotics pathfinding.
- A planar <span style="color:#bf9ded">rectangular grid graph</span>.
- Two points `S` and `T` on the graph.
- Obstacles modeled as blocked vertices.
We want to find the shortest path connecting `S` and `T`.
<div style="display: flex; justify-content: space-between;">
<img src="https://hackmd.io/_uploads/BkbWZhEQlg.png" style="width: 90%;" />
<img src="https://hackmd.io/_uploads/HJOuGn4Xgl.png" style="width: 90%;" />
</div>
The path is determined by exploring all possible moves at each step, and there may be multiple valid solutions.
<hr>
### π Subset Sums & Knapsacks
Given a set of `n` items and knapsack, item `i` weights `wα΅’ > 0`. The knapsack has capacity of `W`. We want to fill the knapsack so as to maximize total weight.
<img src="https://hackmd.io/_uploads/S1--PrHQgg.png" style="width: 40%;" />
#### <b style="color:#CA8EFF">Optimaization problem</b> formulation
- `max Ξ£α΅’ββ wα΅’`
s.t.` Ξ£α΅’ββ wα΅’ < W`, `Sβ{1, ..., n}` and `Ξ£β±Όββ wβ±Ό < w`
#### <b style="color:#CA8EFF">OPT(i)</b> = the total weight of the optimal solution for items 1, ..., i
- OPT(i, w) = maxβ Ξ£β±Όββ wβ Sβ{1, ..., i}
Consider `OPT(n)`, i.e., the total weithgt of the final solution O, we got the following cases:
- Case 1: `n β O` (`OPT(n)` does not count `wβ`) β `OPT(n, w) = OPT(n-1, w)`.
- Case 2: `n β O` (`OPT(n)` counts `wβ`) β `OPT(n, w) = wβ + OPT(n-1, w-wβ)`.
<img src="https://hackmd.io/_uploads/SkBxYYSXex.png" style="width: 60%;" />
#### Implementation
```
Subset-sum(n, wβ, ..., wβ, W)
1. for w = 0, 1, ..., W do // if no selected item
2. M[0, w] = 0
3. for i = 0, 1, ..., n do // if no empty space
4. M[i, 0] = 0
5. for i = 1, 2, ..., n do
6. for w = 1, 2, ..., W do
7. if (wα΅’ > W) then
8. M[i, w] = M[i-1, w]
9. else
10. M[i, w] = max {M[i-1, w], wα΅’+M[i-1, w-wα΅’]}
```
<img src="https://hackmd.io/_uploads/rkCAz8BXxx.png" style="width: 70%;" />
β Time complexity: `O(nW)` β **Not polynomial time, only psudo-polynomial time**.
#### π The Knapsack Problem
Given a set of `n` items and a knapsack, item `i` weights `wα΅’ > 0` and has value `vα΅’ > 0`. The knapsack has capacity of `W`. We want to fill the knapsack so as to maximize total value.
#### <b style="color:#CA8EFF">Optimaization problem</b> formulation
- `max Ξ£α΅’ββ Vα΅’`
s.t.` Ξ£α΅’ββ wα΅’ < W`, `Sβ{1, ..., n}`
<img src="https://hackmd.io/_uploads/SkaQtrBXee.png" style="width: 60%;" />
<hr>
### π <span style="color:#B3A1FF">Bellman-Ford algorithm</span>
If `G` has no negative cycles, then there is a <span style="color:#bf9ded">shortest path</span> from `s` to `t` that is <span style="color:#bf9ded">simple</span> (i.e., does not repeat nodes), and hence has at most `n-1` edges.
`OPT(i, v)` means length of shortest v-t path P using at most i edges.
- `OPT(n-1, s)` means length of shortest s-t path.
- <span style="color:#bf9ded">Case 1</span>: `P` uses at most `i-1` edges.
- `OPT(i, v) = OPT(i-1, v)`.
- <span style="color:#bf9ded">Case 2</span>: `P` uses exactly `i` edges.
- `OPT(i, v) = Cvw + OPT(i-1, w)`, assume that the middle point is `w` and `Cvw` is the distance for `v` and `w`.
<img src="https://hackmd.io/_uploads/B1bn85SXex.png" style="width: 60%;" />
#### Implementation
```
Bellman-Ford(G, s, t)
// n = # of nodes in G
// M[0...n-1, V]: table recording optimal solutions of subproblems
1. M[0, f] = 0
2. foreach v β V-{f} do
3. M[0, v] = β
4. for i = 1 to n-1 do
5. for v β V in any order do
6. M[i, v] = min{M[i-1, v], ,minβα΅₯, wββE{Cvw+M[i-1, w]}}
```
<img src="https://hackmd.io/_uploads/BkfZfoH7xx.png" style="width: 50%;" />
<img src="https://hackmd.io/_uploads/B1aGmiSQgl.png" style="width: 70%;" />
β Time complexity: `O(nm)`, `n` is the number of nodes and `m` is the number of edges.
<hr>
### π Currency Conversion
Given `n` currencies and the exchange rates between each pair, we want to detect an arbitrage opportunity, or to say check if someone can ended up with more money by exchanging currencies in a cycle.
To detect such opportunities, we need to <span style="color:#bf9ded">check whether the product of exchange rates along a cycle is greater than 1</span>. However, since shortest path algorithms work with sums rather than products, we <span style="color:#bf9ded">convert the problem</span> by taking the logarithm of each rate and using the <span style="color:#bf9ded">negative log values</span>. This transforms the product into a sum, allowing us to apply <span style="color:#bf9ded">shortest path algorithms</span> like <b style="color:#CA8EFF">Bellman-Ford</b> to detect <b style="color:#CA8EFF">negative-weight cycles</b>, which correspond to arbitrage opportunities.
<img src="https://hackmd.io/_uploads/BkNAfnSmxl.png" style="width: 50%;" />
#### Negative cycle detection
If a graph has `n` nodes, `OPT(n, v) < OPT(n-1, v)` for some `v`, then shortest path contains a negative cycle.
To detect it, we can add a new vertex and connect all vertices to it with edge weight `0`. This ensures all vertices can reach the new one, so if there's a negative cycle, it will be detected when running Bellman-Ford algorithm.
<img src="https://hackmd.io/_uploads/S1XE9hBmeg.png" style="width: 40%;" />
Also, remember to run `n+1` iterations to detect negative cycles.
<hr>
### π Travelling Salesman Problem
<b style="color:#CA8EFF">Travelling Salesman Problem (TSP)</b>: A salesman is required to visit once and only once each of n different cities starting from a base city, and returning to this city. What path minimizes the total distance travelled by the salesman?
For each subset `S` of the cities with `|S| >= 2` and each `u, v β S`, `OPT(S, u, v)` means the length of the shortest path that <span style="color:#bf9ded">starts at</span> `u`, <span style="color:#bf9ded">ends at</span> `v`, <span style="color:#bf9ded">visits all cities</span> in `S`.
- <span style="color:#bf9ded">Case 1</span>: `S = {u, v}`
- `OPT(S, u, v) = d(u, v)`
- <span style="color:#bf9ded">Case 2</span>: `|S| > 2`
- Assume `w β S-{u, v}` is visted first:
`OPT(S, u, v) = d(u, w)+OPT(S-u, w, v)`
- `OPT(S, u, v) = min(wβS-P{u, v}) {d(u, w)+OPT(S-u, w, v)}`
##### β Time complexity: `O(2βΏnΒ³)`.
From above we can see that although it is much better than `O(n!)` (using brute force), DP is suitable when the number of subproblems is polynomial.
<br>
<br>
# π Flow Network
A <b style="color:#CA8EFF">flow network</b> `G = (V, E)` is a directed graph.
- In a flow network, an augmenting path may include <span style="color:#bf9ded">both forward and backward edges</span>. Forward edges represent <span style="color:#bf9ded">unused capacity and allow additional flow</span>, while backward edges represent <span style="color:#bf9ded">previously pushed flow</span> that can be canceled or "undone."
- This mechanism enables the algorithm to <span style="color:#bf9ded">redirect</span> flow by decreasing flow on certain edges and increasing it on others, ultimately leading to a more optimal overall flow.
### π‘ Key points
- `s-t flow f: E β R+`
- A function f that maps each edge e to a nonnegative real number, `f(e)`: flow carried by edge e.
- `v(f)`: The value of a flow f.
#### Flow properties
- <b style="color:#CA8EFF">Capacity conditions</b>
- `βeβE, 0 β€ f(e) β€ cβ`
- <b style="color:#CA8EFF">Conservation conditions</b>
- `βuβV\{s, f}, Ξ£β α΅’βββ α΅₯ f(e) = Ξ£β βα΅€β βf α΅₯ f(e)`
<img src="https://hackmd.io/_uploads/ByG4vW8Xex.png" style="width: 100%;" />
#### Upper bounds of the maximum s-t flow
To find the upper bound of the s-t flow, we divide the nodes into two sets, A and B, so that `s β A` and `t β B`.
- Any s-t flow must cross from `A` into `B` at some point.
- The s-t flow uses up some of the edge capacity from `A` to `B`.
<hr>
### π Ford-Fulkerson: Residual Graph
Assume we have a pushing flow:
- Push <span style="color:#bf9ded">forward</span> on edges with leftover capacity.
- Push <span style="color:#bf9ded">backward</span> on edges with flow.
The <span style="color:#B3A1FF">residual graph</span> `Gf` of `G w.r.t.f`.
- `V(Gf) = V(G)`
<img src="https://hackmd.io/_uploads/SJ-p1K8Qxg.png" style="width: 70%;" />
The **blue edges** mean the <span style="color:#bf9ded">leftover capacity</span> of an edge, and the **red edges** mean the <span style="color:#bf9ded">maximum available push-back</span> flow.
At this point, we can determine whether more flow can be sent from `s` to `t` by <span style="color:#bf9ded">checking for a path</span> in the residual graph. By finding augmenting paths step by step, we can <span style="color:#bf9ded">maximize the overall flow</span>.
##### Update pushing flow
<img src="https://hackmd.io/_uploads/BJQR1YI7le.png" style="width: 70%;" />
#### Implementation
- `e`: Edge.
- `E`: Flow network.
- `Gf`: Residual Graph.
```
Ford-Fulkerson(G, s, t)
1. foreach (e β E) do
2. f(e) = 0
3. construct Gf
4. while (Ζ an s-t path P in Gf) do
5. b = bottleneck(P, f)
6. foreach(e β P) do
7. if (e β E) then f(e) = f(e) + b
8. else f(eα΄Ώ) = f(eα΄Ώ) - b
9. update Gf
10. return f
```
#### Ex.
<img src="https://hackmd.io/_uploads/Sy5l0KLQex.png" style="width: 100%;" />
<hr>
### π <b style="color:#CA8EFF">Bipartite Maching</b>
Given `n` men, `m` women and their feasible partners, `G(X, Y, E)`, we want to find the maching `MβE` with the max number of marriages.
The <span style="color:#bf9ded">Bipartite Matching problem</span> can also be solved using a <span style="color:#bf9ded">residual graph</span>. We only need to <span style="color:#bf9ded">define a start and an end node</span>, and set all <span style="color:#bf9ded">edge capacities</span> to 1.
<img src="https://hackmd.io/_uploads/HyW7L587xx.png" style="width: 100%;" />
<hr>
### π Maximum Flows and Minimum Cuts
#### Termination and Running Time
Assumption: All capacities are <span style="color:#bf9ded">integers</span>. Through out Ford-Fulderson, the flow values `{f(e): eβE}` and the residual capacities in `Gf` are <span style="color:#bf9ded">integers</span>.
- `v(f') = v(f) + bottleneck(P, f)`. Since `bolltleneck(P, f) > 0`, `v(f') > v(f)`.
- `C = Ξ£β βα΅€β βf β cβ >= Ξ£β βα΅€β βf β f(e) = v(f)`. Ford-Fulkerson terminates in at most C iterations of the while loop.
Running time: `O(mC)`. For certain pathological casesβlike the one shownβeach augmenting path may only contribute 1 unit of flow, resulting in `C` iterations and thus <span style="color:#bf9ded">worst-case performance</span>.
<img src="https://hackmd.io/_uploads/ryw-U2I7xx.png" style="width: 40%;" />
#### Choose augmenting paths
To avoid the above worst-case, we should choose augmenting paths with:
- <span style="color:#bf9ded">Max bottlenect capacity</span>.
- <span style="color:#bf9ded">Fewest number of edges</span>.
- <span style="color:#bf9ded">Sufficiently large bottleneck capacity</span>.
- <b style="color:#CA8EFF">Ξ-scaling</b>: look for paths with bottleneck capacity of <span style="color:#bf9ded">at least Ξ</span> (choosing larger edge at the beginning).
#### Max-flow min-cut theorem
The value of the <span style="color:#bf9ded">max flow</span> is equal to the value of the <span style="color:#bf9ded">min cut</span>.
- There exists a cut `(A, B)` s.t. `v(f) = cap(A, B)`.
- Flow `f` is a max flow.
- There is no augmenting path relative of `f`.
##### Ex.
<img src="https://hackmd.io/_uploads/BJExR08Xlx.png" style="width: 40%;" />
The cut shows the maximum flow is `8+8=16`, which is the same as the final flow that goes into `t` (`6+6+4`).
<br>
<br>
# π
ΏοΈ Beyond polynomial running times
Those problems with <b style="color:#CA8EFF">polynomial-time</b> algorithms will be able to solve in practice.
### π Decision & Optimization Problems
#### <b style="color:#CA8EFF">Decision problems</b>
Those having <span style="color:#bf9ded">yes/no</span> answers called <span style="color:#bf9ded">decision problems</span>.
- <b style="color:#CA8EFF">MST</b>: Given a graph `G=(V, E)` and a bound `K`, is there a spanning tree with a cost at most `K`?
- <b style="color:#CA8EFF">TSP</b>: Given a set of <span style="color:#bf9ded">cities</span>, <span style="color:#bf9ded">distance</span> between each pair of cities, and a bound `B`, is there a <span style="color:#bf9ded">route</span> that starts and ends at a given city, visits every city <span style="color:#bf9ded">exactly once</span>, and has total distance <span style="color:#bf9ded">at most</span> `B`?
#### <b style="color:#CA8EFF">Optimization problems</b>
Those finding a legal configuration s.t. its cost is minimum (or maximum)
<b style="color:#CA8EFF">Binary search</b> is available for a <span style="color:#bf9ded">decision problem</span> to obtain solutions to its <span style="color:#bf9ded">optimization problem</span>.
<hr>
### π Complexity Classes
- The class <b style="color:#CA8EFF">P</b> is the class of problems that can be <span style="color:#bf9ded">solved</span> in <span style="color:#bf9ded">polynomial time</span> in the size of input.
- The class <b style="color:#CA8EFF">NP (Nondeterministic Polynomial)</b> class is the class of problems that can be <span style="color:#bf9ded">verified</span> in polynomial time in the size of input.
- The class <b style="color:#CA8EFF">NP-complete (NPC)</b> is a problem `Y` in <span style="color:#bf9ded">NP</span> with the property that for every problem `X` in <span style="color:#bf9ded">NP</span>, `X β€β Y`.
- Suppose `Y` is <span style="color:#bf9ded">NPC</span>, then `Y` is solvable in polynomial time iff `P = NP`.
#### Fundamental question: Do there exitst "natual" NPC problems?
<hr>
### π The First Proved NPC: Circuit Satiusfiability
Q: Given a combinational circuit built out of AND, OR, and NOT gates, is there a way to set the circuit inputs so that the output is 1?
A: Yes: `1 0 1`
<img src="https://hackmd.io/_uploads/r1Aaflw7le.png" style="width: 50%;" />
**This is a NP-complete problem (also the first on in history).**
<hr>
### π Polynomial-Time Reduction
Suppose we could <span style="color:#bf9ded">solve</span> `Y` <span style="color:#bf9ded">in polynomial-time</span>, we want to let problem `X` <b style="color:#CA8EFF">polynomial reduces</b> to problem `Y` if given an arbitrary instance `x` of problem `X`, we can construct an input `y` to problem `Y` in <span style="color:#bf9ded">polynomial time</span> s.t. `x` is a <span style="color:#bf9ded">yes</span> instance to `X` iff `y` is a <span style="color:#bf9ded">yes</span> instance of `Y`.
- <b style="color:#CA8EFF">Design algorithms</b>
If `X β€β Y` and `Y` <span style="color:#bf9ded">can</span> be solved in polynomial-time, then `X` <span style="color:#bf9ded">can</span> also be solved in polynomial-time.
- <b style="color:#CA8EFF">Establish intractability</b>
If `X β€β Y` and `Y` <span style="color:#bf9ded">cannot</span> be solved in polynomial-time, then `Y` <span style="color:#bf9ded">cannot</span> be solved in polynomial-time.
- <b style="color:#CA8EFF">Establish equivalence</b>
If `X β€β Y` and `Y β€β X`, `X β‘β Y`.
<hr>
### π Polynomial Reduction: HC β€β TSP
The <b style="color:#CA8EFF">Hamiktonian Circuit(Cycle) Problem (HC)</b> is an <span style="color:#bf9ded">undirected graph</span> `G=(V, E)`. We want to know if there is a <span style="color:#bf9ded">cycle</span> in `G` that includes <span style="color:#bf9ded">every vertex exactly once</span>?
- <b style="color:#CA8EFF">TSP</b>: The Traveling Salesman Problem.
- Claim: `HC β€β TSP`.
#### Steps
1. Define a reduction function `f` for `HC β€β TSP`.
- Given an HC instance `G = (G, V)` with `n` vertices, create a set of `n` cities labeled with names in `V`.
- Assign distance between `u` and `v`.
<img src="https://hackmd.io/_uploads/S14EiLPXlx.png" style="width: 40%;" />
- Set bound `B = n`.
- `f` can be computed in `O(VΒ²)` time.
<img src="https://hackmd.io/_uploads/r1nJAIwXgl.png" style="width: 60%;" />
2. G has a <span style="color:#bf9ded">HC</span> iff the reduced instance has a <span style="color:#bf9ded">TSP</span> with `distance β€ B`.
- `x β HC` β `f(x) β TSP`.
- Suppose the <span style="color:#bf9ded">HC</span> is `h = <vβ, vβ, ..., vβ>`. Then, `h` is also a tour in the transformed <span style="color:#bf9ded">TSP</span> instance.
- The distsnce of the tour `h` is `n = B` since there are `n` <span style="color:#bf9ded">consecutive edges</span> in `E`, and so has distance 1 in `f(x)`.
- Thus, `f(x) β TSP` (`f(x)` has a TSP tour with `distance β€ B`).
- `f(x) β TSP` β `x β HC`
- Suppose there is a <span style="color:#bf9ded">TSP</span> tour with `distance β€ B`. Let it be `<vβ, vβ, ..., vβ>`.
- Since distance of the `tour β€ n` and there are `n` edges in the <span style="color:#bf9ded">TSP</span> tour, the tour contains only edges in `E`.
- Thus, `<vβ, vβ, ..., vβ>` is a Hamiltonian cycle (`x β HC`).
<hr>
### π NP-Completeness
<span style="color:#bf9ded">Definition</span>: A <span style="color:#bf9ded">decision</span> problem L (a language `L β {0, 1}'`) is <b style="color:#CA8EFF">NP-complete (NPC)</b> if:
- `L β NP`, and
- `L' β€β L` for every `L' β NP`.
<b style="color:#CA8EFF">NP-hard</b>: If `L` statisfies property 2, but not necessarily property 1, we say that `L` is <b style="color:#CA8EFF">NP-hard</b>.
Suppose `L β NPC`, `P = NP`?
- If `L β P`, then there exist a <span style="color:#bf9ded">polynomial-time algorithm</span> for every `L' β NP` (i.e., `P = NP`).
- If `L β P`, then there exist no <span style="color:#bf9ded">polynomial-time algorithm</span> for any `L' β NPC` (i.e., `P β NP`).
<br>
<br>
# π Linear Programming
<b style="color:#CA8EFF">Linear programming</b> describes a broad class of optimization tasks in which both the <span style="color:#bf9ded">optimization cirterion</span> and the <span style="color:#bf9ded">coinstraints</span> are <span style="color:#bf9ded">linear</span> functions.
### π‘ Key points
#### Basic concepts
Linear programming contains three parts:
- A set of decision variables.
- An <span style="color:#bf9ded">objective function</span>.
- A set of <span style="color:#bf9ded">constraints</span>.
#### Example: Profit Maximization
A boutique chocolatier has two products:
- A(Pydamide): profit $1 per box.
- B(Nuit): profit $6 per box.
##### <span style="color:#bf9ded">Constraints</span>
- The daily demand for these exclusive chocolates is limited to at most <span style="color:#bf9ded">200 boxes of A</span> and <span style="color:#bf9ded">300 boxes of B</span>.
- The current workforce can produce a total of <span style="color:#bf9ded">at most 400</span> boxes of chocolate per day.
##### <span style="color:#bf9ded">Decision variables</span>
- `xβ` = Boxes of A.
- `xβ` = Boxes of B.
##### <span style="color:#bf9ded">Objective Function</span>
- Maximize profit
```
max xβ+6xβ
Subject to xβ β€ 200 (1)
xβ β€ 300 (2)
xβ + xβ β€ 300 (3)
xβ, xβ β₯ 0
```
<img src="https://hackmd.io/_uploads/HJEU4Zd7xe.png" style="width: 30%;" />
Using the <span style="color:#bf9ded">simplex method: hill climbing</span> to find the best solution.
##### Multipliers
To find the upper bound, we can also use the constraints to caclculate the result:
- (1) + 6*(2): `xβ + 6xβ β€ 2000`
- 0*(1) + 5*(2) + (3): `xβ + 6xβ β€ 1900`
#### <b style="color:#CA8EFF">Duality</b>
The problem can be changed to the following type, for us to get the answer by minimize.
```
Multiplier Ineuality
yβ xβ β€ 200
yβ xβ β€ 300
yβ xβ + xβ β€ 400
β (yβ+yβ)xβ+(yβ+yβ)xβ β€ 200yβ+300yβ+400yβ
```
We want a <span style="color:#bf9ded">tight</span> bound for the <span style="color:#bf9ded">dual question</span>, thus we need to minimize `200yβ+300yβ+400yβ`.
##### Generic form:
```
Primal LP: Dual LP:
max cα΅x min yα΅b
Ax β€ b yα΅A β₯ cα΅
x β₯ 0 y β₯ 0
```
#### <span style="color:#bf9ded">Standard Form</span>
Variants
- Either a maximization or a minimization problem.
- Constraints can be queations and/or ineualities.
- Variables are restricted be nonnegative or unrestricted in sign.