演算法 - HackMD

--- title: 演算法 --- # Introduction to Algorithm NTNU 演算法 ##### [Back to Note Overview](https://reurl.cc/XXeYaE) {%hackmd @ntnucsie112/DSA_ABB %} {%hackmd @sophie8909/pink_theme %} ###### tags: `NTNU` `CSIE` `必修` `Algorithm` ## Video <iframe width="560" height="315" src="https://www.youtube.com/embed/videoseries?list=PLcFF2oPoIi7LMV5fnrFW2AXUzUKlSyA_g" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ## [Calander](https://calendar.google.com/calendar?cid=ZHA5MTV1bHRtbjhtYWFhNmxsYWNtOGx1NG9AZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ) <iframe src="https://calendar.google.com/calendar/embed?src=dp915ultmn8maaa6llacm8lu4o%40group.calendar.google.com&ctz=Asia%2FTaipei" style="border: 0" width="600" height="400" frameborder="0" scrolling="no"></iframe> ## Score - Midterm Exam(04/21 Tue.) 25% - Final Exam(06/16 Tue.) 30% - 點名 5% - Assignment 40% - Handwritten 30% - pA 40% - pB 30% - [Normal OJ](https://noj.tw/) - HW#1 - due: 2020/03/27 00 : 00 : 00 - HW#2 - due: 2020/04/11 00 : 00 : 00 - HW#3 - due: 2020/05/01 00 : 00 : 00 ## Private Note - [Midterm](https://hackmd.io/@SK11/H1DOBTtuI) --- ## Ch. 1 Overview ### What is an algorithm? - A sequence of computational steps that transform the input into the output - A tool for solving a well-specified computational problem - A procedure for solving a mathematical problem in a finite number of steps that frequently **involves repetition of an operation** - A step-by-step procedure for solving a problem or accomplishing some end especially **by a computer** ### Requirement - **Finite** - terminates after a finite number of steps - **Definite** - unambiguously specified - **Input** - valid inputs are clearly specified - **Output** - can be proved to produce the correct output given a valid input - **Effectiveness** - steps are sufficiently and basic ### Computational thinking - “Computational Thinking is the thought processes involved in formulating problems and their solutions so that the solutions are represented in a form that _**can be effectively carried out by computers.**_” -- Jeannette Wing --- ## Ch. 3 Growth of Functions ### What is a good algorithm? - Correct - Time efficient - Space efficient - Simple - General ### Algorithm design and Analysis process ![](https://i.imgur.com/yb2HY5e.png) ### Analysis Framework - Run time vs. Input size - Run time = $f($input size$)$ - #Basic operations(Costs most time) - Order of growth - E.g. $logn,\ n,\ nlogn,\ n^2,\ n^3,\ 2^n,\ n!$ - Worst-case, Best-case, Average-casexx - Average-case **is or isn’t** the average of worst and best case? - Time (Space) efficiency is measured as a function of the algorithm’s input size. - Time efficiency is measured by counting the number of times the basic operation is executed. - Need to distinguish between the worst-case, average-case, and best-case efficiencies. - Run time = $f($input size $\infty )$ ### 3.1 Asymptotic Notations #### $O$-notation - Upper bound ($\geqslant$) > $O(g(n))$ is the set of functions with a **smaller or same** order of growth as $g(n)$. #### $o$-notation - Absolute Upper bound ($>$) > $O(g(n))$ is the set of functions with a **smaller** order of growth as $g(n)$. #### $\Omega$-notation - Lower bound($\leqslant$) > $Ω(g(n))$ is the set of functions with a **larger or same** order of growth as $g(n)$. #### $\omega$-notation - Absolute Lower bound ($<$) > $Ω(g(n))$ is the set of functions with a **larger** order of growth as $g(n)$. #### $\Theta$-notation ($=$) > $Θ(g(n))$ is the set of functions that have the **same** order of growth as $g(n)$. --- ## Ch. 4 Divide and Conquer ### 4.3-4.5 _**Solving Recurrences**_ #### The substitution method - Forward substitution - e.g. - Recurrence - $x(n) = 2x(n-1) + 1$ for $n > 1$ - $x(1) = 1$ - Solution - $x(1) = 1$ - $x(2) = 2*$<font color=#c04851>$1$</font>$+ 1 = 3$ - $x(3) = 2*$<font color=#c04851>$3$</font>$+ 1 = 7$ - $x(4) = 2*$<font color=#c04851>$7$</font>$+ 1 = 15$ - $x(n) = 2^n -1 = \theta (2^n)$ - Backard substitution - e.g. - Recurrence - $x(n)=x(n-1)+n$ - $x(1)=1$ - Solution - $x(n)=x(n-1)+n$ =$[x(n-2)+n-1]+n$ =$[x(n-3)+n-2]+n-1+n$ =$x(n-i) + (n-i+1) + (n-i+2) + … + n$ - $x(n)=1+2+3+...+n=n(n+1)/2=\theta(n^2)$ #### The recursion tree method - e.g. - Recurrence tree - save the constant as the root node and put the $T(?)$ as the child ![](https://i.imgur.com/pkAqrS2.png) #### The master method - $T(n) = aT(\frac{n}{b})+f(n)$, $a\ge1,b\ge1,$ $\rightarrow f$ is asymptotically positive. - <font color=#c04851>$f(n)$ vs $n^{log_ba}$</font> - Case1: $f(n)$ **<** $n^{\log_ba}$ > <font color=#c04851>T(n) = $\Theta(n^{log_b a})$</font> - Case2: $f(n)$ **=** $n^{\log_ba}$ > <font color=#c04851>T(n) = $\Theta(n^{log_b a}lgn)$</font> - Case3: $f(n)$ **>** $n^{\log_ba}$ > <font color=#c04851>T(n) = $\Theta(f(n))$</font> - **<font color=#c04851>polynomially</font>** smaller/larger - **regularity condition** > $af$ $(\frac{n}{b}) \leqslant$ $cf (n)$for some $c <1$ - idea of master theorem ![](https://i.imgur.com/1c6cKWH.png) - e.g. - $T(n) = aT(\frac{n}{b}) +f(n)$ > $T(n) = 4T(\frac{n}{2}) + n^3$ > $a = 4, b = 2 \Rightarrow n^{\log_ba} = n^2$ > $f(n) = n \Rightarrow f(n) =O(n^{2 - \epsilon})$ for $\epsilon = 1$ - Case 1 $T(n) = \Theta(n^2)$ ### 4.1-4.2 _**Divide and Conquer**_ #### General plan 1. **Divide** 2. **Solve** 3. **Combine** #### Typical case ```graphviz digraph hierarchy { "problem of size n"->{"subproblem1 of size n/2" "subproblem2 of size n/2"} "subproblem1 of size n/2" -> "solution of subproblem1" "subproblem2 of size n/2" -> "solution of subproblem2" {"solution of subproblem1" "solution of subproblem2"} -> "solution of origin problem" } ``` #### Example ##### Binary Search 二元搜尋 1. Divide : Check middle element. 2. Conquer : Recursively search 1 sub-array. - Recurrence : $T(n) = 1T(n/2) + \Theta(1)$ - $n^{\log_{a} {b}}= n^{\log_21} = n^{0} = 1$ 3. Combine : Trivial. ##### Fast Power 快速冪 - Naïve algorithm: $\theta (n)$ - Divide and Conquer - \begin{aligned} a^n = \left\{\begin{array}{ll} a^{\frac{n}{2}}\cdot a^{\frac{n}{2}}, & \mbox{if $a為偶$} \\ a^{\frac{n-1}{2}}\cdot a^{\frac{n-1}{2}}\cdot a, & \mbox{if $a為奇$} \\ \end{array} \right. \end{aligned} - Recurrence : $T(n) = T(n/2) + \Theta(1)$ - $n^{\log_{a} {b}}= n^{\log_21} = n^{0} = 1\Rightarrow$ by [The master method Case 2 ](####The-master-method) ##### Matrix multiplication 矩陣乘法 - Naïve algorithm: $\theta (n^3)$ - Divide and Conquer - $n\times n$ martix $= 2 \times 2$ martix of $\frac{n}{2}\times\frac{n}{2}$sub-matrices ![](https://i.imgur.com/GuOYu51.png) - $T(n) = 8T(n/2) + \theta (n^2)$ - $n^{\log_{a} {b}}= n^{\log_28} = n^{3}$ $\Rightarrow$ by [The master method-Case 1 ](####The-master-method) $(n^{\log _ab}>f(n))$ - $T(n) = \Theta(n^3)$ - Strassen's algorithm (Divide and Conquer) - Multiply $2 \times 2$ matrices with only **<font color = #c04851>7</font>** recursive mults ![](https://i.imgur.com/XGc46Uv.png) 1. Divide Partition A and B into $\frac{n}{2}\times\frac{n}{2}$sub-matrices. Form terms to be multiplied using + and -. 2. Conquer Perform 7 multiplications of $\frac{n}{2}\times\frac{n}{2}$ sub-matrices recursively 3. Combine Form C using + and – on $\frac{n}{2}\times\frac{n}{2}$ submatrices. - $T(n) = 7T(\frac{n}{2}) + \Theta(n^2)$ - $n^{\log_ba} = n^{\log_27} \approx n^{2.81} \Rightarrow$[The master method-Case 1 ](####The-master-method)$T(n) = \Theta(n^{2.81})$ ##### Closest- Pair Problem > Find two closest points in a set of n points using $d(P_i,P_j) = \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2 }$ - Brute-Force Algorithm - $C^n_2$ pairs - $C(n) = \sum_{i = 1}^{n - 1}\sum_{j = i + 1}^{n}1 = \frac{(n-1)n}{2} \in \theta(n^2)$ - Divide and Conquer 1. Divide:Bisect the point set into sets $P_L$ and $P_R$ with same sizes. 2. Conquer: Make two recursive calls—one to find the closest pair in $P_L$ and the other to find the closest pair in $P_R$. Let $d =$ min$(d_L , d_R )$. - $T(n) = 2T(n/2)+...$ - $O(n)$ using pre-sorted lists - at most 1 point can reside in each $d/2\times d/2$ square! check the following 7 points! - $T(n) = 2T(n/2)+O(n)$ - $T(n) : O(n\log n) < O(n^2)$ - [The master method-Case 2 ](####The-master-method) 3. Combine: Choose either $d$ or a pair of points with one in $P_L$ and the other in $P_R$.  ##### Convex-Hull Problem - Brute-Force ALgorithm: $O(n^3)$ - Divide and Conquer 1. Split into half 2. find Pmax(farthest from $\overline{P1Pn}$) $T(n) = 2T(n/2) + O(n) = O(nlgn)$Approximately ## Ch.7 Sorting ### Brute-force algorithms ==$O(n^2)$== {%hackmd @SK11/selection_sort %} {%hackmd @SK11/bubble_sort %} {%hackmd @SK11/insertion_sort %} ### Comparison-based algorithms {%hackmd @SK11/merge_sort %} {%hackmd @SK11/quick_sort %} {%hackmd @SK11/randomized_quick_sort %} - **Heap Sort**(Ch.6) - description - Time Complexity: ```pseudo= ``` ### Linear-time algorithms {%hackmd @SK11/counting_sort %}  {%hackmd @SK11/radix_sort %} - bucket sort ### Some Definition #### Indicator random varibale - An indicator random variable associated with event A is defined as ![](https://i.imgur.com/ztIULu4.png =400x) - Example: flip a fair coin and see a head ![](https://i.imgur.com/mLn2h8x.png =400x) #### Decision tree ![](https://i.imgur.com/Cw7JFYc.png =x270) - Each leaf contains a permutation. - =={comparisons} of the algorithm = the length of the path taken== - Worst-case running time = height of tree #### Stable sorting - Preserves the input order among equal elements - how to improve quick sort #### Lower bound for comparison sorts > Any comparison sort algorithm requires comparisons in Ω(nlogn) in the worst case. pf. Consider the decision tree of height $h$ with $L$ leaves. $n!\leq L \leq 2^h$ $h \geq\log (n!)$ $=\Omega(n\log n)$ ##### $\log (n!)=\Omega(n\log n)$ 1. Informally, $\log(n!) > log((\frac{n}{2})^{\frac{n}{2}}) = (\frac{n}{2})\times \log(\frac{n}{2})\rightarrow O(n\log(n))$ 2. Stirling’s Approximation: $n!>(\frac{n}{e})^n$ $\log (n!)>\log(\frac{n}{e})^n$ $=n(\log n-\log e)$ $=\Theta(n\log n)$ {%hackmd @SK11/sort_compare %} ## Ch. 9 The Selcetion Problem ### Selection > **Input**: $A$ set $A$ of n distinct numbers and an integer $i$, with $1 \leq i \leq n$ > > **Output**: The element $x$ that is larger than exactly $i-1$ other elements of $A$. ## Dynamic Programming ## Greedy Algorithm ### The Activites Selection Problem > Given the time table of activities, select **a** maximum-size subset of mutually compatible activities. #### Dynamic Programing ![](https://i.imgur.com/8y0Be2J.png) - $S_{ij}$ : The activity set that start after $a_i$ finished and that finish before $a_j$ start - $c[i,j]$ : An optimal solution for $S_{ij}$ - Recurrence : $c[i, j]=c[i,k]+c[k,j]+1$ #### Greedy - Choose the activity has the earliest finish time. - Once we make the greedy choice, only one sub-problem remains. ```c= GREEDY-ACTIVITY-SELECTOR(s, f) n = s.length A = {a1} k = 1 for m = 2 to n if s[m] ≥ f[k] A = A U {am} k = m return A ``` #### Huffman code $\begin{aligned}&cost(R*)-cost(T'')\\ =&\sum_{c\in C}c\cdot d_r \cdot (c) - \sum_{c\in C}c\cdot freq \cdot d_r \cdot (c)\\ =&(a\cdot freq - x \cdot freq)(d_r\cdot(a)-d_r\cdot(x))+(b\cdot freq - y\cdot freq)(d_r\cdot(b) - d_r\cdot(y))\end{aligned}$ ## Graph Algorithm ### Directed acyclic graph (DAG) DAG: A directed graph that does not contain any directed cycle. #### Topological sort ### Strongly connected components #### A linear Θ(V+E)-time algorithm 1. call DFS(G) to compute finishing time u.f for each vertex u 2. compute GT (the transpose of G) 3. call DFS(GT) with considering the vertices in the order of decreasing u.f 4. output the components found in step 3 ### Minimum Spanning tree. > connects all of the vertices - Input: - A connected, undirected, weighted graph G = (V, E) - Output: - An acyclic subset T  E that connects all of the vertices and whose total weight is minimized. #### Two Common used algorithm - Kruskal’s algorithm - Prim’s algorithm **$O(E \log V)$ Greedy methods also optimal ans** ##### Kruskal's algorithm ```c= MST-KRUSKAL (G, w) A = Ø for each $u \in G.V$ MAKE-SET(v) sort the edges of G.E into non-decreasing order by weight w for each edge (u, v) \in G.E, taken in non-decreasing order by w if FIND-SET(u) ≠ FIND-SET(v) // check if (u, v) are in different sets A = A U {(u, v)} // add the edge UNION(u, v) // construct the union of the disjoint sets return A // containing u an ``` **Simplified** ```pseudo= Weight for each edge = W_i sort(Weight) for W_0 to W_n: if(W_i form a cycle) delete else add ``` **Operations** - MAKE-SET(v): creates a one-element set {v} - FIND-SET(u): returns a subset containing u - UNION(u, v): constructs the union of the disjoint sets U and V containing u and v ## Ch.24 Single-Source Shortest Paths ### 24.1 The Bellman-Ford Algorithm ### 24.3 Dijkstra's Algorithm ```c= DIJKSTRA(G,w,s) //initialize d and pi value initialize_single_source(G,s) S = not empty set Q = G.V while Q != not empty set u = extract_min( Q ) S = S or {u} for each v in G.Adj[u] relax(u,v,w) ``` ## Ch.25 All-Pairs Shortest Paths ### 25.1 Shortest paths and matrix multiplication - DP method (1) > $T(n) = O(n^3\log n)$ Create a 3-dimension $l_{i,j}^{m}$: the minimum weight of any path from vertex i to vertex j that contains at most m edges. The initial condition: $$ l_{i,j}^{0} = \begin{cases} 0 &, if i = j\\ \infty &, if i \neq j\\ \end{cases} $$ $$l_{i,j}^{1} = w_{ij} $$ Recurrence: $$ l_{i,j}^{m} = min(l_{ij}^{(m-1)}, min_{1\leq\ k\leq n}(l_{ik}^{(m-1)}, l_{kj}^{(m-1)})) $$ ### 25.2 The Floyd-Warshell algorithm - DP method (2) $T(n) = O(n^3\log n) \rightarrow O(n^3)$ $d_{i,j}^{(k)}$: the weight f a shortest path from vertex i to vertex j for which all intermediate vertices are in the set of {1, 2, ..., k} (可通過的節點)（非節點數） The initial condition: $$ d_{i,j}^{0} = w_{ij} $$ Recurrence: $$ d_{i,j}^{(k)} = min(d_{i,j}^{(k-1)}, d_{ik}^{(k-1)} + d_{kj}^{(k-1)}) $$ ### 25.3 Johnson's Algorithm for sparse graph - Run dijkstra for each vertex - **Reweighting** $\to O(n^3)$ - Assign a weight h(v) to each vertex v of G - Let $w'(u,v) = w(u,v) + h(u) - h(v)$  - Let h(v) be the shortest path from 0 to vertex - Bellman-Ford ## Ch26 Maximum Flow ![](https://i.imgur.com/yF89XWO.png) - source: s - sink: t **Capacity Constraint** ![](https://i.imgur.com/b6jDMYC.png) **Flow Conservation** ![](https://i.imgur.com/9i3sFOp.png) ### Ford-Fulkerson method 1. Initialize flow $f$ to 0 2. while there exists an augmenting path p in the residual network Gf, augment flow f along p 3. return f ### Correctness of the method f is not maximum for G if and only if reach($G_f$, s, t) = yes. ### A cut of flow network The capacity of the cut $c(S, T) = \sum_{u\in S}\sum_{v\in T} c(u, v)$ The source s is partitioned in S, the sink t is in T. ## Theory of P,NP,NP-Complete Polynomial v.s. Exponential - Tractable problem > Can be solved by polynomial-time algorithm > [color=#061EB9] - The halting problem > Given a program or input, determine whetherhe program will eventually halt. > [color=#061EB9] - Solving Diophantne Equeation > Given a polynomial equation, is there a solution in integers? > $$\text{Example: } 4xy^2+ 2xy^2z^3 – 11x^3y^2z^2 = -1164$$ > [color=#061EB9] ### P **Problems that can be solved in polynomial time** - Problems that can be solved in polynomial time. - Examples: sorting, finding shortest paths, …. ### NP **Problems that can be verified in polynomial time** - Problems that can be solved in non-deterministic polynomial time $P\subseteq NP\\ P=NP?$ ### Reducibility $D_1 ≤_P\ D_2$