<style> .reveal { font-size: 30px; } .reveal strong { color: #E12315; } .reveal h1 { font-size: 55px; } </style> # Notes for CLRS (Part I - Fundations) ### Cory Chu (2022) [\[Return to Table of Content\]](https://hackmd.io/@c0rychu/dsa/) **Reference**: Introduction to Algorithms by Thomas H. **C**ormen, Charles E. **L**eiserson, Ronald L. **R**ivest, and Clifford **S**tein. --- ## Ch1 - The Role of Algorithms in Computing ---- Muḥammad ibn Mūsā al-Khwārizmī or al-Khwarizmi was a Persian polymath from Khwarazm -- Wikipedia According to Knuth, the word “algorithm” is derived from the name “al-Khowârizmî,” a ninth-century Persian mathematician. -- CLRS ---- ### Correct Algorithm (P.29) - An algorithm for a computational problem is **correct** if, for every problem instance provided as input, it **halts** — finishes its computing in finite time — and **outputs the correct solution** to the problem instance. - An incorrect algorithm might not halt at all on some input instances, or it might halt with an incorrect answer. - Contrary to what you might expect, *incorrect algorithms can sometimes be useful, if you can control their error rate.* --- ## Ch2 - Getting Started ---- ### 2.1 - Insertion Sort ``` INSERTION-SORT(A, n) // output from small to large for i = 2 to n // subarray A[1:i-1] is sorted pickup = A[i] j = i-1 // move elements in the subarray to the right // making space for inserting the pickup while j>0 and A[j] > pickup : A[j+1] = A[j] j = j-1 A[j+1] = pickup // insert the pickup ``` - Proof the correctness of insertion sort by **Loop Invariant** (*Mathematical Induction*) - **Loop Invariant**: `A[1:i-1]` is sorted - i = 2 is true - i = k $\rightarrow$ i = k+1 - Terminate when i = n (different from MI) ---- ### 2.2 - Analyzing algorithms - **Random-Access Machine (RAM)** model - In the RAM model each instruction (+-*/=...) or data access takes a constant amount of time - Gray Area in the RAM model - Is `x^n` a constant-time instruction? - Generally, $O(\log n)$ (eq. 31.34 / page 1205) - For $x=2, n\in\mathbb{N}$: `x^n == x<<(n-1)`, $O(1)$ - We just treat all of them as $O(1)$ in RAM model ---- ### 2.2 - Analyzing algorithms - Running time depends on the input - Input size: - number of - items(sort) - bits(multiplying) - vertices/edges (in the graph) - The running time of an algorithm on a particular input is the number of instructions and data accesses executed. (RAM model) ---- ### Analysis of insertion sort ```= INSERTION-SORT(A, n) # cost times for i = 2 to n # c1 n pickup = A[i] # c2 n-1 j = i-1 # c3 n-1 while j>0 and A[j] > pickup : # c4 sum_{i=2}^{n} m_i A[j+1] = A[j] # c5 sum_{i=2}^{n} m_i-1 j = j-1 # c6 sum_{i=2}^{n} m_i-1 A[j+1] = pickup # c7 n-1 ``` $$ \begin{aligned} T(n) &= c_1 n + [c_2+c_3+c_7] (n-1) \\ &+ c_4 \sum_{i=2}^n m_i + [c_5+c_6] \sum_{i=2}^n (m_i - 1) \end{aligned} $$ - $m_i$ is the number of times the while loop run +1 - $m_i=1$ for best case (A is already sorted) - $m_i=i$ for worst case (A is reverse ordered) ---- ### Analysis of insertion sort (**best** case) $$ \begin{aligned} T(n) &= c_1 n + [c_2+c_3+c_7] (n-1) \\ &+ c_4 \sum_{i=2}^n m_i + [c_5+c_6] \sum_{i=2}^n (m_i - 1) \end{aligned} $$ - $m_i=1$ for best case (A is already sorted) $c_4 \sum_{i=2}^n m_i = c_4 (n-1)$ $$ \begin{aligned} T(n) &= c_1 n + [c_2+c_3+c_7+c_4] (n-1) \\ &= a n + b \end{aligned} $$ - **$\Theta(n)$** (Pronounced "theta of n" or "theta n"), i.e., Linear Time ---- ### Analysis of insertion sort (**worst** case) $$ \begin{aligned} T(n) &= c_1 n + [c_2+c_3+c_7] (n-1) \\ &+ c_4 \sum_{i=2}^n m_i + [c_5+c_6] \sum_{i=2}^n (m_i - 1) \end{aligned} $$ - $m_i=i$ for worst case (A is reverse ordered) $\sum_{i=2}^n m_i = \frac{(2+n)(n-1)}{2}=\frac{n^2+n-2}{2}$ $\sum_{i=2}^n (m_i-1) = \frac{n^2+n-2}{2}-(n-1)=\frac{n^2-n}{2}$ $T(n)=cn^2+dn+e$ - **$\Theta(n^2)$** (Pronounced "theta of n-squared" or "theta n-squared"), i.e., Quadratic Time ---- Due to constant factors and lower-order terms, an algorithm whose running time has a higher order of growth might take less time for small inputs than an algorithm whose running time has a lower order of growth. ---- ### 2.3 Designing algorithms - Incremental method: e.g., Insertion Sort - Divide and Conquer: e.g., Merge Sort ```python def MergeSort(A, left, right): if left >= right: # zero or one element return mid = floor((left+right)/2) MergeSort(A, left, mid) MergeSort(A, mid+1, right) Merge(A, left, mid, right) ``` --- ## Ch3 - Characterizing Running Times ---- ### 3.1 - $O$, $\Omega$, and $\Theta$-notation - Suggested by Aho, Hopcroft, and Ullman [1], it becomes prevalent to use $O$, $\Omega$, and $\Theta$-notation to describe the **asymptotic** behavior of algorithms. - They are general notations that can be used for **Time Complexity**, **Space Complexity**, ... - **$O(f(n))$** is an asymptotic **upper bound** - e.g., $5n^3+3n$ is $O(n^3)$, $O(n^4)$, $O(n^5)$,... - **$\Omega(f(n))$** is an asymptotic **lower bound** - e.g., $5n^3+3n$ is $\Omega(n^3)$, $\Omega(n^2)$, $\Omega(n)$,... - **$\Theta(f(n))$** is the asymptotic **tight bound** - e.g., $5n^3+3n$ is $\Theta(n^3)$ - As a result, if you can show a function is both $O(n^3)$ and $\Omega(n^3)$, it is, then, must be $\Theta(n^3)$. [1] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. *The Design and Analysis of Computer Algorithms*, Addison-Wesley, 1974.) ---- ### Revisit Insertion Sort (step1: **$O(n^2)$**) ```python= INSERTION-SORT(A, n) for i = 2 to n pickup = A[i] j = i-1 while j>0 and A[j] > pickup : A[j+1] = A[j] # Line 6 - Move element by one position j = j-1 A[j+1] = pickup ``` - The outer `for` loop runs **`n-1`** times - The inner `while` loop might runs `0`(best) to **`i-1`(worst)** times depending on the input array `A` - `i<=n` - Number of iterations of the inner loop is therefore **at most** $(n-1)(n-1)$, which is **$O(n^2)$** ---- ### Revisit Insertion Sort (step2: **$\Omega(n^2)$**) ```python= INSERTION-SORT(A, n) for i = 2 to n pickup = A[i] j = i-1 while j>0 and A[j] > pickup : A[j+1] = A[j] # Line 6 - Move element by one position j = j-1 A[j+1] = pickup ``` - Assume $n \equiv 0 \pmod 3$, we can devide the `A` into three pieces: `A[1:n/3]`, `A[n/3+1:2n/3]`, `A[2n/3+1:n]`. - Assume $\frac{n}{3}$ numbers in the first piece `A[1:n/3]` are greater than the rest of the input array `A`. - After sorting, all these $\frac{n}{3}$ numbers must be moved to somewhere in the third piece `A[2n/3+1:n]` by the Line 6 in the above code. - Each of these numbers will be moved by at least $\frac{n}{3}$ times to accross the middle piece `A[n/3+1:2n/3]`. - So, Line 6 will be executed **at least** $\frac{n}{3}\frac{n}{3}=\frac{n^2}{9}$ times. - For this **particular case**, it is **$\Omega(n^2)$** ---- ### Revisit Insertion Sort (step3: **$\Theta(n^2)$**) ```python= INSERTION-SORT(A, n) for i = 2 to n pickup = A[i] j = i-1 while j>0 and A[j] > pickup : A[j+1] = A[j] # Line 6 - Move element by one position j = j-1 A[j+1] = pickup ``` - Step1: $O(n^2)$ in all cases - Step2: $\Omega(n^2)$ in a particular case - Step3: Insertion Sort is $\Theta(n^2)$ in the worst case. ---- ### Remarks <small>(Page 95)</small> - ✅ Insertion sort's best-case running time is $O(n)$, $\Omega(n)$, and $\Theta(n)$ - ✅ Insertion sort's worst-case running time is $O(n^2)$, $\Omega(n^2)$, and $\Theta(n^2)$ - ❌ Insertion sort's running time is $\Theta(n^2)$ - Because in best-case, it is $\Theta(n)$ - ✅ **Insertion sort's running time is $O(n^2)$** - ✅ Insertion sort's running time is $\Omega(n)$ - ✅ Merge sort runs in $\Theta(n \lg n)$ time (in all cases) ---- ### **Amortized** - Detail discussed in Ch.16 - "In an amortized analysis, you average the time required to perform a sequence of data-structure operations over all the operations performed." - "Amortized analysis differs from average-case analysis in that probability is not involved. An amortized analysis guarantees the *average performance of each operation in the worst case*." ---- ### Misc. - Formally, $f(n) = O(g(n))$ means $f(n) \in O(g(n))$, where $O(g(n))$ is a **set** of functions. - There are also $o$ and $\omega$ notations $f(n) = O(g(n))$ is like $f \le g$ $f(n) = \Omega(g(n))$ is like $f \ge g$ $f(n) = \Theta(g(n))$ is like $f = g$ $f(n) = o(g(n))$ is like $f < g$ $f(n) = \omega(g(n))$ is like $f > g$ ---- ### 3.3 - Standard notations and common functions - Floor/Ceiling - `floor(1.7)`$= \lfloor 1.7 \rfloor = 1$ - `ceil(1.7)`$= \lceil 1.7 \rceil = 2$ - $\lceil x \rceil \ge x \ge \lfloor x \rfloor \quad \forall x \in \mathbb{R}$ - Modular - `a%n`$= (a \bmod n) = a - n \lfloor \frac{a}{n}\rfloor$ - $(a \bmod n) = (b \bmod n) \Rightarrow a \equiv b \pmod n$ i.e., $a$ is equivalent to $b$, modulo $n$ - $\lg n = \log_2 n$ --- ## Ch4 - Divide-and-Conquer ---- ### Intro - Divide-and-Conquer - **Base Case**: Solve it directly - **Recursive Case**: Devide it recursively until reaching the base case (*bottoms out*). This invloves three steps: 1. **Divide** the problem into subproblems 2. **Conquer** the subproblems *recursively* 3. **Combine** the subproblems to solve the original problem ---- ### Compared to Dynamic Programming (Ch14) - In contrast to divide-and-conquer, dynamic programming applies when subproblems share subsubproblems. - A dynamic-programming algorithm **solves each subsubproblem just once** and then **saves its answer in a table for looking up later**. ---- ### Algorithmic recurrences - Recurrence $T(n)$ is **algorithmic** if, for every sufficiently large **threshold** constant $n_0>0$, we have: 1. $\forall n<n_0$, we have $T(n) = \Theta(1)$. 2. $\forall n \ge n_0$, all recursion terminates in a defined base case within a finite number of recursive invocations. - An example of recurrence $T(n)$: - The running time of Strassen’s algorithm can be described by the recurrence: $$T(n) = 7 T(n/2)+\Theta(n^2)$$ - The solution is $$T(n) = \Theta(n^{\lg 7})\approx\Theta(n^{2.81})$$ ---- ### 4.1 - Multiplying Square Matrices (directly) ```python def Matrix_Multiply(A, B, C, n): for i in range(n): for j in range(n): for k in range(n): C[i,j] += A[i,k]*B[k,j] ``` - $\Theta(n^3)$ time complexity ---- ### 4.1 - Multiplying square Matrices (Simple Divide-and-Conquer) - Assume $n$ is an exact power of 2. - Devide original $n \times n$ matrices ($A, B, C$) into $\frac{n}{2}\times\frac{n}{2}$ ones ($A_{11}, A_{12}, \ldots$) and expand the multiplication: $$ \begin{align} C &= A \cdot B \\ \Rightarrow \begin{pmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \end{pmatrix} &= \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix} \begin{pmatrix} B_{11} & A_{12} \\ B_{21} & A_{22} \end{pmatrix} \\ &= \begin{pmatrix} A_{11} \cdot B_{11} +A_{12} \cdot B_{21} & A_{11} \cdot B_{12} +A_{12} \cdot B_{22} \\ A_{21} \cdot B_{11} +A_{22} \cdot B_{21} & A_{21} \cdot B_{12} +A_{22} \cdot B_{22} \end{pmatrix} \end{align} $$ ---- ### 4.1 - Multiplying square Matrices (Simple Divide-and-Conquer) ```python def Matrix_Multiply_Recursive(A, B, C, n): if n == 1: # Base Case C[1,1] += A[1,1]*B[1,1] return else: # Divide [A11, A12, A21, A22, B11, B12, B21, B22, C11, C12, C21, A22] = Part(A,B,C) # Conquer Matrix_Multiply_Recursive(A11, B11, C11, n/2) Matrix_Multiply_Recursive(A11, B12, C12, n/2) Matrix_Multiply_Recursive(A21, B11, C21, n/2) Matrix_Multiply_Recursive(A21, B12, C22, n/2) Matrix_Multiply_Recursive(A12, B21, C11, n/2) Matrix_Multiply_Recursive(A12, B22, C12, n/2) Matrix_Multiply_Recursive(A22, B21, C21, n/2) Matrix_Multiply_Recursive(A22, B22, C22, n/2) ``` - $T(n)=8T(n/2)+\Theta(1)$ - From the master method in Sec. 4.5, we can show it has the solution: $T(n) = \Theta(n^3)$ - Could we make it better than $\Theta(n^3)$? ---- ### 4.2 - Strassen’s algorithm for matrix multiplication - It is hard to imagine that any matrix multiplication algorithm could take less than $\Theta(n^3)$ time, since the natural definition of matrix multiplication requires $n^3$ scalar multiplications. Indeed, many mathematicians presumed that it was not possible to multiply matrices in $o(n^3)$ time until 1969, when V. Strassen [424] published a remarkable recursive algorithm for multiplying $n \times n$ matrices - The running time of Strassen’s algorithm is $\Theta(n^{\lg 7})\approx\Theta(n^{2.81})$ - [424] Volker Strassen. *Gaussian elimination is not optimal*, Numerische Mathematik, 14(3):354–356, 1969. ---- ### 4.2 - Strassen’s algorithm for matrix multiplication (cont.) - The basic idea is similar to $x^2-y^2=(x-y)(x+y)$, where we can turn 2 multiplication and 1 "addition" into 2 "addition" and 1 multiplication. - For scalar operation, the difference between multiplication and addition is not huge. However, it is a big difference for matrices. - Instead of deviding the original $n \times n$ matrix multiplication into **8 matrix multiplication of $\frac{n}{2}\times\frac{n}{2}$ matrices**, Strassen only perform **7 matrix multiplication of $\frac{n}{2}\times\frac{n}{2}$ matrices** with some **extra addition (substraction) operations**. - So, the recurrence for the running time of Strassen’s algorithm is $$T(n) = 7T(n/2)+\Theta(n^2)$$ - [Link: Wikipedia - Strassen’s Algorithm](https://en.wikipedia.org/wiki/Strassen_algorithm) ---- ### 4.3 - The substitution method for solving recurrences ---- ### 4.4 - The recursion-tree method for solving recurrences ---- ### 4.5 - The master method for solving recurrences - The master method provides a “cookbook” method for solving algorithmic recurrences of the form: $$T(n) = aT(n/b)+f(n)$$ - A recurrence of this general form is called **master recurrence**. - $f(n)$ is the **driving function** - It means we divide a problem of size $n$ into $a$ subproblems, each of size $n/b<n$.
{"metaMigratedAt":"2023-06-17T07:05:57.828Z","metaMigratedFrom":"YAML","title":"Notes for CLRS (Part I - Fundations)","breaks":true,"slideOptions":"{\"theme\":\"serif\"}","description":"[Return to Table of Content]","contributors":"[{\"id\":\"de7b5032-6489-4d89-b444-ed16c56171f8\",\"add\":16705,\"del\":1833,\"latestUpdatedAt\":1765919008188}]"}
    771 views