<style>
.reveal {
font-size: 30px;
}
.reveal strong {
color: #E12315;
}
.reveal h1 {
font-size: 55px;
}
</style>
# Notes for CLRS (Part I - Fundations)
### Cory Chu (2022)
[\[Return to Table of Content\]](https://hackmd.io/@c0rychu/dsa/)
**Reference**: Introduction to Algorithms by Thomas H. **C**ormen, Charles E. **L**eiserson, Ronald L. **R**ivest, and Clifford **S**tein.
---
## Ch1 - The Role of Algorithms in Computing
----
Muḥammad ibn Mūsā al-Khwārizmī or al-Khwarizmi was a Persian polymath from Khwarazm -- Wikipedia
According to Knuth, the word “algorithm” is derived from the name “al-Khowârizmî,” a ninth-century Persian mathematician. -- CLRS
----
### Correct Algorithm (P.29)
- An algorithm for a computational problem is **correct** if, for every problem instance provided as input, it **halts** — finishes its computing in finite time — and **outputs the correct solution** to the problem instance.
- An incorrect algorithm might not halt at all on some input instances, or it might halt with an incorrect answer.
- Contrary to what you might expect, *incorrect algorithms can sometimes be useful, if you can control their error rate.*
---
## Ch2 - Getting Started
----
### 2.1 - Insertion Sort
```
INSERTION-SORT(A, n) // output from small to large
for i = 2 to n // subarray A[1:i-1] is sorted
pickup = A[i]
j = i-1
// move elements in the subarray to the right
// making space for inserting the pickup
while j>0 and A[j] > pickup :
A[j+1] = A[j]
j = j-1
A[j+1] = pickup // insert the pickup
```
- Proof the correctness of insertion sort by **Loop Invariant** (*Mathematical Induction*)
- **Loop Invariant**: `A[1:i-1]` is sorted
- i = 2 is true
- i = k $\rightarrow$ i = k+1
- Terminate when i = n (different from MI)
----
### 2.2 - Analyzing algorithms
- **Random-Access Machine (RAM)** model
- In the RAM model each instruction (+-*/=...) or data access takes a constant amount of time
- Gray Area in the RAM model
- Is `x^n` a constant-time instruction?
- Generally, $O(\log n)$ (eq. 31.34 / page 1205)
- For $x=2, n\in\mathbb{N}$: `x^n == x<<(n-1)`, $O(1)$
- We just treat all of them as $O(1)$ in RAM model
----
### 2.2 - Analyzing algorithms
- Running time depends on the input
- Input size:
- number of
- items(sort)
- bits(multiplying)
- vertices/edges (in the graph)
- The running time of an algorithm on a particular input is the number of instructions and data accesses executed. (RAM model)
----
### Analysis of insertion sort
```=
INSERTION-SORT(A, n) # cost times
for i = 2 to n # c1 n
pickup = A[i] # c2 n-1
j = i-1 # c3 n-1
while j>0 and A[j] > pickup : # c4 sum_{i=2}^{n} m_i
A[j+1] = A[j] # c5 sum_{i=2}^{n} m_i-1
j = j-1 # c6 sum_{i=2}^{n} m_i-1
A[j+1] = pickup # c7 n-1
```
$$
\begin{aligned}
T(n) &= c_1 n + [c_2+c_3+c_7] (n-1) \\
&+ c_4 \sum_{i=2}^n m_i +
[c_5+c_6] \sum_{i=2}^n (m_i - 1)
\end{aligned}
$$
- $m_i$ is the number of times the while loop run +1
- $m_i=1$ for best case (A is already sorted)
- $m_i=i$ for worst case (A is reverse ordered)
----
### Analysis of insertion sort (**best** case)
$$
\begin{aligned}
T(n) &= c_1 n + [c_2+c_3+c_7] (n-1) \\
&+ c_4 \sum_{i=2}^n m_i +
[c_5+c_6] \sum_{i=2}^n (m_i - 1)
\end{aligned}
$$
- $m_i=1$ for best case (A is already sorted)
$c_4 \sum_{i=2}^n m_i = c_4 (n-1)$
$$
\begin{aligned}
T(n) &= c_1 n + [c_2+c_3+c_7+c_4] (n-1) \\
&= a n + b
\end{aligned}
$$
- **$\Theta(n)$** (Pronounced "theta of n" or "theta n"), i.e., Linear Time
----
### Analysis of insertion sort (**worst** case)
$$
\begin{aligned}
T(n) &= c_1 n + [c_2+c_3+c_7] (n-1) \\
&+ c_4 \sum_{i=2}^n m_i +
[c_5+c_6] \sum_{i=2}^n (m_i - 1)
\end{aligned}
$$
- $m_i=i$ for worst case (A is reverse ordered)
$\sum_{i=2}^n m_i = \frac{(2+n)(n-1)}{2}=\frac{n^2+n-2}{2}$
$\sum_{i=2}^n (m_i-1) = \frac{n^2+n-2}{2}-(n-1)=\frac{n^2-n}{2}$
$T(n)=cn^2+dn+e$
- **$\Theta(n^2)$** (Pronounced "theta of n-squared" or "theta n-squared"), i.e., Quadratic Time
----
Due to constant factors and lower-order terms, an algorithm whose running time has a higher order of growth might take less time for small inputs than an algorithm whose running time has a lower order of growth.
----
### 2.3 Designing algorithms
- Incremental method: e.g., Insertion Sort
- Divide and Conquer: e.g., Merge Sort
```python
def MergeSort(A, left, right):
if left >= right: # zero or one element
return
mid = floor((left+right)/2)
MergeSort(A, left, mid)
MergeSort(A, mid+1, right)
Merge(A, left, mid, right)
```
---
## Ch3 - Characterizing Running Times
----
### 3.1 - $O$, $\Omega$, and $\Theta$-notation
- Suggested by Aho, Hopcroft, and Ullman [1], it becomes prevalent to use $O$, $\Omega$, and $\Theta$-notation to describe the **asymptotic** behavior of algorithms.
- They are general notations that can be used for **Time Complexity**, **Space Complexity**, ...
- **$O(f(n))$** is an asymptotic **upper bound**
- e.g., $5n^3+3n$ is $O(n^3)$, $O(n^4)$, $O(n^5)$,...
- **$\Omega(f(n))$** is an asymptotic **lower bound**
- e.g., $5n^3+3n$ is $\Omega(n^3)$, $\Omega(n^2)$, $\Omega(n)$,...
- **$\Theta(f(n))$** is the asymptotic **tight bound**
- e.g., $5n^3+3n$ is $\Theta(n^3)$
- As a result, if you can show a function is both $O(n^3)$ and $\Omega(n^3)$, it is, then, must be $\Theta(n^3)$.
[1] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. *The Design and Analysis of Computer Algorithms*, Addison-Wesley, 1974.)
----
### Revisit Insertion Sort (step1: **$O(n^2)$**)
```python=
INSERTION-SORT(A, n)
for i = 2 to n
pickup = A[i]
j = i-1
while j>0 and A[j] > pickup :
A[j+1] = A[j] # Line 6 - Move element by one position
j = j-1
A[j+1] = pickup
```
- The outer `for` loop runs **`n-1`** times
- The inner `while` loop might runs `0`(best) to **`i-1`(worst)** times depending on the input array `A`
- `i<=n`
- Number of iterations of the inner loop is therefore **at most** $(n-1)(n-1)$, which is **$O(n^2)$**
----
### Revisit Insertion Sort (step2: **$\Omega(n^2)$**)
```python=
INSERTION-SORT(A, n)
for i = 2 to n
pickup = A[i]
j = i-1
while j>0 and A[j] > pickup :
A[j+1] = A[j] # Line 6 - Move element by one position
j = j-1
A[j+1] = pickup
```
- Assume $n \equiv 0 \pmod 3$, we can devide the `A` into three pieces: `A[1:n/3]`, `A[n/3+1:2n/3]`, `A[2n/3+1:n]`.
- Assume $\frac{n}{3}$ numbers in the first piece `A[1:n/3]` are greater than the rest of the input array `A`.
- After sorting, all these $\frac{n}{3}$ numbers must be moved to somewhere in the third piece `A[2n/3+1:n]` by the Line 6 in the above code.
- Each of these numbers will be moved by at least $\frac{n}{3}$ times to accross the middle piece `A[n/3+1:2n/3]`.
- So, Line 6 will be executed **at least** $\frac{n}{3}\frac{n}{3}=\frac{n^2}{9}$ times.
- For this **particular case**, it is **$\Omega(n^2)$**
----
### Revisit Insertion Sort (step3: **$\Theta(n^2)$**)
```python=
INSERTION-SORT(A, n)
for i = 2 to n
pickup = A[i]
j = i-1
while j>0 and A[j] > pickup :
A[j+1] = A[j] # Line 6 - Move element by one position
j = j-1
A[j+1] = pickup
```
- Step1: $O(n^2)$ in all cases
- Step2: $\Omega(n^2)$ in a particular case
- Step3: Insertion Sort is $\Theta(n^2)$ in the worst case.
----
### Remarks <small>(Page 95)</small>
- ✅ Insertion sort's best-case running time is $O(n)$, $\Omega(n)$, and $\Theta(n)$
- ✅ Insertion sort's worst-case running time is $O(n^2)$, $\Omega(n^2)$, and $\Theta(n^2)$
- ❌ Insertion sort's running time is $\Theta(n^2)$
- Because in best-case, it is $\Theta(n)$
- ✅ **Insertion sort's running time is $O(n^2)$**
- ✅ Insertion sort's running time is $\Omega(n)$
- ✅ Merge sort runs in $\Theta(n \lg n)$ time (in all cases)
----
### **Amortized**
- Detail discussed in Ch.16
- "In an amortized analysis, you average the time required to perform a sequence of data-structure operations over all the operations performed."
- "Amortized analysis differs from average-case analysis in that probability is not involved. An amortized analysis guarantees the *average performance of each operation in the worst case*."
----
### Misc.
- Formally, $f(n) = O(g(n))$ means $f(n) \in O(g(n))$, where $O(g(n))$ is a **set** of functions.
- There are also $o$ and $\omega$ notations
$f(n) = O(g(n))$ is like $f \le g$
$f(n) = \Omega(g(n))$ is like $f \ge g$
$f(n) = \Theta(g(n))$ is like $f = g$
$f(n) = o(g(n))$ is like $f < g$
$f(n) = \omega(g(n))$ is like $f > g$
----
### 3.3 - Standard notations and common functions
- Floor/Ceiling
- `floor(1.7)`$= \lfloor 1.7 \rfloor = 1$
- `ceil(1.7)`$= \lceil 1.7 \rceil = 2$
- $\lceil x \rceil \ge x \ge \lfloor x \rfloor \quad \forall x \in \mathbb{R}$
- Modular
- `a%n`$= (a \bmod n) = a - n \lfloor \frac{a}{n}\rfloor$
- $(a \bmod n) = (b \bmod n) \Rightarrow a \equiv b \pmod n$
i.e., $a$ is equivalent to $b$, modulo $n$
- $\lg n = \log_2 n$
---
## Ch4 - Divide-and-Conquer
----
### Intro - Divide-and-Conquer
- **Base Case**: Solve it directly
- **Recursive Case**: Devide it recursively until reaching the base case (*bottoms out*). This invloves three steps:
1. **Divide** the problem into subproblems
2. **Conquer** the subproblems *recursively*
3. **Combine** the subproblems to solve the original problem
----
### Compared to Dynamic Programming (Ch14)
- In contrast to divide-and-conquer, dynamic programming applies when subproblems share subsubproblems.
- A dynamic-programming algorithm **solves each subsubproblem just once** and then **saves its answer in a table for looking up later**.
----
### Algorithmic recurrences
- Recurrence $T(n)$ is **algorithmic** if, for every sufficiently large **threshold** constant $n_0>0$, we have:
1. $\forall n<n_0$, we have $T(n) = \Theta(1)$.
2. $\forall n \ge n_0$, all recursion terminates in a defined base case within a finite number of recursive invocations.
- An example of recurrence $T(n)$:
- The running time of Strassen’s algorithm can be described by the recurrence: $$T(n) = 7 T(n/2)+\Theta(n^2)$$
- The solution is $$T(n) = \Theta(n^{\lg 7})\approx\Theta(n^{2.81})$$
----
### 4.1 - Multiplying Square Matrices (directly)
```python
def Matrix_Multiply(A, B, C, n):
for i in range(n):
for j in range(n):
for k in range(n):
C[i,j] += A[i,k]*B[k,j]
```
- $\Theta(n^3)$ time complexity
----
### 4.1 - Multiplying square Matrices (Simple Divide-and-Conquer)
- Assume $n$ is an exact power of 2.
- Devide original $n \times n$ matrices ($A, B, C$) into $\frac{n}{2}\times\frac{n}{2}$ ones ($A_{11}, A_{12}, \ldots$) and expand the multiplication:
$$
\begin{align}
C
&=
A \cdot B
\\
\Rightarrow
\begin{pmatrix}
C_{11} & C_{12} \\
C_{21} & C_{22}
\end{pmatrix}
&=
\begin{pmatrix}
A_{11} & A_{12} \\
A_{21} & A_{22}
\end{pmatrix}
\begin{pmatrix}
B_{11} & A_{12} \\
B_{21} & A_{22}
\end{pmatrix}
\\
&=
\begin{pmatrix}
A_{11} \cdot B_{11} +A_{12} \cdot B_{21} & A_{11} \cdot B_{12} +A_{12} \cdot B_{22} \\
A_{21} \cdot B_{11} +A_{22} \cdot B_{21} & A_{21} \cdot B_{12} +A_{22} \cdot B_{22}
\end{pmatrix}
\end{align}
$$
----
### 4.1 - Multiplying square Matrices (Simple Divide-and-Conquer)
```python
def Matrix_Multiply_Recursive(A, B, C, n):
if n == 1: # Base Case
C[1,1] += A[1,1]*B[1,1]
return
else:
# Divide
[A11, A12, A21, A22, B11, B12, B21, B22, C11, C12, C21, A22] = Part(A,B,C)
# Conquer
Matrix_Multiply_Recursive(A11, B11, C11, n/2)
Matrix_Multiply_Recursive(A11, B12, C12, n/2)
Matrix_Multiply_Recursive(A21, B11, C21, n/2)
Matrix_Multiply_Recursive(A21, B12, C22, n/2)
Matrix_Multiply_Recursive(A12, B21, C11, n/2)
Matrix_Multiply_Recursive(A12, B22, C12, n/2)
Matrix_Multiply_Recursive(A22, B21, C21, n/2)
Matrix_Multiply_Recursive(A22, B22, C22, n/2)
```
- $T(n)=8T(n/2)+\Theta(1)$
- From the master method in Sec. 4.5, we can show it has the solution: $T(n) = \Theta(n^3)$
- Could we make it better than $\Theta(n^3)$?
----
### 4.2 - Strassen’s algorithm for matrix multiplication
- It is hard to imagine that any matrix multiplication algorithm could take less than $\Theta(n^3)$ time, since the natural definition of matrix multiplication requires $n^3$ scalar multiplications. Indeed, many mathematicians presumed that it was not possible to multiply matrices in $o(n^3)$ time until 1969, when V. Strassen [424] published a remarkable recursive algorithm for multiplying $n \times n$ matrices
- The running time of Strassen’s algorithm is $\Theta(n^{\lg 7})\approx\Theta(n^{2.81})$
- [424] Volker Strassen. *Gaussian elimination is not optimal*, Numerische Mathematik, 14(3):354–356, 1969.
----
### 4.2 - Strassen’s algorithm for matrix multiplication (cont.)
- The basic idea is similar to $x^2-y^2=(x-y)(x+y)$, where we can turn 2 multiplication and 1 "addition" into 2 "addition" and 1 multiplication.
- For scalar operation, the difference between multiplication and addition is not huge. However, it is a big difference for matrices.
- Instead of deviding the original $n \times n$ matrix multiplication into **8 matrix multiplication of $\frac{n}{2}\times\frac{n}{2}$ matrices**, Strassen only perform **7 matrix multiplication of $\frac{n}{2}\times\frac{n}{2}$ matrices** with some **extra addition (substraction) operations**.
- So, the recurrence for the running time of Strassen’s algorithm is $$T(n) = 7T(n/2)+\Theta(n^2)$$
- [Link: Wikipedia - Strassen’s Algorithm](https://en.wikipedia.org/wiki/Strassen_algorithm)
----
### 4.3 - The substitution method for solving recurrences
----
### 4.4 - The recursion-tree method for solving recurrences
----
### 4.5 - The master method for solving recurrences
- The master method provides a “cookbook” method for solving algorithmic recurrences of the form: $$T(n) = aT(n/b)+f(n)$$
- A recurrence of this general form is called **master recurrence**.
- $f(n)$ is the **driving function**
- It means we divide a problem of size $n$ into $a$ subproblems, each of size $n/b<n$.
{"metaMigratedAt":"2023-06-17T07:05:57.828Z","metaMigratedFrom":"YAML","title":"Notes for CLRS (Part I - Fundations)","breaks":true,"slideOptions":"{\"theme\":\"serif\"}","description":"[Return to Table of Content]","contributors":"[{\"id\":\"de7b5032-6489-4d89-b444-ed16c56171f8\",\"add\":16705,\"del\":1833,\"latestUpdatedAt\":1765919008188}]"}