###### tags: `ADA 2.4` `Merge Sort` # ADA 2.4 Merge Sort 合併排序法 # Sorting Problem Input: unsorted list of size n 6,3,5,1,8,7,2,4 :::info ❓ What are the **base case** and **recursive case**? ::: Output: sorted list of size n 1,2,3,4,5,6,7,8 --- # Divide-and-Conquer - Base case (n = 1) - Directly output the list - Recursive case (n > 1) - Divide the list into two sub-lists - Sort each sub-list recursively - Merge the two sorted lists How? 1,3,5,6. 2,4,7,8. 2 sublists of size **n/2** 兩個指標指向兩個 list 的頭一一作比較,比較小的放下來,直到兩個 list 都清空 :::info 💡 # of comparison = $\Theta(n)$ ::: --- # Illustration for n = 10 ![](https://i.imgur.com/aHdVnNT.png) --- # Pseudocode for Merge Sort ```c MergeSort(A, p, r) // base case if p == r return // recursive case // divide q = [(p+r-1)/2] // conquer MergeSort(A, p, q) MergeSort(A, q+1, r) // combine Merge(A, p, q, r) ``` p, r: 左邊和右邊的邊界 q: 中間的切點 1. Divide - Divide a list of size n into 2 sublists of size n/2 2. Conquer - Recursive case (n > 1) - Sort 2 sublists **recursively** using **merge sort** - Base case (n = 1) - Return itself 3. Combine - Merge 2 sorted sublists into one sorted list in **linear** time --- # Time Complexity for Merge Sort - T(n) = time for running MergeSort(A, p, r) with r-p+1 = n - Divide $\Theta(1)$ - Recursive $T(\lceil n/2\rceil)+T(\lfloor n/2\rfloor)$ - Base case $\Theta(1)$ - Merge $\Theta(n)$ $T(n)=\begin{cases} O(1) &\text{if } n=1\\ T(\lceil n/2\rceil)+T(\lfloor n/2\rfloor)+O(n) &\text{if } n\ge2 \end{cases}$ - Simplify recurrences - Ingore floors and ceilings (boundary conditions) - Assume base cases are constant (for small n) $T(n)=\begin{cases} O(1) &\text{if } n=1\\ 2T(n/2)+O(n) &\text{if } n\ge2 \end{cases}$ $T(n)\le2T(\frac n2)+cn$ $1^{st}$ expansion $\le2[2T(\frac n4)+c\frac n2]+cn=4T(\frac n4)+2cn$ $2^{st}$ expansion $\le4[2T(\frac n8)+c\frac n4]+2cn=8T(\frac n8)+3cn$ ... $\le2^kT(\frac n{2^k})+kcn$ $k^{th}$ expansion The expansion stops when $2^k = n$ :::info 💡 遞迴樹法 ::: $T(n)\le nT(1)+cn\log_2n$ $=O(n)+O(n\log n)$ $=O(n\log n)$ --- # Theorem1 - Theorem $T(n)=\begin{cases} O(1) &\text{if } n=1\\ T(\lceil n/2\rceil)+T(\lfloor n/2\rfloor)+O(n) &\text{if } n\ge2 \end{cases}$ → $T(n) = O(n\log n)$ - Proof - There exists positive constant a, b s.t. $T(n)\le\begin{cases} a &\text{if } n=1\\ T(\lceil n/2\rceil)+T(\lfloor n/2\rfloor)+b\cdot n &\text{if } n\ge2 \end{cases}$ - Use induction(歸納法) to prove $T(n)\le2b\cdot n\log_2 n+a\cdot n$ - n = 1, trivial - n > 1, $\lceil \frac n2\rceil \le \frac n{\sqrt2}$ $T(n)\le T(\lceil n/2\rceil)+T(\lfloor n/2\rfloor)+b\cdot n$ Inductive hypothesis $\le 2b\cdot(\lceil n/2\rceil\log_2 \lceil n/2\rceil)+a\cdot \lceil n/2\rceil+2b\cdot(\lfloor n/2\rfloor\log_2 \lfloor n/2\rfloor)+a\cdot\lfloor n/2\rfloor+b\cdot n$ $\le 2b\cdot(\lceil n/2\rceil\log_2 \frac n{\sqrt2})+a\cdot \lceil n/2\rceil+2b\cdot(\lfloor n/2\rfloor\log_2 \frac n{\sqrt2})+a\cdot\lfloor n/2\rfloor+b\cdot n$ $=2b\cdot n(\log n - \log_2\sqrt2)+a\cdot n+b\cdot n=2b\cdot n\log_2 n+a\cdot n$ :::info 💡 取代法 ::: :::warning 自己幫自己回答: 原來是數學都還給數學老師了,去查了一下對數的運算 $\log_a \frac bc = \log_ab-\log_ac$ 所以倒數第二條可以整理成 $\le 4b\cdot( n/2\log_2 \frac n{\sqrt2})+a\cdot n+b\cdot n$ $\le 2b\cdot n(\log_2 \frac n{\sqrt2})+a\cdot n+b\cdot n$ $=2b\cdot n(\log_2n - \log_2\sqrt2)+a\cdot n+b\cdot n$ 然後因為$\sqrt2$又可以寫作$2^{1/2}$所以$\log_2\sqrt2=1/2$了,再整理一下 $=2b\cdot n\log_2n - bn +an + bn$ $=2b\cdot n\log_2n +an$ ::: --- # How to Solve Recurrence Relations? 1. Subsititution Method (取代法) - Guess a bound and then prove by induction 2. Recursion-Tree Method (遞迴樹法) - Expand the recurrence into a tree and sum up the cost 3. Master Method (套公式大法/大師法) (還沒講) - Apply Master Theorem to a specific form of recurrences Let’s see more examples first and come back to this later