---
# System prepended metadata

title: 演算法_comparision sort
tags: [大二, 資工]

---

# 演算法_comparision sort
###  In-place sorting algorithm
> A sorting algorithm is in-place if the numbers are rearranged within the array A, **with at most a constant number of them sorted outside** the array at any time.
* In-place: Insertion sort, Quicksort, Heapsort
* Not in-place: Merge sort, Counting sort, Radix sort
### worst-case running time
> It is an upper bound on the running time.
> The worst case occurs fair often
> The average case is often as bad as the worst case
### Stable
> keys with same value appear in same order in output as they did in input
* Stable: Insertion sort, Merge sort, Counting sort, Radix sort
* Unstable: Heapsort, Quicksort

### Sorting in Linear Time
> violates the ground rules θ(nlogn)


---

# Insertion sort
* Sorted in place
* Loop invariant
    * At the start of each iteration of the for loop of line 1-8, the subarray A[1,..., j – 1] consists of the elements originally in A[1,..., j – 1] but in sorted order

## pseudocode
```!=
Insertion-sort(A)
    for j⭠2 to length[A]
        do key⭠A[j]
        i⭠j-1
        while i>0 and A[i]>key
            do A[i+1]⭠A[i]
            i⭠i-1
        A[i]⭠key
```
![image](https://hackmd.io/_uploads/Byh-0ge-A.png =60%x)


---

# Merge sort
* NOT in-place
* T(n) = θ(nlogn)

## pseudocode
```!=
MERGE(A, p, q, r)
    n1 ⭠ q – p + 1
    n2 ⭠ r – q
    create array L[1,..., n1 + 1] and R[1,..., n2 + 1]
    for i ⭠ 1 to n1
        do L[i] ⭠ A[p + i – 1]
    for j ⭠ 1 to n2
        do R[j] ⭠ A[q + j]
    L[n1 + 1] ⭠ ∞ //比較最後一個元素時會用到
    R[n2 + 1] ⭠ ∞
    i ⭠ 1
    j ⭠ 1
    for k ⭠ p to r
    do if L[i] <= R[j]
        then A[k] ⭠ L[i]
            i ⭠ i + 1
        else A[k] ⭠ R[j]
            j ⭠ j + 1
            
MERGE-SORT(A, p, r)
    if p < r
        then q ⭠ [(p + r)/2] //取下高斯
        MERGE-SORT(A, p, q)
        MERGE-SORT(A, q + 1, r)
        MERGE(A, p, q, r)
```
![image](https://hackmd.io/_uploads/HJs74WgWR.png =80%x)


---

# Asymptotic notation
* g(n) is an asymptotic **upper bound** for f(n)
    O(g(n)) = { f(n) : ∃ positive constants c, n0 s.t. ∀ n ≥ n0, **0 ≤ f(n) ≤ cg(n)**}
    o(g(n)) = { f(n) : <font color="#f00">∀</font> constants c>0, ∃ constants n0>0 s.t. ∀ n ≥ n0, 0 ≤ f(n) ==<== cg(n)}
    
* g(n) is an asymptotic **lower bound** for f(n)
    Ω(g(n)) = { f(n) : ∃ positive constants c, n0 s.t. ∀ n ≥ n0, **0 ≤ cg(n) ≤ f(n)**}
    ω(g(n)) = { f(n) : <font color="#f00">∀</font> constants c>0, ∃ constants n0>0 s.t. ∀ n ≥ n0, 0 ≤ cg(n) ==<== f(n)}
    
* g(n) is an asymptotic **tight bound** for f(n)
    Θ(g(n)) = { f(n) : ∃ positive constants c1, c2, n0 s.t. ∀ n ≥ n0, 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n)}

## Theorem

:::success
f(n) = Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n))
:::
    
(⇒):
f(n) = Θ(g(n)) imply ∃ positive constants c1 , c2 , n0 s.t. ∀ n ≥ n0 , 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n),
from 0 ≤ c1g(n) ≤ f(n) , we get f(n) = Ω(g(n)), 
from f (n) ≤ c2g(n) , we get f (n) = O(g(n)).

(⇐):
f(n) = O(g(n)) imply ∃ positive constants c1, n1 s.t. ∀ n ≥ n1 , f (n) ≤ c1g(n),
f(n) = Ω(g(n)) imply ∃ positive constants c2, n2 s.t. ∀ n ≥ n2 , 0 ≤ c2g(n) ≤ f(n),
therefore, let n0 = max{n1, n2},
∃ positive constants c1, c2 , n0 s.t. ∀ n ≥ n0 , 0 ≤ c2g(n) ≤ f(n) ≤ c1g(n),
we get that f(n) = Θ(g(n)).
    
## Properties
#### Transitivity
![image](https://hackmd.io/_uploads/HkW0UGx-R.png =70%x)![image](https://hackmd.io/_uploads/Syu8vzl-R.png =25%x)

#### Reflexivity
![image](https://hackmd.io/_uploads/H1LJPfgZ0.png =20%x)![image](https://hackmd.io/_uploads/HJvePMlZ0.png =20%x)

#### Symmetry
![image](https://hackmd.io/_uploads/ry8hwMgWA.png =60%x)

#### Transpose symmetry
![image](https://hackmd.io/_uploads/SJiCwGgZA.png =60%x)

## Exmaple
1.![217A9031-B47C-4760-B724-45F17759325D](https://hackmd.io/_uploads/HJgmIGxbC.jpg)
2.![F04A871A-FE82-4114-A246-92B331A85BC2](https://hackmd.io/_uploads/ByUJ7XlZ0.jpg)
3.![image](https://hackmd.io/_uploads/SyYnQ7gbC.png)


---


# Recurrences
## Substitution method
1. Guess the solution
2. Upper bound、Lower bound
![3A364978-8BC9-4B47-AD08-9E929DE529BF](https://hackmd.io/_uploads/BJ-8qRlbA.jpg)


## Recurrence trees
![image](https://hackmd.io/_uploads/SJudB1--A.png)


## Master method
> 取最大的，差$lg^{k}n$平手變$lg^{k+1}n$

:::info
T(n) = aT(n/b) + f(n), Where a≥1, b>1, and f(n)>0.
:::

* Case 1：$f(n) < O(n^{log_ba})$ 
    Solution：$T(n) = O(n^{log_ba})$
    
* ==Case 2：$f(n) = O(n^{log_ba}lg^{k}n),where\  k ≥ 0$== 平手
    ==Solution：$T(n) = O(n^{log_ba}lg^{k+1}n)$==

* Case 3：$f(n) > O(n^{log_ba})$ 
    Solution：$T(n) = O(f(n))$
    

---
# Heapsort
* Heap A is a nearly complete **binary tree**
* Height of node = # of edges on a longest simple path from the **node down to a leaf**
* Height of heap = height of root = Θ(lgn)
* Maximum numbers of elements in a heap of height h =  $2^{h+1}$ − 1
* property
    * Max-heap: for all nodes i, excluding the root, A[PARENT(i)] ≥ A[i].
    * Min-heap: for all nodes i, excluding the root, A[PARENT(i)] ≤ A[i].
* A heap can be stores as an array A:
    * Root of trees is A[1]
    * Parent of A[𝑖] = A[⌊𝑖/2⌋]
    * Left child of A[𝑖] = A[2𝑖]
    * Right child of A[𝑖] = A[2𝑖+1]
## MAX-HEAPIFY
> 先找自己、左右的child誰最大，最後把最大的換上來
* Time: O(lg n)

```!=
MAX-HEAPIFY(A, i, n)
    l ⭠ LEFT(i)
    r ⭠ RIGHT(i)
    if l ≤ n and A[l] > A[i]
        then largest ⭠ l
        else largest ⭠ i
    if r ≤ n and A[r] > A[largest]
        then largest ⭠ r
    if largest ≠ i
        then exchange A[i]⭤ A[largest]
    MAX-HEAPIFY(A, largest, n)
```
![image](https://hackmd.io/_uploads/HkWnsyWWC.png)

## BUILD-MAX-HEAP
> 從最小的子樹走上去
* **Time: O(n)** //Heapify takes a different time for each node
* Loop invariant
    * At start of every iteration of for loop, each node i + 1, i + 2,..., n is root of a max-heap
```!=
BUILD-MAX-HEAP(A, n)
    for i ⭠ ⌊n/2⌋ downto 1
        do MAX-HEAPIFY(A, i, n)
```
![image](https://hackmd.io/_uploads/r1iqTk-bR.png)

## HEAPSORT
> 跟最小的node換，再HEAPIFY
* Time: O(nlgn)
```!=
BUILD-MAX-HEAP(A, n)
    for i ⭠ n downto 2
        do exchange A[1] ⭤  A[i]
            MAX-HEAPIFY(A, 1, i – 1)
```
![image](https://hackmd.io/_uploads/B1xNZe-bC.png)

## Implementation of Priority Queue
### Max Priority Queues
* INSERT(S, x): inserts element x into set S
    * 插入新的node，**設為–∞**，再進行HEAP-INCREASE-KEY
    * O(lg n)
* MAXIMUM(S): returns elements of S with largest key
    * return Root
    * Time: Θ(1)

* EXTRACT-MAX(S): removes and returns element of S with largest key
    * 把root跟最小node交換，pop largest key，再MAX-HEAPIFY
    * Time: O(lg n)

* INCREASE-KEY(S, x, k): increases value of element x’s key to k. Assume k ≥ x’s current key value
    * 跟parent比，若key較大就交換
    * Time: O(lg n)
```!=
INSERT(A, key, n)
    A[n + 1] ⭠ –∞
    INCREASE-KEY(A, n + 1, key)

MAXIMUM(A)
    return A[1]

EXTRACT-MAX(A, n)
    if n < 1
        then error “heap underflow”
    max ⭠ A[1]
    A[1] ⭠ A[n]
    MAX-HEAPIFY(A, 1, n – 1)
    return max
    
INCREASE-KEY(A, i, key)
    if key < A[i]
        then error “new key is smaller than current key”
    A[i] ⭠ key
    while i > 1 and A[PARENT(i)] < A[i]
    do exchange A[i] ⭤  A[PARENT(i)]
        i ⭠ PARENT(i)
```


---
# Quicksort
> A[j]小於pivot, i = i+1, A[i]⭤ A[j]
* Sorts in place
* Loop invariant
* Time (Quicksort)
    * Worst-case running time: Θ($n^2$)
    * Expected running time: Θ(nlg n), constants hidden in are small
* Time (PARTITION): Θ(n)
    * For loop iterates exactly r−p times, each iteration takes constant time.
    * The section outside the for loop also takes constant time.
    * As r−p represents the size of the subarray, PARTITION takes time proportional to the size of the subarray it's called on

```!=
QUICKSORT(A, p, r)
    if p < r
    then q ← PARTITION(A, p, r)
    QUICKSORT(A, p, q-1)
    QUICKSORT(A, q+1, r)
    
PARTITION(A, p, r)
    x ← A[r]
    i ← p – 1
    for j ← p to r – 1  //A[r] is pivot
        do if A[j] ≤ x  // 小於pivot, 換去前面
            then i ← i +1
                exchange A[i]⭤ A[j]
    exchange A[i+1]⭤ A[r]
    return i + 1
```
![image](https://hackmd.io/_uploads/H1xKXk---R.png =50%x)![image](https://hackmd.io/_uploads/BkhDyW--A.png =45%x)

### Performance
#### Worst case
𝑇(𝑛) = 𝑇(𝑛 − 1) + 𝑇(0) + 𝑂(𝑛) = **𝑂($𝑛^2$)**
* Have 0 elements in one subarray and n-1 elements in the other subarray
* When input is sorted in either **ascending order** or **descending order**.
* **Insertion sort** runs in O(n) time in this case

#### Best case
T(n) = 2T(n/2) + Θ(n) = **Θ(nlgn)**
* Each subarray has ≤ n/2 elements

#### Balanced case
T(n) ≤ T(9n/10) + T(n/10) + Θ(n) = **O(nlgn)**
* Imagine that PARTITION always produces a 9-to-1 split

## Randomized Quicksort
*  an **already-sorted array** causes worst-case behavior in non-randomized QUICKSORT, but not in RANDOMIZED QUICKSORT
*  Why analyze the expected running time of a randomized algorithm?
    *  because it's more representative of **typical time costs**.
    *  Additionally, considering randomness in computation **prevents adversarial input generation** compared to analyzing all possible inputs
```!=
RANDOMIZED-PARTITION(A, p, r)
    i ← RANDOM(p, r)
    exchange A[r]⭤ A[i]
    return PARTITION(A, p, r)
    
RANDOMIZED-QUICKSORT(A, p, r)
    if p < r
    then q ← RANDOMIZED-PARTITION(A, p, r)
        RANDOMIZED-QUICKSORT(A, p, q-1)
        RANDOMIZED-QUICKSORT(A, q+1, r)
```
### Performance
#### ==Worst case== 必考
![image](https://hackmd.io/_uploads/BJzzwr-Z0.png =60%x)

![image](https://hackmd.io/_uploads/S16b9BZ-0.png =35%x)

![image](https://hackmd.io/_uploads/SkCGqHWZR.png =50%x)

![image](https://hackmd.io/_uploads/HyiN5HZbC.png =90%x)
* When the (n/4)th smallest element is selected as the pivot using an O(n) time algorithm
    * recursion expression is T(n) = T(n/4) + T(3n/4) + cn.
    * upon solving the recursion, determine that the time complexity is θ(nlogn)
* Prove that the running time of QUICKSORT is O(n^2) when the array A is sorted in non-increasing order
    * the partitioning routine produces one region with n - 1 elements and one region with 1 element
    * This results in a recursion tree with a height of O(log n), which gives a worst-case running time of O(n^2)

#### Average-case


---
# Decision tree
* Abstraction of any comparison sort
* counting only comparisons
* Each internal node is labeled by indices of array elements from their original positions

![91C0485D-DC86-4A68-B7D3-D747175EBFD9](https://hackmd.io/_uploads/rJci98Wb0.jpg)

* **Big-O = Decision tree 的樹高 (Root到Leaf最長的Path) = O(n lg n)**
* the smallest possible depth of a leaf in a decision tree for a comparison sort
    * n-1 for an input array of length n
    * 對一個 leaf 而言，best case 為去比較sort好的數列，只要比較 n-1 次就可以判斷大小順序完成 sorting，其depth 為 n-1(假設 root depth 為 0)
## Theorem
> Any decision tree that sorts n elements has height Ω(nlgn)
> **Heapsort** and **merge sort** are asymptotically optimal comparison sort
*  **≧ n!** leaves on the decision tree
    *  because every permutation appears at least once
* Lemma: Any binary tree of height h has ≦ $2^h$ leaves
    * l = # of leaves, h = height, Then l ≦ $2^h$

#### ==Proof== 必考
:::info
l ≧ n!
By lemma, n! ≦ l ≦ $2^h$ or $2^h$ ≧ n!

Take logs: h ≧ lg(n!)
Use Stirling’s approximation: n! ＞ $(n/e)^n$ (by equation (3.16))
h ≧ $lg(n/e)^n$
= n lg(n/e)
= n lg n － n lg e
= Ω( n lg n)
:::


---
