## Lecture 07. Sorting
- Sorting problem
- Sorting Algorithms
- Sorting in Python
---
### Sorting problem
- Given an array that contains n elements, your task is to sort the elements in **increasing** order.
---
### Sorting algorithms
- $O(n^2)$ algorithms
- $O(n\log n)$ algorithms
- $O(n)$ algorithms
---
#### $O(n^2)$ algorithms
- Bubble sort
- Selection sort
- Insertion sort
---
#### Bubble sort?
```python=
for i in range(len(a)):
for j in range(i+1, len(a)):
if a[i] > a[j]:
a[i], a[j] = a[j], a[i]
```
---
#### Exchange sort
<img src="https://cdn.ucode.vn/uploads/1/upload/hnAkYESK.png"/>
---
#### Exchange sort
```python=
for i in range(len(a)):
for j in range(i+1, len(a)):
if a[i] > a[j]:
a[i], a[j] = a[j], a[i]
```
---
#### Bubble sort
<img src="https://cdn.ucode.vn/uploads/1/upload/eMxaKJIY.png"/>
---
#### Bubble sort?
```python=
for i in range(len(a)):
for j in range(len(a)-1, i, -1):
if a[j-1] > a[j]:
a[j-1], a[j] = a[j], a[j-1]
```
---
#### Selection Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/OSJVFHoM.png"/>
---
#### Selection Sort
```python=
for i in range(len(a)):
imin = i
for j in range(i + 1, len(a)):
if a[j] < a[imin]:
imin = j
a[i], a[imin] = a[imin], a[i]
```
---
#### Insertion Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/xDYzhvuX.png"/>
---
#### Insertion Sort
```python=
for i in range(1, len(a)):
key = a[i]
j = i - 1
while j >= 0 and key < a[j]:
a[j + 1] = a[j]
j -= 1
a[j + 1] = key
```
---
#### Insertion Sort: improvement
- Binary search?
- Shell Sort: insertion sort with **decreasing gaps**
---
#### Shell Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/CZywjgXt.png"/>
<img src="https://cdn.ucode.vn/uploads/1/upload/dQSINuma.png"/>
Note:
gap = 4
---
#### Shell Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/zPxFYGqa.png"/>
<img src="https://cdn.ucode.vn/uploads/1/upload/sbSuhWGr.png"/>
Note:
gap = 2
---
#### Shell Sort
```python=
gap = len(a) // 2
while gap > 0:
for i in range(gap, len(a)):
key = a[i]
j = i
while j >= gap and a[j - gap] > key:
a[j] = a[j - gap]
j -= gap
a[j] = key
gap //= 2
```
---
#### Shell Sort: Gap sequence
- Original: $\frac{n}{2}$, $\frac{n}{4}$, ..., $1$ $\rightarrow$ $O(n(\log n)^2)$
- Knuth's: $1, 4, 13, ..., \frac{3^k-1}{2}$ $\rightarrow$ $O(n^{\frac{3}{2}})$
- Hibbard's: $1, 3, 7, ..., 2^k-1$ $\rightarrow$ $O(n^{\frac{3}{2}})$
- Sedgewick's: $1, 8, 23, 77, ..., 4^{i+1}+3\times 2^k+1$ $\rightarrow$ $O(n^{\frac{4}{3}})$
---
### $O(n \log n)$ algorithms
- Merge Sort
- Quick Sort
- Heap Sort
---
#### Merge Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/xWeezgWJ.png" width="60%"/>
---
#### Merge Sort
```python=
def merge_sort(a):
if len(a) < 2:
return
r = len(a) // 2
L, R = a[:r], a[r:]
merge_sort(L)
merge_sort(R)
i = j = k = 0
while i < len(L) and j < len(R):
if L[i] < R[j]: a[k] = L[i]; i += 1
else: a[k] = R[j]; j += 1
k += 1
while i < len(L):
a[k] = L[i]; i += 1; k += 1
while j < len(R):
a[k] = R[j]; j += 1; k += 1
```
---
#### Quick Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/lssbmYAi.png"/>
---
#### Quick Sort
```python=
def quick_sort(a, L, R):
if L >= R: return
i, j = L, R
pivot = a[R]
while i <= j:
while a[i] < pivot: i += 1
while a[j] > pivot: j -= 1
if i <= j:
a[i], a[j] = a[j], a[i]
i, j = i + 1, j - 1
quick_sort(a, L, j)
quick_sort(a, i, R)
quick_sort(a, 0, len(a) - 1)
```
---
#### Heap Sort
- Using heap data structure: next lesson (tree)
- Manual implementation (without `heapq`): next lesson
---
#### Heap Sort
```python=
from heapq import heappush, heappop
def heapsort(a):
h = []
for v in a:
heappush(h, v)
return [heappop(h) for i in range(len(h))]
```
---
### $O(n)$ algorithms
- Bucket Sort: $O(n + k)$
- Counting sort
- Radix Sort
---
#### Bucket Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/DjGIeOfV.png" width="80%"/>
---
#### Bucket Sort
- Complexity: heavily depends on the bucket mechanism
- Average: $O(n + k)$, where $k$ - average number of items in each bucket.
Note:
Assume using O(n) sorting algorithm to sort these k elements
---
#### Counting Sort
- Bucket sort with single value in each bucket
- Count the number of occurrences of each bucket's value
- Can use a distribution counting array, or simply a `dict`.
---
#### Counting Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/exTXIWIi.png"/>
<img src="https://cdn.ucode.vn/uploads/1/upload/vjvhZDTl.png"/>
---
#### Counting Sort
Complexity:
- Using counting array: $O(n + k)$
---
#### Radix Sort
<img src="https://cdn.ucode.vn/uploads/1/upload/kDqonhzx.png"/>
---
#### Sorting algorithms: complexity
<img src="https://cdn.ucode.vn/uploads/1/upload/xRPvaPBG.png"/>
---
#### Sorting algorithms: stability
- A **stable** sorting algorithm **maintains the relative order** of the items with equal sort keys.
<img src="https://cdn.ucode.vn/uploads/1/upload/chSEzKXM.png"/>
---
#### Unstable Sorting algorithms
<img src="https://cdn.ucode.vn/uploads/1/upload/nqvPEAlr.png"/>
---
#### Sorting algorithms: stability
- Stable: Bubble, Insertion, Merge, Timsort
- Unstable: Selection, Quick Sort, Shell Sort, Heap Sort
---
### Sorting in Python
Sorting list `a`:
- `a.sort()`: increasing, in place, no returns
- `b = sorted(a)`: return a new sorted list
- `a.sort(reverse=True)`: decreasing
---
### Sorting in Python
- Sorts are guaranteed to be **stable**, this lets us build complex sorts in a series of sorting steps
- Algorithm: **Timsort**: a hybrid algorithm derived from **merge sort** and **insertion sort**.
- $O(n\log n)$ on average, best case: $O(n)$
---
#### Key Functions
A function to be called on each list element prior to making comparisons.
```python=
s = "This is a test string from Andrew"
print(sorted(s.split(), key=str.lower))
# ['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']
```
---
#### Key Functions
```python=
students = [
('john', 'A', 15), # name, grade, age
('jane', 'B', 12),
('dave', 'B', 10),
]
# sort by age
print(sorted(students, key=lambda student: student[2]))
# [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
```
---
#### Key Functions
```python=
from operator import itemgetter
students = [
('john', 'A', 15), # name, grade, age
('jane', 'B', 12),
('dave', 'B', 10),
]
# sort by age
print(sorted(students, key=itemgetter(2)))
# [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
```
---
#### Key Functions
```python=
student_objects = [
Student('john', 'A', 15),
Student('jane', 'B', 12),
Student('dave', 'B', 10),
]
# sort by age
print(sorted(student_objects, key=lambda student: student.age))
# [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
```
---
#### Key Functions
```python=
from operator import attrgetter
student_objects = [
Student('john', 'A', 15),
Student('jane', 'B', 12),
Student('dave', 'B', 10),
]
# sort by age
print(sorted(student_objects, key=attrgetter("age")))
# [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
```
---
#### Comparison Functions
- Computes the **relative ordering** for two inputs.
- `cmp(a, b)` will return
- a **negative** value for **less-than**
- **zero** if the inputs are **equal**
- a **positive** value for **greater-than**
---
#### Comparison Functions
- `sort()` and `sorted()` functions don't support comparison function as a parameter.
- `functools.cmp_to_key`: wrap the comparison function to make it usable as a **key function**.
---
#### Comparison Functions
```python=
from functools import cmp_to_key
def cmp(r1, r2):
if r1.w * r1.h < r2.w * r2.h: return -1
if r1.w * r1.h > r2.w * r2.h: return 1
if r1.w < r2.w: return -1
return 0
# sort list of rectangles in ascending order of their areas
rects.sort(key=cmp_to_key(cmp))
```
---
## The End
---
{"metaMigratedAt":"2023-06-17T19:46:11.787Z","metaMigratedFrom":"YAML","title":"Thuc Nguyen's ADSA Course - Lecture 07. Sorting","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"theme\":\"white\"}","contributors":"[{\"id\":\"476f9158-9a39-4a2d-bb09-ce08dd7eb978\",\"add\":9012,\"del\":389}]"}