## Lecture 07. Sorting - Sorting problem - Sorting Algorithms - Sorting in Python --- ### Sorting problem - Given an array that contains n elements, your task is to sort the elements in **increasing** order. --- ### Sorting algorithms - $O(n^2)$ algorithms - $O(n\log n)$ algorithms - $O(n)$ algorithms --- #### $O(n^2)$ algorithms - Bubble sort - Selection sort - Insertion sort --- #### Bubble sort? ```python= for i in range(len(a)): for j in range(i+1, len(a)): if a[i] > a[j]: a[i], a[j] = a[j], a[i] ``` --- #### Exchange sort <img src="https://cdn.ucode.vn/uploads/1/upload/hnAkYESK.png"/> --- #### Exchange sort ```python= for i in range(len(a)): for j in range(i+1, len(a)): if a[i] > a[j]: a[i], a[j] = a[j], a[i] ``` --- #### Bubble sort <img src="https://cdn.ucode.vn/uploads/1/upload/eMxaKJIY.png"/> --- #### Bubble sort? ```python= for i in range(len(a)): for j in range(len(a)-1, i, -1): if a[j-1] > a[j]: a[j-1], a[j] = a[j], a[j-1] ``` --- #### Selection Sort <img src="https://cdn.ucode.vn/uploads/1/upload/OSJVFHoM.png"/> --- #### Selection Sort ```python= for i in range(len(a)): imin = i for j in range(i + 1, len(a)): if a[j] < a[imin]: imin = j a[i], a[imin] = a[imin], a[i] ``` --- #### Insertion Sort <img src="https://cdn.ucode.vn/uploads/1/upload/xDYzhvuX.png"/> --- #### Insertion Sort ```python= for i in range(1, len(a)): key = a[i] j = i - 1 while j >= 0 and key < a[j]: a[j + 1] = a[j] j -= 1 a[j + 1] = key ``` --- #### Insertion Sort: improvement - Binary search? - Shell Sort: insertion sort with **decreasing gaps** --- #### Shell Sort <img src="https://cdn.ucode.vn/uploads/1/upload/CZywjgXt.png"/> <img src="https://cdn.ucode.vn/uploads/1/upload/dQSINuma.png"/> Note: gap = 4 --- #### Shell Sort <img src="https://cdn.ucode.vn/uploads/1/upload/zPxFYGqa.png"/> <img src="https://cdn.ucode.vn/uploads/1/upload/sbSuhWGr.png"/> Note: gap = 2 --- #### Shell Sort ```python= gap = len(a) // 2 while gap > 0: for i in range(gap, len(a)): key = a[i] j = i while j >= gap and a[j - gap] > key: a[j] = a[j - gap] j -= gap a[j] = key gap //= 2 ``` --- #### Shell Sort: Gap sequence - Original: $\frac{n}{2}$, $\frac{n}{4}$, ..., $1$ $\rightarrow$ $O(n(\log n)^2)$ - Knuth's: $1, 4, 13, ..., \frac{3^k-1}{2}$ $\rightarrow$ $O(n^{\frac{3}{2}})$ - Hibbard's: $1, 3, 7, ..., 2^k-1$ $\rightarrow$ $O(n^{\frac{3}{2}})$ - Sedgewick's: $1, 8, 23, 77, ..., 4^{i+1}+3\times 2^k+1$ $\rightarrow$ $O(n^{\frac{4}{3}})$ --- ### $O(n \log n)$ algorithms - Merge Sort - Quick Sort - Heap Sort --- #### Merge Sort <img src="https://cdn.ucode.vn/uploads/1/upload/xWeezgWJ.png" width="60%"/> --- #### Merge Sort ```python= def merge_sort(a): if len(a) < 2: return r = len(a) // 2 L, R = a[:r], a[r:] merge_sort(L) merge_sort(R) i = j = k = 0 while i < len(L) and j < len(R): if L[i] < R[j]: a[k] = L[i]; i += 1 else: a[k] = R[j]; j += 1 k += 1 while i < len(L): a[k] = L[i]; i += 1; k += 1 while j < len(R): a[k] = R[j]; j += 1; k += 1 ``` --- #### Quick Sort <img src="https://cdn.ucode.vn/uploads/1/upload/lssbmYAi.png"/> --- #### Quick Sort ```python= def quick_sort(a, L, R): if L >= R: return i, j = L, R pivot = a[R] while i <= j: while a[i] < pivot: i += 1 while a[j] > pivot: j -= 1 if i <= j: a[i], a[j] = a[j], a[i] i, j = i + 1, j - 1 quick_sort(a, L, j) quick_sort(a, i, R) quick_sort(a, 0, len(a) - 1) ``` --- #### Heap Sort - Using heap data structure: next lesson (tree) - Manual implementation (without `heapq`): next lesson --- #### Heap Sort ```python= from heapq import heappush, heappop def heapsort(a): h = [] for v in a: heappush(h, v) return [heappop(h) for i in range(len(h))] ``` --- ### $O(n)$ algorithms - Bucket Sort: $O(n + k)$ - Counting sort - Radix Sort --- #### Bucket Sort <img src="https://cdn.ucode.vn/uploads/1/upload/DjGIeOfV.png" width="80%"/> --- #### Bucket Sort - Complexity: heavily depends on the bucket mechanism - Average: $O(n + k)$, where $k$ - average number of items in each bucket. Note: Assume using O(n) sorting algorithm to sort these k elements --- #### Counting Sort - Bucket sort with single value in each bucket - Count the number of occurrences of each bucket's value - Can use a distribution counting array, or simply a `dict`. --- #### Counting Sort <img src="https://cdn.ucode.vn/uploads/1/upload/exTXIWIi.png"/> <img src="https://cdn.ucode.vn/uploads/1/upload/vjvhZDTl.png"/> --- #### Counting Sort Complexity: - Using counting array: $O(n + k)$ --- #### Radix Sort <img src="https://cdn.ucode.vn/uploads/1/upload/kDqonhzx.png"/> --- #### Sorting algorithms: complexity <img src="https://cdn.ucode.vn/uploads/1/upload/xRPvaPBG.png"/> --- #### Sorting algorithms: stability - A **stable** sorting algorithm **maintains the relative order** of the items with equal sort keys. <img src="https://cdn.ucode.vn/uploads/1/upload/chSEzKXM.png"/> --- #### Unstable Sorting algorithms <img src="https://cdn.ucode.vn/uploads/1/upload/nqvPEAlr.png"/> --- #### Sorting algorithms: stability - Stable: Bubble, Insertion, Merge, Timsort - Unstable: Selection, Quick Sort, Shell Sort, Heap Sort --- ### Sorting in Python Sorting list `a`: - `a.sort()`: increasing, in place, no returns - `b = sorted(a)`: return a new sorted list - `a.sort(reverse=True)`: decreasing --- ### Sorting in Python - Sorts are guaranteed to be **stable**, this lets us build complex sorts in a series of sorting steps - Algorithm: **Timsort**: a hybrid algorithm derived from **merge sort** and **insertion sort**. - $O(n\log n)$ on average, best case: $O(n)$ --- #### Key Functions A function to be called on each list element prior to making comparisons. ```python= s = "This is a test string from Andrew" print(sorted(s.split(), key=str.lower)) # ['a', 'Andrew', 'from', 'is', 'string', 'test', 'This'] ``` --- #### Key Functions ```python= students = [ ('john', 'A', 15), # name, grade, age ('jane', 'B', 12), ('dave', 'B', 10), ] # sort by age print(sorted(students, key=lambda student: student[2])) # [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] ``` --- #### Key Functions ```python= from operator import itemgetter students = [ ('john', 'A', 15), # name, grade, age ('jane', 'B', 12), ('dave', 'B', 10), ] # sort by age print(sorted(students, key=itemgetter(2))) # [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] ``` --- #### Key Functions ```python= student_objects = [ Student('john', 'A', 15), Student('jane', 'B', 12), Student('dave', 'B', 10), ] # sort by age print(sorted(student_objects, key=lambda student: student.age)) # [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] ``` --- #### Key Functions ```python= from operator import attrgetter student_objects = [ Student('john', 'A', 15), Student('jane', 'B', 12), Student('dave', 'B', 10), ] # sort by age print(sorted(student_objects, key=attrgetter("age"))) # [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] ``` --- #### Comparison Functions - Computes the **relative ordering** for two inputs. - `cmp(a, b)` will return - a **negative** value for **less-than** - **zero** if the inputs are **equal** - a **positive** value for **greater-than** --- #### Comparison Functions - `sort()` and `sorted()` functions don't support comparison function as a parameter. - `functools.cmp_to_key`: wrap the comparison function to make it usable as a **key function**. --- #### Comparison Functions ```python= from functools import cmp_to_key def cmp(r1, r2): if r1.w * r1.h < r2.w * r2.h: return -1 if r1.w * r1.h > r2.w * r2.h: return 1 if r1.w < r2.w: return -1 return 0 # sort list of rectangles in ascending order of their areas rects.sort(key=cmp_to_key(cmp)) ``` --- ## The End ---
{"metaMigratedAt":"2023-06-17T19:46:11.787Z","metaMigratedFrom":"YAML","title":"Thuc Nguyen's ADSA Course - Lecture 07. Sorting","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"theme\":\"white\"}","contributors":"[{\"id\":\"476f9158-9a39-4a2d-bb09-ce08dd7eb978\",\"add\":9012,\"del\":389}]"}
    154 views