Chap. 07- Sort

# Chap. 07 - Sort > 參考書目 : Fundamentals of Data Structures in C (2nd), Ellis H., Sartaj S., Susan A.-F. > > 其他科目內容請參見 [[Here]](https://hackmd.io/@ChenZE/By4SOO6Jyl) ## Content * [簡單排序](#簡單排序) * [1. Bubble sort](#1-Bubble-sort) * [2. Selection sort](#2-Selection-sort) * [3. Insertion sort](#3-Insertion-sort) * [4. Shell sort](#4-Shell-sort) * [高效排序](#高效排序) * [1. Merge sort](#1-Merge-sort) * [1.1 Top-down merge sort](#11-Top-down-merge-sort) * [1.2 Bottom-up merge sort](#12-Bottom-up-merge-sort) * [2. Quick sort](#2-Quick-sort) * [3. Heap sort](#3-Heap-sort) * [排序演算法的極限](#排序演算法的極限) * [混合排序](#混合排序) * [1. Tim sort](#1-Tim-sort) * [特殊排序](#特殊排序) * [1. Bucket sort](#1-Bucket-sort) * [2. Counting sort](#2-Counting-sort) * [3. Key-based sort](#3-Key-based-sort) * [4. Radix sort](#4-Radix-sort) ## 簡單排序 ### 1. Bubble sort > The following code is saved in `sorted.c`. ```c= // bubble sort void bubbleSort(int arr[], int n){ int temp; for (int i = 0; i < n-1; i++){ for (int j = 0; j < n-i-1; j++){ if (arr[j] > arr[j+1]){ temp = arr[j+1]; arr[j+1] = arr[j]; arr[j] = temp; } } } } ``` ### 2. Selection sort 最簡單的排序法之一，**每次挑選==最小==(最大)元素放到正確的位置**，過程如下： * 從陣列左側(index=0)開始，搜尋整個 unsorted array 並挑選最小值 * 將最小值與 unsorted array 的左側元素(index=0)交換 * 此時左邊(index=0)會是 sorted array，右邊會是 unsorted array * 從下一個 index 開始繼續上述步驟，直到排序完成 > The following code is saved in `sorted.c`. ```c= // selection sort void selectionSort(int arr[], int n){ int temp; for (int i = 0; i < n - 1; i++){ int minIndex = i; // compare and search the minimum element for (int j = i + 1; j < n; j++){ if (arr[j] < arr[minIndex]){ minIndex = j; } } // swap the minimum and current element temp = arr[i]; arr[i] = arr[minIndex]; arr[minIndex] = temp; } } ``` #### Analysis * Time complexity: $O(n^2)$ 因為排序過程中每個位置都需要尋找最小值才能決定 * Space complexity: $O(1)$ * Advantage * Unstable * In-place ### 3. Insertion sort <img src="https://media.geeksforgeeks.org/wp-content/uploads/20240802210251/Insertion-sorting.png" width="600"> (圖片來源：[GeeksforGeeks](https://www.geeksforgeeks.org/insertion-sort-algorithm/)) Insertion sort 的基本概念是將元素**插入到一個已經排序好的 array 之中**並放在**正確的位置**，插入後的陣列依然是**已經排序**好的。假設有兩個陣列，一個已經排序好(sorted)，另一個尚未排序(unsorted)。Insertion sort 的過程就是每次從 unsorted array 中挑選一個數字插入到 sorted array 中的正確位置。 :::info 可以想像成類似卡牌的插入，手上有兩疊卡牌，一疊有排序一疊沒有，從沒有排序的卡牌中每次挑選一張放在已排序卡牌中的正確位置。 ::: Insertion 的**關鍵第一步**是如何在一個**尚未排序的 array 中==選出已經排序好的 sub-array==** * 將 array 的**首項元素 `arr[0]` 視為 sorted array** * 將第 2 個元素 `arr[1]` 與第 1 個元素 `arr[0]` 做**比較(comparison)** * 若 `arr[1] < arr[0]` 則交換(swap)兩者位置 * 若 `arr[1] >= arr[0]` 則不變 * 繼續拿第 3 個元素往前與前 2 個元素**依序做比較插入** :::info 交換(swap)的過程中需要第三方元素來暫存舊的值，最後再換回。以 $A \leftrightarrow B$ 互換為例： * $A \rightarrow X$ * $B \rightarrow A$ * $X \rightarrow B$ 在進行 insertion sort 時因為需要做連續的比較交換，所以可以等**最後確定後再將元素放入正確位置** ::: > The following code is saved in `sorted.c`. ```c= // insertion sort void insertionSort(int arr[], int n){ int i; // i point the bound of sorted array. int j; // j used to compare with key and swap them. int key; // an element which is going to be compared and inserted correct position. for (i = 1; i < n; i++){ key = arr[i]; j = i - 1; while ((j >= 0) && (arr[j] > key)){ arr[j+1] = arr[j]; /** if you want to track the change of input array in each iteration * try these two lines. * arr[j] = key; * printArray(arr, n); */ j = j - 1; } arr[j+1] = key; } } ``` 每一次迭代過程中都要將 `key` 放入正確的位置，所以將 `key` 不斷往前比較 * 比 `key` 大，交換：`arr[j+1] = arr[j];` * 比 `key` 小，正確位置，將 `key` 插入：`arr[j+1] = key;` #### Analysis * Time complexity * Best case 最好的的情況是 array 本身已經排序好，時間複雜度為 $O(n)$ * Worse case 如果 array 是反向的遞減排序(reverse sorted)，則每次插入的元素都需要不斷比較交換直到放在第一個，時間複雜度為 $O(n^2)$ * Average case 如果 array 是隨機排序，時間複雜度為 $O(n^2)$ * Space complexity 排序過程不需要額外的空間，空間複雜度為 $O(1)$，是一種 in-place 的排序法。 * Advantage * 對長度較小且快排好的 array 有比較好的效率，適用於長度較小的 array * Stable * In-place :::info 排序演算法的穩定性(stability)指的兩個元素值相同的情況下，排序後的順序與排序前的順序相同。例如考慮以下陣列 $$ [6,\ 2,\ 3^*,\ 4,\ 1,\ 3, 9] $$ 若某個排序演算法執行後的結果是 $$ [1,\ 2,\ 3^*,\ 3,\ 4,\ 6,\ 9] $$ 則該排序演算法具有 stable 的性質，因為 $3^*$ 與 $3$ 的順序相同。反之若結果為 $$ [1,\ 2,\ 3,\ 3^*,\ 4,\ 6,\ 9] $$ 則為 unstable 的排序法 ::: ### 4. Shell sort Shell sort 是基於 insertino sort 的以下性質進行改良的排序演算法： * 對幾乎排好的資料來說 insertion sort 的執行效率高 * 但大部分情況效率很差，因為排序過程中每次只能移動一個元素為了解決 insertion sort 的第二個缺點，shell sort 採用的是將原始資料切割成小資料，因為小資料可以視為幾乎排序好的陣列，對每個陣列都進行 insertion sort 後可將原始資料排序完成 Step-by-step： * 依照間隔長度 h，每 h 個元素取出來形成 subarray * 以 insertion sort 將每個 subarray 排序(排序後的 subarray 稱為 h-sorted array) * 縮小間隔長度 h 重複第一步以 [23, 1, 10, 5, 2] 為例，若 h = 2，則 subarray 為 [23, 10, 2] 與 [1, 5] > The following code is saved in `sorted.c`. ```c= //shell sort void shellSort(int arr[], int n){ int h; // control subarray size int i, j; // used to perform insertion sort for (h = n / 2; h > 0; h /= 2){ for (i = h; i < n; i++){ int temp = arr[i]; for (j = i; j >= h && arr[j-h] > temp; j -= h){ arr[j] = arr[j-h]; } arr[j] = temp; } } } ``` ## 高效排序 ### 1. Merge sort 合併排序法(merge sort)是一種利用 **divide and conquer** 的概念所衍生的排序演算法。依照合併方向的不同可以分為兩種作法： * Top-down(recursion ver.) * Bottom-up(iteration ver.) #### 1.1 Top-down merge sort <img src="https://media.geeksforgeeks.org/wp-content/uploads/20230706153706/Merge-Sort-Algorithm-(1).png" width=600> (圖片來源：[GreekforGreek](https://www.geeksforgeeks.org/merge-sort/)) 第一種 merge sort 是**由上而下(top-down)的方向**，是遞迴版本的 merge sort。將長度為 n 的陣列不斷分割成大小相同的兩個子陣列，每個子陣列，將子陣列排序完後再依序合併。 **Step-by-step：** * 使用遞迴的方式不斷**分割輸入陣列**，直到不能分割為止(長度=1) * 從最**小的子陣列開始獨自排序** * 排序完後使用 **merge algorithm 兩兩合併**，直到所有子陣列合併成輸入陣列長度 > The following code is saved in `sorted.c`. ```c= // merge sort void merge(int arr[], int left, int middle, int right, int *mergedArray){ /** merge left and right sub-array and assign to mergedArray * Parameter: * left: index of left point which is beginning of left sub-array * middle: index of middle point which is the partition point also * right: index of right point which is the end of right sub-array */ int i = left; // control left sub-array int j = middle + 1; // control right sub-array int k = left; // control merged list while ((i <= middle) && (j <= right)){ if (arr[i] <= arr [j]){ mergedArray[k++] = arr[i++]; // assign i/k then plus 1 } else{ mergedArray[k++] = arr[j++]; } } // assign remaining sub-array if another done while (i <= middle){ mergedArray[k++] = arr[i++]; } while (j <= right){ mergedArray[k++] = arr[j++]; } // copy to original array for (i=left; i<=right; i++){ arr[i] = mergedArray[i]; } } void topDownMergeSort(int arr[], int left, int right, int *mergedArray){ int middle = left + (right - left) / 2; if (left < right){ topDownMergeSort(arr, left, middle, mergedArray); // divide left sub-array topDownMergeSort(arr, middle+1, right, mergedArray); // divide right sub-array merge(arr, left, middle, right, mergedArray); /** * if you want to show the sorting procedure * add a parameter 'n' which is the lenght of input array 'arr' * then uncomment the following code * printArray(arr, n); // show sorting pdocedure */ } } ``` #### 1.2 Bottom-up merge sort Top-down 的 merge sort 是以遞迴的方式呈現，比較**適合用在小資料**上，若是在大資料上做遞迴函式則會需要較大的 stack space，空間不夠可能會導致**溢位(overflow)** Bottom-up 是 top-down 的反向做法，排序過程**不會涉及陣列的分割(partition)**，因為預設每次的分割長度都是以 $2^n$ 為單位。 **Step-by-step：** 給定一個 size = n 的輸入陣列， * 對於 size = 1 的子陣列可以視為已經排序好的子陣列 * 將所有 size = 1 的子陣列**兩兩合併**，形成 size = 2 的已排序子陣列 * 對於所有 size = 2 的子陣列兩兩合併，形成 size = 4 的已排序子陣列 * 依此類推，直到合併兩個 size = n / 2 的已排序子陣列 > The following code is saved in `sorted.c`. ```c= // merge sort #define MIN(x, y) ((x) < (y) ? (x) : (y)) void bottomUpMergeSort(int arr[], int n, int *mergedArray){ int currSize;// current size of each sub-array int left; // leftmost index of left sub-array // merge sub-arry of size n to create sorted array of size 2*n for (currSize = 1; currSize <= n-1; currSize *= 2){ // pick beginning point of each sub-array of size currSize for (left = 0; left < n-1 ; left += 2 * currSize){ int middle = MIN(left + currSize - 1, n - 1); // end index of left sub-array int right = MIN(left + 2 * currSize - 1, n - 1); // end index of right sub-array merge(arr, left, middle, right, mergedArray); //printArray(arr, n); // if you want to show the sorting process, uncomment this line } } } ``` * **外層**迴圈控制的是 **array size** * 合併的陣列長度不能超過輸入陣列(`currSize = n - 1`) * 每次合併的陣列長度是以**指數形式遞增**(`currSize *= 2`) * **內層**迴圈是計算兩兩**合併過程中陣列的==頭尾==位置** * 左子陣列的起點必須小於倒數第 2 個位置(`left < n - 1`) * 合併後的子陣列長度為 2 * currSize，所以左子陣列的計算必須每次遞增 2 倍長度(`left += 2 * currSize`) * 左子陣列的起點位置必須在 `left + currSuze - 1` 與最後一個元素 `n-1` 做選擇，避免超出輸入陣列的邊界 * 右子陣列的結束位置必須在 `left + 2 * currSize - 1` 與最後一個元素 `n - 1` 做選擇，避免超出輸入陣列的邊界 #### Analysis * Time complexity 兩個方式的時間複雜度都是 $O(n\cdot \log n)$，因為分割過程中樹的高度是 $\log n$ 且每層都有 n 個元素需要做排序。 * Space complexity 不論是 top-down 或 bottom-up 的做法，在排序過程中都需要一個額外的陣列來儲存合併後的已排序陣列(not in-place)，所以空間複雜度都是 $O(n)$。但因為遞迴版本會不斷呼叫函式並儲存在 stack 中，因此從記憶體的角度上看，遞迴版本的空間複雜度是 $O(n+\log n)$ * Advantage * Stable 在合併過程中，會優先選擇較小的元素，但當元素大小相同時，會優先選擇放在左子陣列，所以元素會保持輸入時的相對順序，是一種 stable 的排序方法。 * Not inplace ### 2. Quick sort ![image](https://media.geeksforgeeks.org/wp-content/uploads/20240926172924/Heap-Sort-Recursive-Illustration.webp) (圖片來源：[GreekforGreek](https://www.geeksforgeeks.org/quick-sort-algorithm/)) 快速排序法(quick sort)同樣是依據 **divide and conquer** 的精神來進行排序，將要排序的**大陣列依序拆解成許多小陣列**，當小陣列排序好之後，則大陣列也會排序完成。 * 選定一個 **樞紐(pivot)** 並將 array 中的元素依照下列規則**重新排列並分解**成兩個 sub-array * $\le$ pivot 的數放在 pivot 左邊 * $\ge$ pivot 的數放在 pivot 右邊 * 對左右兩個 sub-array 繼續依照上述步驟遞迴分割(partition) * 直到最後的 array 只剩一個元素時就排序完成 (類似 binary search tree 的規則) :::info 做 array partition 的目的是將每次**選定的 pivot 放在正確的位置上** ::: 快速排序法的實作關鍵在於以下兩點：(1) 如何挑選 pivot 與 (2) 如何對 array 做 partition (1) How to choose pivot 一般來說 pivot 會直接挑選每次迭代過程中 array 的首項或末項元素。但如果剛好選到最大或最小值，則樹的高度會 = n。而如果挑到的 pivot 剛好是中位數，則可以形成一個高度為 $\log n$ 的 balance binary tree。因次為了避免選到最大或最小值，可以每次選擇 3 個數並取三數的中位數做為 pivot (2) How to partition an array 迭代過程中分割 array 的過程有很多種，常見的有 1. Naive partition Naive partition 在分割過程中，會複製整個 array。將較小的元素放在 left sub-array，較大的元素放在 right sub-array。空間複雜度為 $O(n)$ 2. Lomuto partition Lomuto partition 在分割過程中會追蹤較小的元素。從陣列首項開始迭代，當遇到較小(smaller than pivot)的元素就跟當前的元素交換位置 3. **Hoave's partition** Hoave's partition 使用 two pointer 的方式來分割陣列，選定 pivot 之後(通常是陣列第一個元素) * 當左指標元素 > pivot 且右指標元素 < pivot 時交換位置 * 直到左指標 index >= 右指標 index 則完成分割 > The following code is saved in `sorted.c`. ```c= // quick sort void swap(int *a, int *b){ int temp; temp = *a; *a = *b; *b = temp; } int partition(int arr[], int left, int right){ /** Hoare's Partition Algorithm * use two pointers to divide the input array * this method also called Hoare's Partition Algorithm * Parameter: * left: index of the beginning element of arr * right: index of the last element of arr */ int leftPointer = left - 1; int rightPointer = right + 1; int pivot = arr[left]; while (1){ do { // left pointer leftPointer++; } while (arr[leftPointer] < pivot); do { // right pointer rightPointer--; } while (arr[rightPointer] > pivot); if (leftPointer >= rightPointer) return rightPointer; swap(&arr[leftPointer], &arr[rightPointer]); } } void quickSort(int arr[], int left, int right){ /** use Hoare's partition to implement quick sort * Parameter: * left: index of beginning element. * right: index of last element. */ if (left < right){ int partiionIndex = partition(arr, left, right); quickSort(arr, left, partiionIndex); // sort left array quickSort(arr, partiionIndex+1, right); // sort right array } } ``` #### Analysis * Time complexity * Worse case 如上所述，每次在挑選 pivot 時，如果剛好都選到最大或最小，則會形成一個高度 = n 的 skewed tree，時間複雜度為 $O(n^2)$ * Best case 同樣的，如果每此都剛好選到中位數，則會形成一顆高度 = $\log n$ 的 balanced tree，時間複雜度為 $O(n \log n)$ * Average case：$O(n \log n)$ * Space complexity 依照迭代不同，空間複雜度也不同，通常是 $O(\log n)$ 或 $O(n)$ * Advantage * Unstable * Not in-place ### 3. Heap sort 堆積排序(heap sort)是使用 max heap 的結構來實現排序的過程。因此在實際排序之前，必須先將輸入陣列的順序改成 max heap 的結構。 * 將輸入陣列調整成 max heap * 將 root 與 max heap 中的最後元素 last 交換位置，此時 last 的值最大，已經排序完成 * 移除 last * 最後將剩下元素再次調整成 max heap，重回第二步驟 :::info Heap 的陣列表示法中，習慣上會用 1-index 表示根節點的位置來簡化計算。此處因為要進行排序所以使用 0-index 來表示根節點的位置。 ::: 在 heapSort 前半段的目的是要讓輸入陣列形成最大堆積，所以從底層的葉子節點開始往上處理，但因為葉子節點可以視為已經排序好的 heap，所以只要找最後一個非葉節點($\lfloor \frac{n}{2} \rfloor - 1$)的位置，再往上處理即可。 > The following code is saved in `sorted.c`. ```c= // heap sort void heapify(int arr[], int root, int n){ /** rearrange the array(0-index) to form a max heap * Parameter: * arr: input array * root: index of root * n: array length */ int child = 2 * root + 1; // left node by default int rootValue = arr[root]; while (child < n){ if ( (child + 1 < n) && (arr[child] < arr[child + 1])){ child++; } if (arr[child] <= rootValue){ break; } else { arr[root] = arr[child]; root = child; // sink down and check next level child = 2 * root + 1; } } arr[root] = rootValue; } void heapSort(int arr[], int n){ // heapify arr to form max heap for (int i = (n-1)/2; i >= 0; i--){ heapify(arr, i, n); } for (int j = n-1; j > 0; j--){ swap(&arr[0], &arr[j]); // printArray(arr, n); // if you want to show the procedure, uncomment this line heapify(arr, 0, j); } } ``` #### Analysis * Time complexity * Worse/best/average case 的時間複雜度均為 $O(n \log n)$。 * Space complexity * 需看實作的過程，如果使用迴圈實現，則空間複雜度為 $O(1)$，但如果 `heapify()` 使用遞迴方式進行，則空間複雜度為 $O(n \log n)$。 * Advantage * Unstable * In-place ## 排序演算法的極限以比較和交換兩個動作為主的排序演算法中，假設有 n 個元素的陣列需要排列，則可能的排序情況有 $n!$ 種，以樹狀圖結構的示意圖如下 <img src="https://hackmd.io/_uploads/rJtRyKtr1x.jpg" width=500> 因為有 $n!$ 種可能的排序，所以會有 $n!$ 個葉節點。以高度為 $k$ 的 binary tree 來說，葉節點的個數為 $2^k$，故當葉節點數為 $n!$ 時 $$ n! = 2^k \Rightarrow k = \log_2(n!) $$ 考慮到 0-index 的編號系統，因此樹的高度為 $k+1 = \log_2(n!) + 1$ 因次，以比較和交換為主的排序演算法中，最差的時間複雜度為 $O(n \log n)$ ## 混合排序 ### 1. Tim sort ## 特殊排序 ### 1. Bucket sort 容器排序法(bucket sort)又稱為 bin sort ，是一種特殊排序法，能夠在 best case 的情況下做到 $O(n)$ 的時間複雜度。 Step-by-Step： * 將元素分配到不同的 group/bucket * 對於每個 group/bucket 內部在進行排序，通常是用 insertion sort，因為此時陣列長度變小 * 最後把所有的 group/bucket 依序合併在進行分組時，會使用類似長方圖找組距的方式決定每一組的長度，再將元素進行映射，轉換到對應的組別。 > THe following code is saved in `sorted.c`. ```c= // bucket sort int getMax(int arr[], int n){ int max = arr[0]; for (int i=0; i<n; i++){ if (arr[i] > max) max = arr[i]; } return max; } int getMin(int arr[], int n){ int min = arr[0]; for (int i=0; i<n; i++){ if (arr[i] < min) min = arr[i]; } return min; } void bucketSort(int arr[], int n){ /** bucket sort * Variable: * bucketCount: setup the bucket number ,if the tange between max and * min is small, then bucketCount can't too large. * buckets: an 2-D array which save all buckets * bucketSize: an 1-D array which save bucket size corresponding to buckets[i] */ int bucketCount = 2; int min = getMin(arr, n); int max = getMax(arr, n); int range = (max - min) / bucketCount; int **buckets = (int **) malloc(bucketCount * sizeof(int *)); // each bucket int *bucketSize = (int *) calloc(bucketCount, sizeof(int *)); // record each bucket size // allocate each bucket size for (int i=0; i<bucketCount; i++){ buckets[i] = (int *) malloc(n * sizeof(int*)); // maximum capacity of each bucket } // scatter the element to each bucket for (int i=0; i<n; i++){ int index = (arr[i] - min) / range; if (index == bucketCount) index -= 1; buckets[index][bucketSize[index]++] = arr[i]; } // sort each bucket for (int i=0; i<bucketCount; i++){ if (bucketSize[i] > 0) insertionSort(buckets[i], bucketSize[i]); } // merge all bucket int index = 0; for (int i=0; i<bucketCount; i++){ for (int j=0; j<bucketSize[i]; j++){ arr[index++] = buckets[i][j]; } free(buckets[i]); } free(buckets); free(bucketSize); } ``` #### Analysis * Worse case: $O(n^2)$ 當 bucket 中的元素分佈不均，某個 bucket 拿到所有元素時就會退化成 insertion sort。但複雜度還是依據使用的排序方式為主。 * Best case: $O(n+k)$ or $O(n)$ 當 bucket 中的元素均勻分佈 * Space complexity: $O(n+k)$ ### 2. Counting sort 計數排序(counting sort) 是一種非比較的排序法，核心思想是計算元素出現的次數進行排序，適合用在輸入數值不大的非負整數陣列中，能夠做到 $O(n)$ 的時間，但需要 $O(n+m)$ 空間，其中 m 是陣列的最大值。整個過程可以分成 3 個步驟： <img src="https://hackmd.io/_uploads/rJsXTy-8ke.jpg" width=400> 1. 計算出現次數(頻率) * 找到輸入陣列的最大值 m * 建立 size = m + 1 的計數陣列 `countArr` 並計算元素出現的次數 2. 計算計數陣列 `countArr` 的累加和(presum) <img src="https://hackmd.io/_uploads/HJXDTJZL1g.jpg" width=400> 3. 從輸入陣列的末端開始迭代 * 找 `countArr` 中 index = `inputArr[i]` 的元素 * `countArr[index] - 1` 的值就是 `inputArr[i]`` 需要放置的位置 * 更新 `countArr[index]` > The following code is saved in `sorted.c`. ```c= // counting sort void countSort(int arr[], int n){ int max = getMax(arr, n); // compute frequency of each element int *countArr = calloc(max + 1, sizeof(arr[0])); for (int i = 0; i < n; i++) countArr[arr[i]]++; // compute presum of countArr for (int i = 1; i < max + 1; i++) countArr[i] += countArr[i-1]; // sort from last to beginning int *sortedArr = malloc(sizeof(arr[0])*n); for (int i = n - 1; i >= 0; i--) sortedArr[--countArr[arr[i]]] = arr[i]; // copy to original array for (int i = 0; i < n; i++) arr[i] = sortedArr[i]; free(countArr); free(sortedArr); } ``` ### 3. Key-based sort 想像 array 中的元素有多個鍵值 $K^1$, $K^2$, $K^n$可以做排序。以撲克牌為例有分為花色跟點數兩種鍵值 * 鍵值 $K^1$(花色)：♠️ > ♥️ > ♦️ > ♣️ * 鍵值 $K^2$(點數)：1 > 2 > ... > 13 依據鍵值(key)的重要性進行排序可分為兩種：最重要位優先(Most-Significant-Digit-First, MSD)與最不重要位優先(Least-Significant-Digits-First, LSD) 假設 key 的重要性從 $K^1$ 開始依序遞減，以上述撲克牌為例，對於 MSD 來說： * 先根據 $K^1$ 的值排序成四堆 * 每堆中再依據 $K^2$ 的值排序對於 LSD 來說： * 先依據 $K^2$ 的值分成 13 堆 * 再依據 $K^1$ 的值使用 stable 的排序法分成四堆 * 最後將每堆結果合併 :::info 不論是 MSD 還是 LSD 都只是描述鍵值的順序，不會涉及排序。 ::: ### 4. Radix sort 基數排序(radix sort)也是一種 key-based 的排序方法，藉由比較每個位數(digit)來達到線性的複雜度。適合用在固定長度的整數(integer)或字串(string)等資料的排序。因為是多鍵值的排序方法，所以可以依據鍵值的重要性分為 LSD 與 MSD。假設輸入陣列如下，因為最大數字有三位(digit=3)，所以必須迭代三次才能排序完成。 $$ [170,\ 45,\ 75,\ 90,\ 802,\ 84,\ 2,\ 66] $$ 根據個位數(digit=1)的排序結果為： $$ [17 \underline0,\ 9 \underline0,\ 80 \underline2,\ \underline2,\ 2 \underline4,\ 4 \underline5,\ 7 \underline5,\ 6 \underline6] $$ 再根據十位數(digit=2)的排序結果為： $$ [8 \underline02,\ 2,\ \underline24,\ \underline45,\ \underline66,\ 1 \underline70,\ \underline75,\ \underline90] $$ 最後一句百位數(digit=3)的排序結果為： $$ [2,\ 24,\ 45,\ 66,\ 75,\ 90,\ \underline170,\ \underline802] $$ 在每一個 digit 排序的過程中，原則上可以使用任何的排序演算法，只要保證 stability 即可，不過習慣上會使用 counting sort 進行。 > The following code is saved in `sorted.c`. ```c= // radix sort void sortedDigit(int arr[], int n, int digit){ // compute frequency int *countArr = calloc(10, sizeof(arr[0])); // digit from 0 to 9 for (int i = 0; i < n; i++) countArr[(arr[i] / digit) % 10]++; // conpute presum for (int i = 1; i < 10; i++) countArr[i] += countArr[i-1]; // sort int *sortedArr = malloc(n * sizeof(arr[0])); for (int i = n - 1; i >= 0; i--) sortedArr[--countArr[(arr[i] / digit) % 10]] = arr[i]; // copy to original array for (int i = 0; i < n; i++) arr[i] = sortedArr[i]; free(countArr); free(sortedArr); } void radixSort(int arr[], int n){ int max = getMax(arr, n); for (int digit = 1; max / digit > 0; digit *= 10){ sortedDigit(arr, n, digit); /** printArray(arr, n); * if want to show sorting process * uncomment this line */ } } ``` ## Reference * [Rust Algorithm CLub](https://rust-algo.club/index.html) * [GeeksforGeeks](https://www.geeksforgeeks.org) ## Interview Ques. ### 1. Quick sort ```c= #include <stdio.h> void swap(int *, int *); void quickSort(int, int, int []); int main(void) { int arr[] = {1, 5, 4, 3, 2}; int n = sizeof(arr) / sizeof(int); quickSort(0, n-1, arr); printf("arr = "); for (int i = 0; i < n; i++) { printf("%d ", arr[i]); } return 0; } void swap(int *a, int *b) { int temp = *a; *a = *b; *b = temp; } void quickSort(int left, int right, int arr[]) { if (left >= right) { return; } int mid = (left + right) / 2; int pivot = arr[mid]; int L = left; int R = right; while (L <= R) { // find smaller than pivot while (arr[L] < pivot) { L++; } // find greater than pivot while (arr[R] > pivot) { R--; } if (L <= R) { swap(&arr[L], &arr[R]); L++; R--; } } quickSort(left, R, arr); quickSort(L, right, arr); } ```