16. Sorting - HackMD

# Sorting ###### tags: `LeetCode筆記` The focus will be on knowing **the tradeoffs of the various algorithms** along with the overarching themes and detailed implementation details. 教學動畫: https://www.youtube.com/playlist?list=PLOMGwySGA-vZgkBTPMLK5VzW78g8iCqG- ## :memo: 零、C++ sort() 比較自訂: ![image](https://hackmd.io/_uploads/rJ2MztxEa.png) ## :memo: Queue Reconstruction by Height (Medium) ![image](https://hackmd.io/_uploads/ByDDYvVVp.png) ![image](https://hackmd.io/_uploads/Hk2TuvNNp.png) ### 作法按身高由高到低排序，身高一樣按k由低到高排序，之後按照k將人插入index一樣是k的位置 ``` class Solution { public: vector<vector<int>> reconstructQueue(vector<vector<int>>& people) { sort(people.begin(), people.end(), [&](const auto& L, const auto& R) { if (L[0] == R[0]) { return L[1] < R[1]; } return L[0] > R[0]; }); vector<vector<int>> res; for (int i = 0; i < people.size(); i++) { res.insert(res.begin() + people[i][1], {people[i][0], people[i][1]}); } return res; } }; ``` ## :memo: 一、Comparison Based Sort * Bubble Sort * Selection Sort * Insertion Sort * Heap Sort ## :memo: Bubble Sort 1. The core idea of bubble sort is it will repeat this process until no more swaps are made in a single pass, which means the list is sorted. 2. stable 3. time: O(n^2) 4. space: O(1) 5. Overall, bubble sort is fairly simple to implement, and it’s stable, but outside of that, this algorithm does not have many desirable features. It’s fairly **slow for most inputs** and, as a result, it is rarely used in practice. ``` void BubbleSort(int* nums, int numsSize){ bool hasSwapped = true; while (hasSwapped) { hasSwapped = false; for (int i = 0; i < numsSize - 1; i++) { if (nums[i] > nums[i + 1]) { int temp = nums[i]; nums[i] = nums[i + 1]; nums[i + 1] = temp; hasSwapped = true; } } } } ``` ## :memo: Selection Sort 1. swap次數少，因為是只移動一次迭代的最小元素，適合swap成本高的事件 2. not stable，因為有可能把同樣大小的元素原本是B1B2A2，交換B1A1後，變成A1B2B1 3. time: O(n^2) 4. space: O(1) ``` void SelectionSort(int* nums, int numsSize){ int min_index = 0; for (int i = 0; i < numsSize; i++) { min_index = i; for (int j = i + 1; j < numsSize; j++) { if (nums[min_index] > nums[j]) { min_index = j; } } int temp = nums[i]; nums[i] = nums[min_index]; nums[min_index] = temp; } } ``` ## :memo: Insertion Sort 1. 適合幾乎已經排序好的陣列，一直往後移動，將current index的元素往前swap直到比前面大為止 2. stable 3. time: O(n^2) 4. space: O(1) 5. Many sorting functions have a quick check for the size of the collection and if that value is below a threshold, the program will default to insertion sort. ``` public class Solution { public void insertionSort(int[] arr) { // Mutates elements in arr by inserting out of place elements into appropriate // index repeatedly until arr is sorted for (int i = 1; i < arr.length; i++) { int currentIndex = i; while (currentIndex > 0 && arr[currentIndex - 1] > arr[currentIndex]) { // Swap elements that are out of order int temp = arr[currentIndex]; arr[currentIndex] = arr[currentIndex - 1]; arr[currentIndex - 1] = temp; currentIndex -= 1; } } } } ``` ## :memo: Insertion Sort List (Medium) ![](https://i.imgur.com/H3RzMZS.png) ![](https://i.imgur.com/Xkevuy9.png) ### 作法這一題我自己做一樣用三個指標卻Time Limit Exceed，範例code一樣是利用三個指標，首先三個指標指向a=dummy,b=dummy->next,c=dummy->next->next，而dummy->next=head，在過程中一旦c->val < b->val，就將a->next=c，b->next=c->next，c->next=b，然後a=b,b=c,c=c->next往前進，最後回傳ptr->next ``` struct ListNode* insertionSortList(struct ListNode* head){ if (head == NULL) { return NULL; } struct ListNode dummy; struct ListNode* ptr = &dummy; ptr->next = head; int flag = 0; while (!flag) { flag = 1; struct ListNode* a = ptr; struct ListNode* b = ptr->next; struct ListNode* c = ptr->next->next; while (c) { if (c->val < b->val) { flag = 0; a->next = c; b->next = c->next; c->next = b; } a = b; b = c; c = c->next; } } return ptr->next; } ``` ## :memo: Heap Sort 1. improve selection sort by using a special data structure called a heap. 2. not stable 3. time: O(nlogn) 4. space: O(1) 5. It also turns out that in practice, this algorithm performs worse than other O(N \log N)O(NlogN) sorts as a result of bad cache locality properties. Heapsort swaps elements based on locations in heaps, which can cause many read operations to access indices in a seemingly random order, causing many cache misses, which will result in practical performance hits. ![](https://i.imgur.com/IyqVPMK.png) ``` public class Solution { public void heapSort(int[] arr) { // Mutates elements in lst by utilizing the heap data structure for (int i = arr.length / 2 - 1; i >= 0; i--) { maxHeapify(arr, arr.length, i); } for (int i = arr.length - 1; i > 0; i--) { // swap last element with first element int temp = arr[i]; arr[i] = arr[0]; arr[0] = temp; // note that we reduce the heap size by 1 every iteration maxHeapify(arr, i, 0); } } private void maxHeapify(int[] arr, int heapSize, int index) { int left = 2 * index + 1; int right = 2 * index + 2; int largest = index; if (left < heapSize && arr[left] > arr[largest]) { largest = left; } if (right < heapSize && arr[right] > arr[largest]) { largest = right; } if (largest != index) { int temp = arr[index]; arr[index] = arr[largest]; arr[largest] = temp; maxHeapify(arr, heapSize, largest); } } } ``` ## :memo: 二、Non-Comparison Based Sorts * Counting Sort * Radix Sort * Bucket Sort ## :memo: Counting Sort 1. A = [5, 4, 5, 5, 1, 1, 3], count_array = [0, 2, 0, 1, 1, 3, 0], startingIndices = [0, 0, 2, 2, 3, 4, 7] 2. stable 3. time: O(n+k) k is the maximum value in the array 4. space: O(n+K) since we have to initialize a new array of size n and a counts arrays of size k+1 5. It requires extra memory, while many comparison sorts can be implemented without requiring any extra memory. 6. When the range of possible values K is large compared to N, counting sort may actually perform worse than a theoretically slower O(nlogn) sort as a result of the extra memory overhead and additional K operations that need to be performed. ``` void CountingSort(int* nums, int numsSize){ int* nums_copy = (int*) malloc(sizeof(int) * numsSize); int CountArray[3] = {0}; //3 is size of k int StartingIndices[3] = {0}; int sum = 0; for (int i = 0; i < numsSize; i++) { CountArray[nums[i]]++; nums_copy[i] = nums[i]; } for (int i = 1; i < 3; i++) { sum += CountArray[i - 1]; StartingIndices[i] = sum; } for (int i = 0; i < numsSize; i++) { nums[StartingIndices[nums_copy[i]]] = nums_copy[i]; StartingIndices[nums_copy[i]]++; } } ``` ## :memo: Minimum Absolute Difference (Easy) ![](https://i.imgur.com/ceGLfvS.png) ### 作法先用CountSort排序和先算出差距最小等於多少，再去從排序好的陣列找差值等於最小值的連續元素，過程因為太久沒寫C的動態記憶體，所以在*returnSize的操作不太熟練 ``` int** minimumAbsDifference(int* arr, int arrSize, int* returnSize, int** returnColumnSizes){ int* arr_copy = (int*) malloc(sizeof(int) * arrSize); int CountArray[2000001] = {0}; int StartingIndices[2000001] = {0}; int sum = 0; int** result = (int**) malloc(sizeof(int*) * arrSize); *returnSize = 0; int minimumabs = INT_MAX; for (int i = 0; i < arrSize; i++) { arr_copy[i] = arr[i]; CountArray[arr[i] + 1000000]++; } for (int i = 1; i < 2000001; i++) { sum += CountArray[i - 1]; StartingIndices[i] = sum; } for (int i = 0; i < arrSize; i++) { arr[StartingIndices[arr_copy[i] + 1000000]] = arr_copy[i]; StartingIndices[arr_copy[i] + 1000000]++; } for (int i = 0; i < arrSize - 1; i++) { minimumabs = fmin(minimumabs, abs(arr[i] - arr[i + 1])); } *returnColumnSizes = (int*) malloc(sizeof(int*) * arrSize); for (int i = 0; i < arrSize - 1; i++) { if (abs(arr[i] - arr[i + 1]) == minimumabs) { (*returnColumnSizes)[*returnSize] = 2; result[*returnSize] = (int*) malloc(sizeof(int*) * 2); result[*returnSize][0] = arr[i]; result[(*returnSize)++][1] = arr[i + 1]; } } return result; } ``` ## :memo: Radix Sort 1. Least Significant Digit (LSD) Radix Sort.一次看一個位數的值比較個十百千的數字排序 2. stable 3. time: O(w(n+k)) w is the maximum digit length within the list of integers,n is the size of the original input integer array, k is the alphabet size 4. space: O(n+k) 5. require some overhead memory 6. Additionally, it does require looking at all digits due to the fact that more significant digits later down the line have more impact on the final sorted result. ## :memo: Query Kth Smallest Trimmed Number (Medium) ![](https://i.imgur.com/6hzmHRe.png) ### 作法看範例code思考 ``` class Solution { public: vector<int> smallestTrimmedNumbers(vector<string>& v, vector<vector<int>>& queries) { vector<int> res; for (auto & q : queries) { vector<int> init; for(int i = 0; i < v.size(); i++) { init.push_back(i); } res.push_back(helper(v, q, init)); } return res; } int helper(vector<string>& v, vector<int>& q, vector<int>& subset) { vector<int> mp[10]; for (auto i : subset) { mp[v[i].at(v[i].length() - q[1]) - '0'].push_back(i); } q[1]--; for (int j = 0; j <= 9; j++) { if (q[0] > mp[j].size()) { q[0] -= mp[j].size(); } else if (q[1] == 0) { return mp[j][q[0] - 1]; } else { return helper(v, q, mp[j]); } } return -1; } }; ``` ## :memo: Maximum Gap (Hard) ![](https://i.imgur.com/NA5UhQS.png) ### 作法用radix sort排序(其他也行)，再去比較連續兩數字差值最大作為答案 ``` class Solution { public: int maximumGap(vector<int>& nums) { if (nums.empty() || nums.size() < 2) { return 0; } int maxVal = *max_element(nums.begin(), nums.end()); int exp = 1; int radix = 10; vector<int> aux(nums.size()); while (maxVal / exp > 0) { vector<int> count(radix, 0); for (int i = 0; i <nums.size(); i++) { count[(nums[i] / exp) % 10]++; } for (int i = 1; i < count.size(); i++) { count[i] += count[i - 1]; } for (int i = nums.size() - 1; i >= 0; i--) { aux[--count[(nums[i] / exp) % 10]] = nums[i]; } for (int i = 0; i < nums.size(); i++) { nums[i] = aux[i]; } exp *= 10; } int maxGap = 0; for (int i = 0; i < nums.size() - 1; i++){ maxGap = max(nums[i + 1] - nums[i], maxGap); } return maxGap; } }; ``` ## :memo: Bucket Sort 1. place every element of the input into a bucket, where each bucket accepts a range of values. 2. stable 3. time: O(n^2) 4. space: O(n+k) k buckets 5. The worst-case time complexity of bucket sort is O(n^2) if the sorting algorithm used on the bucket is insertion sort 6. In the worst case, all elements are placed in one bucket, causing the running time to reduce to the worst-case complexity of insertion sort (all elements are in reverse order). 7. If the worst-case running time of the intermediate sort used is O(nlogn), then the worst-case running time of bucket sort will also be O(nlogn) 8. On average, when the distribution of elements across buckets is reasonably uniform, it can be shown that bucket sort runs on average O(n+k) for k buckets. ``` import java.util.*; public class Solution { public void bucketSort(int[] arr, int K) { List<List<Integer>> buckets = new ArrayList<ArrayList<Integer>>(K); int shift = Arrays.stream(arr).min().getAsInt(); int maxValue = Arrays.stream(arr).max().getAsInt() - shft; // place elements into buckets double bucketSize = (double) maxValue / K; if (bucketSize < 1) { bucketSize = 1.0; } for (int elem : arr) { // same as K * arr[i] / max(lst) int index = (int) (elem - shift) / bucketSize; if (index == K) { // put the max value in the last bucket buckets[K - 1].add(elem); } else { buckets[index].add(elem); } } // sort individual buckets for (List<Integer> bucket : buckets) { Collections.sort(bucket); } // convert sorted buckets into final output List<Integer> sortedList = new ArrayList<Integer>(); for (List<Integer> bucket : buckets) { sortedList.addAll(bucket); } // perfectly fine to just return sortedList here // but common practice is to mutate original array with sorted elements for (int i = 0; i < arr.length; i++) { arr[i] = sortedList.get(i); } } } ``` ## :memo: 三、Sorting Algorithm Summary ![](https://i.imgur.com/vrVzuiT.png) ## :memo: 四、Segment Tree 教學影片: https://www.youtube.com/watch?v=xztU7lmDLv8&t=2s&ab_channel=StableSort 教學網頁: https://www.hackerearth.com/practice/data-structures/advanced-data-structures/segment-trees/tutorial/ ![image](https://hackmd.io/_uploads/BJhN291Vp.png) ## :memo: 五、Fenwick (Binary Indexed) Tree 教學影片: https://www.youtube.com/watch?v=uSFzHCZ4E-8&ab_channel=StableSort 教學網頁: https://www.hackerearth.com/practice/data-structures/advanced-data-structures/fenwick-binary-indexed-trees/tutorial/ ![image](https://hackmd.io/_uploads/ByCBT9J46.png) ## :memo: Count of Smaller Numbers After Self (Hard) Given an integer array nums, return an integer array counts where counts[i] is the number of smaller elements to the right of nums[i]. ![image](https://hackmd.io/_uploads/SJTw3c1Na.png) ### 作法 Segmant Tree ``` class Solution { public: vector<int> countSmaller(vector<int>& nums) { int offset = 1e4; // offset negative to non-negative int size = 2 * 1e4 + 1; // total possible values in nums vector<int> tree(size * 2); vector<int> result; for (int i = nums.size() - 1; i >= 0; i--) { int smaller_count = query(0, nums[i] + offset, tree, size); result.push_back(smaller_count); update(nums[i] + offset, 1, tree, size); } reverse(result.begin(), result.end()); return result; } // implement segment tree void update(int index, int value, vector<int>& tree, int size) { index += size; // shift the index to the leaf // update from leaf to root tree[index] += value; while (index > 1) { index /= 2; tree[index] = tree[index * 2] + tree[index * 2 + 1]; } } int query(int left, int right, vector<int>& tree, int size) { // return sum of [left, right) int result = 0; left += size; // shift the index to the leaf right += size; while (left < right) { // if left is a right node // bring the value and move to parent's right node if (left % 2 == 1) { result += tree[left]; left++; } // else directly move to parent left /= 2; // if right is a right node // bring the value of the left node and move to parent if (right % 2 == 1) { right--; result += tree[right]; } // else directly move to parent right /= 2; } return result; } }; ``` ### 作法 Binary Indexed Tree (Fenwick Tree) ``` class Solution { public: vector<int> countSmaller(vector<int>& nums) { int offset = 1e4; // offset negative to non-negative int size = 2 * 1e4 + 2; // total possible values in nums plus one dummy vector<int> tree(size); vector<int> result; for (int i = nums.size() - 1; i >= 0; i--) { int smaller_count = query(nums[i] + offset, tree); result.push_back(smaller_count); update(nums[i] + offset, 1, tree, size); } reverse(result.begin(), result.end()); return result; } // implement Binary Index Tree void update(int index, int value, vector<int>& tree, int size) { index++; // index in BIT is 1 more than the original index while (index < size) { tree[index] += value; index += index & -index; } } int query(int index, vector<int>& tree) { // return sum of [0, index) int result = 0; while (index >= 1) { result += tree[index]; index -= index & -index; } return result; } }; ``` ## :memo: 刷題檢討排序演算法很多，要會有影片一樣的動畫記憶