Divide and Conquer

# Divide and Conquer ###### tags: `Divide and Conquer` >Prerequisite: None ## Content [ToC] ## Recurrence We have studied recurrence during high school. Recursion can be implemented on the problem that the current state of the problem can be solved by the previous state of the problem. For example, we should be familiar with Fibonacci sequence, where its recurrence relation can be represent as follow: ``` Let Fn be the n-th Fibonacci number, F1 = 1, F2 = 1, Fn = Fn-1+ Fn-2, if n ≥ 3. ``` If we use computer language to represent a state of a Fibonacci number, then it can use a function to represent. The pseudo code is as follow: ``` function Fibonacci(n): if n == 1 || n == 2: return 1; else return Fibonicci(n-1) + Fibonicci(n-2); ``` ## Divide and Conquer Divide and Conquer can be implement in three steps: **Divide** the problem into a number of subproblems that are smaller instances of the same problem. **Conquer** the subproblems by solving them recursively. If the subproblem sizes are small enough, however, just solve the subproblems in a straightforward manner. **Combine** the solutions to the subproblems into the solution for the original problem. ### Merge Sort We have discussed merge sort in [sorting part](https://hackmd.io/@h-ZsK3uKQU-EzblplsWowA/DrAmor_Sorting). In merge sort, we divide the array into two smaller subarrays and we can directly get the sorted subarray when the size of the subarray is equal to 1. Then, we can merge those subarrays into the original array. Therefore, we can formally write the procedure of merge sort as follow: **Divide** the array into two similar size subarrays. **Conquer** the subarray when the size is equal to 1 (directly return the number). **Combine** two subarray into the bigger subarray with linear merging. ## The maximum Subarray Problem ### Problem statement ``` Input: An array A[1,2,...,n]. Output: A nonempty subarray A[i,i+1,...,j] that having the largest sum S[i,j] Where S[i,j] = ai+...+aj. ``` The problem can be solved by the brute-force method in which we can examine all possible subarrays and calculate the largest among all subarrays. Obviously, we will need $O(n^2)$ to find all possible subarrays. Also, we will need $O(n)$ to search the maximum subarray. The overall complexity will be $O(n^3)$. However, we can use Divide & Conquer to solve the problem. Consider that if we divide the original array into two parts, we will get the maximum subarray in three conditions, entire in left part, entire in right part, and the subarray cross the left and right part. Therefore, the recursion can be the function as follow: ``` function Max-Subarray(A[], p, r): if p == q: return (p,p,A[p]); else: q = (p+r)/2; (i1,j1,s1) = Max-Subarray(A, p, q); (i2,j2,s2) = Max-Subarray(A, q+1, r); (i3,j3,s3) = Cross(A, p, q, r); if s1 ≥ s2 && s1 ≥ s3: return (i1,j1,s1); else if s2 ≥ s3: return (i2,j2,s2); else: return (i3,j3,s3); ``` Then, we can handle the condition that the maximum is crossing the midpoint of the array. If the maximum is crossing the array, we can first find the maximum start from the midpoint to the left-most number of the array. Also, find the maximum array start from the midpoint to the end of the array as the code below: ``` function Cross(A[], p, q, r): s1 = inf; sum = 0; for i = q down 1: sum += A[i]; if sum > s1: s1 = sum; max_i = i; sum = 0; s2 = inf; for j = q+1 to r: sum += A[j]; if sum > s2: s2 = sum; max_right = j; return (max_i, max_j, s1+s2); ``` ### C++ Code ```cpp= #include <iostream> #include <stdio.h> using namespace std; struct pairs{ int left; int right; int sum; }; pairs Cross(int A[], int p, int q, int r){ pairs P; int s1 = INT_MIN; int sum = 0; int max_left = 0, max_right = 0; for(int i = q; i >= p; i--){ sum += A[i]; if(sum >= s1){ s1 = sum; max_left = i; } } int s2 = INT_MIN; sum = 0; for(int i = q+1; i <= r; i++){ sum += A[i]; if(sum >= s2){ s2 = sum; max_right = i; } } P.left = max_left; P.right = max_right; P.sum = s1+s2; return P; } pairs FindMaxSubarray(int A[], int p, int r){ if(p == r){ pairs tmp; tmp.left = p; tmp.right = p; tmp.sum = A[p]; return tmp; } else{ int q = (p+r)/2; pairs left, right, cross; left = FindMaxSubarray(A, p, q); right = FindMaxSubarray(A, q+1, r); cross = Cross(A, p, q, r); if(left.sum >= right.sum && left.sum >= cross.sum) return left; else if(right.sum >= cross.sum) return right; else return cross; } } int main(){ int A[] = {13,-3,-25,20,-3,-16,-23,18,20,-7,12,-5,-22,15,-4,7}; int n = sizeof(A)/sizeof(A[0]); pairs P = FindMaxSubarray(A, 0, n-1); cout << P.sum << endl; return 0; } ``` ## Kth Largest Element in an Array ### Problem Statment The question is selected from Leetcode, the original problem can be referred to [here](https://leetcode.com/problems/kth-largest-element-in-an-array/). ``` Given an integer array nums and an integer k, Return the kth largest element in the array. ``` ### Example ``` Input: A[] = {1,4,35,444,3,46,76,88,24,23,2345,65} k = 5; Output: 65 ``` ### Algorithm The straightforward method will be that we first sort the array and directly return the k-th largest element of the sorted array. We can use [Quick Sort](https://hackmd.io/@h-ZsK3uKQU-EzblplsWowA/DrAmor_Sorting) we have discussed in the sorting topic. However, the average complexity of sorting will be $O(nlogn)$. We hope to get the answer in complexity $O(n)$. Therefore, consider the procedure in Quick Sort. Once we have decided the position of the pivot in function, we can get the correct position of the pivot. Therefore, we can next check whether the k-th largest number is pivot or not. If the pivot position is smaller than the k-1, then we will now move to the left subarray divided by pivot, vice versa. If the pivot is equal to the kth largest number, then we can directly return the pivot. As the pseudo code below. ### Pseudo Code ``` function k-largest(A[], p, r, k): q = partition(A,p,r); if k-1 == q: return A[q]; else if k-1 < q: return k-largest(A, p, q-1, k); else: return k-largest(A, q+1, r, k); ``` ### Complexity The average case will be that the pivot will always divide the array into two same size subarrays. Therefore, the recursion relation can be described as $T(n) = T(n/2) + O(n)$. By recursion tree method, we can get the average complexity to be $O(n)$. ### C++ Code ```cpp= #include <iostream> using namespace std; void Swap(int &a, int &b){ int tmp = a; a = b; b = tmp; } int partition(int A[], int p, int r){ int i = p-1; int key = A[r]; for(int j = p; j < r; j++){ if(key <= A[j]){ i++; Swap(A[i], A[j]); } } Swap(A[i+1], A[r]); return i+1; } int QuickSort(int A[], int p, int r, int k){ int q = partition(A,p,r); if(k-1 == q) return A[q]; else if(k-1 < q) return QuickSort(A,p,q-1,k); else return QuickSort(A,q+1,r,k); } int main(){ int A[] = {1,4,35,444,3,46,76,88,24,23,2345,65}; int n = sizeof(A)/sizeof(A[0]); cout << QuickSort(A,0,n-1,12); return 0; } ``` ## Median of Two Sorted Array ### Problem Statement The question is selected from Leetcode, the original problem can be referred to [here](https://leetcode.com/problems/median-of-two-sorted-arrays/). ``` Given two sorted arrays A and B of size m and n respectively. Return the median of the two sorted arrays. The overall complexity should be O(log(m+n)). ``` ### Example ``` Input: A[] = {5,5,6,7,8,9,14}; B[] = {2,3,5,5,8}; Output: 5.5 ``` ### Algorithm The straightforward method will be that we can merge two arrays into one bigger array (with the help of the merge function in merge sort). Therefore, we can directly return the median in position (m+n)/2 if m+n is odd; the half sum of position (m+n)/2 and (m+n)/2+1 if m+n is even. However, the straightforward method will need to go through two arrays, whose complexity will be $O(m+n)$. If we first compare the median of two sorted arrays, there will be three conditions: median(A) > median(B), median(A) < median(B) and median(A) = median(B). If median(A) < median(B), we will say that any number smaller than median(A) will definitely not be the median of A and B. Therefore, we can prune the half of A array. The condition will also hold when median(B) < median(A). When median(A) = median(B), we can directly return the value of median(A) since half of element of A is smaller than median(A) and half of element of B is smaller than median(B). Therefore, we can now construct a function that returns the k-th number in two arrays, where k = median of A and B in this problem. The function should always compare the k/2 of A and B, if k/2 is larger than the size of A or B. Assume that the size of A is always smaller than the size of B. If it is not, we can exchange the input order of A and B. When array A is empty, we can directly return the k-th element in array B. When k is equal to one, we can return the minimum value of the first element of A and B. Finally, we apply the divide and conquer method after comparing the current two values of A and B as the pseudo code below. ### Pseudo Code ``` function median(A[], B[], p1, q1, p2, q2, k): m = q1-p1+1; n = q2-p2+1; if m > n: return median(B, A, p2, q2, p1, q1, k); if m == 0: return B[p2+k-1]; if k == 1: return min(A[p1], B[p2]); pa = min(k/2, m); pb = k-pa; if A[p1+pa-1] < B[p2+pb-1]: return median(A,B,p1+pa,q1,p2,q2,k-pa); else if A[p1+pa-1] > B[p2+pb-1]: return median(A,B,p1,q1,p2+pb,q2,k-pb); else return A[p1+pa-1]; ``` The variables p1, q1, p1, and q2 represent the current start point and end point of the subarray of A and. The variables m,n represent the size of current subarray of A and B. In order to find out the k-th variable in array A and B, we will need to assign pa and pb rather than directly compare k/2 position in A and B since we the size of array might be smaller than k/2. Note that we will have the array started at index 0. Therefore, the k-th position in the array needs to be minus one in order to get the correct position. After we finish the comparison of A[p1+pa-1] and B[p2+pb-1], we can assure that either pa or pb elements will not be the median of array A and B. Therefore, we can keep finding either k-pa or k-pb position of A and B. ### Complexity Since we will prune half of the array of either A or B, the overall complexity will be $O(log(m+n))$. ### C++ Code ```cpp= #include <iostream> #include <cmath> using namespace std; double median(double A[], double B[], int p1, int q1, int p2, int q2, int k){ int m = q1-p1+1;//A int n = q2-p2+1;//B if(m > n) return median(B, A, p2, q2, p1, q1, k); if(m == 0) return B[p2+k-1]; if(k == 1) return min(A[p1], B[p2]); int pa = min(m, k/2); int pb = k-pa; if(A[p1+pa-1] < B[p2+pb-1]) return median(A, B, p1+pa, q1, p2, q2, k-pa); else if(A[p1+pa-1] > B[p2+pb-1]) return median(A, B, p1, q1, p2+pb, q2, k-pb); else return A[p1+pa-1]; } int main(){ double A[] = {5,5,6,7,8,9,14}; double B[] = {2,3,5,5,8}; int m = sizeof(A)/sizeof(A[0]); int n = sizeof(B)/sizeof(B[0]); if((m+n)%2 == 0) { double k = (median(A, B, 0, m-1, 0, n-1, (m+n)/2) + median(A, B, 0, m-1, 0, n-1, (m+n)/2+1))/2; printf("%f\n",k); } else { double k = median(A, B, 0, m-1, 0, n-1, (m+n)/2+1); printf("%f\n",k); } return 0; } ```