COMP 3711 – Design and Analysis of Algorithms III

# COMP 3711 – Design and Analysis of Algorithms III 2022 Fall Semester – Written Assignment 3 Distributed: October 26, 2022 Due: November 13, 2022, 23:59 Name: Kwong Cheuk Lam SID: 20776617 Email: clkwongaa@connect.ust.hk **Implementation of Pseudocode of Q1-Q4:** https://colab.research.google.com/drive/1V1v5eHEvIWTjd30m2MZUc3ISZA8HHS4x?usp=sharing **Online version:** https://hackmd.io/@j1Jsew7VSb-GQ9oE3SBz8g/H1w473PEi **Collaborators:** 1. Lam Ho Wai (hwlamah@connect.ust.hk) ### Question 1 #### Definition Suppose `root` is the root node of the input binary tree. Define `resRoot` as the root node a resulting binary tree generated by our algorithm, where: 1. it has the same structure as the input binary tree. 2. all of its node values correspond to the maximum non-adjacent sum of the subtree rooted at that input node. The maximum non-adjacent sum of the whole input tree would then be the value of the `resRoot`. #### Recurrence We try to solve the problem with a bottom-up approach. We will find the maximum non-adjacent sums of the subtrees from the lowest level to the highest level. (It doesn't matter if we solve from left-to-right or right-to-left at the same level.) **Base Case:** The node is a leaf node. The maximum non-adjacent sum of the tree rooted at this node is just the value of the node. **Recursive cases:** *Case 1:* Select the current node. We cannot select the immediate children nodes, both left and right. But we can choose the four grandchildren nodes if there are any. Therefore, we must include the grandchildren's optimal solutions. Since, we work bottom-up, their optimal solutions are already calculated. Hence, The maximum non-adjacent sum of the subtree rooted at this node `resNode.value =` `node.value + resNode.left.left.value + resNode.left.right.value + resNode.right.left.value + resNode.right.right.value`. (The value is treated 0 if the node doesn't exist.) *Case 2:* Not select the current node. We can select the immediate children nodes if there are any, and for the sake of optimality, we must include the optimal solutions of the two immediate children. `resNode.value =` `resNode.left.value + resNode.right.value`. ***Combinition of two cases:*** The optimal solution is chosen by: `resNode.value = max{` `resNode.left.value + resNode.right.value,` `node.value + resNode.left.left.value + resNode.left.right.value + resNode.right.left.value + resNode.right.right.value}` #### Pseudocode ```c= def findMax(root, n): // We deep copy a resulting binary tree with the same structure at the input tree // And initialize the values of the tree as zero resRoot <- deepStructureCopy(root) // bfs: we generate two lists of pointers to the nodes // in an order from top to bottom, left-to-right nodeList <- bfs(root) resNodeList <- bfs(resRoot) // we reverse the lists, so that they are now pointers of nodes that are // in an order from bottom to top, right-to-left reverse nodeList reverse resNodeList // we loop throught the nodes for i <- 1 to n: node <- nodeList[i] res <- resNodeList[i] // if the node is a leave node, the max sum would just be its own value if(node.left = None and node.right = None) then res.val <- node.value // recursive case else: // get the optimal solutions of the children and grandchildren, // zero if that node does not exist at all if(res.left exists) then l <- res.left.value else l <- 0 if(res.right exists) then r <- res.right.value else r <- 0 if(res.left.left exists) then ll <- res.left.left.value else ll <- 0 if(res.left.right exists) then lr <- res.left.right.value else lr <- 0 if(res.right.left exists) then rl <- res.right.left.value else rl <- 0 if(res.right.right exists) then rr <- res.right.right.value else rr <- 0 res.value = max(l+r, node.value+ll+lr+rl+rr) return resRoot.value ``` #### Runtime 1. line 5: Since the original tree has only n nodes, deep copy takes $\mathcal{O}(n)$ times. 2. line 9-10, 14-15: bfs and list reversal of n nodes takes $\mathcal{O}(n)$ times. 3. line 18-49: The loop runs n times. The variable assignments, comparisons, `max()`, array access and data member access take constant times. So, the loop takes $\mathcal{O}(n)$ times. Therefore, the algorithm takes $\mathcal{O}(n)$ times. ### Question 2 #### Definition Let P[i, j] be the length of longest possible increasing path ended at position (i, j). #### Recurrence **Base Case:** M[0,j] and M[j,0] = ∞ $\forall i,j \in [0,n]$. The base case is set at infinity such that the cell cannot be reached outside the boundary. **Recursive cases:** **Case 1:** $M[i,j] > M[i-1,j]$ and $M[i,j] > M[i,j-1]$ M[i,j] can be reached from the left or the top. Therefore, the longest path length ended at M[i,j] is max{P[i-1,j],P[i,j-1]} + 1. And it can be reached by extending the longest path from M[i-1,j] or M[i,j-1], whichever is longer. **Case 2:** $M[i,j] > M[i-1,j]$ and $M[i,j] \leq M[i,j-1]$ M[i,j] can only be reached from the top. Therefore, the longest path length ended at M[i,j] is P[i-1,j] + 1. The path extends the optimal solution at M[i-1,j] to one further down. **Case 3:** $M[i,j] \leq M[i-1,j]$ and $M[i,j] > M[i,j-1]$ M[i,j] can only be reached from the left. Therefore, the longest path length ended at M[i,j] is P[i,j-1] + 1. The path extends the optimal solution at M[i,j-1] to one further right. **Case 4:** $M[i,j] \leq M[i-1,j]$ and $M[i,j] \leq M[i,j-1]$ M[i,j] cannot be reached from the left or the top. Therefore, no other paths can stop at M[i,j], except the path that only contains itself. P[i,j] = 1 and the path is [M[i,j]] ***Combinition of four cases:*** Since we are solving the problem from top-to-bottom, left-to-right, the previous optimal results for use in $P[i,j]$ are available. $P[i,j] = \left\{ \begin{array}{rcl} \max\{P[i,j-1], P[i-1,j]\} & \text{if} & M[i,j] > M[i-1,j] \text{ and } M[i,j] > M[i,j-1] \\ P[i-1,j] + 1 & \text{if} & M[i,j] > M[i-1,j] \text{ and } M[i,j] \leq M[i,j-1] \\ P[i,j-1] + 1 & \text{if} & M[i,j] \leq M[i-1,j] \text{ and } M[i,j] > M[i,j-1] \\ 1 & \text{if} & M[i,j] \leq M[i-1,j] \text{ and } M[i,j] \leq M[i,j-1] \end{array} \right.$ #### Reconstructing the optimal solution Then, we can find the ending point of the longest path and track the path back. We first locate the the location (i, j) where the maximum path is ended at and add it to the resulting array `output`. - (\*) Then we repeat the following steps after until P[i,j] = 1, i.e. the start of the path. 1. Determine where the current cell is reached from. **Case 1:** if P[i-1,j] = P[i,j] - 1 and M[i-1,j] < M[i,j] We can say that the path is extended from the top, and we trace back to the element above. i.e. i <- i - 1 **Case 2:** if P[i,j-1] = P[i,j] - 1 and M[i,j-1] < M[i,j] and Case 1 is not satisfied We can say that the path is extended from the left, and we trace back to the left element. i.e. j <- j - 1 2. Add the previous cell to the output. We add M[i,j] to the front of the output array. Note that the two cases are collectively exhausive as the path ended at M[i,j] must come from either top or left. Also, (\*) may not be unique and the end point can come from both the top and the left (Case 1 and Case 2 are satisfied at the same time) because the optimal solution itself may not be unique. And we just pick the optimal solution that is viable. #### Pseudocode: ```c= def longestPath(M, n): P[0...n,0...n] <- 0 // Initialize the array P[0,0...n] <- ∞ // set the top boundary as infinity P[0...n, 0] <- ∞ // set the left boundary as infinity // Initialize the two variables storing the end point of the longest path endI <- 0, endJ <- 0 // Initialize the variable that stores the longest path length maxNum <- 0 // loop through the cells top-to-bottom, left-to-right for i in range(1, n): for j in range(1, n): // Recursive cases if(M[i][j] > M[i][j-1] and M[i][j] > M[i-1][j]) do P[i][j] <- max(P[i][j-1], P[i-1][j]) + 1 else if(M[i][j] > M[i][j-1]) do P[i][j] <- P[i][j-1] + 1 else if(M[i][j] > M[i-1][j]) do P[i][j] <- P[i-1][j] + 1 else do P[i][j] <- 1 // If the current length is the highest one, we // store the maxNum and also the index if(P[i][j] > maxNum): maxNum <- P[i][j] endI <- i, endJ <- j // Initialize the output array with the number at the end of the longest path output <- [M[endI][endJ]] // Reconstructuring the optimal solution // We loop until maxNum = 1, i.e. the start of the array is met while(maxNum > 1): if(P[endI-1][endJ] == P[endI][endJ]-1 and m[endI-1][endJ] < m[endI][endJ]): endI <- endI - 1 add m[endI][endJ] to the end of output else: endJ <- endJ - 1 add m[endI][endJ] to the end of output maxNum <- maxNum - 1 // reverse the output array such that it starts with the starting number reverse output return output ``` #### Runtime 1. Line 2 - 4: Initialization of an array of size (n+1) x (n+1) takes $\mathcal{O}(n^2)$ time. 2. Line 13 - 29: The loops iterate $n^2$ times. The comparisons, assignments, max() all take constant time. The runtime is $\mathcal{O}(n^2)$. 3. Line 36 - 44: The longest possible path length in a n x n array is 2n-1, hence the loops iterate less than 2n times. The comparisons, assignments take constant time. The runtime is $\mathcal{O}(n)$. 4. Line 47: Reversal of the array with maximum size 2n-1 takes $\mathcal{O}(n)$ times. 5. Other assignments take constant time $\mathcal{O}(1)$. Therefore, the total runtime $=\mathcal{O}(n^2)$ #### Space complexity 1. Line 2 - 4: The new array P is of size $\mathcal{O}(n^2)$. 2. Line 32, 36-44: The output array has a maximum size of 2n-1, i.e. $\mathcal{O}(n)$. 3. Other variables take constant space $\mathcal{O}(1)$. Therefore, the total space $=\mathcal{O}(n^2)$ ### Question 3 (1) #### Definition Let `S` be the set we are partitioning and `n` be the size of the set. Let `V[i,w]` be the largest obtained value for a subset that has at most a total sum of `w`, choosing only the first `i` items in the set `S`. Let `totalSum` be the sum of all elements in `S`. To partition the set `S` into two subsets of equal sum, each subset must have a sum that is equal to half of the set, i,e. `totalSum/2`. To determine whether the set is possible to be partitioned into two equal sum, we can find `V[n,totalSum/2]`. If `V[n,totalSum/2] = totalSum/2`, it means that we can pick a subset from `S` that has sum = totalSum/2. Hence, the remaining unpicked elements will form a subset that also have a sum of totalSum/2. We will return True. Otherwise, we will return False. (Note that totalSum is assumed even.) #### Recurrence The recurrence is similar to the knapsack problem. Consider V[i,w], **Base case:** The maximum sum is 0: $V[i,0] = 0 ,\forall i \in [1,n]$ We pick from an empty set: $V[0,w] = 0 ,\forall w \in [1, \text{totalSum/2}]$ **Recursive cases:** **Case 1:** We choose $S[i]$ and add it to the subset. It also follows that $w \geq S[i]$. Then, we would also include the optimal solution for $V[i-1, w-S[i]]$ that generates the greatest sum for the first i-1 elements and also leaves space for $S[i]$. Then $V[i,w] = S[i] + V[i-1, w-S[i]]$ **Case 2:** We doesn't choose $S[i]$. Then, the optimal solution for $V[i,w]$ is just the optimal solution for $V[i-1,w]$. **Combination of two cases:** Case 1 happens when $w \geq S[i]$ and adding S[i] to the subset generates a greater sum, i.e. $V[i-1,w] < S[i] + V[i-1, w-S[i]]$ Therefore, $V[i,j] = \left\{ \begin{array}{rcl} S[i] + V[i-1, w-S[i]]\ & \text{if} & w \geq S[i] \text{ and } V[i-1,w] < S[i] + V[i-1, w-S[i]]\\ V[i-1,w] & \text{if} & \text{otherwise} \\ \end{array} \right.$ Or equivalently, $V[i,j] = \left\{ \begin{array}{rcl} \max\{V[i-1,w], S[i] + V[i-1, w-S[i]]\} & \text{if} & w \geq S[i] \\ V[i-1,w] & \text{if} & \text{otherwise} \\ \end{array} \right.$ where $i \in [1,n]$ and $w \in [1,\text{totalSum}/2]$ #### Pseudocode ```c= def setPartitioning(S, n): totalSum <- sum of S W <- totalSum/2 V[0...n,0...W] <- 0 for i <- 1 to n do for w <- 1 to W do if(S[i] <= w and S[i]+V[i-1][w-S[i]] > V[i-1][w]) do V[i][w] <- S[i]+V[i-1][w-S[i]] else do V[i][w] <- V[i-1][w] if(V[n][W] = W) do return True else do return False ``` #### Runtime 1. Line 1: Sum of S takes $\mathcal{O}(n)$ times. 2. Line 4: Array initialization takes $\mathcal{O}(Wn)$ times. 3. Line 5 - 11: The loop iterates Wn times. Other comparisons and variable assignments take constant time. It runs $\mathcal{O}(Wn)$ times. 4. Other comparisons and variable assignments take constant time $\mathcal{O}(1)$. The total runtime = $\mathcal{O}(Wn)$ , where W is half of the total sum of the set and n is the number of elements in the set. #### Space complexity 1. Line 4: The array is of size $\mathcal{O}(Wn)$. 2. Other variables take constant space $\mathcal{O}(1)$. The total space = $\mathcal{O}(Wn)$ ### Question 3 (2) #### Description Suppose the sums of two subsets partitioned from $S$ are $S_1 \text{ and } S_2$ where $S_1 \leq \text{totalSum}/2 \leq S_2$. Assume their difference $k = S_2 - S_1$, then $k = (\text{totalSum} - S_1)-S_1$, $\Rightarrow k = 2\times(\text{totalSum}/2 - S_1)$ Then, $k$ is minimized when $S_1$ is maximized under the constraint $S_1 \leq \text{totalSum}/2$, and we can reuse and add on to the above alogrithm. #### Definition Let `S` be the set we are partitioning and `n` be the size of the set. Let `V[i,w]` be the largest obtained value for a subset that has at most a total sum of `w`, choosing only the first `i` items in the set `S`. Let `keep[i,w]` be 1 if we select S[i] in the optimal solution for `V[i,w]`, 0 otherwise. Let `totalSum` be the sum of all elements in `S`. Since `totalSum` may be odd, we minimize the differences of two subsets by finding the largest valued subset S1 that is smaller than `⌊totalSum/2⌋`. #### Recurrence The recurrence is the same in Question 3 (1). $V[i,j] = \left\{ \begin{array}{rcl} S[i] + V[i-1, w-S[i]]\ & \text{if} & w \geq S[i] \text{ and } V[i-1,w] < S[i] + V[i-1, w-S[i]]\\ V[i-1,w] & \text{if} & \text{otherwise} \\ \end{array} \right.$ $\text{keep}[i,j] = \left\{ \begin{array}{rcl} 1 & \text{if} & w \geq S[i] \text{ and } V[i-1,w] < S[i] + V[i-1, w-S[i]]\\ 0 & \text{if} & \text{otherwise} \\ \end{array} \right.$ where $i \in [1,n]$ and $w \in [1,\text{totalSum}/2]$ #### Reconstructing the optimal solution We start from k <- W and repeat the following steps from i <- n to 1: 1. Case 1: keep[i,k] = 1 For the current remaining space k, we include S[i] to $S_1$. We also update the remaining space k <- k - S[i]. 2. Case 2: keep[i,k] = 0 For the current remaining space k, we do not include S[i] to $S_1$, but include it to $S_2$. We then return the two sets. #### Pseudocode ```c= def greatestSubsetDifference(S, n): totalSum <- sum of S W <- ⌊totalSum/2⌋ V[0...n,0...W] <- 0 keep[0...n,0...W] <- 0 for i <- 1 to n do for w <- 1 to W do if(S[i] <= w and S[i]+V[i-1][w-S[i]] > V[i-1][w]) do V[i][w] <- S[i]+V[i-1][w-S[i]] keep[i,w] <- 1 else do V[i][w] <- V[i-1][w] keep[i,w] <- 1 k <- W S1 <- Ø S2 <- Ø for i <- n downto 1 do if keep[i,k] = 1 do add S[i] to S1 k <- k - S[i] else do add S[i] to S2 return (S1, S2) ``` #### Runtime 1. Line 1: Sum of S takes $\mathcal{O}(n)$ times. 2. Line 4-5: Array initializations take $\mathcal{O}(Wn)$ times. 3. Line 6 - 14: The loop iterates Wn times. Other comparisons and variable assignments take constant time. It runs in $\mathcal{O}(Wn)$ times. 4. Line 19 - 24: The loop iterates n times. Other comparisons and variable assignments take constant time It runs in $\mathcal{O}(n)$ times. 5. Other comparisons and variable assignments take constant time $\mathcal{O}(1)$. The total runtime = $\mathcal{O}(Wn)$ , where W is half of the total sum of the set and n is the number of elements in the set. #### Space complexity 1. Line 4-5: The arrays are of sizes $\mathcal{O}(Wn)$. 2. Line 17-18: The sets are of sizes $\mathcal{O}(n)$. 3. Other variables take constant space $\mathcal{O}(1)$. The total space = $\mathcal{O}(Wn)$ ### Question 4 #### Description In the following algorithm, we treat splitting `A` into `m` array as making `m-1` cuts to the array `A`. #### Definition Assume the input array is `A` of size `n` and we make `m` subarrays. Let `V[j,i]` be the minimized largest sum if we cut the array `A[1...i]` j times. i.e. the largest sum of the optimal solution among j+1 continuous subarrays of `A[1...i]`. Let `lastCut[j,i]` be the position of the last cut (j-th cut) that the largest sum among j+1 continuous subarrays of A[1...i] is minimized. Note that we denote "the position of cut" $i$ as cutting the array from A[x...y] to A[x...i-1] and A[i...y]. e.g. if we cut [1, 2, 3, 4, 5] at position 4, we would get [1, 2, 3] and [4, 5]. #### Recurrence Now consider V[j,i] **Base Case:** $j = 0 \text{ or } j \geq i$ *$j = 0:$* We do not cut the array at all, $V[0, i] = V[0, i - 1] + A[i]$, i.e. the cumulative sum till i. *$j \geq i:$* We ignore these cases as we cannot make j cuts and devide i elements to j+1 subarrays. We set them to 0 initially. **Recursive Cases:** **Case 1:** We cut the array at position i. We will have a new subarray of size 1 [A[i]]. Also, we must consider the optimal solution of j - 1 cuts in A[1...i-1], i.e. V[j-1,i-1]. The largest sum for this case is then $max\{V[j-1,i-1], A[i]\}$. **Case 2:** We do not cut the array at position i. It means that we still consider the optimal solution of j cuts, V[j,i-1], but add one element to the last subarray of the previous optimal solution. Define $\text{acc}$ as the sum of last subarray in the optimal solution of $V[j,1...i-1]$, or equivalently, sum of $A[lastCut[j,i-1]...i-1]$. The largest sum is either in the last subarray $=acc+A[i]$ or in the previous subarrays $=V[j,i-1]$. Hence, The largest sum for this case is then $max\{acc+A[i], V[j,i-1]\}$. ![](https://i.imgur.com/QVMZRj3.png) **Combinition of two cases:** Since we have to minimize the largest sum from two cases, we would cut the array at position i only if it can create a smaller maximum sum, the recurrence would then be: $V[j,i] = \text{min of} \left\{ \begin{array}{} \max\{V[j-1,i-1], A[i]\} \\ \max\{\text{acc}+A[i], V[j,i-1]\} \\ \end{array} \right.$ where $i \in [1,n]$ and $j \in [1,m]$ $\text{lastCut}[j,i] = \left\{ \begin{array}{} j & \text{ if } & \max\{V[j-1,i-1], A[i]\} \leq \max\{\text{acc}+A[i], V[j,i-1]\} \\ \text{lastCut}[i,j-1] & \text{ if } & \text{ otherwise } \end{array} \right.$ where $i \in [1,n]$ and $j \in [1,m]$. At position (j, i): $\text{new acc} = \left\{ \begin{array}{} A[i] & \text{ if } & \max\{V[j-1,i-1], A[i]\} \leq \max\{\text{acc}+A[i], V[j,i-1]\} \\ \text{acc} + A[i] & \text{ if } & \text{ otherwise } \end{array} \right.$ For the case where $\max\{V[j-1,i-1], A[i]\} = \max\{\text{acc}+A[i], V[j,i-1]\}$, we would assume we make a cut here so that the $\text{acc}$ is reset to $A[i]$. #### Reconstructuring the Solution Since we have the `lastCut` array that stores the last cut at each position, we can start from lastCut[m,n]. If lastCut[m,n] stores k, it means that we the last subarray we cut is [k...n]. Then we find the optimal solution for A[1...k-1], i.e. lastCut[m-1, k-1] until m = 0. #### Pseudocode ```c= def continueSubarrayPartition(A, n, m): // initialize the arrays needed // the arrays only goes up to m-1 has m subarrays only needs m-1 cuts V[0...m-1,0...n] <- 0 lastCut[0...m-1,0...n] <- 0 acc = 0 // base case for m = 0 for i <- 1 to n: acc <- acc + A[i] V[0][i] <- acc // it only goes to m-1 as m subarray requires m-1 cuts for j <- 1 to m-1: // we set the initial acc to infinity so that // it will always "cut" at the begining at the array acc <- ∞ // we ignore cases where j >= i for i <- j+1 to n: cut <- max(A[i], V[j-1][i-1]) notCut <- max(acc + A[i], V[j][i-1]) if(notCut >= cut): V[j][i] <- cut acc <- A[i] lastCut[j][i] <- i else: V[j][i] <- notCut acc <- acc + A[i] lastCut[j][i] <- lastCut[j][i-1] cuts[1...m-1] <- 0 k <- n // Finding the cuts for the optimal solution for cut <- m-1 downto 1: lastCut[cut][k] to the front of cuts k <- lastCut[cut][k] - 1 // we add two cuts at the front and end of the array for our convenience cuts = [1] + cuts + [n+1] // res that stores the subarrays res = [] // reconstructuring the optimal solution: Finding m subarrays based on the cuts for i <- 1 to m: add A[cuts[i]...cuts[i+1]-1] to res return res ``` #### Runtime: 1. Line 5-6: Array initialization takes $\mathcal{O}(nm)$ times. 2. Line 10-12: The loop iterates n times, the runtime is $\mathcal{O}(n)$. 3. Line 15-33: The loop iterates $\mathcal{O}(nm)$ times. Other comparisons, variable assignments and max() take constant time. 4. Line 39-41: The loop iterates $\mathcal{O}(m)$ times. 5. Line 49-50: The loop iterates $\mathcal{O}(m)$ times. 6. Other variable assignments takes $\mathcal{O}(1)$ times. The runtime is $\mathcal{O}(nm)$. Since m is much smaller than n, we can also say the runtime is $\mathcal{O}(n)$. #### Space Complexity: 1. Line 3-4: The arrays take $\mathcal{O}(mn)$ space. 2. Line 35: The array takes $\mathcal{O}(m)$ space. 3. Line 46: The array contains a few subarrays, but the total number of elements is still $\mathcal{O}(n)$. 4. Other variable takes constant space $\mathcal{O}(1)$. The total space is $\mathcal{O}(nm)$. Similarly, since m is much smaller than n, we can also say the space is $\mathcal{O}(n)$.