ADA 103 期中考

# ADA 103 期中考 ###### tags: `Exams` # Problem 1 Short Answer Question (30 points) Answer the following questions and briefly justify your answer. For (a)-(d), decide whether you think the statement, particularly the underlined phrase, is true or false. Briefly explain your choice to receive full credit. (a) (3 points) True or False: To prove the correctness of a greedy algorithm, we must prove that __every optimal solution contains our greedy choice__. 旭君是不是連續3年出同一題阿 False, just that one containg greedy choice is optimal ___ (b) (3 points) True or False: When items have __non-integer weights__, it may be extremely inefficient to solve the 0/1 Knapsack Problem using dynamic programming because of __lack of overlapping subproblems__. True. Because non-integer addition/subtraction will result in a weight that is most likely not integer, and it is infeasible for us to make a dp table containing non-integer weights (since infinite of such values). The problem becomes easier if we can find pattern, like increments of set length. ___ (c) (3 points) True or False: (m_1,w_3),(m_2,w_2),(m_3,w_1) is one stable matching in the following preference lists: ┌──┬───────┐ ┌──┬───────┐ │ │ 1st 2nd 3rd│ │ │ 1st 2nd 3rd│ ├──┼───────┤ ├──┼───────┤ │ m_1│ w_3 w_2 w_1│ │ w_1│ m_1 m_2 m_3│ │ m_2│ w_2 w_1 w_3│ │ w_2│ m_1 m_2 m_3│ │ m_3│ w_3 w_2 w_1│ │ w_3│ m_2 m_1 m_3│ └──┴───────┘ └──┴───────┘ 其實陳蘊儂沒有教ㄝ可ㄅ如我 True. By the algorithm, we first let m1 and w3 become engaged, then we let m2 and w2 become engaged. m3 asks w3 but w3 prefer m1 over m3, m3 as w2 over but w2 also prefer m2 over m3. m3 finally left with w1 ##### p.s. m3是甚麼死肥宅，大家都討厭他可是就算這樣還是有老婆我哭哭 ___ (d) (3 points) True or False: If f(n) is O(g(n)), then 2^f(n) is O(2^g(n)). False. let f(n) = 2n, g(n) = n, but 4^n is not O(2^n) ___ (e) (3 points) Given the recurrence relation A(i,j) = F(A(i,j-1),A(i-1,j-1), A(i-1,j+1)), provide a valid traversal order to fill the DP table or justify why no valid traversal exists. By row. these are all to the top or to the left of i,j. (i-1,j+1) is top right. If we fill the previous row, no problem ___ (f) (3 points) Given the recurrence relation A(i,j) = F(A(i-1,j-1),A(i+1,j+1)), provide a valid traversla order to fill the DP table or justify why no valid traversal exists. No valid traversal, we need to fill top left and bottom right, which is impossible? ==有沒有什麼系統一點的方式去寫阿== ___ (g) (3 points) Suppose you have designed a Divide and Conquer algorithm whose running time T(n) can be expressed by T(n) = aT(n/b)+f(n). Briefly explain what a,b, and f(n) represent in your algorithm. a = # subproblem b = size of each subproblem f(n) = time needed to conquer and merge the subproblems ___ (h) (3 points) Solve the following recurrence by giving a tight Θ-notation bound: $$ T(n) = 2T(n/2)+(n^2) log n $$ Master's theorem, f(n) > n^1.00001, therefore time is f(n) = n^2logn ___ (i) (6 points) Describe a real-world problem that you have solved or have thought about solving using technique learned in the class. Make an educated guess about how fast your algorithm might be, and how much time it would take to solve the same problem using a brute-force approach instead. Greedy 演算法之cashier problem. 又出現了！！！大家是不是該想一個好的演算法了呢？ ___ # Problem 2: nth Smallest in Two Databases (15 points) You're given two databases D_1 and D_2, each of which contains n numbers. These 2n numbers are all distinct. For each database, you can query for the kth smallest number for any 1≦k≦n. You want to find the kth smallest number among these 2n numbers. (a) (10 points) Please design a Divide-and-Conquer algorithm to find the nth smallest number among these 2n numbers in O(log n) queries. Hint: Let m = \floor{n/2} and let v_1,m and v_2,m be the mth smallest number in database D_1 and D_2, respectively. Let x be the nth smallest number among these 2n numbers. Consider where x is in three cases: v_1,m > v_2,m, v_1,m = v_2,m, v_1,m < v_2,m. If v1m < v2m, we know that the first m elements in v1 must be smaller than v2m (although we don't know whether m+1, m+2... is). If we know for sure they are smaller, and m < n, we can simply delete them. The next time we query, we need to find the N-m = N/2 largest value, and eliminating another part ==是這樣嘛？一次刪1/4== ___ (b) (5 points) Briefly justify the running time of your algorithm is indeed O(log N) queries. O(log N) queries by drawing recursion trees. If we delete in O(1), this is really T(n) = T(3n/4) + O(1) Which is just log(N) levels each taking O(1). ___ # Problem 3: Longest Path (20 points) (a) (3 points) Briefly explain what optimal substructure is. The smaller subproblems in an optimal solution is also optimal. We can divide a large OPT into composition of smaller OPT. ___ (b) (5 points) Consider a graph G = (V,E), a start node s ∈ V, and end node t ∈ V, and weight w_e of each edge e ∈ E. The Longest Path Problem is to find a longest simple path going from s to t on this graph. (The length of a path is the sum of weights of all edges on the path. A simple path is a path that visits a node at most once, that is, a path without cycles.) Please construct a counterexample to show that the Longest Path Problem has no optimal substructure. 上課說過 QRST 成正方形, cycle ___ (c) Your friend made a interesting observation that the Longest Path Problem on certain types of graphs does exibit optimal substructure. To test your friend's conjecture, you decide to use a dynamic programming algorithm to find one longest path from top-left to bottom-right on a grid-like directed graph with edges going downward and rightward only. As an example, Figure 1 shows a 4-by-4 grid in which each edge is labeled with a weight. Answer the following questions: 1. (3 points) Find one longest path in the graph in Figure 1. length 39. ![](https://i.imgur.com/u7qmc9s.jpg) 2. (6 points) Consider a n-by-n grid. Suppose you want to find the length of a longest path using dynamic programming. Please define your subproblems and represent the value of the optimal solution using a recurrence. $$ dp[i][j] = max(dp[i-1][j], dp[i][j-1]) $$ 3. (3 points) What's special about this type of graphs? Please explain why the Longest Path Problem has no optimal substructure in general but has optimal substructure in this special type of graphs. An informal observation would be sufficient. This is a DAG (Directed Acyclic Graph), so we know that it is impossible for us to "backtrack" at all. This means that we won't have the problem that might result in travelling in parts of a cycle. This graph is especially nice for how it organized it into a grid, but that is not neccessary. 3 4 5 ｓ→→→○→→→○→→→○ 8↓ 5↓ 5↓ 8↓ ↓ 8 ↓ 7 ↓ 3 ↓ ○→→→○→→→○→→→○ 3↓ 6↓ 5↓ 4↓ ↓ 3 ↓ 9 ↓ 7 ↓ ○→→→○→→→○→→→○ 7↓ 5↓ 3↓ 1↓ ↓ 3 ↓ 1 ↓ 4 ↓ ○→→→○→→→○→→→ｔ Figure 1: Find one longest path from s to t. # Problem 4: The Rise of the Planet of the Apes (15 points) Being a brilliant ape, Caesar is learning a sign language to communicate with his trainer, Will. In this sign language, each word, is represented by a sequence of gestures. This sign language should be prefix-free, meaning that no word is a prefix of another word, since Caesar is smart enough to combine words into phrases. Will wants to teach Caesar the following seven words that are frequently used in their daily activities: ┌─────┬─────────────────────┐ │ Word │food sleep play drink open who come │ ├─────┼─────────────────────┤ │ Frequency│0.25 0.15 0.12 0.19 0.07 0.14 0.08 │ └─────┴─────────────────────┘ (a) (5 points) Will knows that Caesar can do two gestures─pointing his index finger upward (UP) pointing his index finger downward (DOWN)─without difficulty. Please help Will to create a prefix-free binary sign language so as to minimize the average number of gestures per word. This sign language should contain the above seven words and each word is represented by a sequence of UP and DOWN gestures. Huffman coding 簡單化一下就好 ___ (b) (5 points) Caesar now can do three gestures─UP, DOWN, and WAVE. Please create a prefix-free ternary sign language so as to minimize the average number of gestures per word. The sign language should contain the above seven words and each word is represented by a sequence of UP, DOWN, and WAVE gesture. 還是簡單變成3個小孩剛好都是full的不用掉換 ___ c) (5 points) Following (a), Will observes that the UP gesture consumes three times of energy than the DOWN gesture. Will comes up with greedy heuristic in the hope to minimize the average energy consumption per word. His greedy heuristic is as follows. Given n words and their frequencies, Will first draws a binary prefix tree that minimizes the average number of gestures per word in its corresponding binary sign language. He then re-labels each edge such that for every node v and its left node v_l and right node v_r, edge (v,v_l) is re-labeled with UP and edge (v,v_r) is re-labeled with DOWN if the sum of the frequencies of the leaf nodes in v's left subtree is lower than it of its right subtree. The labels are switched otherwise. Please show that this greedy algorithm is incorrect. You don't need to use the same input (i.e., the 7 words and their frequency distributions) in your counterexample. ![](https://i.imgur.com/VEn9xGM.jpg) # Problem 5: Counting Inversions (10 points + 10 bonus points) (a) Counting inversions revisited (10 points) Given a sequence of unique numbers B = b_1,b_2,...b_n, an inversion is a pair of numbers b_i and b_j in this sequence such that i < j and b_i > b_j. Let I(B) be the set of inversions in B. For example, if B = <1, 3, 5, 2>, I(B) = {(3,2),(5,2)}, and the number of inversions |I(B)| is 2. Given an unsorted sequence B with n numbers, please design an efficient algorithm to calculate the number of inversions |I(B)| in O(n log n) time. Please briefly explain why your algorithm indeed runs in O(n log n). Like homework, use a modified merge sort ___ (b) Bonus: Counting significant inversions (10 points) Given a sequence of unique numbers B = b_1,b_2,...b_n, a significant inversion is a pair of numbers b_i and b_j in this sequence such that i < j and b_i > 2b_j. Let SI(B) be the set of significant inversions in B. For example, if B = <1, 3, 5, 2>, SI(B) = {(5,2)}, and the number of significant inversions |SI(B)| is 1. Given an unsorted sequence B with n numbers, please describe how to modify your algorithm in part (a) for calculating the number of significant inversions |SI(B)| in O(n log n) time. Please briefly explain why your algorithm indeed runs in O(n log n). Same question as 104, modified merge sort with pointer pointing at the value Vr that is at least 2Vl. ___ # Problem 6: Finding closest pair of points (10 points + 10 bonus points) (a) (10 points) There are n points in a two-dimensional space, and point i's x-y coordinate is (x_i,y_i). However, in this bizarre world, the distance between points i and j is defined as $$d(i,j) = min {(|x_i-x_j|,|y_i-y_j|)} $$, which is not Euclidean distance. Suppose you are given arrays X_n and Y_n containing these n points sorted by their x-coordinates and y-coordinates, respectively. Please design a greedy algorithm to find a closest pair of points in O(n) time, and explain why your algorithm is correct. You can prove it by showing your algorithm has the greedy choice property and optimal substructure. Alternatively, you can also use other proof techniques as long as you can show that your solution is indeed optimal. Hint: first consider a 1D case where the distance is defined as d(i,j) = |x_i-x_j| and then extend your solution to 2D. Since the arrays are already sorted, we can just loop through i and i+1. We do it once in array X, once in array Y. The absolute minimum in the 2 is the closest point, thus O(n) Optimal substructure: Say we divide it into 2 random groups, the global minimum must be the minimum of these 2 groups Greedy property: since they are sorted, the distance between i and i+1 must be minimum. If OPT exist that don't contain i and i+1, but instead i and i+2, then we know i and i+1 is at least as good ___ (b) Bonus (10 points) There are n points in a two-dimensional space, and point i's x-y coordinate is (x_i,y_i). However, in this bizarre world, the distance between points i and j is defined as $$d(i,j) = max{(|x_i-x_j|,|y_i-y_j|)}$$, which is not Euclidean distance. Please design an algorithm to find a closet pair of points in O(n log n) time, and explain why your algorithm is correct. DC approach similar to how we solve it in euclidean space. We first take O(nlogn) to sort each x and y. We then DC into 2 smaller parts, lets say by the x coordinate. We do this until base case. When we merge, we choose the min distance to be min(left, right, across). We can calculate across in a similar way. Let delta = min(left, right), we can allocate an area of 2delta by delta. Because of this "max" value, it make sense that any point must be within this delta x 2delta area or else either its x difference or y difference will be larger than the min. We can find the min by comparing 7 closest points. ==感覺這個至多8個點的性質會繼續存在。如果有的話根本是一樣的作法？== ___ Problem 7: Fair Division of Christmas Gifts (10 bonus points) Christmas is approaching. You're helping Santa Clauses to distribute gifts to children. For ease of delivery, you are asked to divide n gifts into two groups such that the weight difference of these two groups is minimized. The weight of each gift is a postive integer. Please designe an algorithm to find an optimal division minimizing the value difference. The algorithm should find the minimal weight difference as well as the groupings in O(nS) time, where S is the total weight of these n gifts. Briefly justify the correctness of your algorithm. Hint: This problem can be converted into making one set as close to S/2 as possible. Taught in class, knapsack problem basically. ___ Appendix Master Theorem. Let a ≧ 1 and b > 1 be constants, let f(n) be a function, and let T(n) be defined on the nonnegative integers by the recurrence T(n)=aT(n/b) + f(n), where we interpret n/b to mean either \floor{n/b} or \ceiling{n/b}. Then T(n) has the following asymptotic bounds: 1. If f(n) = O(n^(log_b a-ε) for some constant ε> 0, then T(n) = Θ(n^(log_b a)). 2. If f(n) = Θ(n^(log_b a)), then T(n) = Θ(n^(log_b a)log n). 3. If f(n) = Ω(n^(log_b a+ε)) for some constant ε> 0 and if af(n/b) ≦ cf(n) for some constant c < 1 and all sufficiently large n, then T(n) = Θ(f(n)).