# ADA 103 期中考
###### tags: `Exams`
# Problem 1 Short Answer Question (30 points)
Answer the following questions and briefly justify your answer. For (a)-(d),
decide whether you think the statement, particularly the underlined phrase, is
true or false. Briefly explain your choice to receive full credit.
(a) (3 points) True or False: To prove the correctness of a greedy algorithm,
we must prove that __every optimal solution contains our greedy choice__.
旭君是不是連續3年出同一題阿
False, just that one containg greedy choice is optimal
___
(b) (3 points) True or False: When items have __non-integer weights__, it may
be extremely inefficient to solve the 0/1 Knapsack Problem using dynamic
programming because of __lack of overlapping subproblems__.
True. Because non-integer addition/subtraction will result in a weight that is most likely not integer, and it is infeasible for us to make a dp table containing non-integer weights (since infinite of such values). The problem becomes easier if we can find pattern, like increments of set length.
___
(c) (3 points) True or False: (m_1,w_3),(m_2,w_2),(m_3,w_1) is one stable
matching in the following preference lists:
┌──┬───────┐ ┌──┬───────┐
│ │ 1st 2nd 3rd│ │ │ 1st 2nd 3rd│
├──┼───────┤ ├──┼───────┤
│ m_1│ w_3 w_2 w_1│ │ w_1│ m_1 m_2 m_3│
│ m_2│ w_2 w_1 w_3│ │ w_2│ m_1 m_2 m_3│
│ m_3│ w_3 w_2 w_1│ │ w_3│ m_2 m_1 m_3│
└──┴───────┘ └──┴───────┘
其實陳蘊儂沒有教ㄝ 可ㄅ如我
True. By the algorithm, we first let m1 and w3 become engaged, then we let m2 and w2 become engaged. m3 asks w3 but w3 prefer m1 over m3, m3 as w2 over but w2 also prefer m2 over m3. m3 finally left with w1
##### p.s. m3是甚麼死肥宅,大家都討厭他 可是就算這樣還是有老婆我哭哭
___
(d) (3 points) True or False: If f(n) is O(g(n)), then 2^f(n) is O(2^g(n)).
False. let f(n) = 2n, g(n) = n, but 4^n is not O(2^n)
___
(e) (3 points) Given the recurrence relation A(i,j) = F(A(i,j-1),A(i-1,j-1),
A(i-1,j+1)), provide a valid traversal order to fill the DP table or justify
why no valid traversal exists.
By row. these are all to the top or to the left of i,j. (i-1,j+1) is top right. If we fill the previous row, no problem
___
(f) (3 points) Given the recurrence relation A(i,j) = F(A(i-1,j-1),A(i+1,j+1)),
provide a valid traversla order to fill the DP table or justify why no valid
traversal exists.
No valid traversal, we need to fill top left and bottom right, which is impossible?
==有沒有什麼系統一點的方式去寫阿==
___
(g) (3 points) Suppose you have designed a Divide and Conquer algorithm whose
running time T(n) can be expressed by T(n) = aT(n/b)+f(n). Briefly explain
what a,b, and f(n) represent in your algorithm.
a = # subproblem
b = size of each subproblem
f(n) = time needed to conquer and merge the subproblems
___
(h) (3 points) Solve the following recurrence by giving a tight Θ-notation
bound:
$$
T(n) = 2T(n/2)+(n^2) log n
$$
Master's theorem, f(n) > n^1.00001, therefore time is f(n) = n^2logn
___
(i) (6 points) Describe a real-world problem that you have solved or have thought about solving using technique learned in the class. Make an educated
guess about how fast your algorithm might be, and how much time it would take to solve the same problem using a brute-force approach instead.
Greedy 演算法之cashier problem.
又出現了!!! 大家是不是該想一個好的演算法了呢?
___
# Problem 2: nth Smallest in Two Databases (15 points)
You're given two databases D_1 and D_2, each of which contains n numbers. These 2n numbers are all distinct. For each database, you can query for the kth
smallest number for any 1≦k≦n. You want to find the kth smallest number among
these 2n numbers.
(a) (10 points) Please design a Divide-and-Conquer algorithm to find the nth
smallest number among these 2n numbers in O(log n) queries.
Hint: Let m = \floor{n/2} and let v_1,m and v_2,m be the mth smallest
number in database D_1 and D_2, respectively. Let x be the nth smallest number
among these 2n numbers. Consider where x is in three cases: v_1,m > v_2,m,
v_1,m = v_2,m, v_1,m < v_2,m.
If v1m < v2m, we know that the first m elements in v1 must be smaller than v2m (although we don't know whether m+1, m+2... is). If we know for sure they are smaller, and m < n, we can simply delete them. The next time we query, we need to find the N-m = N/2 largest value, and eliminating another part
==是這樣嘛? 一次刪1/4==
___
(b) (5 points) Briefly justify the running time of your algorithm is indeed
O(log N) queries.
O(log N) queries by drawing recursion trees. If we delete in O(1), this is really
T(n) = T(3n/4) + O(1)
Which is just log(N) levels each taking O(1).
___
# Problem 3: Longest Path (20 points)
(a) (3 points) Briefly explain what optimal substructure is.
The smaller subproblems in an optimal solution is also optimal. We can divide a large OPT into composition of smaller OPT.
___
(b) (5 points) Consider a graph G = (V,E), a start node s ∈ V, and end node
t ∈ V, and weight w_e of each edge e ∈ E. The Longest Path Problem is to find
a longest simple path going from s to t on this graph. (The length of a path is
the sum of weights of all edges on the path. A simple path is a path that
visits a node at most once, that is, a path without cycles.) Please construct a
counterexample to show that the Longest Path Problem has no optimal
substructure.
上課說過 QRST 成正方形, cycle
___
(c) Your friend made a interesting observation that the Longest Path Problem on
certain types of graphs does exibit optimal substructure. To test your friend's
conjecture, you decide to use a dynamic programming algorithm to find one
longest path from top-left to bottom-right on a grid-like directed graph with
edges going downward and rightward only. As an example, Figure 1 shows a 4-by-4
grid in which each edge is labeled with a weight.
Answer the following questions:
1. (3 points) Find one longest path in the graph in Figure 1.
length 39.

2. (6 points) Consider a n-by-n grid. Suppose you want to find the length
of a longest path using dynamic programming. Please define your
subproblems and represent the value of the optimal solution using a
recurrence.
$$ dp[i][j] = max(dp[i-1][j], dp[i][j-1]) $$
3. (3 points) What's special about this type of graphs? Please explain why
the Longest Path Problem has no optimal substructure in general but has
optimal substructure in this special type of graphs. An informal
observation would be sufficient.
This is a DAG (Directed Acyclic Graph), so we know that it is impossible for us to "backtrack" at all. This means that we won't have the problem that might result in travelling in parts of a cycle. This graph is especially nice for how it organized it into a grid, but that is not neccessary.
3 4 5
s→→→○→→→○→→→○
8↓ 5↓ 5↓ 8↓
↓ 8 ↓ 7 ↓ 3 ↓
○→→→○→→→○→→→○
3↓ 6↓ 5↓ 4↓
↓ 3 ↓ 9 ↓ 7 ↓
○→→→○→→→○→→→○
7↓ 5↓ 3↓ 1↓
↓ 3 ↓ 1 ↓ 4 ↓
○→→→○→→→○→→→t
Figure 1: Find one longest path from s to t.
# Problem 4: The Rise of the Planet of the Apes (15 points)
Being a brilliant ape, Caesar is learning a sign language to communicate with
his trainer, Will. In this sign language, each word, is represented by a
sequence of gestures. This sign language should be prefix-free, meaning that no
word is a prefix of another word, since Caesar is smart enough to combine words
into phrases. Will wants to teach Caesar the following seven words that are
frequently used in their daily activities:
┌─────┬─────────────────────┐
│ Word │food sleep play drink open who come │
├─────┼─────────────────────┤
│ Frequency│0.25 0.15 0.12 0.19 0.07 0.14 0.08 │
└─────┴─────────────────────┘
(a) (5 points) Will knows that Caesar can do two gestures─pointing his index
finger upward (UP) pointing his index finger downward (DOWN)─without
difficulty. Please help Will to create a prefix-free binary sign language so as
to minimize the average number of gestures per word. This sign language should
contain the above seven words and each word is represented by a sequence of UP
and DOWN gestures.
Huffman coding 簡單化一下就好
___
(b) (5 points) Caesar now can do three gestures─UP, DOWN, and WAVE. Please
create a prefix-free ternary sign language so as to minimize the average
number of gestures per word. The sign language should contain the above seven
words and each word is represented by a sequence of UP, DOWN, and WAVE gesture.
還是簡單 變成3個小孩 剛好都是full的不用掉換
___
c) (5 points) Following (a), Will observes that the UP gesture consumes three
times of energy than the DOWN gesture. Will comes up with greedy heuristic in
the hope to minimize the average energy consumption per word. His greedy
heuristic is as follows. Given n words and their frequencies, Will first draws
a binary prefix tree that minimizes the average number of gestures per word in
its corresponding binary sign language. He then re-labels each edge such that
for every node v and its left node v_l and right node v_r, edge (v,v_l) is
re-labeled with UP and edge (v,v_r) is re-labeled with DOWN if the sum of the
frequencies of the leaf nodes in v's left subtree is lower than it of its right
subtree. The labels are switched otherwise. Please show that this greedy
algorithm is incorrect. You don't need to use the same input (i.e., the 7 words
and their frequency distributions) in your counterexample.

# Problem 5: Counting Inversions (10 points + 10 bonus points)
(a) Counting inversions revisited (10 points) Given a sequence of unique
numbers B = b_1,b_2,...b_n, an inversion is a pair of numbers b_i and b_j in
this sequence such that i < j and b_i > b_j. Let I(B) be the set of inversions
in B. For example, if B = <1, 3, 5, 2>, I(B) = {(3,2),(5,2)}, and the number of
inversions |I(B)| is 2.
Given an unsorted sequence B with n numbers, please design an efficient
algorithm to calculate the number of inversions |I(B)| in O(n log n) time.
Please briefly explain why your algorithm indeed runs in O(n log n).
Like homework, use a modified merge sort
___
(b) Bonus: Counting significant inversions (10 points) Given a sequence of
unique numbers B = b_1,b_2,...b_n, a significant inversion is a pair of numbers
b_i and b_j in this sequence such that i < j and b_i > 2b_j. Let SI(B) be the
set of significant inversions in B. For example, if B = <1, 3, 5, 2>,
SI(B) = {(5,2)}, and the number of significant inversions |SI(B)| is 1.
Given an unsorted sequence B with n numbers, please describe how to modify
your algorithm in part (a) for calculating the number of significant inversions
|SI(B)| in O(n log n) time. Please briefly explain why your algorithm indeed
runs in O(n log n).
Same question as 104, modified merge sort with pointer pointing at the value Vr that is at least 2Vl.
___
# Problem 6: Finding closest pair of points (10 points + 10 bonus points)
(a) (10 points) There are n points in a two-dimensional space, and point i's
x-y coordinate is (x_i,y_i). However, in this bizarre world, the distance
between points i and j is defined as $$d(i,j) = min {(|x_i-x_j|,|y_i-y_j|)} $$, which is not Euclidean distance.
Suppose you are given arrays X_n and Y_n containing these n points sorted
by their x-coordinates and y-coordinates, respectively. Please design a greedy
algorithm to find a closest pair of points in O(n) time, and explain why your
algorithm is correct. You can prove it by showing your algorithm has the greedy
choice property and optimal substructure. Alternatively, you can also use other
proof techniques as long as you can show that your solution is indeed optimal.
Hint: first consider a 1D case where the distance is defined as
d(i,j) = |x_i-x_j| and then extend your solution to 2D.
Since the arrays are already sorted, we can just loop through i and i+1. We do it once in array X, once in array Y. The absolute minimum in the 2 is the closest point, thus O(n)
Optimal substructure:
Say we divide it into 2 random groups, the global minimum must be the minimum of these 2 groups
Greedy property:
since they are sorted, the distance between i and i+1 must be minimum. If OPT exist that don't contain i and i+1, but instead i and i+2, then we know i and i+1 is at least as good
___
(b) Bonus (10 points) There are n points in a two-dimensional space, and point
i's x-y coordinate is (x_i,y_i). However, in this bizarre world, the distance
between points i and j is defined as $$d(i,j) = max{(|x_i-x_j|,|y_i-y_j|)}$$, which is not Euclidean distance.
Please design an algorithm to find a closet pair of points in O(n log n)
time, and explain why your algorithm is correct.
DC approach similar to how we solve it in euclidean space. We first take O(nlogn) to sort each x and y. We then DC into 2 smaller parts, lets say by the x coordinate. We do this until base case.
When we merge, we choose the min distance to be min(left, right, across). We can calculate across in a similar way. Let delta = min(left, right), we can allocate an area of 2delta by delta. Because of this "max" value, it make sense that any point must be within this delta x 2delta area or else either its x difference or y difference will be larger than the min. We can find the min by comparing 7 closest points.
==感覺這個至多8個點的性質會繼續存在。如果有的話根本是一樣的作法?==
___
Problem 7: Fair Division of Christmas Gifts (10 bonus points)
Christmas is approaching. You're helping Santa Clauses to distribute gifts to
children.
For ease of delivery, you are asked to divide n gifts into two groups such
that the weight difference of these two groups is minimized. The weight of each
gift is a postive integer. Please designe an algorithm to find an optimal
division minimizing the value difference. The algorithm should find the minimal
weight difference as well as the groupings in O(nS) time, where S is the total
weight of these n gifts. Briefly justify the correctness of your algorithm.
Hint: This problem can be converted into making one set as close to S/2 as
possible.
Taught in class, knapsack problem basically.
___
Appendix
Master Theorem. Let a ≧ 1 and b > 1 be constants, let f(n) be a function, and
let T(n) be defined on the nonnegative integers by the recurrence
T(n)=aT(n/b) + f(n),
where we interpret n/b to mean either \floor{n/b} or \ceiling{n/b}. Then T(n)
has the following asymptotic bounds:
1. If f(n) = O(n^(log_b a-ε) for some constant ε> 0, then
T(n) = Θ(n^(log_b a)).
2. If f(n) = Θ(n^(log_b a)), then T(n) = Θ(n^(log_b a)log n).
3. If f(n) = Ω(n^(log_b a+ε)) for some constant ε> 0 and if
af(n/b) ≦ cf(n) for some constant c < 1 and all sufficiently large n,
then T(n) = Θ(f(n)).