1092Algorithm HW

# 1092Algorithm HW ###### tags: `courses`, `assinment` ## HW5_Ex_6 Let $D[i,j]$ is the **edit distance** between x[1..i] and y[1..y]. Let the cost of each operation is $cost_{insert},\ cost_{delete},\ cost_{replace},\ cost_{twiddle}$ $$ D[i,j] = \min \begin{cases} 0 & \text{if } i = j = 0 \\ D[i, j-1] + cost_{insert} & \text{if } j > 0 \\ D[i-1, j] + cost_{delete} & \text{if } i > 0 \\ D[i-1,j-1] + cost_{replace} & \text{if } i,j > 0 \\ D[i-2, j-2] + cost_{twiddle} & \text{if } i,j > 1 \text{ and } x[i] = y[j-1] \text{and} x[i-1] = y[j-1] \\ \end{cases} $$ #### pseudo-code: ```python= # Get edit distance D = {Inf} D[0][0] = 0 for i in range(0, m): for j in range(0, n): D[i][j+1] = max(D[i][j+1], D[i][j] + c_insert) D[i+1][j] = max(D[i+1][j], D[i][j] + c_delete) D[i+1][j+1] = max(D[i+1][j+1], D[i][j] + c_replace) D[i-2, j-2] = max(D[i+1][j+1], D[i][j] + c_twiddle) ``` ```python= # Trace back i = n j = m while true: if i == 0 and j == 0: break if j > 0 and D[i][j] == D[i][j-1] + c_insert: j = j-1 operation_queue.push(insert) if i > 0 and D[i][j] == D[i-1][j] + c_delete: i = i-1 operation_queue.push(insert) if i > 0 and j > 0 and D[i][j] == D[i-1][j-1] + cost_replace i = i-1 j = j-1 operation_queue.push(replace) if i > 1 and j > 1 and D[i][j] == D[i-2][j-2] + cost_twiddle i = i-2 j = j-2 operation_queue.push(twiddle) ``` #### Analyze Time: $\Theta(m \times n)$ Space: $\Theta(m \times n)$ ## HW6_Ex4 #### 題目：題目可簡化為: 給定一擁有 n+1 段線段的集合，排列在數線上 $$ U=\{(x_0, x_0+m),(x_1, x_1+m),...,(x_{n+1}, x_{n+1})\} $$ 從 $U$ 中挑部份線段得到若干種子集合 $S$ ，其中 $S$ 中所有的線段必須覆蓋 $(x_0, x_{n+1})$ 求所有 $S$ 集合中線段數量最小的那一個 $$ \begin{eqnarray} &P &=& \{S | S \in U, \ \forall (y_i, y_i+m) \in S, \ y_i+m > y_{i+1} \} \\ &S_{ans} &=& \text{min}(|S_0|, |S_1|, ...) \end{eqnarray} $$ #### 推斷：每次取離現在的補水站能到的最遠的補水站： $$ \begin{eqnarray} e_0 &=& x_0\\ e_{i+1} &=& max(\{x | x \in U, x \leq e_i + m\}) \end{eqnarray} $$ #### prove: 假設此算法不是最佳解，則： $$ \begin{align} & \exists \ S_{better} \ne S_{my} \\ \\ & \text{Let } x = |S_{my}| \\ & \text{Let } y = |S_{better}| \\ & \text{Let } a_i \text{ be the }i \text{ th element in } S_{my}\\ & \text{Let } b_i \text{ be the }i \text{ th element in } S_{better}\\ & a_0 = b_0 = x_0 \\ \\ & \text{For } b_2,...,b_y \text{ they needs to cover up the lasting } (b1+m, x_{n+1}) \text{ area} \\ & \text{For } a_2,...,a_x \text{ they needs to cover up the lasting } (a1+m, x_{n+1}) \text{ area} \\ & \text{By definition }b_1 < a_1 \\ & \text{Then } (b_1+m, x_{n+1}) \geq (a_1+m, x_{n+1}) \\ & \text{So } b_1,...,b_y \text{ needs equal or more lines to cover up the area} \\ & \text{Then } y - 2 \geq x - 2 \\ & \text{Then } y \geq x \\ & \text{Thus } S_{better} \text{ is not a better solution} \end{align} $$ #### Analyze Time: $\Theta(n)$ ## HW7_Ex5 let every incoming activity $i$ in form $t_s$: means the time when the activity start $t_e$: means the time when the activity ends $a_i = (t_s, t_e)$ then, due to every activity only needs only one hell. So, the min hell we needs, is the max number of the overlapping activities ### Stratagy 每次都把編號最小的教室借出就可以了。 ### 分析 Time: $\Theta(n)$ --- ## HW10_Ex6 ### 1 因為每個 task 所需時間一樣 (unit task)，因此不會出現拿掉一個 task 的排程而獲得兩個 task 的排程的現象。並因此，讓最大懲罰的 task 優先排序必為最優解。若 $A_i <= D_i$ 則把盡量往 $D_i$ 排，若 $A_i > D_i$ 則把 $t$ 盡量往最後排這兩個步驟則是盡量減少對其他 task 的影響。 ### 2 ```python= # pseudocode struct set_element { parent : set_elemet latest : integer } solution(T : task_array): set_array = array of pointer to set_element latest_set = make_set(T.latest_one) for a in T: if set_array[a.deadline] is Nil: # 如果是空的就直接插入 a.time = a.deadline else: root = find_set(set_array[a.deadline]) if root.latest >= 0: a.time = root.latest # 如果前面有得插，就插在前面最後一個 else: a.time = latest_set.latest # 如果前面沒得插，則盡量往後插 # End set a.time # 將 a 前面與後面的 set 與 a union 起來 set_array[a.index] = make_set(a) if set_array[a.index - 1] is not Nil: union(set_array[a.index], set_array[a.index - 1]) if set_array[a.index + 1] is not Nil: union(set_array[a.index], set_array[a.index + 1]) ``` 假設 task 數量為 $N$，而迴圈每次都會做一次 `find_set`, `make_set`, `union` 若使用 Weighted Union + Path Compression 在 set 的處理上則 Time: $O(n\alpha(N))$ 大約等於 $O(N)$ ## HW11_Ex7 Bellman ford 1. 其實我們可以只做“被改變的那些點”所連結的邊做 relax 就可以了 2. 當沒有點的 _d_ 值被改變時就可以停了。 ```cpp= Bellman-Ford(G,s) Inint(G, s) que.push(s) // 將起始點塞入 que while que not empty // 將目前可能可以 relax 的點做 relax，若沒有點是可能可以再被改變的便結束 let u := que.pop() // 取出 for each v adjacent to u if d[v] > d[u] + u(u, v) then d[v] := d[u] + w(w, v) pi[v] := u que.push(v) // 若一個點 d 值被更新，則與他相鄰的點也有可能可以更新 ``` ## HW11_Ex8 令一 Vertex 內容是`(x, y)` 分別代表 dp 矩陣中的兩個 string 的 index 而矩陣內的運算我們可以改成 DAG 來實作 ```python= Judge(V, X, Y) if X[V.x] == Y[V.y] return 1 else return 0 LCS(X, Y) X = input() # 字串 X, 長度 N, 內容 0 ~ N-1 Y = input() # 字串 Y, 長度 M, 內容 0 ~ M-1 DP[X,Y] init(G, X, Y, DP) # 初始化 Graph: 每個點 V(i, j) 都會連到 v(i+1, j+1), v(i+1, j), v(i, j+1) # 初始化 DP[any, any] = 0 que.push((0, 0)) # 起始點 while que not empty: u := que.pop() if Judge(u, X, Y) == 1 and u.x + 1 < N and u.y + 1 < M DP[u.x + 1, u.y + 1] = DP[u] + 1 que.push((u.x + 1, u.y + 1)) else for each v adjacent to u if DP[v] < DP[u] DP[v] = DP[u] que.push(v) ``` ## HW12_Ex8 $m^n \space \text{mod} \space p$ $$ f() = \begin{cases} f() \\ f() \end{cases} $$ ``` let ans := m let n := n while n > 1 then if (n mod 2) == 1 then n := n - 1 ans = (ans * m) mod p else n = n / 2 ans = (ans * ans) mod p ``` ``` 2 ^ 11 (2^10) * 2 mod p (2 ^ 5) * (2 ^ 5) mod p (2 ^ 4) * 2 mod p (2^2) * (2^2) mod p (2 * 2) mod p 2 ``` $(A \times B)\ \text{mod}\ p$ $= ((A\ \text{mod}\ p) \times (B\ \text{mod}\ p))\ \text{mod}\ p$ ``` a / p = m ... n a = m_a * p + n_a b = m_b * p + n_b a * b = (m_a * p + n_a) * (m_b * p + n_b) = p(m_a + m_b) ``` TODO: 完成他 | Column 1 | Column 2 | Column 3 | | | -------- | -------- | -------- | --- | | Text | Text | Text | | ## HW13_Ex3 > Show how to determined the edge connectivity by a maximum-flow algorithm > connectivity：一個圖至少要刪除 $1 條邊才能使有點被隔離。 min cut 會等於 max flow 而我們把這張無向圖的所有邊的 cap 設為 1 時，global min cut 就會是 edge connectivity ### Ford–Fulkerson algorithm Ford–Fulkerson algorithm 的複雜度為：$O(E|f_{max}|)$ 由於這題中的每條邊的 capbility 都是 1，因此 $f_{max}$ 最糟就是 $V$ 因此複雜度最糟就是： $O(EV)$ 而我們在直接指定ㄧ點 $v$，計算所有 $f(v, u), u \neq v, u \in V$ 在這裡面所有的 min-cut 中選出最低的 global min-cut 這們做的複雜度就變成 $O(EV^2)$ $O(V^2E|f_{max}|)$  #### Faster way 我們不用嘗試 $V^2$ 種組合，只要 $V$ 組就可以了。我們先固定一點 $u$ 接著對於其他所有點做一次找 max-flow 即可。 ```python= def edge_connectivity(G): k = inf u = G.V[0] 將 G 轉換成新的樣態 for v in G.V: if u != v: k = min(k, max_flow(G, u, v)) return k ```` ### Stoer-Wagner 有一個很複雜的證明，我看不懂但是使用此演算前需要先講另一個 #### Maximum Cardinality Search 令 $A$ 為空集合，將任意一點 $v$ 放進去每回合去找一點與目前 $A$ 集合中的點連結權重最高的點，將其加入 $A$ #### Stoer-Wagner 想法做一次 Max Cardinality Search 降 A[n] A[n-1] 的 flow 紀錄然後把兩者合併注意如果存在點 v 則，w({A[n]mergeA[n-1]}, v) = w(A[n], v) + w(A[n-1], v) ## Hw14_ex1 > Show that any comparison-based algorithm needs (n log n) time to solve the following problem: Given n points in the 2D plane, findthe points that form the smallest convex polygon from these points which can contain all the points on the plane.(convex hull problem) convex hull problem 可以使用 Graham's Scan 來解 ::: info ### Graham's Scan 找一任一點 u 將其他所有點以（與 u 連線後的直線角度）做排序依排序順序將點與 u 連線若當前的 uv 連線為內凹，則往前做調整 ::: 而 Graham's scan 的連線調整過程所需時間為 $\Omega(n)$，因此 lower bound 決定在 sorting 身上由 Decision tree model 可證明 sorting lower bound 為 $\Omega(nlgn)$ 因此 Graham's scan 的 lower bound 為 $\Omega(nl)$ ## Hw15_ex4 ==TODO: 順序錯了，應該是要用 SAT 轉成此問題== 問題： $$ \begin{bmatrix} a_{11} & a_{12} &...&a_{1_n} \\ a_{21} & a_{22} &...&a_{2_n} \\ : & : &...& : \\ a_{m1} & a_{m2} &...&a_{m_n} \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ : \\ x_m \end{bmatrix} ≤ \begin{bmatrix} b_1 \\ b_2 \\ : \\ b_m \end{bmatrix} $$ 證明，要解此問題的難度是 NP-complete ### 1 要驗成答案只須 polynomial time（做矩陣乘法）因此為 NP 問題 ### 2 裡面的數值都只有 {0, 1} 兩種因此對於每一行: 如果 b == 0 除了係數等於 0 的 x 們，其他 x 都要是 0 如果 b == 1 除了係數等於 0 的 x 們，其他所有 x 之種只能有 1 個 1 $$ \begin{cases} a_{11}x_1 + a_{12}x_2 + ... + a_{1n}x_n ≤ b_1 \\ a_{21}x_1 + a_{22}x_2 + ... + a_{2n}x_n ≤ b_2 \\ : \\ a_{m1}x_1 + a_{m2}x_2 + ... + a_{mn}x_n ≤ b_m \end{cases} $$ b== 0 全都 0 $$ \lnot (\bigvee_{\forall a_{ki} ≠ 0} x_i) \\ (\bigwedge_{\forall a_{ki} ≠ 0} \lnot x_i) $$ $$ \lnot(x_1 \lor x_2 \lor x_3) \\ \lnot x_1 \land \lnot x_2 \land \lnot x_3 $$ b == 1 兩兩不相同（只有一個一） $$ (\bigwedge_{\forall a_{ki} ≠ 0, a_{kj} ≠ 0, i ≠ j} \lnot x_i \lor \lnot x_j) \land (\bigvee_{\forall a_{k_i} ≠ 0} x_i) $$ $$ f(k) = \begin{cases} (\bigwedge_{\forall a_{ki} ≠ 0} \lnot x_i) & \text{if }b == 0 \\ (\bigwedge_{\forall a_{ki} ≠ 0, a_{kj} ≠ 0, i ≠ j} \lnot x_i \lor \lnot x_j) \land (\bigvee_{\forall a_{k_i} ≠ 0} x_i) & \text{if }b == 1 \end{cases} $$ $$ f(1) \land f(2) \land ... \land f(m) = 1 $$ 總之整理起來後，可轉換成 SAT 問題轉換時間在 polynomial time 內，其又為 NP Problem 因此得證此問題為 NP-complete ## Some algo about graph ### DFS tree - DFS - 顏色 - tree edge - back edge - 時間 - forward - across edge - [slide](https://docs.google.com/presentation/d/1dNBiqsBS7MzEjCrynwJjt_dvm3WwgZgSIyhTSg65uRY/edit#slide=id.gd8ba87fe58_1_3) ### Shortest path - Ballman ford - 可以對付負圈 negtive cir - 用所有頂點的邊的資訊做 n-1 (頂點為 n 個) 輪的鬆弛 - 我都叫它倒水 - time: - O ( | V | ⋅ | E | ) - Dijasktra - kind of greedy - 不能有 negtive weight - 因為 Dijasktra 是 Greedy - 所以在之後才會便好的路線不會走（後面出現負的），因此答案會是錯的 - 每次選一個離以確定點集合最近的點加入（並更新其距離） - solution - 選一點非確定其集合中離起點最近的點u - 將 u 計為確定集合 - 將點u 旁邊的人 relax - time: - adjacency list + heap - $O((m+n)logn) = O(mlogn)$ - adjacy mat + unsorted list - $O(n^2)$ - ad list + Fibonoacci heap - $O(nlogn + m)$ ### All-pair shortest path - floyd warshall - 試試看假設加入 k 點可不可以簡短 uv 之間的距離 - $V_u \rightarrow V_k \rightarrow V_v$ - D(u, v) = min(D(u, v), D(u, k) + D(k, v)) - 使用類似 DP - johnson - 1. 設一 dummy vertex 連到所有 vertex - 2. 算 V 次 D(dummy, any vertex) - 3. 從新計算所有 edge，（條成正的） - 4. 把所有點當成 source 做一次 Dijkstra - time: - 在稀疏圖的狀況下比 floyd warshell fast ### spanning tree - Prime's - 選點 - 從誰便一點開始，加入一點與目前集合最近的 - kurskal - 選邊 - 再沒有 cir 的條件下，選最小邊 - 問題：已知一 mini span tree，修改一個不在此 tree 上的一邊權重，求新的 mini spanning tree - 問題：次小 span tree - 在圖中拔掉一個在 mini span tree 中的邊，再全部重作一次 span tree - 總共 V 次 ### max-flow min-cut - ford forkersul - 每次從起點找一個 argumenting path to sink - 用 residual graph 去跑 - residual graph 可以回朔，因此可以修正之前的 flow - 跑到不能在跑得時候就完成了 - adman's cut - 用 BFS 找 argumenting path