Intro. to Algorithm HW1

# Intro. to Algorithm HW1 ###### tags: `Intro. to Algorithm` - [name=曾文鼎 0716023] ## 1 ### a * 存在 $c=1, n_0=716023, h(n)=n^2$ 使得 $\forall n\geq n_0, 0\leq f(n)\leq ch(n)$ 。故 $f(n)=O(h(n))=O\left(n^2\right)$ 。 ### b * 以下證明 $\displaystyle\lim_{x\to\infty} \dfrac {g(x)}{\sqrt x}=0$ 。 $$ \begin{align}\lim_{x\to\infty} \frac {g(x)}{\sqrt x} &= \lim_{x\to\infty} \frac {\frac {\mathrm d}{\mathrm dx} g(x)}{\frac {\mathrm d}{\mathrm dx} \sqrt x} \\&= \lim_{x\to\infty} \frac{\frac 1 x}{\frac 1 {2\sqrt x}} \\ &= \lim_{x\to\infty} \frac 2 {\sqrt x} \\ &=0 \end{align}$$ 故 $g(n)=o(\sqrt n) = O\left(\sqrt n\right)$ 。 ### c 對於任意函數 $f(n)=O(F(n)), g(n)=O(G(n))$ ，存在 $c>0$ 和 $n_0>0$ 使得 $$\forall n>n_0, |f(n)|\leq|cF(n)| \wedge |g(n)|\leq|cG(n)|$$ 因此 $$\exists c>0, |f(n)g(n)| \leq c^2|F(n)G(n)|$$ 故得知 $f(n)g(n)=O(F(n)G(n))$ 。所以 $f(n)g(n)=O\left(n^2\sqrt n\right)=O\left(n^{2.5}\right)$ 。 ## 2 ### a 根據 Master Theorem： * 已知 $f(n)=O(n)$ * 存在 $\varepsilon=1$ 使得 $f(n)=\Omega(n^{\log_b a+\varepsilon})$ ，其中 $a=1,b=2$ 。 * 存在 $c=0.716023<1$ 使得 $af\left(\dfrac n b\right)\leq cf(n)$ ，因為 $$af\left(\dfrac n b\right)=f\left(\dfrac n 2\right)=O\left(\dfrac n 2\right)\leq O(n)=0.716023~f(n)=cf(n)$$ 故原式 $T(n)=\Theta(f(n))=\Theta(n)$ 。 ### b 因為 $\left\lceil\dfrac n 2\right\rceil - \left\lfloor \dfrac n 2\right\rfloor \leq1=O(1)$ ，故 $\left\lceil\dfrac n 2\right\rceil = \left\lfloor \dfrac n 2\right\rfloor + O(1)$ 。所以原式可以化為 $$ \begin{aligned} & T(n)= \begin{equation}\begin{cases} 2T\left(\left\lfloor \dfrac n 2\right\rfloor\right) + O(n)+O(1) &\mathrm{if}~n>1 \\ O(1) &\mathrm{if}~n=1 \end{cases}\end{equation} \\ \Longrightarrow~& T(n)= \begin{cases} 2T\left(\left\lfloor \dfrac n 2\right\rfloor\right) + O(n) &\mathrm{if}~n>1 \\ O(1) &\mathrm{if}~n=1 \end{cases}\end{aligned}$$ 根據 Master Theorem： * $f(n)=O(n)=\Theta\left(n^{\log_b a}\right)$ ，其中 $a=b=2$ 。故 $T(n)=\Theta\left(n^{\log_2 2}\log n\right)=\Theta\left(n\log n\right)$ 。 ### c 理由同 2b ，原式可化為 $$ T(n)= \begin{equation}\begin{cases} 3T\left(\left\lfloor \dfrac n 2\right\rfloor\right)+O(n) & \mathrm{if}~n>1 \\ O(1) & \mathrm{if}~n=1 \end{cases}\end{equation}$$ 根據 Master Theorem： * $f(n)=O(n)=O\left(n^{\log_b a-\varepsilon}\right)$ ，其中 $a=3, b=2, \varepsilon=\log_2 3-1$ 。故 $T(n)=\Theta\left(n^{\log_b a}\right)=\Theta\left(n^{\log_2 3}\right)$ 。 ### d 令 $n=2^k, S(k)=T\left(2^k\right)=T(n)$ ，可以推出 $$\begin{align} & T(n)=T(\sqrt n) +O(1) \\ \Longrightarrow ~& T\left(2^k\right)=T\left(2^{k/2}\right) + O(1) \\ \Longrightarrow ~& S(k)=2S\left(\dfrac k 2\right) + O(1)\end{align}$$ 亦即原式可以化為 $$S(k)=\begin{equation}\begin{cases}2S\left(\dfrac k 2\right)+O(1) &\text{if}~k>1 \\ O(1) &\text{if}~ k=1\end{cases}\end{equation}$$ 根據 Master Theorem： * $f(k)=O(1)=O\left(k^{\log_a b-\varepsilon}\right)$ ，其中 $a=b=2, \varepsilon=1$ 。因此 $S(k)=\Theta\left(k^{\log_2 2}\right)=\Theta(k)$ ，亦即 $T(n)=\Theta(\log n)$ 。 ## 3 ### a 定義此 heap 的節點（最小的節點）為 $V_1$ 至 $V_{15}$ ， $V_1$ 為 heap 頂端， $\forall i>1, V_i$ 的父節點為 $V_{\lfloor i/2\rfloor}$ 。為了方便敘述，令次小的節點為 $A$ 、再次小的的節點為 $B$ 。顯然 $V_1$ 擁有最小值， $V_2$ 和 $V_3$ 之一擁有次小值。 * 令 $V_2$ 與 $V_3$ 比較，這裡假設較小者為 $A=V_2$ 。在一個「比誰小」的遊戲中，$B$ 只可能輸給 $V_1$ 或 $A$ ，且在 heap 建造過程中只會輸給兩者之一恰一次。 * 若 $B$ 輸給 $V_1$ ，則 $B=V_3$ 。 * 若 $B$ 輸給 $A$ ，則 $B=V_4$ 或 $B=V_5$ 。簡之， $B$ 必為 $V_3$ 、 $V_4$ 或 $V_5$ 其中之一。為了找出答案，可以： * 先令 $V_3$ 與 $V_4$ 比較，令較小者為 $Z$ 。 * 再令 $Z$ 與 $V_5$ 比較，較小者即為答案 $B$ 。故只需要用到三次比較。考量所有情況， $B$ 可能為除了根和最底層以外的所有節點。 ### b 令第四小者的節點為 $C$ 。 $C$ 只可能輸給 $V_1$ 、 $A$ 或 $B$ 。因此 $C$ 必為該三節點其中之一的子節點。因此具體找出 $C$ 的方法為： * 先找出 $A$ 。 * 再找出 $B$ 。 * 若 $A$ 與 $B$ 為 $V_1$ 的子節點，則 $C$ 為 $V_4$ 至 $V_7$ 其中之一，即第三層（根為第一層）。 * 若 $B$ 為 $A$ 的子節點，則 $C$ 可能為 $V_0$ 的另一個子節點、或 $A$ 的另一個子節點、或 $B$ 的兩個子節點其中之一。考量所有的情況， $C$ 可能為 $V_2$ 至 $V_{15}$ 的任一節點，即除了根以外的所有其他節點。 ## 4 ### a 因為 $\forall i<j,S[i]>S[j]$ ，所以 $\forall i,S[i]+i\geq S[i+1]+(i+1)$ 。可以想像一個新陣列 $S[i]+i$ ，因為它是一個非嚴格遞減數列，故仍然可以使用單純的二分搜來找 $S[i]+i=0$ 。以下給出 pseudo code 的範例（陣列 index 刻意模仿題目從 $1$ 開始計算）。 ```javascript function judge(array) { let l = 1, r = array.length while (l < r) { let m = floor((l + r) / 2) if (array[m] + m <= 0) r = m else l = m + 1 } return array[l] + l == 0 } ``` ### b 在 $\forall i<j,S[i]\geq S[j]$ 的情況下，陣列 $S[i]+i$ 相當於未排序的陣列。判斷 $\exists i, S[i]+i=0$ 所需的複雜度顯然為 $\Omega(n)$ 。下面給出一個遊戲的例子。假設玩家詢問任意索引 $i$ 的 $S[i]+i$ 值，我們可以總是回答 $716023$ （亦即 $\forall i, S[i]=716023-i$）。因為玩家無法藉由這個回答推斷其他索引 $S[j]+j\neq0$ 的可能，因此玩家至少要詢問 $n$ 次才能完成遊戲。複雜度 $\Omega(n)$ 。 ## 5 ### a 假設我有現成的 second mode 演算法 $A$ ，那我可以用 $A$ 和其他 $O(n)$ 內的步驟，來解決 element uniquness problem 。以下給出 pseudo code 和步驟。 ```javascript import function A(array) function is_unique(v, array) { let c = 0 for (e in array) if (e == v) c++ return c == 1 } function solve_element_uniqueness_problem(array) { let v = array[0] if (!is_unique(v, array)) return false // O(n) let n = array.length for (i=0; i<n; i++) array.append(v) // O(n) let s = A(array) // O(?) return is_unique(s, array) // O(n) } ``` 1. 先檢查陣列第一個元素 $v$ 是否只出現一次。若否，答案為假。這個步驟需要 $O(n)$ 。 2. 將 $n$ 個 $v$ 加入陣列中。這個步驟需要 $O(n)$ 。 3. 呼叫 $A$ 演算法，得到陣列的 second mode 元素 $s$。時間複雜度待定。 4. 若 $s$ 只在陣列中出現一次，答案為真，否則為假。這個步驟需要 $O(n)$ 。因為既成事實是 element uniqueness problem 的複雜度至低是 $\Omega\left(n\log n\right)$ ，所以可以肯定 $A$ 的複雜度至低為 $\Omega(n\log n)$ 。 ### b $N$-based 的 Radix sort 可以把排序複雜度降到 $O(\log_n n^4)O(n)=O(n)$ 。以下給出 psuedo code。 ```javascript function count_sort(array, exp) { let n = array.length // Declare an array of queue let q = queue[n] for (val in array) { let i = val / exp % n q[i].push(val) } let c = 0 for (i=0; i<n; i++) { while (!q[i].empty()) { array[c++] = q[i].front() q[i].pop() } } } function radix_sort(array) { let n = array.length // array is passed by reference count_sort(array, 1) // O(n) count_sort(array, n) // O(n) count_sort(array, n*n) // O(n) count_sort(array, n*n*n) // O(n) // Now array is sorted } ``` 排序好之後，就可以在 $O(n)$ 算出 second mode 。 ## 6 先講結論，答案為真，可以在 $O(n)$ 內判別 $$\min_{x\in S}\text{freq}(x)+\max_{x\in S}\text{freq}(x)>n\left(1-\frac 1 {\log n}\right)$$ 的真偽。整個證明的大綱如下： * #1 若陣列中 distinct 元素少於等於兩個，答案恆真。 * #2 若陣列中 distinct 元素大於兩個，分兩情況。 * #2-1 若 $\displaystyle\max_{x\in S}\text{freq}(x)\leq n \left(1-\frac 2 {\log n}\right)$ ，答案恆假。 * #2-2 若 $\displaystyle\max_{x\in S}\text{freq}(x) > n \left(1-\frac 2 {\log n}\right)$ ，可以在 $O(n)$ 內判斷真偽。 ### 證明首先判斷陣列的中 distinct 元素是否少於等於兩個，這可以在 $O(n)$ 內完成。以下給出 pseudo code 範例： ```javascript function is_distinct_value_leq_2(array) { let v0 = null, v1 = null for (i=1; i<array.length; i++) { // O(n) if (array[i] == array[i-1]) continue else if (v0 == null) { v0 = min(array[i], array[i-1]) v1 = max(array[i], array[i-1]) } else if(array[i] < array[i-1]) { if (array[i] != v0) return false } else { if (array[i] != v1) return false } } return true } ``` ### #1 如果陣列的中 distinct 元素少於等於兩個，則顯然 $$\min_{x\in S}\text{freq}(x)+\max_{x\in S}\text{freq}(x) = n ~\text{or}~ 2n$$ 答案為真。 ### #2 接下來考慮陣列中 distinct 元素多於兩個的情況。首先檢驗 $$\max_{x\in S}\text{freq}(x) > n\left(1-\frac 2 {\log n}\right)$$ 是否成立。 * 因為 $\displaystyle\frac n 2 \leq n\left(1-\frac 2 {\log n}\right)$ ，這可以透過 Boyer–Moore majority vote algorithm 在 $O(n)$ 之內判斷，並找出 $\displaystyle\max_{x\in S}\text{freq}(x)$ 的值。以下給出 pseudo code 。 ```javascript function find_majority(array) { let maj = 0, c = 1; for (i=1; i<array.length; i++) { // O(n) if (array[maj] == a[i]) c++ else c-- if (c == 0) { maj = i c = 1 } } return array[maj]; } function find_frequency(array, val) { let c = 0 for (i in array) if (i == val) c++ // O(n) return c } // Queries if maxfreq(x)>n(1-2/log(n)) function judge(array) { let n = array.length let maj_candidate = find_majority(array) // O(n) // We can get the value of greatest frequency let max_freq = find_frequency(array, maj_candidate) // O(n) return max_freq > n * (1 - 2 / log(n)) } ``` ### #2-1 至此我們已經知道 $$\max_{x\in S}\text{freq}(x) \leq n\left(1-\frac 2 {\log n}\right)$$ 因為 $$\min_{x\in S}\text{freq}(x) \leq \dfrac 1 2\left(n-\max_{x\in S}\text{freq}(x)\right)$$ 故 $$\begin{align}\min_{x\in S}\text{freq}(x)+\max_{x\in S}\text{freq}(x) &\leq \dfrac 1 2\left(n-\max_{x\in S}\text{freq}(x)\right)+\max_{x\in S}\text{freq}(x) \\ &= \frac n 2 + \frac 1 2\left(\max_{x\in S}\text{freq}(x)\right) \\ &\leq\frac n 2 + \frac n 2 \left(1-\frac 2 {\log n}\right) \\ &= n\left(1-\frac 1 {\log n}\right)\end{align}$$ 得證答案為假。 ### #2-2 至此我們已經知道 $$\max_{x\in S}\text{freq}(x) > n\left(1-\frac 2 {\log n}\right)$$ 並且得出 $\displaystyle\max_{x\in S}\text{freq}(x)$ 的值，則只要計算 $\displaystyle\min_{x\in S}\text{freq}(x)$ 的值，就可以判斷原命題的真偽。以下證明給出 $O(n)$ 內的方法。 1. 因為 $$\max_{x\in S}\text{freq}(x) > n\left(1-\frac 2 {\log n}\right)$$ 則可以先將 majority 從陣列中去除，使得陣列的長度至多為 $\dfrac {2n} {\log n}$ 。顯然這個操作可以在 $O(n)$ 之內完成。 2. 已知對長度為 $n$ 的陣列可以在 $O(n\log n)$ 內排序完成；則對一個長度為 $\dfrac {2n} {\log n}$ 的陣列排序，可以在 $O\left(\dfrac{2n}{\log n}\log\dfrac{2n}{\log n}\right)=O(n)$ 內完成。以下給出在 $n$ 夠大 $(n>200)$ 的情況下的證明。 $$\begin{align}& n > \frac {2n}{\log n} \\ \Longrightarrow ~& \log n > \log \frac {2n}{\log n} \\ \Longrightarrow ~& 1 > \frac 1 {\log n}\log\frac{2n}{\log n} \\ \Longrightarrow ~& O(n)=2n>\frac{2n}{\log n}\log\frac{2n}{\log n}\end{align}$$ 3. 在排序剩餘的陣列元素之後，就可以在 $O(n)$ 內計算 $\displaystyle\min_{x\in S}\text{freq}(x)$ 的值。至此，我們可以在 $O(n)$ 內判斷原命題的真偽。