# 【資料結構與演算法】 02 Analysis Tools [TOC] <br> --- ## Part 1. Properties of Good Programs #### Good Program - Meet requirements and Correctness are basic - External for clear usage document - Internal for readability #### Proper Resource Usage - Efficient use of computation resources (CPU,GPU,etc.) $\to$ <span style="color:blue">time complexity</span> - Efficient use of storage resources (memory,disk,etc.) $\to$ <span style="color:blue">space complexity</span> 通常會先專注在**速度 (time complexity)**,然後再來探討**記憶體 (space complexity)**。 <br> --- ## Part 2. Operation Code (電腦運算碼) 首先了解電腦需要處理的運算符號,然後是運算碼數量。而運算碼的大小 (常數),其實對電腦不太重要。 - 所有操作都花費**一樣**的時間 $\to$ <span style="color:blue">運算符號</span> - 把所有操作需要的「**次數**」都計算出來 $\to$ <span style="color:blue">運算碼數量</span> - 根據**複雜度**,並且評估執行時間 <br> ### 2.1 運算符號 在下圖中,無論 $n$ 是多大,電腦所需處理的都是 $3$ 個運算符號。 <img src='https://miro.medium.com/v2/resize:fit:828/format:webp/1*sfXSZGZ9HvPF0LnQhOfIng.png'> ### 2.2 運算碼數量 <img src='https://miro.medium.com/v2/resize:fit:828/format:webp/1*hPJUtVnCaUtl1cPwYX6EaA.png'> <br> --- ## Part 3. 實例介紹 ### 3.1 **Space Complexity** of GET-MIN-INDEX ```c++= int GET-MIN-INDEX(A) { m = 1 // store current min.index for i=2 to A.length // update if i-th element smaller if A[m] > A[i] m = i return m } ``` - array A : pointer size is <span style="color:blue">$d_1$</span> (not counting the actual input elements) - integer m : size is <span style="color:blue">$d_2$</span> - integer i : size is <span style="color:blue">$d_2$</span> Total space $\implies$ 這個函數用了多少變數 $\to$ <span style="color:blue">$\ d_1 + 2d_2 \text{ (constant)}$</span> $\implies$ array 需要的空間為 <span style="color:blue">$d_1$</span> array 中的每個 element 各自需要 <span style="color:blue">$2d_2$</span> 的空間 <br> ### 3.2 **Time Complexity** of Insertion Sort ```c++= int INSERTION-SORT(A) { for m = 2 to A.length key = A[m] // Insert A[m] into the sorted sequence A[1...m-1] i = m - 1 while i > 0 and A[i] > key A[i+1] = A[i] i = i - 1 A[i+1] = key } ``` Total time $T(n)$ = $\color{blue}{d_2}n + \color{blue}{d_3}(n - 1) + \color{blue}{d_5}(n - 1) + \color{blue}{d_6}\sum_\limits{m=2}^{n}\color{red}{t_m} + \color{blue}{d_7}\sum_\limits{m=2}^{n}(\color{red}{t_m} - 1) + \color{blue}{d_8}\sum_\limits{m=2}^{n}(\color{red}{t_m} - 1) + \color{blue}{d_9}(n - 1)$ > $d_i$ 對應到第 $i$ 行程式碼 <br> --- ## Part 4. 複雜度概略 ### 4.1 複雜度種類 - **Big-O** ( $O$ ) 是複雜度量級的**上界 (Upper Bound)**,取最緊的 Upper Bound > - $3n^3 + 5n^2 + 10n + 3 \in O(n^3)$ > - $1000 \in O(1)$ - **Little-o** ( $o$ ) 是複雜度量級的**嚴格**上界。 > - $3n^3 + 5n^2 + 10n + 3 \notin o(n^3)$ > - $3n^3 + 5n^2 + 10n + 3 \in o(n^4)$ - **Big-omega** ( $\Omega$ ) 是複雜度量級的**下界(Lower Bound)**。 - **Little-omega** ( $\omega$ ) 是複雜度量級的**嚴格**下界(Lower Bound)。 - **Big-theta** ( $\Theta$ ) 是複雜度量級的**嚴格上下界** (完全相同)。 > - $3n^3 + 5n^2 + 10n + 3 \in \Theta(n^3)$ > - $3n^3 + 5n^2 + 10n + 3 \notin \Theta(n^4)$ > - $3n^3 + 5n^2 + 10n + 3 \notin \Theta(n^2)$ <br> ### 4.2 Three Big Asymptotic Notation <img src='https://miro.medium.com/v2/resize:fit:1194/0*AYlHuaDJIDKBXEyn.png'><br> - $f(n)$ grows slower than or similar to $g(n)$ : ("$\le$") $f(n) = O(g(n))$ iff exist $c$,$n_0$ such that $f(n) \le c \cdot g(n)$ for all $n \ge n_0$ - $f(n)$ grows faster than or similar to $g(n)$ : ("$\ge$") $f(n) = \Omega(g(n))$ iff exist $c$,$n_0$ such that $f(n) \ge c \cdot g(n)$ for all $n \ge n_0$ - $f(n)$ grows similar to $g(n)$ : ("$\approx$") $f(n) = \Theta(g(n))$ iff $f(n) = O(g(n))$ and $f(n) = \Omega(g(n))$ <br> ### 4.3 實際情況 在實際運用中,我們只在意**最糟糕情況 (worst case)** $\implies$ <span style="color:red">Big-O</span> <br> <img src='https://i.ytimg.com/vi/bxgTDN9c6rg/hq720.jpg?sqp=-oaymwEhCK4FEIIDSFryq4qpAxMIARUAAAAAGAElAADIQj0AgKJD&rs=AOn4CLCjK6zfDlJKEW4wsVyY2auG_bOH7g'> <br> --- ## Part 5. **Rough** Notation : Big-Theta and Big-O <img src='https://cstaleem.com/wp-content/uploads/2023/03/Theta-Notation-%CE%98-notation.png'><br><br> ### 5.1 Representing 'Rough' by Asymptotic Behavior $$ \underbrace{cn^2 + bn + a}_\text{f(n)} = \underbrace{\Theta(n^2)}_\text{g(n)} $$ - growth of $bn + a$ slower than $g(n) = n^2$ : removed by dividing $g(n)$ for large $n$ - asymptotically, two functions only differ by $c > 0$ $\lim\limits_{n\to\infty}\frac{f(n)}{g(n)} = c$ <br> In conclusion, for <span style="color:red">positive</span> $f(n)$ and $g(n)$, $$ f(n) = \Theta{g(n)}\ \text{ if } \ \lim_{n\to\infty}\frac{f(n)}{g(n)} = c > 0 $$ <br> ### 5.2 Big-$\Theta$ : roughly the same - Back to the before, definition meets criteria : - care about larger $n$ : yes $n \to \infty$ - leading term more important : yes, $n + \sqrt{n} + log{n} = \Theta(n)$ - insensitive to constants: yes, for example : $1126n = \Theta(n)$ - meaning : $f(n)$ grows roughly the same as $g(n)$ - "$=\Theta(\cdot)$" actually "$\in$" <br> | | $\sqrt{n}$ | $0.1126n$ | $n$ | $112.6n$ | $n^{1.1}$ | $e^n$ | $n+\sqrt{n}$ | | ------------ | ---------- | --------- | --- | -------- | --------- | ----- | ----------- | | $\Theta(n)?$ | N | Y | Y | Y | N | N | Y | <br> ### 5.3 Big-$O$ Notation 在實務上,我們通常只會使用到 Big-$O$。 For example, Binary Search, which best case $O(1)$ and worst case $\Theta(log n)$ for 'any' binary search task, needed time roughly no more than $log(n)$. <br> $f(n)$ grows slower than or similar to $g(n)$ : ("$\le$") One side of $\Theta(\cdot)$ definition : $$ f(n) = O(g(n)) \iff \text{ exist positive($c_2$, $n_0$) that } f(n) \le c_2 \cdot g(n) \text{ for all } n \ge n_0 $$ <br> --- ## Part 6. Time Complexity (時間複雜度) ### 6.1 Time Complexity O(1) ```javascript= function addUpTo(n) { return n * (n + 1) / 2 } ``` $\because$ 所有加減乘除的運算元都視為一樣的 $\to O(1)$ <br> ### 6.2 Time Complexity O(n) ```javascript= function addUpTo(n) { let total = 0 for (let i = 1; i <= n; i++) { total += i } return total } ``` $\because \text{One loop } \to O(n)$ <br> ### 6.3 Time Complexity O(n) : independent ```javascript= function countUpAndDown(n) { console.log('Going up!') for (let i = 0; i < n; i++) { console.log(i) } console.log('At the top!\nGoing down...') for (let j = n - 1; j>=0; j--) { console.log(j) } console.log('Back down. Bye!') } ``` $\because \text{Two loops but they are independent } \to O(n)$ <br> ### 6.4 Time Complexity O(n^2) ```javascript= function printAllPairs(n) { for (var i = 0; i < n; i++) { for (var j = 0; j < n; j++) { console.log(i, j) } } } ``` $\because \text{Nested loop(2) } \to O(n^2)$ <br> ### 6.5 Types of Time Complexity Order (由小到大) : $$log_2n < n < nlog_2n < n^2 < n^3 < 2^n < n!$$ 如圖 : <img src='https://miro.medium.com/v2/resize:fit:828/format:webp/1*beqf0L-9RJVuJbI5A3bQ5A.png'> <br><br> --- ## Part 7. Space Complexity (空間複雜度) 在空間複雜度中的定義 : - 大部分的原始型別(booleans、numbers、undefined、null)都是固定的空間。 - Strings 字串則是 O(n) space,這裡的 n 則是字串長度。 - 物件型別的陣列與物件也是 O(n) space,n 則是陣列的長度或是物件的 key 數量。 <br> ### 7.1 Space Complexity O(1) <img src='https://miro.medium.com/v2/resize:fit:828/format:webp/1*guRnGL9cW0NjUasnKLOSsg.png'><br> $\because$ 在上圖中,前者是assign一個變數,後者是assign另一個變數,兩者獨立且變數所需的空間為 1 $\to O(1)$ <br> ### 7.2 Space Complexity O(n) <img src='https://miro.medium.com/v2/resize:fit:828/format:webp/1*Jp5b635BiAEk0jBB_60Q6w.png'> $\because$ 每當原始陣列擴展 $n$ 倍,新陣列會需要 $n$ 倍的空間 $\to O(n)$ <br> --- ## Part 8. Big-O of Objects <img src='https://miro.medium.com/v2/resize:fit:828/format:webp/1*qeJ3D_PY5Hdtw1UqutYuZg.png'><br> #### 常見的物件方法 ```javascript= var obj = { 'Name': '3nGryCa7', 'HasJob': true, 'FavFood': ['gourmet', 'coffee'] } ``` - **Object.keys -O(n)** 將物件中的所有 key 值以陣列方式回傳。 ```javascript= Object.keys(obj); // Output -> ['Name', 'HasJob', 'FavFood'] ``` - **Object.values-O(n)** 將物件中的所有 value 值以陣列方式回傳。 ```javascript= Object.values(obj); // Output -> ['3nGryCa7', true, Array(2)] ``` - **Object.entries-O(n)** 將物件中的所有 key-value pairs 若干陣列方式回傳。 ```javascript= Object.entries(obj); // Output -> { 'Name': '3nGryCa7', 'HasJob': true, 'FavFood': ['gourmet', 'coffee'] } ``` - **hasOwnProperty-O(1)** 根據物件內是否有某項屬性回傳 Boolean 值,因此只需要檢查物件的某個 key 是否有對應的回傳資料。 ```javascript= obj.hasOwnProperty('Name'); // Output -> true ```