# 【資料結構與演算法】 01 Introduction [TOC] <br> <hr> ## Part 0. 為什麼需要 Data Structures and Algorithms? <span style="color:red">Data structures</span> and <span style="color:blue">Algorithms</span> : for writing good <span style="color:blueviolet">Program</span> - **Space Resources** $\to$ <span style="color:red">Data structure</span> - momory - disk(s) - transmission bandwidth - **Computation Resources** $\to$ <span style="color:blue">Algorithm</span> - CPU(s) - GPU(s) - computation power - Other Resources - manpower - budget usually cared by <span style="color:green">management</span> <br> --- ## Part 1. Accessing Algorithms 好的 Data Structure 需要拿出來與放進去的方法。 ### GET Generally assume to **read without deleting** - GET-BY-INDEX : for arrays - GET-NEXT : for sequential access - GET(key) : for search > Often-<span style="color:red">GET</span> $\iff$ <span style="color:blue">place</span> "nearby" (常用到的東西會放的近一點) ### INSERT Generally assume to **add without overriding** - INSERT-BY-INDEX : for arrays - INSERT-AFTER : for sequential access - INSERT(key, data) <br> --- ## Part 2. Maintenance Algorithms 資料結構的隱含成本 : 維護所需花費的時間 (esp. <span style="color:blueviolet">REMOVE</span> & <span style="color:orange">UPDATE</span>) ### CONSTRUCT - usually possible with multiple <span style="color:red">INSERT</span> - sometimes possible to be <span style="color:green">faster</span> than so ### REMOVE - often viewed as deleting <span style="color:blue">after GET</span> - $\sim$ <span style="color:blueviolet">UN</span><span style="color:red">INSERT</span> : often harder than <span style="color:red">INSERT</span> ### UPDATE - usually possible with <span style="color:blueviolet">REMOVE</span> + <span style="color:red">INSERT</span> - can be viewed as <span style="color:red">INSERT</span> <span style="color:orange">with overriding</span> <br> --- ## Part 3. **INSERT** of Ordered Array ### SWAP Version 先插入在最後,再移動到適當的位置。 ```c++= // INSERT(A, data) n = A.length A[n+1] = data // put in the back for i=n downto 1 if A[i+1] < A[i] SWAP(A[i], A[i+1]) // cut in consecutive else return ``` ### Direct Cut-in Version 先找到適當的位置,再插入該位置。 ```c++= // INSERT(A, data) key = data i = A.length while i > 0 and A[i] > key A[i+1] = A[i] i = i - 1 A[i+1] = key return ``` <br> --- ## Part 4. **CONSTRUCT** of Ordered Array ### Selection Sort 從未排序中,取得最小值,放在已排序部分的最後。<br> <img src='https://he-s3.s3.amazonaws.com/media/uploads/2888f5b.png'><br> #### 虛擬碼 ```c++= // SELECTION-SORT(A) for i=1 to A.length GET-MIN-INDEX(A,i,A.length) SWAP(A[i],A[m]) return A // GET-MIN-INDEX(A,l,r) m = l // store current min.index for i=l+1 to r // update if i-th element smaller if A[m] > A[i] m = i return m ``` <br> ### Insertion Sort Sequential 整個序列,把當前 element 插入適當的位置。 > CONSTRUCT with multiple <span style="color:red">INSERT</span> <img src='https://miro.medium.com/v2/resize:fit:818/1*_NOe6jL9veyR4yAyf85dzA.png'><br> #### 虛擬碼 ```c++= // INSERTION-SORT(A) for i=1 to A.length INSERT(A,i) return A // INSERT(A,m) key = A[m] i = m-1 while i > 0 and A[i] > key A[i+1] = A[i] i = i - 1 A[i+1] = key ``` <br> --- ## Part 5. **REMOVE and UPDATE** of Ordered Array ### REMOVE ```c++= // REMOVE(A,m) i = m + 1 while i < A.length A[i-1] = A[i] // fill in i = i + 1 A.length = A.length - 1 ``` ### UPDATE ```c++= // UPDATE(A,m,data) key = data i = m if A[i] > key // insert to front i = i - 1 while i > 0 and A[i] > key A[i+1] = A[i] i = i - 1 A[i+1] = key else // insert to back i = i + 1 while i < A.length and A[i] < key A[i-1] = A[i] i = i + 1 A[i+1] = key ``` <br> --- ## Part 6. Sequential Search Algorithm <span style="color:blue">SEQ-SEARCH</span> : structurally similar to <span style="color:red">GET-MIN-INDEX</span> #### Review of GET-MIN-INDEX ```c++= // GET-MIN-INDEX(A,i,r) m = l // store current min.index for i=l+1 to r // update if i-th element smaller if A[m] > A[i] m = i return m ``` ### Sequential Search ```c++= // SEQ-SEARCH(A,key,l,r) for i = l to r // return when found if A[i] equals key return i return NIL ``` ### Sequential Search with Shortcut 若是在已排序的情況下,有機會可以提早結束。 ```c++= // SEQ-SEARCH-SHORTCUT(A,key,l,r) for i = l to r // return when found if A[i] equals key return i else if A[i] > key return NIL return NIL ``` <br> --- ## Part 7. Binary Search Algorithm <span style="color:red">Binary Search</span> can multiple shortcuts by <span style="color:red">quickly checking the middle</span> #### 虛擬碼 ```c++= // BIN-SEARCH(A,key,l,r) while l <= r m = floor((l+r)/2) if A[m] equals key return m else if A[m] > key r = m - 1 // cut out end else if A[m] < key l = m + 1 // cut out begin return NIL ``` #### 程式碼 ( java.util.Arrays ) ```java= private static int binarySearch(int[] a, int key) int low = 0; int high = a.length - 1; while (low <= high) { int mid = (low + high) >>> 1; int midVal = a[mid]; if (midVal < key) low = mid + 1; else if (midVal > key) high = mid - 1; else return mid; // key found } return -(low + 1) // key not found ``` <br> --- ## Part 8. Conclusion ### Trade-off Different Factors <span style="color:blueviolet">good program</span> needs understanding trade-off - faster GET $\iff$ slower INSERT and/or maintenance - more space $\iff$ faster computation - harder to implement/debug $\iff$ faster computation ### Summary of Data Structure - Definition of data structure : organize data with <span style="color:orange">access/maintenance</span> algorithms - Ordered array as data structure : <span style="color:orange">insert by cut-in, remove by fill-in</span> - GET (search) in ordered array : <span style="color:orange">binary search</span> using <span style="color:orange">order</span> for shortcuts - why data structures and algorithms : study <span style="color:orange">trade-off</span> to move from coding to programming <br> --- ## Appendix :question: 生成測試資料的技巧 1. 思考什麼情形最容易出錯,例如最極端情形/邊緣情形 (boundary case),需思考到全面狀況 2. 如何生成大規模測試資料