jhengjhe
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Fundamentals Of Data Structures [交大開放式課程](http://ocw.nctu.edu.tw/course_detail-v.php?bgid=9&gid=0&nid=412) [Ctutor](http://pythontutor.com/c.html#mode=edit) ## Chapter 1. Basic concepts ### Overview: System Life Cycle #### Requirement 系統的需求 #### Analysis * Bottom-up * Top-down #### Design * Data objects: abstract data types * Operations: specification(function) & design of algorithms(設計演算法達到 function 的目的) #### Refinement & Coding * Choose representations for data objects * Write algorithms for each operation on data obejcts #### Verification * Correctness proofs: selecting proved algorithms * Testing: correctness & efficiency * Error removal: well-document ### Evaluative judgments about programs * Meet the original specification? * Work correctly? * Document? * Use functions to create logical units? * Code readable? * User storage efficiently? * Running time acceptable?(as fast as possible) ### Data Abstraction * Predefined & user defined data type ``` Struct student{ char last_name; int student_id; char grade; } ``` * Data type: objects & operations (加減乘除等等) * Reoresentation: char 1 byte, int 4 byte * Abstract Data Type (ADT): data type specification(object & operation) is separated from representation. * ADT is implementation-independent(與實作無關,ADT 僅定義如何做,但沒有實作細節。p.20 Abstract data type NaturalNumber) ### Algorithm Specification #### Algorithm criteria * Input * Output * Definiteness * Finiteness * Effectiveness 演算法是用來解決問題的方法,演算法中的 input and output 表示了你的問題。我們會希望我們的演算法能夠有效率且能夠在有限的步驟內,解決問題。 最簡單的演算法,排序。 input: 一串數值。 output: 一串數值,但需依照大小順序排好。 **program doesn't have to be finite(e.g. OS scheduling)** ### Example 1: Selection Sort * From those integers that are currently unsorted, find the smallest and place it next in the sorted list. ```csharp= // 虛擬程式碼 for(i=0; i < n; i++){ Examine list[i] to list[n-1] and suppose that smallest integer is list[min] Interchange list[i] & list[min] } ``` ```csharp= void sort(int list[], int n) { for (i=0; i < n-1; i++){ min = i; for(j = i+1; j < n; j++){ if(list[j] < list[min]) min = j; } SWAP(list[i], list[min], temp) } } } ``` ### Example of Binary Search Search 最暴力的方法:一個一個去確認搜尋。 在執行 Binary Search 前需要先進行排序。 圖中 * 為二分點 不一定找得到輸入的數值 ![Binary Search](https://i.imgur.com/4ejevyt.png) ```csharp= // pseudo code While (there are moree integers to check){ middle = (left + right) / 2; if(searchnum < list[middle]) right = middle -1 ; else if(searchnum == list[middle]) return middle; else left = middle+1; } ``` ```csharp= int compare(int x, int y) /* return -1 for less than, 0 for equal */ int binsearch(int list[], int searchno, int left, int right) { while (left <= right){ middle = (left + right) / 2; switch (COMPARE(list[middle], searchno)){ case -1: left = middle +1; break; case 0: return middle; case 1: right = middle -1 } } } ``` **不要為了學語言而學語言,學會解決問題的方法。** ### Example 3: Selection Problem * Selection problem: select the kth largest among N number * Solutions: ``` * Approach 1 * read N numbers into an array * sort the array in decreasing order * return the element in position k * Approach 2 * read k elements into an array * sort them in decreasing order * for each remaining elements, read one by one * ignored if it is smaller than the kth element * otherwise, place in correct place and bumping one out of array ``` * Which is better? * More efficient algorithm? ### Recursive Algorithms * Direct recursion: functions that call themselves * Indirect recursion: Functions that call other functions that invoke calling function again * C(n, m) = n!/[m!(n-m)!] -> C(n, m) = C(n-1, m) + C(n-1, m-1) * Boundary condition for recursion (停止條件 Boundary) **不要讓演算法進入無限迴圈** ### Recursive Factorial * n! = n * (n-1) -> fact(n) = n * fact(n-1) * boundary -> 0! = 1 ```csharp= int fact(int n){ if( n == 0 ) // Boundary condition return (1) else return (n*fact(n-1)) } ``` ### Recursive Multiplication * a * b = a * (b-1) +a * a * 1 = a ```csharp= int mult(int a, int b) { if(b == 1) return (a); else return ( mult(a, b-1) + a ) } ``` ### Recursive Summation * sum(1, n) = sum(1, n-1) + n * sum(1, 1) = 1 ```csharp= int sum(int n){ if(n == 1) return (1); else return ( sum(n-1) + n ) } ``` 也可以直接使用 for 迴圈寫完 ### Recursive binary search ```csharp= int binsearch(int list[], int searchno, int left, int right) { if(left <= right){ middle = (left + right) / 2; switch(COMPARE(list[middle], searchno)){ case -1: retrun binsearch(list, searchno, middle + 1, right) case 0: return middle; case 1: return binsearch(list, searchno, left, middle - 1) } } return -1; } ``` ### Recursive Permutations * Permutation of { a, b, c } ``` (a, b, c), (a, c, b) (b, a, c), (b, c, a) (c, a, b), (c, b, a) ``` * Recursion? * a + Perm({b, c}) -> {a, b, c} and {a, c, b} * b + Perm({a, c}) -> {b, a, c} and {b, c, a} * c + Perm({a, b}) -> {c, a, b} and {c, b, a} ![Perm Function](https://i.imgur.com/2NGAjOe.png) ```csharp= void perm(char *;list, int i , int n){ if(i == n){ for(j = 0; j <=n; j++) cout<<list[j]; }else{ for(j = i; j <= n; j++){ SWAP(list[i], list[j], temp); perm(list, i+1, n) SWAP(list[i], list[j], temp) } } } ``` Q. 字元排序問題如果不用 recursive 要怎麼解決? for loop? ### Performance Evaluation * Performance analysis: machine independent * Performance measurement: machine dependent ### Performance Analysis * Complexity theory * Space complexity: amount of memory * Time complexity: amount of computer time ### Space Complexity ![Space Complexity](https://i.imgur.com/j7NhWFa.png) ### Time Complexity ![Time Complexity](https://i.imgur.com/lQng23w.jpg) ![Time Complexity 1](https://i.imgur.com/xSbGxYw.jpg) * Worst case * Best case * Average case ![](https://i.imgur.com/gg7flgS.png) ### Asymptotic Notation - Big "oh" ![Big "oh"](https://i.imgur.com/ILwyfKi.png) ### Asymptotic Notation - Omega ![Omega](https://i.imgur.com/liKbk1T.png) The most important is most lower bound. :::info 如何計算 recursive 的時間複雜度。 ![](https://i.imgur.com/cSnTuOh.jpg) ![](https://i.imgur.com/vV0vHu4.jpg) ![](https://i.imgur.com/ReASqSx.jpg) ![](https://i.imgur.com/ttRm8lZ.jpg) ![](https://i.imgur.com/U9i51yr.jpg) ![](https://i.imgur.com/DpOLudE.png) ![](https://i.imgur.com/4biv4iU.jpg) ![](https://i.imgur.com/pxSX4zR.png) https://github.com/illuminate/support/blob/master/Arr.php#L207 10 ans. D ::: ```csharp= Int a = 0 Int* c = &a *c = 1 Int b = 10 c = &b *c = 3 ``` ## Array Array: a set of index and value * data structure For each index, there is a value associated with that index Array ADT ![](https://i.imgur.com/nDJuGiG.jpg) :::info ![](https://i.imgur.com/RfrIBq5.jpg) ![](https://i.imgur.com/RgoR4CW.jpg) ::: https://youtu.be/cgaPEsMJkSc?t=3132 ![](https://i.imgur.com/aTAS6kQ.png) https://blog.gtwang.org/programming/memory-layout-of-c-program/ ``` 001080 000000 040300 600000 000000 050000 000000 ``` array[7][6] | - | - | | - | - | | 0 | 765 | | 1 | 021 | | 2 | 048 | | 3 | 214 | | 4 | 233 | | 5 | 515 | ``` 12345 06789 00abc 000de 0000f ``` index of e = 13 index of any (x,y) where x >= y in n by n matrix = ((n+1) + n-i +1) * i / 2 + (j - i) --- https://docs.google.com/presentation/d/1PfNCvxJV_pOozArub_N-99-fXwu7yNw_Lt3EjlVKLiU/edit#slide=id.p ## Array int list[5]: 存放5個int的array int* plist[5]: 存放list[5]內5個int的pointer的array 陣列內value存放的記憶體空間為連續 *plist = list[0]: base address = alpha list[1]: address = alpha + sizeof(int) list[2]: address = alpha + 2*sizeof(int) ... --- ### loser me 10歲的james薪水就有35000 QQ ![](https://i.imgur.com/YpwaPi7.png) --- ### polynomial * purpose:do the basic oparation like addition or substraction #### 表示方法1: x^4^+10x^3^+3x^2^+1 = [1, 0, 3, 10, 1] | 0 (constant) | 1 | 2 | 3 | 4 | | - | - | - | - | - | | 1 | 0 | 3 | 10 | 1 | Example: a = 3x^20^+2x^5^+4 = [4, 0, 0, 0, 0, 2, 0, ... , 0, 3] b = x^4^+10x^3^+3x^2^+1 = [1, 0, 3, 10, 1] result = [] 1. compare a.count and b.count: a>b -> case 1 result[20] = a[20] a.remove(index: 20) (到a[5]前皆同) 沒有2,因為我懶得打惹 #### 表示方法3: 若多項式為 2x^1000^+1 = [1, 0, ... , 0, 2] 則Array多數空間被浪費 -> 用一個global array存放所有多項式 A(x) = x^1000^+1 B(x) = x^4^+10x^3^+3x^2^+1 | | A Start | A Finish | B Start | | | B Finish | free | | - | - | - | - | - | - | - | - | | coef | 2 | 1 | 1 | 10 | 3 | 1 | | | exp | 1000 | 0 | 4 | 3 | 2 | 0 | | | index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | * free: 允許加入其他多項式 or 存放相加後結果 * 須清楚各多項式的起始及終點位置所對應的index值 Example: 1. Compare A-start and B-start's exp: A-start>B-start -> case 1 copy A-start to free free++ #### complexity = O(n+m) * 若free超過了一開始宣告的maxTerm: * 回收(destruture)已做過運算且不會再使用的polynomial * 缺點: release後的空間不連續 * 須改寫程式碼以避免問題,大家自己回家想一波,期中考會考 😈 --- ### Sparse Matrix * 矩陣內大部分element為0,使用2-dimension array紀錄hen浪費空間 * 使用1-dimension array紀錄非0的element的<row, column, value> * 排序先row再column ### 矩陣轉置 #### dumby method * row與column值互換 * 互換後需再重新檢查並排序array,調整時間成本高 * O(column * term) #### smart method * 先統計transport後各個row有幾個element,紀錄位置後再進行transport * O(row * column) ### 矩陣相乘 * result(i,j) = sigma(a(i,k)*b(j,k)) ### 矩陣存放 * 先存row再存column(row major) --- ### String的存放 #### 英文 | H | E | L | L | O | | W | O | R | L | D | \0 | | - | - | - | - | - | - | - | - | - | - | - | - | #### 中文 * 中文的字串處理(自己google) * 資料檢索 ### 字串的運算 * string match * 字串串接 * tring copy * string insert * trim * print ### String match * 查詢P字串有無在T字串內出現 #### dumby method * 從P1/T1開始比對,若不吻合則shift一格,比對P1/T2直到吻合或回報-1 * O(mn) #### KMP * 建構可快速移動的failure array,再由此加速字串比對 * O(m+n) ### KMP #### first case * 從P1/T1開始比對,若在P4/T4處不吻合,則直接shift至比對P1/T4 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | - | - | - | - | - | - | - | - | - | | A | G | C | **C** | T | A | T | C | A | | A | G | C | **G** | G | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | - | - | - | - | - | - | - | - | - | | A | G | C | C | T | A | T | C | A | | | | | A | G | C | G | G | #### second case * 當字串不吻合時,發生不吻合的字元前(已比對過的字元)若有與字首吻合的字元,則不需再做比對 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | | - | - | - | - | - | - | - | - | - | - | - | - | - | | A | G | C | C | T | **A** | **G** | C | T | C | A | T | T | | **A** | G | C | C | T | **A** | **C** | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | | - | - | - | - | - | - | - | - | - | - | - | - | - | | A | G | C | C | T | **A** | G | C | T | C | A | T | T | | | | | | | **A** | G | C | C | T | A | C | #### third case * second case在非單一字元的情況下同樣適用(failure array) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | A | G | C | C | G | **A** | **G** | **G** | T | C | A | T | T | A | G | T | A | A | | **A** | **G** | C | C | G | **A** | **G** | **C** | A | G | G | C | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | A | G | C | C | G | **A** | **G** | G | T | C | A | T | T | A | G | T | A | A | | | | | | | **A** | **G** | C | C | G | A | G | C | A | G | G | C | ### failure function ![](https://i.imgur.com/5WINRzM.png) #### example | q | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | - | - | - | - | - | - | - | - | | p | a | b | a | b | a | c | a | | ㄇ | 0 | 0 | 1 | 2 | 3 | 0 | 1 | ## Stack and Queue ### Template * 當data type改變時,對function本身沒有影響 ### Stack * last-in-first-out(LIFO) * top * abstract data type(ADT) * top * finite order list * Boolean IsFull() * void Push/Add(const KeyType & item) * Boolean IsEmpty() * KeyType* Delete/Pop(KeyType &) * 可用array inplement(但array不是唯一) #### 迷宮老鼠 ![](https://i.imgur.com/l07waBm.png) * 有幾種走法(8個方位) * 用stack存放行走路徑[row][column][dir],若無法前進則pop回前一步 * 圍牆用1表示 * maze記錄曾經走過的path * move決定行走方位 * mark記錄有無走過 * 存不存在一條路徑到達終點 * 是否為最短路徑 * 出口位置未知 * 出口數量不只1個 #### Operate * Postfix evaluation ![](https://i.imgur.com/Uf3if3r.png) * infix轉postfix * 先括弧、移項後刪除 * */的priority大於+- ![](https://i.imgur.com/tVAVcgn.png) * priority table ![](https://i.imgur.com/QmIt69C.png) ### Queue * first-in-first-out(FIFO) * job schedule * 排隊即queue * Priority Queue(插隊仔) * abstract data type(ADT) * front / rear * 其他跟stack差不多 * 因資料delete後不進行搬移,故queue size(IsFull())可能造成誤判 * data movement * 修改procedure * 環狀queue(保留一個空間去做判斷) * front = rear: empty * rear + 1 = front: full * delete front + 1 ## Linked List ### Array * 連續的記憶體空間 * 有大小限制 * 有空間浪費問題 * 當資料做增減時,可能產生誤判 ### Linked List * 有序 * element中存放資料以及pointer(存放下一筆資料的記憶體位置) * insert資料時,assign插入資料的記憶體位置給上一筆資料的pointer,並將插入資料的pointer指向下一筆資料的記憶體位置。 * 刪除資料時,需先依序搜尋資料是否存在以及存放位置,再對pointer重新做assign,搜尋成本高。(element可刪除可不刪) * double linked list * 可從頭尾同時搜尋,降低搜尋成本。 * 可避免pointer鏈結斷掉造成找不到下一筆資料的情況。 ### Implement Linked List * Composite Classes * class Node: data, next pointer * class List: first node * 優點:class Node可共用 * 缺點:自己想 * Nested Classes * class List: class Node * class Node: data, next pointer * 缺點:共用需要再複製一份 * 優點:自己想 ![](https://i.imgur.com/zmtzTQz.png) ### insert ### delete ### list iterator * 一個 class 存放 scan, number of element, first, isEmpty...(因重複使用率高) ### invert a list * 反轉 linked list 所有 pointer * first重新assignment ![](https://i.imgur.com/u9P70pY.png) ### Concatenating * if p,link = null {p -> link = b.first} else {p.link = p} ### list desctrutor * delete first ### circular list * 需註明first與last `linked list也可用來表示polynomial以及stack and queue` ### polynomial * 相加 * 指數部分相同->係數相加 * 不同-> copy較大者係數 * 複製後pointer往後走 * 直到兩者pointer都為null為止 ![](https://i.imgur.com/wMsTXAG.png) ### free pool * delete掉的node丟進free pool裡,不先destructor。要創建新的node時可從free pool中取用(first)再修改值,可減少系統請求記憶體空間的程序。 * 環狀linked list -> 可將polynomial一次return進free pool ### equivalence relations * symmetric * reflexive * transitive ![](https://i.imgur.com/zsjXcl0.png) * phase1 - 讀完所有的pair,process未處理的pair(seq array w/ directive equivalence linked list) * phase2 - output結果(output array: output過的index轉成true) (使用stack尋找indirect的equivalence number(transitive)) * input順序不同會造成output的set內容順序不同,但element相同 ### Sparse Matrix 每個row與column都會對應到一個環狀的linked list ![](https://i.imgur.com/ZphBndY.png) * 藍色代表是4 by 4的matrix * header node僅有一藍四黃(左與上為同一組head node) * 紅線為column構成的linked list * 黑線為row構成的linked list * 數字的左上/右上代表pointer,指向對應的row及column * 如果要做transport,需調換row/column及改變pointer ![](https://i.imgur.com/mMsQtuS.png) ### Doubly linked list * 有llink及rlink 2個pointer * header node 的 rlink 指向linked list的first,llink指向linked list的last * insert(修改4個pointer)/delete(修改2個pointer) ## Tree * a collection of nodes * a distinguished node root * zero or more nonempty (sub)trees * level: height of tree(長得像大ray哥的老師在他家祖譜排level 32) * degree: number of pointing out * parent: 最原始的祖先 ### 表示法 #### linked list表示法 ![](https://i.imgur.com/OFBuD5K.png) * ( A ( B ( E ( K , L ) , F ) , C ( G ) , D ( H ( M ) , I , J ) ) ) * 如果n個node就件n筆資料(n個pointer),很浪費memory(?) * 整理成2元樹,每個node只記錄left child及right child #### binary tree ![](https://i.imgur.com/ZmSQeZo.png) * child只有2個(right/left) * 允許zero node * 在binary tree裡,如果level是i,最多的node數為2^i-1^ * maximum number of pointer (B): B+1=n **(the properties of binary tree沒聲音)** * full binary tree: has maximum number of node * complete tree: 照次序排列的樹 * skill tree: child偏向依邊的歪斜樹 #### Array表示法 * 次序編號: index -> 丟入相對應的value * left child: even value = 2i right child: odd value = 2i+1 i=1: parent node #### 優缺點 array: 可快速查找,但skill tree容易遇見空間的浪費 linked list: 查找困難 ![](https://i.imgur.com/PSq9g57.png) ### binary tree traversals * 快速得知tree的結構,以便比較及複製 * **LVR(inorder), LRV(postorder), VLR(preorder)**, VRL, RVL, RLV * L: 往左查訪 * R: 往右查訪 * V: 印出資料 * recursive function or using a stack/queue #### inorder * recursive * inorder(currentNode -> left child); * cout << currentNode -> data; * inorder(currentNode -> right child); * stack * 往左走訪 * 有child -> 加入stack * 沒child -> pop out the last element in stack and visit the right child #### preorder * recursive * cout << currentNode -> data; * inorder(currentNode -> left child); * inorder(currentNode -> right child); #### postorder * recursive * inorder(currentNode -> left child); * inorder(currentNode -> right child); * cout << currentNode -> data; #### level order * 照index順序 ![](https://i.imgur.com/QyZ4eqK.png) * only inoder can check two trees equal ### Propositional calculus expression #### boolean tree(供三小) ![](https://i.imgur.com/IkzMWDU.png) ![](https://i.imgur.com/zfrZE9f.png) * use postorder -> evaluate expression value * the last element is the root ### Threaded Binary Trees(ㄅ重要) * 讓 tree 裡沒有被使用的node 被充分利用 -> 使用 inorder 讓空的 node 指向他上一個或下一個 node * 可快速查找 inorder 順序 ![](https://i.imgur.com/BepaEB9.png) * 實線:原本的 tree * 虛線:thread binary tree,每個 element 都指向他前後的 2 個 element #### 缺點 * 多增加了 2 個欄位空間 * 增加了 insert 和 delete 的難度 ### Huffman code * 目的在於用最少的 code text 進行編碼 * 根據 message 的使用頻率來決定表達方式 * 例如出現頻率若最高,就用最少的單位(minimum code size)表示,使解碼的速度加快 ![](https://i.imgur.com/1Df7ClW.png) * M1 = 000 M2 = 001 ...依此類推 #### huffman function * 建立方法 * bottom-up * 先選頻率最小的 2 個 message 並 merge 出現頻率 * 將 merge 後的頻率加入比較,再選出最小的 2 個 message 並 merge ![](https://i.imgur.com/UvVAOzN.png) * 程式 * 宣告一個 binary tree 的 priority queue * 選最小的 2 個 element(最前面 2 個) merge * delete 已被 merge 的資料 * insert merge 後的資料(priority queue) * 呼叫自己 * huffman code 並不唯一 ![](https://i.imgur.com/zqSkzEB.png) ![](https://i.imgur.com/P5H7caN.png) ### Priority Queue(heap) * 為 complete binary tree * 分 maximum 與 minimun,差在 return 值以及 root * maximum/minimum heap 只在乎 tree/subtree 的 root 是 maximum/minimum,其餘 leaves 的排列隨便 ![](https://i.imgur.com/8EXPN68.png) #### insert * bottom-up * insert element 到最一個位置 * 若 leaf 比 root 大則進行交換,最多交換 logn次 #### delete * top-down * 將最上面的 element output * 把最後一個 element 放到 root 的位置 * 若 leaf 比 root 大則進行交換,最多交換 logn次 #### heap sort * 建立 minimum/maximum heap tree * insert n 個 element * delete(pop out) n 個 element ### Binary Search Tree * delete max/min: O(logn) * delete arbitary: O(n) * search: O(n) --- * element of left-subtree < root < element of right-subtree #### search * 若所蒐數值比 root 小 -> 查找左邊,比 root 大 -> 查找右邊,直到找到 null 為止 * 找第 n 大的數字 -> sorting(maximum heap tree) #### size (???-36:30) #### Insert * 跟 search 差不多 #### delete * 刪除指定 element * 只有左/右子樹 -> 直接拉上去 * 同時有左右子樹 -> 拉左子樹裡最大的 element 或右子樹裡最小的 element #### AVL tree * 將 binary search tree 調整成左右 level 相差不大於 1 ### selection tree * 每個 leaf node 分別是一個 queue * 可輕鬆做排序 #### winner tree ??? - (52:00) * 大量數字排序,memory 無法開另一個空間存放時 #### loser tree ### Forest * a set of tree #### forest traversal ![](https://i.imgur.com/1vOHkuj.png) #### disjoint set union * 兩種做法的優缺點取決於 element 被查找的頻率 ![](https://i.imgur.com/M3XYg8K.png) #### weighting rule * root i 的 leaf 若比 root j 少,則 union 時把 j 當成 root * 猶豫的ppt https://www.csie.ntu.edu.tw/~hsinmu/courses/_media/dsa_17spring/disjoint_set.pdf ## Graph * vertice(V) & edge(E) * 一比劃問題 * 電路分析 * 最短路徑問題 * 專案規劃 * social science(fb) ### 一筆劃問題 * 七條橋(經典) * Euler's graph -> 每個點的 degree 僅能為偶數 ### 最短路徑問題 * 地圖 ### graph 種類 * complete gragh -> 每個點都和其他點有連接 * directed graph -> 有向的 ![](https://i.imgur.com/XKxIdtQ.png) * multigragh -> 當 edge 有其他記憶條件因素時 * DAG -> 有向無環圖 * connected component -> 沒有人是局外人 * strongly -> 每個點都有相互連接(左) or isolated component(右) ![](https://i.imgur.com/G68MFus.png) ### ADT * 使用矩陣表示點和點之間是否有相連 * 在 undirected graph 下為 symmetry ![](https://i.imgur.com/y2HKf6D.png) * 若不是 complete graph,使用 matrix 較浪費空間,可改用 linked list 表示 ![](https://i.imgur.com/hWYyCDX.png) * 用 one-dimension array(有點複雜) * 紅色(起始點) -> 與 component 有連接的資料的放置位置 * 缺點 -> 刪除資料麻煩(需要一直 shift) ![](https://i.imgur.com/EZ1mM1G.png) * adjacent list * 記載 component 的 out-degree * inverse adjacent list * 記載 component 的 in-degree * multilists * 以 edge 為出發點 * 記錄跟該 edge 有連接的 node 及其 value * N 的個數為 number of edge ![](https://i.imgur.com/Usubg3k.png) ### weight edge * 光復路的故事 ### traversal * DFS * depth preorder traversal * 向一個方向走到不能再走後,退回上一個component 繼續走訪 * 如果表示方法為 list,複雜度 = O(e);若為 matrix,複雜度為O(n^2) * BFS * level by level * 先走訪一個component 所有連接的 component * 走過的 mark 為 1 (boolean array) * 皆不唯一 ### Spanning tree * sub-graph * minimum cost spanning tree -> 能用最少的 edge 連接所有 vertice * 不唯一 * 有 3 個 greedy 的演算法 #### kruskal * 以 edge 為主 * 從權重最小的 edge 開始選擇 * 不能有 cycle (how to detect loop) * 使用 priority queue #### print * 以 vertice 為主 * 決定一個起點,選擇與起點連接權重最小的 edge * 再往外擴張 #### sollin * 一次決定多個起點,選擇起點們連接的最小權重邊 * 結合 kruskal 及 print ### Biconnected components * articulation point ->在 graph 中若拿掉會使 graph 形成 2 個獨立的 connected components 的 vertice * 在 graph 中沒有 articulation point 則為 bioconnected component ![](https://i.imgur.com/D0v63Nv.png) * 若不被任一 subgraph 包含的話則為 maximum biconnected component * 可用 DFS 查找,查找完將 graph 中原有的 edge 補完(back edge) ![](https://i.imgur.com/wjknCQs.png) * low -> 以 root 為出發點,DFS 時可以拜訪到對最小數值 (48:00) 為何右邊的最小拜訪值不是 1 ? 567 為何一樣? #### find articulation point in programming * make the DFS spanning tree -> decide DFN(越靠近 root, DFN 值越小) * 補上 back edge * 於自己本身的 DFN, child node 的 DFN, 以及 back edge 連接的 node 的 DFN 中取最小值(?)(bottom up) * root 一定為 articulation point(若選擇非 articulation point 做 root?),若 u 不為 root ,則與 u 連接的 child node 1 皆無法不通過 u 拜訪 chide node 2, 則 u 為一個 articulation point ### Shortest Path * Dijkstra * 從起點開始選擇最小權重邊延伸,直到所有點都被拜訪過為止 * 選定起始點加入陣列,與起始點相連的點更新權重,其餘為無限大 * 將權重較小的點加進陣列裡(已走訪過),並做為新的起始點,更新與之相連的點的權重 * 重複以上動作 * 全部點都被走訪完畢後,得知起點至所有點的最短路徑 * 要知道 path 則需另外紀錄 * 為 greedy 演算法 * 無法處理有負權重的情況 * Bellman-Ford * 當只允許走一條路徑時,列出到每個點的最小 cost * 開放走 2 條路徑,經由上一次的結果更新 cost * 不斷更新 cost 直到走訪 edge - 1 次 * 若第 edge 次仍可以更新權重,代表 graph 中有負迴圈 ### All-Pairs * a^k^[i][j]: 從 i 走到 j ,最少經過 k 個頂點 = a^k-1^[i][k] + a^k-1^[k][j] #### Transitive closure * 做完 all-pair 後,大於1的在 transitive closure 中皆為1,表示可連通, 0 表示無路徑可連通 ### Activity-on-Vertex (AOV) Network * 每個 vertex 代表一個 activity,有先後順序關係。若先決條件的 activity 沒發生,則該 activity 不會發生 * in-degree 為 0 的 output 出來(不拘順序),output 後對有與 output 連接的 vertex 更新 in-degree * order 並不唯一 ![](https://i.imgur.com/K7VcKii.png) ![](https://i.imgur.com/HkHhDPe.png) ### Activity-on-Edge (AOE) Network * activity 在 edge 上,vertex 只代表 activity 的開始即結束 * edge 可同時執行 * vertex 紀錄最早完成時間 * 利用 compage order 依序走訪 ![](https://i.imgur.com/SRxX1KD.png) ![](https://i.imgur.com/LaouOIj.png) ### critical path * 在 graph 中不能有 delay 的空間 * 先計算每個 activity 的時間 * 在 in-degree 不為 1 的 vertex 前建立 vertex',edge 為 ?,order 為前置作業的 maximum 值 * 從左往右推 -> 完成需花費最短時間 ![](https://i.imgur.com/aXljYUL.png) * 從右往左推(選 minimum) -> 最多可 delay 時間 ![](https://i.imgur.com/0MSRZs1.png) * critical path: 從右往左及從左往右推的時間相減為 0 的即為 critical path * critical path 是否一定存在? ## Sorting * data base: records(key) * get / set * scan / match * return data * ex: 電話簿 * binary search * 須先做排序 * 可將 search 複雜度從 O(n) 降至 O(logn) * internal sort: 要排序的資料皆可放入 memory 中 * insert * n-1 筆 data 已排序,將第 n 筆 data 插入已排序 data 中適當位置 * recursive * 將 unsolve 的資料加入已排序資料最後項(A(j)),若A(j)<A(j-1)則做swap,j- - * selection * 從未排序資料中挑選最小的資料加入已排序資料 * 最小的資料與第一筆資料 swap 後,將第一筆資料加入已排序資料 * bubble * 兩兩做比較後如有需要則 swap * merge * 合併 2 個已經排序完成的資料 * recursive * divide and conquer * 先將資料切成 n 組最小的 set 後,每 2 組做排序合併,合併時需新開一個 array 存放合併後的結果 * complexity: O(nlogn) * quick * comparison base 中最快的 * recursive * divide and conquer * 在資料中隨機選一筆資料作為分段點,將資料分成比分段點小及分段點大 2 堆,各自再選擇分段點,重複動作 * sampling: 為避免隨機選擇的數字過於偏大或偏小而失去 quick sort 的優勢,可隨機選擇多筆資料,以中位數的資料作為分段點 * 不適合資料筆數少的排序 * complexity: O(nlogn) * heap * 建立一個 minimum heap tree,delete root 後重新做調整(O(logn)),共做 n 次 delete(O(n)) * 須建立額外的空間存放 delete 的值 * 若使用 maximum heap tree 則不需額外建立 array 存放 delete 的 element * 建立 heap: top-down or bottom-up(O(logn)) * bucket * 根據資料特性建立 bucket,並將資料放入對應的 bucket 中 * implement in queue * radix * approach from bucket sort * most significant bit(MSB): radix-exchange sort,根據最重要的位元進行排序 ![](https://i.imgur.com/RhgZ2X8.png) * least significant bit(LSB): straight-radix sort,根據最小的位元進行排序,recursive 至最大位元後,從最小的 bucket 依序從最底端 output 出來(queue) ![](https://i.imgur.com/4SvpMBe.png) * external: data 太大無法放進 memory(目前較少使用,未來趨勢) * merge * 假設 memory 只能放進 750 個 record,而資料有 4500 筆 * 將 data 做 partition,分段處理 ![](https://i.imgur.com/nFPd1zG.png) ### weighted external path lenght * element 的個數及被 merge 的次數乘積的加總 * 可用 huffman-code 解決 ### Proof Lower Bound of Sorting * 在 comparison sorting algorism 中,最小複雜度為O(nlogn) ![](https://i.imgur.com/gLmoahb.png) ## Hashing * hash key(bucket) * hash funciton * static v.s. dynamic * static - hash function 不會隨資料的比數或狀態改變 * dynamic - hash function 會隨資料的比數或狀態調整 * identifier density: n(所有資料)/T(可能的key) * loading density: hash table 被佔用的情況。n(所有資料)/sb(每個bucket所能容納資料數 * bucket) * collision: 2 筆資料都要進同一 bucket * overflow: 資料要進已滿的 bucket * cardinality: 資料筆數 * degree: 欄位數 ### symbol * directory base * 可作新增、搜尋、刪除 ### Hash Function * 目標 * 發生 collision 的情況越少越好 * 每個 bucket 的容納資料筆數平均 * 常見形式 * Mid-Square * 將資料轉成二進位 * 取中間數(不限定幾位) * 將取出數開平方 * Division * 取mode * Fold * 將資料轉成二進位 * 將轉換後的資料分為好幾個區間,分別去做運算(23:44?) * 對折 * solve overflow * open new bucket * linear: 找附近的空 bucket * quadratic: 找遠一點的空 bucket,若 bucket key 為 i,則一次跳 i 平方格查找 * rehash: 共有 m 個 hash function,再 hash 一次 * chaining * 使用 linked list 串接 * 搜索長度:所有可能走訪的 linked list/總資料筆數 ### Dynamic Hashing * 用資料分布的狀態來決定 hash * 將 key 轉換成二進位表示法 * 轉換成 binary search tree * sstime(travers tree's time,與樹的高度有關) * 樹的歪斜問題 * extendible hash(?) * 將 tree 建一個目錄(避免 travers 的時間) * directoryless hash(?) * 沒有 directory * overflow -> linked list? ![](https://i.imgur.com/h2qpiLv.png) ## search structure ### AVL Tree * binary tree 受 input 的影響,可能不 balance * high balance - 左右樹高度差(balance factory)只能為 1, 0, -1 * 在 insert 的過程中只要有 node 違反 BF 值則必須做調整 * 依據 insert 位置,可分為 LL, LR, RR, RL #### 調整(只能意會) * LL * RR * LR(31:00) * RL * 將 insert 後產生衝突的 C 往上拉 * 其餘4顆子樹照 binary search tree 順序排放 #### Complexity ![](https://i.imgur.com/oKwuXUF.png) * 建 AVL tree 的 cost? ### Optimal Binary Search Tree * 在最壞的狀況下有幾次的 comparison(與 tree 的高度有關) * 當每個 node 都有 priority 時,與搜索的頻率有關(與 path 長度有關) * internal path - 搜索到 node 的路徑 * external path - fail node, node 並不存在 binary search tree 中 * cost: internal path * frequency + external path * freqiency * 資料為 static set * if the data is dynamic set? #### Determine(?) * Tij - optimal binary search tree, 包含 a(i+1)-a(j) 個 words * Cij - Tij 的 cost * Cij = cost(L) + cost (R) + weight(L) + weight(R) * Rij - Tij 的 root * Wij - weight, 所有 priority 的加總(static) ##### 若 tree 的 subtree 為 optimal,則 tree 本身有高機率為 optimal * 假設 a(k) 為 Tij 的 root(Rij) 則左子樹包含 a(i+1)-a(k-1),右子樹包含 a(k+1)-a(j) * Cij = Pk + cost(L) + cost(R) + weight(L) + weight(R),cost(L) 與 cost (R) 再繼續 recursive ![](https://i.imgur.com/laAvWnA.png) ![](https://i.imgur.com/NVbanc0.png) * 橫軸代表編號,縱軸代表 word 個數 ### M-Way Search * 最多有 M 個 subtree(M 個 pointer) * if x=K -> search complete if x=/=K -> 尋找 x 的坐落區間 ### B-Tree * 是一個 m-way search tree * 若 m = n,則叫做 2-3-...-(n-1) tree * root 最少要 2 個 pointer * 除了 root 外,每個 node 最少要有 m/2 個 children * 所有的 failure node 都在同一個 level * 可以解決 binary tree 當 data 過多時樹高過高的問題 * 缺點:在每個 level 比較時耗時較長,故在比較時會使用 binary search #### insert * 尋找 word 是否已存在,若不存在則判斷適當插入位置 * 當 leaf node還有空間 -> 直接插入 overflow -> create node(split) 新 node 包含衝突的資料內的最大值 middle 值拉到 parent ![](https://i.imgur.com/7xH4w2t.png) ![](https://i.imgur.com/3iwz5EP.png) #### delete * 尋找數字是否存在 * 刪除後符合 B-tree 特性 -> 直接刪除 * 刪除後不符合: 隔壁有 node 可以提 -> rotate (隔壁 node 提到 parent,parent 送一個 node 給原本的 pointer) 隔壁沒有 node 可以提,parent 也沒 node 可以送 -> combine ![](https://i.imgur.com/2QEwcFX.png) ### B+ tree * 可一次找一堆值 * 將 leaf node 用 linked list 串接 * balance * 可把想搜索的範圍鎖定在某 subtree 內 * search 到 leaf node 後透過 linked list 做 range query ![](https://i.imgur.com/Hm3hcwS.png) ### R tree(不在課程範圍內) * 可處理多維空間(ex:經緯度)

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully