資料結構 Data Structure

# 資料結構 Data Structure :::info :::spoiler Click to Open TOC [TOC] ::: :::spoiler 成績算法 * 小考+手寫作業：10% * 程式作業：20% * 出席：5% * 期中考2次，各15%，共30% * 期末考：20% * 上機考：15% (據說考的跟作業一樣) * 加分：考CPE，對1題加總分1分 ::: :::spoiler 修課規定 * 程式語言：限 C or C++ * 每週都有程式作業 - 用[online judge](https://dsoj.ncu.edu.tw/)交 - 期限為一週(太難則會延長) ::: ## Chapter 1 Basic Concepts ### **1-1 System Life Cycle** #### 【Five Points】 * Requirements * Analysis： **bottom-up vs. top-down** * Design： **data objects** and **operations** * Refinement(精細) and Coding * Verification * Program Proving * Testing * Debugging $\quad$ ### **1-2 Alorithm Specification** #### 【Selection sort 選擇排序法】 定義：在未排序序列中找到最小or大的元素　　　存放到排序序列的起始位置(放到最左邊) 　　　範例：有一未排序過陣列A如右 `{30, 10, 50, 40, 20}`，欲透過**選擇排序法**排序 1. 從A[0]開始向右找一輪(意即從A[0]~A[4])，找出最小的元素 2. 找一輪後發現最小的是A[1]的10，要將A[1]的10放到陣列的最左邊。此時有兩種做法： - 用新陣列：將10放到新陣列的最左邊 - 用原來的：將A[1]與A[0]對調，使陣列變成 `{10, 30, 50, 40, 20}` (老師用的方法是在原來的陣列進行交換，本共筆也以此為範例) 3. 第二輪開始，由於上一輪已將最小值10排好，排在A[0] 因此這輪從A[1]開始往後找最小值，找一輪後會發現最小值為20，位於A[4] 4. 將20移到10的後面(意即目前 *A[1]的30* 與 *A[4]的20* **對調**) 移動好的陣列會變成 `{10, 20, 50, 40, 30}` 5. 第三輪開始，以此類推往下做。應該會得到 `{10, 20, 30, 40, 50}` > 補充：如何將陣列兩元素對調 > 上面的範例有提到將陣列中的兩個元素對調，在程式當中要如何呈現呢？ > 我們可以自定義函式來完成它，範例程式碼如下： > ```c= > #define SWAP(x,y,t)((t)=(x),(x)=(y),(y)=(t)) > //t是暫存的變數，透過t來暫存x，讓x可以去取得y的值 > temp = list[i]; > list[i] = list[j]; > list[j] = temp; > ``` 老師的選擇排序法程式碼： ```c= void sort(int list[], int n) { int i, j, min, temp; for (i = 0; i < n-1; i++) { min = i; for (j = i+1; j < n; j++) { if (list[j] < list[min]) min = j; } SWAP(list[i], list[min], temp); } } ``` $\quad$ #### 【Binary search 二分搜尋法 (簡稱：二分搜)】 定義：資料 **排序好** 的情況下尋找特定數字特色：不從頭開始找，而是從 **中間** 開始找範例：有一排序過陣列B如右 `{8, 14, 26, 30, 43, 50, 52}`，欲透過**二分搜尋法**搜尋 `43` 1. 陣列B總共有7個元素，分別從B[0]~B[6]，此時取B[$\frac{0+6}{2}$]=B[3]=30 2. 拿30與想要搜尋的數字(43)進行比較，發現30<43。因此可以推斷，43一定在B[3]的右邊or不存在陣列B裡 3. 改從B[4]~B[6]當中來找43，此時取B[$\frac{4+6}{2}$]=B[5]=50 4. 拿50與想要搜尋的數字(43)進行比較，發現50>43 因此可以推斷，43一定在B[5]的左邊or不存在陣列B裡 5. 綜合 `2`、`4` 所述，唯一還沒被找過的是B[4]。若B[4]=43，則表示搜尋成功；若B[4] != 43，表示43不在陣列B裡面範例程式碼如下： ```c= int binsearch (int list[], int searchNum, int left, int right) { /* search list[0] <= list[1] <= … <= list[n-1] for searchnum. Return its position if found. Otherwise return -1 */ int middle; while (left <= right) { middle = (left + right)/2; switch ( COMPARE (list[middle], searchnum)) { case -1: left = middle + 1; break; case 0 : return middle; case 1 : right = middle – 1; } } return -1; } ``` :::warning :warning: $\;$請留意！上面程式碼第9行的 `COMPARE` 係Java的用法，但考試時只能用C、C++ ::: $\quad$ #### 【1.2.2 Recursive Algorithms】 > ##### A funtion as something that is invoked (called) by another function. They may call other functions that invoke the calling function again ( **indirect recursion** 間接). * ex. A call B, B call A, which is **extremely powerful**. This perspective ignores the fact that functions can call themselves(direction recursion). * ex. A call A. > Use this algorithm when the problm itself is defined recursively. 範例程式碼如下： ```c= int binsearch (int list[], int searchNum, int left, int right) { /* search list[0] <= list[1] <= … <= list[n-1] for searchnum. Return its position if found. Otherwise return -1 */ int middle; if (left <= right) { middle = (left + right)/2; switch ( COMPARE (list[middle], searchNum)) { case -1: return binsearch(list, searchNum, middle + 1, right); case 0 : return middle; case 1 : return binsearch(list, searchNum, left, middle - 1); } } return -1; } ``` ### **1-3 Permutation** 透過程式的方式，完成排列(Permutation)。例如，`a, b, c` 三個字母會有6種排列方式定義：產生 list[i] 到 list[n] 的所有排列程式碼： > [name=songchiu] ```c= #include<stdio.h> #define SWAP(x,y,t) ((t)=(x), (x)=(y), (y)=(t)) void perm(char *list, int i, int n) { int j,temp; if(i == n) { for(j=0; j<n; j++) { printf("%c", list[j]);//print out result } printf("\n"); } else { /*list list[i] to list[n] has more than one permutatio, generate these resursively*/ for(j=i; j<n;j++) { SWAP(list[i], list[j], temp); perm(list, i+1, n);//用遞迴的方式去執行 SWAP(list[i], list[j], temp); } } } int main(void) { char list[5]={'a', 'b', 'c', 'd', 'e'}; perm(list, 0, 5);//列出5個字母的所有排列情況 return 0; } ``` $\quad$ ### 1-4 Performance Analysis [【演算法】時間複雜度與空間複雜度Time & Space Complexity](https://jason-chen-1992.weebly.com/home/time-space-complexity) #### 【Space Complexity 空間複雜度】 公式：$S(P)=C+S_P(I)$ - $S(P)$: 總共的空間複雜度 - $C$: 固定會用到的一些空間，與輸入、輸出的內容無關 - $S_P(I)$ : 執行時所用的空間，與型態、大小、輸入輸出...等內容有關由於 $C$ 是常數，因此空間複雜度取決於 $S_P(I)$ 範例： * $S_{abc}(I)=0$ ```c= float abc(float a, float b, float c) { return a+b+b*c+ (a+b-c)/(a+b)+4.00; } ``` * $S_{sum}(I)=S_{sum}(n)=0$ ```c= float sum(float list[], int n) { float tempsum=0; int i; for(i=0; i<n; i++) { tempsum += list[i]; } return tempsum; } ``` $\quad$ #### 【Time Complexity 時間複雜度】 公式：$T(P)=C+T_P(I)$ - $T(P)$: 總共的時間複雜度 - $C$: 固定會用到的一些時間，例如編譯程式(compile time) - $T_P(I)$ : 運作這支程式所花的時間，例如跑迴圈(run or execution time) 一些會看到的符號： - $O \rightarrow$ 上限 **(最常使用)** - $\Omega \rightarrow$ 下限 - $\Theta \rightarrow$ 比較貼近答案的估計 $\quad$ ## Chapter 2 Arrays and Structures ### **2-1 The Array as an ADT** #### 【Implementation of 1-D array】 - list[0] $\quad \quad \quad$ //base address = $\alpha$ - list[1] $\quad \quad \quad$ //address = $\alpha$ + $1 \ast$ sizeof(int) - list[n] $\quad \quad \quad$ //address = $\alpha$ + $n \ast$ sizeof(int) - 以此類推 #### 【$int$ *list1 vs. $int$ list2[5]】 **Same**: list1 and list2 are pointers. **Difference**: list2 reserves **5 locations**. **Notation:** list2 － a pointer to list2[0] (list2 + i) － a pointer to list2[i] (&list2[i]) ***(list2 + i) － list2[i]** ### **2-2 Structures and Unions** 【**Padding**】 ```c= Struct { char a; char b; int c; } var; ``` ![](https://i.imgur.com/ea83zqf.jpg =400x150) 【**Unions**】 >only one field of the union is "active" at any given time. ```c= typedef struct sex_type{ enum tag_field{female, male} sex; union{ int children; int beard; } u; }; typedef struct human_being{ char name[10]; int age; float salary; date dob; sex_tyoe sex_info; }; ``` ![reference link](https://i.imgur.com/X6G7Ru4.png =200x200) $\quad$ 【**Self-Referential Structures**】 ```c= typedef struct list { char data; list *link; //指向list的一個資料指標 } ``` ### **2-3 The Polynomial ADT** #### 【Polynomial】 ```c= #define MAX_degree 101 /*MAX degree of polynomial+1*/ //指數跨度大不適用 typedef struct{ int degree; float coef [MAX_degree]; }polynomial; #define MAX_TERMS 100 /*size of terms array*/ //指數跨度大適用 typedef struct{ float coef; int expon; }polynomial; ``` #### 【Global array】 >Use one global array to store all polynomials * 適合長短差異大、稀疏的多項式 * 相加完的結果放在 avail ![](https://i.imgur.com/MBw6QeB.png) ### **2-4 The Sparse Matrix ADT** #### 【Introduction】 * Standard repretation of a matrix * a[MAX_ROWS][MAX_COLS] * Sparse Matrix wastes space * store only nonzero elements * characterized by **(row, col, value)** ```c= #define MAX_TERMS 101 //maximum number of terms +1 typedef struct{ int col; int row; int value; }term; term a[MAX_TERMS]; ``` ### 【**Fast Transpose** 快速轉置矩陣】定義: Assign $A[i][j]$ to $AT[j][i]$ 時間複雜度: $O(colsB + totalb)$ ```c= void fast_transpose(term a[], b[]){ int row_terms[MAX_COL], starting_terms[MAX_COL]; int i, j, num_cols = a[0].col, num_terms = a[0].vaule; b[0].row = num_cols; b[0].col = a[0].row; b[0].value = num_terms; if(num_terms > 0){ for(int i = 0; i < num_cols; i++) row_terms[i] = 0; for(int i = 0; i <= num_terms; i++) row_terms[a[i].col]++; starting_pos[0] = 1; for(int i = 0; i < num_cols; i++) starting_pos[i] = starting_pos[i-1] + row_terms[i-1]; for(int i = 0; i <= num_terms; i++){ j = starting_pos[a[i].col]++; b[j].row = a[i].col; b[j].col = a[i].row; b[j].value = a[i].value; } } } ``` ## Chapter 4 Lists ### 4-1 Pointer #### 【**Problem**】 * Arbitrary(隨心所欲) insertion and deletion from arrays can be very time-consuming. * Waste storage **Solution：** using **linked** representation * In a linked, representation these two sequences need **not** to be the same as in the ordered list. * Store the **address, or location, of the next element** in that list in order to access elements in the correct order with each element. * Associated with each list element is a node which contains both a data component and a pointer to the next item in the list. ### 4-6 Circularly Linked List #### 【**Advantage**】 1. **Any node can be a starting point**. We can **traverse the whole list** by starting from any point. We just need to **stop** when the first visited node is visited again. 2. Useful for implementation of queue. Unlike Queue – Linked List implementation, we **don’t need to maintain two pointers** for front and rear if we use circular linked list. We can **maintain** a pointer to **the last inserted node** and **front** can always be **obtained as next of last**. 3. Circular lists are useful in applications to **repeatedly go around the list**. 4. 讓最後一個node很好找 #### 【**Head Node**】 >讓我們知道**最後一個**節點或**第一個**節點 ### 4-7 Equivalence Relations #### 【Input Equivalence Pairs i **$\equiv$** j】 ```c= scanf("%d%d", &i, &j); while( i >= 0){ x = (node_pointer)malloc(sizeof(node)); if(IS_FULL(x)){ fprintf(stderr, "The memory is full\n"); exit(1); } x->data = j; x->link = seq[i]; seq[i] = x; x = (node_pointer)malloc(sizeof(node)); if(IS_FULL(x)){ fprintf(stderr, "The memory is full\n"); exit(1); } x->data = i; x->link = seq[j]; seq[j] = x; } scanf("%d%d", &i, &j); ``` #### 【Find all pairs of the form **<i,j>**】 ```c= for(int i = 0; i < n; i++){ if(out[i]){ printf("%d", i); out[i] = FALSE; //標記被找到了 x = seq[i]; top = NULL; // 初始化stack } for(;;){ //find rest of class while(x){ j = x->data; if(out[j]){ printf("%d", j); out[j] = FALSE; y = x->link; x->link = top; top = x; x = y; } else x = x->link; } if(!top) break; x = sep[top->data]; top = top->link; //unstack } } ``` ### 4-8 Doubly Linked List #### 【**Disadvantage** of Singly Linked List】原因：因為我們可以藉由更改link的指向，輕易地移動Singly Linked List ```c= typedef struct node *node_pointer; typedef struct node{ node_pointer llink; element item; node_pointer rlink; }; ``` #### 【**Insertion**】 ```c= void dinsert(node_pointer node, node_pointer newnode){ newnode->llink = node; newnode->rlink = node->rlink; node->rlink->llink = newnode; node->rlink = newnode; } ``` #### 【**Deletion**】 ```c= void ddelete(node_pointer node, node_pointer deleted){ if(node == deleted) printf("Deletion denied"); else{ deleted->llink->rlink = deleted->rlink; deleted->rlink->llink = deleted->llink; free(deleted); } } ``` ## Chapter 5 Trees **A tree is a finite set of on or more nodes** * There is a specially designated node called **root**. * The remaining nodes are partitioned into $n\geq0$ disjoint set $T_{1},...,T_{n}$, which are called the **subtrees** of the root.