# 2018q3 Homework1 contributed by < [kevin110604](https://github.com/kevin110604) > :::danger 你的提問呢?回顧你過去開發的程式,難道沒想到概念上有衝突或者激發出更多想法嗎? :notes: jserv ::: ###### tags: `2018q3` ## Content - [ ] [為什麼要深入學習 C 語言?](https://hackmd.io/s/HJpiYaZfl) - [ ] [你所不知道的 C 語言: 指標篇](https://hackmd.io/s/HyBPr9WGl#) - [ ] [你所不知道的 C 語言: 函式呼叫篇](https://hackmd.io/s/SJ6hRj-zg) - [ ] [你所不知道的 C 語言: 遞迴呼叫篇](https://hackmd.io/s/rJ8BOjGGl) - [ ] [你所不知道的 C 語言: 前置處理器應用篇](https://hackmd.io/s/S1maxCXMl) - [ ] [你所不知道的 C 語言: goto 和流程控制](https://hackmd.io/s/B1e2AUZeM) - [ ] [你所不知道的 C 語言: linked list 和非連續記憶體操作](https://hackmd.io/s/SkE33UTHf) - [ ] [你所不知道的 C 語言:技巧篇](https://hackmd.io/s/HyIdoLnjl) - [ ] [GNU/Linux 開發工具](https://hackmd.io/c/rJKbX1pFZ) ## 為什麼要深入學習 C 語言? ## 指標篇 ### 一些難懂的指標... * case1 ```c int main() { typedef void (*funcptr)(); (* (funcptr) (void *) 0)(); // 呼叫地址為0的函式 } ``` * case2 ```c void **(*d) (int &, char **(*)(char *, char **)); ``` `d` is a pointer to a function that takes two parameters: * a reference to an int and * a pointer to a function that takes two parameters: * a pointer to a char and * a pointer to a pointer to a char and returns a pointer to a pointer to a char and returns a pointer to a pointer to void * Note: C 語言最開始是用來開發 Unix 作業系統,可以說 C 語言是 Unix 的母語 ### object * 在 C 當中,在執行時期,只要能儲存資料的區域,它的空間不為零,就是物件 * incomplete type 沒有佔用一個固定明確的空間,所以不是物件。 * 可以宣告一個 incomplete type ,但不能利用它來建立 instance ,但可以用一個指標指向他 * Array, function, and pointer types are collectively called **derived declarator types**. A declarator type derivation from a type T is the construction of a derived declarator type from T by the application of an array-type, a function-type, or a pointer-type derivation to T. ### Alignment 對某硬體架構,像是 ARM,我們需要額外的 alignment。ARMv5 (含) 以前,若要操作 32-bit 整數 `uint32_t`,該指標必須對齊 32-bit 邊界 (否則會在 dereference 時觸發 exception)。於是,當要從 `void *` 位址讀取 `uint16_t` 時,需要這麼做: ```c /* may receive wrong value if ptr is not 2-byte aligned */ /* portable way of reading a little-endian value */ uint16_t value = *(uint16_t *) ptr; uint16_t value = *(uint8_t *) ptr | ((*(uint8_t *) (ptr + 1)) << 8); ``` ### pointer to pointer ```c int B = 2; void func(int *p) { p = &B; } int main() { int A = 1, C = 3; int *ptrA = &A; func(ptrA); printf("%d\n", *ptrA); return 0; } ``` 因為 function call 傳入的是一個變數的副本,對副本作更動並不會影響原本傳入變數的值 所以可以做以下更改: ```c int B = 2; void func(int **p) { *p = &B; } int main() { int A = 1, C = 3; int *ptrA = &A; func(&ptrA); printf("%d\n", *ptrA); return 0; } ``` 再多加 `*` 使得 `p` 的 life time 跟 `ptrA` 一樣 * Note: 在 C 語言當中,只有 call-by-value * forward declaration 前置宣告 ### Pointers vs. Arrays * 在宣告的時候 * 陣列前面加上 `extern` 之後,不可和指標互換 * 陣列宣告了固定的長度,不可和指標互換 * parameter of function 可和指標互換: `func(char x[])` <=> `func(char *x)` ps: 嚴格上來說是 `func(char x[])` <=> `func(char * const x)` 因為指標 `x` 不能指到別的地方 * in expression * array 與 pointer 都可互換:`a[i]` <=> `*(a+i)` * e.g. ```c int main() { int x[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}; printf("%d %d %d %d\n", x[4], *(x + 4), *(4 + x), 4[x]); // output 4 4 4 4 } ``` Because we know `*(x + 4)` equal to `*(4 + x)` and `*(x + 4)` could be written to `x[4]`, so `*(4 + x)` also could be written to `4[x]`. * In C, there is no real array. Actually, `[]` is just a operator. * Array subscripting * Array subscripting 背後的意義就是指標 ### GDB debugger ``` $ gcc -o file_exe -Og -g file.c $ gdb -q file_exe ``` `-g`: 產生讓 debugger 使用的除錯訊息 `-Og`: 針對 debugger 做最佳化 :::info 注意是大寫的 "O",才是編譯器最佳化選項 :notes: jserv ::: * Note: `sizeof` is an operator, not function. * e.g. ```c int a[3]; struct { double v[3]; double length; } b[17]; int calendar[12][31]; ``` ```shell (gdb) p &b $5 = (struct {...} (*)[17]) 0x601060 <b> (gdb) p &b+1 $6 = (struct {...} (*)[17]) 0x601280 <a> (gdb) p &b[0]+1 $8 = (struct {...} *) 0x601080 <b+32> ``` `&b + 1` 遞移一整個 `b` 佔用的空間 `&b[0] + 1` 遞移一個 `b[0]` 佔用的空間 * Note: It is possible to fail when you call `malloc`. If that happens, it will return `NULL`. However, when you call `malloc(0)`, it will return `NULL`, too. e.g. So, instead write something like this ```c char r[strlen(s) + strlen(t)]; // OK in C99 strcpy(r, s); strcat(r, t); ``` or this ```c char *r = malloc(strlen(s) + strlen(t)); strcpy(r, s); strcat(r, t); ``` the correct one should be: ```c char *r = malloc(strlen(s) + strlen(t) + 1); // use sbrk; change progrram break if (!r) exit(1); /* print some error and exit */ strcpy(r, s); strcat(r, t); /* later */ free(r); r = NULL; /* Try to reset free’d pointers to NULL */ ``` ### Function pointer ```c int main() { return (********puts)("Hello"); } ``` * C99 [ 6.3.2.1 ] A **function designator** is an expression that has function type. Except when it is the operand of the `sizeof` operator or the unary `&` operator, a function designator with type ‘‘function returning type’’ is converted to an expression that has type ‘‘pointer to function returning type’’. * So why it works? * `*` is Right associative operator * C99 [6.5.1 ] It is an lvalue, a function designator, or a `void` expression if the unparenthesized expression is, respectively, an lvalue, a function designator, or a `void` expression. * C99 [6.5.3.2-4] The unary `*` operator denotes indirection. If the operand points to a function, the result is a function designator. ### String * `gets()` 極容易會有 buffer overflow 的問題,不應再使用。 * 自 C99 以來,C 有 2 種字串 * byte string: 使用 char 作為最小操作單位,每個 char 至少要 8 bits * wide string: 使用 [wchar_t](http://pubs.opengroup.org/onlinepubs/007908775/xsh/wchar.h.html) 作為最小操作單位,以便支援 Unicode 或多國語文 (意味著變動程度編碼) * C 語言規格中定義了 string literals 會被分配於 “static storage” 當中 (C99 [6.4.5]),並說明如果程式嘗試修改 string literals 的內容,將會造成未定義行為 * `char *p = "hello world"` vs `char p[] = "hello world"` e.g. ```c char *f4() { char *x = "apple"; return x; } char *f5() { char x[] = "apple"; return x; // warning: function returns address of local variable } int main() { printf("%s\n", f4()); // apple printf("%s\n", f5()); // (null) } ``` ## 函式呼叫篇 * In standard C, there is no nest function. * 一般的情況,程式並不會接觸到實體記憶體,而是透過 MMU 將 virtual address 對應到 physical address ![](https://i.imgur.com/8f6M5nK.png) ### Program Memory Layout ![](https://hackpad-attachments.s3.amazonaws.com/embedded2015.hackpad.com_2q5oxqltYTG_p.299401_1449482197657_undefined) [source](http://gghh.name/dibtp/2015/11/10/function-calls-in-c-the-boring-specs.html) * Stack: used by function call * Heap: `malloc` or `free` ### Recursive * Stack ![](https://i.imgur.com/XtaMV0o.png) [source](https://www.slideshare.net/jserv/helloworld-internals) * Stack frame * Frame pointer * Stack pointer * 在 Recursive call 的時候, stack 裡面會存放 parameters, local variables and return addresses * return value 一般來說是放在暫存器裡。 ### Heap * Q: 為什麼使用 `malloc()` 時需要給定空間,而 `free()` 則不用? * `malloc()` 配置給你的空間往往會比預期的在大一點點,這牽扯到 aloghnment 的問題... * `free()` ```c #include <stdlib.h> int main() { int *p = (int *) malloc(1024); free(p); free(p); //double free return 0; } ``` 上面的程式執行時會出錯,但如果多加一行... ```c #include <stdlib.h> int main() { int *p = (int *) malloc(1024); free(p); p = NULL; // do this after free() to avoid double free free(p); // if p is NULL, no operation is performed return 0; } ``` 就不會有問題了。 ## 遞迴呼叫篇 * 由於現代的編譯器最佳化技術,遞迴的程式不見得會比迭代 的版本來得慢。 * **Tail recursion** 透過增加參數,並把計算過程直接放入增加的參數裡回傳的方法,以減少重複的計算、stack 和 local variables 的使用。 ### Greatest Common Divisor ### Sum ### Hanoi Tower ### Binary Search ### Fibonacci Sequence * recursive ```c int fib(int n) { if (n == 0) return 0; if (n == 1) return 1; return fib(n - 1) + fib (n - 2); } ``` * tail recursion ```c int fib(int n, int a, int b) { if (n == 0) return a; return fib(n - 1 , b, a + b); } ``` ### Reverse a String ## 前置處理器應用篇 <img style="display: block; margin: auto;" src="https://i.imgur.com/JMOk7Mz.png"></img> * 如果善用 macro ,可以讓程式碼變得更精簡、更容易維護,從而提昇程式設計思考的層面。 ### Preprocessor * Stringification/Stringizing: 把一個表示式變成字串 * Concatenation * Note: 用 `gcc -E` 觀察 C preprocessor 的輸出結果。 ### Object Oriented * Object system/model 1. class-based: 模板長什麼樣子就是什麼樣子 2. prototype-based: 可逐步擴充屬性 * e.g. ```c typedef enum { NORTH, SOUTH, EAST, WEST} Direction; typedef struct { char *description; int (*init)(void *self); void (*describe)(void *self); void (*destroy)(void *self); void *(*move)(void *self, Direction direction); int (*attack)(void *self, int damage); } Object; int Object_init(void *self); void Object_destroy(void *self); void Object_describe(void *self); void *Object_move(void *self, Direction direction); int Object_attack(void *self, int damage); void *Object_new(size_t size, Object proto, char *description); #define NEW(T, N) Object_new(sizeof(T), T##Proto, N) #define _(N) proto.N ``` ### _Generic * C11 新增了一個關鍵字: _Generic,可以用來達到類似 C++ template 的效果。(type-generic functions) * e.g.1 cube root ```c #include <math.h> #define cbrt(X) \ _Generic((X), \ long double: cbrtl, \ default: cbrt, \ const float: cbrtf, \ float: cbrtf \ )(X) ``` * e.g.2 ```c #include <stdio.h> void funci(int x) { printf("func value = %d\n", x); } void funcc(char c) { printf("func char = %c\n", c); } void funcdef(double v) { printf("Def func's value = %lf\n", v); } #define func(X) \ _Generic((X), \ int: funci, char: funcc, default: funcdef \ )(X) ``` ### 應用: Exception Handling * `setjmp()`, `longjmp()` ## goto 和流程控制 ### goto * 善用 goto 來處理"善後",可以避免掉很多巢狀的 if-else 。 ## linked list 和非連續記憶體操作 ### Remove list node [從刪除 linked-list node 看程式設計的品味](https://medium.com/fcamels-notes/%E5%BE%9E%E5%88%AA%E9%99%A4-linked-list-node-%E7%9C%8B%E7%A8%8B%E5%BC%8F%E8%A8%AD%E8%A8%88%E7%9A%84%E5%93%81%E5%91%B3-b597cc5af785) ps: We assume that the target we search for must be in the list. Ver 1. ```c void remove_list_node(List *list, Node *target) { Node *prev = NULL; Node *current = list->head; // Walk the list while (current != target) { prev = current; current = current->next; } // Remove the target by updating the head or the previous node. if (!prev) list->head = target->next; else prev->next = target->next; } ``` Ver 2. ```c void remove_list_node(List *list, Node *target) { // The "indirect" pointer points to the *address* // of the thing we'll update. Node **indirect = &list->head; // Walk the list, looking for the thing that // points to the node we want to remove. while (*indirect != target) indirect = &(*indirect)->next; *indirect = target->next; } ``` `indirect` 一開始是一個指向 `list->head` 的指標,所以 `*indirect` 就是 `list->head` 的 alias。執行到 while 時,我們檢查 `*indirect` 跟 `target` 一不一樣 (兩者的型態都是指向 `Node` 的指標) 。如果不一樣,就把 `indirect` 改成指向 `indirect` 現在指向的指標 (也就是 `*indirect` ) 指向的 `Node` 的 member `next`。 如此重複,直到找到目標跳出迴圈後,由於 `*indirect` 是指向我們找到的目標 (也就是它是目標的前一個的 `next` ) ,所以把它改成指向 `target->next` 就可以成功刪除 `*target`。 #### Abstract data type ### Circular linked list [include/linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) ```c /** * list_for_each - iterate over a list * @pos: the &struct list_head to use as a loop cursor. * @head: the head for your list. */ #define list_for_each(pos, head) \ for (pos = (head)->next; pos != (head); pos = pos->next) ``` 另一個"安全"的版本: ```c /** * list_for_each_safe - iterate over a list safe against removal of list entry * @pos: the &struct list_head to use as a loop cursor. * @n: another &struct list_head to use as temporary storage * @head: the head for your list. */ #define list_for_each_safe(pos, n, head) \ for (pos = (head)->next, n = pos->next; pos != (head); \ pos = n, n = pos->next) ``` `n`: 下一個的下一個 glist contain of ## 技巧篇