# 2018q3 Homework1 contributed by < [`TerryShu`](https://github.com/TerryShu) > :::danger 你的提問呢?回顧你過去開發的程式,難道沒想到概念上有衝突或者激發出更多想法嗎? :notes: jserv ::: ## 為什麼要深入學習 C 語言? C語言一開始是用來開發 UNIX 系統 ### 為甚麼不探討C++? C++自稱是一個OOP語言 但實際上包含了 generic programming 和 functional programming * 內容多樣學習難度高 * 改版神速 * 因為改版快編譯器容易不相容 * C 與 C++ 已經漸行漸遠可視作不同語言 ### C 和 C++ 差異 自 C99 起,C 語言支援 designated initializer ```clike struct pointp = { .y = yvalue, .x =xvalue } ``` C++ 則以 Constructor 實現 但以C開發者角度較在意程式執行順序 **Constructor實作可能會發生在main之前** 可能導致平台移植困難或跟預期的結果有差異 ### 授權差異 MIT 只要不把作者塗掉 較為寬鬆 BSD LGPL :::warning 查詢 BSD 跟 LGPL 的規範與差別 ::: ### C語言 Object 僅表示佔有空間的表示法! ### 第一個C語言編譯器如何開發? [第一个 C 语言编译器是怎样编写的?](http://blog.jobbole.com/94311/) * 先用組合語言編寫一個C語言子集的編譯器 * 再透過編出的子集編譯器一步一步編出包含更多指令的C語言編譯器 * C~0~ -> C~1~ -> ... -> C~N~ -> C 一步一步堆疊完成第一個 C 語言編譯器 ### [ISO/IEC 9899](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) * object 在執行環境下的資料儲存範圍 其內容可以用在表示值(values) > C99 [3.14] Region of data storage in the execution environment, the contents of which can represent values. * `&` 在 bitwise operator 時才念作 "and" 運用在運算元的位址時應念 "address of" > C99 [6.5.3.2] The unary & operator yields the address of its operand. * `*` 應念成 "value of" 或 "dereference" 而非 "star" ,`*` 的運算元應該要是 pointer type,表示運算元所指向的值。 > C99 [6.5.3.2] The operand of the unary * operator shall have pointer type. * sizeof 是一個Operator不是一個Function > C99 [6.5.3.4] The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type ## 你所不知道的 C 語言: 指標篇 ### 我應該要知道的部分 簡易指標實驗: ```clike= #include <stdio.h> int main(void) { int num = 8 ; int *pointer = &num ; // 存放num位址的pointer printf("the value of num: %d\n", num); printf("the address of num : %p\n", &num); printf("the value of pointer : %p\n", pointer); printf("the value of *pointer : %d\n", *pointer); printf("the address of pointer : %p\n\n", &pointer); *pointer = 87 ; // assign 87 to value of pointer printf("the value of num: %d\n", num ); printf("the address of num : %p\n", &num); printf("the value of pointer : %p\n", pointer); printf("the value of *pointer : %d\n", *pointer); printf("the address of pointer : %p\n", &pointer); return 0; } ``` Result: ``` the value of num: 8 the address of num : 0x7ffcff8bfedc the value of pointer : 0x7ffcff8bfedc the value of *pointer : 8 the address of pointer : 0x7ffcff8bfee0 the value of num: 87 the address of num : 0x7ffcff8bfedc the value of pointer : 0x7ffcff8bfedc the value of *pointer : 87 the address of pointer : 0x7ffcff8bfee0 ``` ### C語言規格書 ### Type Object : In C 只要有實際佔有記憶體就稱之為"物件" 不要跟OOP的物件搞混! pointer type 沒有完整的算數(可 `+` `-` 不可 `*` `/`) Type有3種 > C99 [6.5.2] object types (types that fully describe objects), function types (types that describe functions), and incomplete types (types that describe objects but lack information needed to determine their sizes). * Object types : 被完整描述的Object * Function types : 被用來描述Functions * Incomplete types : 已經被宣告但缺少可定義大小的資訊 ### `void *` 之謎 * void 一開始在C並不存在,而是到C89才確立 ### 沒有 「雙指標」 只有 「指標的指標」 「雙」有對稱的意思 但指標的指標是有階層關係的! In C everything is value! 所以只有call by value 「指標的指標」簡易實驗: 目標:A=1 B=2 透過function呼叫改變 A 值 錯誤版 ```clike= #include <stdio.h> int B = 2; void func(int *p) { p = &B; } int main() { int A = 1; int *ptrA = &A; func(ptrA); printf("%d\n", *ptrA); return 0; } ``` Result ``` 1 ``` 正確版 ```clike= #include <stdio.h> int B = 2; void func(int **p) { *p = &B; } int main() { int A = 1; int *ptrA = &A; func(&ptrA); printf("%d\n", *ptrA); return 0; } ``` Result ``` 2 ``` * 解析: 錯誤版中 > func(ptrA) -> 此處ptrA帶入的是複本 所以 > func 執行 p = &B; 時實際上不會更改到 A 的值 * 變數的生命週期很重要! * 使用a pointer to a pointer的方法,可以延長變數生命周期 ::: warning 查詢 open source 內 *** a pointer to a pointer to a pointer 出現在哪 困難的不是用幾顆 * 而是使用情境 ::: ### Pointers vs. Arrays * C語言中沒有二維三維矩陣,只有一維陣列 * 多維陣列以 row-major 計算轉換維一維陣列 ### Array Subscripting * 是C語言的語法糖 > x[i] 在 C 語言內會轉換為 (\*((x)+(i))) > x[i] -> (\*((x)+(i))) -> (\*((i)+(x))) -> i[x] 「Array Subscripting」簡易實驗: ```clike= #include <stdio.h> int main() { int x[5] = { 0, 1, 2, 3, 4 } ; printf( "x[2] value : %d\n", x[2] ); printf( "x[2] address : %p\n", &x[2] ); printf( "2[x] value : %d\n", 2[x] ); printf( "2[x] address : %p\n", &2[x] ); return 0; } ``` Result: ``` x[2] value : 2 x[2] address : 0x7ffe7e31d968 2[x] value : 2 2[x] address : 0x7ffe7e31d968 ``` > 使用 **int \*p** 的方式宣告 > 而非 int* p > 否則 int* p , value 容易有誤解 **(此處 value是 int 而非 int\* )** C 記憶體配置和指標實驗: ```clike= #include <stdio.h> int a[3]; struct { double v[3]; double length; } b[17]; int calendar[12][31]; int main() {} ``` Using GDB ``` $ gcc -o try -Og -g tryDebugger.c $ gdb -q try $ b main // set breakpoint at main $ r // run (gdb) p sizeof(b) $1 = 544 // b 這個結構體的大小 4(4個double)*8(double=8byte)*17(array size) (gdb) p sizeof(b[0]) $2 = 32 // 一個結構體元素大小 4(4個double)*8(double=8byte) 若想得到下一個元素位置 (gdb) p &b $3 = (struct {...} (*)[17]) 0x555555755040 <b> //b這個結構體元素0的位址 (gdb) p &b+1 // 小心!!!!! 這個寫法代表一整個b的結構體 $4 = (struct {...} (*)[17]) 0x555555755260 <a> (gdb) p &b[0]+1 // 這樣寫才是一個元素 $5 = (struct {...} *) 0x555555755060 <b+32> ``` ### 數值表示 * int 和 float/double(ieee754) 儲存方式不同 ### `int main(int argc, char *argv[], char *envp[])` 的奧秘 ```clike= #include <stdio.h> int main(int argc, char (*argv)[0]) { puts(((char **) argv)[0]); return 0; } ``` 提問: :::warning 查詢 C99 規格書發現有講到 `argc` `argv` 的規範如下 > C99 [5.1.2.2.1] If they are declared, the parameters to the main function shall obey the following constraints: > — The value of argc shall be nonnegative. > — argv[argc] shall be a null pointer. > — If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup. The intent is to supply to the program information determined prior to program startup from elsewhere in the hosted environment. If the host environment is not capable of supplying strings with letters in both uppercase and lowercase, the implementation shall ensure that the strings are received in lowercase. > — If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment. If the value of argc is greater than one, the strings pointed to by argv[1] through argv[argc-1] represent the program parameters. > — The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination. 但好像沒看到argv內的內容應該放些甚麼,所以想問老師argv內放的內容是根據OS不同而不同嗎?或是有其他的規範? ::: :::info 因為你還沒讀完規格書,還有好幾節和 runtime environment 有關,去找 :notes: jserv ::: ### strdup vs. strcpy * strcpy 有明確的destination和source * strdup 創造出來的字串是由OS使用 `malloc()` 安排記憶體位置,所以須由 `free()` 釋放,故lifetime是從malloc到free之間 * 小心strdup有可能失敗(因為malloc有可能失敗 超長字串,結尾沒有null...) ### Function pointer Lvalue : L表示locator In C99 若存取 function designator,不管經過多少次操作(加上很多*),依舊是function designator ### string literals > C99 [6.4.5] A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in "xyz". A wide string literal is the same, except prefixed by the letter L. > > The same considerations apply to each element of the sequence in a character string literal or a wide string literal as if it were in an integer character constant or a wide character constant, except that the single-quote ' is representable either by itself or by the escape sequence \', but the double-quote " shall be represented by the escape sequence \". `char *p = "hello world"` 和 `char p[] = "hello world"` 看似相同,但**底層行為大不同** In C99 string literals 會被分配於 "static storage" ,若嘗試修該其內容可能會造成UB * 已 gcc 的 ELF target 來說,將 string literals 分配在 read-only data section ## 你所不知道的 C 語言: 函式呼叫篇 ### Nested function * C 語言不支援 nested function,目的為簡化編譯器設計 ```clike= Nested function: void f() { void g() { printf("Hello") ; } g() ; } ``` ### Memory架構 ![](https://i.imgur.com/3GOa6hB.png) * Stack : 進行 Function Call 時,儲存個別 Function 的 local variables 和 return address 等 * Heap : 使用 `malloc()` 配置的變數會存放於此 * BSS Segment : 尚未被初始化的變數 * Data Segment : 已經被初始化的變數 * Text Segment : 程式的 Binary code 提問: :::warning :question: 對於特地區分 BSS 和 Data Segment 有疑慮? 我記得有些編譯器會自動幫未初始化的變數初始化為 0 或是 null 然而在編譯器做完這些動作後不是又把變數搬到 Data Segment去了 感覺有點多此一舉,所以想請問老師 BSS 內的未初始化變數大多數應用場合是在哪? 有沒有可能將 BSS 和 Data Segment合而為一? :::