owned this note
owned this note
Published
Linked with GitHub
# 2018q3 Homework4 (assessment)
contributed by < `kevin110604` >
###### tags: `2018q3`
## 第 4 週測驗 `1`
### 題目
考慮以下求絕對值的程式碼:
```C
#include <stdint.h>
int64_t abs64(int x) {
if (x < 0) return -x;
return x;
}
```
移除分支並善用[二補數](https://en.wikipedia.org/wiki/Two%27s_complement)特性,改寫為下方程式碼:
```C
#include <stdint.h>
int64_t abs64(int64_t x) {
int64_t y = x A1 (A2 - 1);
return (x A3 y) - y;
}
```
請補完,其中 `A1` 和 `A3` 都是 operator。
==作答區==
A1 = ?
* `(a)` &
* `(b)` |
* `(c)` ^
* `(d)` <<
* `(e)` >>
A2 = ?
* `(a)` 0
* `(b)` 1
* `(c)` 61
* `(d)` 62
* `(e)` 63
* `(f)` 64
A3 = ?
* `(a)` &
* `(b)` |
* `(c)` ^
* `(d)` <<
* `(e)` >>
:::success
延伸問題:
1. 解釋運作原理,並探討可能的 overflow/underflow 議題;
2. 搭配下方 pseudo-random number generator (PRNG) 和考量到前述 (1),撰寫 `abs64` 的測試程式,並探討工程議題 (如:能否在有限時間內對 int64_t 數值範圍測試完畢?)
```C
static uint64_t r = 0xdeadbeef
int64_t rand64() {
r ^= r >> 12;
r ^= r << 25;
r ^= r >> 27;
return (int64_t) (r * 2685821657736338717);
}
```
3. 在 GitHub 找出類似用法的專案並探討,提示:密碼學相關
:::
### 想法 & 思考
所以根據二補數的特性,大致可以猜測此程式碼是想先把有關 `x` 的 sign bit 的資訊存到 `y` 裡面,所以可以猜函式的第一行是想要把 `x` right shift (64-1) bits 。也就是說如果 `x` 是正數或 `0` ,那 `y` 就會是 `0x0000000000000000` ;如果 `x` 是負數,那 `y` 就會是 `0xffffffffffffffff` 。
絕對值是輸入 `x` 為正時要返回 `x` ,輸入 `x` 為負時要返回 `-x` ,也就是它的補數。要得到一個數的二補數就是先將所有的 bits 都先 toggle ,再加上 `1` 。要將 64-bit 的數做 toggle 就是把它跟 `0xffffffffffffffff` 做 XOR 。然後又因為減負一就是加一,所以如果 `x` 是負數,那它的二補數就剛好是 `(x ^ y) - y` ;如果 `x` 是正數,`x` 與 `0` 做 XOR 還是自己,扣掉 `0` 也還是自己。
:::info
根據規格書 6.5.7 Bitwise shift operators 第 5 點,對負整數做 right shift 其實是 implementation-defined :
> 5. The result of `E1 >> E2` is `E1` right-shifted `E2` bit positions. If `E1` has an unsigned type or if `E1` has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2^. If `E1` has a signed type and a negative value, the resulting value is implementation-defined.
那 `abs64()` 是否在任何地方都可以使用呢?
:::
### 延伸問題
64-bit 的有號數整數最大值是 $2^{63}-1$ ,最小值是 $-2^{63}$ , 如果將 $-2^{63}$ 帶入這個函式會無法得到正確答案,因為最後會發生 overflow。
### 參考資料
* [規格書](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
---
## 第 4 週測驗 `2`
### 題目
考慮測試 C 編譯器 [Tail Call Optimization](https://en.wikipedia.org/wiki/Tail_call) (TCO) 能力的程式 [tco-test](https://github.com/sysprog21/tco-test),在 gcc-8.2.0 中抑制最佳化 (也就是 `-O0` 編譯選項) 進行編譯,得到以下執行結果:
```shell
$ gcc -Wall -Wextra -Wno-unused-parameter -O0 main.c first.c second.c -o chaining
$ ./chaining
No arguments: no TCO
One argument: no TCO
Additional int argument: no TCO
Dropped int argument: no TCO
char return to int: no TCO
int return to char: no TCO
int return to void: no TCO
```
而在開啟最佳化 (這裡用 `-O2` 等級) 編譯,會得到以下執行結果:
```shell
$ gcc -Wall -Wextra -Wno-unused-parameter -O2 main.c first.c second.c -o chaining
$ ./chaining
No arguments: TCO
One argument: TCO
Additional int argument: TCO
Dropped int argument: TCO
char return to int: no TCO
int return to char: no TCO
int return to void: TCO
```
注意 [__builtin_return_address](https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) 是 gcc 的內建函式:
> This function returns the return address of the current function, or of one of its callers. The level argument is number of frames to scan up the call stack. A value of 0 yields the return address of the current function, a value of 1 yields the return address of the caller of the current function, and so forth. When inlining the expected behavior is that the function returns the address of the function that is returned to. To work around this behavior use the noinline function attribute.
> The level argument must be a constant integer.
從實驗中可發現下方程式無法對 `g` 函式施加 TCO:
```C
void g(int *p);
void f(void) {
int x = 3;
g(&x);
}
void g(int *p) { printf("%d\n", *p); }
```
因為函式 `f` 的區域變數 `x` 在返回後就不再存在於 stack。考慮以下程式碼:
```C=
int *global_var;
void f(void)
{
int x = 3;
global_var = &x;
...
/* Can the compiler perform TCO here? */
g();
}
```
思考程式註解,在第 8 行能否施加 TCO 呢?選出最適合的解釋。
==作答區==
* `(a)` 編譯器不可能施加 TCO
* `(b)` 編譯器一定可施加 TCO
* `(c)` 只要函式 `g` 沒有對 `global_var` 指標作 dereference,那麼 TCO 就有機會
:::success
延伸問題:
1. 探討 TCO 和遞迴程式的原理
2. 分析上述實驗的行為和解釋 gcc 對 TCO 的操作
3. 在 [Android 原始程式碼](https://android.googlesource.com/) 裡頭找出 [__builtin_return_address](https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) 的應用並解說
:::
---
### 想法 & 思考
待完成
### 延伸問題
### 參考資料
---
## 第 4 週測驗 `3`
### 題目
以下程式碼編譯並執行後,在 x86_64 GNU/Linux 會遇到記憶體存取錯誤:
```shell
$ cat ptr.c
int main() {
int *ptr = 0;
return *ptr;
}
$ gcc -o ptr ptr.c
$ ./ptr
Segmentation fault: 11
```
分別考慮以下 4 個程式,探討其行為。
- [ ] `ptr1.c`
```C
int main() { return *((int *) 0); }
```
- [ ] `ptr2.c`
```C
int main() { return &*((int *) 0); }
```
- [ ] `ptr3.c`
```C
#include <stddef.h>
int main() { return &*NULL; }
```
- [ ] `ptr4.c`
```C
#include <stddef.h>
int main() {
return &*(*main - (ptrdiff_t) **main);
}
```
==作答區==
K1 = ?
* `(a)` `ptr1.c` 在執行時期會造成 Segmentation fault
* `(b)` 對於 `ptr1.c`, C 語言規格書聲明這是 undefined behavior 或者語法錯誤
* `(c)` `ptr1.c` 是合法 C 程式,在執行後可透過 `echo $?` 得到 exit code 為 `0`
K2 = ?
* `(a)` `ptr2.c` 在執行時期會造成 Segmentation fault
* `(b)` 對於 `ptr2.c`, C 語言規格書聲明這是 undefined behavior 或者語法錯誤
* `(c)` `ptr2.c` 是合法 C 程式,在執行後可透過 `echo $?` 得到 exit code 為 `0`
K3 = ?
* `(a)` `ptr3.c` 在執行時期會造成 Segmentation fault
* `(b)` 對於 `ptr3.c`, C 語言規格書聲明這是 undefined behavior 或者語法錯誤
* `(c)` `ptr3.c` 是合法 C 程式,在執行後可透過 `echo $?` 得到 exit code 為 `0`
K4 = ?
* `(a)` `ptr4.c` 在執行時期會造成 Segmentation fault
* `(b)` 對於 `ptr4.c`, C 語言規格書聲明這是 undefined behavior 或者語法錯誤
* `(c)` `ptr4.c` 是合法 C 程式,在執行後可透過 `echo $?` 得到 exit code 為 `0`
:::success
延伸問題:
1. 參照 C 語言規格書,充分解釋其原理
2. 解析 clang/gcc 編譯器針對上述程式碼的警告訊息
3. 思考 `Segmentation fault` 的訊息是如何顯示出來,請以 GNU/Linux 為例解說。提示: Page fault handler
:::
### 想法 & 思考
規格書 6.3.2.3 Pointers 對 null pointer 有清楚的定義
> 3. An integer constant expression with the value 0, or such an expression cast to type `void *`, is called a **null pointer constant**.^55)^ If a null pointer constant is converted to a pointer type, the resulting pointer, called a **null pointer**, is guaranteed to compare unequal to a pointer to any object or function.
> 55\) The macro `NULL` is defined in `<stddef.h>` (and other headers) as a null pointer constant; see 7.17.
所以在題目的程式碼當中
```c
int main() {
int *ptr = 0;
return *ptr;
}
```
我們可以知道 `ptr` 是一個 null pointer ,所以自然無法對它取值。
那 ptr1.c 也是同樣的道理:
```c
int main() { return *((int *) 0); }
```
`((int *) 0)` 就是一個 null pointer,所以一樣無法對它取值。
那為什麼 ptr2.c 和 ptr3.c 可以呢?
```c
int main() { return &*((int *) 0); }
```
```c
#include <stddef.h>
int main() { return &*NULL; }
```
先看到規格書 6.5.3.3 Unary arithmetic operators 這麼寫:
> 3. The unary `&` operator yields the address of its operand. If the operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. ==If the operand is the result of a unary `*` operator, neither that operator nor the `&` operator is evaluated and the result is as if both were omitted,== except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a `[]` operator, neither the `&` operator nor the unary `*` that is implied by the `[]` is evaluated and the result is as if the `&` operator were removed and the `[]` operator were changed to a `+` operator. Otherwise, the result is a pointer to the object or function designated by its operand.
> 4. The unary `*` operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary `*` operator is undefined.^87)^
> 87\) ==Thus, `&*E` is equivalent to `E` (even if `E` is a null pointer),== and `&(E1[E2])` to `((E1)+(E2))`. It is always true that if `E` is a function designator or an lvalue that is a valid operand of the unary `&` operator, `*&E` is a function designator or an lvalue equal to `E`. If `*P` is an lvalue and `T` is the name of an object pointer type, `*(T)P` is an lvalue that has a type compatible with that to which `T` points.
> Among the invalid values for dereferencing a pointer by the unary `*` operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.
所以 `&*((int *) 0)` 就是 `((int *) 0)` , `&*NULL` 就是 `NULL`
ptr4.c 也是相同的概念:
```c
#include <stddef.h>
int main() {
return &*(*main - (ptrdiff_t) **main);
}
```
`&*(*main - (ptrdiff_t) **main)` 就是 `(*main - (ptrdiff_t) **main)` 。 `main` 後面沒有接 `()` 代表它是一個指向 `main` function 的指標, `*main` 就是 `main` function 本身, `**main` 是...(這邊待我繼續研究)
### 延伸問題
```shell
$ gcc-8 test4-3-2.c
test4-3-2.c: In function 'main':
test4-3-2.c:1:21: warning: returning 'int *' from a function with return type 'int' makes integer from pointer without a cast [-Wint-conversion]
int main() { return &*((int *) 0); }
^~~~~~~~~~~~~
```
```shell
$ gcc-8 test4-3-3.c
test4-3-3.c: In function 'main':
test4-3-3.c:2:22: warning: dereferencing 'void *' pointer
int main() { return &*NULL; }
^
test4-3-3.c:2:21: warning: returning 'void *' from a function with return type 'int' makes integer from pointer without a cast [-Wint-conversio]
int main() { return &*NULL; }
^
```
```shell
$ gcc-8 test4-3-4.c
test4-3-4.c: In function 'main':
test4-3-4.c:3:12: warning: returning 'int (*)()' from a function with return type 'int' makes integer from pointer without a cast [-Wint-conversion]
return &*(*main - (ptrdiff_t) **main);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
### 參考資料
* [What is the difference between NULL, '\0' and 0](https://stackoverflow.com/questions/1296843/what-is-the-difference-between-null-0-and-0)
* [NULL pointer in C](https://www.geeksforgeeks.org/few-bytes-on-null-pointer-in-c/)
* [ptrdiff_t](https://en.cppreference.com/w/c/types/ptrdiff_t)
---
## 「因為自動飲料機而延畢的那一年」帶來的啟發
[因為自動飲料機而延畢的那一年](http://opass.logdown.com/posts/1273243-the-story-of-auto-beverage-machine-1)