2018q3 Homework3 (review)
===
contributed by < [Jyun-Neng](https://github.com/Jyun-Neng) >
## [第三周測驗 1](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-1)
考慮以下程式碼:
```C=
#include <stdio.h>
#include <stdint.h>
struct test {
unsigned int x : 5;
unsigned int y : 5;
unsigned int z;
};
int main() {
struct test t;
printf("Offset of z in struct test is %ld\n",
(uintptr_t) &t.z - (uintptr_t) &t);
return 0;
}
```
在 GNU/Linux x86_64 環境中執行,得到以下輸出:
```
Offset of z in struct test is 4
```
倘若將第 10 和第 11 換為以下:
```C=10
printf("Address of t.x is %p", &t.x);
```
會發生什麼事?
---
C99 規格書對於 alignment 的定義:
* requirement that objects of a particular type be located on storage boundaries with addresses that are particular multiples of a byte address
> objects 的儲存位址必須以 byte 為單位
C99 規格書 6.2.5 第 27 點:
* A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.
> (1) pointer to void, (2) pointer to structure, (3) pointer to union, and (4) pointer to qualified or unqualified versions of compatible types 需要 alignment。
根據 C99 規格書 6.5.3.2 第 1 點以及 6.7.2.1:
* The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
* The unary & (address-of) operator cannot be applied to a bit-field object; thus, there are no pointers to or arrays of bit-field objects.
由此可知,`&t.x` 因為 `x` 是 bit-field 所以不能用 `&` operator。
## [第三周測驗 2](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-2)
考慮以下程式碼,在 little-endian 硬體上,會返回 `1`,反之,在 big-endian 硬體上會返回 `0`:
```C
int main() {
union { int a; char b;
} c = { .a = 1 };
return c.b == 1;
}
```
---
Union type 的記憶體空間配置為 union 裡需配置最大記憶體空間的變數,所有的成員都使用相同的記憶體空間。本題的 union `c` 中需要最大記憶體配置空間的就是 `int a` 為 4 bytes。
若硬體為 little-endian,則記憶體的存放方式會如下圖:
![](https://i.imgur.com/X7HLgXX.png)
若硬體為 big-endian,則記憶體的存放方式會如下圖:
![](https://i.imgur.com/g2ns34S.png)
所以改為以下程式碼,則在 big-endian 硬體上,會返回 `1`,反之,在 big-endian 硬體上會返回 `0`:
```C
int main() {
union { int a; char b;
} c = { .a = 1 };
return c.b == 0;
}
```
:::info
若改為以下程式碼,應當也可以在 little-endian 硬體上,會返回 `1`,反之,在 big-endian 硬體上會返回 `0`
```C
int main() {
union { int a; char b;
} c = { .a = 254 };
return c.b == 254;
}
```
:::
> `char` 通常被視爲 `signed char`,無法表示 254,要改爲 `unsigned char`
> [name=siahuat0727] [time=Fri, Oct 5, 2018 4:31 PM]
## [第三周測驗 3](https://hackmd.io/s/BkyQskjK7#%E6%B8%AC%E9%A9%97-3)
以下程式碼計算 parity bit:
```C=
#include <stdint.h>
int parity(uint32_t x)
x ^= x >> 1;
x ^= x >> 2;
x = (x & 0x11111111U) * 0x11111111U;
return (x >> 28) & 1;
}
```
---
使用乘法計算 parity 這個方法只需要 8 個運算。
由 2, 3 行程式碼先行算出 4 bits 的 parity bit
`xxxAxxxBxxxCxxxDxxxExxxFxxxGxxxH`
其中 `ABCDEFGH` 分別代表自己的 bit 及前面四個 bit 的 parity bit,因此其他 bit 為多少已經不需在意。
第 5 行程式碼一開始的 AND 運算留下 `ABCDEFGH` 並將其他 bit 設為 0。乘上 `0x11111111U` 的目的其實等同於左移 0, 4, 8, 12, 16, 20, 24, 28 bits 後全部相加:
```
000A000B000C000D000E000F000G000H
+ 000B000C000D000E000F000G000H0000
+ 000C000D000E000F000G000H00000000
+ 000D000E000F000G000H000000000000
+ 000E000F000G000H0000000000000000
+ 000F000G000H00000000000000000000
+ 000G000H000000000000000000000000
+ 000H0000000000000000000000000000
```
在第 28 bit 上會發現所有 parity bit `ABCDEFGH` 都在這個 bit 相加,而加法實際上就是 XOR 及 AND 的組合,AND 是為了計算 carry bit,XOR 計算的是仍然在該 bit 位置的值,此外因為與前一個值隔了 4 bit 中間都是 0,若要受到 carry bit 的影響必須要加上 16 個 1,但這裡很明顯頂多就 7 個 1 相加,所以不會影響。因此 32-bit 的 parity bit 就在第 28 bit 上,所以第 6 行程式碼將乘完的結果右移 28 bits 再 AND 1 即為答案。
一篇有名的 bit operation 手冊: [Bit Twiddling Hacks](https://graphics.stanford.edu/~seander/bithacks.html)
### 更快的實作機制
將兩個 32-bit data 合併成 64-bit data 使用 SIMD 的方法用相同的運算數完成兩筆 32-bit data 的 parity bit 計算:
```C=
#include <stdint.h>
int parity(uint64_t x)
x ^= x >> 1;
x ^= x >> 2;
x = (x & 0x1111111111111111UL) * 0x1111111111111111UL;
return (x >> 28) & 0x100000001;
}
```
最後回傳的結果,兩筆 32-bit data 的 parity bit 計算結果分別在第 1 個及第 33 個 bit 上。
## [第二周測驗 5](https://hackmd.io/s/BJA6LjbK7#%E6%B8%AC%E9%A9%97-5)
以下程式是合法 C99 程式嗎?
```C
#include <stdio.h>
int main() { return (********puts)("Hello"); }
```
請搭配 C 語言規格書解釋
繼續思考以下是否合法:
```C
#include <stdio.h>
int main() { return (**&puts)("Hello"); }
```
繼續變形:
```C
#include <stdio.h>
int main() { return (&**&puts)("Hello"); }
```
也會合法嗎?為什麼?請翻閱 C 語言規格書解說。
---
根據 C99 規格書
- [ 6.3.2.1 ] A function designator is an expression that has function type
- **Except** when it is the operand of the **sizeof** operator or the unary **&** operator, a **function designator** with type ‘‘function returning type’’ **is converted to** an expression that has type ‘‘**pointer to function returning type’**’.
- `*` is Right associative operator
- [ 6.5.3.2 ] Thus, **&*E is equivalent to E (even if E is a null pointer)**, and &(E1[E2]) to ((E1)+(E2)). It is always true that if E is a function designator or an lvalue that is a valid operand of the unary & operator, ***&E is a function designator or an lvalue equal to E**. If *P is an lvalue and T is the name of an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
- [ 6.5.3.2-1 ] The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
- [ 6.5.3.2-4 ] The unary * operator denotes indirection. **If the operand points to a function, the result is a function designator**
`puts()` 為 function designator,對其 dereference 後,根據 [6.5.4.2-4] 最後還是一個 function designator,所以第一個程式碼不論做了幾次 dereference 最後都是 function designator。
根據 [ 6.5.3 ] `(*&puts)()` 仍為 function designator 所以合法。
根據 [ 6.5.3 ] `(&**&puts)()` 對等價於 `(*&puts)()` 所以合法。
若連續兩個 `&` 則會有問題 ex: `(& &**&puts)()` 兩個 `&` 之間空格是防止判斷成 `&&`。
```shell
cannot take the address of an rvalue
```
根據 [6.5.3.2-1],因為 `(&**&puts)()` 會得到 `puts()` 回傳的值 (rvalue),所以無法使用 `&`。
## [第二周測驗 2](https://hackmd.io/s/BJA6LjbK7#測驗-2)
```C
void (*signal(int sig, void (*handler)(int))) (int);
```
---
`signal(sig, signal_handler)` 的功能為收到 `sig` 時會帶入 `signal_handler` 執行。
[qtest.c](https://github.com/sysprog21/lab0-c/blob/67654f5d1fe108930633595dc90d3c68d234a1a9/qtest.c) 中便有使用到 signal 函式。當收到 `SIGSEGV` 時會由 `sigsegvhandler` 處理此信號 ; 當收到 `SIGALRM` 時,會由 `sigalrmhandler` 處理此信號。
```clike=
static void queue_init()
{
fail_count = 0;
q = NULL;
signal(SIGSEGV, sigsegvhandler);
signal(SIGALRM, sigalrmhandler);
}
```
* `SIGSEGV` (segmentation violation) 當存取了無效記憶體時會產生的信號。當產生此信號,`sigsegvhandler` 就會拋出一個例外,顯示 "Segmentation fault occurred. You dereferenced a NULL or invalid pointer" 的例外訊息。
* `SIGALRM` 是計時器產生的信號,目的為偵測長時間的程式執行。當此信號產生,`sigalrmhandler` 就會拋出一個例外,顯示 "Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient" 的例外訊息。
```clike=
/* Signal handlers */
void sigsegvhandler(int sig)
{
trigger_exception(
"Segmentation fault occurred. You dereferenced a NULL or invalid "
"pointer");
}
void sigalrmhandler(int sig)
{
trigger_exception(
"Time limit exceeded. Either you are in an infinite loop, or your "
"code is too inefficient");
}
```