2018q3 Homework3 (review)

# 2018q3 Homework3 (review) contributed by <`AlecJY`> ## 1-2 一開始會先建立一個 $2^8$ 的 lookup table | 十進位 | 二進位 | Parity Check | | ----- | ----- | ------------ | | 0 | 0000 | 0 | | 1 | 0001 | 1 | | 2 | 0010 | 1 | | 3 | 0011 | 0 | | 4 | 0100 | 1 | | 5 | 0101 | 0 | | 6 | 0110 | 0 | | 7 | 0111 | 1 | | 8 | 1000 | 1 | | 9 | 1001 | 0 | | 10 | 1010 | 0 | | 11 | 1011 | 1 | | 12 | 1100 | 0 | | 13 | 1101 | 1 | | 14 | 1110 | 1 | | 15 | 1111 | 0 | 上面這張 lookup table 只包含 4 bit 的部分，可以發現每四個 bit 依循著 n, n^1, n^1, n 的規則，而每 4 個 bit 作為一組的時候每一組也依循著 n, n^1, n^1, n 的規則由於輸入的是一個 32 bit 的數字， lookup table 只有做 8 bit 數字的部分，因此在迴圈的地方先做了 `num ^= 16;` 和 `num ^= 8;` 將問題化簡為 8 bit。 ## 2-3 在某些情況可以避免因為 operator 優先順序問題而做了錯誤的操作。例如 `head` 傳入一個變數 `*l` ，因為 `->` 的優先順序比 `*` 高，因此在沒有 `()` 的情況下就會先對 l 做 access member through pointer 再把得到的值 dereference ## 3-1 ``` C= #include <stdio.h> #include <stdint.h> struct test { unsigned int x : 5; unsigned int y : 5; unsigned int z; }; int main() { struct test t; printf("Offset of z in struct test is %ld\n", (uintptr_t) &t.z - (uintptr_t) &t); return 0; } ``` 首先先看一下在 GNU/Linux x86_64 環境中執行是怎麼得到 ``` Offset of z in struct test is 4 ``` 變數 x 和 y 是 bit-field ，根據 C99 規格書的說法 > An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. > -- ISO/IEC 9899:1999 p.102 x 佔了 5 bits ， y 也佔了 5 bits，兩個相加為 10 bits ，假設一個 unsigned int 的大小為 32 bits ，夠同時放進 x 和 y ，所以 &t.z 到 &t 之間只使用了 4 bytes。按照這種說法，如果我們把程式碼改一下 ``` C= #include <stdio.h> #include <stdint.h> struct test { unsigned int x : 30; unsigned int y : 5; unsigned int z; }; int main() { struct test t; printf("Offset of z in struct test is %ld\n", (uintptr_t) &t.z - (uintptr_t) &t); return 0; } ``` 這個時候 x 和 y 所佔的 bit 數總和就是 35 ，大於一個 unsigned int 的大小，所以這個時候占用的記憶體空間就變成 ``` Offset of z in struct test is 8 ``` 這題題目在問的是把程式碼改成底下這樣 ``` C= #include <stdio.h> #include <stdint.h> struct test { unsigned int x : 5; unsigned int y : 5; unsigned int z; }; int main() { struct test t; printf("Address of t.x is %p", &t.x); return 0; } ``` 到底可不可以編譯，結果會如何，而答案是 ``` (e) 編譯失敗，不能將指標指向沒有對齊 1 byte 的結構體成員; ``` 會讓人想如果把 x 的 bit 數換成 8 或是 32 能不能編譯成功，結果很顯然的是不行的， GCC 都會產生類似以下的錯誤 ``` test.c: In function ‘main’: test.c:10:36: error: cannot take address of bit-field ‘x’ printf("Address of t.x is %p", &t.x); ``` 這個錯誤訊息已經很明白的說明了不能取 bit-field 的 address ，如果去翻閱 C99 規格書也可以找到以下解釋 > The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier > -- ISO/IEC 9899:1999 p.78 在 102 頁的註釋也有再次提醒 > The unary & (address-of) operator cannot be applied to a bit-field object; thus, there are no pointers to or arrays of bit-field objects. > -- ISO/IEC 9899:1999 p.102 所以並不存在任何指向 bit-field 的指標，也沒辦法取得 bit-field 的 address ## 3-2 這題利用 union 將不同 data type 存在同一個記憶體空間的特性來判斷 byte order ，在 `c.a` 存入 1 ，在 big endian 的環境下記憶體裡會存成 16 進位的 `00 00 00 01` ，在 little endian 的環境下會儲存成 `01 00 00 00` ，這樣讀取第一個 byte 看看是 0 還是 1 就知道是 little endian 還是 big endian ### 找出類似的程式碼在 Stack Overflow 上面有找到類似的程式碼 [來源](https://stackoverflow.com/a/8979034) ``` C inline int IsBigEndian() { int i=1; return ! *((char *)&i); } ``` 一樣利用了 int 在 big endian 環境下還有 little endian 環境下儲存方式不同的概念來找出 byte order ，只是這邊是利用轉換指標的 type 來取得 int 的第一個 byte ### 其他判斷 Byte Order 的方法在第一次作業的時候有找到某段程式碼跟轉換 Byte Order 有關係，想說從那份程式碼找找看他怎麼判斷 Byte Order，結果找到下面這段[程式碼](https://github.com/Stichting-MINIX-Research-Foundation/minix/blob/03ac74ede908465cc64c671bbd209e761dc765dc/include/arpa/nameser_compat.h#L52) ``` C #if defined(vax) || defined(ns32000) || defined(sun386) || defined(i386) || \ defined(MIPSEL) || defined(_MIPSEL) || defined(BIT_ZERO_ON_RIGHT) || \ defined(__i386__) || defined(__i386) || defined(__amd64__) || \ defined(__x86_64__) || defined(MIPSEL) || defined(_MIPSEL) || \ defined(BIT_ZERO_ON_RIGHT) || defined(__alpha__) || defined(__alpha) || \ (defined(__Lynx__) && defined(__x86__)) #define BYTE_ORDER LITTLE_ENDIAN #endif #if defined(sel) || defined(pyr) || defined(mc68000) || defined(sparc) || \ defined(is68k) || defined(tahoe) || defined(ibm032) || defined(ibm370) || \ defined(MIPSEB) || defined(_MIPSEB) || defined(_IBMR2) || defined(DGUX) ||\ defined(apollo) || defined(__convex__) || defined(_CRAY) || \ defined(__hppa) || defined(__hp9000) || \ defined(__hp9000s300) || defined(__hp9000s700) || \ defined(__hp3000s900) || defined(__hpux) || defined(MPE) || \ defined (BIT_ZERO_ON_LEFT) || defined(m68k) || defined(__sparc) || \ (defined(__Lynx__) && \ (defined(__68k__) || defined(__sparc__) || defined(__powerpc__))) #define BYTE_ORDER BIG_ENDIAN #endif ``` 由於 C 標準裡面沒有定義跟判斷 byte order 相關的方法，所以要在編譯時期找出 byte order 也只能用像是這種直接列表列出來的方法。如果編譯器是用 GCC 的話則有提供 `__BYTE_ORDER__` 這個 Macro 來判斷。 ## 3-3 前兩行 `x ^= x >> 1;` 和 `x ^= x >> 2;` 運算後會將每一個 bit 與其前面三個 bit 做 xor ， `0x11111111` 在二進位表示的時候是 `0b1_0001_0001_0001_0001_0001_0001_0001` ，與 x 做 and 即為每四個 bit 中取最低的 bit，之後再與 `0b1_0001_0001_0001_0001_0001_0001_0001` 相乘，在最高 bit 往下數四個 bit 那位剛好是將所有位數相加的結果，因此在 32 bit 的時候平移 28 位之後做 and 1 取得那位數即為將最初 x 值中二進位每四個位數做 xor 後再把它們相加的結果，也是 parity check 的結果