# 2020q3 Homework2 (quiz2) contributed by < ``yceugene`` > ###### tags: `linux2020` `sysprog2020` 題目連結: https://hackmd.io/@sysprog/2020-quiz2 ## :memo: 目錄 [TOC] ## 1. ASCII 編碼 * 題意: 在明確界定 7 位元編碼的字元隸屬於 ASCII 的情況下,判斷指定的記憶體範圍內是否全是有效的 ASCII 字元。 * 觀念: 只需要判斷數值大小範圍是否藉於: 0~127,(若超過就不是有效的 ASCII 字元)。 * 完整 ASCII table 可參考: [ASCII Code - The extended ASCII table](https://www.ascii-code.com/)、[ASCII table , ascii codes](https://theasciicode.com.ar/ascii-printable-characters/at-sign-atpersand-ascii-code-64.html) 範例程式 1: 輸入字串記憶體位址和字串長度,針對字串長度,透過迭代**依序比較每個字元**的數值大小。 ``` c #include <stddef.h> bool is_ascii(const char str[], size_t size) { if (size == 0) return false; for (int i = 0; i < size; i++) if (str[i] & 0x80) /* i.e. (unsigned) str[i] >= 128 */ return false; return true; } ``` 範例程式 2: 較快的版本,作法和前一段程式類似,但改成: **每次處理 64 bits 的資料 (即 word size)**: ``` c #include <stdbool.h> #include <stddef.h> #include <stdint.h> #include <string.h> bool is_ascii(const char str[], size_t size) { if (size == 0) return false; int i = 0; while ((i + 8) <= size) { uint64_t payload; memcpy(&payload, str + i, 8); if (payload & MMM) return false; i += 8; } while (i < size) { if (str[i] & 0x80) return false; i++; } return true; } ``` * 這邊 ``MMM`` 應填上: ``0x8080808080808080`` * 觀念: * 在第一支程式碼中,是逐一比對每個 char ,看值的大小是否超出 128 的範圍: ``str[i] & 0x80`` 這行的作用是將 char 和 ``1000 0000(二進制)`` 做 ``AND`` 運算,意思就是只留下 char 的最高位元,其他部分都變為 0。 * 新的程式碼要在允許情況下,一次比對 64 位元的資料,意思就是一次處理 8 個 char,套用上面的做法,程式會逐一將將每 8 個 char,一併和 ``0x8080808080808080`` 做 ``AND`` 運算。 * 為何用到 memcpy? ## 2. 開發解析器 (parser) * 題意: 以不使用分支的方式,將十六進位表示的字串轉為數值 (不區分大小寫)。例如 `0xF` (大寫 `F` 字元) 和 `0xf` (小寫 `f` 字元) 都該轉換為數值 `15`。 * 已知 `'in'` 一定隸屬於 \`0`\``, `1`, `2`, …, `9` 及 `a`, `b`, `c`, …, `f` (小寫) 及 `A`, `B`, `C`, …, `F` (大寫) 這些字元。預期 `hexchar2val('F')` 和 `hexchar2val('f')` 回傳 `15`,且 `hexchar2val('0')` 回傳 `0`,其他輸入都有對應的數值。 * 程式碼: ``` c uint8_t hexchar2val(uint8_t in) { const uint8_t letter = in & MASK; const uint8_t shift = (letter >> AAA) | (letter >> BBB); return (in + shift) & 0xf; } ``` * 作答: * MASK: 0x40 (用來保留字串的第 5 個 bit 的值) * AAA: 3 (結合 AAA 和 BBB 來計算出數值 9 或 0) * BBB: 6 * 觀察: 先觀察 **數字字串**、**小寫字母**和**大寫字母**的二進制表示,以及其轉換後數值的二進制表示: | 數字字串 | 字串的 16 進制表示 | 字串的二進制表示| => | 轉換後的數值(二進制表示) | | -------- | -------- | -------- | -------- | -------- | | `'0'` | `0x30` | `0011 0000` | => | `0000` | | `'1'` | `0x31` | `0011 0001` | => | `0001` | | `'2'` | `0x32` | `0011 0010` | => | `0010` | | `'3'` | `0x33` | `0011 0011` | => | `0011` | | `'4'` | `0x34` | `0011 0100` | => | `0100` | | `'5'` | `0x35` | `0011 0101` | => | `0101` | | `'6'` | `0x36` | `0011 0110` | => | `0110` | | `'7'` | `0x37` | `0011 0111` | => | `0111` | | `'8'` | `0x38` | `0011 1000` | => | `1000` | | `'9'` | `0x39` | `0011 1001` | => | `1001` | | 小寫字母 | 16 進制表示 | 字串的二進制表示| => | 轉換後的數值(二進制表示) | | -------- | -------- | -------- | -------- | -------- | | `'a'` | `0x61` | `0110 0001` | => | `1010` | | `'b'` | `0x62` | `0110 0010` | => | `1011` | | `'c'` | `0x63` | `0110 0011` | => | `1100` | | `'d'` | `0x64` | `0110 0100` | => | `1101` | | `'e'` | `0x65` | `0110 0101` | => | `1110` | | `'f'` | `0x66` | `0110 0110` | => | `1111` | | 大寫字母 | 16 進制表示 | 字串的二進制表示| => | 轉換後的數值(二進制表示) | | -------- | -------- | -------- | -------- | -------- | | `'A'` | `0x41` | `0100 0001` | => | `1010` | | `'B'` | `0x42` | `0100 0010` | => | `1011` | | `'C'` | `0x43` | `0100 0011` | => | `1100` | | `'D'` | `0x44` | `0100 0100` | => | `1101` | | `'E'` | `0x45` | `0100 0101` | => | `1110` | | `'F'` | `0x46` | `0100 0110` | => | `1111` | * 歸納三種字串的轉換如下: * **數字字串**: 轉二進制表示時,其右邊 4 個 bits 恰好就是最終轉換出的數值大小。 * **小寫字母** 轉二進制表示時,其右邊 4 個 bits 轉成數值後再加 `9`,恰好就是最終轉換出的數值大小。(另外左半邊 4 個 bits 全為 `0110`) * **大寫字母**: 轉二進制表示時,其右邊 4 個 bits 轉成數值後再加 `9`,恰好就是最終轉換出的數值大小。(另外左半邊 4 個 bits 全為 `0100`) * 分析: * 程式碼最後一行回傳 ``(in + shift) & 0xf``,表示回傳時只考慮了``in + shift`` 二進制中的右邊 4 個 bits,這代表 ``in + shift`` 的右半部即為最終數值的二進制表示。 * 再搭配上述表格,我們可以如下分析: * 針對數值字串轉二進制時,我們只需要把數值字串的左半邊清除成 `0000`,即可得到結果。因次在處理數值字串時, `shift` 須為 `0000`,程式才會輸出正確結果。 * 針對英文母轉二進制時,我們需要把字串右半部加上 `9`,左半邊清除成 `0000`,即可得到結果。因次在處理字母字串時,不論大寫或小寫字母, `shift` 都必須是 `1001`,程式才會輸出正確結果。(剩餘目標是思考如何讓 `shift` 為 `1001`) * 考慮大小寫問題: 根據上述表格我們觀察到 "小寫字母的二進制表示中,左半邊全為 `0110`; 大寫字母的二進制表示中,左半邊全為 `0100`。 * 搭配程式碼,整個轉換過程可以這樣理解: * 把字母都視為大寫: 因此在``letter = in & MASK`` 這一行裡,MASK 為 `0100 0000`。 * 轉換字母字串時,``shift`` 要設為數值 9 (即 `1001`);轉換數字字串時,``shift`` 要設為數值 `0` (即 `0000`)。因此在``shift = (letter >> AAA) | (letter >> BBB)``這一行中: AAA 和 BBB 必須為 `3` 和 `6`(順序顛倒也可以)。 * 最後回傳數值時,只留下右邊 4 個 bits,左邊 4 個 bits 要清為 0,因此回傳 ``(in + shift) & 0xf``。 ## 3. 除法運算 * 題意: * 對已知的除數 `d (或 D)` 和被除數 `n` 做計算,求出餘數 `n - quotient * D`。 * 利用乘法運算、位移操作、預先計算,來代替除法運算 * 程式碼: ``` c const uint32_t D = 3; #define M ((uint64_t)(UINT64_C(XXX) / (D) + 1)) /* compute (n mod d) given precomputed M */ uint32_t quickmod(uint32_t n) { uint64_t quotient = ((__uint128_t) M * n) >> 64; return n - quotient * D; } ``` * 作答 XXX = 0xFFFFFFFFFFFFFFFF * 程式原理: * 要先計算出商數 `quotient` 才能計算餘數。而`quotient`就是 `n/D` 的整數部分。 * 根據提示`n/D` 可以表示成 $\frac{n}{D}=n \times \frac{\frac{2^{N}}{D}}{2^{N}}$,因此可將計算拆解成: * 預先計算: $2^{N} / d$ * 計算: n * $2^{N} / d$ ,再將結果向右 shift `N bits`。 (搭配 `quotient = ((__uint128_t) M * n) >> 64;` 可知 `N` 為 `64`。 * 處理預先計算會 overflow 的問題: 已知 `N` 為 `64`,且 `M` 的型別為 `uint64_t`(上限為 $2^{64}-1$),所以 `M` 不能直接設為 $2^{N} / d$。因此這裡把 M 設為 $(2^{N}-1)/d + 1$,因此 `XXX = 0xFFFFFFFFFFFFFFFF`。 * examination: Modify source code to be: ``` c #define MDebug printf const uint32_t D = 3; #define M ((uint64_t)(UINT64_C(0xFF) / (D) +1 )) #define MM ((uint64_t)(UINT64_C(0xFF) / (D) )) /* compute (n mod d) given precomputed M */ uint32_t quickmod(uint32_t n) { uint64_t quotient = ( M * n) >> 8; uint64_t quotientMM = ( MM * n) >> 8; MDebug("\tdata:%d, \tMM:0x%lx, MM*%d=0x%lX, q:%ld(%lx)\tq*D=%ld \n", n, MM, n, ( MM * n),quotientMM,quotientMM, quotientMM*D); MDebug("\tdata:%d, \t M:0x%lx, M*%d=0x%lX, q:%ld(%lx)\tq*D=%ld \n\t=>", n, M, n, ( M * n), quotient,quotient, quotient*D); return n - quotient * D; } int main(int argc, char const *argv[]) { uint32_t aData[8] = {5, 55, 555, 12, 9, 100, 4, 3}; uint32_t result; for(int i=0; i<8; i++) { result = quickmod(aData[i]); printf("mode of %d is: %d \n", aData[i], result); } } ``` In order to prevent from overflow, $2^{8}$ was used instead of $2^{64}$ in this examination. Please find executed result below: ``` data:5, MM:0x55, MM*5=0x1A9, q:1(1) q*D=3 data:5, M:0x56, M*5=0x1AE, q:1(1) q*D=3 =>mode of 5 is: 2 data:55, MM:0x55, MM*55=0x1243, q:18(12) q*D=54 data:55, M:0x56, M*55=0x127A, q:18(12) q*D=54 =>mode of 55 is: 1 data:555, MM:0x55, MM*555=0xB847, q:184(b8) q*D=552 data:555, M:0x56, M*555=0xBA72, q:186(ba) q*D=558 =>mode of 555 is: -3 data:12, MM:0x55, MM*12=0x3FC, q:3(3) q*D=9 data:12, M:0x56, M*12=0x408, q:4(4) q*D=12 =>mode of 12 is: 0 data:9, MM:0x55, MM*9=0x2FD, q:2(2) q*D=6 data:9, M:0x56, M*9=0x306, q:3(3) q*D=9 =>mode of 9 is: 0 data:100, MM:0x55, MM*100=0x2134, q:33(21) q*D=99 data:100, M:0x56, M*100=0x2198, q:33(21) q*D=99 =>mode of 100 is: 1 data:4, MM:0x55, MM*4=0x154, q:1(1) q*D=3 data:4, M:0x56, M*4=0x158, q:1(1) q*D=3 =>mode of 4 is: 1 data:3, MM:0x55, MM*3=0xFF, q:0(0) q*D=0 data:3, M:0x56, M*3=0x102, q:1(1) q*D=3 =>mode of 3 is: 0 ``` > :warning: study: why +1 is required in M? >> * M (with '+1') is correct, but MM (without '+1') is sometimes wrong, when data is 555, 12, 9, and 3. >> * ## 4. (延伸測驗 3), * 題意: 判斷某個數值能否被指定的除數所整除 (即 n 是否為 D 的倍數,若是則回傳 1 ,若否則回傳 0) * 在 `D` 和 `M` 都沿用的狀況下 (D = 3, M = `0xFFFFFFFFFFFFFFFF / 3 + 1 = 0x5555555555555556`),程式碼如下: ``` c bool divisible(uint32_t n) { return n * M <= YYY; } ``` * 注意條件: n 的資料型別為 uint32_t,M 的資料型別為 uint64_t,又 M = 0x5555555555555556,故 `n < M` * 根據 n mod 3 的特性,n 有三種可能: 3k, 3k + 1, 3k + 2 (k 為整數),每種對應的 n * M 為: 1. 若 n = 3k, n * M = 3k * `0x5555555555555556` = 3k * (`0x5555555555555555` + `0x0000000000000001`) = k * (`0xFFFFFFFFFFFFFFFF` + `0x0000000000000003`) = k * `0x0000000000000002` = 2k (小於 M) 2. 若 n = 3k + 1, n * M = (3k + 1) * M = 3kM + M = 2k + M (大於 M) 3. 若 n = 3k + 2, n * M = (3k + 2) * M = 3kM + 2M = 2k + 2M, 即 M < n * M (大於 M) * 因為 n 不會比 M 大,因此可以搭配上述特性,比較 n*M 的計算結果和 M 的大小關係,判斷 n 是否整除 3 * 問題: 但我不清楚,為何要選 M-1 不是 M。 ## 5. 英文字母大小寫轉換 `strlower` 的 in-place 實作如下: ``` c #include <ctype.h> #include <stddef.h> /* in-place implementation for converting all characters into lowercase. */ void strlower(char *s, size_t n) { for (size_t j = 0; j < n; j++) { if (s[j] >= 'A' && s[j] <= 'Z') s[j] ^= 1 << 5; else if ((unsigned) s[j] >= '\x7f') /* extended ASCII */ s[j] = tolower(s[j]); } } ``` 實作向量化的 `strlower`: ``` c #include <ctype.h> #include <stddef.h> #include <stdint.h> #define PACKED_BYTE(b) (((uint64_t)(b) & (0xff)) * 0x0101010101010101u) /* vectorized implementation for in-place strlower */ void strlower_vector(char *s, size_t n) { size_t k = n / 8; for (size_t i = 0; i < k; i++, s += 8) { uint64_t *chunk = (uint64_t *) s; if ((*chunk & PACKED_BYTE(VV1)) == 0) { /* is ASCII? */ uint64_t A = *chunk + PACKED_BYTE(128 - 'A' + VV2); uint64_t Z = *chunk + PACKED_BYTE(128 - 'Z' + VV3); uint64_t mask = ((A ^ Z) & PACKED_BYTE(VV4)) >> 2; *chunk ^= mask; } else strlower(s, 8); } k = n % 8; if (k) strlower(s, k); } ``` * 在 64 位元處理架構中,以 word 為單位來處理字串,因此要把輸入字串拆成每 8 個一組 (一個 char 佔一位元)。 `#define PACKED_BYTE(b) (((uint64_t)(b) & (0xff)) * 0x0101010101010101u)` 這行的作用是把 8 位元的資料擴充成內容重複的 64 位元,在 strlower_vector() 的程式碼中,我們需要一次判斷 8 個字元是否為 ASCII 字元 (根據 [測驗1]() 我們知道可以透過 `str[i] & 0x80` 來判斷一個字元是否落在 ASCII 範圍內),這裡的作法是利用 `#define PACKED_BYTE(b)` 將 `0x80` 擴展成 `0x0808080808080808`。因此 `if ((*chunk & PACKED_BYTE(VV1)) == 0)` 中,`VV1` 應填上 `0x80`。 * 解讀以下程式碼 ``` c uint64_t A = *chunk + PACKED_BYTE(128 - 'A' + VV2); uint64_t Z = *chunk + PACKED_BYTE(128 - 'Z' + VV3); uint64_t mask = ((A ^ Z) & PACKED_BYTE(VV4)) >> 2; *chunk ^= mask; ``` 先觀察 `128 - 'A' + VV2` 和 `128 - 'Z' + VV3`,推測這段程式碼應該是在判斷字元是否落於 `'A' ~ 'Z'` 的範圍。 把 ASCII 字元,分成三個範圍: 小於 'A', 大於等於 'A' 且 小於等於 'Z', 大於 'Z'。 從三個範圍中各取一個字元放到下方表格。 | char | Decimal| => | 與 127 的距離 | | -------- | -------- | -------- | -------- | | `'c'` (大於 'Z') | `99` | => | `28` | | `'Z'` | `90` | => | `37` (128 - `'Z'` + 1) | | `'C'` (大於等於 'A' 且 小於等於 'Z')| `67` | => | `60` | | `'A'` | `65` | => | `62` (128 - `'A'` + 1)| | `' '` (小於 'A')| `32` | => | `95` | 要判斷一個字元是否落於 `'A' ~ 'Z'` 的範圍,除了直接比較字元和 'A'、'Z' 的大小以外,也可以考慮把字元加上 `127 - 'A'` 及 `127 - 'Z'`,判斷結果是否會超過 ASCII 的範圍 (>127)。 * 若輸入字元 > 'A',則輸入字元 + `127 - 'A'` 會大於 127,其二進制表示中最左 bit 會是1。 * 若輸入字元 >= 'A',則輸入字元 + `127 - 'A' + 1` 會大於 127 (**即輸入字元 + `128 - 'A'`**),其二進制表示中最左 bit 會是1。 * 若輸入字元 < 'Z',則輸入字元 + `127 - 'Z'` 會小於 127,其二進制表示中最左 bit 會是 0。 * 若輸入字元 <= 'Z',則輸入字元 + `127 - 'Z'` 會小等於 127 (**即輸入字元 + `128 - 'Z' - 1`**),其二進制表示中最左 bit 會是 0。 因此 `'A' ~ 'Z'` 範圍判斷方式應為: 輸入字元分別加上 `128 - 'A'` 及 `128 - 'Z' - 1`。 In order to set bit 5 for each char in 64bits word, 0x80 (VV4) was required to & (A ^ Z) and followed by >> 2. > * examination: modify source code to show more debug information as below: ```c /* vectorized implementation for in-place strlower */ void strlower_vector(char *s, size_t n) { size_t k = n / 8; for (size_t i = 0; i < k; i++, s += 8) { uint64_t *chunk = (uint64_t *) s; char acBuff[9] = {0,0,0,0,0,0,0,0,0}; memcpy((void*)acBuff, (void*)s, 8); MDebug("<%s> if(%lX==0)\n", acBuff, (*chunk & PACKED_BYTE(VV1)) ); if ((*chunk & PACKED_BYTE(VV1)) == 0) { /* is ASCII? */ uint64_t A = *chunk + PACKED_BYTE(128 - 'A' + VV2); uint64_t Z = *chunk + PACKED_BYTE(128 - 'Z' + VV3); uint64_t mask = ((A ^ Z) & PACKED_BYTE(VV4)) >> 2; *chunk ^= mask; memcpy((void*)acBuff, (void*)s, 8); MDebug("\tA=%lx, Z=%lx, mask=%016lx, *chunk=%lx\n\t=>%s\n\n", A, Z, mask, *chunk, acBuff); } else strlower(s, 8); } k = n % 8; if (k) strlower(s, k); } ``` > * the execution data can be found below: ```c <This eBo> if(0==0) A=ae81a45fb2a8a793, Z=94678a45988e8d79, mask=0020000000000020, *chunk=6f62652073696874 =>this ebo <ok is fo> if(0==0) A=aea55fb2a85faaae, Z=948b45988e459094, mask=0000000000000000, *chunk=6f66207369206b6f =>ok is fo <r the us> if(0==0) A=b2b45fa4a7b35fb1, Z=989a458a8d994597, mask=0000000000000000, *chunk=7375206568742072 =>r the us <e of any> if(0==0) A=b8ada05fa5ae5fa4, Z=9e9386458b94458a, mask=0000000000000000, *chunk=796e6120666f2065 =>e of any <one anyw> if(0==0) A=b6b8ada05fa4adae, Z=9c9e9386458a9394, mask=0000000000000000, *chunk=77796e6120656e6f =>one anyw <here at > if(0==0) A=5fb3a05fa4b1a4a7, Z=459986458a978a8d, mask=0000000000000000, *chunk=2074612065726568 =>here at <no cost > if(0==0) A=5fb3b2aea25faead, Z=4599989488459493, mask=0000000000000000, *chunk=2074736f63206f6e =>no cost <and with> if(0==0) A=a7b3a8b65fa3ada0, Z=8d998e9c45899386, mask=0000000000000000, *chunk=6874697720646e61 =>and with < almost > if(0==0) A=5fb3b2aeacaba05f, Z=4599989492918645, mask=0000000000000000, *chunk=2074736f6d6c6120 => almost <no restr> if(0==0) A=b1b3b2a4b15faead, Z=9799988a97459493, mask=0000000000000000, *chunk=7274736572206f6e =>no restr <ictions > if(0==0) A=5fb2adaea8b3a2a8, Z=459893948e99888e, mask=0000000000000000, *chunk=20736e6f69746369 =>ictions <whatsoev> if(0==0) A=b5a4aeb2b3a0a7b6, Z=9b8a949899868d9c, mask=0000000000000000, *chunk=76656f7374616877 =>whatsoev <er. You> if(0==0) A=b4ae985f5f6db1a4, Z=9a947e454553978a, mask=0000200000000000, *chunk=756f7920202e7265 =>er. you < may cop> if(0==0) A=afaea25fb8a0ac5f, Z=959488459e869245, mask=0000000000000000, *chunk=706f632079616d20 => may cop <y it, gi> if(0==0) A=a8a65f6bb3a85fb8, Z=8e8c4551998e459e, mask=0000000000000000, *chunk=6967202c74692079 =>y it, gi <ve it aw> if(0==0) A=b6a05fb3a85fa4b5, Z=9c8645998e458a9b, mask=0000000000000000, *chunk=7761207469206576 =>ve it aw <ay or re> if(0==0) A=a4b15fb1ae5fb8a0, Z=8a97459794459e86, mask=0000000000000000, *chunk=657220726f207961 =>ay or re <-use it > if(0==0) A=5fb3a85fa4b2b46c, Z=45998e458a989a52, mask=0000000000000000, *chunk=207469206573752d =>-use it <under th> if(0==0) A=a7b35fb1a4a3adb4, Z=8d9945978a89939a, mask=0000000000000000, *chunk=6874207265646e75 =>under th <e terms > if(0==0) A=5fb2acb1a4b35fa4, Z=459892978a99458a, mask=0000000000000000, *chunk=20736d7265742065 =>e terms <of the P> if(0==0) A=8f5fa4a7b35fa5ae, Z=75458a8d99458b94, mask=2000000000000000, *chunk=702065687420666f =>of the p <roject G> if(0==0) A=865fb3a2a4a9aeb1, Z=6c4599888a8f9497, mask=2000000000000000, *chunk=67207463656a6f72 =>roject g <utenberg> if(0==0) A=a6b1a4a1ada4b3b4, Z=8c978a87938a999a, mask=0000000000000000, *chunk=677265626e657475 =>utenberg < License> if(0==0) A=a4b2ada4a2a88b5f, Z=8a98938a888e7145, mask=0000000000002000, *chunk=65736e6563696c20 => license < include> if(0==0) A=a4a3b4aba2ada85f, Z=8a899a9188938e45, mask=0000000000000000, *chunk=6564756c636e6920 => include <d with t> if(0==0) A=b35fa7b3a8b65fa3, Z=99458d998e9c4589, mask=0000000000000000, *chunk=7420687469772064 =>d with t <his eBoo> if(0==0) A=aeae81a45fb2a8a7, Z=9494678a45988e8d, mask=0000200000000000, *chunk=6f6f626520736968 =>his eboo <k or onl> if(0==0) A=abadae5f99805faa, Z=919394457f664590, mask=0000000020200000, *chunk=6c6e6f207a61206b =>k az onl <ine at w> if(0==0) A=b65fb3a05fa4ada8, Z=9c459986458a938e, mask=0000000000000000, *chunk=7720746120656e69 =>ine at w <ww.guten> if(0==0) A=ada4b3b4a66db6b6, Z=938a999a8c539c9c, mask=0000000000000000, *chunk=6e657475672e7777 =>ww.guten <berg.net> if(0==0) A=b3a4ad6da6b1a4a1, Z=998a93538c978a87, mask=0000000000000000, *chunk=74656e2e67726562 =>berg.net this ebook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. you may copy it, give it away or re-use it under the terms of the project gutenberg license included with this ebook or online at www.gutenberg.net ``` > :warning: the information above, low byte was displayed first for each long word. For example, 'mask=0020000000000020' in the second line, means 'T' and 'B' upper case were found from 'This eBo' string. 'T' is the first char but displayed last, while B was the 7th char, but displayed as 2nd byte. This is litte-endian feature of Intel processor. ## 6. LeetCode 137. Single Number II ### 關於 KKK = ? 觀察返回值 lower 其初值是 0, 第一個操作 ^= 之後, 設為輸入值, 第二個動作 &= 是清除, 因此: * (a) higher: 清除為0, 這項不對. * (b) lower: 無動作, 也不對. * (d) ~lower: 清除 lower 自己, 這項不對。 * (e) 0xFFFFFFFF: 無動作, 也不對. 因此選 ( C) ~higher. ### 關於 JJJ = ? 同 lower 其初值是 0, 第一個操作 ^= 之後, 設為輸入值, 第二個動作 &= 是清除, 因此: * (a) higher: 無動作, 不對. * (b) lower: 保留 lower,清除其餘的 bits, 怪怪的! * (c) ~higher: 清除 higher 自己, 這項不對。 * (e) 0xFFFFFFFF: 無動作, 也不對. 因此選 (d) ~lower. 加上 MDebug 來偵錯, 完整程式如下: ``` c #define MDebug printf #define KKK (~higher) #define JJJ (~lower) int singleNumber(int *nums, int numsSize) { int lower = 0, higher = 0; for (int i = 0; i < numsSize; i++) { lower ^= nums[i]; lower &= KKK; higher ^= nums[i]; higher &= JJJ; MDebug("%d \thigher:0x%08x, lower:0x%08x\n", nums[i], higher, lower); } return lower; } int main() { int iExample1[] = {2,2,2,3}; int iExample2[] = {0,1,0,1,0,1,99}; int iExample3[] = {21,66,66,66}; printf("Output: %d\n", singleNumber(iExample1, sizeof(iExample1)/sizeof(int))); printf("Output: %d\n", singleNumber(iExample2, sizeof(iExample2)/sizeof(int))); printf("Output: %d\n", singleNumber(iExample3, sizeof(iExample3)/sizeof(int))); } ``` * 執行結果如下: ``` 2 higher:0x00000000, lower:0x00000002 2 higher:0x00000002, lower:0x00000000 2 higher:0x00000000, lower:0x00000000 3 higher:0x00000000, lower:0x00000003 Output: 3 0 higher:0x00000000, lower:0x00000000 1 higher:0x00000000, lower:0x00000001 0 higher:0x00000000, lower:0x00000001 1 higher:0x00000001, lower:0x00000000 0 higher:0x00000001, lower:0x00000000 1 higher:0x00000000, lower:0x00000000 99 higher:0x00000000, lower:0x00000063 Output: 99 21 higher:0x00000000, lower:0x00000015 66 higher:0x00000000, lower:0x00000057 66 higher:0x00000042, lower:0x00000015 66 higher:0x00000000, lower:0x00000015 Output: 21 ```