hsieh
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # 2020q3 Homework2 (quiz2) contributed by <`hsiehong`> [github](https://github.com/hsiehong/quiz02) ###### tags: `進階電腦系統理論與實作` > [quiz2](https://hackmd.io/@sysprog/2020-quiz2) > [已提交的作業共筆](https://hackmd.io/@sysprog/2020-homework2) ---- ## 測驗1 #### 判斷指定的記憶體範圍內是否全是有效的 ASCII 字元 ```c= #include <stdbool.h> #include <stddef.h> #include <stdint.h> #include <string.h> bool is_ascii(const char str[], size_t size) { if (size == 0) return false; int i = 0; while ((i + 8) <= size) { uint64_t payload; memcpy(&payload, str + i, 8); if (payload & MMM) return false; i += 8; } while (i < size) { if (str[i] & 0x80) return false; i++; } return true; } ``` `MMM = 0x8080808080808080` #### 題目原理解釋 因為有效的 ASCII 範圍是 0 - 127,二進制表示是 `0x0000 0000` - `0x0111 1111` ,所以可以透過 `alpha & 1000 0000` 來檢測字母的 `MSB` 是否為 1,為 `1` 表示不為有效的 ASCII 字元 (ASCII > 127),在 64 位元觀點此 mask 即為 `0x8080808080808080` #### 為何用到 `memcpy` ? * 參閱 [memcpy - Linux manual page](https://man7.org/linux/man-pages/man3/memcpy.3.html) 可以得知若系統沒有 memory alignment,在 `memcpy` 會做 memory alignment,可以避免 [Unaligned Memory Accesses](https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt),因為並不是所有的硬體都有支援 unaligned memory acces,`memcpy` 就是透過軟體來達到 memory alignment,雖然執行速度會較硬體慢,但不在這邊的討論範圍。 * 其中 `OPSIZ` 會因不同的架構而異,可能是 8 bytes 或 16 bytes,而把多出來不足 8( or 16) bytes 的部分先行複製 ```c=38 len -= (-dstp) % OPSIZ; BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ); ``` :::info Question 1. why (-dstp) ? ::: * reference: 1. [memcpy](https://blog.xiocs.com/archives/181/) 2. [alignment fault](https://www.itread01.com/content/1544492598.html) #### 將上述技巧應用於「給予一個已知長度的字串,檢測裡頭是否包含有效的英文大小寫字母」 * 判斷是否為字母的部分使用第 5 題的技巧,利用 `A` `Z` 的 MSB 判斷是否是大寫字母,利用 `a` `z` 的 MSB 判斷是否是小寫字母,並利用 `PACK_BYTE` 來增加可讀性 * [detectEffectiveChar.c](-) #### 考慮常見的英文標點符號,判斷輸入字串是否為有效的字元集,避免逐一字元比對。Intel SSE 4.2 加入 STTNI (String and Text New Instructions) 指令,可加速字串比對,閱讀 [Schema Validation with Intel® Streaming SIMD Extensions 4](https://software.intel.com/content/www/us/en/develop/articles/schema-validation-with-intel-streaming-simd-extensions-4-intel-sse4.html) :::success TODO reference: [`eecheng`](https://hackmd.io/@eecheng/HJgpR7RNw) ::: --- ## 測驗2 #### 將十六進位表示的字串 (hexadecimal string) 轉為數值 ```c= uint8_t hexchar2val(uint8_t in) { const uint8_t letter = in & MASK; const uint8_t shift = (letter >> AAA) | (letter >> BBB); return (in + shift) & 0xf; } ``` `MASK = 0x40` `AAA = 3` `BBB = 6` #### 題目原理解釋 觀察一個 byte 的狀況 | 字元 | ASCII(in) | output(dec) | output(bin) | |:----:|:---------:|:-----------:|:-----------:| | '0' | 0011 0000 | 0 | 0000 0000 | | '6' | 0011 0110 | 6 | 0000 0110 | | '9' | 0011 1001 | 9 | 0000 1001 | | 'A' | 0100 0001 | 10 | 0000 1010 | | 'a' | 0110 0001 | 10 | 0000 1010 | | 'F' | 0100 0110 | 15 | 0000 1111 | | 'f' | 0110 0110 | 15 | 0000 1111 | * `MASK` 為 `0100 0000`,推測是判斷字元是**數字**還是**字母**,若是字母則 `letter` 為`0100 0000`,數字的話 `letter` 為`0000 0000` * 從 line5 回傳值 `(in + shift) & 0xf` 推測,`0xf` 二進制為 `0000 1111`,即保留 `(in + shift)`末四位,所以只須探討這末四位是何方神聖 * 若字元是字母,`A` 和 `a` 的末四位都是 `0001`,要變成正確輸出的 `1010` 需要加上 `1001`,可以推斷正確的 `offset` 為 `1001` 。這裡利用 `letter` `0100 0000` 來產生 shift,`letter >> 3 | letter >> 6` 可以得到正確的 offset `0000 1001`。 * 若字元是數字,`letter` 就是 `0`,line 5 就沒有作用,`in` 末四碼就是正確的結果 #### 將 hexchar2val 擴充為允許 "0xFF", "0xCAFEBABE", "0x8BADF00D" 等字串輸入,且輸出對應的十進位數值 * 將參數改成一次處理一個字串,且能處理的範圍不大於 64 bits (即最多只能處理八個字元,不包含 `0x`),並檢查是否為 [Hexspeak](https://en.wikipedia.org/wiki/Hexspeak) * [hexchar2val.c](-) ```c= uint64_t hexchar2val(char *in) { uint64_t result = 0; int len = strlen(in); if (in[0] == '0' && in[1] == 'x') { for (int i = 2; i < len; i++) { const uint8_t letter = *(in + i) & 0x40; uint8_t shift = (letter >> 3) | (letter >> 6); result = (result << 4) | (in[i] + shift) & 0xf; } } return result; } ``` * 測試結果 ```c= int main(void) { printf("%llu\n", hexchar2val("0xFF")); printf("%llu\n", hexchar2val("0xCAFEBABE")); printf("%llu\n", hexchar2val("0x8BADF00D")); return 0; } ``` ``` 255 3405691582 2343432205 ``` --- ## 測驗3 #### 將除法改為乘法運算 ```c= const uint32_t D = 3; #define M ((uint64_t)(UINT64_C(XXX) / (D) + 1)) /* compute (n mod d) given precomputed M */ uint32_t quickmod(uint32_t n) { uint64_t quotient = ((__uint128_t) M * n) >> 64; return n - quotient * D; } ``` `XXX = 0xFFFFFFFFFFFFFFFF` #### 題目原理解釋 根據數學式 $$ \dfrac{n}{d} = n \times (\dfrac{\dfrac{2^N}{d}}{2^N}) = (n \times \dfrac{2^N}{D}) \times \dfrac{1}{2^N} $$ * line 6 可以推得是除以$2^{64}$,所以 N 應為64,`((__uint128_t) M * n) >> 64` 即為 $n/d$ 的整數部份( quotient ) * 精確值 `XXX` 應為 $2^{64}$,但 `XXX` 最大只能儲存 $2^{64}-1$,就是 `0xFFFFFFFFFFFFFFFF`,最後 `+1` 代表無條件進位,確保 $\dfrac{2^{64}}{D}$ 與 $\dfrac{2^{64}-1}{D}+1$ 的整數部份是會相同的 * 從上述可以得知, `M` 是對應到 $\lceil{\dfrac {2^{32}}{d}}\rceil$ * line7 算出商之後,`n - q * D` 即可得到餘數 [測試程式](https://github.com/hsiehong/quiz02/blob/master/quickmod/measure.c) #### fastmod 實驗 * 對 `fast_mod` 與 `normal_mod` 分別跑 10 萬次 * [繪圖 script](https://github.com/hsiehong/quiz02/blob/master/quickmod/perf.gp) * 比較不做最佳化與做 O1, O2, O3 最佳化的結果 * 可以看出有做最佳化的話其實結果不會差太多,但 normal 版本還是會稍微較好 * O0 ![](https://i.imgur.com/XKPzbKH.png) * O1 ![](https://i.imgur.com/kQpITf7.png) * O2 ![](https://i.imgur.com/FbENpg9.png) * O3 ![](https://i.imgur.com/3spZyrO.png) :::info Question 1. `fast_mod` 在全部的情況下表現都沒有比較優異? 2. 如何驗證實驗結果是正確的? ::: #### 關於 [jemalloc 快速除法](https://github.com/jemalloc/jemalloc) * 由 Facebook 公司所維護的 [jemalloc](https://github.com/jemalloc/jemalloc) 是個高效能的記憶體配置器 (memory allocator,即 malloc 和 free 函式對應的實作),特別在多核多執行緒的環境有優異表現,在其原始程式碼 [include/jemalloc/internal/div.h](https://github.com/jemalloc/jemalloc/blob/dev/include/jemalloc/internal/div.h) 和 [src/div.c](https://github.com/jemalloc/jemalloc/blob/dev/src/div.c) 也運用類似的技巧,請閱讀原始程式碼,並抽離快速除法實作為獨立的程式碼,說明運作原理,也該撰寫測試程式,比較透過處理器的整數除法指令及本技巧的效能差距 * 假設 $n = q\times d$,$n, q, d$ 都是整數,已知 $n, d$ 求 $q$,$q = \dfrac{n}{d}$,可以轉換成下面的形式 $q =$$\left \lfloor \lceil \dfrac{2^k}{d} \rceil \times \dfrac{n}{2^k} \right \rfloor = \left \lfloor \dfrac{2^k+r}{d} \times \dfrac{n}{2^k} \right \rfloor$,其中 $r$ = $(-2)^k$ mod $d$ $=\left \lfloor \dfrac{2^k}{d} \times \dfrac{n}{2^k} + \dfrac{r}{d} \times \dfrac{n}{2^k} \right \rfloor = \left \lfloor \dfrac{n}{d} + \dfrac{r}{d} \times \dfrac{n}{2^k} \right \rfloor$,因為 $n = q\times d$,所以$\dfrac{n}{d}$ 為整數 $=\dfrac{n}{d} + \left \lfloor \dfrac{r}{d} \times \dfrac{n}{2^k} \right \rfloor$,推導到這邊,只要能確保$\left \lfloor \dfrac{r}{d} \times \dfrac{n}{2^k} \right \rfloor < 1$ ,就代表 $\dfrac{n}{d}$ 可以被改寫成$\left \lfloor \lceil \dfrac{2^k}{d} \rceil \times \dfrac{n}{2^k} \right \rfloor$ * 其中 $r$ = $(-2)^k$ mod $d$,可以確定 $r < d$,而 $\dfrac{n}{2^k}$ 可以令 $k = 32$ 來確保其 $< 1$ * 推導完再來看快速除法的資料結構 [div.h](https://github.com/jemalloc/jemalloc/blob/dev/include/jemalloc/internal/div.h) ```c= #ifndef JEMALLOC_INTERNAL_DIV_H #define JEMALLOC_INTERNAL_DIV_H #include "jemalloc/internal/assert.h" /* * This module does the division that computes the index of a region in a slab, * given its offset relative to the base. * That is, given a divisor d, an n = i * d (all integers), we'll return i. * We do some pre-computation to do this more quickly than a CPU division * instruction. * We bound n < 2^32, and don't support dividing by one. */ typedef struct div_info_s div_info_t; struct div_info_s { uint32_t magic; #ifdef JEMALLOC_DEBUG size_t d; #endif }; void div_init(div_info_t *div_info, size_t divisor); static inline size_t div_compute(div_info_t *div_info, size_t n) { assert(n <= (uint32_t)-1); /* * This generates, e.g. mov; imul; shr on x86-64. On a 32-bit machine, * the compilers I tried were all smart enough to turn this into the * appropriate "get the high 32 bits of the result of a multiply" (e.g. * mul; mov edx eax; on x86, umull on arm, etc.). */ size_t i = ((uint64_t)n * (uint64_t)div_info->magic) >> 32; #ifdef JEMALLOC_DEBUG assert(i * div_info->d == n); #endif return i; } #endif /* JEMALLOC_INTERNAL_DIV_H */ ``` * 在 structure `div_info_s` 中有一個 `uint32_t magic`,`magic` 就是上述推導中的 $\left \lceil \dfrac{2^k}{d} \right \rceil$,在 [div.c](https://github.com/jemalloc/jemalloc/blob/dev/src/div.c) 中實做 * `div_compute` 回傳的是 $\left \lfloor \lceil \dfrac{2^k}{d} \rceil \times \dfrac{n}{2^k} \right \rfloor$ * 再來實做部份 [div.c](https://github.com/jemalloc/jemalloc/blob/dev/src/div.c) (省略推導過程) ```c= #include "jemalloc/internal/jemalloc_preamble.h" #include "jemalloc/internal/div.h" #include "jemalloc/internal/assert.h" void div_init(div_info_t *div_info, size_t d) { /* Nonsensical. */ assert(d != 0); /* * This would make the value of magic too high to fit into a uint32_t * (we would want magic = 2^32 exactly). This would mess with code gen * on 32-bit machines. */ assert(d != 1); uint64_t two_to_k = ((uint64_t)1 << 32); uint32_t magic = (uint32_t)(two_to_k / d); /* * We want magic = ceil(2^k / d), but C gives us floor. We have to * increment it unless the result was exact (i.e. unless d is a power of * two). */ if (two_to_k % d != 0) { magic++; } div_info->magic = magic; #ifdef JEMALLOC_DEBUG div_info->d = d; #endif } ``` * 這邊有避免除數為 0 的狀況,在 [ISO/IEC 9899](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) 中 > 6.5.5-5 The result of the / operator is the quotient from the division of the first operand by the second; the result of the % operator is the remainder. In both operations, if the value of the second operand is zero, the behavior is undefined. * `two_to_k` 即 $2^{32}$ * `magic` 為 $\left \lceil \dfrac{2^k}{d} \right \rceil$,但 `/` 預設是取 floor,所以在 line 24 中還要做取 ceiling 的動作 :::info Questions 1. 在這裡 `.h` file 與 `.c` file 的區分 ? 2. 在 `div.c` 中使用到 `%`,影響? ::: #### jemalloc 實驗 測試程式 ```c= int main(void) { div_info_t DIV; size_t temp; struct timespec start, end; srand(time(NULL)); printf("divisor fast_time norm_time\n"); for (size_t i = 2; i <= 10000; i++) { div_init(&DIV, i); uint64_t dividend = rand() % UINT_MAX; clock_gettime(CLOCK_MONOTONIC, &start); temp = div_compute(&DIV, dividend); clock_gettime(CLOCK_MONOTONIC, &end); long long fast_cost = (long long)(end.tv_sec * 1e9 + end.tv_nsec) - \ (long long)(start.tv_sec * 1e9 + start.tv_nsec); clock_gettime(CLOCK_MONOTONIC, &start); temp = dividend / i; clock_gettime(CLOCK_MONOTONIC, &end); long long norm_cost = (long long)(end.tv_sec * 1e9 + end.tv_nsec) - \ (long long)(start.tv_sec * 1e9 + start.tv_nsec); printf("%-7ld %-7lld %-7lld\n", i, fast_cost, norm_cost); } return 0; } ``` 下面分別比較編譯時分別使用不同最佳化的結果 * O0 ![](https://i.imgur.com/wGWcBuv.png) * O1 ![](https://i.imgur.com/vQfjt3i.png) * O2 ![](https://i.imgur.com/HbOda7S.png) * O3 ![](https://i.imgur.com/ElrQ03E.png) :::info Question 1. 與同學的實驗結果有落差 >參考同學的實驗: >[WeiCheng14159](https://hackmd.io/@WeiCheng14159/HkesfMvBP#%E5%BB%B6%E4%BC%B8%E5%95%8F%E9%A1%8C2) >[RinHizakura](https://hackmd.io/i22yIu2aTveDbrregY3Hug?view#%E6%B8%AC%E9%A9%97-3) 2. 實驗結果的震盪探討 ::: --- ## 測驗4 ```c= bool divisible(uint32_t n) { return n * M <= YYY; } ``` ```c const uint32_t D = 3; #define M ((uint64_t)(UINT64_C(XXX) / (D) + 1)) ``` `YYY = M-1` #### 題目原理解釋 * 已知 `0xFFFFFFFFFFFFFFFF / 3` = `0x5555555555555556` = `M`,又 `d` = 3,`n` 只可能有 3 種情況: `3k + 1`, `3k + 2`, `3k`,k 為正整數,分別討論 `n` 可能的 3 種狀況 * `n` = `3k`: $n * M = 2k$ * `n` = `3k + 1`: $n * M = (3k + 1) * M = 3KM + M = 2k + M$ * `n` = `3k + 2`: $n * M = (3k + 2) * M = 3KM + 2M = 2k + 2M$ * 可以得知只有 `n = 3k` 時 `n` 才能被整除,所以 `n * M = 2k` 才有可能,也可以表示成 `n * M <= M - 1` 或 `n * M < M` (`k` 有可能為 0) #### 閱讀 [Faster remainders when the divisor is a constant: beating compilers and libdivide](https://lemire.me/blog/2019/02/08/faster-remainders-when-the-divisor-is-a-constant-beating-compilers-and-libdivide/) * [note](https://hackmd.io/@hsieh22/r1XfXlP0w) --- ## 測驗5 #### 將指定的字串 (或部分字串) 的英文字母向量化地全改為小寫(一次改變8個 byte) ,使用 in-place 作法 ```c= #include <ctype.h> #include <stddef.h> #include <stdint.h> #define PACKED_BYTE(b) (((uint64_t)(b) & (0xff)) * 0x0101010101010101u) /* vectorized implementation for in-place strlower */ void strlower_vector(char *s, size_t n) { size_t k = n / 8; for (size_t i = 0; i < k; i++, s += 8) { uint64_t *chunk = (uint64_t *) s; if ((*chunk & PACKED_BYTE(VV1)) == 0) { /* is ASCII? */ uint64_t A = *chunk + PACKED_BYTE(128 - 'A' + VV2); uint64_t Z = *chunk + PACKED_BYTE(128 - 'Z' + VV3); uint64_t mask = ((A ^ Z) & PACKED_BYTE(VV4)) >> 2; *chunk ^= mask; } else strlower(s, 8); } k = n % 8; if (k) strlower(s, k); } ``` `VV1 = 0x80` `VV2 = 0` `VV3 = -1` `VV4 = 0x80` #### 題目原理解釋 * `define PACKED_BYTE(b)` `PACKED_BYTE(b)` 的作用是取出 b 的最後 8 個位元並複製 8 次形成 8 個 byte,比如說 `b` 的後 8 bits 是 `0xab`,則 `PACKED_BYTE(b)` = `0xabababababababab` * line13: 要確認是否為有效的 ASCII 字元,使用技巧同第一題,又這裡要一次處理 8 個 char (8 bytes),故將 `0x80` -> `0x8080808080808080` * line17,推測 `mask` 的作用,若大寫字母才需要做轉換,小寫字母不須轉換,字母為大寫的話 `mask` 是 `0b0010 0000` (原本是`0x80`,在 line16 做了 `>> 2` 的動作),反之是小寫字母的話 `mask` 為 `0b0000 0000` * 我們要判斷一個字母 `c` 是否在大寫字母範圍簡單的方法如下,有兩個部分, `c` > 'A' **且** `c` < 'Z' ```c c > 'A' && c < 'Z' ``` * 128 二進制表示是 `1000 0000`,這裡設計兩個變數 `A` `Z` 檢查字母是否在 `A`-`Z` 之間,變數 `A` 判斷 c 是否大於等於 'A',若 c 大於 'A' 的話 `128 - 'A' + VV2` 的 MSB 會是 1 (>127),故 `vv2 = 0` 。變數 `Z` 同理,紀錄字母 c 是否大於 'Z',但因為 'Z' 是允許的(計算`127 - 'Z'`的距離就好,故 `vv3 = -1`),大於 'Z' 的話變數 `Z` 的 MSB 才要為 1,故要再減 1 。 * 這裡只需要 `A^Z` MSB 的資訊來判斷是否在大寫字母的範圍,所以此處 mask 取 `0x1000 0000` ,取出 MSB 的資訊,來判斷要做大寫還是小寫的動作。 #### 延伸問題 ```c= #include <stdio.h> #include <string.h> int main() { /* quote from https://www.gutenberg.org/files/74/74-h/74-h.htm */ char str[] = "This eBook is for the use of anyone anywhere at no cost and with \ almost no restrictions whatsoever. You may copy it, give it away or \ re-use it under the terms of the Project Gutenberg License included \ with this eBook or online at www.gutenberg.net"; int n = strlen(str); strlower_vector(str, n); puts(str); } ``` * 若將 `char str[] ...` 改成 `char *str ...` 執行時會產生錯誤 參閱 [ISO/IEC 9899](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) (即 C99 規格) * **6.4.5 String literals** * 第 2 點 > 6.4.5 > A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in "xyz". > 5.1.1.2-6 > Adjacent string literal tokens are concatenated. * 第 5 點 > In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to **initialize an array of static storage** duration and **length just sufficient to contain the sequence**. For character string literals, the `array elements` have type `char`, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type `wchar_t`, and are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by the `mbstowcs` function with an implementation-defined current locale. The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set is implementation-defined. * 第 6 點 > It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined. * string literal 的型態是 `array of char` * char array 會在 static storage 做初始化的動作,且被分配到的空間大小是剛好的 (大小是有加入 `null byte` 的) * 試圖修改其內容是 **undefine behavior** #### supplement :::spoiler Description of `mbstowcs` function in **7.20.8.1** > The mbstowcs function converts a sequence of multibyte characters that begins in the initial shift state from the array pointed to by s into a sequence of corresponding wide characters and stores not more than n wide characters into the array pointed to by pwcs. No multibyte characters that followanull character (which is converted into a null wide character) will be examined or converted. Each multibyte character is converted as if by a call to the mbtowc function, except that the conversion state of the mbtowc function is not affected. ::: :::spoiler Description of `wchar_t` in **7.17** > which is an integer type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales; the null character shall have the code value zero. Each member of the basic character set shall have a code value equal to its value when used as the lone character in an integer character constant if an implementation does not define > ::: #### `char str[]` 與 `char *str` 的差別參考: * **6.7.8 Initialization** * 第 32 點 ``` char s[] = "abc" ``` > defines **plain** char array objects `s` whose elements are initialized with character string literals. > The contents of the arrays are modifiable. 定義一個字元陣列,其內容是 string literal 所初始的,且陣列的內容是可以更改的 ``` char *p = "abc"; ``` > defines `p` with type **pointer to char** and initializes it to point to an object with type **array of char** with length 4 whose elements are initialized with a character string literal. If an attempt is made to use `p` to modify the contents of the array, the behavior is undefined. 首先會宣告一個指向 `char` 的 pointer,並將其初始化,初始化會指向一個 **字元陣列**,這個陣列是 string literal 的,如上所述,嘗試對其內容修改是 undefined behavior 的 * some other introduction of [string literal](https://www.cs.uic.edu/~jbell/CourseNotes/C_Programming/CharacterStrings.html) --- ## 測驗6 LeetCode [題目敘述](https://leetcode.com/problems/single-number-ii/),完成以下程式碼 ```c= int singleNumber(int *nums, int numsSize) { int lower = 0, higher = 0; for (int i = 0; i < numsSize; i++) { lower ^= nums[i]; lower &= KKK; higher ^= nums[i]; higher &= JJJ; } return lower; } ``` `KKK = ~higher` `JJJ = ~lower` #### 題目原理解釋 * 這題是運用 bitwise 操作中,**每個 bits 都是獨立的個體**的特性,透過 {higher,lower} 紀錄每個 bit 出現的次數,因為每個數只會出現3次或1次,所以有三種狀態要記錄,分別是出現1次,出現2次和出現3次的,{higher,lower} 有 2 bits,其實最多可以記錄4種狀態。 * 可以利用 01 表示該 bit 出現了1次,10 表示該 bit 出現2次,00 表示出現3次,其實就是一個 Mealy 有現狀態機,當前的狀態會與輸入還有目前的狀態有關,狀態轉換如下 * current state 和 after state 分別是紀錄每個位元在該 iteration 的狀態 {`higher`, `lower`} | current state | input | after state | |:-------------:|:-----:|:-----------:| | 00 | 0 | 00 | | 00 | 1 | 01 | | 01 | 0 | 01 | | 01 | 1 | 10 | | 10 | 0 | 10 | | 10 | 1 | 00 | ```graphviz digraph{ rankdir=LR 00->01 [label=1] 01->10 [label=1] 10->00 [label=1] 00->00 [label=0] 01->01 [label=0] 10->10 [label=0] } ``` ```c lower ^= nums[i]; ``` 將 `nums[i]` 加入 `lower` (LSB),用 `xor` operation。若此 bit 是 `1` (代表前面已經出現過1次了),那會被消成 `0`,代表這個位置出現兩次了,後面要再加入 `higher` ```c lower &= ~higher; ``` 檢查 `higher` 的狀態,若 `higher` 是 `1` (表示前面已經出現過 2 次了),那最終的狀態應該要是 `0` `0`。若 `higher` 是 0 ,表示還沒出現過,所以做 `~higher` 運算可以達到這個效果 ```c higher ^= nums[i]; ``` 將 `nums[i]` 加入 `higher` (MSB),用 `xor` operation,若此 bit 是 `1` (代表前面已經出現過 2 次了,現在是第 3 次出現,這樣狀態應該要是 `0`),那會被消成 `0` ```c higher &= ~lower; ``` 檢查 `lower` 的狀態,若 `lower` 是 1 (表示這個 bit 是第 1 次出現,所以並不用加到 `higher`),那狀態應該要是 `0` `1`。若 `lower` 是 0 ,表示不是第 1 次出現,可能是第 2 或 3 次,此時上一個動作就已經幫我們做好處理了,所以做 `~higher` 運算可以達到這個效果 #### 另解 * 思路:利用一個陣列 `bitCnt[32]`,記錄每個位置的 bit 出現的次數,利用本題只會有出現3次或1次的狀況,所以可以用 `bitCnt[i] % 3` 來判斷該位置的 bit 是否是只出現一次的數字 * 缺點:另外宣告了一個 int 陣列,問題是這個陣列並不會被有效的使用,而且下面的迴圈內用到了 mod 運算,所以還有蠻多地方可以加強的 ```c= int singleNumber(vector<int>& nums) { int numSize = nums.size(); int bitCnt[32] = {0}; for (int i = 0; i < numSize; ++i) { int num = nums[i]; for (int j = 0; j < 32; ++j) { if (num & (1 << j)) bitCnt[j]++; } } int res = 0; for (int i = 0; i < 32; ++i) { if (bitCnt[i] % 3) res |= (1 << i); } return res; } ``` #### 執行結果 ![](https://i.imgur.com/qXxvory.png) #### 推廣 * 推廣到出現 n 次,可以先分成兩種狀況,出現偶數次跟奇數次 * 偶數次比較簡單,根據 `xor` operation 的特性: `A ^ A = 0`,出現偶數次的 bit 在運算時可以直接被消去,所以只需記錄兩種狀態,出現 1 次或出現偶數次,這樣一個 bit 就可以代表了,實作方式跟 [Single Nnumber](https://leetcode.com/problems/single-number/) 一樣 ```c= int evenNumber(int *nums, int numsSize) { int one = 0; for (int i = 0; i < numsSize; i++) one ^= nums[i]; return one; } ``` * 討論奇數次的狀況,若出現 n 次,理論上需要 $\lceil$ log~2~n $\rceil$ 個變數來記錄狀態 * 比較直覺的方法同上面,利用 1 個大小為 32 個 int 的 `int` 陣列 `bitCnt` 記錄每個 bit 出現的次數,最後改成 `bitCnt[i] % n` 即可,但問題也相同,可能大多數並不需要用到 32 個 int,只需要$\lceil$ log~2~n $\rceil$ 個 int 即可,在 n < 2^31^ 的情況下都能較有效的利用到 memory,同時也盡量避免 module operation * 參考 [這篇](https://leetcode.com/problems/single-number-ii/discuss/43295/Detailed-explanation-and-generalization-of-the-bitwise-operation-method-for-single-numbers) 的解說我覺得講的非常仔細,圖示也把核心概念解釋得很清楚如下: ![](https://i.imgur.com/HskKEtE.png) 其中的 `m` 就是 $\lceil$ log~2~n $\rceil$,也是所需 counter 的數量 * 這裡考慮出現 5 次的要求,因為最多會出現 5 次,所以需要 $\lceil$ log~2~5 $\rceil$ = 3 個 counter `x1`, `x2`, `x3`, bit 的轉換狀態如下: ``` 000 -> 001 -> 010 -> 011 -> 100 -> 101 -> 000 一次 兩次 三次 四次 五次 ``` 注意第 X~i~ 個 bit 從 `0 -> 1` 的過程中,X~i-1~, X~i-2~, ... , X~1~ 這些 bits 必須全為 1 才能成立,要注意實做時要 X~m~ 計算到 X~1~ ,不然會 data loss 導致進位錯誤,到這邊我們已經可以正常統計每個位置的 bit 出現的次數,但接下來會有一個問題,m bits counter 理論上會統計到 2^m^ - 1 而不是我們要的 n,根據上面的範例解釋就是該如何讓 `100 -> 000` 而不是 `100 -> 101`,所以還需要設立一個機制能統計到 m 時歸零而不是繼續累計。 這裡很巧妙的利用到了 `cutting` 的機制,當 counter 累積到 `n` 的時時候會歸零,可以透過使用一個 `mask` 對所有的 counter 做檢查,mask 是個人覺得這題最 tricky 的地方,mask 的建制如下: mask = ~ (y~1~ & y~2~ & ... & y~m~) y~i~ = x~i~ if k~i~ is 1 y~i~ = ~ x~i~ if k~i~ is 0 (k~i~ 是數字出現次數二進制表示的第 i 個 bit) 說明:假設各 counter 的狀態如下 MSB 是 `101`,即將要備歸零,而 LSB 的狀態是 `110`,各 `counter` 表示如下: ``` x1: 1xx...x1 x2: 0xx...x1 x3: 1xx...x0 ``` 做取 mask 的動作後 `MSB` 會得到 `1` (需要歸零的 signal) (1 & ~0 & 1),`LSB` 會得到 `0` (不需要歸零,所以並沒有效果)(1 & ~1 & 0),mask 再取 `~` 即可達到將 `counter` 歸零的效果 `mask` 建立完成之後再依序對每個 counter 做檢查即可 #### 程式碼 ```c= #include <stdio.h> int main(void) { int arr[] = {2,5,9,7,7,6,9,2,5,9,5,2,7,2,9,5,5,7,7,9,2}; int x1, x2, x3; x1 = x2 = x3 = 0; int size = sizeof(arr) / sizeof(int); for (int i = 0; i < size; i++) { x3 ^= (x2 & x1 & arr[i]); x2 ^= (x1 & arr[i]); x1 ^= arr[i]; /* in this situation k = 5, which is 101 */ int mask = ~(x1 & ~x2 & x3); x3 &= mask; x2 &= mask; x1 &= mask; } printf("The single number is %d\n", x1); return 0; } ``` 執行結果 ``` The single number is 6 ``` * 推廣到 r 次也是以此類推,並利用`陣列`的形式紀錄 `counter` 就可以了 ---

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully