Huang Adrian
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # 2020q1 Homework3 (quiz4) Contributed by < [AdrianHuang](https://github.com/AdrianHuang/bitcpy) > ###### tags: `linux2020` ## 測驗 `1` - bitcpy ([Github](https://github.com/AdrianHuang/bitcpy)) ### 程式運作的原理 bitcpy 程式將指定的位元偏移量及位元數複製到目標位址,觀察 bitcpy 函數原型,其中包含: ```cpp= void bitcpy(void *_dest, /* Address of the buffer to write to */ size_t _write, /* Bit offset to start writing to */ const void *_src, /* Address of the buffer to read from */ size_t _read, /* Bit offset to start reading from */ size_t _count) ``` * input/_src: 長度為 8 個 uint8 陣列 (總共 64 位元)。注意: 其位元順序布局由每個位元組的 MSB (Most Significant Bit) 往上增加,如下圖一所示。 * output/_dest: 長度為 8 個 uint8 陣列 (總共 64 位元),其位元順序布局如 `input/_dest` 所述。 * _write: 從 _dest 的第 `_write` 個位元開始寫入 `_count` 位元數。 * _read: 從 _src 的第 `_read` 個位元開始讀取 `_count` 位元數。 * _count: 讀取/寫入的位元數。 ![](https://i.imgur.com/WpjBDuk.jpg) > $\Uparrow$ 圖一、input/output 變數位元順序示意圖 > ### 讀取位元 讀取位元可區分兩種不同情況: 1. `read_lhs = 0`: 起始位元對齊位元組的MSB。因此,只要將 input/_src 資料與 `read_mask` 做遮罩即可,即底下程式碼。 ```cpp void bitcpy(void *_dest, size_t _write, const void *_src, size_t _read, size_t _count) { .... while (_count > 0) { data = *source++; bitsize = (_count > 8) ? 8 : _count; ... if (bitsize < 8) data &= read_mask[bitsize]; original = *dest; ... } } ``` 2. `read_lhs > 0`: 起始位元沒有對齊位元組的MSB,又可分為兩種情況: 2.1 `bitsize <= read_rhs`: 欲讀取位元數,不需跨兩個位元組 2.2 `bitsize > read_rhs`:欲讀取位元數,需跨兩個位元組 ```cpp void bitcpy(void *_dest, size_t _write, const void *_src, size_t _read, size_t _count) { .... while (_count > 0) { data = *source++; bitsize = (_count > 8) ? 8 : _count; if (read_lhs > 0) { data <<= read_lhs; if (bitsize > read_rhs) data |= (*source >> read_rhs); } if (bitsize < 8) data &= read_mask[bitsize]; ... } } ``` 上述兩段程式碼將欲讀取 _src 資料先放在 `data` 變數,一次最多只能讀取 8 個位元 (這就是為何 `read_rhs = 8 - read_lhs`)。其中,如需跨兩個位元組讀取,則需要讀取兩次,並將這兩次讀取資料放在 `data`變數,最後依據讀取位元數大小 `bitsize` 找出對應位元遮罩,其細節/範例如圖二所示。隨後,再將 `data` 資料複製到目的地,重複上述流程直到所有的位元數讀取完畢。 **注意 (參考圖二、三的示意圖)** {read, write}_lhs: 該位元組從最左邊 (MSB) 的第幾個位元開始讀取或寫入 {read, write}_rhs (需跨兩個位元組讀取或寫入,才需要參考此變數): `{read, write}_lhs`的下一個位元組從最右邊 (LSB) 的第幾個位元,代表讀取或寫入的最終位元。 ![](https://i.imgur.com/N6lSLtW.jpg) > $\Uparrow$ 圖二、讀取 `_src` 並將資料放在 `data`變數示意圖/範例 (需跨兩個位元組) > #### `read_mask` 用途 參考圖一位元順序布局圖,其位元順序布局由每個位元組的 MSB (Most Significant Bit) 往上增加。對照 `bitsize`,該變數表示欲讀取的位元數,即表示 `bitsize = 1` 時,其對應的 `read_mask = 0x80`,以此類推。 ### 寫入位元 寫入位元可區分兩種不同情況: 1. `write_lhs = 0`: 起始位元對齊位元組的MSB。因此,只要將 output/_dest 資料與 `write_mask` 做遮罩,此遮罩把欲寫入的位元清為零,其它不寫入的位元則保留,最後再把欲寫入的資料 (`data`) 寫入目的地。 2. `write_lhs > 0`: 起始位元沒有對齊位元組的MSB,又可分為兩種情況: 2.1 `bitsize <= write_rhs`: 欲寫入位元數,不需跨兩個位元組 2.2 `bitsize > write_rhs`:欲寫入位元數,需跨兩個位元組 #### `write_mask` 用途 參考圖一位元順序布局圖,其位元順序布局由每個位元組的 MSB (Most Significant Bit) 往上增加。對照 `bitsize`,該變數表示欲寫入的位元數,即表示 `bitsize = 1` 時,其對應的 `write_mask = 0x7F` (即把欲寫入的位元先清掉,並保留不需寫入的位元),以此類推。 ## Bug 除錯 ### Bug 1 - size_t 型態造成的錯誤 `if` 判斷 bitsize 與 write_lhs 型態都為 `size_t`,如果 `bitsize < write_lhs` 的話,則`if ((bitsize - write_lhs) > 0)` 還是會成立,這不是我們預期的結果。且 `write_mask[8 - (bitsize - write_lhs)]` 的陣列索引有可能會越界。底下 patch 修復此問題。 ```diff= diff --git a/bitcpy.c b/bitcpy.c index 17f80ac..65f9a31 100644 --- a/bitcpy.c +++ b/bitcpy.c @@ -62,7 +62,7 @@ void bitcpy(void *_dest, /* Address of the buffer to write to */ original = *dest & write_mask[bitsize - write_rhs]; *dest = original | (data << write_rhs); } else { - if ((bitsize - write_lhs) > 0) + if (bitsize > write_lhs) mask = mask | write_mask[8 - (bitsize - write_lhs)]; *dest++ = (original & mask) | (data >> write_lhs); } ``` ### Bug 2 - 錯誤的 mask (位元寫入 - 不需跨兩個位元組寫入) 如果把 input 所有位元清為 0,output 所有位元設為 1 ,如底下程式碼所示: ```diff= diff --git a/bitcpy.c b/bitcpy.c index 65f9a31..526f534 100644 --- a/bitcpy.c +++ b/bitcpy.c @@ -92,12 +92,12 @@ static inline void dump_binary(uint8_t *_buffer, size_t _length) int main(int _argc, char **_argv) { - memset(&input[0], 0xFF, sizeof(input)); + memset(&input[0], 0, sizeof(input)); for (int i = 1; i <= 32; ++i) { for (int j = 0; j < 16; ++j) { for (int k = 0; k < 16; ++k) { - memset(&output[0], 0x00, sizeof(output)); + memset(&output[0], 0xFF, sizeof(output)); printf("%2d:%2d:%2d ", i, k, j); bitcpy(&output[0], k, &input[0], j, i); dump_binary(&output[0], 8); ``` 其輸出結果是錯的: ```shell= 1: 0: 0 0111111111111111111111111111111111111111111111111111111111111111 1: 1: 0 1000000011111111111111111111111111111111111111111111111111111111 1: 2: 0 1100000011111111111111111111111111111111111111111111111111111111 1: 3: 0 1110000011111111111111111111111111111111111111111111111111111111 1: 4: 0 1111000011111111111111111111111111111111111111111111111111111111 1: 5: 0 1111100011111111111111111111111111111111111111111111111111111111 1: 6: 0 1111110011111111111111111111111111111111111111111111111111111111 1: 7: 0 1111111011111111111111111111111111111111111111111111111111111111 ``` 直接看修正 patch,細節詳見 [commit message](https://github.com/AdrianHuang/bitcpy/commit/53d62c68d6bc97352af580b71c4883148586233e): ```diff diff --git a/bitcpy.c b/bitcpy.c index 65f9a31..ad26bcd 100644 --- a/bitcpy.c +++ b/bitcpy.c @@ -62,8 +62,11 @@ void bitcpy(void *_dest, /* Address of the buffer to write to */ original = *dest & write_mask[bitsize - write_rhs]; *dest = original | (data << write_rhs); } else { - if (bitsize > write_lhs) - mask = mask | write_mask[8 - (bitsize - write_lhs)]; + /* + * write_lhs + bitsize is never greater than 8, so + * no out-of-bound access is observed. + */ + mask |= write_mask[write_lhs + bitsize]; *dest++ = (original & mask) | (data >> write_lhs); ``` 套用修正 patch 後,其結果正確: ```shell 1: 0: 0 0111111111111111111111111111111111111111111111111111111111111111 1: 1: 0 1011111111111111111111111111111111111111111111111111111111111111 1: 2: 0 1101111111111111111111111111111111111111111111111111111111111111 1: 3: 0 1110111111111111111111111111111111111111111111111111111111111111 1: 4: 0 1111011111111111111111111111111111111111111111111111111111111111 1: 5: 0 1111101111111111111111111111111111111111111111111111111111111111 1: 6: 0 1111110111111111111111111111111111111111111111111111111111111111 1: 7: 0 1111111011111111111111111111111111111111111111111111111111111111 1: 8: 0 1111111101111111111111111111111111111111111111111111111111111111 1: 9: 0 1111111110111111111111111111111111111111111111111111111111111111 1:10: 0 1111111111011111111111111111111111111111111111111111111111111111 1:11: 0 1111111111101111111111111111111111111111111111111111111111111111 1:12: 0 1111111111110111111111111111111111111111111111111111111111111111 1:13: 0 1111111111111011111111111111111111111111111111111111111111111111 1:14: 0 1111111111111101111111111111111111111111111111111111111111111111 1:15: 0 1111111111111110111111111111111111111111111111111111111111111111 .... 32:14:14 1111111111111100000000000000000000000000000000111111111111111111 32:15:14 1111111111111110000000000000000000000000000000011111111111111111 32: 0:15 0000000000000000000000000000000011111111111111111111111111111111 32: 1:15 1000000000000000000000000000000001111111111111111111111111111111 32: 2:15 1100000000000000000000000000000000111111111111111111111111111111 32: 3:15 1110000000000000000000000000000000011111111111111111111111111111 32: 4:15 1111000000000000000000000000000000001111111111111111111111111111 32: 5:15 1111100000000000000000000000000000000111111111111111111111111111 32: 6:15 1111110000000000000000000000000000000011111111111111111111111111 32: 7:15 1111111000000000000000000000000000000001111111111111111111111111 32: 8:15 1111111100000000000000000000000000000000111111111111111111111111 32: 9:15 1111111110000000000000000000000000000000011111111111111111111111 32:10:15 1111111111000000000000000000000000000000001111111111111111111111 32:11:15 1111111111100000000000000000000000000000000111111111111111111111 32:12:15 1111111111110000000000000000000000000000000011111111111111111111 32:13:15 1111111111111000000000000000000000000000000001111111111111111111 32:14:15 1111111111111100000000000000000000000000000000111111111111111111 32:15:15 1111111111111110000000000000000000000000000000011111111111111111 ``` ## 改寫更精簡且等效的程式碼 `Bug 2 - 錯誤的 mask (位元寫入 - 不需跨兩個位元組寫入)` 修正 patch 中,可利用此程式碼 `mask |= write_mask[write_lhs + bitsize];` 消除邊界條件,使其轉換成一般條件,如下所示: ```diff diff --git a/bitcpy.c b/bitcpy.c index ad26bcd..6c70271 100644 --- a/bitcpy.c +++ b/bitcpy.c @@ -54,25 +54,19 @@ void bitcpy(void *_dest, /* Address of the buffer to write to */ data &= read_mask[bitsize]; original = *dest; - if (write_lhs > 0) { - mask = read_mask[write_lhs]; - if (bitsize > write_rhs) { - /* Cross multiple bytes */ - *dest++ = (original & mask) | (data >> write_lhs); - original = *dest & write_mask[bitsize - write_rhs]; - *dest = original | (data << write_rhs); - } else { - /* - * write_lhs + bitsize is never greater than 8, so - * no out-of-bound access is observed. - */ - mask |= write_mask[write_lhs + bitsize]; - *dest++ = (original & mask) | (data >> write_lhs); - } + mask = read_mask[write_lhs]; + if (bitsize > write_rhs) { + /* Cross multiple bytes */ + *dest++ = (original & mask) | (data >> write_lhs); + original = *dest & write_mask[bitsize - write_rhs]; + *dest = original | (data << write_rhs); } else { - if (bitsize < 8) - data = data | (original & write_mask[bitsize]); - *dest++ = data; + /* + * write_lhs + bitsize is never greater than 8, so + * no out-of-bound access is observed. + */ + mask |= write_mask[write_lhs + bitsize]; + *dest++ = (original & mask) | (data >> write_lhs); } _count -= bitsize; ``` ## 在 Linux 核心原始程式碼找出逐 bit 進行資料複製的程式碼,並解說相對應的情境 Linux 核心定義 bitmap ,bitmap 為 unsigned long 陣列 (詳見: include/linux/types.h)。其中,`bitmap_set()` 設定某一特定區間位元全為 1,程式碼如下所示: 1. __builtin_constant_p() 編譯器優化: 此 GCC 內建函式判斷其參數值是否在編譯時期就已經知道。如果是,則回傳 1 (代表編譯器可做 [constant folding](https://en.wikipedia.org/wiki/Constant_folding))。`bitmap_set ()` 有兩個相關優化: * nbits = 1: 底下範例程式碼將數值 `1` 傳給 `nbits`,所以 `if (__builtin_constant_p(nbits) && nbits == 1)` 條件成立。 ```cpp /* Source: arch/mips/kernel/smp-cps.c */ static void boot_core(unsigned int core, unsigned int vpe_id) { ... /* The core is now powered up */ bitmap_set(core_power, core, 1); } ``` * `start` 與 `nbits` 對齊 8 個位元組 (bitmap 為 unsigned long 陣列,所以最小單位為 `unsigned long`): 此狀況可呼叫 memset。 ```cpp /* Source: include/linux/bitmap.h */ static __always_inline void bitmap_set(unsigned long *map, unsigned int start, unsigned int nbits) { if (__builtin_constant_p(nbits) && nbits == 1) __set_bit(start, map); else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && __builtin_constant_p(nbits & BITMAP_MEM_MASK) && IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) memset((char *)map + start / 8, 0xff, nbits / 8); else __bitmap_set(map, start, nbits); } ``` 2. 呼叫 __bitmap_set(): 此函式依據參數 `start` 與 `len` 設定其對應的位元為 1。說明如下: * `p`: 依據起始位元 (`start`) 找出對應的 bitmap 陣列元素。 * `size`: 紀錄最後一個 bitmap 陣列元素中,欲設定的位元總數。 * `bits_to_set`: 此次所設定 bitmap 陣列元素的位元總數。 * `mask_to_set`: 此次所設定 bitmap 陣列元素的位元遮罩。 * `while迴圈`: 此迴圈設定每一個陣列元素的所有位元為 1 (當所欲設定的位元總數超過 `sizeof (unsigned long)` ,也就是超過 64 個位元。 * `最後一個 bitmap 陣列元素設定`: 依據 `start` 與 `len` 找出對應的 `mask_to_set`。 ```cpp /* Source: lib/bitmap.c */ void __bitmap_set(unsigned long *map, unsigned int start, int len) { unsigned long *p = map + BIT_WORD(start); const unsigned int size = start + len; int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG); unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start); while (len - bits_to_set >= 0) { *p |= mask_to_set; len -= bits_to_set; bits_to_set = BITS_PER_LONG; mask_to_set = ~0UL; p++; } if (len) { mask_to_set &= BITMAP_LAST_WORD_MASK(size); *p |= mask_to_set; } } ``` * 巨集 `BITMAP_LAST_WORD_MASK`,如同其名稱所述,即找出最後一個 bitmap 陣列元素位元遮罩 (從最右邊位元算起)。此巨集巧妙地使用 `nbits` 的負數並與 `BITS_PER_LONG - 1` 做 bitwise AND 後,再做右移,即可得出其值。 ```cpp #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1))) ``` 然而,比較直覺的巨集定義為 `#define BITMAP_LAST_WORD_MASK_V2(nbits) ((1UL << (nbits)) - 1)`,進而改寫為`#define BITMAP_LAST_WORD_MASK_V2(nbits) ((1 << ((nbits) & (BITS_PER_LONG - 1))) - 1)`。試想,為何 Linux 核心開發者要使用此方法呢? 差異性如下: * BITMAP_LAST_WORD_MASK(nbits) * `nbits = 0`,則回傳 0xFFFF_FFFF_FFFF_FFFF。然而,使用巨集時,必須確保 `nbits > 0` (可觀察 Linux 程式碼)。 * 如底下分析,此方法花費 6 CPU cycles。 * BITMAP_LAST_WORD_MASK_V2(nbits) * `nbits = 0`,則回傳 0。 * 如底下分析,此方法花費 7 CPU cycles。 * 使用底下範例程式,比較其組合語言差異。 ```cpp #include <stdio.h> #define BITS_PER_LONG 64 #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1))) int main(void) { const unsigned int size = 3; unsigned long mask_to_set = BITMAP_LAST_WORD_MASK(size); printf("%d 0x%x\n", mask_to_set, mask_to_set); return 0; } ``` * `BITMAP_LAST_WORD_MASK`: 總共六道指令,對照 Intel Software Optimization Reference Manual 文件 [Intel® Xeon® Scalable Processor Throughput and Latency](https://software.intel.com/en-us/download/intel-xeon-scalable-processor-throughput-and-latency),這六道指令共花費 6 CPU cycles (1 + 1 + 1 + 0 [你沒看錯就是 0 !] + 3 + 0) ``` # Code generated from 'define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))' 65c: f7 d8 neg %eax 65e: 83 e0 3f and $0x3f,%eax 661: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx 668: 89 c1 mov %eax,%ecx 66a: 48 d3 ea shr %cl,%rdx 66d: 48 89 d0 mov %rdx,%rax ``` * `BITMAP_LAST_WORD_MASK_V2`: 總共七道指令,對照 Intel Software Optimization Reference Manual 文件 Intel® Xeon® Scalable Processor Throughput and Latency,這六道指令共花費 7 CPU cycles (1 + 1 + 0 + 3 + 0 + 1 + 1,`cltq` 指令等效 Intel 指令 `cdqe`)。 ``` 65c: 83 e0 3f and $0x3f,%eax 65f: ba 01 00 00 00 mov $0x1,%edx 664: 89 c1 mov %eax,%ecx 666: d3 e2 shl %cl,%edx 668: 89 d0 mov %edx,%eax 66a: 83 e8 01 sub $0x1,%eax 66d: 48 98 cltq ``` ## 測驗 `2` - Vector 實作 ([Github](https://github.com/AdrianHuang/vec)) ### 程式運作的原理 `STRUCT_BODY` 與 `v` 巨集實作 [C++ vector (STL)](https://zh.wikipedia.org/wiki/Vector_(STL)) 雛形。 * `STRUCT_BODY`: 紀錄 metadata 與存放資料的起始位址。 * `size`: 紀錄 vector 已存放的資料個數。 * `on_heap`: vector 的資料是否存放在 `heap`。 * `capacity`: vector 可容納資料的總個數。 * `ptr`: 資料如果存放在 `heap`,`ptr` 用以紀錄起始位址。如果存放在 `stack`,其起始位址為 `buf` 成員。 * `v`: 此巨集宣告一個新 vector 變數。由於,vector 存放的資料有可能會在 `heap`,因此需要一個機制將分配到的記憶體空間釋放,進而避免 `memory leak`。所以當 vector 變數不在宣告此變數的 scope 內 (譬如: 在某一函式宣告 vector 變數,當此函式程式執行完畢準備返回時),可使用 GNU attribute [cleanup](https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html) 自行定義 `cleanup function` 用以釋放記憶體空間。 * 使用 `v` 宣告新的 vector 變數,其資料預設存放空間位於 `stack` 。當使用 `vec_push_back` 或 `vec_reserve`,如有需要擴充 vector capacity大小,則使用 `malloc` 配置記憶體空間 (也就是存放空間改為 `heap`)。 ```cpp #define STRUCT_BODY(type) \ struct { \ size_t size : 54, on_heap : 1, capacity : 6, flag1 : 1, flag2 : 1, \ flag3 : 1; \ type *ptr; \ } #define NEXT_POWER_OF_2(s) \ !(s & (s - 1)) \ ? s \ : (size_t) 1 << (64 - __builtin_clzl(s)) /* next power of 2 */ #define v(t, s, name, ...) \ union { \ STRUCT_BODY(t); \ struct { \ size_t filler; \ t buf[NEXT_POWER_OF_2(s)]; \ }; \ } name __attribute__((cleanup(vec_free))) = {.buf = {__VA_ARGS__}}; \ name.size = sizeof((__typeof__(name.buf[0])[]){0, __VA_ARGS__}) / \ sizeof(name.buf[0]) - 1; \ name.capacity = sizeof(name.buf) / sizeof(name.buf[0]) ``` ### 指出上述實作的限制並提出改進方案 宣告一個 vector,其大小超過 1024 * 1024,如下所示: ```cpp v(float, 1024 * 1024 + 1, vec1); ``` 編譯並執行,說明如下: * 編譯警告訊息: `capacity` 只有 6 個位元,所以會造成 overflow (原始程式碼使用 `sizeof(v.buf) / sizeof(v.buf[0])` 取得存放在 stack 的 vector大小,可以避免讀取溢位的 `capacity`)。 * Segmentation fault: Linux Proecess stack 大小預設為 8MB (可用`ulimit -s`指令確認)。該範例故意從 stack 分配超過 8MB,因此造成 stack overflow,進而發生 segmentation fault。 ```shell $ make gcc -g -c -o vec.o vec.c vec.c: In function ‘main’: vec.c:30:21: warning: large integer implicitly truncated to unsigned type [-Woverflow] name.capacity = sizeof(name.buf) / sizeof(name.buf[0]) ^ vec.c:125:5: note: in expansion of macro ‘v’ v(float, 1024 * 1024 + 1, vec1); ^ gcc -o vec vec.o $ ./vec Segmentation fault (core dumped) ``` #### 改進方案 ##### 方案一: 使用 static assertion 在宣告 vector 時,使用 `static assertion` 確保 vector capacity 不超過所定義的數字 (此範例 capacity = 8)。注意,宣告 vector 時,如果 capacity 數值太大的話,會有此情況發生: 當 vector capacity 需要增加時 (也就是存放空間 `stack` -> `heap`),`memcpy` 所花費時間會增加 (因為需要將原本存放在 stack 資料搬到 `heap`)。 ```diff diff --git a/vec.c b/vec.c index 7fc3909..a6095cd 100644 --- a/vec.c +++ b/vec.c @@ -18,6 +18,10 @@ : (size_t) 1 << (64 - __builtin_clzl(s)) /* next power of 2 */ #define v(t, s, name, ...) \ + (void) ((struct { \ + _Static_assert(s <= 8, "it is too big"); \ + int dummy; \ + }){1}); \ ```

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully