KYG
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # 2020q1 Homework2 (quiz2) contributed by < [`KYG-yaya573142`](https://github.com/KYG-yaya573142/xstring) > > [第 2 周測驗題](https://hackmd.io/@sysprog/linux2020-quiz2) ## 解題思路 ### Q1 首先觀察 xs 的資料結構定義 ```cpp typedef union { /* allow strings up to 15 bytes to stay on the stack * use the last byte as a null terminator and to store flags * much like fbstring: * https://github.com/facebook/folly/blob/master/folly/docs/FBString.md */ char data[16]; struct { uint8_t filler[15], /* how many free bytes in this stack allocated string * same idea as fbstring */ space_left : 4, /* if it is on heap, set to 1 */ is_ptr : 1, flag1 : 1, flag2 : 1, flag3 : 1; }; /* heap allocated */ struct { char *ptr; /* supports strings up to 2^54 - 1 bytes */ size_t size : 54, /* capacity is always a power of 2 (unsigned)-1 */ capacity : 6; /* the last 4 bits are important flags */ }; } xs; ``` ![](https://i.imgur.com/xpC5dPX.jpg) * 最多可存放 16 bytes 的資料 * 最後一個 byte 的前 4 bits 存放剩餘的空間 `space_left`,後 4 bits 的每一個 bit 都作為 flag 使用 (目前只有第一個 flag 有用途) * `space_left` 共佔 4 bits,可以表示的無號整數範圍是 0~15 * 最巧妙的地方為,當存放大小為 16 bytes 的字串時,剩餘空間為 0,同時字串末端的 `\0` 剛好也是 0,此時第 16 個 byte 的位置剛好可以同時表達 null terminator 以及剩餘空間是 0 * 當字串長度 > 16 bytes,會動態配置記憶體 * ptr 紀錄動態記憶體的指標 * size 紀錄字串的長度 (不包含 \0`) * capacity 紀錄動態記憶體的大小,需特別注意會一律配置 2 的次方 byte 的大小,因此需配合 `ilog2` 使用 ```cpp xs *xs_new(xs *x, const void *p) { *x = xs_literal_empty(); size_t len = strlen(p) + 1; if (len > AAA) { x->capacity = ilog2(len) + 1; x->size = len - 1; x->is_ptr = true; x->ptr = malloc((size_t) 1 << x->capacity); memcpy(x->ptr, p, len); } else { memcpy(x->data, p, len); x->space_left = 15 - (len - 1); } return x; } ``` `if (len > AAA)` 的目的是根據字串長度決定是否需要動態配置記憶體,根據 xs 的資料結構知道最大可以使用的空間為 16 bytes,因此 `AAA` 應填入 16 ### Q2 & Q3 觀察 `xs_trim` 的預期結果,可以了解其用途是根據 pattern 來去除原本字串的頭和尾,本例的 pattern 為 `"\n "`,因此會將頭尾的空白以及 `\n` 去除 ```cpp xs string = *xs_tmp(" foobarbar \n\n"); xs_trim(&string, "\n "); printf("[%s] : %2zu\n", xs_data(&string), xs_size(&string)); xs prefix = *xs_tmp("((("), suffix = *xs_tmp(")))"); xs_concat(&string, &prefix, &suffix); printf("[%s] : %2zu\n", xs_data(&string), xs_size(&string)); ``` 接著觀察 `xs_trim` 16~25 行的部分 * 首先以 byte 為單位掃描 `trimset`,使用 `set_bit` 來對 `mask` 進行一些操作 * 再對字串分別從頭尾掃描,使用 `check_bit` 來根據 mask 決定是否刪減頭尾的 byte * 因此 `mask` 與 `set_bit` 及 `check_bit` 是 `xs_trim` 運作的關鍵 ```cpp= xs *xs_trim(xs *x, const char *trimset) { if (!trimset[0]) return x; char *dataptr = xs_data(x), *orig = dataptr; /* similar to strspn/strpbrk but it operates on binary data */ uint8_t mask[BBB] = {0}; #define check_bit(byte) (mask[(uint8_t) byte / 8] CCC 1 << (uint8_t) byte % 8) #define set_bit(byte) (mask[(uint8_t) byte / 8] |= 1 << (uint8_t) byte % 8) size_t i, slen = xs_size(x), trimlen = strlen(trimset); for (i = 0; i < trimlen; i++) set_bit(trimset[i]); for (i = 0; i < slen; i++) if (!check_bit(dataptr[i])) break; for (; slen > 0; slen--) if (!check_bit(dataptr[slen - 1])) break; dataptr += i; slen -= i; /* reserved space as a buffer on the heap. * Do not reallocate immediately. Instead, reuse it as possible. * Do not shrink to in place if < 16 bytes. */ memmove(orig, dataptr, slen); /* do not dirty memory unless it is needed */ if (orig[slen]) orig[slen] = 0; if (xs_is_ptr(x)) x->size = slen; else x->space_left = 15 - slen; return x; #undef check_bit #undef set_bit } ``` 觀察 `set_bit` 與 `check_bit` * `mask[(uint8_t) byte / 8]` 可以理解為將輸入的 byte 以 8 為倍數進行分組 * 接著 `1 << (uint8_t) byte % 8` 可以用來設置對應 `mask` 中的某個 bit * 因此每個不同的 ASCII 字元都會對應到特定 `mask` 中的特定 bit,bit 被設置為 1 就代表對應的 ASCII 字元須被刪除,以 `char *trimset = "a"` 為例 ```cpp char *trimset = "a"; byte = 0x61 = 97 mask[12] |= 1 << 1 mask[12] = 00000010 ``` * 1 byte 能呈現的範圍是 0~255,因此 `mask` 的總 bit 數量至少要大於 256 * $BBB \times 8 = 256 \rightarrow BBB = 32$ * 觀察第 19 及 22 行,若 `check_bit` 返回 1 就會刪除一個字元 (藉由移動指標的方式),這同時表示 `mask` 中對應的 bit 有被 `set_bit` 設置,因此 `check_bit` 中的 `CCC` 應填入 `&` ### `xs_tmp` 整個 xs.c 程式碼中我最看不懂的部分是 `xs_tmp` ```cpp /* Memory leaks happen if the string is too long but it is still useful for * short strings. * "" causes a compile-time error if x is not a string literal or too long. */ #define xs_tmp(x) \ ((void) ((struct { \ _Static_assert(sizeof(x) <= 16, "it is too big"); \ int dummy; \ }){1}), \ xs_new(&xs_literal_empty(), "" x)) ``` * [_Static_assert](https://en.wikichip.org/wiki/c/static_assertion) 是 [C11](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) 增加的 assertion > C11 [6.7.10] Static assertions > The constant expression shall be an integer constant expression. **If the value of the constant expression compares unequal to 0, the declaration has no effect. Otherwise, the constraint is violated and the implementation shall produce a diagnostic message** that includes the text of the string literal, except that characters not in the basic source character set are not required to appear in the message. * 因此若字串長度超過 16 bytes,在編譯時就會出現錯誤,而功編譯後 `xs_tmp` 會展開為 ```cpp ((void) ((struct {int dummy;}){1}), xs_new(&xs_literal_empty(), "" x)) ``` * `(struct {int dummy}){1}` 的部分就是定義一個 struct 物件且值為 1 * 接著看不懂的部分就變成符號 `,`,查閱規格書後知道它其實是一個運算子 comma perator,規格書中相關的定義如下 * comma operator 會**先** evaluate 左方的內容,且會將其視作 void expression,這也是為何左邊的 struct 物件需要 `(void)` 型態轉換 * 這也是左方 struct 物件存在的意義,如果 comma operator 左方只寫 `_Static_assert`,會因為缺乏 left operand 而造成編譯錯誤 * comma operator 最終的結果會具有右方內容的資料型態與數值 > C11 [6.5.17] Comma operator > The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its evaluation and that of the right operand. **Then** the right operand is evaluated; the result has its type and value. * 因此展開後的 `xs_tmp` 實質上會變成 ```cpp xs_new(&xs_literal_empty(), "" x) ``` * 剩下最後一個部分 `""`,理解這個部分的靈感來自 [colinyoyo26 的共筆](https://hackmd.io/@colinyoyo26/xs)以及[你所不知道的C語言:指標篇](https://hackmd.io/@sysprog/c-pointer?type=view) * 根據 C99 [6.4.5],當兩個 string literal 連在一起時,會被 concatenated 成一個 string literal,因此只有當 x 也是 string literal 時才不會編譯錯誤 * 使用以下程式碼 test.c 測試,可以觀察到 * `xs_tmpa(x)` 的 `x` 代入 `ptr` 或是 string literal 都可以正常編譯 * `xs_tmpb(x)` 的 `x` 只能代入 string literal,代入 `ptr` 會 compilation error ```cpp #include <stdio.h> #define xs_tmpa(x) printf("%s\n", x) #define xs_tmpb(x) printf("%s\n", "" x) int main() { char *ptr = "foo"; xs_tmpa(ptr); xs_tmpa("bar"); xs_tmpb(ptr); //compilation error xs_tmpb("bar"); return 0; } ``` * 從 C99 [6.4.5] 中可以知道 string literals 會被分配於 static storage,去除程式碼中的 `xs_tmpb` 進行測試,以 `objdump` 觀察可以確認 string literal 確實在 section .rodata ```shell $ gcc -g -c test.c -o test.o $ objdump -s test.o | grep -B 1 bar Contents of section .rodata: 0000 666f6f00 62617200 foo.bar. ``` ## 用法改善 首先觀察 `xs_literal_empty` 的實作,會發現使用此 macro 的部分在展開後其實等同定義一個無名的 xs 物件 (此物件在 stack 上)。因此 `xs_new` 做的事情是將 `xs_literal_empty` 中定義的物件的值複製給 x 指向的物件 (初始化),再更新 x 指向的 xs 物件的內容 ```cpp #define xs_literal_empty() (xs) { .space_left = 15 } xs *xs_new(xs *x, const void *p) { *x = xs_literal_empty(); size_t len = strlen(p) + 1; if (len > 16) { x->capacity = ilog2(len) + 1; x->size = len - 1; x->is_ptr = true; x->ptr = malloc((size_t) 1 << x->capacity); memcpy(x->ptr, p, len); } else { memcpy(x->data, p, len); x->space_left = 15 - (len - 1); } return x; } ``` 接著觀察原本例題中的程式碼使用 xs 物件的方式 ```cpp xs string = *xs_tmp(" foobarbar \n\n"); ``` 根據上面對[`xs_tmp` 的討論](#xs_tmp),展開上述的表達 ```cpp xs string = xs_new(&xs_literal_empty(), " foobarbar \n\n"); ``` 會發現這種使用的方式其實定義了兩個物件,一個是等號左方的 string,另一個是等號右方 `xs_literal_empty` 定義的無名物件,接著根據給定的 string literal 以 `xs_new` 初始化等號右方的物件,最後再將等號右方物件的值複製給等號左方的物件 string。 仔細想想會發現這樣根本多此一舉,應該直接定義一個物件然後直接用 `xs_new` 進行初始化就好,因此原本例題中的程式碼可以改成如下用法 ```cpp int main() { xs *string = xs_tmp("\n foobarbar \n\n\n"); xs_trim(string, "\n "); printf("[%s] : %2zu\n", xs_data(string), xs_size(string)); xs *prefix = xs_tmp("((("), *suffix = xs_tmp(")))"); xs_concat(string, prefix, suffix); printf("[%s] : %2zu\n", xs_data(string), xs_size(string)); return 0; } ``` * 物件空間實質上是在 `xs_tmp` 的位置定義的 (位於 stack) * 改變後的用法更符合原本 C 語言使用字串的方式 * 原始碼不需更動 ## 字串複製功能的函式實作 (應考慮到 CoW) ### 未考慮 CoW 的版本 ```cpp xs *xs_cpy(xs *dest, xs* src) { /* data on heap */ if (xs_is_ptr(src)) { xs_grow(dest, xs_size(src) + 1); memcpy(dest->ptr, src->ptr, xs_size(src) + 1); dest->size = src->size; } else { /* src data on stack */ if (xs_is_ptr(dest)) { free(dest->ptr); dest->is_ptr = false; } memcpy(dest->data, src->data, xs_size(src) + 1); dest->space_left = src->space_left; } return dest; } ``` * 如果 `src` 字串在 stack (短字串),空間一定夠所以直接複製到 `dest` * 如果 `dest` 本來有動態配置記憶體,要釋放掉避免 memory leak * 如果 `src` 字串在 heap,使用 `xs_grow` 來處理並確保 `dest` 有足夠的空間可以使用,再將 `src` 複製過去 使用以下程式碼測試,結果符合預期 ```cpp int main() { xs *src = xs_tmp("foobarbar"); xs *dest = xs_tmp("original"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); xs_cpy(dest, src); printf("after xs_cpy(dest, src)\n"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); printf("----\n"); xs *prefix = xs_tmp("I like "), *suffix = xs_tmp("!!!"); xs_concat(src, prefix, suffix); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); xs_cpy(dest, src); printf("after xs_cpy(dest, src)\n"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); return 0; } ``` ``` src: [foobarbar] size: 9 dest: [original] size: 8 after xs_cpy(dest, src) src: [foobarbar] size: 9 dest: [foobarbar] size: 9 ---- src: [I like foobarbar!!!] size: 19 dest: [foobarbar] size: 9 after xs_cpy(dest, src) src: [I like foobarbar!!!] size: 19 dest: [I like foobarbar!!!] size: 19 ``` ### CoW 的版本 這裡的參考 [CS:APP 第 10 章](https://hackmd.io/@sysprog/S1vGugaDQ/https%3A%2F%2Fhackmd.io%2Fs%2FH1TtmVTTz?type=book) 當中描述的 fork 後 child process 共用檔案的方式,將動態配置的記憶體前 4 bytes 用來記錄 reference count (refcnt)。 ```graphviz digraph xs { node [shape=record]; //default node style xs [label="<f0> *ptr|size|capacity|flag" xlabel="xs" width=4.0] heap [label="refcnt|string" width=2.0]; xs:f0 -> heap } ``` 為了更簡易的存取 `refcnt`,首先定義以下 3 個 macro 以及 2 個輔助函式 `xs_is_ref` 與 `xs_cow` ```cpp #define REFCNT_NUM(ptr) (*((int*)(ptr) - 1)) #define REFCNT_INCRE(ptr) (REFCNT_NUM(ptr)++) #define REFCNT_DECRE(ptr) (REFCNT_NUM(ptr)--) ``` #### `xs_is_ref` ```cpp static inline size_t xs_is_ref(const xs *x) { if (xs_is_ptr(x)) { if (REFCNT_NUM(x->ptr) > 1) return true; } return false; } ``` * 檢查 `x->ptr` 是否有被參照 #### `xs_cow` ```cpp static void xs_cow(xs *x) { if (xs_is_ref(x)) { REFCNT_DECRE(x->ptr); xs_new(x, xs_data(x)); } } ``` * 如果 `x->ptr` 有被參照的話,另開一個新的空間給 `x` #### `xs_new` ```diff xs *xs_new(xs *x, const void *p) { *x = xs_literal_empty(); size_t len = strlen(p) + 1; if (len > 16) { x->capacity = ilog2(len) + 1; x->size = len - 1; x->is_ptr = true; - x->ptr = malloc((size_t) 1 << x->capacity); + x->ptr = malloc((size_t) 1 << x->capacity + 4); /* additional 4 bytes for refcnt */ + x->ptr += 4; /* leading 4 bytes = refcnt */ + REFCNT_NUM(x->ptr) = 1; memcpy(x->ptr, p, len); } else { memcpy(x->data, p, len); x->space_left = 15 - (len - 1); } return x; } ``` * 動態配置的記憶體需多出 4 bytes 作為 `refcnt` 使用 * 初始化 `refcnt` 數值為 1 #### `xs_grow` ```diff xs *xs_grow(xs *x, size_t len) { + xs_cow(x); if (len <= xs_capacity(x)) return x; len = ilog2(len) + 1; if (xs_is_ptr(x)) { + x->ptr -=4; x->ptr = realloc(x->ptr, (size_t) 1 << len + 4); + x->ptr +=4; } else { char buf[16]; memcpy(buf, x->data, 16); x->ptr = malloc((size_t) 1 << len + 4); + x->ptr -=4; + REFCNT_NUM(x->ptr) = 1; memcpy(x->ptr, buf, 16); } x->is_ptr = true; x->capacity = len; return x; } ``` * 動態配置的記憶體前 4 bytes 是 `refcnt`,因此 `malloc` 和 `free` 使用前後需要特別處理代入的指標 * 可能會更動到動態配置的記憶體,因此開頭前先 `xs_cow` #### `xs_free` ```diff static inline xs *xs_free(xs *x) { if (xs_is_ptr(x)) { - free(xs_data(x)); + if (REFCNT_NUM(x->ptr) > 1) { + REFCNT_DECRE(x->ptr); + x->ptr = NULL; + } else { + x->ptr -= 4; /* leading 4 bytes = refcnt */ + free(x->ptr); + } } return xs_newempty(x); } ``` * 如果 `refcnt` 大於 1 代表還有別的物件會參照此動態記憶體,因此只需將 `refcnt` 減 1 * `refcnt` 為 1 的時候代表無其他物件參照此動態記憶體,所以需要 `free` #### `xs_concat` ```diff xs *xs_concat(xs *string, const xs *prefix, const xs *suffix) { size_t pres = xs_size(prefix), sufs = xs_size(suffix), size = xs_size(string), capacity = xs_capacity(string); char *pre = xs_data(prefix), *suf = xs_data(suffix), *data = xs_data(string); + xs_cow(string); if (size + pres + sufs <= capacity) { memmove(data + pres, data, size); memcpy(data, pre, pres); memcpy(data + pres + size, suf, sufs + 1); - string->space_left = 15 - (size + pres + sufs); } else { xs tmps = xs_literal_empty(); xs_grow(&tmps, size + pres + sufs); char *tmpdata = xs_data(&tmps); memcpy(tmpdata + pres, data, size); memcpy(tmpdata, pre, pres); memcpy(tmpdata + pres + size, suf, sufs + 1); xs_free(string); *string = tmps; - string->size = size + pres + sufs; } + if (xs_is_ptr(string)) { + string->size = size + pres + sufs; + } else { + string->space_left = 15 - (size + pres + sufs); + } return string; } ``` * 在開頭先呼叫 `xs_grow` 來執行 copy on write 的相關處理 * 修改原本題目的 `size` 中的邏輯漏洞 * 原本如果同時符合 `size + pres + sufs <= capacity` 以及 `xs_is_ptr(string)`,就會錯誤的去修改 `string->space_left` #### `xs_trim` ```diff xs *xs_trim(xs *x, const char *trimset) { if (!trimset[0]) return x; + xs_cow(x); ... } ``` * 開頭先呼叫 `xs_grow` 來執行 copy on write 的相關處理 #### `xs_cpy` ```cpp xs *xs_cpy(xs *dest, xs* src) { xs_free(dest); memcpy(dest->data, src->data, 16); if (xs_is_ptr(src)) REFCNT_INCRE(src->ptr); return dest; } ``` * `dest` 可能原本就有使用或是參照的動態記憶體,需先 `xs_free` 來處理 * 複製時,只需將 `dest->ptr` 指向 `src->ptr`,並將 `refcnt` 增加 1 * 這裡直接使用 `memcpy` 複製整個 `src` 內容 * 如此一來,字串無論是在 stack 還是在 heap 都可以直接使用 `xs_cpy` #### 檢驗結果 用以下程式碼驗證,結果正確 ```cpp int main() { printf("---original---\n"); xs *src = xs_tmp("foobarbar"); xs *dest = xs_tmp("original"); xs *prefix = xs_tmp("@I like "), *suffix = xs_tmp("!!!"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); printf("prefix: [%s] suffix: [%s]\n", xs_data(prefix), xs_data(suffix)); xs_concat(src, prefix, suffix); printf("---after xs_concat(src, prefix, suffix)---\n"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); xs_cpy(dest, src); printf("---after xs_cpy(dest, src)---\n"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); printf("src refcnt: %d dest refcnt: %d\n", REFCNT_NUM(src->ptr), REFCNT_NUM(dest->ptr)); printf("src: %p\ndest: %p\n", src->ptr, dest->ptr); xs_trim(dest, "!@"); printf("---after xs_trim(dest, \"@!\")---\n"); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); printf("src: %p\ndest: %p\n", src->ptr, dest->ptr); return 0; } ``` ``` ---original--- src: [foobarbar] size: 9 dest: [original] size: 8 prefix: [@I like ] suffix: [!!!] ---after xs_concat(src, prefix, suffix)--- src: [@I like foobarbar!!!] size: 20 dest: [original] size: 8 ---after xs_cpy(dest, src)--- src: [@I like foobarbar!!!] size: 20 dest: [@I like foobarbar!!!] size: 20 src refcnt: 2 dest refcnt: 2 src: 0x5600d4a8366c dest: 0x5600d4a8366c ---after xs_trim(dest, "@!")--- src: [@I like foobarbar!!!] size: 20 dest: [I like foobarbar] size: 16 src: 0x5600d4a8366c dest: 0x5600d4a83884 ``` ## 設計實驗,確認上述字串處理最佳化 (針對空間效率) 帶來的效益,應考慮到 [locality of reference](https://en.wikipedia.org/wiki/Locality_of_reference) ### 時間效益 使用以下程式碼進行測試 ```cpp #include <stdio.h> #include <time.h> int main() { int sample_size = 500; xs *src = xs_tmp("foobarbar"); xs *prefix = xs_tmp("@I like "), *suffix = xs_tmp("!!!"); xs_concat(src, prefix, suffix); /* let string on heap */ xs dest[sample_size]; for (int i = 0; i < sample_size; i++) dest[i] = xs_literal_empty(); struct timespec t_start, t_end; for (int i = 1; i < (sample_size + 1); i++) { clock_gettime(CLOCK_MONOTONIC, &t_start); for (int n = 0; n < i; n++) { xs_cpy(&dest[n], src); } clock_gettime(CLOCK_MONOTONIC, &t_end); long long tt = (long long)(t_end.tv_sec * 1e9 + t_end.tv_nsec) - (long long)(t_start.tv_sec * 1e9 + t_start.tv_nsec); printf("%d %lld\n", i, tt); } return 0; } ``` 使用 `gnuplot` 繪製結果 ![](https://i.imgur.com/a2uNtpM.png) * xs_cpy 配合 CoW 明顯更快速 接著以不同長度的字串進行測試 ```cpp #include <stdio.h> #include <time.h> int main() { int max_length = 1000; xs *src = xs_tmp("f"); xs *prefix = xs_tmp(""), *suffix = xs_tmp("o"); xs dest = xs_literal_empty(); struct timespec t_start, t_end; for (int i = 1; i < (max_length + 1); i++) { clock_gettime(CLOCK_MONOTONIC, &t_start); xs_cpy(&dest, src); clock_gettime(CLOCK_MONOTONIC, &t_end); long long tt = (long long)(t_end.tv_sec * 1e9 + t_end.tv_nsec) - (long long)(t_start.tv_sec * 1e9 + t_start.tv_nsec); printf("%d %lld\n", i, tt); xs_concat(src, prefix, suffix); /* increase src length by 1 byte */ xs_free(&dest); } return 0; } ``` 使用 `gnuplot` 繪製結果 ![](https://i.imgur.com/zYEF91u.png) * xs_cpy w/o CoW 實質上是使用 memcpy 複製,因此會隨著字串長度線性上升 * xs_cpy w/ CoW 無論字串長度都只複製指標,因此呈現常數時間 ### 空間效益 以下列實驗程式碼進行實驗,將 `src` 的字串複製給總共 1000 項的 `dest` ```cpp int main() { int sample_size = 1000; xs *src = xs_tmp("foobarbar"); xs *prefix = xs_tmp("@I like "), *suffix = xs_tmp("!!!"); xs_concat(src, prefix, suffix); /* let string on heap */ xs dest[sample_size]; for (int i = 0; i < sample_size; i++) dest[i] = xs_literal_empty(); for (int i = 0; i < sample_size; i++) xs_cpy(&dest[i], src); for (int i = 0; i < sample_size; i++) xs_free(&dest[i]); return 0; } ``` 使用 `valgrind` 來分析 heap 的用量 ```shell $ valgrind --tool=massif --time-unit=B ./xs ``` 以 `ms_print massif.out.<pid>` 觀察結果 ``` xs_cpy w/o CoW -------------------------------------------------------------------------------- n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 31 40,040 40,040 32,032 8,008 0 xs_cpy w/ CoW -------------------------------------------------------------------------------- n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 1 56 56 36 20 0 ``` * CoW 可以節省非常大量的 heap 空間 * 以本例 (src 字串大小 20 bytes,複製 1000 次) 來說,heap 使用量差距約為 700 倍 ### locality of reference 使用以下程式碼進行實驗,第 15~18 行的目的是追加取用字串的次數 ```cpp= int main() { int sample_size = 1000; xs *src = xs_tmp("foobarbar"); xs *prefix = xs_tmp("@I like "), *suffix = xs_tmp("!!!"); xs_concat(src, prefix, suffix); /* let string on heap */ xs dest[sample_size]; for (int i = 0; i < sample_size; i++) dest[i] = xs_literal_empty(); for (int i = 0; i < sample_size; i++) xs_cpy(&dest[i], src); for (int i = 0; i < 20; i++) { for (int i = 0; i < sample_size; i++) printf("%s", xs_data(&dest[i])); } return 0; } ``` 用 [`perf`](http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial) 進行分析 ``` $ sudo perf stat --repeat 5 -e cache-misses,cache-references,instructions,cycles ./xs_old ``` #### `xs_cpy` w/ CoW ``` Performance counter stats for './xs' (5 runs): 3,371 cache-misses # 7.211 % of all cache refs ( +- 9.76% ) 46,751 cache-references ( +- 11.18% ) 24,062,939 instructions # 1.78 insn per cycle # 0.00 stalled cycles per insn ( +- 0.04% ) 13,518,560 cycles ( +- 3.82% ) 0.006644 +- 0.000514 seconds time elapsed ( +- 7.73% ) ``` #### `xs_cpy` w/o CoW ``` Performance counter stats for './xs_old' (5 runs): 4,575 cache-misses # 8.177 % of all cache refs ( +- 12.67% ) 55,947 cache-references ( +- 4.58% ) 24,502,092 instructions # 1.85 insn per cycle # 0.00 stalled cycles per insn ( +- 0.02% ) 13,280,095 cycles ( +- 2.08% ) 0.006589 +- 0.000263 seconds time elapsed ( +- 3.99% ) ``` * CoW 版本的 cache-misses 比較少,但不顯著 * 這主要是因為,雖然在 CoW 版本下 `dest[i]->ptr` 位置都一樣,但是每個 `dest[i]` 仍是獨立的變數,因此系統較難善用 locality * 如果整個 xs 物件都用 CoW 的方式改寫,可能會大幅度改善,之後有空再考慮嘗試此作法 (當務之急是先寫其他作業) * 如果將第 15~18 行刪除,會發現 cache-misses 的差距就更不明顯了 ``` Performance counter stats for './xs' (5 runs): 525 cache-misses # 4.377 % of all cache refs ( +- 83.11% ) 11,990 cache-references ( +- 1.41% ) 706,297 instructions # 0.85 insn per cycle # 0.00 stalled cycles per insn ( +- 0.83% ) 826,928 cycles ( +- 2.04% ) 0.0007727 +- 0.0000104 seconds time elapsed ( +- 1.35% ) ``` ``` Performance counter stats for './xs_old' (5 runs): 616 cache-misses # 4.864 % of all cache refs ( +- 91.30% ) 12,669 cache-references ( +- 1.04% ) 1,152,005 instructions # 1.13 insn per cycle # 0.00 stalled cycles per insn ( +- 0.12% ) 1,017,068 cycles ( +- 1.49% ) 0.0008488 +- 0.0000158 seconds time elapsed ( +- 1.86% ) ``` ## 類似 strtok 功能的函式實作 * 首先從 [man strtok](http://man7.org/linux/man-pages/man3/strtok.3.html) 了解其運作結果 * 可以參考 [glibc 實作的程式碼](https://code.woboq.org/userspace/glibc/string/strtok_r.c.html#__strtok_r) 直接仿造 [glibc 實作的程式碼](https://code.woboq.org/userspace/glibc/string/strtok_r.c.html#__strtok_r) 來實作 ```cpp /* Reentrant xs string tokenizer */ char *xs_strtok_r(xs *x, const char *delim, char **save_ptr) { int src_flag = 0; char *s = NULL; char *end = NULL; if (x == NULL) { s = *save_ptr; } else { /* use the source x */ xs_cow(x); s = xs_data(x); src_flag = 1; } if (*s == '\0') { *save_ptr = s; return NULL; } /* Scan leading delimiters */ s += strspn(s, delim); if (*s == '\0') { *save_ptr = s; return NULL; } /* Find the end of the token */ end = s + strcspn(s, delim); if (*end == '\0') { *save_ptr = end; return s; } /* Terminate the token and make *SAVE_PTR point past it */ *end = '\0'; *save_ptr = end + 1; /* contents after the first tok is no longer accessible for x */ if (src_flag) { if (xs_is_ptr(x)) { x->size = (size_t) (end - xs_data(x)); } else { x->space_left = 15 - (size_t) (end - xs_data(x)); } } return s; } char *xs_strtok(xs *x, const char *delim) { static char *old; return xs_strtok_r(x, delim, &old); } ``` * 如同 glibc 的作法製作兩個版本的 api * [MT-Safe](https://www.gnu.org/software/libc/manual/html_node/POSIX-Safety-Concepts.html) 的 `xs_strtok_r` * MT-Unsafe 的 `xs_strtok` * CoW 的部分會在開頭就使用 `xs_cow` 處理 * `xs_strtok` 會將字串中符合 `delim` 的部分以 `'\0'` 替換,因此原先字串的內容從第一個 tok 以後的部分會被 `'\0'` 截斷 * 所以第一次使用 `xs_strtok` 時 (也就是代入的參數有 `xs` 物件),會將 `src_flag` 設置為 1 並且更新 `xs` 中紀錄的字串長度 ### 檢驗結果 以下列程式碼進行檢驗 ```cpp int main() { printf("---original---\n"); printf("---after xs_cpy(dest, src)---\n"); xs *src = xs_new(&xs_literal_empty(), "|foo|bar|||bar|bar!!!|||"); xs *dest = xs_tmp("original"); xs_cpy(dest, src); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); printf("src refcnt: %d dest refcnt: %d\n", REFCNT_NUM(src->ptr), REFCNT_NUM(dest->ptr)); printf("src: %p\ndest: %p\n", src->ptr, dest->ptr); printf("---after xs_strtok(dest, \"|\")---\n"); printf("tok: %s\n", xs_strtok(dest, "|")); printf("src: [%s] size: %2zu\n", xs_data(src), xs_size(src)); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); printf("src refcnt: %d dest refcnt: %d\n", REFCNT_NUM(src->ptr), REFCNT_NUM(dest->ptr)); printf("src: %p\ndest: %p\n", src->ptr, dest->ptr); for (int i = 0; i < 5; i++) { printf("---after xs_strtok(NULL, \"|\")---\n"); printf("tok: %s\n", xs_strtok(NULL, "|")); printf("dest: [%s] size: %2zu\n", xs_data(dest), xs_size(dest)); } return 0; } ``` 結果正確 ``` ---original--- ---after xs_cpy(dest, src)--- src: [|foo|bar|||bar|bar!!!|||] size: 24 dest: [|foo|bar|||bar|bar!!!|||] size: 24 src refcnt: 2 dest refcnt: 2 src: 0x558e6d756674 dest: 0x558e6d756674 ---after xs_strtok(dest, "|")--- tok: foo src: [|foo|bar|||bar|bar!!!|||] size: 24 dest: [|foo] size: 4 src refcnt: 1 dest refcnt: 1 src: 0x558e6d756674 dest: 0x558e6d756884 ---after xs_strtok(NULL, "|")--- tok: bar dest: [|foo] size: 4 ---after xs_strtok(NULL, "|")--- tok: bar dest: [|foo] size: 4 ---after xs_strtok(NULL, "|")--- tok: bar!!! dest: [|foo] size: 4 ---after xs_strtok(NULL, "|")--- tok: (null) dest: [|foo] size: 4 ---after xs_strtok(NULL, "|")--- tok: (null) dest: [|foo] size: 4 ``` ## 在 Linux 核心原始程式碼中,找出類似手法的 SSO 或 CoW 案例並解說 (待補) 之前閱讀 CS:APP 第 10 章就有提到類似的用法,打算從這裡進行探討 ![](https://i.imgur.com/t5GW0Wj.png) ###### tags: `linux 2020`

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully