2021q1 Homework3 (quiz3)

# 2021q1 Homework3 (quiz3) contributed by < `gyes00205` > ###### tags: `linux2021` > [作業要求](https://hackmd.io/@sysprog/ryCRU-Hmu) ## folly::fbstring ### 結構定義 [folly::fbstring](https://github.com/facebook/folly/blob/master/folly/FBString.h) 實作主要手法透過下方的 union: ```cpp union { struct { char *data; size_t size; size_t capacity; } heap; struct { char data[23]; int8_t length; // align 1 } stack; }; ``` union 的記憶體空間為 24 bytes，1 個 char 為 1 byte 因此 `data[]` 為 23 bytes ， `int_8` 為 1 byte 。短字串就是保存在 `char data[23]` ，而中長字串則透過 `data` 這個指標和 malloc 的使用，保存在 heap 上。 ## 仿效 folly::fbstring `xs.c` ### 結構定義 ```cpp typedef union { /* allow strings up to 15 bytes to stay on the stack * use the last byte as a null terminator and to store flags * much like fbstring: * https://github.com/facebook/folly/blob/master/folly/docs/FBString.md */ char data[16]; struct { uint8_t filler[15], /* how many free bytes in this stack allocated string * same idea as fbstring */ space_left : 4, /* if it is on heap, set to 1 */ is_ptr : 1, flag1 : 1, flag2 : 1, flag3 : 1; }; /* heap allocated */ struct { char *ptr; /* supports strings up to 2^54 - 1 bytes */ size_t size : 54, /* capacity is always a power of 2 (unsigned)-1 */ capacity : 6; /* the last 4 bits are important flags */ }; } xs; ``` union 16 個 bytes 。這裡的 `space_left : 4` 透過 bit field 指定 4 bits，其他用這種表示法的單位都是 bit 。 * 字串長度小於或等於 15 個位元組，則放在 stack。 * 字串長度大於或等於 16 個位元組，則放在 heap (透過 malloc 系統呼叫配置所需字串大小) * heap struct * ptr: 8 個位元組指標 (64 位元架構: x86_64/aarch64 等等)。 * size: 字串長度。因定義 54 位元，故最大字串長度為 2^54^ − 1 位元組。 * capacity: 從 heap 配置的空間大小，其單位為 2capacity，故用 6 個位元即可涵蓋 size 長度。 * 有 4 個位元沒有定義，是為了避免覆寫另一結構成員: is_ptr, flag1, flag 2 與 flag3。 * 短字串(字串長度小於或等於 15 )。 * 中字串(字串長度大於或等於 16 且小於 `LARGE_STRING_LEN`) * 長字串(字串長度大於或等於 `LARGE_STRING_LEN` ) :::warning TODO: 透過 GDB 確認記憶體佈局是否符合程式碼 :notes: jserv 編譯 xs.c > $ gcc -o xs xs.c -O0 -g 確認 `xs` 記憶體空間為 16 bytes > $ gdb xs > (gdb) p sizeof(xs) > $1 = 16 其餘部分還在研究 gdb command 的用法 ::: ### `xs_is_ptr` 判斷是否有配置記憶體在 heap (字串長度大於或等於 16 才有配置記憶體)。如果有， `is_ptr` 為 1，則回傳 `True` ，反之則回傳 `False` 。 ```cpp static inline bool xs_is_ptr(const xs *x) { return x->is_ptr; } ``` ### `xs_is_large_string` 判斷是否為長字串 (字串長度大於或等於 `LARGE_STRING_LEN` )。 `is_large_string` 為 1 ，則回傳 `True` 。 ```cpp static inline bool xs_is_large_string(const xs *x) { return x->is_large_string; } ``` ### `xs_size` 如果有配置記憶體在 heap ，回傳 `x->size` ，因為字串長度紀錄在 `size` 。如果是短字串，則回傳 `15 - x->space_left` 。因為 `space_left` 等於 15 - 短字串長度，所以 15 - (15 - 短字串長度)等於短字串長度 ```cpp static inline size_t xs_size(const xs *x) { return xs_is_ptr(x) ? x->size : 15 - x->space_left; } ``` ### `xs_data` 如果 `x` 為短字串，則回傳 `x->data` 。如果是長字串(字串長度大於或等於 `LARGE_STRING_LEN`) ，回傳 `x->ptr + 4` ，因為長字串的前 4 個 bytes 用來保存 reference count (在 xs_allocate_data 有解釋)。如果是中字串，回傳 `x->ptr` 。 ```cpp static inline char *xs_data(const xs *x) { if (!xs_is_ptr(x)) return (char *) x->data; if (xs_is_large_string(x)) return (char *) (x->ptr + 4); return (char *) x->ptr; } ``` ### `xs_capacity` 回傳該字串所能保存的最大長度。中長字串 `((size_t) 1 << x->capacity) - 1 ` ，短字串 15 。 ```cpp static inline size_t xs_capacity(const xs *x) { return xs_is_ptr(x) ? ((size_t) 1 << x->capacity) - 1 : 15; } ``` ### reference count :::danger 應該先討論 reference counting 的設計考量 :notes: jserv > reference count 紀錄有多少字串共用同一個記憶體空間。 ::: 對長字串的 reference count 運算， `x->ptr` 原本的資料型態為 8 bytes 的 `char *` ，這裡將它轉型成 8 bytes 的 `int *` 並進行運算。 :::danger 為什麼？ :notes: jserv > 從 `xs_set_refcnt(const xs *x, int val)` 我們可以知道，函式的功能是想將 integer type 的 `val` assign 給 `*(x->ptr)` ，但是 `*(x->ptr)` 的 type 為 `char` (1 byte)， `val` 的 type 為 `int` (4 bytes) ，如果強行將 `val` assign 給 `*(x->ptr)` 勢必造成 overflow 。 > 不過對於長字串來說， `x->ptr` 的前 4 個 bytes ，就是設計來保存 reference count ，因此我們只須先將 `x->ptr` 強制轉型成 `int *` ，再取 dereference 便能將 `val` assign 給 `*(x->ptr)` 。 ::: ```cpp static inline void xs_set_refcnt(const xs *x, int val) { *((int *) ((size_t) x->ptr)) = val; } static inline void xs_inc_refcnt(const xs *x) { if (xs_is_large_string(x)) ++(*(int *) ((size_t) x->ptr)); } static inline int xs_dec_refcnt(const xs *x) { if (!xs_is_large_string(x)) return 0; return --(*(int *) ((size_t) x->ptr)); } static inline int xs_get_refcnt(const xs *x) { if (!xs_is_large_string(x)) return 0; return *(int *) ((size_t) x->ptr); } ``` ### `xs_literal_empty` 將 `space_left` 初始為 15，即短字串長度為 0 。 ```cpp #define xs_literal_empty() \ (xs) { .space_left = 15 } ``` ### `ilog2` ```cpp /* lowerbound (floor log2) */ static inline int ilog2(uint32_t n) { return 32 - __builtin_clz(n) - 1; } ``` `uint32_t` 所能表示最大正整數為 $2^{32} - 1$，因此 lowerbound 最大為 31。所以答案為 `32 - __builtin_clz(n) - 1` 。以 $n$ $=$ $32$ $=$ $100000_2$ 為例: * `__builtin_clz(n) = 26` * 所以 $32 - 26 - 1 = 5$ ### `xs_allocate_data` 分配記憶體給中字串和長字串，其中長字串需要額外配置 4 bytes 的空間給 reference count 。 ```cpp static void xs_allocate_data(xs *x, size_t len, bool reallocate) { /* Medium string */ if (len < LARGE_STRING_LEN) { x->ptr = reallocate ? realloc(x->ptr, (size_t) 1 << x->capacity) : malloc((size_t) 1 << x->capacity); return; } /* Large string */ x->is_large_string = 1; /* The extra 4 bytes are used to store the reference count */ x->ptr = reallocate ? realloc(x->ptr, (size_t)(1 << x->capacity) + 4) : malloc((size_t)(1 << x->capacity) + 4); xs_set_refcnt(x, 1); } ``` ### `xs_new` 如果為中長字串，就需要分配記憶體給 heap ， `len > 16` 判斷是否為中長字串。如果是短字串的話 `strlen(p)` 最多為 15 ，因此不會大於 16 。 ```cpp xs *xs_new(xs *x, const void *p) { *x = xs_literal_empty(); size_t len = strlen(p) + 1; if (len > 16) { x->capacity = ilog2(len) + 1; x->size = len - 1; x->is_ptr = true; xs_allocate_data(x, x->size, 0); memcpy(xs_data(x), p, len); } else { memcpy(x->data, p, len); x->space_left = 15 - (len - 1); } return x; } ``` ### xs_tmp 如果字串長度大於 `MAX_STR_LEN` 回傳 "it is too big" 。如果沒有，使用 `xs_new` 並依據 `x` 的長度，判斷是否需要配置記憶體在 heap 。 ```cpp #define xs_tmp(x) \ ((void) ((struct { \ _Static_assert(sizeof(x) <= MAX_STR_LEN, "it is too big"); \ int dummy; \ }){1}), \ xs_new(&xs_literal_empty(), x)) ``` ### `xs_grow` 增加 `x` 的保存空間，因此先判斷 `len <= xs_capacity(x)` ，如果足夠則直接回傳。 ```cpp= /* grow up to specified size */ xs *xs_grow(xs *x, size_t len) { char buf[16]; if (len <= xs_capacity(x)) return x; /* Backup first */ if (!xs_is_ptr(x)) memcpy(buf, x->data, 16); x->is_ptr = true; x->capacity = ilog2(len) + 1; if (xs_is_ptr(x)) { xs_allocate_data(x, len, 1); } else { xs_allocate_data(x, len, 0); memcpy(xs_data(x), buf, 16); } return x; } ``` :::warning 在第 13 行將 `x->is_ptr` 設為 `true` ，那第 16 行 `if (xs_is_ptr(x))` 不就永遠都成立，這樣不就永遠執行不到 `else` 的部份嗎? > 路見不平，拿 patch 來填！ > :notes: jserrv > > 修改方式如下：參考同學 [ccs100203](https://hackmd.io/@cccccs100203/linux2021-quiz3) > > 將 `x->is_ptr = true;` 移至 else 下方，確保原本的短字串內容能成功複製到 heap 上。而 `x->capacity = ilog2(len) + 1;` 則在原本位置，因為在 `xs_allocate_data` 中需要利用更新後的 `x->capacity` 分配記憶體空間。 ::: ```cpp= /* grow up to specified size */ xs *xs_grow(xs *x, size_t len) { char buf[16]; if (len <= xs_capacity(x)) return x; /* Backup first */ if (!xs_is_ptr(x)) memcpy(buf, x->data, 16); x->capacity = ilog2(len) + 1; if (xs_is_ptr(x)) { xs_allocate_data(x, len, 1); } else { xs_allocate_data(x, len, 0); memcpy(xs_data(x), buf, 16); } x->is_ptr = true; return x; } ``` ### `xs_newempty` 將 `x->space_left` 設為 15 。 ```cpp static inline xs *xs_newempty(xs *x) { *x = xs_literal_empty(); return x; } ``` ### `xs_free` 如果 `x` 為長字串且 reference count 小於等於 1 (因為 `xs_dec_refcnt` 會將 reference count 減 1 ，所以當 reference count 等於 1 就可以釋放記憶體) ，釋放 `x->ptr` 的記憶體空間。 ```cpp static inline xs *xs_free(xs *x) { if (xs_is_ptr(x) && xs_dec_refcnt(x) <= 0) free(x->ptr); return xs_newempty(x); } ``` ### `xs_cow_lazy_copy` 如果 reference count 小於等於 1 ，就沒必要使用 CoW 。 ```cpp static bool xs_cow_lazy_copy(xs *x, char **data) { if (xs_get_refcnt(x) <= 1) return false; /* Lazy copy */ xs_dec_refcnt(x); xs_allocate_data(x, x->size, 0); if (data) { memcpy(xs_data(x), *data, x->size); /* Update the newly allocated pointer */ *data = xs_data(x); } return true; } ``` ### `xs_trim` `if (!trimset[0])` 首先判斷分割符號是否存在。 `mask` 透過 `trimset[i]` 的前 5 個 bit 當作 index ，後 3 個 bit 決定將 1 設在哪個位置。以換行符號 `\n` $=$ $10$ $=$ $00001010_2$ 為例: `mask[1] |= 00000100'b` 第二個和第三個 for 迴圈分別從頭和尾巴檢查是否符合 `trimset` 如果符合，則 `check_bit` 的結果會大於 1 ，如果不符合，則結果等於 0 。 ```cpp xs *xs_trim(xs *x, const char *trimset) { if (!trimset[0]) return x; char *dataptr = xs_data(x), *orig = dataptr; if (xs_cow_lazy_copy(x, &dataptr)) orig = dataptr; /* similar to strspn/strpbrk but it operates on binary data */ uint8_t mask[32] = {0}; #define check_bit(byte) (mask[(uint8_t) byte / 8] & 1 << (uint8_t) byte % 8) #define set_bit(byte) (mask[(uint8_t) byte / 8] |= 1 << (uint8_t) byte % 8) size_t i, slen = xs_size(x), trimlen = strlen(trimset); for (i = 0; i < trimlen; i++) set_bit(trimset[i]); for (i = 0; i < slen; i++) if (!check_bit(dataptr[i])) break; for (; slen > 0; slen--) if (!check_bit(dataptr[slen - 1])) break; dataptr += i; slen -= i; /* reserved space as a buffer on the heap. * Do not reallocate immediately. Instead, reuse it as possible. * Do not shrink to in place if < 16 bytes. */ memmove(orig, dataptr, slen); /* do not dirty memory unless it is needed */ if (orig[slen]) orig[slen] = 0; if (xs_is_ptr(x)) x->size = slen; else x->space_left = 15 - slen; return x; #undef check_bit #undef set_bit } ``` :::warning 有一種情況是 `x` 原本為中長字串，在做完 `x_trim` 後 `slen < 16` ，但是依然使用到 heap 的記憶體空間。不過註解中有提到 `Do not shrink to in place if < 16 bytes.` ::: ### `xs_concat` * 首先使用 `xs_cow_lazy_data` 複製新的 `xs *string` 。 * 如果總長度小於等於 `capacity` 不需要用 `xs_grow` * 如果總長度超過 `capacity` ，先宣告 `tmps` 使其為短字串，並透過 `xs_grow` 增加 `tmps->capacity`。而 `tmpdata` 為 `tmps->ptr` (如果為中長字串) 或 `tmps->data` (如果為短字串)。最後釋放 `string` 的記憶體空間，並將 `tmps` assign 給 `string` 。 * 其中 `data` 的移動方式如下:參考同學 [ccs100203](https://hackmd.io/@cccccs100203/linux2021-quiz3) `data` 移到 `data + pres` 的位置，空出前面的空間給 `prefix` 。 `data` 後方 `sufs + 1` 的長度用來保存 `suffix` (+ 1 的部分為 '\0')。 ```graphviz digraph { rankdir=LR block2 [shape="record", label="{prefix|data|suffix}"]; block1 [shape="record", label="{____|data|____}"]; block [shape="record", label="{data}"]; block -> block1 -> block2 } ``` ```cpp xs *xs_concat(xs *string, const xs *prefix, const xs *suffix) { size_t pres = xs_size(prefix), sufs = xs_size(suffix), size = xs_size(string), capacity = xs_capacity(string); char *pre = xs_data(prefix), *suf = xs_data(suffix), *data = xs_data(string); xs_cow_lazy_copy(string, &data); if (size + pres + sufs <= capacity) { memmove(data + pres, data, size); memcpy(data, pre, pres); memcpy(data + pres + size, suf, sufs + 1); if (xs_is_ptr(string)) string->size = size + pres + sufs; else string->space_left = 15 - (size + pres + sufs); } else { xs tmps = xs_literal_empty(); xs_grow(&tmps, size + pres + sufs); char *tmpdata = xs_data(&tmps); memcpy(tmpdata + pres, data, size); memcpy(tmpdata, pre, pres); memcpy(tmpdata + pres + size, suf, sufs + 1); xs_free(string); *string = tmps; string->size = size + pres + sufs; } return string; } ```