# 2021q1 Homework3 (quiz3) contributed by < `jonathanyang0227` > :::danger 注意作業規範:登記在作業區的網址,應為「瀏覽模式」下對應的發表網址 :notes: jserv ::: >已更改謝謝老師 # Week 3 - [ ] 並行和多執行緒程式設計 - [ ] C 語言: 函式呼叫 - [ ] C 語言: 遞迴呼叫 - [ ] C 語言: 前置處理器應用 - [ ] C 語言: goto 和流程控制 - [ ] C 語言程式設計技巧 ## 作業要求 - [ ] 1. 解釋上述程式碼運作原理,並提出改進方案; - [ ] 2. 以上述程式碼為基礎,提供字串複製 (應考慮到 CoW) 的函式實作; - [ ] 3. 設計實驗,確認上述字串處理最佳化 (針對空間效率) 帶來的效益,應考慮到 locality of reference,善用 GDB 追蹤和分析,確認字串的確符合預期地配置於 stack 或 heap; - [ ] 4. 嘗試將 quiz2 提到的 string interning 機制整合,提出對空間高度最佳化的字串處理工具函式庫 - [ ] 5. 在 Linux 核心原始程式碼中,找出類似手法的 SSO 或 CoW 案例並解說; ## 解釋程式碼運作原理 ### 前言 此題目為仿效 [folly::fbstring](https://github.com/facebook/folly/blob/master/folly/docs/FBString.md) 對字串進行 SSO (small string optimization) 和 CoW (copy on write) 這兩種最佳化手法,而 `folly::fbstring` 概念為將字串依據長度分為短字串、中字串跟長字串: 1. 當字串長度小於等於 23 時,會使用堆疊 (stack) 上的空間來保存字串。此時被看作「短字串」; 2. 當字串長度介於 24 和 254 (含) 之間,視作「中等長度的字串」,採用動態配置記憶體; 3. 當字串長度大於 255 時,視作「長字串」,會透過 CoW 手段進行最佳化:進行字串複製操作時,倘若字串本身沒有修改,就會共享記憶體空間,從而減少記憶體佔用 ### 結構定義 ```c typedef union { /* allow strings up to 15 bytes to stay on the stack * use the last byte as a null terminator and to store flags * much like fbstring: * https://github.com/facebook/folly/blob/master/folly/docs/FBString.md */ char data[16]; struct { uint8_t filler[15], /* how many free bytes in this stack allocated string * same idea as fbstring */ space_left : 4, /* if it is on heap, set to 1 */ is_ptr : 1, is_large_string : 1, flag2 : 1, flag3 : 1; }; /* heap allocated */ struct { char *ptr; /* supports strings up to 2^MAX_STR_LEN_BITS - 1 bytes */ size_t size : MAX_STR_LEN_BITS, /* capacity is always a power of 2 (unsigned)-1 */ capacity : 6; /* the last 4 bits are important flags */ }; } xs; ``` `xs` 為效仿 `folly::fbstring` 的結構,其原理也為將字串依據長度分類處理 * 若字串不超過15位元組,存放在 stack 內 ,其中: * 儲存在 filter 中, `sace_left` 則是 15 - filter 內位元組 的數量。 * * 有 4 個位元沒有定義,是為了避免覆寫另一結構成員: `is_ptr`, `flag1`, `flag 2` 與 `flag3`。 * 若字串長度大於或等於 16 個位元組,則放在 heap (透過 malloc 系統呼叫配置所需字串大小) * 長度以 size 表示,範圍為 0~2^54^-1 * 分配大小以 capacity 表示 (power of two) * 若字串大於 `LARGE_STRING_LE` 則為長字串 ### xs_is_ptr ```c static inline bool xs_is_ptr(const xs *x) { return x->is_ptr; } ``` 此函式匯回傳 `is_ptr` 而當字串存放到 heap 時, `is_ptr` 會為 1,固可判斷為畔對是否存放在 heap ### xs_is_large_string ```c static inline bool xs_is_large_string(const xs *x) { return x->is_large_string; } ``` 判斷字串是否為長字串(大於 `LARGE_STRING_LEN` ) ### xs_size ```c static inline size_t xs_size(const xs *x) { return xs_is_ptr(x) ? x->size : 15 - x->space_left; } ``` 若為長字串,回傳 `x->size`,則回傳 15-`x->space_left` ( `x->space_left` = 15 - 短字串長度) ### xs_data ```c static inline char *xs_data(const xs *x) { if (!xs_is_ptr(x)) return (char *) x->data; if (xs_is_large_string(x)) return (char *) (x->ptr + OFF); return (char *) x->ptr; } ``` ### xs_capacity ```c static inline size_t xs_capacity(const xs *x) { return xs_is_ptr(x) ? ((size_t) 1 << x->capacity) - 1 : 15; } ``` 回傳字串容量,若為長字串則回傳 2^capacity^-1 ### xs_allocate_data ```c static void xs_allocate_data(xs *x, size_t len, bool reallocate) { /* Medium string */ if (len < LARGE_STRING_LEN) { x->ptr = reallocate ? realloc(x->ptr, (size_t) 1 << x->capacity) : malloc((size_t) 1 << x->capacity); return; } /* Large string */ x->is_large_string = 1; /* The extra 4 bytes are used to store the reference count */ x->ptr = reallocate ? realloc(x->ptr, (size_t)(1 << x->capacity) + 4) : malloc((size_t)(1 << x->capacity) + 4); xs_set_refcnt(x, 1); } ``` :::info **realloc(void *ptr, size_t size)**:重新配置記憶體,同時維持內容不變 ::: `xs_allocate_data` 針對中長字串做處理,如果為長字串,則多配置 4 byte 儲存 reference count ### xs_set_refcnt ```c static inline void xs_set_refcnt(const xs *x, int val) { *((int *) ((size_t) x->ptr)) = val; } ``` 設定 reference count 為 val :::info 確認現在字串被哪些參照 ::: :::success TODO atomic operation atomic_fetch_add/sub 有對值做操作都要加 ::: ### xs_inc_refcnt ```c static inline void xs_inc_refcnt(const xs *x) { if (xs_is_large_string(x)) ++(*(int *) ((size_t) x->ptr)); } ``` 增加 reference count ### xs_dec_refcnt ```c static inline int xs_dec_refcnt(const xs *x) { if (!xs_is_large_string(x)) return 0; return --(*(int *) ((size_t) x->ptr)); } ``` 減少 reference count ### xs_get_refcnt ```c static inline int xs_get_refcnt(const xs *x) { if (!xs_is_large_string(x)) return 0; return *(int *) ((size_t) x->ptr); } ``` 取得 reference count 的值 ### ilog2 ```c static inline int ilog2(uint32_t n) { return LLL; } ``` :::info **__builtin_clz (unsigned int x)** :返回字左起第一個1以前0的個數,即最高位的位數 ::: 從 `xs_new` 中 `x->capacity = ilog2(len) + 1` 可看出 `ilgo2()` 應為設定字串大小的函式 因此應該為 `32-(__builtin_clz(n)+1)` (因為需要大於最高位,故需+1) ### xs_new ```c xs *xs_new(xs *x, const void *p) { *x = xs_literal_empty(); size_t len = strlen(p) + 1; if (len > NNN) { x->capacity = ilog2(len) + 1; x->size = len - 1; x->is_ptr = true; xs_allocate_data(x, x->size, 0); memcpy(xs_data(x), p, len); } else { memcpy(x->data, p, len); x->space_left = 15 - (len - 1); } return x; } ``` 此寒視為新增一個 xs 的結構 從 `x->is_ptr = true;` 可知,在 len>NNN 的情況下,要儲存在 heap 中,故 len 要大於16,`NNN = 16` ### xs_tmp ```c /* Memory leaks happen if the string is too long but it is still useful for * short strings. */ #define xs_tmp(x) \ ((void) ((struct { \ _Static_assert(sizeof(x) <= MAX_STR_LEN, "it is too big"); \ int dummy; \ }){1}), \ xs_new(&xs_literal_empty(), x)) ``` :::info **_Static_assert ( expression , message )** : 若 expression =0,會產生 compile-time error 並顯示 massage **編譯時就會做確認** ::: 此處為測試是否超過 `MAX_STR_LEN` ,若沒有則建立一個新的字串 ### xs_grow ```c /* grow up to specified size */ xs *xs_grow(xs *x, size_t len) { char buf[16]; if (len <= xs_capacity(x)) return x; /* Backup first */ if (!xs_is_ptr(x)) memcpy(buf, x->data, 16); x->is_ptr = true; x->capacity = ilog2(len) + 1; if (xs_is_ptr(x)) { xs_allocate_data(x, len, 1); } else { xs_allocate_data(x, len, 0); memcpy(xs_data(x), buf, 16); } return x; } ``` 此函式用於擴張字串大小,若原為短字串,則須先複製內容到 buf,因為原本的位置會被拿來儲存 heap 指標 ### xs_newempty ```c static inline xs *xs_newempty(xs *x) { *x = xs_literal_empty(); return x; } ``` ```c #define xs_literal_empty() \ (xs) { .space_left = 15 } ``` 新增一個字串,內容只有 space_left =15 ### xs_free ```c static inline xs *xs_free(xs *x) { if (xs_is_ptr(x) && xs_dec_refcnt(x) <= 0) free(x->ptr); return xs_newempty(x); } ``` 清空字串所有內容,並改成只有 space_left=15 ### xs_cow_lazy_copy ```c static bool xs_cow_lazy_copy(xs *x, char **data) { if (xs_get_refcnt(x) <= 1) return false; /* Lazy copy */ xs_dec_refcnt(x); xs_allocate_data(x, x->size, 0); if (data) { memcpy(xs_data(x), *data, x->size); /* Update the newly allocated pointer */ *data = xs_data(x); } return true; } ``` 此函式應為 Copy On Write 的實作,將 x 的內容複製到 data :::warning 仍在理解為什麼 `xs_get_refcnt(x) <= 1` 的情況不用複製 ::: ### xs_concat ```c xs *xs_concat(xs *string, const xs *prefix, const xs *suffix) { size_t pres = xs_size(prefix), sufs = xs_size(suffix), size = xs_size(string), capacity = xs_capacity(string); char *pre = xs_data(prefix), *suf = xs_data(suffix), *data = xs_data(string); xs_cow_lazy_copy(string, &data); if (size + pres + sufs <= capacity) { memmove(data + pres, data, size); memcpy(data, pre, pres); memcpy(data + pres + size, suf, sufs + 1); if (xs_is_ptr(string)) string->size = size + pres + sufs; else string->space_left = 15 - (size + pres + sufs); } else { xs tmps = xs_literal_empty(); xs_grow(&tmps, size + pres + sufs); char *tmpdata = xs_data(&tmps); memcpy(tmpdata + pres, data, size); memcpy(tmpdata, pre, pres); memcpy(tmpdata + pres + size, suf, sufs + 1); xs_free(string); *string = tmps; string->size = size + pres + sufs; } return string; } ``` 從 `main` 判斷,此函示應為將 prefix 串接至 string 前, suffix 串接到 string 後 ```c if (size + pres + sufs <= capacity) { memmove(data + pres, data, size); memcpy(data, pre, pres); memcpy(data + pres + size, suf, sufs + 1); if (xs_is_ptr(string)) string->size = size + pres + sufs; else string->space_left = 15 - (size + pres + sufs); } ``` 若串接後沒有超過 capacity ,則直接複製並修改 size即可 ```c else { xs tmps = xs_literal_empty(); xs_grow(&tmps, size + pres + sufs); char *tmpdata = xs_data(&tmps); memcpy(tmpdata + pres, data, size); memcpy(tmpdata, pre, pres); memcpy(tmpdata + pres + size, suf, sufs + 1); xs_free(string); *string = tmps; string->size = size + pres + sufs; } ``` 若超過 capacity,則需使用 `xs_grow()` 擴增容量 ### xs_trim ```c xs *xs_trim(xs *x, const char *trimset) { if (!trimset[0]) return x; char *dataptr = xs_data(x), *orig = dataptr; if (xs_cow_lazy_copy(x, &dataptr)) orig = dataptr; /* similar to strspn/strpbrk but it operates on binary data */ uint8_t mask[32] = {0}; #define check_bit(byte) (CCC) #define set_bit(byte) (SSS) size_t i, slen = xs_size(x), trimlen = strlen(trimset); for (i = 0; i < trimlen; i++) set_bit(trimset[i]); for (i = 0; i < slen; i++) if (!check_bit(dataptr[i])) break; for (; slen > 0; slen--) if (!check_bit(dataptr[slen - 1])) break; dataptr += i; slen -= i; /* reserved space as a buffer on the heap. * Do not reallocate immediately. Instead, reuse it as possible. * Do not shrink to in place if < 16 bytes. */ memmove(orig, dataptr, slen); /* do not dirty memory unless it is needed */ if (orig[slen]) orig[slen] = 0; if (xs_is_ptr(x)) x->size = slen; else x->space_left = 15 - slen; return x; #undef check_bit #undef set_bit } ``` TODO