2021q1 Homework3 (quiz3)

contributed by < ccs100203 >

tags: `linux2021`

第 3 週測驗題

測驗 1

仿效 folly::fbstring，提出 C 語言的實作:

程式解說

xs string 結構

看出是以 16 byte 設為一個 string 的大小，
短字串為長度小於 16 的字串，並存放在 stack。
中字串在 16 ~ 256 之間，並利用動態記憶體配置在 heap。
長字串為長度大於 256 的字串，一樣存放在 heap，但多了 CoW 的機制。
LARGE_STRING_LEN 定義了中字串跟長字串的長度閥值。



























#define MAX_STR_LEN_BITS (54)
#define MAX_STR_LEN ((1UL << MAX_STR_LEN_BITS) - 1)

#define LARGE_STRING_LEN 256
typedef union {
    char data[16];

    struct {
        uint8_t filler[15],
            /* how many free bytes in this stack allocated string
             * same idea as fbstring
             */
            space_left : 4,
            /* if it is on heap, set to 1 */
            is_ptr : 1, is_large_string : 1, flag2 : 1, flag3 : 1;
    };

    /* heap allocated */
    struct {
        char *ptr;
        /* supports strings up to 2^MAX_STR_LEN_BITS - 1 bytes */
        size_t size : MAX_STR_LEN_BITS,
                      /* capacity is always a power of 2 (unsigned)-1 */
                      capacity : 6;
        /* the last 4 bits are important flags */
    };
} xs;

xs_is_ptr

透過 is_ptr flag，判斷 string 是否透過 pointer 儲存在 heap 內。

static inline bool xs_is_ptr(const xs *x) { return x->is_ptr; }

xs_is_large_string

透過 is_large_string flag，判斷 string 是否為 large string。

static inline bool xs_is_large_string(const xs *x)
{
    return x->is_large_string;
}

xs_size

此函式回傳 string 的大小，透過 xs_is_ptr 判斷 string 的儲存位置，再利用不同方式取得長度。

stack
space_left 儲存的是 string 所剩下的長度，所以透過總長度 15 - space_left 就可以得到 string 的當前長度。
heap
長度儲存在 size 之中，直接調用即可。

static inline size_t xs_size(const xs *x)
{
    return xs_is_ptr(x) ? x->size : 15 - x->space_left;
}

xs_data

此函數用來取得 string 所儲存的資料，透過 xs_is_ptr 判斷 string 的儲存位置。

stack
可以直接調用 x->data
heap
需要判斷 string 是否屬於長字串，是的話需要回傳 padding 4 之後的位置，之後會補充說明。
而中字串的話，則可以直接調用 x->ptr。

static inline char *xs_data(const xs *x)
{
    if (!xs_is_ptr(x))
        return (char *) x->data;

    if (xs_is_large_string(x))
        return (char *) (x->ptr + 4);
    return (char *) x->ptr;
}

xs_capacity

用來取得 string 的當前最大容量，透過 xs_is_ptr 判斷 string 的儲存位置。

stack
直接回傳短字串的最大長度 15
heap
藉由 x->capacity 作為 shift 的數量，表達 2 的冪的效果。
對應到 #define MAX_STR_LEN ((1UL << MAX_STR_LEN_BITS) - 1)

static inline size_t xs_capacity(const xs *x)
{
    return xs_is_ptr(x) ? ((size_t) 1 << x->capacity) - 1 : 15;
}

reference count

有關取用次數的相關實作，大部分都會先判斷字串是否屬於長字串才做操作，做了一層保護。

set 用來設立當前的 reference count
increment 會將 reference count 加 1
decrement 會將 reference count 減 1
get用來取得當前的　reference count

static inline void xs_set_refcnt(const xs *x, int val)
{
    *((int *) ((size_t) x->ptr)) = val;
}

static inline void xs_inc_refcnt(const xs *x)
{
    if (xs_is_large_string(x))
        ++(*(int *) ((size_t) x->ptr));
}

static inline int xs_dec_refcnt(const xs *x)
{
    if (!xs_is_large_string(x))
        return 0;
    return --(*(int *) ((size_t) x->ptr));
}

static inline int xs_get_refcnt(const xs *x)
{
    if (!xs_is_large_string(x))
        return 0;
    return *(int *) ((size_t) x->ptr);
}

xs_allocate_data

此函式會負責分配 xs string 所需要的記憶體空間，並且有 reallocate 的 flag，用來判斷是否需要藉由 realloc 重新配置空間。
並且只有在 string 需要儲存在 heap 時才會呼叫到此函式，因為存放在 stack 的短字串是不需要進行記憶體配置的。
由於只有中字串跟長字串會進入到此函式，所以使用 LARGE_STRING_LEN 作為分辨的依據。
由於程式需要引入 CoW 的機制，對長度大於 LARGE_STRING_LEN 的字串會在最前面做 4 byte 的 padding，將其保留給 reference count 使用，這也是為什麼 xs_data 在使用長字串時需要先做
$+ 4$

static void xs_allocate_data(xs *x, size_t len, bool reallocate)
{
    /* Medium string */
    if (len < LARGE_STRING_LEN) {
        x->ptr = reallocate ? realloc(x->ptr, (size_t) 1 << x->capacity)
                            : malloc((size_t) 1 << x->capacity);
        return;
    }

    /* Large string */
    x->is_large_string = 1;

    /* The extra 4 bytes are used to store the reference count */
    x->ptr = reallocate ? realloc(x->ptr, (size_t)(1 << x->capacity) + 4)
                        : malloc((size_t)(1 << x->capacity) + 4);

    xs_set_refcnt(x, 1);
}

xs_new

此函式用來建立一個新的 xs string。
由字串的長度決定要如何儲存他，小於 16 時會將他存放在 stack。
中字串跟長字串則是藉由上面提到的 xs_allocate_data 去配置大小。

xs *xs_new(xs *x, const void *p)
{
    *x = xs_literal_empty();
    size_t len = strlen(p) + 1;
    if (len > 16) {
        x->capacity = ilog2(len) + 1;
        x->size = len - 1;
        x->is_ptr = true;
        xs_allocate_data(x, x->size, 0);
        memcpy(xs_data(x), p, len);
    } else {
        memcpy(x->data, p, len);
        x->space_left = 15 - (len - 1);
    }
    return x;
}

兩個重要的 macro

xs_literal_empty
用來協助宣告，並且將字串的空間初始化為 0。
xs_tmp
此程式其實不直接呼叫 xs_new，而都是用此 macro 包裝，做了一層判斷字串長度是否超出最大長度限制的判斷及保護。
_Static_assert
根據 c11 6.7.10 Static assertions
_Static_assert ( constant-expression , string-literal ) ;

The constant expression shall be an integer constant expression. If the value of the constant expression compares unequal to 0, the declaration has no effect. Otherwise, the constraint is violated and the implementation shall produce a diagnostic message that includes the text of the string literal, except that characters not in the basic source character set are not required to appear in the message.

如果 expression 不為 0 的話無任何作用，反之會告知 string-literal 內的訊息。














#define xs_literal_empty() \
    (xs) { .space_left = 15 }

/* Memory leaks happen if the string is too long but it is still useful for
 * short strings.
 */
#define xs_tmp(x)                                                   \
    ((void) ((struct {                                              \
         _Static_assert(sizeof(x) <= MAX_STR_LEN, "it is too big"); \
         int dummy;                                                 \
     }){1}),                                                        \
     xs_new(&xs_literal_empty(), x))
     
xs string = *xs_tmp("\n foobarbar \n\n\n");

2021/05/02 新增 xs_tmp 的說明

首先知道 xs_tmp 是為了 xs_new 所作的一層封裝，其作用在新增新的空間時，會提前檢查字串大小是否超出 MAX_STR_LEN 的範圍。
上面的第 14 行是程式中使用 xs_tmp 的情形，因為需要確保進行檢查的過程是不會影響到原先函式的呼叫方式與運作，所以這邊使用了 comma expression 的技巧。
根據 c11 6.5.17 Comma operator

The left operand of a comma operator is evaluated as a void expression; there is a
sequence point between its evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value.

可以了解 comma 左邊的 expression 會先處理，然後再處理以及回傳右邊的 expression，這樣可以確保在進行 xs_new 之前會先通過 _Static_assert 的檢查，並且 _Static_assert 不會回傳所以不影響到原先的使用方式。
但是如果只在 comma 左邊放一個 _Static_assert() 會遇到問題，因為 _Static_assert 本身是一個 macro 並不是 expression，所以他不符合 comma expression 所規範的 syntax。
所以程式透過把 _Static_assert 包裝在一個 structure 內，並利用 compound literal 的方式將其轉變為一個 expression。此時的程式就可以保有在 macro 內進行 _Static_assert 檢查且不影響其原先使用方式的效果。 (參考 c11 6.5.2.5 Compound literals，其歸類在 Postfix operators 之中)

使用 (void) 的部份還尚未釐清，有參考同學的共筆做實驗，但不論我有沒有加入 (void) 都不會在編譯時產生 warning，所以還搞不懂原因。

xs_grow

此函式的作用在結合上述的 xs_allocate_data 重新動態配置新的空間給 xs string。

如果空間還夠或是屬於存放在 stack 內的短字串，在第一個 if 判斷式就會將其過濾掉。
第 9 行的 if 用來備份儲存在 stack 內的資料，因為要轉換到 heap 去。
最後呼叫 xs_allocate_data 的地方會根據當前資料的存放位置有所不同
- stack
  會配置一塊全新的記憶體空間給 xs string。
  記得要把剛剛備份的資料填回來。
- heap
  將 reallocate 設為 1，藉此調用函式中的 realloc 而不是重新配置一整塊空間。






















xs *xs_grow(xs *x, size_t len)
{
    char buf[16];

    if (len <= xs_capacity(x))
        return x;

    /* Backup first */
    if (!xs_is_ptr(x))
        memcpy(buf, x->data, 16);

    x->is_ptr = true;
    x->capacity = ilog2(len) + 1;

    if (xs_is_ptr(x)) {
        xs_allocate_data(x, len, 1);
    } else {
        xs_allocate_data(x, len, 0);
        memcpy(xs_data(x), buf, 16);
    }
    return x;
}

這段程式碼原先有寫錯的地方，第 12 行會導致後面的 else 根本不會進入，勢必需要改寫，所以做了以下分析:

短字串 -> 短字串
由於短字串的 capacity 都會是 15，所以會在第 5 行的 if 就結束。
短字串 -> 中長字串
在第 9 行進行資料的備份，並接著進行其餘的動作，最後要再把備份好的資料移入新的空間內。
中長字串 -> 中長字串
與第二點的差別只在不需要進行資料的備份。

綜合以上的分析，將程式碼改為下列這樣：























xs *xs_grow(xs *x, size_t len)
{
    char buf[16];

    if (len <= xs_capacity(x))
        return x;

    /* Backup first */
    if (!xs_is_ptr(x))
        memcpy(buf, x->data, 16);

    x->capacity = ilog2(len) + 1;
    
    if (xs_is_ptr(x)) {
        xs_allocate_data(x, len, 1);
    } else {
        xs_allocate_data(x, len, 0);
        memcpy(xs_data(x), buf, 16);
    }
    
    x->is_ptr = true;
    return x;
}

第 20 行 x->capacity = ilog2(len) + 1; 移至第 12 行 if (xs_is_ptr(x)) 前，因為在 xs_allocate_data 函式中，會根據 x->capacity 配置記憶體，所以如果沒將 x->capacity 更新的話，配置的記憶體空間會和原本相同。
Chia-Lun Hsu <gyes00205>
感謝，之前沒注意到。
Shao-Tse Hung <ccs100203>

TODO 增加長度超過 MAX_STR_LEN 之後的判斷，但還在思考好的寫法，也還沒搞懂 xs_tmp 中的寫法。

釋放空間的函式

xs_literal_empty 會將其重新指向一塊空的結構體
xs_free
如果字串儲存在 heap 內的話，會先透過 xs_dec_refcnt 對 reference count 減 1，再決定是否要 free 掉。

static inline xs *xs_newempty(xs *x)
{
    *x = xs_literal_empty();
    return x;
}

static inline xs *xs_free(xs *x)
{
    if (xs_is_ptr(x) && xs_dec_refcnt(x) <= 0)
        free(x->ptr);
    return xs_newempty(x);
}

xs_cow_lazy_copy

此函式的作用在於，如果要對 reference count 大於 1 的長字串進行操作，那麼必須要先複製一份新的字串出來，因為原先是由多個字串使用同一份記憶體空間。
函式的回傳值則是此字串是否為長字串
藉由 xs_allocate_data 配置新的空間後，將原先的指標 data 指向新配出的空間。

static bool xs_cow_lazy_copy(xs *x, char **data)
{
    if (xs_get_refcnt(x) <= 1)
        return false;

    /* Lazy copy */
    xs_dec_refcnt(x);
    xs_allocate_data(x, x->size, 0);

    if (data) {
        memcpy(xs_data(x), *data, x->size);

        /* Update the newly allocated pointer */
        *data = xs_data(x);
    }
    return true;
}

xs_concat

此函式的作用是對於給定的參數進行字串串接。

第 9 行是對 CoW 機制的操作，避免改動到多個字串共用的記憶體空間
後面的操作會分為串接後的長度超過 capacity 跟未超過 capacity
- 未超過 capacity
  - 通過 memmove 移動原先的 string，空出 prefix 所需要的空間
  - 再藉由 memcpy 將需要串接的資料放入預計接上的位置。
  - 之後再根據 xs string 不同的存放位置去更新他的 size
- 超過 capacity
  - 超過 capacity 就意味著此字串一定要存放在 heap 內
  - 通過 xs_grow 配置出串接後的字串所需要的空間
  - 後續會將資料串接在剛剛新增的 tmps 內
  - 接著釋放掉原先 string 所佔用的記憶體，並接其重新指向剛剛的 tmps
  - 最後記得要更新字串的 size
補充圖解
一開始的 data 會從結構的最前端，透過 memmove 移動到空出 prefix 的位置，最後在將資料接上。
































xs *xs_concat(xs *string, const xs *prefix, const xs *suffix)
{
    size_t pres = xs_size(prefix), sufs = xs_size(suffix),
           size = xs_size(string), capacity = xs_capacity(string);

    char *pre = xs_data(prefix), *suf = xs_data(suffix),
         *data = xs_data(string);

    xs_cow_lazy_copy(string, &data);

    if (size + pres + sufs <= capacity) {
        memmove(data + pres, data, size);
        memcpy(data, pre, pres);
        memcpy(data + pres + size, suf, sufs + 1);

        if (xs_is_ptr(string))
            string->size = size + pres + sufs;
        else
            string->space_left = 15 - (size + pres + sufs);
    } else {
        xs tmps = xs_literal_empty();
        xs_grow(&tmps, size + pres + sufs);
        char *tmpdata = xs_data(&tmps);
        memcpy(tmpdata + pres, data, size);
        memcpy(tmpdata, pre, pres);
        memcpy(tmpdata + pres + size, suf, sufs + 1);
        xs_free(string);
        *string = tmps;
        string->size = size + pres + sufs;
    }
    return string;
}

memmove
將 src 內 n bytes 的資料按順序從第一個開始，先放入 buffer 再複製到 dest，不斷進行到移動結束。此函式可以接受 memory area overlap。
```
void *memmove(void *dest, const void *src, size_t n);
```
memcpy
將資料從 src 開始，複製 n bytes 到 dest。此函式不可接受 memory area overlap。
```
void *memcpy(void *restrict dest, const void *restrict src, size_t n);
```

將字串 concat 到超過 MAX_STR_LEN 時沒有做偵測，我認為需要在 xs_grow 中設立判斷機制。

xs_trim

此函式的用處在於修剪掉字串頭尾的指定字元。(註解寫的內容尚未能搞懂)

第 8 行的 if 會維持 CoW 的機制，在長字串並有需要時複製一份記憶體空間，才來做 trim 的動作
透過第 18 行的 for 將所需要的剔除的字元放入 mask 中
再透過接下來的兩個 for，分別從頭跟尾開始判斷每一個字元是否需要剔除掉，直到找到第一個不需要剔除的則停止
透過 memmove 將剩下的資料重新放到指標的一開始，並確保字串的結尾是 \0
最後要記得更改字串的大小

解釋 check_bit 及 set_bit 的運作原理：

ASCII 總共有 256 個字元，程式利用一個 bit 去代表一種字元，所以創造了 8 * 32 的 mask 陣列。
查找每一個字元時，利用 mask[(uint8_t) byte / 8] 以及 1 << (uint8_t) byte % 8 作為 index，用來索引當前字元所對應到的 bit。
有了 index 之後，使用位元運算子 and & 即可判斷該字元是否同時存在於 trimset 與 string 之中。而位元運算子 or | 則可將需要刪減掉的字元新增進去 trimset 中。













































xs *xs_trim(xs *x, const char *trimset)
{
    if (!trimset[0])
        return x;

    char *dataptr = xs_data(x), *orig = dataptr;

    if (xs_cow_lazy_copy(x, &dataptr))
        orig = dataptr;

    /* similar to strspn/strpbrk but it operates on binary data */
    uint8_t mask[32] = {0};

#define check_bit(byte) (mask[(uint8_t) byte / 8] & 1 << (uint8_t) byte % 8)
#define set_bit(byte) (mask[(uint8_t) byte / 8] |= 1 << (uint8_t) byte % 8)
    size_t i, slen = xs_size(x), trimlen = strlen(trimset);

    for (i = 0; i < trimlen; i++)
        set_bit(trimset[i]);
    for (i = 0; i < slen; i++)
        if (!check_bit(dataptr[i]))
            break;
    for (; slen > 0; slen--)
        if (!check_bit(dataptr[slen - 1]))
            break;
    dataptr += i;
    slen -= i;

    /* reserved space as a buffer on the heap.
     * Do not reallocate immediately. Instead, reuse it as possible.
     * Do not shrink to in place if < 16 bytes.
     */
    memmove(orig, dataptr, slen);
    /* do not dirty memory unless it is needed */
    if (orig[slen])
        orig[slen] = 0;

    if (xs_is_ptr(x))
        x->size = slen;
    else
        x->space_left = 15 - slen;
    return x;
#undef check_bit
#undef set_bit
}

提供字串複製 (應考慮到 CoW) 的函式實作

TODO