owned this note
owned this note
Published
Linked with GitHub
# 2021q1 Homework3 (quiz3)
contributed by < `RZHuangJeff` >
###### tags: `linux2021`
> [Problem set](https://hackmd.io/@sysprog/linux2021-quiz3)
## Problem 1
### Structure Description
```cpp=
#define MAX_STR_LEN_BITS (54)
#define MAX_STR_LEN ((1UL << MAX_STR_LEN_BITS) - 1)
#define LARGE_STRING_LEN 256
typedef union {
char data[16];
struct {
uint8_t filler[15],
space_left : 4,
is_ptr : 1, is_large_string : 1, flag2 : 1, flag3 : 1;
}; /* space and flag */
struct {
char *ptr;
size_t size : MAX_STR_LEN_BITS,
capacity : 6;
}; /* heap */
} xs;
```
This union is the header of simplified implementation of [folly::fbstring](https://github.com/facebook/folly/blob/master/folly/FBString.h), for this implementation (mentioned as `xs_string` below), it can hold upto 15 characters (exclude null character) on stack with in its 16-byte size header, and upto ($2^{53} - 1$) characters (include null character) on heap.
Following macros are defined:
1. `MAX_STR_LEN_BITS`: This macro defines how many bits that are used to represent the size of `xs_string`.
2. `MAX_STR_LEN`: With `MAX_STR_LEN_BITS` defined above, the max length of `xs_string` can be calculated with following expression: `((1UL << MAX_STR_LEN_BITS) - 1)`.
3. `LARGE_STRING_LEN`: This macro defines the minimal size of large string category.
With in `xs` union, there are three parts of structures.
1. `data`: This array of character makes it easy to access string content while given `xs` holds short string on stack.
2. `space and flag`: This structure contains two part of members:
* `space_left`: This field occupied 4 bits, represents the length of string content. Notice that the actual value stored is 15(max length of on-stack `xs_string`) minus length of current string content, which will ensure the last byte to be zero (acts as null character) to terminate string content properly. This field is meaningful only for on-stack `xs_string`.
* `flags`: There are 4 slots avaliable, only two of them are meaningful. `is_ptr` indicates whether string content is stored on heap, `is_large_string` indicates whether the length of string content of given `xs_string` is greater than `LARGE_STRING_LEN`, in order to perform cow feature properly. Notice that such flags will be zero for on-stack `xs_string`, which ensures there is a null character for the on-stack `xs_string` with full sized string content.
3. `heap`: The fields in this structure will be referenced only when `is_ptr` is set.
* `ptr`: A pointer that points to the buffer that stores string content of given `xs_string`.
* `size`: This field indicates the length of string content of given `xs_string`.
* `capacity`: This field represents the entire size of allocated buffer with in format $log_2$(buffer size) to reduce space consumption.
Since that `data` and following two structures are unioned together, we have to ensure that flags in first structure and fields in second structure will not affect to each other, otherwise, unexpected behavior may happend.
The key of such problem is to check the memory organization of `xs`, from ISO/IEC 9899:
* ISO/IEC 9899 [6.7.2.1] Structure and union specifiers
> An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
With in spec of C99, it only mentioned that the successive bit-fields should be packed into adjacent bits if there is enough space remains. But the packing order of succeding bit-fields is implementation-defined. To figure out the actual packing order, following program are designed as experiment:
```cpp
#include <stdio.h>
#include <stdint.h>
#include <stddef.h>
struct {
uint32_t a : 4,
b : 8,
c : 20;
} ts;
void print_bits(uint8_t byte)
{
for (int8_t i = 7; i >= 0; i--)
printf("%d", byte & (1 << i) ? 1 : 0);
}
void print_layout(const void *ptr, size_t len)
{
uint8_t *p = (uint8_t *)ptr;
for (size_t i = 0; i < len; i++, p++) {
printf("%p: ", p);
print_bits(*p);
printf("\n");
}
}
int main()
{
printf("Default:\n");
print_layout(&ts, sizeof(ts));
// set ts.a
ts.a = 0xF;
printf("\nts.a is set\n");
print_layout(&ts, sizeof(ts));
// set ts.b
ts.b = 0xFF;
printf("\nts.a & ts.b are set\n");
print_layout(&ts, sizeof(ts));
return 0;
}
```
And is compiled and run under following environment:
```shell
$ gcc --version | head -n 1
gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0
$ lscpu | head -n 3
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
```
The experiment first defines a structure that contains 3 bit-fields with size of 4, 8 and 20 bits, respectively. The field `a` and `b` are set successively and the content of that structure is printed after each setting, which generates following output:
```shell
Default:
0x557b7da95014: 00000000
0x557b7da95015: 00000000
0x557b7da95016: 00000000
0x557b7da95017: 00000000
ts.a is set
0x557b7da95014: 00001111
0x557b7da95015: 00000000
0x557b7da95016: 00000000
0x557b7da95017: 00000000
ts.a & ts.b are set
0x557b7da95014: 11111111
0x557b7da95015: 00001111
0x557b7da95016: 00000000
0x557b7da95017: 00000000
```
And the organization of 4-byte integer in little endian byte order is as follows:
```graphviz
graph {
n [shape=record label="0x557b7da95017|0x557b7da95016|0x557b7da95015|0x557b7da95014"]
}
```
With this experiment, we can figure out that the successive bit-fields are packed from low-order to high-order.
After all, following figure shows the memory layout of `xs`:
```
_______________________________________________________________________
|_____________________________data[16]__________________________________|
|____________________________filler[15]____________________|_sl_|_|_|_|_|
|______________ptr_______________|_________size________|capacity|\flags/
sl stands for space_left
```
### Code Description
* **xs_literal_empty**
```cpp=
#define xs_literal_empty() \
(xs) { .space_left = 15 }
```
This macro defines an initializer of `xs_string`, which will set field `space_left` to 15.
* **xs_new**
```cpp=
xs *xs_new(xs *x, const void *p)
{
*x = xs_literal_empty();
size_t len = strlen(p) + 1;
if (len > 16 /* NNN */) {
x->capacity = ilog2(len) + 1;
x->size = len - 1;
x->is_ptr = true;
xs_allocate_data(x, x->size, 0);
memcpy(xs_data(x), p, len);
} else {
memcpy(x->data, p, len);
x->space_left = 15 - (len - 1);
}
return x;
}
```
This function aims to initialize a `xs_string` with given string content. Since that `xs` string supports string content with length 16 or less (includes null character) to be stored on stack, the length of given string content will be checked to decide where to place the content.
While the length of content is greater than 16, string content will be placed on heap, to do so, the capacity of the buffer should be set up first, it will be set to logarithm of 2 to the closest power of 2 that is greater or equals to the length of string content. After `size` and `is_ptr` flag are set and the space is allocated, the content is copied to the buffer (line 5 - 10). Otherwise, the content will be placed inside `x`, and the `space_left` will be set properly (line 12 - 13).
* **xs_newempty**
```cpp=
static inline xs *xs_newempty(xs *x)
{
*x = xs_literal_empty();
return x;
}
```
This function is a wrapper of `xs_literal_empty`, that will initialize given `xs_string` to empty string.
* **xs_free**
```cpp=
static inline xs *xs_free(xs *x)
{
if (xs_is_ptr(x) && xs_dec_refcnt(x) <= 0)
free(x->ptr);
return xs_newempty(x);
}
```
This function tries to free the allocated space on heap while the reference count of that string content shrinks to 0 (line 3 - 4), and `x` will be initialized to empty `xs_string` before returned (line 5).
* **xs_allocate_data**
```cpp=
static void xs_allocate_data(xs *x, size_t len, bool reallocate)
{
if (len < LARGE_STRING_LEN) {
x->ptr = reallocate ? realloc(x->ptr, (size_t) 1 << x->capacity)
: malloc((size_t) 1 << x->capacity);
return;
}
x->is_large_string = 1;
x->ptr = reallocate ? realloc(x->ptr, (size_t)(1 << x->capacity) + 4)
: malloc((size_t)(1 << x->capacity) + 4);
xs_set_refcnt(x, 1);
}
```
This function aims to allocate the space of buffer on heap. There are two kinds of allocation:
1. **Medium length string**: For the string with length less than `LARGE_STRING_LEN`, the space with size $2^{x->capacity}$ bytes is allocated (line 3 - 6).
2. **Large string**: Otherwise, not only $2^{x->capacity}$ bytes is allocated, there are additional 4 bytes that holds the reference count to that string content (line 11).
* **xs_is_ptr & xs_is_large_string**
```cpp=
static inline bool xs_is_ptr(const xs *x)
{
return x->is_ptr;
}
static inline bool xs_is_large_string(const xs *x)
{
return x->is_large_string;
}
```
These two functions will access to flag `is_ptr` and `is_large_string` respectively.
`xs_is_ptr` will return whether the string content of given `xs_string` is stored on head or not.
The same as `xs_is_ptr`, `xs_is_large_string` returns whether a given `xs_string` is large string or not.
* **xs_size & xs_data & xs_capacity**
```cpp=
static inline size_t xs_size(const xs *x)
{
return xs_is_ptr(x) ? x->size : 15 - x->space_left;
}
static inline char *xs_data(const xs *x)
{
if (!xs_is_ptr(x))
return (char *) x->data;
if (xs_is_large_string(x))
return (char *) (x->ptr + 4 /* OFF */);
return (char *) x->ptr;
}
static inline size_t xs_capacity(const xs *x)
{
return xs_is_ptr(x) ? ((size_t) 1 << x->capacity) - 1 : 15;
}
```
These three functions will return different components of given `xs_string`.
`xs_size`: This function reports the size of string content of given `xs_string`, for the content stored on stack is (15 - `space_left`), otherwise, the size is stored in field `size`.
`xs_data`: This function returns the start address of string content. For the string stored on stack, this would be its `data` field (line 8 - 9), otherwise, it would be `ptr` field, furthermore, there are additional 4 bytes at the begining of buffer, which needs to be omitted (line 12), while given `xs_string` is large string.
* **xs_*_refcnt**
```cpp=
static inline void xs_set_refcnt(const xs *x, int val)
{
*((int *) ((size_t) x->ptr)) = val;
}
static inline int xs_get_refcnt(const xs *x)
{
if (!xs_is_large_string(x))
return 0;
return *(int *) ((size_t) x->ptr);
}
static inline void xs_inc_refcnt(const xs *x)
{
if (xs_is_large_string(x))
++(*(int *) ((size_t) x->ptr));
}
static inline int xs_dec_refcnt(const xs *x)
{
if (!xs_is_large_string(x))
return 0;
return --(*(int *) ((size_t) x->ptr));
}
```
These four functions provide an interface to access the reference count of large `xs_string`. Since the reference count occupies the begining 4 bytes of allocated buffer, all we need to do to access it is casting field `ptr` of given `xs_string` to `int`.
* **ilog2**
```cpp=
static inline int ilog2(uint32_t n)
{
return 32 - __builtin_clz(n) - 1; /* LLL */
}
```
This function aims to calculate logarithm of 2 to n by counting leading zero of `n`. Since a 1 in binary representation at position `i` indicates $2^i$ and $log_2(2^i) = i$, with fact that logarithm of 2 to numbers between $2^i$ and $2^{i + 1}$ will land between $i$ and $i + 1$, we only need to check where the first 1 appears to calculate logarithm of 2 to `n`.
* **xs_tmp**
```cpp=
#define xs_tmp(x) \
((void) ((struct { \
_Static_assert(sizeof(x) <= MAX_STR_LEN, "it is too big"); \
int dummy; \
}){1}), \
xs_new(&xs_literal_empty(), x))
```
This macro will create a `xs_string` as local variable by calling `xs_new`. This macro also performs length checking with macro `_Static_assert`, with ISO/IEC 9899:201x:
* ISO/IEC 9899:201x [6.7.10] Static assertions
> The constant expression shall be an integer constant expression. If the value of theconstant expression compares unequal to 0, the declaration has no effect. Otherwise, the constraint is violated and the implementation shall produce a diagnostic message that includes the text of the string literal, except that characters not in the basic source character set are not required to appear in the message.
* ISO/IEC 9899:201x [6.5.17] Comma operator
> The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its evaluation and that of the right operand. Then the right operand is evaluated; the result has its type and value.
The spec indicates that `_Static_assert` expression has no effect while the length of given string content is less or equal to `MAX_STR_LEN`, otherwise, the message will be shown and the program will abort. If length checking passes, the result of this macro will be the return value of `xs_new` due to the property of comma operator.
* **xs_grow**
```cpp=
xs *xs_grow(xs *x, size_t len)
{
char buf[16];
if (len <= xs_capacity(x))
return x;
/* Backup first */
if (!xs_is_ptr(x))
memcpy(buf, x->data, 16);
// x->is_ptr = true;
x->capacity = ilog2(len) + 1;
if (xs_is_ptr(x)) {
xs_allocate_data(x, len, 1);
} else {
xs_allocate_data(x, len, 0);
memcpy(xs_data(x), buf, 16);
x->is_ptr = true;
}
return x;
}
```
This function aims to extend its buffer to be capable of holding string with size `len`. It will first check whether `len` fits in the size of current buffer, if so, there is no further action taken (line 5 - 6), otherwise, the buffer should be extended. The string that is stored in field `data` will be backup first (line 9 - 10), then, the field `capability` will be set to fit `len`, finally, the space will be allocated (line 15 - 20).
:::danger
Flag `is_ptr` is set at line 12 originally, which will cause the condition of if statement at line 15 always be true, and an error will occur while data is originally stored in field `data`, since the external buffer is not allocated yet, while program will try to reallocate it later. This error can be re-produced by following test program:
```cpp
int main(int argc, char *argv[])
{
xs string = *xs_tmp("foobarbar");
xs_grow(&string, 17);
return 0;
}
```
which will cause segmentation fault.
To fix it, just move the assignment to `is_ptr` into the body of `else` statement (line 21).
:::
* **xs_cow_lazy_copy**
```cpp=
static bool xs_cow_lazy_copy(xs *x, char **data)
{
if (xs_get_refcnt(x) <= 1)
return false;
/* Lazy copy */
xs_dec_refcnt(x);
xs_allocate_data(x, x->size, 0);
if (data) {
memcpy(xs_data(x), *data, x->size);
/* Update the newly allocated pointer */
*data = xs_data(x);
}
return true;
}
```
This function tries to allocate a new space for given `xs_string` and detach it from the existed buffer (line 7 - 8) while there are more than 1 references to its string content (line 3 - 4), and the content of allocated buffer will be filled with the content that `*data` points to (line 10 - 15).
* **xs_concat**
```cpp=
xs *xs_concat(xs *string, const xs *prefix, const xs *suffix)
{
size_t pres = xs_size(prefix), sufs = xs_size(suffix),
size = xs_size(string), capacity = xs_capacity(string);
char *pre = xs_data(prefix), *suf = xs_data(suffix),
*data = xs_data(string);
xs_cow_lazy_copy(string, &data);
if (size + pres + sufs <= capacity) {
memmove(data + pres, data, size);
memcpy(data, pre, pres);
memcpy(data + pres + size, suf, sufs + 1);
if (xs_is_ptr(string))
string->size = size + pres + sufs;
else
string->space_left = 15 - (size + pres + sufs);
} else {
xs tmps = xs_literal_empty();
xs_grow(&tmps, size + pres + sufs);
char *tmpdata = xs_data(&tmps);
memcpy(tmpdata + pres, data, size);
memcpy(tmpdata, pre, pres);
memcpy(tmpdata + pres + size, suf, sufs + 1);
xs_free(string);
*string = tmps;
string->size = size + pres + sufs;
}
return string;
}
```
This function will wrap `string` by `prefix` and `suffix`. Since the content of `string` will change, it should be detached from its original content by calling `xs_cow_lazy_copy` (line 9). After that, the content of `prefix` and `suffix` will be copied onto buffer of `string` directly while the total length fits in buffer of `string` (line 11 - 14). The reason why use `memmove` (line 12) instead of `memcpy` is that `memmove` will take care of memory overlapping, as descriped in [Linux manual page](https://man7.org/linux/man-pages/man3/memmove.3.html). Otherwise, a new `xs_string` object is created and is extended to capable of holding modified string content by calling `xs_grow` (line 21 - 22), and three parts of string are copied onto it (line 23 - 26), and, the old one is freed (line 27). Finally, the size of string is updated (line 16 - 19, 29).
* **xs_trim**
```cpp=
xs *xs_trim(xs *x, const char *trimset)
{
if (!trimset[0])
return x;
char *dataptr = xs_data(x), *orig = dataptr;
if (xs_cow_lazy_copy(x, &dataptr))
orig = dataptr;
/* similar to strspn/strpbrk but it operates on binary data */
uint8_t mask[32] = {0};
#define check_bit(byte) (mask[(uint8_t) byte / 8] & 1 << (uint8_t) byte % 8) /* CCC */
#define set_bit(byte) (mask[(uint8_t) byte / 8] |= 1 << (uint8_t) byte % 8) /* SSS */
size_t i, slen = xs_size(x), trimlen = strlen(trimset);
for (i = 0; i < trimlen; i++)
set_bit(trimset[i]);
for (i = 0; i < slen; i++)
if (!check_bit(dataptr[i]))
break;
for (; slen > 0; slen--)
if (!check_bit(dataptr[slen - 1]))
break;
dataptr += i;
slen -= i;
/* reserved space as a buffer on the heap.
* Do not reallocate immediately. Instead, reuse it as possible.
* Do not shrink to in place if < 16 bytes.
*/
memmove(orig, dataptr, slen);
/* do not dirty memory unless it is needed */
if (orig[slen])
orig[slen] = 0;
if (xs_is_ptr(x))
x->size = slen;
else
x->space_left = 15 - slen;
return x;
#undef check_bit
#undef set_bit
}
```
This function will remove user defined characters that specified in `trimset` from the begining and the end of given `xs_string`. If there is no character specified in `trimset`, there is no further action taken (line 3 -4). Just like `xs_concat`, the content of given may change after trimming, it should be detached from its original string content by calling `xs_cow_lazy_copy` (line 8 - 9).
To improve the performance of triming, an `uint8_t` array with 32 entries that acts as a table with each bit in it represents a character is introduced (line 12). The characters in `trimset` is mapped onto that table by macro `set_bit` which will set the corresponding bit in table of given character (line 18 - 19). The `for` loops around line 20 and 23 perform similar task, they will step through each character form begining and end of string content, until they find a character not matchs characters in `trimset`. To discard characters from head and tail, we only need to move pointer to the new begining and shrink its length, respectively (line 26 - 27), and the modified content is shifted forward to the start of buffer (line 33). After setting the null character at the end of string (line 35 - 36) and updating the size of given `xs_string` (line 38 - 41), the trimming process is complete.