owned this note
owned this note
Published
Linked with GitHub
# 2021q1 Homework2 (quiz2)
contributed by < `RZHunagJeff` >
###### tags: `linux2021`
> [Problem set](https://hackmd.io/@sysprog/linux2021-quiz2)
:::warning
See [What is the difference between a problem and a question?](https://www.quora.com/What-is-the-difference-between-a-problem-and-a-question)
Problem usually means the major issue that leads to the need for a study.
:notes: jserv
:::
## Problem 1
### Description
This section will illustrate the effect of some important functions. For the graph that visualize the work that functions done, **red** edge represents the edge being added, while **gray** one stands for the edge being removed.
* **list_add_tail**
```cpp
static inline void list_add_tail(struct list_head *node,
struct list_head *head)
{
struct list_head *prev = head->prev;
prev->next = node;
node->next = head;
node->prev = prev;
head->prev = node;
}
```
This function will attach `node` to the tail of list that `head` points to.
```graphviz
digraph {
rankdir=LR
node [shape="box"]
edge [dir="both"]
omit1 [label="..." shape="none"]
pnode [label="prev of head"]
nnode [label="node"]
head [label="head"]
omit2 [label="..." shape="none"]
omit1 -> pnode
pnode -> head [color=gray]
head -> omit2
pnode -> nnode -> head [color=red]
}
```
* **list_del**
```cpp
static inline void list_del(struct list_head *node)
{
struct list_head *next = node->next, *prev = node->prev;
next->prev = prev; prev->next = next;
}
```
This function will disattach `node` from the list that it belonged to.
```graphviz
digraph {
rankdir=LR
node [shape="box"]
edge [dir="both"]
omit1 [label="..." shape="none"]
pnode [label="prev of node"]
n [label="node"]
nnode [label="next of node"]
omit2 [label="..." shape="none"]
omit1 -> pnode
pnode -> nnode [color=red]
nnode -> omit2
pnode -> n -> nnode [color=gray]
}
```
* **list_splice_tail**
```cpp
static inline void list_splice_tail(struct list_head *list,
struct list_head *head)
{
struct list_head *head_last = head->prev;
struct list_head *list_first = list->next, *list_last = list->prev;
if (list_empty(list))
return;
head->prev = list_last;
list_last->next = head;
list_first->prev = head_last;
head_last->next = list_first;
}
```
This function aims to attach list that `list` points to to the tail of `head`.
```graphviz
digraph {
node [shape=box]
edge [minlen=2 dir=both]
lom1, lom2, hom1, hom2 [label="..." shape=none]
lprev [label="prev of list"]
lnext [label="next of list"]
hprev [label="prev of head"]
{
rank=same
lom1 -> lprev
lprev -> list [dir=forward color=gray]
list -> lprev [dir=forward]
list -> lnext [dir=forward]
lnext -> list [dir=forward color=gray]
lnext -> lom2
}
{
rank=same
hom1 -> hprev
hprev -> head [color=gray]
head -> hom2
}
lnext -> hprev [color=red]
lprev -> head [color=red]
}
```
* **list_cut_position**
```cpp
static inline void list_cut_position(struct list_head *head_to,
struct list_head *head_from,
struct list_head *node)
{
struct list_head *head_from_first = head_from->next;
if (list_empty(head_from))
return;
if (head_from == node) {
INIT_LIST_HEAD(head_to);
return;
}
head_from->next = node->next;
head_from->next->prev = head_from;
head_to->prev = node;
node->next = head_to;
head_to->next = head_from_first;
head_to->next->prev = head_to;
}
```
For this function, a list will be cut into 2 sublists from the cut point that `node` specified.
```graphviz
digraph {
rankdir=LR
node [shape=record]
edge [dir=both]
omit1, omit2, omit3 [label="..." shape=none]
cut [label="node"]
{
omit1 -> head_from
head_from -> head_from_first [color=gray]
head_from_first -> omit2 -> cut
cut -> node_next [color=gray]
node_next ->omit3
head_from -> node_next [color=red]
}
head_to -> cut [color=red]
head_to -> head_from_first [color=red]
}
```
### Analysis
Following are some type definitions used in this problem:
```cpp
typedef struct __element {
char *value;
struct __element *next;
struct list_head list;
} list_ele_t;
typedef struct {
list_ele_t *head;
list_ele_t *tail;
size_t size;
struct list_head list;
} queue_t;
```
And following is the main part of program that performs merge sort on given `queue_t`:
```c=
void list_merge_sort(queue_t *q)
{
if (list_is_singular(&q->list))
return;
queue_t left;
struct list_head sorted;
INIT_LIST_HEAD(&left.list);
list_cut_position(&left.list, &q->list, &get_middle(&q->list)->list /* MMM */);
list_merge_sort(&left);
list_merge_sort(q);
list_merge(&left.list, &q->list, &sorted);
INIT_LIST_HEAD(&q->list);
list_splice_tail(&sorted, &q->list);
}
```
The work done by `list_merge_sort` can be described briefly:
1. Check how many elements are there in `q`(line 3): if there is only one element in `q`, then it doesn't need to be sorted.
2. Divide elements of `q` into 2 sublists(line 9)
3. Do recursive function calls (line 10 - 11): call `list_merge_sort` with 2 sublists divided from `q` as arguments.
4. Merge sublists: after recursive function calls return, the 2 sublists should both contain elements in acensing order, a call to `list_merge` will merge these sublists into final result.
To make the program works properly, the first thing needs to be figured out is `MMM` at line 9. According to comment of this function, `node`, the third argument of `list_cut_position`, is the pointer to the cutting point and with type `struct list_head *`, in all options, there is only one option matches that type, which is `&get_middle(&q->list)->list`.
```cpp
static list_ele_t *get_middle(struct list_head *list)
{
struct list_head *fast = list->next, *slow;
list_for_each (slow, list) {
if (fast->next != list /* COND1 */ || fast->next->next != list /* COND2 */)
break;
fast = fast->next->next;
}
return list_entry(slow /* TTT */, list_ele_t, list);
}
```
Next, we have to figure out the other 3 expressions in function `get_middle`. There are several ways to split a list to two, the method implemented here is to maintain two pointers, one is `slow` which will move forward a node with in an iteration, while `fast` moves two nodes at the same time. After `fast` moved through all nodes in list, `slow` will point to the node at the middle. Since that list in this example is circular list, to check whether `fast` has moved through all nodes, we can check whether the next node of that `fast` points to is `list` itself, that is, back to the head of given list. But the number of nodes in the list may be odd or even, the condition mentioned above will never match in the even case, to fix that, we can check whether the node two after that `fast` pointed to is `list`. From here, we can figure out the answer of `COND1` and `COND2` is `fast->next == list` and `fast->next->next == list`.
Following shows two situations that `COND1` and `COND2` consider to:
* odd case
```graphviz
digraph {
rankdir=BT
node [shape=box]
{
rank=same
list -> node1 -> node2 -> node3 -> node4 -> node5 -> list
}
slow_0 [label="slow" shape=none fontcolor=blue]
fast_0 [label="fast" shape=none fontcolor=blue]
{slow_0, fast_0} -> node1
slow_1 [label="slow" shape=none fontcolor=red]
fast_1 [label="fast" shape=none fontcolor=red]
slow_1 -> node2
fast_1 -> node3
slow_2 [label="slow" shape=none fontcolor=green]
fast_2 [label="fast" shape=none fontcolor=green]
slow_2 -> node3
fast_2 -> node5
}
```
* even case
```graphviz
digraph {
rankdir=BT
node [shape=box]
{
rank=same
list -> node1 -> node2 -> node3 -> node4 -> node5 -> node6 -> list
}
slow_0 [label="slow" shape=none fontcolor=blue]
fast_0 [label="fast" shape=none fontcolor=blue]
{slow_0, fast_0} -> node1
slow_1 [label="slow" shape=none fontcolor=red]
fast_1 [label="fast" shape=none fontcolor=red]
slow_1 -> node2
fast_1 -> node3
slow_2 [label="slow" shape=none fontcolor=green]
fast_2 [label="fast" shape=none fontcolor=green]
slow_2 -> node3
fast_2 -> node5
}
```
Finally, the list element at the middle will be returned, that is, the `list_ele_t` that `slow` belongs to, with macro `list_entry`, we can find the structure which contains `slow`.
:::warning
TODO:
1. Since we were able to find the middle element in the given linked list, it is feasible to integrate the hybrid sorting algorithm, which modifies the merge sort algorithm to switch to insertion sort when the input gets sufficiently small. See [Algorithmic Analysis and Sorting: Part Three](https://stanford.edu/class/archive/cs/cs106b/cs106b.1138/lectures/13/Slides13.pdf)
2. Linked lists are interesting while you attempt to perform typical algorithms and find out the expected performance. Figure out the facts!
:notes: jserv
:::
---
## Problem 2
### Description
```cpp
uint16_t func(uint16_t N) {
/* change all right side bits to 1 */
N |= N >> 1;
N |= N >> 2; // X
N |= N >> 4; // Y
N |= N >> 8; // Z
return (N + 1) >> 1;
}
```
According to description, `func` aims to find the greatest power-of-2 that is smaller than or equals to N. To fulfill this task, we have to take a look at binary rerpesentation of integer:
```
1 : 00000000 00000001 (2^0)
2 : 00000000 00000010 (2^1)
4 : 00000000 00000100 (2^2)
8 : 00000000 00001000 (2^3)
16 : 00000000 00010000 (2^4)
32 : 00000000 00100000 (2^5)
64 : 00000000 01000000 (2^6)
128 : 00000000 10000000 (2^7)
256 : 00000001 00000000 (2^8)
512 : 00000010 00000000 (2^9)
...
32768: 10000000 00000000 (2^15)
```
As shown above, we can find that the numbers which are power-of-2 will have only 1 bit be set, and the $n^{th}$ bit be set represents 2 to the $n^{th}$. That is, to find the greatest power-of-2 under N, all bits under the left most 1 should be cleared.
And the work that `func` done are as follow (take 272($00000001 00010000_2$) for example):
1. Set all right side bits: (00000001 00010000) -> (00000001 11111111)
2. Add 1 to it: (00000001 11111111) -> (00000010 00000000)
3. Right shift one bit: (00000010 00000000) -> (00000001 00000000)($256_{10}$)
The key point is how a set of oring and right shifting will cause all right side bits be set. With binary representations shown above, we can see that for all `N` greater than zero, there should be a left most 1 in binary representation of `N`, which can be shown as: (0...01...). After oring `N` with `N >> 1`, `N` becomes (0...011...), as shown, we can determine that there should be at least 2 succeeding 1s around left most 1. To set as most bits as possible at a time, `N` is ored with `N >> 2`, which causes (0...01111...). Then oring `N` with `N >> 4`, `N` will be (0...011111111...).
To sum up, if we keep oring `N` with `N` right shift 1, 2, 4, 8 ... bits, respectively, `N` results in all right side bits are set. And since that `N` is type `unit16_t` in this case, we only need to or `N` up to `N >> 8`, then all right side bits of `N` will be set even if the most significant bit of `N` is 1.
---
## Problem 3
### Description
* **bitcpy**
```cpp=
void bitcpy(void *_dest, /* Address of the buffer to write to */
size_t _write, /* Bit offset to start writing to */
const void *_src, /* Address of the buffer to read from */
size_t _read, /* Bit offset to start reading from */
size_t count)
{
size_t read_lhs = _read & 7;
size_t read_rhs = 8 - read_lhs;
const uint8_t *source = (const uint8_t *) _src + (_read / 8);
size_t write_lhs = _write & 7;
size_t write_rhs = 8 - write_lhs;
uint8_t *dest = (uint8_t *) _dest + (_write / 8);
static const uint8_t read_mask[] = {
0x00, /* == 0 00000000b */
0x80, /* == 1 10000000b */
0xC0, /* == 2 11000000b */
0xE0, /* == 3 11100000b */
0xF0, /* == 4 11110000b */
0xF8, /* == 5 11111000b */
0xFC, /* == 6 11111100b */
0xFE, /* == 7 11111110b */
0xFF /* == 8 11111111b */
};
static const uint8_t write_mask[] = {
0xFF, /* == 0 11111111b */
0x7F, /* == 1 01111111b */
0x3F, /* == 2 00111111b */
0x1F, /* == 3 00011111b */
0x0F, /* == 4 00001111b */
0x07, /* == 5 00000111b */
0x03, /* == 6 00000011b */
0x01, /* == 7 00000001b */
0x00 /* == 8 00000000b */
};
while (count > 0) {
uint8_t data = *source++;
size_t bitsize = (count > 8) ? 8 : count;
if (read_lhs > 0) {
data <<= read_lhs; /* RRR */
if (bitsize > read_rhs)
data |= (*source >> read_rhs);
}
if (bitsize < 8)
data &= read_mask[bitsize];
uint8_t original = *dest;
uint8_t mask = read_mask[write_lhs];
if (bitsize > write_rhs) {
/* Cross multiple bytes */
*dest++ = (original & mask) | (data >> write_lhs);
original = *dest & write_mask[bitsize - write_rhs];
*dest = original | (data << write_rhs);
} else {
// Since write_lhs + bitsize is never >= 8,
// no out-of-bound access.
original |= write_mask[write_lhs + bitsize]; /* DDD */
*dest++ = (original & mask) | (data >> write_lhs);
}
count -= bitsize;
}
}
```
This function is going to copy `count` bits starts from address that `_src` points to with offset `_read` bits to the address that `_dest` points to with offset `_write`.
This function can be divided into two parts, first part (line 7 - 36) is preprocessing that definding variables, second one (line 38 - 65) is a loop which will finish the job of copying.
Since that there are 8 bits within a byte for recent computer architecture, we can write `_read` and `_write` into ($8*n+k$), where $n$ is the offset to bytes and $k$ is the offset of bits in the byte, that is, we actually start to read from address that (`_src` + $n$) points to with offset $k$. And this takes the same to where to write to. Following shows the relationship between variables:
```
source/dest = _src/_dest + n
_src/_dest | k (take k = 3, for example)
| | |
V V V
byte 0 | byte 1 | | byte n | byte n+1
00000000 | 00000000 | ... | 00000000 | 000|00000
\ /|\ /
*_lhs| *_rhs
the read/write process starts here ->|
```
Next, we are going to take a look at the second part, the while loop around line 38 - 65. At line 40, `bitsize` will be set to 8 if `count` is greater than 8, and with the type of `data` is `uint8_t`, we can figure out that there is a byte(8 bits) being copied within an iteration. The whole process starts from copying a byte from where `source` points to to `data`(line 39). If `read_lhs` greater than zero($k$ > 0), which means that there are $k$ bits should be read from next byte to form a complete byte, and this situation is handled around line 41 - 45. To leave place for bits from next byte, `data` should be shifted left by `read_lhs` bits. After that, `read_lhs` bits from next byte will be read to that place by oring them together while there are not enough bits from the previous read(`bitsize` > `read_rhs`). And to discard unnecessary bits, `data` is ored with proper `read_mask`, which will leave upper `bitsize` unmodified and clear all other bits.
Finally, the read byte needs to be written to proper place, where pointed by `dest` with `write_lhs` bits offset. There are two situations need to be handled, one is the byte should be written into mutiple bytes, while the other one only need to modify one byte. To handle first case, `data` is divided into two parts, the upper `write_lhs` bits and lower `write_rhs` bits. The process that modifies two bytes are similar(line 50 - 51, 54 - 56):
1. Clear all bits that need to be modified by anded it with proper mask
2. Write data into it by oring `data` with proper shift
Process to handle the second case is the same as above.
Following shows the writing process in detail:
```
data: 11001100
dest
|
v
01010101 10101100 ...
Case 1: cross mutilpe bytes
(take write_lhs = 3 and bitsize = 7)
(1)
original = *dest;
=> original: 01010101
(2)
mask = read_mask[write_lhs];
=> mask: 11100000
(3)
original & mask => 01000000
data >> write_lhs => 00011001
*dest++ = (original & mask) | (data >> write_lhs);
=> *dest: 01011001
(4)
write_mask[bitsize - write_rhs] => 00111111
original = *dest & write_mask[bitsize - write_rhs];
=> original: 00101100
(5)
data << write_rhs => 10000000
*dest = original | (data << write_rhs);
=> *dest = 10101100
After all:
dest
|
V
010[11001 10]101100 ...
Case 2: no need to cross mutiple bytes
(take write_lhs = 5 and bitsize = 2)
(1)
original = *dest;
=> original: 01010101
(2)
mask = read_mask[write_lhs];
=> mask: 11111000
(3)
write_mask[bitsize + write_lhs] => 00000001
mask |= write_mask[bitsize + write_lhs];
mask = 11111001
(4)
original & mask => 01010001
data >> write_lhs => 00000110
*dest++ = (original & mask) | (data >> write_lhs);
=> *dest = 01010111
After all:
dest
|
V
01010[11]1 10101100 ...
```
:::warning
TODO: visualize with Graphviz or something advanced.
:notes: jserv
:::
---
## Problem 4
### Structures
* **cstring**
```cpp
typedef struct __cstr_data {
char *cstr;
uint32_t hash_size;
uint16_t type;
uint16_t ref;
} * cstring;
```
`struct __cstr_data` is the basic structure that wraps required variables for string interning, and `cstring` is defined as a pointer type that points to `struct __cstr_data`.
`cstr`: address of string content.
`hash_size`: the hash of string `cstr` points to, or the length of string while `type` is `CSTR_ONSTACK`.
`type`: a flag that marks whether the string content is stored on stack(`CSTR_ONSTACK`), on heap(`CSTR_PERMANENT`) or in the interning pool(`CSTR_INTERNING`).
`ref`: reference count on undefined type `cstring`.
:::warning
:heavy_check_mark:
`ref` is used for the [reference counting](https://en.wikipedia.org/wiki/Reference_counting), which is crucial in multithreaded environments.
:notes: jserv
:::
* **cstr_buffer**
```cpp
typedef struct __cstr_buffer {
cstring str;
} cstr_buffer[1];
```
A buffer that contains a `cstring`.
`str`: the corresponding `cstring`.
:::danger
Still trying to figure out why definding as `cstr_buffer[1]` instead of just `cstr_buffer`.
> TODO: modify and experiment!
> :notes: jserv
:::
* **struct __cstr_node**
```cpp
struct __cstr_node {
char buffer[CSTR_INTERNING_SIZE];
struct __cstr_data str;
struct __cstr_node *next;
};
```
This is a structure that aims to hold interned string.
`buffer`: a `char` buffer that holds content of interned string.
`str`: the corresponding `cstring` instance.
`next`: used to form the hash table.
* **struct __cstr_pool**
```cpp
struct __cstr_pool {
struct __cstr_node node[INTERNING_POOL_SIZE];
};
```
This structure collects a set of `struct __cstr_node`s.
`node`: an array that holds upto `INTERNING_POOL_SIZE` of `struct __cstr_node`s.
* **struct __cstr_interning**
```cpp
struct __cstr_interning {
int lock;
int index;
unsigned size;
unsigned total;
struct __cstr_node **hash;
struct __cstr_pool *pool;
};
```
This is the main structure that packs all needed fields for string interning.
`lock`: a lock that used for atomic operations on corresponding `struct __cstr_interning` object.
`index`: the index that points to next position in `pool`.
`size`: size of current hash table.
`total`: the count of interned strings.
`hash`: a pointer that points to the hash table.
`pool`: a pointer that points to the interned string pool.
### Code Description
* **xalloc**
```cpp
static void *xalloc(size_t n)
{
void *m = malloc(n);
if (!m)
exit(-1);
return m;
}
```
This function is a wrapper of `malloc`, which performs valid checking on allocated pointer.
* **CSTR_BUFFER**
```cpp
#define CSTR_BUFFER(var) \
char var##_cstring[CSTR_STACK_SIZE] = {0}; \
struct __cstr_data var##_cstr_data = {var##_cstring, 0, CSTR_ONSTACK, 0}; \
cstr_buffer var; \
var->str = &var##_cstr_data;
```
This is a macro that sets up a `cstr_buffer` named by `var` and its corresponding variables. Structures used in this macro will be described in detail later. Takes `CSTR_BUFFER(a)` as example:
```cpp
char a_cstring[CSTR_STACK_SIZE] = {0};
struct __cstr_data a_cstr_data = {a_cstring, 0, CSTR_ONSTACK, 0};
cstr_buffer a;
a->str = &a_cstr_data;
```
Notice that `##` in macro means concatenate which is mentioned in ISO/IEC 9899:
* ISO/IEC 9899[6.10.3.3] **Semantics** - 3
> For both object-like and function-like macro invocations, before the replacement list isreexamined for more macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an argument) is deleted and the preceding preprocessing token is concatenated with the following preprocessing token.
* **CSTR_LITERAL**
```cpp
#define CSTR_LITERAL(var, cstr) \
static cstring var = NULL; \
if (!var) { \
cstring tmp = cstr_clone("" cstr, (sizeof(cstr) / sizeof(char)) - 1); \
if (tmp->type == 0) { \
tmp->type = CSTR_PERMANENT; \
tmp->ref = 0; \
} \
if (!__sync_bool_compare_and_swap(&var, NULL, tmp)) { \
if (tmp->type == CSTR_PERMANENT) \
free(tmp); \
} \
}
```
This macro is used to define a `cstring` object named by `var` with content of `cstr`. Take `CSTR_LITERAL(a, "example")` as example:
```cpp
static cstring a = NULL;
if(!a) {
cstring tmp = cstr_clone("" "example", (sizeof("example") / sizeof(char)) - 1);
if (tmp->type == 0) {
tmp->type = CSTR_PERMANENT;
tmp->ref = 0;
}
if (!__sync_bool_compare_and_swap(&a, NULL, tmp)) {
if (tmp->type == CSTR_PERMANENT)
free(tmp);
}
}
```
In this example, `a` is declared as static variable, which will only be initialized once. While `a` is first time to be created, a temporary variable `tmp` is set up with content of `cstr`, `"example"` in this case, by calling `cstr_clone` which will be described in detail later, and `tmp` is copied to `a` by calling `__sync_bool_compare_and_swap`.
`__sync_bool_compare_and_swap` a built-in function of [GNU C Extension for atomic memory access](https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html#Atomic-Builtins). With this function, it can avoid memory leak while several threads are trying to initialize `a` at the same time, only one of them will be able to initialize `a` successfully, while others may need to free their `tmp` if its type is `CSTR_PERMANENT`.
* **CSTR_CLOSE**
```cpp
#define CSTR_CLOSE(var) \
do { \
if (!(var)->str->type) \
cstr_release((var)->str); \
} while (0)
```
This is a macro that closes given `cstr_buffer` by freeing its corresponding `cstring`. The `do { ... } while(0)` statement which wraps the entire macro aims to avoid syntax error or unexpected behavior.
One of common unexpected behavior is dangling else, which may happend while `CSTR_CLOSE` is not wrapped by `do {...} while(0)` and is invoked in an `if ... else` statement like following:
```cpp
if (condition)
CSTR_CLOSE(var);
else
...
```
Which will be expended as follows:
```cpp
if (condition)
if (!(var)->str->type)
cstr_release((var)->str);;
else
...
```
With the definition of selection statement and compound statement in ISO/IEC 9899:
* ISO/IEC 9899[6.8.4] Selection statement
```
selection-statement:
if (expression) statement
if (expression) statement else statement
switch (expression) statement
```
* ISO/IEC 9899[6.8.2] Compound statement
```
compound-statement:
{ block-item-listopt }
block-item-list:
block-item
block-item-list block-item
block-item:
declaration
statement
```
We can figure out that the inner `if` statement is ended by the first semicolon, the second semicolon will be explaned as `null` statement, and since that there is no `{}` packing the inner `if` statement and `null` statement together, which makes `else` no match any `if` statement and will rise a syntax error.
And if we remove the semicolon after `CSTR_CLOSE(var)` as following:
```cpp
if (condition)
CSTR_CLOSE(var)
else
...
```
Which will be expended to:
```cpp
if (condition)
if (!(var)->str->type)
cstr_release((var)->str);
else
...
```
This seems better than the previous one, but here comes an unexpected behavior. With ISO/IEC 9899:
* ISO/IEC 9899[6.8.4.1] Sematics - 3
> An else is associated with the lexically nearest preceding if that is allowed by the syntax.
Compiler will explain the program as following:
```cpp
if (condition) {
if (!(var)->str->type)
cstr_release((var)->str);
else
...
}
```
While we expect the program should be:
```cpp
if (condition) {
if (!(var)->str->type)
cstr_release((var)->str);
} else
...
```
:::warning
:heavy_check_mark:
TODO: explain [Dangling else](https://en.wikipedia.org/wiki/Dangling_else)
:notes: jserv
:::
* **interning**
```cpp=
static cstring interning(struct __cstr_interning *si,
const char *cstr,
size_t sz,
uint32_t hash)
{
if (!si->hash)
return NULL;
int index = (int) (hash & (si->size - 1));
struct __cstr_node *n = si->hash[index];
while (n) {
if (n->str.hash_size == hash) {
if (!strcmp(n->str.cstr, cstr))
return &n->str;
}
n = n->next;
}
// 80% (4/5) threshold
if (si->total * 5 >= si->size * 4)
return NULL;
if (!si->pool) {
si->pool = xalloc(sizeof(struct __cstr_pool));
si->index = 0;
}
n = &si->pool->node[si->index++];
memcpy(n->buffer, cstr, sz);
n->buffer[sz] = 0;
cstring cs = &n->str;
cs->cstr = n->buffer;
cs->hash_size = hash;
cs->type = CSTR_INTERNING;
cs->ref = 0;
n->next = si->hash[index];
si->hash[index] = n;
return cs;
}
```
This is one of the most important function for this implementation of string interning, since that it aims to "intern" a given string by adding it into string pool and onto hash table if the given string is not interned yet.
In order to save memory consumption, it will check whether there is an identical string has already interned, if so, that string will be returned (line 9-17). Notice that since `si->size` will always be power of 2 (mention later), `hash % si->size` can be simplified to `hash & (si->size - 1)`.
It will return `NULL` while `si->total` is at 80% of `si->size` (line 19 - 20), which indicates interning failed.
The space for string pool will be allocated while `interning` is called first time, and the index of pool, `si->index` is initialized to 0 (line 21 - 24).
If an identical string cannot be found in hash table, the next free `struct __cstr_node` will be picked from string pool (line 25), and the given string, `cstr`, will be copied into its corresponding `buffer` (line 26 - 27).
For the newly interned string, its hash, type and reference count will be initialized after content is copied (line 29 - 33). Notice that its `ref` is set to 0, since we don't need to concern the reference count of interned string, which will never be released until program termination.
Finally, that newly interned string is inserted into hash table (line 35 - 36).
:::danger
It seems wired because `si->total` and `si->index` are increased while a string is interned or is added into pool respectively, but they are never decreased.
(Requires further experiment)
:::
* **expand**
```cpp=
static void expand(struct __cstr_interning *si)
{
unsigned new_size = si->size * 2;
if (new_size < HASH_START_SIZE)
new_size = HASH_START_SIZE;
struct __cstr_node **new_hash =
xalloc(sizeof(struct __cstr_node *) * new_size);
memset(new_hash, 0, sizeof(struct __cstr_node *) * new_size);
for (unsigned i = 0; i < si->size; ++i) {
struct __cstr_node *node = si->hash[i];
while (node) {
struct __cstr_node *tmp = node->next;
insert_node(new_hash, new_size, node);
node = tmp;
}
}
free(si->hash);
si->hash = new_hash;
si->size = new_size;
}
```
This function aims to expand the hash table. For the first time this function being called, `new_size` will be set to `HASH_START_SIZE`, and will be double of original size, otherwise (line 3 - 5). Next, a new hash table will be allocated, and will be initialized to 0 (line 7 - 9). After that, the already existed nodes in the original hash table will be inserted into the new one (line 11 - 18). Finally, the old hash table will be replaced by the new one (line 20 - 22).
* **insert_node**
```cpp=
static inline void insert_node(struct __cstr_node **hash,
int sz,
struct __cstr_node *node)
{
uint32_t h = node->str.hash_size;
int index = h & (sz - 1);
node->next = hash[index];
hash[index] = node;
}
```
This function aims to insert `node` to hash table `hash` which has `sz` entries.
* **cstr_interning**
```cpp=
static cstring cstr_interning(const char *cstr, size_t sz, uint32_t hash)
{
cstring ret;
CSTR_LOCK();
ret = interning(&__cstr_ctx, cstr, sz, hash);
if (!ret) {
expand(&__cstr_ctx);
ret = interning(&__cstr_ctx, cstr, sz, hash);
}
++__cstr_ctx.total;
CSTR_UNLOCK();
return ret;
}
```
This function is a wrapper of `interning`, which performs atomic locking / unlocking on `__cstr_ctx` and calls `expand` while interning failed due to full (over 80% threashold) hash table.
* **hash_blob**
```cpp=
static inline uint32_t hash_blob(const char *buffer, size_t len)
{
const uint8_t *ptr = (const uint8_t *) buffer;
size_t h = len;
size_t step = (len >> 5) + 1;
for (size_t i = len; i >= step; i -= step)
h = h ^ ((h << 5) + (h >> 2) + ptr[i - 1]);
return h == 0 ? 1 : h;
}
```
This function aims to calculate a simplified hash of given string (`buffer`). With `step` at line 5, which is set to ceiling of `len/32`, it takes at most 32 iterations to calculate hash (line 6 - 7). But only parts of string will be included into hash, that is why this function is named `hash_blob`.
* **cstr_hash**
```cpp=
static size_t cstr_hash(cstring s)
{
if (s->type == CSTR_ONSTACK)
return hash_blob(s->cstr, s->hash_size);
if (s->hash_size == 0)
s->hash_size = hash_blob(s->cstr, strlen(s->cstr));
return s->hash_size;
}
```
This function is a wrapper of `hash_blob` which performs different action for different type of given `cstring`. For that with type `CSTR_ONSTACK`, it returns the result of `hash_blob` directly (line 3 - 4). For other types, the hash is not only calculated, but also stored to `hash_size` (line 5 - 7).
* **cstr_clone**
```cpp=
cstring cstr_clone(const char *cstr, size_t sz)
{
if (sz < CSTR_INTERNING_SIZE)
return cstr_interning(cstr, sz, hash_blob(cstr, sz));
cstring p = xalloc(sizeof(struct __cstr_data) + sz + 1);
if (!p)
return NULL;
void *ptr = (void *) (p + 1);
p->cstr = ptr;
p->type = 0;
p->ref = 1;
memcpy(ptr, cstr, sz);
((char *) ptr)[sz] = 0;
p->hash_size = 0;
return p;
}
```
This function is used to clone a given string (`cstr`) to a `cstring` string. This function will try to interning the given string first, when the length of given string is less than `CSTR_INTERNING_SIZE`, otherwise, it will allocate space on heap for the given string (line 3 - 5). The `cstring` stored on heap has following memory orgnization:
```graphviz
digraph{
node [shape=record]
cstr [label="<head>struct __cstr_data|(sz + 1) bytes for given string"]
p -> cstr:head
}
```
To reference to the start address of allcated buffer, `p` is increased by 1, with ISO/IEC 9899:
* ISO/IEC 9899 [6.5.6] Additive operators - 7
> For the purposes of these operators, a pointer to an object that is not an element of anarray behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
* ISO/IEC 9899 [6.5.6] Additive operators - 8
> ... If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. ... If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; ...
Since that `p` has type `cstring`(`struct __cstr_data *`), `p + 1` will jump to address right after allocated `struct __cstr_data`, that is, start address of buffer, which is assigned to `ptr` and is casted to type `void *`.
A call to `memcpy` copies `cstr` to where `ptr` points to, and the attributes of new `cstring` is set. Notice that the `type` of allocated `cstring` is set to 0, which means undifined, should be set externally.
* **cstr_grab**
```cpp=
cstring cstr_grab(cstring s)
{
if (s->type & (CSTR_PERMANENT | CSTR_INTERNING))
return s;
if (s->type == CSTR_ONSTACK)
return cstr_clone(s->cstr, s->hash_size);
if (s->ref == 0)
s->type = CSTR_PERMANENT;
else
__sync_add_and_fetch(&s->ref, 1);
return s;
}
```
This function will grab a new `cstring` from an existed `cstring`. With given `s` has type `CSTR_PERMANENT` or `CSTR_INTERNING` (being tested at line 3), `s` will be returned without further action. With type `CSTR_ONSTACK`, a new copy of `s` will be created by calling `cstr_clone`. For undefined type `s`, its `type` will be set to `CSTR_PERMANENT` if `ref` is 0, otherwise, its `ref` will be increased by 1.
`__sync_add_and_fetch` is an [atomic built-in functon](https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html) of GNU extention, which will perform an addition atomically, and the result is returned.
* **cstr_release**
```cpp=
void cstr_release(cstring s)
{
if (s->type || !s->ref)
return;
if (__sync_sub_and_fetch(&s->ref, 1) == 0)
free(s);
}
```
This function will decrease the reference count of `s` if s is an undefined type `cstring`. For other types of `cstring`, this function will simply return (line 3 - 4). When the decreased reference count of `s` reaches 0, the space occupied by `s` will be freed.
* **cstr_equal**
```cpp=
int cstr_equal(cstring a, cstring b)
{
if (a == b)
return 1;
if ((a->type == CSTR_INTERNING) && (b->type == CSTR_INTERNING))
return 0;
if ((a->type == CSTR_ONSTACK) && (b->type == CSTR_ONSTACK)) {
if (a->hash_size != b->hash_size)
return 0;
return memcmp(a->cstr, b->cstr, a->hash_size) == 0;
}
uint32_t hasha = cstr_hash(a);
uint32_t hashb = cstr_hash(b);
if (hasha != hashb)
return 0;
return !strcmp(a->cstr, b->cstr);
}
```
This function will check whether two given `cstring`s are reference to the same string content.
The given two pointers, `a` and `b`, will be checked whether they point to the same `cstring` structure, if so, their content must be the same (line 3 - 4).
Otherwise, they will be checked whether their type are identical and are `CSTR_INTERNING`, if so, they must reference to different string content, since that there is only one copy of each interned string.
Otherwise, if both `a` and `b` are with type `CSTR_ONSTACK`, their size will be checked first, follows their content.
Otherwise, their hash will be compared first, before checking their content.
* **cstr_cat**
```cpp=
cstring cstr_cat(cstr_buffer sb, const char *str)
{
cstring s = sb->str;
if (s->type == CSTR_ONSTACK) {
int i = CCC;
while (i < CSTR_STACK_SIZE - 1) {
s->cstr[i] = *str;
if (*str == 0)
return s;
++s->hash_size;
++str;
++i;
}
s->cstr[i] = 0;
}
cstring tmp = s;
sb->str = cstr_cat2(tmp->cstr, str);
cstr_release(tmp);
return sb->str;
}
```
This function aims to concatenate `str` to the `cstring` stored in `sb`. This function will try to make in-placed concatenation while the type of `sb->str` is `CSTR_ONSTACK` and the result size fits buffer size (line 4 - 15). Otherwise, `cstr_cat2` will take place to handle further action.
* **cstr_cat2**
```cpp=
static cstring cstr_cat2(const char *a, const char *b)
{
size_t sa = strlen(a), sb = strlen(b);
if (sa + sb < CSTR_INTERNING_SIZE) {
char tmp[CSTR_INTERNING_SIZE];
memcpy(tmp, a, sa);
memcpy(tmp + sa, b, sb);
tmp[sa + sb] = 0;
return cstr_interning(tmp, sa + sb, hash_blob(tmp, sa + sb));
}
cstring p = xalloc(sizeof(struct __cstr_data) + sa + sb + 1);
if (!p)
return NULL;
char *ptr = (char *) (p + 1);
p->cstr = ptr;
p->type = 0;
p->ref = 1;
memcpy(ptr, a, sa);
memcpy(ptr + sa, b, sb);
ptr[sa + sb] = 0;
p->hash_size = 0;
return p;
}
```
This function will be invoked while `cstr_cat` failed to concatenate string in-placed on stack. It will try to intern the result string of concatenation first while the length of result string fits in `CSTR_INTERNING_SIZE` (line 4 - 10). Otherwise, The result will be placed on heap, and its type will be set to undefined with reference count to 1 (line 11 - 23).