# 2023q1 Homework1 (lab0)
contributed by < [Raymonna](https://github.com/komark06) >
## Environment Information
```shell
$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 6
On-line CPU(s) list: 0-5
Thread(s) per core: 1
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
Stepping: 10
CPU MHz: 800.062
CPU max MHz: 4000.0000
CPU min MHz: 800.0000
BogoMIPS: 5599.85
Virtualization: VT-x
L1d cache: 192 KiB
L1i cache: 192 KiB
L2 cache: 1.5 MiB
L3 cache: 9 MiB
```
## Implementation of `queue.c`
### q_new
```c
struct list_head *q_new()
{
struct list_head *newh = malloc(sizeof(struct list_head));
if(!newh){
free(newh);
return NULL;
}
INIT_LIST_HEAD(newh);
return newh;
}
```
In `q_new` function, two things need to be consider:
- Satisfy the structure of circular linked list
- The first list_head object `head` is not a member of element_t, which indicates that it contains no value, only consisting of `list_head *next` and `list_head *prev`
### q_free
```c
void q_free(struct list_head *l)
{
if (!l)
return;
element_t *cur, *safe;
list_for_each_entry_safe (cur, safe, l, list) {
q_release_element(cur);
}
free(l);
}
```
Let's look at what the macro `list_for_each_entry_safe` looks like.
```c
for(cur = container_of(l->next, element_t, list), safe = container_of(cur->list.next, element_t, list);
&cur->list != l;
cur = safe, safe = container_of(safe->list.next, element_t, list))
```
- The macro `container_of` could give you the address of the structure(`element_t`) while giving it the information of the member of this structure(`list_head *`). For more details, please refer to [the lecture provided by Professor Huang](https://hackmd.io/@sysprog/linux-macro-containerof#container_of-%E5%B7%A8%E9%9B%86%E4%BD%9C%E7%82%BA%E8%B3%87%E6%96%99%E5%B0%81%E8%A3%9D%E7%9A%84%E5%9F%BA%E7%A4%8E)
- If you look at the first time, you may wonder why it need another list_head `safe` for iteration. Remember that we've mentioned the first list_head structure `head` is not a member of `element_t` structure; therefore, if we ignore it, segmentation fault would occur!
- While taking a look at the `q_release_element`, we could dive into the details of the implementation of `test_free` and see how the magic number has been used.([Details of test_free implementation in the lecture note](https://hackmd.io/@sysprog/linux2022-lab0#%E8%BF%BD%E8%B9%A4%E8%A8%98%E6%86%B6%E9%AB%94%E9%85%8D%E7%BD%AE%E5%92%8C%E9%87%8B%E6%94%BE%E7%9A%84%E7%8B%80%E6%B3%81))
### q_insert_head and q_insert_tail
In the insert and remove sections, I refer to the implementation provided by [chiangkd](https://hackmd.io/2ug0Xq4bTNij2eWQgYR4mg?view#q_insert_head-%E5%92%8C-q_insert_tail-%E5%AF%A6%E4%BD%9C)
```c
static inline element_t *new_element(const char *s)
{
if (!s)
return NULL;
element_t *new_e = malloc(sizeof(element_t));
if (!new_e)
return NULL;
size_t len = strlen(s);
new_e->value = malloc(len + 1);
if (!new_e->value) {
free(new_e);
return NULL;
}
memcpy(new_e->value, s, len);
new_e->value[len] = '\0';
return new_e;
}
```
One thing needs to be noticed is that the length of `new_e` needs to be `strlen(s) + 1` for putting the **'\0'** at the end of the string.
```c
bool q_insert_head(struct list_head *head, char *s)
{
if (!head)
return false;
element_t *ne = new_element(s);
if (!ne)
return false;
list_add(&ne->list, head);
return true;
}
```
```c
bool q_insert_tail(struct list_head *head, char *s)
{
if (!head)
return false;
element_t *ne = new_element(s);
if (!ne)
return false;
list_add_tail(&ne->list, head);
return true;
}
```
### q_remove_head and q_remove_tail
```c
static inline element_t *remove_element(element_t *obj,
char *sp,
size_t bufsize)
{
if (!obj)
return NULL;
if (sp && bufsize != 0) {
size_t len = strlen(obj->value);
if (len > bufsize - 1)
len = bufsize - 1;
memcpy(sp, obj->value, len);
sp[len] = '\0';
}
list_del(&obj->list);
return obj;
```
Notice that
1. since we need to return the obj after calling `q_remove_*`, we wouldn't release `q_release_element(obj)` here.
2. we need to check the size between len and bufsize in order to avoid from buffer overflow.
```c
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
if (!head || list_empty(head))
return NULL;
return remove_element(list_first_entry(head, element_t, list), sp, bufsize);
}
```
Be aware of the `list_empty(head)`. The lack of it may cause segmentation fault.
```c
element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
if (!head || list_empty(head))
return NULL;
return remove_element(list_last_entry(head, element_t, list), sp, bufsize);
}
```
### q_delete_mid
```c
bool q_delete_mid(struct list_head *head)
{
if (!head || list_empty(head))
return false;
struct list_head *tail = head->prev, *fast = head->next, *slow = fast;
while (fast != tail && fast->next != tail) {
fast = fast->next->next;
slow = slow->next;
}
list_del(slow);
q_release_element(container_of(slow, element_t, list));
return true;
}
```
Use the property that size of `fast` would always 2 times larger than `slow`, which indicates the `slow` would be the middle list_head when `fast` reaches the end of the list.
### q_delete_dup
Before designing the code, we have to notice that in this problem, we need to delete all duplicated nodes, which includes the node itself. For example, original is [1, 2, 2, 3, 3, 5], and the one after deleting the repeated nodes is [1, 5].
```c=
bool q_delete_dup(struct list_head *head)
{
if (!head)
return false;
struct list_head *cur = head->next;
while(cur != head && cur->next != head){
struct list_head *cur_next = cur->next;
element_t *cur_e = container_of(cur, element_t, list);
element_t *next_e = container_of(cur->next, element_t, list);
bool is_dup = false;
while(!strcmp(cur_e->value, next_e->value)){
is_dup = true;
struct list_head *tmp_next = next_e->list.next;
list_del(&next_e->list);
q_release_element(next_e);
if(tmp_next == head)
break;
next_e = container_of(tmp_next, element_t, list);
}
cur_next = cur->next;
if(is_dup){
list_del(&cur_e->list);
q_release_element(cur_e);
}
cur = cur_next;
}
return true;
}
```
`q_delete_dup` code explanation:
1. Line 7: Check if the unvisited list contains 0 or 1 node
2. Line 12-16: If the current string equals to the next undeleted string, we need to delete the **next element**(not current one). And we get a new `next_e`.
3. Notice that the `cur_next` save the information of the newlest next list_head of `cur`. We couldn't ignore it since we may need to delete `cur_e`(the element_t structure containing `cur`), which may cause we couldn't use `cur->next`
### q_swap
In this function, we have to notice that the `cur` is an pointer to pointer. If we use pointer in this case, either you would inevitably change the value inside the `cur` or you have to do extra work on dealing with the boundary condition.
```c=
void q_swap(struct list_head *head)
{
if (!head)
return;
struct list_head **cur = &head->next;
for (; *cur != head && (*cur)->next != head; cur = &(*cur)->next->next) {
struct list_head *first = *cur, *second = first->next;
first->next = second->next;
second->next = first;
second->prev = first->prev;
first->prev = second;
*cur = second;
}
}
```
### q_reverse
```c=
void q_reverse(struct list_head *head)
{
if (!head)
return;
struct list_head *cur, *safe;
list_for_each_safe (cur, safe, head) {
list_move(cur, head);
}
}
```
### q_reverseK
This is the code I saw from [ chiangkd ](https://hackmd.io/duPg38XOTL-XKa7_h0Ielw?view#q_reverseK), which is clear and uses the macros wisely.
```c=
void q_reverseK(struct list_head *head, int k)
{
if (!head || k <= 1)
return;
struct list_head *cur, *safe, *tmp = head;
LIST_HEAD(dummy);
int count = 0;
list_for_each_safe (cur, safe, head) {
count++;
if (count == k) {
list_cut_position(&dummy, tmp, cur);
q_reverse(&dummy);
list_splice_init(&dummy, tmp);
count = 0;
tmp = safe->prev;
}
}
}
```
`q_reverse` code explanation:
1. In line 12, it split the list into 2 sub lists, which sets `dummy` and `tmp` as the first list_head, respectively.
2. In line 13, everse the sub list beginning with `dummy`
3. In line 14, combine the two sub lists. Notice that when looking at the implementation of `list_spice_init`, the `dummy` has been initialized as a legal circular linked list.
### q_sort
It is nothing but a typical divide and conquer algorithm
```c=
void q_sort(struct list_head *head)
{
if (!head || head->next == head->prev)
return;
head->prev->next = NULL;
head->next = merge_sort(head->next);
struct list_head *ptr = head;
for (; ptr->next; ptr = ptr->next)
ptr->next->prev = ptr;
ptr->next = head;
head->prev = ptr;
}
```
```c=
struct list_head *merge_two_queue(struct list_head *L1, struct list_head *L2)
{
struct list_head *head = NULL, **ptr = &head;
for (struct list_head **cur = NULL; L1 && L2; *cur = (*cur)->next) {
if (strcmp(container_of(L1, element_t, list)->value,
container_of(L2, element_t, list)->value) >= 0)
cur = &L2;
else
cur = &L1;
*ptr = *cur;
ptr = &(*ptr)->next;
}
*ptr = (struct list_head *) (void *) ((uintptr_t) (void *) L1 |
(uintptr_t) (void *) L2);
return head;
}
```
In `merge_sort`, we firstly need to find the middle list_head(in line 7-9), and combine it(call `merge_two_queue`)
```c=
//A process of dividing
static struct list_head *merge_sort(struct list_head *head)
{
if (!head || !head->next)
return head;
struct list_head *slow = head, *fast = head->next;
for (; fast && fast->next; fast = fast->next->next, slow = slow->next)
;
fast = slow->next;
slow->next = NULL;
return merge_two_queue(merge_sort(head), merge_sort(fast));
}
```
### q_descend
```c
int q_descend(struct list_head *head)
{
if (!head)
return 0;
int len = 0;
struct list_head *cur = head->prev;
while (cur != head) {
len++;
if (cur->prev == head)
break;
element_t *c = container_of(cur, element_t, list),
*p = container_of(cur->prev, element_t, list);
if (strcmp(c->value, p->value) > 0) {
list_del(&p->list);
q_release_element(p);
len--;
} else {
cur = cur->prev;
}
}
return len;
}
```
### Current Grade
There is a strange situation.
When I use `make test`, it sometimes shows 95/100, sometimes shows 100/100.
While I use `make valgrind`, it always shows 95/100.
According to the error message, it should be some memory leakage in the function `q_insert_tail`, but I'm still working on it, trying to deal with the solution.
## Add new command -- Shuffle
```c=
//in queue.c
void q_shuffle(struct list_head *head)
{
if (!head || list_empty(head))
return;
int size = q_size(head);
// shuffle
for (int i = 0; i < size;) { // not iterate i , iterate size
struct list_head *start = head->next;
int rand_idx = rand() % size; // random number in range 0~ (size-1)
for (int j = 0; j < rand_idx; j++) {
start = start->next;
}
list_del(start);
list_add_tail(start, head);
size--;
}
return;
}
```
```c=
//in queue.h
void q_shuffle(struct list_head *head);
```
```c=
//in qtest.c
static bool do_shuffle(int argc, char *argv[])
{
if (argc != 1) {
report(1, "%s takes no arguments", argv[0]);
return false;
}
if (!l_meta.l)
report(3, "Warning: Try to access null queue");
error_check();
set_noallocate_mode(true);
if (exception_setup(true))
q_shuffle(l_meta.l);
exception_cancel();
set_noallocate_mode(false);
show_queue(3);
return !error_check();
}
....
static void console_init()
{
ADD_COMMAND(shuffle, "shuffle", "");
}
```
## Performance analysis for list_sort
### Modify the source code in [list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) to embed it to lab0
- Step 1: add three functions.(Notice that we have remove the unnecessary variables, `cmp` and `priv`)
- `static struct list_head *merge(struct list_head *a, struct list_head *b)`
- `static void merge_final(struct list_head *head,
struct list_head *a, struct list_head *b)`
- `void list_sort(struct list_head *head)`
- Step 2: add compare function
```c=
static int cmp(struct list_head *a, struct list_head *b)
{
return strcmp(container_of(a, element_t, list)->value, container_of(b, element_t, list)->value);
}
```
- Step 3: add necessary macros, `likely` and `unlikely`
```c=
# define likely(x) __builtin_expect(!!(x), 1)
# define unlikely(x) __builtin_expect(!!(x), 0)
```
- Step 4: add the new command for `list_sort`(just like we've done for `shuffle` command)
### Performance between q_sort and list_sort
- For q_sort,
```shell=
$perf stat --repeat 5 -e instructions,cycles ./qtest -f traces/trace-15-perf.cmd
Performance counter stats for './qtest -f traces/trace-15-perf.cmd' (5 runs):
7,5665,9045 instructions # 0.43 insn per cycle ( +- 0.01% )
17,6220,2427 cycles ( +- 0.19% )
0.45850 +- 0.00111 seconds time elapsed ( +- 0.24% )
```
- For list_sort,
```shell=
$ perf stat --repeat 5 -e instructions,cycles ./qtest -f traces/trace-15-perf-lsort.cmd
Performance counter stats for './qtest -f traces/trace-15-perf-lsort.cmd' (5 runs):
7,0725,6998 instructions # 0.51 insn per cycle ( +- 0.01% )
13,8684,4861 cycles ( +- 0.64% )
0.36797 +- 0.00706 seconds time elapsed ( +- 1.92% )
```
### <s>Time complexity</s> analysis between q_sort and list_sort
:::warning
We don't tend to analyze the time complexity. Instead, we care about the measured time and the factors to make programs run faster.
:notes: jserv
:::