Try   HackMD

2020q3 Homework (quiz1)

contributed by < hankluo6 >

Linked List

add_entry

  • new_node 加入到 linked list 的尾端,使用 pointer to pointer 來指向要加入的節點位置的「地址」。
void add_entry(node_t **head, int new_value)
{
    node_t **indirect = head;

    node_t *new_node = malloc(sizeof(node_t));
    new_node->value = new_value;
    new_node->next = NULL;

    AA1
    while (*indirect)
        indirect = &(*indirect)->next;
    
    AA2
}

AA1

  • (a) assert(new_node)

先使用 assert 來檢驗 malloc 是否成功。

AA2

  • (b) *indirect = new_node

當找到位置後,將該位址的值設為 new_node







linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



head

head



indirect->head





node1

node1



head->node1





node2

node2



node1->node2





new_node

new_node



  • 先將 indirect 設定為 head 的 address






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



node1

node1



indirect->node1





head

head



head->node1





node2

node2



node1->node2





new_node

new_node



  • while 中持續尋找有 NULL 值的位置,indirect&node->next,也就是存有 node 值的 address






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



node2

node2



indirect->node2





head

head



node1

node1



head->node1





node1->node2





new_node

new_node



  • 持續尋找 NULL 值的位置






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



NULL

NULL



indirect->NULL





head

head



node1

node1



head->node1





node2

node2



node1->node2





node2->NULL





new_node

new_node



  • 找到後跳出迴圈,indirectNULL 位置的 address






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



new_node

new_node



indirect->new_node





head

head



node1

node1



head->node1





node2

node2



node1->node2





node2->new_node





  • new_node 設定為該 address 的內容,完成插入在 linked list 尾端

討論

如果 malloc 無法分配記憶體,malloc 會回傳 NULLNULL 的定義在規格書中有寫道:

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.

得知 NULL 定義為 0 或為 (void*)0。對 NULL 取其成員,應為未定義行為。

透過程式碼驗證

typedef struct my_struct {
    int val;
} node;

int main() {
    node *head = NULL;
    printf("%d\n", head->val);
}

出現 segmentation fault,與預期的一樣。

再看看 assert 的作用,從 man page 中寫道:

If expression is false (i.e., compares equal to zero), assert()
prints an error message to standard error and terminates the program
by calling abort(3).

可推知當 expression 為 NULL (0) 時,會呼叫 abort(3)

如果這邊 assert 的作用是用來檢驗 malloc 是否成功,我認為應放在 new_node->value 前比較合適,否則沒有作用,new_node 為 NULL 時,new_node->value 就會丟出 segmentation fault。

find_entry

node_t *find_entry(node_t *head, int value)
{
    node_t *current = head;
    for (; current && current->value != value; current = current->next)
        /* interate */;
    return current;
}

find_entry 函式用來尋找指定 value 的節點







linked_list


cluster_0

linked list



head

head



node1

node1



head->node1





node2

node2



node1->node2





current

current



current->head











linked_list


cluster_0

linked list



head

head



node1

node1



head->node1





node2

node2



node1->node2





current

current



current->node1











linked_list


cluster_0

linked list



head

head



node1

node1



head->node1





node2

node2



node1->node2





current

current



current->node2





remove_entry

void remove_entry(node_t **head, node_t *entry)
{
    node_t **indirect = head;

    while ((*indirect) != entry)
        indirect = &(*indirect)->next;

    *indirect = entry->next;
    free(entry);
}

remove_entry 使用 pointer to pointer,可以減少刪除 head 時需要的判斷式







linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



entry

entry



indirect->entry





head

head



head->entry





node2

node2



entry->node2





  • while 迴圈找到 entry 的位置






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



indirect

indirect



node2

node2



indirect->node2





head

head



head->node2





  • 因為 indirect&head->next,所以直接將該 address 上的值更新為 node2 即可

swap_pair

node_t *swap_pair(node_t *head)
{
    for (node_t **node = &head; *node && (*node)->next; BB1) {
        node_t *tmp = *node;
        BB2;
        tmp->next = (*node)->next;
        (*node)->next = tmp;
    }
    return head;
}

BB1

  • (e) node = &(*node)->next->next

可以先把 type 不合的選項刪掉,(a), (d), (f) 的 pointer type 不合先移除。接著分析迴圈作用,從 *node && (*node)->next 可猜測每次迴圈應為移動兩個 node。注意迴圈內的程式碼,可看出我們沒設定 *node 前節點的 next,這是因為我們 node 是以之前 node->next 的 address 來更新,所以答案為 (e)

這邊需要兩次 next 的原因是因為 *node 已被設定到 tmp 的前面

BB2

  • (c) *node = (*node)->next

我們需要將 node 往後移動一個節點,型態吻合的只有 (b), (c) 兩個選項。

深入討論 (b) 選項,先執行 node_t *tmp = *nodetmp 目前為第一個節點,假如執行 node = &(*node)->next 的話,node 現在為第一個節點的 next 的 address,再來, tmp->next = (*node)->next 這行會讓第一個節點 tmpnext 指向 *nodenext。但 &tmp->next == node,對 tmp->next 的改動會影響到 *node 的值,所以 tmp->next = (*node)->next 事實上與 *node = (*node)->next 等效。造成迴圈的 node = &(*node)->next->next 有不正確的結果。

故答案應為 (c)







linked_list


cluster_0

pointer to pointer


cluster_1

linked list



node0

node



head

head



node0->head





node1

node1



head->node1





node2

node2



node1->node2





node3

node3



node2->node3





tmp

tmp



tmp->head





  • 先將 *tmp 指向 *nodetmp 為要 swap 的第一個節點






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



node0

node



head

head




node1

node1



node0->node1





head->node1





node2

node2



node1->node2





node3

node3



node2->node3





tmp

tmp



tmp->head





  • *node = (*node)->next 將 node 位址的值前進一個節點,現在 *node 為要 swap 的第二個節點






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



node0

node



head

head




node1

node1



node0->node1






node2

node2



head->node2





node1->node2





node3

node3



node2->node3





tmp

tmp



tmp->head





  • tmpnext 指向 *node 的後方,也就是將 swap 的第一個 node 的 next 指向第三個節點






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



node0

node



head

head




node1

node1



node0->node1






node2

node2



head->node2





node1->head






node3

node3



node2->node3





tmp

tmp



tmp->head





  • (*node)->next 要指回 tmp,實現 swap 的部分






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



node0

node



head

head




node2

node2



node0->node2





node1

node1




head->node2





node1->head






node3

node3



node2->node3





tmp

tmp



tmp->head





  • 因為 *node(*node)->next 已經完成 swap,將 node 更新為下兩個節點的位置,根據 loop invariant,可以兩兩 swap 每個節點

討論

用 gdb 來檢驗上述 BB2 選項的分析是否正確

假設將 BB2 設為 node = &(*node)->next

(gdb) n
32	        tmp->next = (*node)->next;
(gdb) p node
$1 = (node_t **) 0x555555756268
(gdb) p *node
$2 = (node_t *) 0x555555756280
(gdb) p &(*node)->next
$3 = (struct __node **) 0x555555756288
(gdb) p (*node)->next
$4 = (struct __node *) 0x5555557562a0
(gdb) n
33	        (*node)->next = tmp;
(gdb) p node
$5 = (node_t **) 0x555555756268
(gdb) p *node
$6 = (node_t *) 0x5555557562a0

可以看到 node 的值在 tmp->next = (*node)->next 後被更新為下個節點,與推測符合,所以該選項為非。

reverse

node_t *reverse(node_t *head)
{
    node_t *cursor = NULL;
    while (head) {
        node_t *next = head->next;
        CCC;
        head = next;
    }
    return cursor;
}

CCC

  • (b) head->next = cursor; cursor = head

從現有程式碼可以看到,node_t *next = head->next; 是將 next 移動到 head 後方,而 head = next 則是將 head 向後移動。所以缺少的部分應為將 next 反轉的部分。反轉的目標從 linked list 的頭之 prev 開始,隨著 head 移動而跟著移動。可以推測出反轉的目標應為 cursor。再從選項中找有執行上面步驟的程式碼 (b)







linked_list



a

a

 



b

b

 



a:c->b:data






c

c

 



b:c->c:data






d

d

 



c:c->d:data






cursor

cursor



NULL

NULL



cursor->NULL





head

head



head->a





  • 進入迴圈前的狀態,cursorNULL 表示 head 的前一個節點為 NULL






linked_list



a

a

 



b

b

 



a:c->b:data






c

c

 



b:c->c:data






d

d

 



c:c->d:data






cursor

cursor



NULL

NULL



cursor->NULL





head

head



head->a





next

next



next->b





  • 新增 next 指標用來指向 head->next,用來在 head 反轉後,還能知道原本 head->next 的值






linked_list



a

a

 



NULL

NULL



a:c->NULL






b

b

 



c

c

 



b:c->c:data






d

d

 



c:c->d:data






cursor

cursor



cursor->NULL





head

head



head->a





next

next



next->b





  • head->next 反轉,指向為 cursor,這邊 cursorhead 前節點 NULL






linked_list



a

a

 



NULL

NULL



a:c->NULL






b

b

 



c

c

 



b:c->c:data






d

d

 



c:c->d:data






cursor

cursor



cursor->a





head

head



head->a





next

next



next->b





cursor 需移動到 head,紀錄下次反轉需指向的節點







linked_list



a

a

 



NULL

NULL



a:c->NULL






b

b

 



c

c

 



b:c->c:data






d

d

 



c:c->d:data






cursor

cursor



cursor->a





head

head



head->b





next

next



next->b





  • head 往後移動,準備反轉第二個節點 b,而 a 已經反轉完畢,根據 loop invariant,最後將反轉整個 linked list

改寫 swap_pairreverse

void swap_pair(node_t **head)
{
    for (node_t **node = head; *node && (*node)->next; node = &((*node)->next->next)) {
        node_t *tmp = *node;
        *node = (*node)->next;
        tmp->next = (*node)->next;
        (*node)->next = tmp;
    }
}

只需將傳進去的參數 head 改為 pointer to pointer 的形式就行了,因為本身 **node 就已經在 &head 上操作,所以其餘部分皆不用改動。

void reverse(node_t **head)
{
    node_t *cursor = NULL;
    while (*head) {
        node_t *next = (*head)->next;
        (*head)->next = cursor; 
        cursor = *head;
        *head = next;
    }
    *head = cursor;
}

head 改為 pointer to pointer,迴圈內每個步驟只會改變 *head 的值,並固定真正的記憶體位置,就不需要回傳 head。須注意迴圈的判斷式為 *head 是否為 NULL,故跳出迴圈後,*head == NULL,需將 *head 設定為最後一個節點 cursor

以遞迴方式實現 reverse

node_t *rev_recursive(node_t *prev, node_t *head)
{
    if (!head)
        return prev;
    node_t *next = head->next;
    head->next = prev;
    return rev_recursive(head, next);
}

void reverse(node_t **head)
{
    *head = rev_recursive(NULL, *head);
}

遞迴的方式很簡單,每次進入 rev_recursive 時,保證 prev 前的 node 皆已經 reverse,此時再將 head->next = prev 就能讓 head 以前皆完成 reverse,在繼續遞迴下去。終止條件為 head 為空,也就表示 prev 以前已經完成 reverse

Fisher–Yates shuffle

參考 wiki 上的 pseudocode

-- To shuffle an array a of n elements (indices 0..n-1):
for i from 0 to n−2 do
     j ← random integer such that i ≤ j < n
     exchange a[i] and a[j]

Version 1.

void shuffle(node_t **head)
{
    int size = 0;
    node_t *current = *head;
    node_t *target = *head;
    node_t *prev_current = NULL;
    node_t *prev_target = NULL;
    for (node_t *tmp = *head; tmp; tmp = tmp->next, ++size)
        /* calculate linked list size */;
    for (int i = 0; i < size - 1; ++i) {
        prev_target = target;
        target = current;
        int pos = rand() % (size - i);
        while (pos--) {
            prev_target = target;
            target = target->next;
        }

        if (prev_current)
            prev_current->next = target;
        if (prev_target)
            prev_target->next = current;
        
        node_t *tmp = current->next;
        current->next = target->next;
        target->next = tmp;

        prev_current = target;
        current = target->next;

        if (!i)
            *head = target;
    }
}

紀錄要交換的兩個 node 的 node->prevnode,再互相 swap 就行了。

既然有 node->prev 的話,我們可以不用特別紀錄需要交換的 node 的值,只要找要交換的 nodeprev 就行了。

Version 2.

void swap(node_t **x, node_t **y)
{
    node_t *tmp = *x;
    *x = *y;
    *y = tmp;
}

void shuffle(node_t **head)
{
    int size = 0;
    node_t *dummy = malloc(sizeof(node_t));
    dummy->next = *head;

    node_t *current = dummy;
    node_t *target = dummy;
    for (node_t *tmp = *head; tmp; tmp = tmp->next, ++size)
        /* calculate linked list size */;
    for (int i = 0; i < size; ++i) {
        target = current;
        int pos = rand() % (size - i);
        while (pos-- > 0) {
            target = target->next;
        }
        swap(&current->next, &target->next);
        current = current->next;
        target = target->next;
        swap(&current->next, &target->next);
    }
    *head = dummy->next;
    free(dummy);
}

增加 dummy 節點來減少 target 選到 head 的時的判斷式,程式的邏輯變得很簡單,找到要交換的兩節點之 prev 後,swap 目前兩節點之 next,再將節點往後移動,並再一次 swap 兩邊的 next







linked_list



dummy

dummy

 



a

a

 



dummy:c->a:data






b

b

 



a:c->b:data






c

c

 



b:c->c:data






d

d

 



c:c->d:data






current

current



current->dummy





target

target



target->dummy





  • currenttarget 設置為 dummy,一開始要交換的 node 為 a,只是我們的 current 目前指向要交換的節點之 prevtarget 也一樣指向為要交換的節點之 prev






linked_list



dummy

dummy

 



a

a

 



dummy:c->a:data






b

b

 



a:c->b:data






c

c

 



b:c->c:data






d

d

 



c:c->d:data






current

current



current->dummy





target

target



target->b





  • target 選到 b,所以 ac 需要交換






linked_list



dummy

dummy

 



c

c

 



dummy:c->c:data






a

a

 



b

b

 



a:c->b:data






b:c->a:data






d

d

 



c:c->d:data






current

current



current->dummy





target

target



target->b





  • a->prevnextc->prevnext 互換






linked_list



dummy

dummy

 



c

c

 



dummy:c->c:data






a

a

 



b

b

 



a:c->b:data






b:c->a:data






d

d

 



c:c->d:data






current

current



current->c





target

target



target->a





  • 移動 targetcurrent 至它們的 next 節點






linked_list



dummy

dummy

 



c

c

 



dummy:c->c:data






a

a

 



d

d

 



a:c->d:data






b

b

 



b:c->a:data






c:c->b:data






current

current



current->c





target

target



target->a





  • 重複上面的 swap 步驟,將 anextcnext 互換

Version 3.

void swap(node_t **x, node_t **y)
{
    node_t *tmp = *x;
    *x = *y;
    *y = tmp;
}

void shuffle(node_t **head)
{
    int size = 0;
    node_t **current = head;
    node_t **target;
    for (node_t *tmp = *head; tmp; tmp = tmp->next, ++size)
        /* calculate linked list size */;
    for (int i = 0; i < size - 1; ++i, current = &(*current)->next) {
        target = current;
        int pos = rand() % (size - i);
        while (pos-- > 0) 
            target = &(*target)->next;

        swap(target, current);
        swap(&(*target)->next, &(*current)->next);
    }
}

使用 pointer to pointer 來取代 dummy head,減少所需記憶體。







linked_list


cluster_0

pointer to pointer


cluster_1

linked list



cur

current



head

head



cur->head





tar

target



tar->head





node1

node1



head->node1





node2

node2



node1->node2





node3

node3



node2->node3





  • 一開始先將 currenttarget 指向 head 的 address






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



cur

current



head

head



cur->head





tar

target



node2

node2



tar->node2





node1

node1



head->node1





node1->node2





node3

node3



node2->node3





  • target 選到 node2,指向 node2 的 adress






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



cur

current



node2

node2



cur->node2





tar

target



head

head



tar->head





node1

node1



head->node1





node1->head





node3

node3



node2->node3





  • headnode2 交換,注意這邊交換的是節點,而 currenttarget 指向的位置不變






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



cur

current



node2

node2



cur->node2





tar

target



head

head



tar->head





node3

node3



head->node3





node1

node1



node1->head





node2->node1





  • target->nextcurrent->next 交換,完成 swap






linked_list


cluster_0

pointer to pointer


cluster_1

linked list



cur

current



node1

node1



cur->node1





tar

target



tar->node1





head

head



node3

node3



head->node3





node1->head





node2

node2



node2->node1





  • 最後將 current 往後移動,進入下輪迴圈

Valgrind

使用 valgrind 來比較三種版本的記憶體

測試程式

int main(int argc, char const *argv[])
{
    node_t *head = NULL;
    
    for (int i = 0; i < 100000; ++i)
       add_entry(&head, rand() % 100);
       
    shuffle(&head);
    return 0;
}

使用 massif 分析,記得開啟 stacks 紀錄堆疊區塊

valgrind --tool=massif --stacks=yes ./a.out 

為了版面整潔,只列出程式結束前幾個 snapshots 的記憶體情形

  • Version 1.

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 46 58,905,268,954        2,400,376        1,600,000       800,000          376
 47 60,293,938,629        2,400,376        1,600,000       800,000          376
 48 61,682,533,376        2,400,376        1,600,000       800,000          376
 49 63,534,166,248        2,400,376        1,600,000       800,000          376
 50 64,287,194,889        2,400,376        1,600,000       800,000          376
 51 65,040,246,906        2,400,376        1,600,000       800,000          376
  • Version 2.

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 88 58,440,902,265        2,400,384        1,600,016       800,008          360
 89 58,903,802,669        2,400,384        1,600,016       800,008          360
 90 59,366,668,507        2,400,384        1,600,016       800,008          360
 91 59,829,453,069        2,400,384        1,600,016       800,008          360
  • Version 3.

​--------------------------------------------------------------------------------
​ n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
​88 58,443,219,786        2,400,360        1,600,000       800,000          360
​89 58,906,109,686        2,400,360        1,600,000       800,000          360
​90 59,368,915,633        2,400,360        1,600,000       800,000          360
​91 59,831,815,688        2,400,360        1,600,000       800,000          360
​92 60,294,835,559        2,400,360        1,600,000       800,000          360
​93 60,757,641,570        2,400,360        1,600,000       800,000          360
​94 61,220,406,226        2,400,360        1,600,000       800,000          360
​95 61,683,176,372        2,400,360        1,600,000       800,000          360
​96 62,145,982,621        2,400,360        1,600,000       800,000          360
  • 從 useful-heap 來看,因為 version 2 有額外 malloc 一個 dummy node,所以比其他兩種版本多了 16 bytes 的 heap。
  • 從 stacks 來看,version 1 較其他兩種版本多了兩個指標變數,在實驗環境中 ( 64位元 ),pointer 為 8 bytes,所以 version 1 多了 16 bytes。

最理想的版本為 version 3,使用到的 total memory 最小,為 2,400,360 bytes。

tags: linux2020