Homework 4 (mergesort-concurrent)

作業要求在這裡
作業交這裡
程式在這裡

Code

mergesort-concurrent

一開始發現註解有點奇特，突然想到搞不好是用了 doxygen 。然後就手癢：

$ doxygen .

然後就生出美美的文件了。真是開心。

接著看 code 。大致的概念是：把整個 linked list 拆成很多段，每一段給一個 thread 負責，最後再把每個執行序排好的東西合併。

首先要可以改成 phonebook 資料的版本。因為 list.c 當中定義的 node_t 中的 val_t 可以指到字串，所以在新增節點時把他變成指到一個字串：

static uint32_t build_list_from_file(llist_t *_list, const char *filename 
{
    ...
    
        char *name = (cher*)malloc(16);
        strcpy(name, buffer);
        list_add(_list, name);
        
    ...
}

不過因為字串比大小跟整數比大小不一樣，所以這需要另外調整。把 merge_sort.c 當中的 sort_n_merge()由：

        llist_t *small = (llist_t *)
                         ((intptr_t) a * (a->head->data <= b->head->data) +
                          (intptr_t) b * (a->head->data > b->head->data));

改成：

       llist *small = strcmp(a->head->data, b->head->data)>0?b:a;

更好的方式應該是在 merge_sort 函數當中多接收一個 cmp(), 像 qsort() 一樣。不過這樣就要連函數的原型都改掉。這裡先讓他能動再說。

我暫時用以下的方法去打亂字典檔：

$ sort -R words.txt > random_words.txt

concurrent-ll

然後研究 concurrent-ll 。

在這之前，可以先翻一下這個實作的基礎，也就是這篇文章。

文中首先提到對「天真版」的 linked-list 插入與刪除，就算直接用compare-and-swap 這個最小操作改寫，也沒有辦法保證結果正確。在只有 insert 的狀況下可以成立，但如果加上 delete 的話，如果 insert 跟 delete 發生時機很接近，有可能會發生以下的狀況：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

情境是這樣：準備插入 20, 並且刪除 10。這是做到一半的狀況。

delete 做的事情是這樣：

CAS(&(H->next), head_next, head->next->next)
insert 準備執行的動作是這樣：

CAS(&(node_10->next), node_10_next, node_20)

但是兩個 CAS 的動作的先後是無法確定的。如果 insert 的先發生，那結果是正確的。但是如果 delete 的先發生，就會變成下面這個樣子：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

結果顯然不太對，因為本來沒有要刪除 20 那個節點。那到底要怎麼辦呢？這篇文章提到一個作法：刪除時不要真的刪除，而是掛一個牌子說「此路段即將刪除」，但是還是過得去，像這樣：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

然後等到真的要拆他的時候（比如說需要插入的時候），再把它拆掉。

所以需要有個方法標示「不需要的節點」，然後要有 atomic operation.

標示不用的節點

不過看了一陣子，發現了幾個問題：為什麼只用位元運算就可以做到「邏輯上」刪除節點？

答案應該是用 data alignment。首先是 C99 的規格中 6.7.2.1 提到：

12
Each non-bit-field member of a structure or union object is aligned in an implementation- defined manner appropriate to its type.
13
Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

所以這個東西是 implementation-defined. 所以跑去看一下 gcc 的文件怎麼說：

The alignment of non-bit-field members of structures (C90 6.5.2.1, C99 and C11 6.7.2.1).
Determined by ABI.

囧

先回去翻 C99 的規格，裡面提到關於 intptr_t 這個資料型態：

7.18.1.4 Integer types capable of holding object pointers

The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:

intptr_t

所以 intptr_t 是個 integral type, 要至少可以裝得下 pointer

然後去查了一下 /usr/include/stdint.h, 查到以下內容：

/* Types for `void *' pointers.  */
#if __WORDSIZE == 64
# ifndef __intptr_t_defined
typedef long int        intptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned long int   uintptr_t;
#else
# ifndef __intptr_t_defined
typedef int         intptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned int        uintptr_t;
#endif

然後繼續找這份文件，裡面其中一個附表：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

然後又說：

Aggregates and Unions
Structures and unions assume the alignment of their most strictly aligned compo-
nent. Each member is assigned to the lowest available offset with the appropriate
alignment. The size of any object is always a multiple of the object‘s alignment.
An array uses the same alignment as its elements, except that a local or global
array variable of length at least 16 bytes or a C99 variable-length array variable
always has alignment of at least 16 bytes. 4
Structure and union objects can require padding to meet size and alignment
constraints. The contents of any padding is undefined.

所以可以推論結構：

typedef intptr_t val_t;

typedef struct node {
    val_t data;
    struct node *next;
} node_t;

的定址當中，不會發生位址的最後一個 bit 被使用到的狀況(因為都是 4 的倍數 + 必須對齊)。所以就把最後一個 bit 設成 1 當作是刪掉這個節點的 mark. 而把最後一個 bit 設成 0, 就表示恢復原狀。

atomic operation

util.h 裡面定義了一堆 atomic operations. 估狗一下發現這篇。不過大部份都跟名字一樣，只是變成 atomic 版本。

注意到 _sync 系列的 atomics builtins 已經標註為 legacy: Legacy __sync Built-in Functions for Atomic Memory Access
–jserv

不過完全不知道怎麼從 atomic operation 生出 lock-free 演算法，所以只好繼續查資料。

insert

註：後面引用的 code 來自於論文。但是基本上跟 concurrent-ll 中的原始碼大同小異。

linked-list 插入都是插在兩個節點中間，左邊的就叫 left_node, 右邊的就叫 right_node 。

大概的概念是這樣：到寫入位址的前一刻，如果這兩個節點仍然是相鄰的，就寫入新値; 如果發現不一樣，就重新來過。原文中的 code 是這樣：

public boolean List::insert (KeyType key) { 
    Node *new_node = new Node(key);
    Node *right_node, *left_node;
    do {
        /* 尋找插入位置的左右相鄰的節點 */
        right_node = search (key, &left_node);
        
        /* 找到跟欲插入節點一樣的值，就不插入 */
        if ((right_node != tail) && (right_node.key == key)) /*T1*/
            return false;
            
        /* 準備插入時，確定剛剛找的左右是否相鄰，也就是驗證 right.next 在這時是否仍然等於 right_mode */
        /* 如果是，就 compare-and-swap */
        /* 如果不是，就重新搜尋適當的插入位置，並重新來過 */
        new_node.next = right_node;
        if (CAS (&(left_node.next), right_node, new_node)) /*C2*/
            return true;
    } while (true); /*B3*/
}

這裡就注意到一件事：要怎麼樣實作 search，使得它真的可以找到相鄰的兩個節點，而且中間沒有「被標記刪除」的節點。就算中間碰到其他人插了節點，也要能夠發現並且做出相應的動

search

search() 的功能大致上是這樣：

找到「插入位置前後」的兩個節點。既然是準備插入，所以一邊會比較大，一邊會比較小。小的那邊的節點叫 left_node ，大的那邊的節點叫 right_node。

因為只能回傳一個值，這裡的設計是函數回傳right_node, 並且提供一個 pointer to pointer 讓函數可以寫入left_node。

如果節點已經存在的話，那本來應該是「比較大」的那個節點，就存「大小相等」的那個節點位置。

這裡要注意 search 必需「真的找到相鄰的節點」。

之所以這樣說是因為如果有很多執行緒一邊插入，一邊刪除，那麼這一刻以為相鄰的節點，可能下一刻中間就立刻就被插入其他點，或是其中一個被刪除。這些都是單一執行緒不會發生的事。所以若找到的兩個節點中間有被標記為刪除的節點，這時候 search 必須刪除那些節點後再回傳。

然後因為本來確認的事實，如果稍微不留神，到下一刻可能就會立刻被其他執行緒改變，所以會一直反覆檢查。

全部的 code 是這樣：


private Node *List::search (KeyType search_key, Node **left_node) { 

    Node *left_node_next, *right_node;
    search_again:
        do {
            
            Node *t = head;
            Node *t_next = head.next;
            do {
                /* 因為while 的條件，如果找到沒被標記的點，它只有可能是值比較小的 */
                /* 這時候就把他們存起來 */
                if (!is_marked_reference(t_next)) {     
                (*left_node) = t;
                left_node_next = t_next;
            }
            
            /* 這裡就是單純的遍歷 */
            /* t = t->next 的感覺*/
            /* 但如果碰到被標記刪除的點，要記得接下來要走的是沒被標記的版本*/
            t = get_unmarked_reference(t_next);
            if (t == tail) break;
            t_next = t.next;
        } while (is_marked_reference(t_next) || (t.key<search_key));
        right_node = t;
    
        /* 如果中間沒有被標記刪除的點，做最後檢查後回傳 */ 
        if (left_node_next == right_node)
            if ((right_node != tail) && is_marked_reference(right_node.next)) 
                goto search_again; /*G1*/
            else
                return right_node; /*R1*/
        
        /* 如果中間有被標記刪除的點，就把他們刪掉 */
        if (CAS (&(left_node.next), left_node_next, right_node)) /*C1*/
            if ((right_node != tail) && is_marked_reference(right_node.next)) 
                goto search_again; /*G2*/
            else
                return right_node; /*R2*/
        } while (true); /*B2*/ 
}

詳細一點的解釋大致如下：

遍歷所有的節點。當前的點叫 t, 當前的點下一個位置叫 t_next。大致上是這樣：


do {
    if (!is_marked_reference(t_next)) {         
        (*left_node) = t;
        left_node_next = t_next;
    }
    t = get_unmarked_reference(t_next);
    if (t == tail) break;
    t_next = t.next;
} while (is_marked_reference(t_next) || (t.key<search_key)); /*B1*/ 
right_node = t;

可以注意 while 迴圈裡面的驗證順序：先驗證 t 是不是被標記為刪除，如果不是，則接著驗證 t 的值是否大於等於要搜尋的值

如果 t 是被刪掉的節點，那麼就要一直往下跑，直到遇到第一個沒被刪掉的點。這個節點有可能比較要搜尋的值小或大：
- 比較小：下一次回到迴圈頭時，就會把它設成左端點。
- 比較大或等於：就會跳出迴圈（因為沒被刪 + 值大於等於），並且把它設為右端點
如果 t 是沒被刪掉的點，也就是 is_marked_reference(t_next) 為假，但是裡面的值比要搜尋的還小，那就表示 t 是可能的左端點，下次回到迴圈頭時，就把它設成左端點。並繼續搜尋右端點。
因為要求回傳的點保證相鄰，所以剛找到左右之後，就要檢查是否相鄰：

if (left_node_next == right_node)
    if ((right_node != tail) && is_marked_reference(right_node.next))
        goto search_again; /*這是全部重做的意思*/
    else
        return right_node; /*R1*/

如果相鄰，皆大歡喜。準備回傳時再檢查有沒有被刪掉之後，就把東西回傳。

如果不相鄰，表示中間有一些標記被刪除的點。這時候就要先把中間那些節點全部刪掉，再回傳左右節點：

/* 3: Remove one or more marked nodes */
if (CAS (&(left_node.next), left_node_next, right_node)) /*C1*/
    if ((right_node != tail) && is_marked_reference(right_node.next))
        goto search_again; /*G2*/
    else
        return right_node; /*R2*/

這裡就是一口氣把它 CAS 成剛剛找到的右邊節點。

(然後可以發現這裡沒有 free 他。所以在 concurrent-ll 當中的註解有說要等人做垃圾回收)

這裡動不動就會要他全部重做。不管是 CAS 比較時失敗，或是好不容易找到的右端點被莫名其妙冒出來的 delete 刪掉、左右端點間突然被插入了奇怪的點等等。

delete

有了 search 之後一切就變得輕鬆許多了。delete 的程式碼如下：

public boolean List::delete (KeyType search_key) 
{ 
    Node *right_node, *right_node_next, *left_node;
    do {
        /* 找相鄰節點。要刪的那個會被存在 right_node */
        right_node = search (search_key, &left_node);
        
        /* 如果沒找到的話 */
        if ((right_node == tail) || (right_node.key != search_key)) /*T1*/
            return false;
        right_node_next = right_node.next;
        
        /* 如果 right_node 剛剛到現在沒被標記為刪除的話 */
        if (!is_marked_reference(right_node_next))
            if (CAS (&(right_node.next), right_node_next, get_marked_reference (right_node_next)))
                break;
    } while (true);
    
    /* 試著刪除剛剛標記的點。 */
    /* 不過只要呼叫 search() 都會順便執行刪除。就算這次沒成功也無妨*/
    if (!CAS (&(left_node.next), right_node, right_node_next)) /*C4*/ 
        right_node = search(right_node.key, &left_node);
    return true;
}

這個概念就相對單純：

找到要刪的點（search會把他回傳），然後檢查是不是要找的點那個。
如果確定找到，確認剛剛檢查的這段時間，這兩個點中間有沒有被插進奇怪的東西。

Lock-Free

查一查發現了一個研討的影片：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

雖然主要是使用 C++, 但我想我可以把概念轉成相應的 C。

Think in Transaction

這裡 transaction 在這裡是「轉帳」的意思。這裡提到的想法是：把程式對資料的操作想成轉帳。

為什麼呢？在轉帳的時候，比如說轉 10,000 元到某人的帳戶。如果這個過程中他剛好去了查詢帳號中的存款，那他只應該看到兩種結果：

轉帳前的數字
轉帳後的數字

除了這兩個數字以外，他不應該看到其他數字，就算程式對帳號金額做一些處理，而使得這個資料處在某種未完成的中間狀態，他也不應該看到任何資料的中間狀態。

一直到對資料更改完畢之後，才使用 atomic write ，把改變寫入。這樣一來，其他人就只會看到改變前後的狀態。

這裡也提到 atomic write 隱含這樣的意義：本來對資料的改變，外面的人是看不到（至少不應該看到）的 (只看得到轉帳前的狀態)，而 atomic write 做的事是「把你做的所有改變讓大家看到」(就是把轉帳後的餘額顯示出來)。

Atomic Operation

聽過的比較貼合的中文是叫「最小操作」。就是某一個動作執行時，中間沒有辦法分割。

其中一種機制是 Compare and Swap (CAS)，概念上來說，就是「不可分割地」做以下的事：

    CAS(int *a, int old_val, int new_val)
    {
        if (*a == old) {
            *a = new;
            return old_val;
        }
        return *a;
    }

白話文就是：

「可以幫我把冰箱裡的大象換成長頸鹿嗎？什麼？你跟我說冰箱裡面放的不是大象？那冰箱裡面到底放什麼？」

註：「裡面放的東西一樣」不保證「剛剛到現在都沒有人動過裡面的東西」是不一樣的。比如 ABA 問題中的情境。

通常會回傳一些值來表示有沒有真的寫入，不過這可能有不同的機制(比如說傳 bool 或是回傳 *ptr 的值)，所以上面就寫個 psuedo code。

比如說 GCC 的：

bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...);

如果 *ptr 被更新成 new_val 的話，會回傳 true。但是另外一個：

type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...);

會回傳 *ptr 在這個動作之前的值。

但是不管怎麼樣，都有辦法事後知道那個當下有沒有成功把東西寫進去。

3 種 Lock-Free 的等級

wait-free : 所有執行緒都可以再有限的步驟內結束，不管其他執行序做了什麼事。
lock-free :

"Somebody makes progress if my thread has to wait"

就算有的執行序可能永遠都沒有做事(比如說因為 starvation)，但總之一定有一些執行續是有貢獻的，而且做的事情不是在裝忙(比如說 livelock)。
obstruction-free :

I will make progress as long as nobody else is doing anything

但是當有其他執行續在動的時候，可能會 livelock