# 2016q3 Homework3(mergesort-concurrent) contributed by <`nekoneko`> ## 預期目標 - 作為 concurrency 的展示案例 - 學習 POSIX Thread Programming,特別是 synchronization object - 為日後效能分析和 scalability 研究建構基礎建設 - 學習程式品質分析和相關的開發工具 ## 筆記 ### manger and workers 參考資料: [Chapter 2 - Designing Threaded Programs](http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_19.html) - manager 接收請求(request),根據請求分配工作給workers - worker完成任務時,可以回覆資訊給manager,交給manager處理 - 每當manager分配任務給worker時,必須建立一個新的執行緒(create a new thread)。建立新的執行緒是有成本(overhead)在的。其中一個解決的模式為`thread pool`。 - `thread pool`: 事先建立多個執行緒,我們稱之為thread pool,等到manager分配工作下來後,在從thread pool裡取出一個執行緒來做其工作。 ### thread pool - task queue - thread pool (many threads) ![](https://upload.wikimedia.org/wikipedia/commons/0/0c/Thread_pool.svg) ### Git hooks - 安裝 ```txt $ sudo apt-get install astyle cppcheck $ script/install-git-hooks ``` - intall-git-hooks是一個bash檔 ```txt #!/bin/sh if ! test -d .git; then echo "Execute scripts/install-git-hooks in the top-level directory." exit 1 fi ln -sf ../../scripts/pre-commit.hook .git/hooks/pre-commit || exit 1 echo echo "Git commit hooks are installed successfully." ``` - `test -d .git`: 檢查`.git`是否為存在且為資料夾 ### mutrace #### 執行 - 安裝 ```txt $ sudo apt-get install mutrace ``` - 執行 ```txt $ (for i in {1..8}; do echo $RANDOM; done) | mutrace ./sort 4 8 ``` - 輸出 ```txt mutrace: Showing statistics for process sort (pid 13148). mutrace: 3 mutexes used. Mutex #0 (0x0x19649b0) first referenced by: /usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f71d0c9f4b2] ./sort(tqueue_init+0x38) [0x401277] ./sort(tpool_init+0x6a) [0x4014cc] ./sort(main+0x161) [0x401c74] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f71d06d6830] Mutex #2 (0x0x7f71ce2a5860) first referenced by: /usr/lib/mutrace/libmutrace.so(pthread_mutex_lock+0x49) [0x7f71d0c9f6b9] /lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Find_FDE+0x2c) [0x7f71ce0a1fec] [(nil)] Mutex #1 (0x0x603120) first referenced by: /usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f71d0c9f4b2] ./sort(main+0x11d) [0x401c30] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f71d06d6830] mutrace: Showing 3 most contended mutexes: Mutex # Locked Changed Cont. tot.Time[ms] avg.Time[ms] max.Time[ms] Flags 0 153 38 25 0.040 0.000 0.002 Mx.--. 2 20 14 5 0.006 0.000 0.002 M-.--. 1 13 7 1 0.003 0.000 0.000 M-.--. |||||| /||||| Object: M = Mutex, W = RWLock /|||| State: x = dead, ! = inconsistent /||| Use: R = used in realtime thread /|| Mutex Type: r = RECURSIVE, e = ERRRORCHECK, a = ADAPTIVE /| Mutex Protocol: i = INHERIT, p = PROTECT / RWLock Kind: r = PREFER_READER, w = PREFER_WRITER, W = PREFER_WRITER_NONREC mutrace: Note that the flags column R is only valid in --track-rt mode! mutrace: Total runtime is 1.189 ms. ``` - 使用addr2line將address轉成程式碼所在的行號 ```txt $ addr2line -e sort 0x401277 0x4014cc 0x401c74 ``` - 輸出: ```txt /home/jacklin/Documents/sys2016/week3/mergesort-concurrent/threadpool.c:15 /home/jacklin/Documents/sys2016/week3/mergesort-concurrent/threadpool.c:80 /home/jacklin/Documents/sys2016/week3/mergesort-concurrent/main.c:175 ``` > 這邊共筆的範例是用括弧裡的位址(ex: 0x38),但在我電腦上要用中括弧裡的位址才能印出code的位置。[name=cheng hung lin] - main.c:175 ```clike=172 tpool_init(pool, thread_count, task_run); /* launch the first task */ task_t *_task = (task_t *) malloc(sizeof(task_t)); _task->func = cut_func; _task->arg = the_list; tqueue_push(pool->queue, _task); ``` - threadpool.c: 80, 在`tpool_init`中 ```clike=78 tqueue_init(the_pool->queue); pthread_attr_t attr; pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); ``` - threadpool.c: 15, 在`tqueue_init`中 ```clike=12 the_queue->head = NULL; the_queue->tail = NULL; pthread_mutex_init(&(the_queue->mutex), NULL); pthread_cond_init(&(the_queue->cond), NULL); the_queue->size = 0; return 0; ``` 這邊對印address的code似乎沒看到mutex發生的情況,覺的自己在某方面上有不了解的 #### 資料閱讀 - `Locked`: how often the mutex was locked during the entire runtime - `Changed`: how often the owning thread of the mutex changed - `Cont.`: how often the lock was already taken when we tried to take it and we had to wait - `tot.Time[ms]`: lock的總時間 - `avg.Time[ms]`: 平均每個lock的時間 - `max.Time[ms]`: lock最久的時間 #### `mutrace -h` ## 作業 ### TODO - [ ] 35萬筆資料亂序排序 - [ ] 改成35萬筆資料輸入 ### 程式碼分析 #### main.c - main.c: main - 初始化link-list的資料 - 初始化pthread_mutex `data_contex` - 初始化thread pool和schedule的函式 `task_run` - 將cut_fun放入thread pool裡的work queue ```graphviz digraph main { a -> b -> c; a [label="intialize link-list"]; b [label="intialize thread pool"]; c [label="put cut_fun into work queue"]; } ``` - main.c: main - max_cut 決定最後task做cut_funct的數量 ```clike= max_cut = thread_count * (thread_count <= data_count) + data_count * (thread_count > data_count) - 1; ``` - main.c: merge 總共有4個情況 1. list的長度小於data_count: 代表傳入merge的data都還是切分過得list,所以要執行左序列和右序列的合併。 1. \_t為NULL的情況: \_t是tmp_list的別名(alias),tmp_list是個全域變數,若tmp_list為NULL,則將傳入data pointer賦值,這種情況是左序列的資料傳入merge的情況; 2. \_t非NULL的情況: 即tmp_list非NULL pointer,代表是右序列的資料傳入merge,將\_list(tmp_list的copy值)和傳入的data交給下一個新的task,merge_list,tmp_list重設為NULL。 2. list的長度大於等於data_count: list已經merge完畢,創立一個func指像NULL pointer的task,放入wait queue裡,當作thread_pool裡task_run裡的中止信號。 - main.c: task_run(void \*) - 一直從work queue抓取新的task來執行,直到work queue 裡**task的funcptr為NULL**(第131行)。 ```clike=125 static void *task_run(void *data) { (void) data; while (1) { task_t *_task = tqueue_pop(pool->queue); if (_task) { if (!_task->func) { tqueue_push(pool->queue, _task); break; } else { _task->func(_task->arg); free(_task); } } } pthread_exit(NULL); } ``` - main.c: cut_func - 不斷將list分成兩半,再將左list和右list的cut_func工作丟回work queue,交由thread pool裡的thread再執行左半list和右半list的cut_func。 - 直到llist的長度小於2或是,cut_count大於等於main函式初始的max_count。這邊cut_count是data_context.cut_thread_count複製過來的,每當cut_func分切llist時,data_context.cut_thread_count會計數一次。 - threadpool.c: tpool_init - 建立thread pool,每個threads一開始做的task由`void *(*func)(void *)`初始。也就是說,每條thread做的就是func。 - 在main.c裡,傳task_run進入tpool_init。 - 所以每個thread pool裡的thread做的是,不斷從work queue裡取工作來執行。 - 設定pthread_attr為JOINALBLE,pthread_create預設本來就是JOINALBLE,但是考慮到不同平台上,posix thread背後實做上可能會有差異,在重新初始atrr一次。 - threadpool.c: `tqueue_*` - 無論是做tqueue_push還是tqueue_pop,都有tqueue_t裡的mutex包住會更動tqueue_t的critial section。 - struct \_task的成員命名不好,list的前後連結通常叫做prev和next。 - 這邊的queue是使用double link list來實做 - tqueue_init: - head 和 tail 初始化為NULL, mutex, condition variable使用pthread的函式初始化,size初始化為0 - tqueue_push: ```clike=47 int tqueue_push(tqueue_t *the_queue, task_t *task) { pthread_mutex_lock(&(the_queue->mutex)); task->last = NULL; task->next = the_queue->head; if (the_queue->head) the_queue->head->last = task; the_queue->head = task; if (the_queue->size++ == 0) the_queue->tail = task; pthread_mutex_unlock(&(the_queue->mutex)); return 0; } ``` - ~~第55行if的判斷式可以由第52行還取代,因為一開始初始時,head和tail都是NULL pointer,要判斷queue是emty的話,有size, head, 和 tail可以判斷。~~ - tqueue_push的queue member,做記憶體配置的時候,是在push函式之外,然而釋放的動作是在tqueue_free裡。 - 參考mingnus共筆裡提到的glib,裡面有實做一個double ended queue的object,其中free函式可允許傳入自訂的函式對queue裡element做記憶體釋放。 - `void g_queue_free_full (GQueue *queue, GDestroyNotify free_func);` - Convenience method, which frees all the memory used by a GQueue, and calls the specified destroy function on every element's data. - list.c, list.h: ```clike= #include <stdint.h> ... typedef intptr_t val_t; ... typedef struct node { val_t data; struct node *next; } node_t; ``` - intptr_t: c11的library提供的變數,允許void pointer用int做存取。參考[stackoverflow](http://stackoverflow.com/questions/35071200/what-is-the-use-of-intptr-t#comment57867575_35071243),假設使用pointer做bitwise運算,適合使用uintptr_t。另外,也有網友表示使用intptr_t做add/sub比較安全。 ### REFACTORING - ~~將struct node的 data改為void \*,允許存取多種資料型態~~intptr_t是將pointer轉成int儲存,但是void \*和intptr_t兩者之間比較是用,還是不太清楚。 ```clike typedef void *val_t; ``` - list_print的參數允許傳自訂的function pointer和file pointer來印出list的成員 ```clike typedef void (*print_funptr_t)(val_t,FILE *); void list_print(llist *list, print_funptr_t funptr, FILE *fp); ``` ### 疑問 - [ ]參考shelly4132同學的共筆和repo,知道將max_cut的部份去掉,所以原本cut_func的收斂條件變為llist size小於等於1的時候。mergesort-concurrent裡有兩種merge,一個是單一一個task裡的llist做merge,另一個是兩個task的merge。若切到每個task裡都只剩1個element的link list,效率會不會是不好的? > merge, merge_sort, merge_list還沒看完,等了解完,再確認上面想法有沒有錯。[name=cheng hung lin] ### 參考資料 - [Fzzz大大的共筆](https://hackmd.io/s/SJfa8wFR) -- CS50-mergesort - [FB--mingnus對mergesort舉出的缺失](https://www.facebook.com/groups/system.software2016/permalink/1140141602732010/?match=bWVyZ2Vzb3J0IGNvbmN1cnJlbnQ%3D) ###### tags: `nekoneko`