2016q3 Homework3 (mergesort-concurrent)

# 2016q3 Homework3 (mergesort-concurrent) contributed by <`FZzzz`> ## 預期目標 * 看資料 * [資料閱讀筆記區](https://hackmd.io/MYVgJgRgTA7AHGAtCAzGYiAscBsGCGwmIiE+2AjDgAwow4CmEQA=) * 學習 POSIX Thread Programming * 想辦法完成實驗 ## 主要開發環境 Distribution: Ubuntu 16.04 LTS ``` Architecture: x86_64 CPU 作業模式： 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 每核心執行緒數：2 每通訊端核心數：4 Socket(s): 1 NUMA 節點： 1 供應商識別號： GenuineIntel CPU 家族： 6 型號： 60 Model name: Intel(R) Xeon(R) CPU E3-1231 v3 @ 3.40GHz 製程： 3 CPU MHz： 3748.632 CPU max MHz: 3800.0000 CPU min MHz: 800.0000 BogoMIPS: 6784.07 虛擬： VT-x L1d 快取： 32K L1i 快取： 32K L2 快取： 256K L3 快取： 8192K ``` > 最近筆電剛弄好環境，有可能也會在上面測，如果有需要附上我再貼　[name=`chenyi`] ## TODO - [x] Git hooks - [x] 打亂字典檔（運用sort -R處理） - [x] 閱讀程式脈絡並做一些修改 - [ ] 提出實作層面不足 - [ ] lock-free實作 - [ ] 分析 ### 閱讀程式脈絡並思考改進原本的程式使用了若干個mutexes > (目前數下來是兩個，mutrace看是3個) > [name=`fzzzz`] * data_contex_mutex * task_queue_mutex task執行區塊 ```C static void *task_run(void *data) { (void) data; while (1) { task_t *_task = tqueue_pop(pool->queue); if (_task) { if (!_task->func) { tqueue_push(pool->queue, _task); break; } else { _task->func(_task->arg);//call function and run free(_task); } } } pthread_exit(NULL); } ``` task在開始前會被放入共用的work queue中，在tpool_init(pool, thread_count, task_run)中 ```C for (uint32_t i = 0; i < tcount; ++i) pthread_create(&(the_pool->threads[i]), &attr, func, NULL); ``` 放入task_run的讓各個thread從task queue撈工作，直到撈出來的 task == null。可以先算完會用至少幾個thread，再開幾個給他就好 > 不過這樣就不符題意了XD [name=`fzzzz`] >> 為何要說「假設多開很多thread，會造成浪費」呢？你當然要先作空間複雜度分析，然後透過預先保留適當的資源和調整 thread pool 即可。再一次可見數學分析的重要。 [name=jserv] 超酷的比對方式XD ```C llist_t *small = (llist_t *) ((intptr_t) a * (a->head->data <= b->head->data) + (intptr_t) b * (a->head->data > b->head->data)); ``` > * 可能改進方向 > * condition variable在整支程式裡似乎沒被用到(init卻不使用)，然後沒有destroy > * 沒做到平衡地分配工作 > [name=`fzzzz`] #### 修正test case給予方式 * 給予正確且平均的亂數分佈 * rand() and srand() * 能接受phonebook作業裡的資料 * 利用strcmp() 自動化測試(Makefile) ```bash for i in 1 2 4 8 16 32 64 128 256; do ./sort $$i input.txt; done ``` 字串比對 ```C llist_t* small = (strcmp(a->head->data , b->head->data) <= 0) ? a:b; ``` #### 原版時間測試利用clock_gettime()來作為量測時間的方法，這邊主要想放一下各個thread數目的效能比較 >> 等等補圖正在更改成可比對字串中比對字串版 ![](https://i.imgur.com/aRyXRHJ.png) 太少筆了不好看 ![](https://i.imgur.com/p9L1zgg.png) > 為什麼會振盪？ > 應該不是clock_gettime()的問題，肉眼在看程式執行時間就可以看到的差異 > 清理cache > [name=`fzzzz`] > 可以嘗試對特定的N-threads做多次，來看實驗數據分佈的狀況 > [name=cheng hung lin] 可以看出趨勢是thread數目愈多，花費時間就愈多 ## References * [作業說明](https://hackmd.io/s/rJ-GWtJ0) * [Mergesort CS50 youtube](https://www.youtube.com/watch?v=EeQ8pwjQxTM) * [Concurrent-II](https://github.com/jserv/concurrent-ll) ###### tags `fzzzz` `sysprog`