# 2017q1 Homework4 (phonebook-concurrent) contributed by <`twzjwang`> 作業說明: [B07: phonebook-concurrent](https://hackmd.io/s/rkOExTOoe) github: [twzjwang/phonebook-concurrent](https://github.com/twzjwang/phonebook-concurrent) # 開發環境 - Ubuntu Ubuntu 16.04.2 LTS Linux 4.8.0-36-generic - cpu version: Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 3072K - memory size: 4GiB # 原始版本 - file_align - 將每一行用 '\0' 填滿 - Pthread - 實作 multithread - [mmap](http://man7.org/linux/man-pages/man2/mmap.2.html) - mmap() creates a new mapping in the virtual address space of the calling process. - On POSIX systems on which mmap(), msync(2), and munmap() are available. - 頻繁的檔案存取 - memory pool - 結果 - ![](https://i.imgur.com/vySOIPS.png) # 修改 ## 平行化 Connect the linked list of each thread - 原始 - ![](https://i.imgur.com/9dVbxU2.png) - n 個 thread 需要連接 n - 1 次 - thread 0 -> thread 1 - thread 1 -> thread 2 - thread 2 -> thread 3 ```C= for (int i = 0; i < THREAD_NUM; i++) { if (i == 0) { pHead = thread_args[i]->lEntry_head->pNext; DEBUG_LOG("Connect %d head string %s %p\n", i, pHead->lastName, thread_args[i]->data_begin); } else { e->pNext = thread_args[i]->lEntry_head->pNext; DEBUG_LOG("Connect %d head string %s %p\n", i, e->pNext->lastName, thread_args[i]->data_begin); } e = thread_args[i]->lEntry_tail; DEBUG_LOG("Connect %d tail string %s %p\n", i, e->lastName, thread_args[i]->data_begin); DEBUG_LOG("round %d\n", i); } ``` - 改成 tree structured - ![](https://i.imgur.com/SAm0dfN.png) - n 個 thread 需要連接 log n 次 - thread 0 -> thread 1 , thread 2 -> thread 3 - thread 0 -> thread 2 ```C= for(int i = 2; i <= t_arg->numOfThread; i*=2){ if (t_arg->threadID % i == 0) { int src = t_arg->threadID + i/2; while(!comm_ready[src]); t_arg->lEntry_tail->pNext = comm_data_head[src]; t_arg->lEntry_tail = comm_data_tail[src]; } else { comm_data_head[t_arg->threadID] = t_arg->lEntry_head->pNext; comm_data_tail[t_arg->threadID] = t_arg->lEntry_tail; comm_ready[t_arg->threadID] = true; break; } } ``` - 結果 (append time) thread num|2|4|8|16 ----------|-|-|-|-- 平行化前|0.004709|0.005546|0.007729|0.010000 平行化後|0.005471|0.005462|0.013722|0.024526 - 無論幾個 thread ,平行化 linked list 連接部分皆無明顯改善,甚至需要更多時間 - 僅需要把頭尾指標相連,沒有大量資料複製 - 不需要平行化 :::info 這議題可以很複雜,請見 [Efficient Lock-free Binary Search Trees](https://arxiv.org/pdf/1404.3272.pdf),特別在 Threaded BST 建立後,如何維護 insert/delete 的資料正確性 --jserv ::: # 刪除資料 # 參考資料 - [LanKuDot 學長筆記](https://hackmd.io/s/HJFhaiAp#效能分析) ###### tags: `twzjwang`