2016q3 Homework2 ( phonebook-concurrent )

contributed by <ierosodin>

phonebook_opt程式碼解讀

~~( 有點多, 但想要完整的理解程式碼！！ )~~

請見: 弱是罪惡 jserv
強者的哲學><
ierosodin

main.c

timespec
用來儲存時間格式, 精確度可以到nsec

struct timespec { 
    time_t   tv_sec; //seconds 
    long     tv_nsec; //nanoseconds 
};

利用DICT_FILE轉換出ALIGN_FILE

why?, 不懂為何file.c中要這麼做, 先看下去 ierosodin
為什麼不做實驗呢？ jserv
仔細看完append取資料的方式以及產生的align.txt檔之後, 才了解ALIGN_FILE是為了將每個lastname都補齊到16 bytes 位元, 這樣取資料就會變得比較方便, 但使用這麼多額外的空間, 在strcpy與mmap時, 可能會增加額外的時間ierosodin

建立phead與e兩個entry型別的指標
in phonebook_opt.h


typedef struct __PHONE_BOOK_ENTRY {
    char *lastName;
    struct __PHONE_BOOK_ENTRY *pNext;
    pdetail dtl;
} entry;

__builtin___clear_cache
當程式碼被修改時, 可以清除特定的快取記憶體, 實際運作方式還要深入研究
clock_gettime
獲取系統時間, 這邊是使用CLOCK_REALTIME

Why not use CLOCK_MONOTONIC_RAW?(由jiffies紀錄clock)
CLOCK_REALTIME就是wall time, 可能會因為很多原因而被修改
ierosodin

From LINUX System Programming 2nd : The important aspect of a monotonic time source is NOT the current value, but the guarantee that the time source is strictly linearly increasing, and thus useful for calculating the difference in time between two samplings.
6. mmap(NULL, fs, PROT_READ, MAP_SHARED, fd, 0);
我的理解是, mmap可以將文件映射到一段記憶體, 這樣無論是讀寫都可以很快, 適合用在需要頻繁存取的資料上, 像是這裡的字典檔. 而PROT_READ指的是權限為可讀取.
7. assert
用來確定剛剛的映射有成功, 否則印出錯誤訊息
8. entry *entry_pool = (entry *) malloc(sizeof(entry) *fs /MAX_LAST_NAMSI);
這裡開始建立pool, 大小為資料數 * entry大小. 同樣assert確認是否建立成功
9. pthread_setconcurrency(THREAD_NUM + 1);
終於來到pthread了！！這裡是用來設定最大的thread數

為何要 +1? ierosodin
程式碼不是寫給你背誦的！提出想法然後修改再檢討 jserv
先對[THREAD_NUM+1]做猜測, 我的理解是設定thread數量時, 加一指的是main thread, 也就是除了平行linked-list使用的thread以外, 分配工作也需要有一個main thread. 另外, 做了以下實驗 ierosodin

pthread_setconcurrency(n)的影響

硬體規格為2 threads per core, 6 cores per socket.
因此給定THREAD_NUM = 12, 針對不同n參數進行比較
n = 2

execution time of append() : 5.803311 sec
execution time of findName() : 0.005820 sec

n = 13

execution time of append() : 5.528366 sec
execution time of findName() : 0.004255 sec

發現並沒有明顯的差異

先腦補, 若THREAD_NUM超出硬體規格, 當兩個pthread被分配到一個cpu時, 是否就會用到concurency的概念?

我的理解是, 當THREAD_NUM = 24, 而set concurrency = 25時, 意味著每顆cpu可以並行兩個pthread, 但若set concurrency = 13, 則每顆cpu就無法並行兩個pthread, 因此再針對THREAD_NUM = 24進行n = 13與n = 25的比較, 並在執行過程中print出thread id

實驗結果不符合!

不能繼續活在「腦補」的世界了，好手好腳就要拿來用！ jserv

刻意增加append()的工作量, 提高cpu的使用率, 並用top觀察cpu, 發現

無論set concurrency為多少, cpu工作的顆數都會等於THREAD_NUM(當然最多就等於12)
-> setconcurrency不影響pthread的使用
無論set concurrency為多少, 所有pthread的printf都是交錯執行的
-> setconcurrency不影響單一cpu中pthreads的並行

重看man page!
From Linux man page:
Concurrency levels are only meaningful for M:N threading implementations, where at any moment a subset of a process's set of user-level threads may be bound to a smaller number of kernel-scheduling entities. Setting the concurrency level allows the application to give the system a hint as to the number of kernel-scheduling entities that should be provided for efficient execution of the application. Both LinuxThreads and NPTL are 1:1 threading implementations,so setting the concurrency level has no meaning. In other words, on Linux these functions merely exist for compatibility with other systems, and they have no effect on the execution of a program.

但實驗硬體環境為6 cores, intel hyper threading到12個cpus, 屬於M:N threading(?), 那pthread_setconcurrency的設定是否有意義呢? ierosodin
pthread_setconcurrency (要寫全名，避免和其他資料來源衝突，養成好習慣) 對 Linux 的實做來說，沒有明確的影響，一如 man page 所說。 jserv
謝謝老師, 已改正！
ierosodin

pthread_t *tid = (pthread_t *) malloc(sizeof(pthread_t) * THREAD_NUM);
宣告tid[THREAD_NUM]個pthread
append_a **app與new_append_a(...)
由於pthread對一個function只能傳入一個參數, 因此將append()所需要的參數, 利用new_append_a function存入 *app[THREAD_NUM]
pthread_create( &tid[i], NULL, (void *) &append, (void *) app[i]);
開始將function送入thread中運算, 參數分別是：
所宣告的thread; thread屬性; 要thread的function; 要傳入的參數
pthread_join(tid[i], NULL);
等待thread工作結束, 如果有回傳值, 可以用第二個參數存取
先到phonebook_opt.c中看append()是怎麼處理資料的!
pthread的切割方法是, 每一個thread先在main.c中分別獲取各自開頭的記憶體位置, 接著進到append()後, 間隔THREAD_NUM*MAX_LAST_NAME_SIZE讀取資料的記憶體位置, 並將資料加入linked list中, 最後再回到main.c將多個list合併

THREAD_NUM比較

實驗中發現, 當n超過8以後能降低, 與預期不符合!(硬體提供12個thread) ierosodin

針對append()進行優化

嘗試改用Thread Pool

From C的Thread Pool筆記

嘗試將main.c中原本的pthread_create改成使用threadpool (THREAD_NUM = 4)

改用threadpool, 但THREAD_NUM太少!並沒有明顯差異

當THREAD_NUM = 12時, 就可以看出差異