contributed by <ierosodin
>
( 有點多, 但想要完整的理解程式碼!! )
請見: 弱是罪惡 jserv
強者的哲學><
ierosodin
timespec
struct timespec {
time_t tv_sec; //seconds
long tv_nsec; //nanoseconds
};
why?, 不懂為何file.c中要這麼做, 先看下去 ierosodin
為什麼不做實驗呢? jserv
仔細看完append取資料的方式以及產生的align.txt檔之後, 才了解ALIGN_FILE是為了將每個lastname都補齊到16 bytes位元, 這樣取資料就會變得比較方便, 但使用這麼多額外的空間, 在strcpy與mmap時, 可能會增加額外的時間ierosodin
typedef struct __PHONE_BOOK_ENTRY {
char *lastName;
struct __PHONE_BOOK_ENTRY *pNext;
pdetail dtl;
} entry;
__builtin___clear_cache
clock_gettime
CLOCK_REALTIME
Why not use
CLOCK_MONOTONIC_RAW
?(由jiffies紀錄clock)
CLOCK_REALTIME
就是wall time, 可能會因為很多原因而被修改
ierosodin
From LINUX System Programming 2nd
: The important aspect of a monotonic time source is NOT the current value, but the guarantee that the time source is strictly linearly increasing, and thus useful for calculating the difference in time between two samplings.
6. mmap(NULL, fs, PROT_READ, MAP_SHARED, fd, 0);
我的理解是, mmap可以將文件映射到一段記憶體, 這樣無論是讀寫都可以很快, 適合用在需要頻繁存取的資料上, 像是這裡的字典檔. 而PROT_READ指的是權限為可讀取.
7. assert
用來確定剛剛的映射有成功, 否則印出錯誤訊息
8. entry *entry_pool = (entry *) malloc(sizeof(entry) *fs /MAX_LAST_NAMSI);
這裡開始建立pool, 大小為資料數 * entry大小. 同樣assert確認是否建立成功
9. pthread_setconcurrency(THREAD_NUM + 1);
終於來到pthread了!!這裡是用來設定最大的thread數
為何要 +1? ierosodin
程式碼不是寫給你背誦的!提出想法然後修改再檢討 jserv
先對[THREAD_NUM+1]做猜測, 我的理解是設定thread數量時, 加一指的是main thread, 也就是除了平行linked-list使用的thread以外, 分配工作也需要有一個main thread. 另外, 做了以下實驗 ierosodin
硬體規格為2 threads per core, 6 cores per socket.
因此給定THREAD_NUM = 12, 針對不同n參數進行比較
n = 2
execution time of append() : 5.803311 sec
execution time of findName() : 0.005820 sec
n = 13
execution time of append() : 5.528366 sec
execution time of findName() : 0.004255 sec
發現並沒有明顯的差異
先腦補, 若THREAD_NUM超出硬體規格, 當兩個pthread被分配到一個cpu時, 是否就會用到concurency的概念?
我的理解是, 當THREAD_NUM = 24, 而set concurrency = 25時, 意味著每顆cpu可以並行兩個pthread, 但若set concurrency = 13, 則每顆cpu就無法並行兩個pthread, 因此再針對THREAD_NUM = 24進行n = 13與n = 25的比較, 並在執行過程中print出thread id
實驗結果不符合!
不能繼續活在「腦補」的世界了,好手好腳就要拿來用! jserv
刻意增加append()的工作量, 提高cpu的使用率, 並用top觀察cpu, 發現
重看man page!
From Linux man page:
Concurrency levels are only meaningful for M:N threading implementations, where at any moment a subset of a process's set of user-level threads may be bound to a smaller number of kernel-scheduling entities. Setting the concurrency level allows the application to give the system a hint as to the number of kernel-scheduling entities that should be provided for efficient execution of the application.
Both LinuxThreads and NPTL are 1:1 threading implementations,so setting the concurrency level has no meaning. In other words, on Linux these functions merely exist for compatibility with other systems, and they have
no effect on the execution of a program.
但實驗硬體環境為6 cores, intel hyper threading到12個cpus, 屬於M:N threading(?), 那pthread_setconcurrency的設定是否有意義呢? ierosodin
pthread_setconcurrency (要寫全名,避免和其他資料來源衝突,養成好習慣) 對 Linux 的實做來說,沒有明確的影響,一如 man page 所說。 jserv
謝謝老師, 已改正!
ierosodin
pthread_t *tid = (pthread_t *) malloc(sizeof(pthread_t) * THREAD_NUM);
append_a **app
與new_append_a(...)
pthread_create( &tid[i], NULL, (void *) &append, (void *) app[i]);
pthread_join(tid[i], NULL);
實驗中發現, 當n超過8以後能降低, 與預期不符合!(硬體提供12個thread) ierosodin
嘗試改用Thread Pool
From C的Thread Pool筆記
嘗試將main.c中原本的pthread_create改成使用threadpool (THREAD_NUM = 4)
改用threadpool, 但THREAD_NUM太少!並沒有明顯差異
當THREAD_NUM = 12時, 就可以看出差異