contributed by <tundergod
>
The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software
An Introduction to Lock-Free Programming
mov dword ptr [x],1
mov eax,dword ptr [x]
add eax,1
mov dword ptr [x],eax
_InterlockedCompareExchange
,該指令會在指定的address中讀取原始值再計算新值並檢查如果讀取位置值不變才寫入新值,例如:void LockFreeQueue::push(Node* newHead)
{
for (;;) {
// Copy a shared variable (m_Head) to a local.
Node* oldHead = m_Head;
// Do some speculative work, not yet visible to other threads.
newHead->next = oldHead;
// Next, attempt to publish our changes to the shared variable.
// If the shared variable hasn't changed, the CAS succeeds and we return.
// Otherwise, repeat.
if (_InterlockedCompareExchange(&m_Head, newHead, oldHead) == oldHead)
return;
}
}
Grandma Beck, discussing child(臭的尿布會幹擾孩子,應該換掉)
-rearing philosophy
void *mmap(void *start,size_t length,
int prot, int flags, int fd, off_t offsize);
O_RDONLY | O_NONBLOCK
代表的是read only和以nonblocking的方式去存取檔案int open(const char *pathname, int flags);
pthread_setconcurrency()
告訴系統明確的並行數量(請詳閱 man page: pthread_setconcurrency,最下方的 NOTES 要認真看 jserv
size of entry : 136 bytes
execution time of append() : 0.071213 sec
execution time of findName() : 0.005603 sec
3
size of entry : 24 bytes
execution time of append() : 0.005196 sec
execution time of findName() : 0.004604 sec
把 phonebook_opt建構的資料輸出並與dictionary中words.txt
利用linux內建功能diff
比較,結果:
0a1
> aaaa
發現在phonebook_opt少了aaaa,問題出現在main.c中
for (int i = 0; i < THREAD_NUM; i++) {
if (i == 0) {
pHead = app[i]->pHead->pNext;
dprintf("Connect %d head string %s %p\n", i,
app[i]->pHead->pNext->lastName, app[i]->ptr);
} else {
etmp->pNext = app[i]->pHead->pNext;
dprintf("Connect %d head string %s %p\n", i,
app[i]->pHead->pNext->lastName, app[i]->ptr);
}
.....
}
可以看到在第三行程式碼中,第一筆資料在設置pHead時指標指多一個pNext,所以他會跳過第一筆lastname,而第7行中etmp也犯了同樣的錯誤.只要把多指向的pNext拿掉就能夠得到正確的答案了.如下:
pHead = app[i]->pHead;
...
etmp->pNext = app[i]->pHead;
「執行時間」是台灣科技慣用術語,「運行」是對岸說法。 jserv
opt:
size of entry : 24 bytes
execution time of append() : 0.005021 sec
execution time of findName() : 0.003749 sec
orig:
size of entry : 136 bytes
execution time of append() : 0.048624 sec
execution time of findName() : 0.005534 sec
圖標:
4 thread
各thread的比較圖
普通的Thread Pool
Lock-Free Thread Pool
kill -l
指令查看
sigemptyset(&zeromask); //初始化
sigemptyset(&newmask); //初始化
sigaddset(&newmask, SIGUSR);
sigprocmask(SIG_BLOCK, &newmask, &oldmask) ;
while (!CONDITION) {
sigsuspend(&zeromask);
}
sigprocmask(SIG_SETMASK, &oldmask, NULL)
第一次接觸信號的東西以上是我的想法和理解如果不對請大家救救我XDDLim Wen Sheng
疑問:在程式碼中可以看到每一個signal都在sigsuspend下等待,這樣也算是lock-free嗎?Lim Wen Sheng
這裏以我的理解,還有觀察sigsuspend的condition後,我覺得signal並不會在這裏等太久,每個訊號都會一直執行下去(除非condition真的不成立),而且這部分都不是critical section。最後按照我的理解,每個process會停留的地方,是在處理share variable的時候,也就是compare_and_swap跟fetch_and_add。 TempoJiJi
這裏其實我也還不是很清楚sigprocmask(SIG_BLOCK, &newmask, &oldmask)
會不會像mutex_lock那樣阻止其它訊號執行下去,還需要做點實驗TempoJiJi
補充:我做了實驗後,發現sigprocmask(SIG_BLOCK, &newmask, &oldmask)
是不會阻止其它process執行下去,所以只有在queue裏沒worker時,才會暫停TempoJiJi
*tpool_init
初始化thread pool(定義好有幾個thread和work queue)spawn_new_threads()
生成對應的threadswait_for_thread_registration
檢查所有的threads是否已經準備好了*tpool_thread
去做信號的溝通(建立semophere),呼叫get_work_concurrently
建立work queue並把工作以concurrency的方式放進work queue中實作lock-free thread pool
int cpu_num = sysconf(_SC_NPROCESSORS_CONF);
void *tpool = tpool_init(cpu_num);
//thread num = 4
for (int i = 0; i < THREAD_NUM; i++){
app[i] = new_append_a(map + MAX_LAST_NAME_SIZE * i, map + fs,THREAD_NUM, entry_pool + i);
if(tpool_add_work(tpool, &append, (void*)app[i]) < 0)
tpool_destroy(tpool,1);
}
tpool_destroy(tpool,1);
cpu time就是work queue的數量(上述運用範例程式碼的做法,也就是把電腦cpu數量帶進去執行)
實驗經過正確驗證!
運行時間:
size of entry : 24 bytes
execution time of append() : 0.103644 sec
execution time of findName() : 0.003529 sec
發現到append的時間非常的久,比original版本都慢了大約兩倍的時間
可以看到在4個work queue和4個worker thread的時候是最快的.
tundergod
hw2
2016q3