owned this note
owned this note
Published
Linked with GitHub
# 2016q3 Homework3 (mergesort-concurrent)
contribute by <`jeff60907`>
[作業說明A07: mergesort-concurrent](https://hackmd.io/s/rJ-GWtJ0)
* merge sort 的實做改為可接受 phonebook-concurrent 的 35 萬筆資料輸入的資料檔
* 設計自動測試的機制
* 一併嘗試重構 (refactor) 給定的程式碼,使得程式更容易閱讀和維護
* 研究 thread pool 管理 worker thread 的實做,提出 lock-free 的實做
## 開發環境
```
作業系統 Lubuntu 16.04
CPU:Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
cpu cores: 4
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
```
## 先閱讀課程資料
* 複習[計算機結構](https://hackmd.io/s/H1sZHv4R) and [note](https://paper.dropbox.com/doc/Computer-Architecture-Notes-NfGP1aZt2pAkgTcFeiEmc)
> 發現約讀懂6-7成而已,很多都有印象學過卻很模糊QQ
>> 一句話:「學過,但是從來沒有學會過」 [name=jserv]
* 必讀[Modern Microprocessors A 90 Minute Guide!](http://www.lighterra.com/papers/modernmicroprocessors/)--用力嗑英文
* [SIMD Programming Introduction](https://docs.google.com/presentation/d/1LeVe7EAmZvqD3KN7p4Wynbd36xPOk9biBCFrdmsqzKs/edit#slide=id.p25)
### 不懂名詞
* Pipeline Depth [參考](http://systw.net/note/af/sblog/more.php?id=34)
> The pipeline depth is the number of stages
> 管線作業架構內含多個硬體階段電路,硬體階段個數就稱為pipeline depth(管線深度)
1.pipeline depth越大,表示各硬體階段完成子工作所需的時間越短,執行效能就越好
2.當pipeline depth增加超過一個臨界值,效能則不升反降
ps:若管線深度為k,通常效能只會增快sqrt(k)倍
* Out-of-order execution (just OOO for short sometimes written OoO or OOE).
> 亂序執行 [參考wiki](https://zh.wikipedia.org/wiki/%E4%B9%B1%E5%BA%8F%E6%89%A7%E8%A1%8C) [參考百度解釋](http://baike.baidu.com/view/326795.htm)
> 可以避免因為獲取下一條程序指令所引起的處理器等待,立即執行處理下一條指令。
> 例如遇到branch指令3時,可以先把其他非相關聯指令4-8運算好,等到branch指令3結束,就可以直接使用。這種作法可以讓CPU內部運算沒有浪費,但是會造成功耗上升
* FSB,Front Side Bus
> 前端匯流排 [參考wiki](https://zh.wikipedia.org/wiki/%E5%89%8D%E7%AB%AF%E6%80%BB%E7%BA%BF)
內文提到
increasing a modern processor's clock speed by a relatively modest 30% can take as much as double the power, and produce double the heat...
> 增加30%的 clock速度 可能會造成兩倍的功耗和熱,所以不能無限制的增加時脈速度。
#### POWER WALL
Up to a point this increase in power is okay, but at a certain point, currently somewhere around 150-200 watts, the power and heat problems become unmanageable, because it's simply not possible to provide that much power and cooling to a silicon chip in any practical fashion, even if the circuits could, in fact, operate at higher clock speeds. This is called the power wall.
#### ILP WALL
Pursuing more and more ILP also has definite limits, because unfortunately, normal programs just don't have a lot of fine-grained parallelism in them, due to a combination of load latencies, cache misses, branches and dependencies between instructions. This limit of available instruction-level parallelism is called the ILP wall.
## 閱讀mergesort-concurrent程式碼
照步驟先測試
```
$ make
gcc -std=gnu99 -Wall -g -pthread -o list.o -MMD -MF .list.o.d -c list.c
gcc -std=gnu99 -Wall -g -pthread -o threadpool.o -MMD -MF .threadpool.o.d -c threadpool.c
gcc -std=gnu99 -Wall -g -pthread -o main.o -MMD -MF .main.o.d -c main.c
gcc -std=gnu99 -Wall -g -pthread -o sort list.o threadpool.o main.o -rdynamic
```
> MMD MF 是什麼? [參考kobeyu介紹](https://hackmd.io/s/Bydtpi4R)
> -g 在編譯時已經加上去了,底下才能使用 mutrace 查詢到地址對應的行數
```
mutrace: Showing statistics for process sort (pid 17137).
mutrace: 3 mutexes used.
Mutex #0 (0x0x13fd9b0) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7faa075e84b2]
./sort(tqueue_init+0x38) [0x401277]
./sort(tpool_init+0x6a) [0x4014cc]
./sort(main+0x161) [0x401c74]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7faa0701f830]
Mutex #1 (0x0x603120) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7faa075e84b2]
./sort(main+0x11d) [0x401c30]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7faa0701f830]
Mutex #2 (0x0x7faa04bee860) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_lock+0x49) [0x7faa075e86b9]
/lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Find_FDE+0x2c) [0x7faa049eafec]
[(nil)]
mutrace: Showing 3 most contended mutexes:
Mutex # Locked Changed Cont. tot.Time[ms] avg.Time[ms] max.Time[ms] Flags
0 133 15 10 0.026 0.000 0.002 Mx.--.
1 13 9 1 0.003 0.000 0.000 M-.--.
2 20 3 1 0.007 0.000 0.002 M-.--.
```
```
sort(tqueue_init+0x38) [0x401277]
$ addr2line -e sort 0x38
??:0
$ addr2line -e sort 0x401277
/home/jeff60907/mergesort-concurrent/threadpool.c:15
```
> mutrace: 3 mutexes used. 會顯示使用多少個 mutexes
> 0x38 反應是??:0 是因為實際執行的地址是後面 401277才正確嗎?
>> 所以要讀書呀!先讀手冊 [addr2line](https://sourceware.org/binutils/docs/binutils/addr2line.html),binutils 背後透過 [Binary File Descriptor library (BFD)](https://en.wikipedia.org/wiki/Binary_File_Descriptor_library) 實做。 [name=jserv]
* 閱讀 [addr2line](https://sourceware.org/binutils/docs/binutils/addr2line.html)
If the file name or function name can not be determined, addr2line will print two question marks in their place. If the line number can not be determined, addr2line will print 0.
* 閱讀[Binary File Descriptor library (BFD)](https://en.wikipedia.org/wiki/Binary_File_Descriptor_library)
[Big Endian和Little Endian的比較](http://www.prudentman.idv.tw/2007/11/big-endianlittle-endian.html) and [最後一段小故事](https://bcc16.ncu.edu.tw/pool/1.33.shtml)
如果是Big Endian的系統:
存到記憶體會變成 0x12 0x34 0x56 0x78,最高位元組在位址最低位元,最低位元組在位址最高位元,依次排列。
如果是Little Endian的系統:
存到記憶體會變成 0x78 0x56 0x34 0x12,最低位元組在最低位元,最高位元組在最高位元,反序排列。
* [參考Measuring Lock Contention mutrace解釋](http://0pointer.de/blog/projects/mutrace.html)
* The 'Locked' column tells how often the mutex was locked during the entire runtime of about 10s.
* The 'Changed' column tells us how often the owning thread of the mutex changed.
* The 'Cont.' column tells us how often the lock was already taken when we tried to take it and we had to wait.
* The fifth column tell us for how long during the entire runtime the lock was locked.
* The sixth tells us the average lock time.
* The seventh column tells us the longest time the lock was held.
* Finally, the last column tells us what kind of mutex this is (recursive, normal or otherwise).
## 程式分析
### list.h
Linked list 是由各個 node,透過指標串聯而成。先定義 node_t 型別
```c=
typedef intptr_t val_t; /* 不僅可放數值,也能放指標 */
typedef struct node {
val_t data; /* 欲儲存的資料 */
struct node *next; /* next 指標 */
} node_t;
```
> inptr_t 為什麼要使用這個型別呢? [參考SwimGlass](https://hackmd.io/s/r1Y9r_OR)很詳盡的內容!!
> 要將 pointer做型態轉換,因為電腦上 32bit 跟 64bit的pointer長度是不同的,強制轉型可能會造成資料loss的問題,所以透過`intptr_t`可以正確有效的轉換型態
> [表示法常見問題](https://zh.wikibooks.org/zh-tw/C_%E8%AA%9E%E8%A8%80%E5%B8%B8%E8%A6%8B%E8%AA%A4%E8%A7%A3/%E6%8C%87%E6%A8%99/%E8%A1%A8%E7%A4%BA%E6%B3%95%E8%88%87%E8%BD%89%E5%9E%8B)
list的操作,詳閱list.c實作內容
```c=
// 建立新的 list 物件
llist_t *list_new();
// 從 list 插入給定資料
int list_add(llist_t *the_list, val_t val);
// 逐一印列 list 內容
void list_print(llist_t *the_list);
// 產生新的節點
node_t *new_node(val_t val, node_t *next);
// 取得某個 index 的節點資訊
node_t *list_get(llist_t *the_list, uint32_t index);
```
### Thread pool
定義 task_t 來封裝 task:
```c=
typedef struct _task {
void (*func)(void *); /* 對應到最終執行的函式 */
void *arg; /* 傳入的參數 */
struct _task *next, *last;
/* 因為 queue 要用 doubly-linked list,
需要儲存 next 和 last */
} task_t;
```
> doubly-linked list 用法不太熟悉,回頭複習[資料結構課程](http://sjchen.im.nuu.edu.tw/Datastructure/98/course04.pdf)
>> 參考資料不要列出一堆中文的,請列出「有強度」的英文材料 [name=jserv]
>> 收到! [name=陳致佑]
#### 詳閱threadpool.c
[LitSnow講解操作內容](https://hackmd.io/BzDsGYE4BYFMAYC0BjAbAIwIaOqATACaLoCM84x88weomesJ+QA=#) 閱讀 [queue介紹](http://finalfrank.pixnet.net/blog/post/22382141-queue--%E4%BD%87%E5%88%97)
> 看了threadpool.c POP 跟 PUSH 有些操作看不太懂參考了`LitSnow`同學的共筆,但疑惑的是queue 應該是 FIFO 先進先出,所以POP 應該是從head 先拿,push 放進tail才對,但是threadpool.c卻寫相反了
>> 快去改,這個作業還有後續。以後要從 code refactoring 開始 [name=jserv]
>> 收到[name=陳致佑]
>>
### Code Refactoring
修改threadpool.c pop 跟 push,把頭尾交換,pop從head提出來,push從tail放進去。
```c=
task_t *tqueue_pop(tqueue_t *the_queue)
{
task_t *ret;
pthread_mutex_lock(&(the_queue->mutex));
ret = the_queue->head;
if (ret) {
the_queue->head = ret->next;
if (!the_queue->head) {
the_queue->tail = NULL;
}
the_queue->size--;
}
pthread_mutex_unlock(&(the_queue->mutex));
return ret;
}
```
```c=
int tqueue_push(tqueue_t *the_queue, task_t *task)
{
pthread_mutex_lock(&(the_queue->mutex));
task->next = NULL;
if (the_queue->tail)
the_queue->tail->next = task;
the_queue->tail = task;
if (the_queue->size++ == 0)
the_queue->head = task;
pthread_mutex_unlock(&(the_queue->mutex));
return 0;
}
```
### 嘗試實作可以讀取phonebook-word檔
[參考andy19950共筆](https://hackmd.io/JwFgrARmCmAcBsBaCBDCBGRIDMtPABN0AzRYgY2FlgIICZ5tjgg=)實作以下修改
修改 list.c and list.h
* 原本`struct node intptr_t `要改成字元可讀的型式 `typedef char* val_t;`
* `node_new()`將資料放進來方式變成`strcpy((char *)node->data, val);`
* `list_print()`方法也要更動輸出為`output.txt`
```c=
void list_print(llist_t *list)
{
node_t *cur = list->head;
/* FIXME: we have to validate the sorted results in advance. */
FILE *fp = fopen(output, "w");
while (cur) {
fprintf(fp,"%s\n", (char *)cur->data);
cur = cur->next;
}
fclose(fp);
}
```
修改 main.c
* 先將`word.txt` 透過指令亂序`$ uniq words.txt | sort -R > input.txt`
* 為了可以讀取`input.txt`,先開頭定義`#define USAGE "usage: ./sort [thread_count] [file_name]\n"`接著修改資料抓取方式,將`scanf` 改成 `fgets`讀檔案內資料,行的結尾要更改`'\0'`不然會紀錄原本的換行字元
```c=
for (int i = 0; i < data_count; ++i) {
long int data;
scanf("%ld", &data);
list_add(the_list, data);
}
while(fgets(line, sizeof(line), fp)!= NULL) {
line[strlen(line)-1] = '\0';
list_add(the_list, line);
}
```
* 原先數字比較大小更改為字串比較,`andy19950共筆`提出了`strcmp`比較大小回傳1 , 0 , -1實作`small`的內容很簡單明瞭,覺得很厲害不然原本直覺想到的就是硬幹而已@@
```c=
int cmp = strcmp(a->head->data, b->head->data);
llist_t *small = (llist_t *)
((intptr_t *) a * (cmp <= 0) +
(intptr_t *) b * (cmp > 0));
```
> 原本照`andy19950`程式碼實作出現一些Waring,主要是const char * 類似的型態類別有問題,指定轉換的格式後就OK了,所以資料在轉換過程中要遵循型態不然可能會有waring甚至會有奇怪的錯誤。
### 設計自動測試的機制
* 將原本的`word.txt` 跟 後來產生的`output.txt`做比較,紀錄第幾筆資料錯誤並顯示
```c=
void check_data()
{
FILE *word, *data;
data = fopen("output.txt", "r");
word = fopen("words.txt", "r");
char output[16], answer[16];
int count = 0;
while(fgets(output, sizeof(output), data) != NULL){
fgets(answer, sizeof(answer), word);
if(strcmp(answer, output) != 0) {
printf("Wrong %d data %s\n",++count, output);
}
++count;
}
printf("data are right\n");
fclose(word);
fclose(data);
}
```
### 測試未優化版本 thread 1 - 512
![](https://i.imgur.com/YP4poIz.png)
> thread 在64以上,時間大幅度上升,代表後面沒有task時, thread會一直耗費CPU的資源
* 使用 mutrace 了解程式 mutex 的狀況
`$ mutrace ./sort 64 input.txt `
```
mutrace: 0.2 sucessfully initialized for process sort (pid 10023).
64 0.4450
mutrace: Showing statistics for process sort (pid 10023).
mutrace: 3 mutexes used.
Mutex #2 (0x0x5258ba0) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7fead81b34b2]
./sort(tqueue_init+0x38) [0x401477]
./sort(tpool_init+0x6a) [0x4016ab]
./sort(main+0x180) [0x402011]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fead7bea830]
Mutex #0 (0x0x7feab777d860) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_lock+0x49) [0x7fead81b36b9]
/lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Find_FDE+0x2c) [0x7feab7579fec]
[(nil)]
mutrace: Showing 2 most contended mutexes:
Mutex # Locked Changed Cont. tot.Time[ms] avg.Time[ms] max.Time[ms] Flags
2 699815 657024 461951 84.738 0.000 0.020 Mx.--.
0 320 131 41 0.032 0.000 0.001 M-.--.
||||||
/|||||
Object: M = Mutex, W = RWLock /||||
State: x = dead, ! = inconsistent /|||
Use: R = used in realtime thread /||
Mutex Type: r = RECURSIVE, e = ERRRORCHECK, a = ADAPTIVE /|
Mutex Protocol: i = INHERIT, p = PROTECT /
RWLock Kind: r = PREFER_READER, w = PREFER_WRITER, W = PREFER_WRITER_NONREC
mutrace: Note that the flags column R is only valid in --track-rt mode!
mutrace: Total runtime is 529.922 ms.
mutrace: Results for SMP with 8 processors.
```
> mutex 2 被大量 locked 住了,熱點應該是 tqueue 之間的 task ,必須要等待 mutex 被釋放才能繼續進行,所以通通都卡在 lock 等待資源被釋放
## 嘗試管理 work thread
[額外整理Pthread探討](https://hackmd.io/OwEwRgbArAhgxgZgLQE5goAxICwFNjIwgpRIwCMYAHGAExQoTYgJA===?both)
[參考20組](https://hackmd.io/s/SJUfbulyx#) 使用 Condition variables 的 singal 來控制 mutex
[參照How to synchronize manager/worker pthreads without a join?](http://stackoverflow.com/questions/12282393/how-to-synchronize-manager-worker-pthreads-without-a-join)
修改 `main.c`在 負責 task 運作的 `task_run` 加上 `pthread_cond_wait` 等待信號回傳喚醒進行動作,但是調用`pthread_cond_wait` 前要搭配`pthread_mutex_lock`使用。
```c=
static void *task_run(void *data)
{
(void) data;
while (1) {
/* Lock must be taken to wait on conditional variable */
pthread_mutex_lock(&pool->queue->mutex);
/* Wait on condition variable, check for spurious wakeups.
When returning from pthread_cond_wait(), we own the lock. */
while(pool->queue->size == 0) {
pthread_cond_wait(&(pool->queue->cond), &(pool->queue->mutex));
}
if (pool->queue->size ==0) break;
/*Grab out task */
task_t *_task = tqueue_pop(pool->queue);
/* Unlock*/
pthread_mutex_unlock(&(pool->queue->mutex));
if (_task) {
if (!_task->func) {
tqueue_push(pool->queue, _task);
break;
} else {
_task->func(_task->arg);
free(_task);
}
}
}
pthread_exit(NULL);
}
```
修改 `threadpool.c` 將原本 `tqueue_pop`的 mutex 拿掉,留下 `tqueue_push`並建立`pthread_cond_signal`去喚醒 `pthread_cond_wait`的 thread
```c=
int tqueue_push(tqueue_t *the_queue, task_t *task)
{
pthread_mutex_lock(&(the_queue->mutex));
task->next = NULL;
if (the_queue->tail)
the_queue->tail->next = task;
the_queue->tail = task;
if (the_queue->size++ == 0)
the_queue->head = task;
pthread_cond_signal(&(the_queue->cond));
pthread_mutex_unlock(&(the_queue->mutex));
return 0;
}
```
![](https://i.imgur.com/DS8bffE.png)
> 後面大的 thread 時間大幅度下降了
* 使用 mutrace 了解用 condition variables 改善 mutex 的狀況
`$ mutrace ./sort 64 input.txt `
```
mutrace: 0.2 sucessfully initialized for process sort (pid 22346).
64 0.3986
mutrace: Showing statistics for process sort (pid 22346).
mutrace: 3 mutexes used.
Mutex #0 (0x0x7f9ecffe5860) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_lock+0x49) [0x7f9ef0a1b6b9]
/lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Find_FDE+0x2c) [0x7f9ecfde1fec]
[(nil)]
Mutex #1 (0x0x36adba0) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7f9ef0a1b4b2]
./sort(tqueue_init+0x38) [0x401577]
./sort(tpool_init+0x6a) [0x4017a7]
./sort(main+0x180) [0x402197]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f9ef0452830]
mutrace: Showing 2 most contended mutexes:
Mutex # Locked Changed Cont. tot.Time[ms] avg.Time[ms] max.Time[ms] Flags
0 320 112 18 0.032 0.000 0.001 M-.--.
1 195 129 0 0.213 0.001 0.012 Mx.--.
||||||
/|||||
Object: M = Mutex, W = RWLock /||||
State: x = dead, ! = inconsistent /|||
Use: R = used in realtime thread /||
Mutex Type: r = RECURSIVE, e = ERRRORCHECK, a = ADAPTIVE /|
Mutex Protocol: i = INHERIT, p = PROTECT /
RWLock Kind: r = PREFER_READER, w = PREFER_WRITER, W = PREFER_WRITER_NONREC
mutrace: Note that the flags column R is only valid in --track-rt mode!
mutrace: Total runtime is 492.383 ms.
mutrace: Results for SMP with 8 processors.
```
> 發現 mutex 的 locked 從幾十萬降到剩幾百了,代表之前都是等待 mutex 被釋放才能繼續進行,加入 condition wait/singal 喚醒的機制改善了 lock 等待的問題。
## 研究 Lock-free 實作
[分組作業-以 concurrent Linked List 實做 merge sort](https://hackmd.io/BwNgJgjArATADAZgLRQKYCMDGSAsOwCcS6AZuqksPtGHGDsAIaZA?view#)
#### 簡介
People often describe lock-free programming as programming without mutexes, which are also referred to as locks. That’s true, but it’s only part of the story. The generally accepted definition, based on academic literature, is a bit more broad. At its essence, lock-free is a property used to describe some code, without saying too much about how that code was actually written.
> lock-free 不表示一定沒有 mutex 或 lock 的互斥鎖,但是減少使用互斥鎖也是一種做法,然而最主要的是一種概念,並沒有要求 code 撰寫方式。
If a given part of your code doesn’t satisfy these conditions, then that part is not lock-free.
要完全符合下面的流程,才是一個完整的 lock-free 程式
![](https://i.imgur.com/VtJr7NL.png)
閱讀老師提供 [concurrent-ll](https://github.com/jserv/concurrent-ll)程式碼以及[20組的共筆資料](https://hackmd.io/s/SJUfbulyx#)
* 導入`atomic_ops_if.h` 並嘗試在原本的 `tqueue_pop` `tqueue_push` 加上 CAS 作法
```c=
task_t *tqueue_pop(tqueue_t *the_queue)
{
task_t *ret;
ret = the_queue->head;
if (ret) {
/*if head == tail means one node in queue*/
if(CAS_PTR(&(the_queue->tail), ret, NULL) == ret) {
return NULL;
} else {
CAS_PTR(&(the_queue->head), ret, ret->last);
FAD_U32(&(the_queue->size));
}
}
return ret;
}
```
```c=
nt tqueue_push(tqueue_t *the_queue, task_t *task)
{
task->next = NULL;
if (CAS_PTR(&(the_queue->tail), NULL, task) == NULL) {
CAS_PTR(&(the_queue->head), NULL, task);
} else {
CAS_PTR(&(the_queue->tail->next),the_queue->tail, task);
}
FAI_U32(&(the_queue->size));
pthread_cond_signal(&(the_queue->cond));
return 0;
}
```
![](https://i.imgur.com/WZaY74o.png)
> 看起來差別不大,可能是因為 CAS用法還不太會,不確定有沒有真正的 lcok-free 還需要再研究一下。
`$ mutrace ./sort 64 input.txt `
```
mutrace: 0.2 sucessfully initialized for process sort (pid 13784).
64 0.4127
mutrace: Showing statistics for process sort (pid 13784).
mutrace: 3 mutexes used.
Mutex #1 (0x0x7fd8cd7de860) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_lock+0x49) [0x7fd8ee2146b9]
/lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Find_FDE+0x2c) [0x7fd8cd5dafec]
[(nil)]
Mutex #0 (0x0x4c6aba0) first referenced by:
/usr/lib/mutrace/libmutrace.so(pthread_mutex_init+0xf2) [0x7fd8ee2144b2]
./sort(tqueue_init+0x38) [0x4015a7]
./sort(tpool_init+0x6a) [0x4017ba]
./sort(main+0x180) [0x4021b4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fd8edc4b830]
mutrace: Showing 2 most contended mutexes:
Mutex # Locked Changed Cont. tot.Time[ms] avg.Time[ms] max.Time[ms] Flags
1 320 120 36 0.033 0.000 0.001 M-.--.
0 258 192 0 0.071 0.000 0.015 M!.--.
||||||
/|||||
Object: M = Mutex, W = RWLock /||||
State: x = dead, ! = inconsistent /|||
Use: R = used in realtime thread /||
Mutex Type: r = RECURSIVE, e = ERRRORCHECK, a = ADAPTIVE /|
Mutex Protocol: i = INHERIT, p = PROTECT /
RWLock Kind: r = PREFER_READER, w = PREFER_WRITER, W = PREFER_WRITER_NONREC
mutrace: Note that the flags column R is only valid in --track-rt mode!
mutrace: Total runtime is 505.703 ms.
mutrace: Results for SMP with 8 processors.
mutrace: WARNING: 386 inconsistent mutex uses detected. Results might not be reliable.
mutrace: Fix your program first!
```
> 使用 CAS 做法反而多出了`mutrace: WARNING: 386 inconsistent mutex uses detected. Results might not be reliable`,是因為 threads 執行中 CAS 會去檢查 新舊問題,如果沒問題就直接使用並改變,才造成 不一致的 mutex warning ?
---
課程內容
電子工程專輯
link-list locality 散落