2016q3 Homework3 (mergesort-concurrence)

contributed by <abba123>

開發環境

OS : ubuntu 16.04.1 LTS

lscpu

Architecture:          x86_64
CPU 作業模式：    	32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
每核心執行緒數：		2
每通訊端核心數：		2
Socket(s):             1
NUMA 節點：         	1
供應商識別號：  		GenuineIntel
CPU 家族：          	6
型號：              	58
Model name:            Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
製程：              	9
CPU MHz：             	1924.316
CPU max MHz:           3100.0000
CPU min MHz:           1200.0000
BogoMIPS:              4988.73
虛擬：              	VT-x
L1d 快取：          	32K
L1i 快取：         	 32K
L2 快取：          	 256K
L3 快取：          	 3072K
NUMA node0 CPU(s):     0-3

案例分析: mergesort-concurrent (對單向 Linked List)

Thread pool

ref:
http://openhome.cc/Gossip/DesignPattern/ThreadPool.htm
http://swind.code-life.info/posts/c-thread-pool.html

在一般使用 thread 的時候，當一個請求到來，就會建立一個新的 thread，用完之後，便不再使用。
但 thread 的建立需要系統資源，對於一個接受許多請求的情況，就會不斷的建立新執行緒，造成系統效能的降低。
thread pool 的概念就是重複使用建好的 thread。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

task_t 裡面包含了要做的 function 和要傳的參數





typedef struct _task {
    void (*func)(void *);
    void *arg;
    struct _task *next, *last;
} task_t;

task_queue 的資料結構，包含了 lock 去控制 task 的執行






typedef struct {
    task_t *head, *tail;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    uint32_t size;
} tqueue_t;

thread_pool 裡面包含了所有的 thread 以及 task_queue





typedef struct {
    pthread_t *threads;
    uint32_t count;
    tqueue_t *queue;
} tpool_t;

作業要求

將 merge sort 的實做改為可接受 phonebook-concurrent 的 35 萬筆資料輸入的資料檔
- 字典檔資料需要事先用 sort -R 處理過
- 思考如何得到均勻分佈的亂數排列，並且設計自動測試的機制

原本檔案接收的資料是數字，但我們要改成可以接受英文的排序，所以必須要做一點小更動

首先把 node 裡面改成可接受 string




typedef struct node {
    char str[20];
    struct node *next;
} node_t;

這邊我們改一下判斷式
strcmp 這邊的比較是依照字典排序
剛好符合我們的需求




while (a->size && b->size) {
        llist_t *small = (llist_t *)
                         ((intptr_t) a * (strcmp(a->head->str,b->head->str)<= 0) +
                          (intptr_t) b * (strcmp(a->head->str,b->head->str) > 0));

我們先把原本的資料打散

$ uniq words.txt | sort -R > words_input.txt

Makefile 的部份寫了一套自動比較的script
這邊我們用 diff 來比較正確性

plot: sort
	./sort 1
	diff words.txt output.txt 
	./sort 2
	diff words.txt output.txt 
	./sort 4
	diff words.txt output.txt 
	./sort 8
	diff words.txt output.txt 
	./sort 16
	diff words.txt output.txt 
	./sort 32
	diff words.txt output.txt 
	gnuplot plot.gp

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →