contributed by < ga639487
>
請加速,共筆請照指定格式撰寫亮谷
tina0405
>*在內容中提到HASH FUNCTION有好幾種不同的,可以都嘗試看看。
*BINARYTREE的結果我本身還在嘗試,想多看看別人不同的實做想法。
asus gl552vw
由於電腦問題,安裝作業系統有些困難,
同一型號可參考這篇。(另外感謝助教大力幫助)
Architecture: x86_64
CPU 作業模式: 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
每核心執行緒數:2
每通訊端核心數:4
Socket(s): 1
NUMA 節點: 1
供應商識別號: GenuineIntel
CPU 家族: 6
型號: 94
Model name: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
製程: 3
CPU MHz: 1284.867
CPU max MHz: 3500.0000
CPU min MHz: 800.0000
BogoMIPS: 5183.95
虛擬: VT-x
L1d 快取: 32K
L1i 快取: 32K
L2 快取: 256K
L3 快取: 6144K
NUMA node0 CPU(s): 0-7
由指令$ lscpu
取得
kernel的部份
Linux ga639487-GL552VW 4.4.0-31-generic
由指令$ uname -a
取得
先打開phonebook,執行
$ cd phonebook
$ ./phonebook_orig
結果如下
size of entry : 136 bytes
execution time of append() : 0.044711 sec
execution time of findName() : 0.004928 sec
再用perf stat測試cache-misses
$ sudo perf stat --repeat 100 -e cache-misses,cache-references,instructions,cycles ./phonebook_orig
結果為
4,445,825 cache-misses # 90.981 % of all cache refs ( +- 0.16% )
4,886,546 cache-references ( +- 0.11% )
260,738,922 instructions # 1.50 insns per cycle ( +- 0.02% )
174,137,398 cycles ( +- 0.15% )
0.057684120 seconds time elapsed ( +- 0.62% )
cache-misses為90.981%
讀過老師提供的資料及同學的共筆後,以下是幾個較常使用的改進方法。
phonebook_orig.c
entry *findName(char lastName[], entry *pHead)
{
while (pHead != NULL) {
if (strcasecmp(lastName, pHead->lastName) == 0)
return pHead;
pHead = pHead->pNext;
}
return NULL;
}
entry *append(char lastName[], entry *e)
{
/* allocate memory for the new entry and put lastName */
e->pNext = (entry *) malloc(sizeof(entry));
e = e->pNext;
strcpy(e->lastName, lastName);
e->pNext = NULL;
return e;
}
phonebook_orig.h
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
struct __PHONE_BOOK_ENTRY *pNext;
} entry;
entry *findName(char lastName[], entry *pHead);
entry *append(char lastName[], entry *e);
#endif
從初始版本中可以看出,在orig.h的struct中宣告了許多變數,但在orig.c的findName和append中只使用到lastName,所以可以將其他沒有使用到的變數寫進另外一個struct裡。
因此,將opt.h的程式碼改為
typedef struct __PHONE_BOOK_ENTRYdetail {
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
} entrydetail;
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
struct entrydetail *detail;
struct __PHONE_BOOK_ENTRY *pNext;
} entry;
entry *findName(char lastName[], entry *pHead);
entry *append(char lastName[], entry *e);
#endif
再次執行後得到
size of entry : 32 bytes
execution time of append() : 0.034947 sec
execution time of findName() : 0.001787 sec
Performance counter stats for './phonebook_opt' (100 runs):
1,437,032 cache-misses # 63.929 % of all cache refs ( +- 0.47% )
2,247,868 cache-references ( +- 0.16% )
244,743,154 instructions # 2.08 insn per cycle ( +- 0.02% )
117,680,863 cycles ( +- 0.15% )
0.038929938 seconds time elapsed ( +- 0.21% )
cache-misses從原本的90.981 %降到63.929 %
$ make plot