contributed by < steven0203
>
ryanwang522
tina0405
中英文字間請記得以空白隔開!
進度落後,趕工趕工!
課程助教
os: Ubuntu 16.04 LTS
Architecture: x86_64
CPU 作業模式: 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
每核心執行緒數:2
每通訊端核心數:4
Socket(s): 1
NUMA 節點: 1
供應商識別號: GenuineIntel
CPU 家族: 6
型號: 58
Model name: Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz
製程: 9
CPU MHz: 1200.132
CPU max MHz: 3300.0000
CPU min MHz: 1200.0000
BogoMIPS: 4589.56
虛擬: VT-x
L1d 快取: 32K
L1i 快取: 32K
L2 快取: 256K
L3 快取: 6144K
$./phonebook_orig
size of entry : 136 bytes
execution time of append() : 0.100508 sec
execution time of findName() : 0.006552 sec
$make cache-test
Performance counter stats for './phonebook_orig' (100 runs):
2,040,618 cache-misses # 94.610 % of all cache refs ( +- 0.14% )
2,156,864 cache-references ( +- 0.14% )
260,715,620 instructions # 1.42 insns per cycle ( +- 0.02% )
183,005,647 cycles ( +- 0.60% )
0.081705608 seconds time elapsed ( +- 1.88% )
$sudo perf record -e cycles ./phonebook_orig
$sudo perf report
因為再做搜尋的時候只需要 lastName 不需要其他資料所以可以將原本的資料結構縮小,來減少cache miss
typedef struct __PHONE_BOOK_DETAIL {
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
}detail;
typedef struct __LAST_NAME_ENTRY{
char lastName[MAX_LAST_NAME_SIZE];
detail *entryDetail;
struct __LAST_NAME_ENTRY *pNext;
}entry;
$make plot
Performance counter stats for './phonebook_orig' (100 runs):
2,032,413 cache-misses # 95.041 % of all cache refs ( +- 0.20% )
2,138,451 cache-references ( +- 0.21% )
260,684,240 instructions # 1.38 insns per cycle ( +- 0.02% )
189,355,737 cycles ( +- 0.30% )
Performance counter stats for './phonebook_opt' (100 runs):
390,124 cache-misses # 67.908 % of all cache refs ( +- 0.35% )
574,490 cache-references ( +- 0.51% )
243,875,613 instructions # 1.85 insns per cycle ( +- 0.02% )
131,948,525 cycles ( +- 0.22% )
append()時間從 0.057sec 降到了 0.043sec
findName()時間也從 0.0062sec 降到了 0.0027sec
$sudo perf record -e cycles ./phonebook_orig
$sudo perf report
這裡使用 BKDRHash 來優化, hash function 如下,我將 HASH_TABLE_SIZE 設成42737,參考 nekoneko 的實驗結果
unsigned int BKDRHash(char *str)
{
unsigned int seed = 131;
unsigned int hash = 0;
while (*str) {
hash = hash * seed + (*str++);
}
return hash%MAX_HASH_TABLE_SIZE;
}
儲存每一個 entry 資料是使用跟優化版本一同樣的 struct , hash table 是儲存經過 hash 後同樣 index 的 linked list 的 head
typedef struct __HASH_TABLE_ENTRY {
entry *head;
} tableEntry;
tableEntry *initializeTable()
{
tableEntry *hashTable=(tableEntry*)malloc(sizeof(tableEntry)*MAX_HASH_TABLE_SIZE);
for(int i=0; i<MAX_HASH_TABLE_SIZE; ++i) {
(hashTable+i)->head=NULL;
}
return hashTable;
}
void append(char lastName[], tableEntry *hashTable)
{
int index=BKDRHash(lastName);
entry *newEntry=(entry *)malloc(sizeof(entry));
strcpy(newEntry->lastName,lastName);
newEntry->pNext=hashTable[index].head;
hashTable[index].head=newEntry;
}
entry *findName(char lastName[],tableEntry *hashTable)
{
int index=BKDRHash(lastName);
entry *head=hashTable[index].head;
while (head) {
if (strcasecmp(lastName, head->lastName) == 0)
return head;
head = head->pNext;
}
return NULL;
}
結果:
Performance counter stats for './phonebook_opt_hash' (100 runs):
347,546 cache-misses # 54.858 % of all cache refs ( +- 0.55% )
633,536 cache-references ( +- 1.14% )
235,882,703 instructions # 1.69 insns per cycle ( +- 0.02% )
139,960,783 cycles ( +- 0.66% )
0.060647815 seconds time elapsed ( +- 2.36% )
cache-misses 從之前的 68% 降到了 54.86%
append 的執行時間因為每次都要花時間在處理 hash function,所以比起只有減少資料結構大小時間有增加,但是 findName 的部份減到很少
Phonebook
Dada Chan
hash function 觀念和實務
哈希表之bkdrhash算法解析及扩展
各种字符串Hash函数比较
nekoneko