contributed by <Weinux
>
believe7028
使用指令:$ lstopo
可列出機器的 cache 架構。lstopo 由 hwloc
套件提供
Makefile 裡面幫我們整理要編譯哪些檔案,並且輸出成執行檔. 包含使用編譯器的最佳化等級警告訊息與遵循哪種c語言的規格…等
其中的 -DIMPL="" 是 #define IMPL是什麼.以下的程式碼代表在編譯 phonebook_orig 這個執行檔過程中,IMPL 為"phonebook_orig.h" 這樣在main.c中就可以根據不同的-DIMPL 來#include 不同的.h file 而只需要一個main.c檔案即可。
要注意的是為了顯示出雙引號 " 前面必須加入跳脫符號\
phonebook_orig: $(SRCS_common) phonebook_orig.c phonebook_orig.h
$(CC) $(CFLAGS_common) $(CFLAGS_orig) \
-DIMPL="\"$@.h\"" -o $@ \
$(SRCS_common) $@.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <assert.h>
#include IMPL
#define MAX_LAST_NAME_SIZE 16
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
struct __PHONE_BOOK_ENTRY *pNext;
} entry;
size of entry : 136 bytes
execution time of append() : 0.062888 sec
execution time of findName() : 0.006345 sec
Performance counter stats for ./phonebook_orig (100 runs):
2,067,590 cache-misses # 91.407 % of all cache refs
2,263,739 cache-references
263,109,049 instructions # 1.26 insns per cycle
205,633,764 cycles
0.077700382 seconds time elapsed ( +- 1.27% )
./phonebook_orig & sudo perf top -p $!
30.50% libc-2.21.so [.] __strcasecmp_l_avx
20.31% phonebook_orig [.] findName
12.21% phonebook_orig [.] main
3.49% libc-2.21.so [.] __strcpy_sse2_unaligned
3.08% libc-2.21.so [.] _IO_getline_info
assert(findName(input, e) &&
"Did you implement findName() in " IMPL "?");
assert(0 == strcmp(findName(input, e)->lastName, "zyxel"));
level 1 cache 有 32 kbit,struct 的大小有 136 bytes,32 * 1024 / (136*8) =30.12,若把整個struct entry都丟進 findName() 尋找,level 1 cache 最多也只能放 30 個entry,一共有 35 萬個單字,必定會發生頻繁的 cache miss。
將entry內其他資料封裝到另一個結構 detail, 並且用指標加入原來的entry 結構中
typedef struct __PHONE_BOOK_DETAIL {
char firstName[16];
char email[16];
char phone[10];
char cell[10];
char addr1[16];
char addr2[16];
char city[16];
char state[2];
char zip[5];
} detail;
typedef struct __PHONE_BOOK_ENTRY {
char lastName[MAX_LAST_NAME_SIZE];
detail *pDetail;
struct __PHONE_BOOK_ENTRY *pNext;
} entry;
size of entry : 32 bytes
execution time of append() : 0.041963 sec
execution time of findName() : 0.002994 sec
Performance counter stats for './phonebook_opt' (100 runs):
369,581 cache-misses # 57.650 % of all cache refs
628,276 cache-references
245,601,976 instructions # 1.66 insns per cycle
154,732,401 cycles
0.057283161 seconds time elapsed ( +- 1.63% )
unsigned int BKDRHash(char *str)
{
unsigned int seed = 131;
unsigned int hash = 0;
while (*str) {
hash = hash * seed + (*str++);
}
return (hash & 0x7FFFFFFF);
}
unsigned int hash(hash_table *hashtable , char *str)
{
unsigned int hash_value = 0;
while(*str)
hash_value = (hash_value << 5) - hash_value + (*str++);
return (hash_value % SIZE);
}
phonebook
cache miss
Weinux