2016q3 Homework1 (phonebook)

前置作業

建立Github Repository: AnCheTeng/phonebook
GNU/Linux 開發工具建置
astyle與git環境建置
研究perf原理與使用

TODO

Read Makefile語法簡介
- [x]gcc 及 Makefile 教學筆記
- [x]Makefile語法簡介
- Other Reference:
  - GNU Make基礎
  - GNU Make中文手冊
Read Linux 效能分析工具: Perf
Read 30 天精通 Git 版本控管
Read GNU純畫圖教學

開發環境

Operating System: Ubuntu 14.04 LTS (64bit)
CPU: 8-core, 3.40GHz
Memory: 16GB
Cache:
- L1d cache: 32K
- L1i cache: 32K
- L2 cache: 256K
- L3 cache: 8192K

使用 lscpu 和 sudo lshw 可看到硬體資訊

參考吳彥寬同學共筆

優化嘗試方案

減少struct大小
利用hash加速查詢
Binary Tree
資料壓縮

開發紀錄 2016/09/27

astyle and pre-commit hook

按照教學安裝 Git pre-commit hook，順便測試看看是否能偵測：
astyle-pre-commit-hook

成功偵測，照著下面的指令就可以符合要求的團隊開發風格！

Make sure you have run astyle as the following:
    astyle --style=kr --indent=spaces=4 --indent-switches --suffix=none *.[ch]

Reduce the struct size

先用老師的例子牛刀小試，把struct size縮小來看看變化吧！phonebook_opt.h如下：

















typedef struct __PHONE_BOOK_DETAIL {
    char firstName[16];
    char email[16];
    char phone[10];
    char cell[10];
    char addr1[16];
    char addr2[16];
    char city[16];
    char state[2];
    char zip[5];
} detail;

typedef struct __PHONE_BOOK_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    detail* detail_info;
    struct __PHONE_BOOK_ENTRY *pNext;
} entry;

make plot結果：

其中phonebook_orig的cache-miss為87.364%：

而phonebook_opt的cache-miss為39.871%：

可以看到cache-miss有很顯著的差異

Optimization with Hash-Table

稍微Survey了一下hash的種類後，找到了這篇有各種hash測試數據的blog，因此打算試試看BDKR-Hash和AP-Hash的效能誰比較優秀，code如下：











hashIndex BDKRhash(char *key, hashTable *ht)
{
    unsigned int seed = 13131; // 31 131 1313 13131 131313 etc..
    hashIndex hashValue = 0;

    while (*key) {
        hashValue = hashValue * seed + (*key++);
    }

    return hashValue % ht->size;
}














hashIndex APhash(char *key, hashTable *ht)
{
    hashIndex hashValue = 0;

    for (int i=0; *key; i++) {
        if ((i & 1) == 0) {
            hashValue ^= ((hashValue << 7) ^ (*key++) ^ (hashValue >> 3));
        } else {
            hashValue ^= (~((hashValue << 11) ^ (*key++) ^ (hashValue >> 5)));
        }
    }

    return hashValue % ht->size;
}

TODO: GNUplot將origin, reduce-size, BDKR-hash, 和AP-hash四者的效能一起畫出比較，GNUplot還不夠熟～！

TODO: GNUplot將不同效能的圖片給區隔開，現在都長一樣有點不太對

以下是BDKR-Hash所跑出的結果：

，除了append()略高0.005左右外，findName()幾乎降為零！有非常顯著的效能成長。

phonebook_opt的cache-miss為20.020%：

在使用DJB2-Hash時發現很奇怪的現象，質數選用越低時cache-miss升高，但append()卻變快，原因還在查詢中

performance變差的cache-miss：