2016q3 Homework1 (phonebook)

contributed by <LanKuDot>

Git Hook

當某事件發生時，git 可以透過 hook 觸發對應腳本
所有 hook 腳本被存放在 .git/hooks 中，可以為 shell script, Perl, Python 等
在 .git/hooks 中，以 .sample 為副檔名為 git 提供的範例腳本，要使用去掉 .sample 即可
一些 hook
- pre-commit：在 commit 之前執行，可以用來檢查程式碼可否運行。當 hook 回傳非 0 值，git 會放棄這次的 commit，但可以用 git commit --no-verify 來忽略。
- commit-msg：可以用來查 commit message 是否符合特定格式。
- post-commit：commit 後執行，可以用作通知。

Astyle

查看作業 pre-commit 提供的 script

astyle option

--style=kr：用 k&r 風格的 brackets

int Foo(int bar)
{	/* Function, namespace, class 要斷行 */
	if (bar != 0) { /* 其他的不能斷行 */
		Foo2();
		return 0;
	} else
		return 1;
}

--indent=space=4：縮排用 4 空格
--indent-switchs：switch statement 中的所有 case block 都要 indent

switch (foo) {
	case 1:
		x += 19;
		break;
	default:
		x += 1;
		break;
}

實做
- [L17] 先用 git diff，指定 ACMR (Added, Copied, Modified, Renamed)，配合 grep 將檔名為 *.cpp, *.c, *.h 的檔案列出來
- 然後針對每一個檔案
  - [L19] 先將指定檔案從 index 複製到暫存檔中，輸出內容為 暫存檔名原本檔名，所以利用 cut -f 1 濾出第一個 section 的文字，也就是暫存檔名
  - [L20] 做一份這個暫存檔的暫存檔
  - [L21] astyle [options] < inputFile > beautifuledFile：將其中一個暫存檔丟給 astyle 做美化並輸出到另一個暫存檔
  - [L22] 只要比較這兩個檔案，有不同就代表被美化過，所以就讓回傳值為 1，並輸出訊息
在 vim 將 tab 轉換為 space

:set tabstop=4	"設定一個 tab 為 4 個 space
:set expandtab	"設定按下 tab 時為插入 spaces
:retab			"將既有的 tab 轉換成 spaces
:.retab			"只轉換目前所在之行的 tab

Hash table

這次使用 hash table 來建立搜尋的 index

hash function：將不同長度的資料轉換成相同大小的函式。以 hash 出來的結果呈 uniform distribution 為佳。
hash table：一種資料結構，元素為 key-vaule pair，以 hash 過後的資料作為 key，原始資料作為 value。通常會以 key 作為 index，而使用 linked list 儲存擁有同樣 key 的 values。

Array of pointers

使用 array of pointers 作為 hash table。array 的index 就是 hash table 的 index，每個 index 都指向一個 linked list。
必需要 pass "pointer to array of pointers" 到 append function，也就是 append(char lastname[], entry **e)，e 是 pointer of pointer to entry，所以只要 pass array of pointers 的名子到 function 就可以了。參考下面程式：
















void printElement(int **e, int index)
{
    printf("e: %p\n", e);
    printf("e[%d]: %p\n", index, e[index]);
    printf("*e[%d]: %d\n", index, *e[index]);
}

int main(void)
{
    int a[5] = {1, 2, 3, 4, 5}; 
    int *b[5] = {a, a+1, a+2, a+3, a+4};
    printf("a: %p\n", a);
    printf("b: %p\n", b);
    printElement(e, 0); 
    return 0;
}

輸出

a: 0x7ffc69a05830
b: 0x7ffc69a05850
e: 0x7ffc69a05850
e[0]: 0x7ffc69a05830
*e[0]: 1

Build hash table

Get hash value：以 lastName 作為 key，丟入 hash function 得到 hash value
Get hash index：將 hash value 依照 hash table 的大小做 modulo，找到對應的 index。如大小是 4096，則 index = (hashValue & 0xFFF)
Update linked list：將新的 entry 放到 linked list 的頭，所以越早加入的資料，會被放到越後面

Search in the hash table

Get hash index：跟 build hash table 中一樣
Search the entry：在指定 index 的 linked list 中搜尋 entry

這邊我將搜尋函式分成

entry *findName(char lastName[], entry **pHashTable)：用於外部呼叫，傳入整個 hash table 的指標
static entry *findEntry(const char value[], entry * const pHead)：findName 找到 hash index 後將對應的 linked list 傳入即可。可以在這裡實作搜尋 linked list 的優化。

效能分析

原始 vs Jenkins

觀察發現優化版的 append 所花的時間比原始版略高，推測因為其中要計算 hash value 而消耗比較多的時間。
至於 findName 則有顯著改進，主要是因為在這裡指定的搜尋字詞為字典最後一個字，而以我更新 linked list 的方式，最後一個字會在最前頭。
另外 cache-miss 在優化版本中少了 10%，branch-misses 則略高於原始版本

原始版本

 Performance counter stats for './phonebook_orig' (100 runs):

           875,020      cache-misses              #   41.627 % of all cache refs      ( +-  0.47% ) [33.30%]
         2,102,031      cache-references                                              ( +-  0.34% ) [33.88%]
       266,077,121      instructions              #    0.97  insns per cycle          ( +-  0.26% ) [51.48%]
       273,433,837      cycles                     ( +-  0.18% ) [50.88%]
         1,536,735      branch-misses             #    3.05% of all branches          ( +-  0.20% ) [50.94%]
        50,326,010      branches                                                      ( +-  0.10% ) [33.10%]

       0.217956438 seconds time elapsed                                          ( +-  0.17% )

優化版本

 Performance counter stats for './phonebook_opt' (100 runs):

           610,906      cache-misses              #   33.933 % of all cache refs      ( +-  0.71% ) [33.52%]
         1,800,335      cache-references                                              ( +-  0.34% ) [33.70%]
       308,570,975      instructions              #    1.00  insns per cycle          ( +-  0.16% ) [50.48%]
       309,035,581      cycles                     ( +-  0.13% ) [50.37%]
         1,815,016      branch-misses             #    3.47% of all branches          ( +-  0.38% ) [50.72%]
        52,364,968      branches                                                      ( +-  0.09% ) [33.47%]

       0.242260209 seconds time elapsed                                          ( +-  0.11% )

在 hash function 中使用 while 或 for 迴圈

有趣的是，我本來以為使用 while 迴圈來取得每一個 character 效能會比較好，但結果發現 cache-reference 變高了：

For loop:
           610,906      cache-misses              #   33.933 % of all cache refs      ( +-  0.71% ) [33.52%]
         1,800,335      cache-references                                              ( +-  0.34% ) [33.70%]
While loop:
           658,116      cache-misses              #   34.765 % of all cache refs      ( +-  0.58% ) [33.80%]
         1,893,031      cache-references                                              ( +-  0.28% ) [34.57%]

推測使用 while loop 必須要一直檢測指向 key 的 pointer 是否為 null character，所以在每次計算迴圈會 access cache 兩次，一次是為了 while loop 條件判斷，一次是計算 hash value。而如果使用 for loop 則只要判斷目前的 iteration 有沒有超過範圍就好了。

原始 vs djb2

由於 djb2 hash function 演算法迴圈中計算次數比 jenkins 少，可以發現 append 的耗費時間少了約 0.03 秒。另外 cache 的 reference 次數也降低了

 Performance counter stats for './phonebook_opt' (100 runs):

           504,582      cache-misses              #   29.810 % of all cache refs      ( +-  0.86% ) [32.27%]
         1,692,631      cache-references                                              ( +-  0.40% ) [33.96%]
       291,627,296      instructions              #    1.09  insns per cycle          ( +-  0.15% ) [52.19%]
       267,873,460      cycles                     ( +-  0.19% ) [52.11%]
         1,781,568      branch-misses             #    3.36% of all branches          ( +-  0.52% ) [51.16%]
        53,045,564      branches                                                      ( +-  0.11% ) [32.76%]

       0.214523920 seconds time elapsed                                          ( +-  0.33% )

原始 vs lose-lose

lose-lose hash function 就更簡單，純粹將每個字元的值加起來，作為 hash value。主要進步為 cache 的 reference 次數又更少了。

 Performance counter stats for './phonebook_opt' (100 runs):

           457,842      cache-misses              #   35.023 % of all cache refs      ( +-  0.56% ) [33.77%]
         1,307,246      cache-references                                              ( +-  0.59% ) [35.36%]
       275,425,010      instructions              #    1.04  insns per cycle          ( +-  0.06% ) [52.46%]
       263,738,002      cycles                     ( +-  0.09% ) [51.85%]
         1,814,252      branch-misses             #    3.38% of all branches          ( +-  0.33% ) [50.02%]
        53,643,758      branches                                                      ( +-  0.08% ) [31.75%]

       0.205535865 seconds time elapsed                                          ( +-  0.11% )

Hash function 資料分佈

觀察到 Jenkins 跟 djb2 分佈得很平均，約在 80~100 之間。但是 lose-lose 因為只是將字串的值做相加，所以集中分佈在某些區塊，當遇到 worst case，就要搜索 1300 多個 entry。

Jenkins
djb2
lose-lose

表格整理

Zyxel

執行時間 (sec)

hash	`append()`	`find()`
原始	0.141627	0.022409
Jenkins	0.178912	0.000001
djb2	0.150415	0.000001
lose-lose	0.148396	0.000001

Perf (K)

hash	CM	CR	%	Instr.	Cycles	I/C
原始	875	2102	41.6	266077	273433	0.97
Jenkins	610	1800	33.9	308570	309035	1.00
djb2	504	1692	29.8	291627	267873	1.09
lose-lose	457	1307	35.0	275425	263738	1.04