2017q1 Homework1 (phonebook)

Reviewed by Stanley

優化開始時，一開始放的圖是硬體環境(cache 大小)，可以先放一下 cache miss rate 的資訊，比較可以知道問題在哪，及為什麼要這樣開始做。
使用 Hash function 可以標註一下使用的 hash function 為何種？實際的程式碼或是數學式子，這樣在後面比較時效能時，可以看得出來是哪一種 hash function 造成有優化的效果
"再參考過上課連結embedded-summer2015裡的作法…"這邊可以放上連結，讓參考的同學可以直接連到相關討論串; 另外我對這個作法有點小疑問，就是把其他資訊完全用另外一個 struct儲存，這樣是用什麼樣的方式另外把 last name 所屬的 node 跟其他資訊的 node 連結起來
有點好奇最後 Binary search 建立 dictionary 去存的方式的cach-misses rate，再等妳的實驗結果！
有看到妳在 github 的code 有include c file，通常不會這樣做，原先版本有#include IMPL 可以利用Makefile 裡的這段 -DIMPL來達到切換不同 header file，指令可以下$ make phonebook_opt

phonebook_opt: $(SRCS_common) phonebook_opt.c phonebook_opt.h
	$(CC) $(CFLAGS_common) $(CFLAGS_opt) \
		-DIMPL="\"$@.h\"" -o $@ \
$(SRCS_common) $@.c

中文輸入法安裝

因為我本身的第一次使用LINUX系統，加上本身的 Ubuntu Software 沒有 gcin ，所以嘗試用另一種方式安裝。
步驟：

1.取得 APT 金鑰

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 835AB0E3

2.去軟體來源 add 新套件

3.安裝 gcin

sudo apt-get install gcin

開發環境

tina@tina-X550VB:~$ lscpu
Architecture:          x86_64
CPU 作業模式：    32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
每核心執行緒數：2
每通訊端核心數：2
Socket(s):             1
NUMA 節點：         1
供應商識別號：  GenuineIntel
CPU 家族：          6
型號：              58
Model name:            Intel(R) Core(TM) i5-3230M CPU @ 2.60GHz
製程：              9
CPU MHz：             1270.750
CPU max MHz:           3200.0000
CPU min MHz:           1200.0000
BogoMIPS:              5188.47
虛擬：              VT-x
L1d 快取：          32K
L1i 快取：          32K
L2 快取：           256K
L3 快取：           3072K
NUMA node0 CPU(s):     0-3

執行原始檔

結果

size of entry : 136 bytes
execution time of append() : 0.085888 sec
execution time of findName() : 0.005778 sec

執行

在執行時必須先建執行檔，此時的 phonebook_orig 就是執行檔

tina@tina-X550VB:~$ make -C phonebook/
make: Entering directory '/home/tina/phonebook'

Git commit hooks are installed successfully.

cc -Wall -std=gnu99 -O0 \
	-DIMPL="\"phonebook_orig.h\"" -o phonebook_orig \
	main.c phonebook_orig.c
cc -Wall -std=gnu99 -O0 \
	-DIMPL="\"phonebook_opt.h\"" -o phonebook_opt \
	main.c phonebook_opt.c

執行時如果執行檔沒在當前目錄下，就會找不到檔案：

tina@tina-X550VB:~$ phonebook/phonebook_orig
cannot open the file

必須利用 cd 進入目錄再執行

tina@tina-X550VB:~$ cd phonebook/
tina@tina-X550VB:~/phonebook$ ls
calculate.c  main.c         phonebook_opt.c  phonebook_orig.c  scripts
dictionary   Makefile       phonebook_opt.h  phonebook_orig.h
LICENSE      phonebook_opt  phonebook_orig   README.md
tina@tina-X550VB:~/phonebook$ ./phonebook_orig

優化原始檔

想法：

想從　miss cache 下手：

L1d 快取：          32K
L1i 快取：          32K
L2 快取：           256K
L3 快取：           3072K

想用 struct 來定義新的型態，struct 就像是一種陣列來存放我們所定義類型的資料：

typedef struct __LAST_NAME_ENTRY{
    char lastName[MAX_LAST_NAME_SIZE];
    entry *detail;
    struct __LAST_NAME_ENTRY *pNext;
} lastNameEntry;

而此時因為我們只需先比較　word.txt　檔中的　lastname，就可以知道有沒有找到，因此把其他多餘資料都拿掉，結果：

 Performance counter stats for './main_optimal' (10 runs):

           501,481      cache-misses              #   54.415 % of all cache refs      ( +-  1.92% )  (63.64%)
           921,580      cache-references                                              ( +-  4.16% )  (68.84%)
         2,339,778      L1-dcache-load-misses                                         ( +-  3.06% )  (70.32%)
           361,228      L1-dcache-store-misses                                        ( +-  1.82% )  (70.50%)
         1,585,106      L1-dcache-prefetch-misses                                     ( +-  4.45% )  (67.48%)
            69,227      L1-icache-load-misses                                         ( +-  5.86% )  (62.09%)

       0.062762091 seconds time elapsed                                          ( +-  7.77% )

cache-misses 54.415 % of all cache refs

比較原始的執行檔：

Performance counter stats for './main_origin' (10 runs):

         3,762,064      cache-misses              #   94.490 % of all cache refs      ( +-  0.99% )  (65.35%)
         3,981,454      cache-references                                              ( +-  0.84% )  (65.65%)
         4,434,223      L1-dcache-load-misses                                         ( +-  0.74% )  (66.41%)
           932,969      L1-dcache-store-misses                                        ( +-  1.68% )  (68.18%)
         2,865,922      L1-dcache-prefetch-misses                                     ( +-  1.81% )  (69.35%)
           115,472      L1-icache-load-misses                                         ( +-  3.43% )  (67.26%)

       0.101997527 seconds time elapsed                                          ( +-  4.20% )

cache-misses 94.490 % of all cache refs

使用 HASH FUNCTION

tina@tina-X550VB:~/embedded-summer2015/phonebook$ ./main_hash
hash table size (prime number) : 42737
size of entry : 24 bytes
   uninvolved is found!
        zyxel is found!
   whiteshank is found!
   odontomous is found!
  pungoteague is found!
   reweighted is found!
 xiphisternal is found!
    yakattalo is found!
execution time of appendHash() : 0.088026
execution time of findNameHash() : 0.000041

Performance counter stats for './main_hash' (10 runs):

           351,564      cache-misses              #   57.129 % of all cache refs      ( +-  0.86% )  (64.90%)
           615,389      cache-references                                              ( +-  0.66% )  (65.77%)
           926,775      L1-dcache-load-misses                                         ( +-  0.52% )  (65.88%)
           664,744      L1-dcache-store-misses                                        ( +-  1.72% )  (69.24%)
            59,462      L1-dcache-prefetch-misses                                     ( +-  4.23% )  (72.27%)
            68,888      L1-icache-load-misses                                         ( +-  3.89% )  (67.91%)

       0.047243179 seconds time elapsed                                          ( +-  2.25% )

重新閱讀作業要求，你還沒回答指定的問題，及早跟上 –jserv

cache-misses 57.129 % of all cache refs

測試在Structure裡只留下Lastname

再參考過上課連結embedded-summer2015裡的作法（以上討論）後，就來做做看：

typedef struct __PHONE_BOOK_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    struct __PHONE_BOOK_ENTRY *pNext;
} entry;

typedef struct __PHONE_BOOK_EXCEPT_LASTNAME {

    char firstName[16];
    char email[16];
    char phone[10];
    char cell[10];
    char addr1[16];
    char addr2[16];
    char city[16];
    char state[2];
    char zip[5];
    struct __PHONE_BOOK_ENTRY *pNext;
} except;

將沒有使用到的資料放進另一個structure

github(https://github.com/tina0405/phonebook/commit/a52b59650bb02e212ad09bdf958e3450913f03ed)

一開始的執行時間
size of entry : 136 bytes
execution time of append() : 0.081552 sec
execution time of findName() : 0.006548 sec


改善後的執行時間
size of entry : 24 bytes
execution time of append() : 0.071361 sec
execution time of findName() : 0.004057 sec

使用Binary tree去測試

建構方法：
想把長度較長或一樣的放置rightchild，長度較短的放置leftchild，如此一來在search時就可以忽略一部分長度不相符的資料。
作法
看了以下的資料後我想先做的是將資料先由小到大排序。

先把資料存進dictionary的矩陣裡，只可惜我還是有改到main.c










    char dictionary[400000][50];   

    int n=0;
    void creat_array(char* line)
    {
        strcpy(dictionary[n++],line);
        printf(line);//for test
    }
    
    //in phonebook_orig.c

首先想到的是以前學過的QuickSort

資料上寫說：
- 在平均情況下，快速排序法執行時間的成長速率卻只有nlogn！
  可見在大多數情況下，快速排序法的效率仍然是相當優秀的。
我的想法：
- 因為我並不理解什麼是大多數情況，所以我想說做完這次，等等用
  其他像BubbleSort等排序方法，看看時間消耗。

程式碼：
在剛剛把資料存進dictionary矩陣後，就開始將資料排序好，我認為即使排序的確是會再多花時間，但在之後插入或搜尋好幾比資料，將會變快很多。




















































void quicksort(int left, int right)
{
    
    int pivot, i, j;
   
    if (left >= right) { return; }

    pivot = (unsigned)strlen(dictionary[left]);

    i = left + 1;
    j = right;

    while (1)
    {
        while (i <= right)
        {
            if ((unsigned)strlen(dictionary[i]) > pivot)
            {
                break;
            }

            i = i + 1;
        }

        while (j > left)
        {
            if ((unsigned)strlen(dictionary[j]) < pivot)
            {
                break;
            }

            j = j - 1;
        }

        if (i > j) { break; }
        swap(&dictionary[i], &dictionary[j]);
    }

    swap(&dictionary[left], &dictionary[j]);
    quicksort(left, j - 1);
    quicksort(j + 1, right);

}

void swap(char *a, char *b)
{
     strcpy(tmp[1],a);
     strcpy(a,b); 
     strcpy(b,tmp[1]);
    
}

不小心在指標和二維的dictionary亂掉，今天花了有點多時間。

先將words.txt削減至4筆測試（依短到長）：

tina@tina-X550VB:~/phonebook$ ./main
size of entry : 136 bytes
abobra
aboe
zyxel
abd

排序後的結果:
abd

aboe

zyxel

abobra

改成跑word.txt排序所花費的時間，感覺花好多時間。

execution time of sort : 24.669081 sec

而且我又多出一個想法，如果依長度排列之外，應該還要依字母排列，這樣不管在找尋或插入才會更有系統。

參考資料Binary tree

Binarytree_1
Binarytree_2
QuickSort_1
QuickSort_2