2016q3 Homework1 (phonebook)

contributed by <ruby0109>

Reviewed by `SarahYuHanCheng`

使用相同的commit可能造成日後專案維護不易

好的！！謝謝～
施雨妏

作業目標

要有所成長
分析Hash table
降低append
- memory pool
- ~~pthread~~

先前微微進度簡介ˊˋ

Phonebook
Gnuplot：
- 打上＄gnuplot 即可啟動
- 簡易指令如下：
set title 'my plot' ＠設定圖片名稱
set xlabel 'x axis' ＠設定XY軸座標名稱
set ylabel 'y axis'
set terminal png ＠設定輸出格式為 .png
set output 'output_plot.png' ＠設定輸出檔名
plot [1:10][0:1] sin(x) ＠畫出 sin(x) 函式，x軸座標範圍1 ~ 10, y軸座標範圍0 ~ 1
- 理解runtime
  - reset //清除之前的設定
  - set ylabel 'time(sec)'
  - set style fill solid
  - set title 'perfomance comparison'
  - set term png enhanced font 'Verdana,10'
  - set output 'runtime.png'
  - plot [:][:0.100]'output.txt' using 2:xtic(1) with histogram title 'original',
    這行代表original 是用第二個column的資料, X的label是取自第1個column
  - '' using ($0-0.06):($2+0.001):2 with labels title ' ',
    這行是控制在圖上的資料顯示, $0 用在只有一維的資料, 做x軸; $2是第二個column的資料位置; 2則是第二組的資料（放在圖上的資料內容） 0.06和0.001都是標籤位移量
  - '' using 3:xtic(1) with histogram title 'optimized' ,
    optimized 是用第三個column的資料, X的label是取自第1個column
  - '' using ($0+0.3):($3+0.0015):3 with labels title ' '
- 補充：
  - plot 'file', plot 'file' 1:2 和plot 'file' ($1):($2)有什麼不同呢？
    如果其中有幾列的column數比較少（有幾格有缺）, 那第一種會自己補一個值, 第二種會忽略那一列, 第三種則是會把遺失的data空下來
    如果每列有一些text, 1會有error, 第二種和第三種則是忽略
縮小struct:
- 原本：
- 改成只有lastname:







/*Optimal 1*/
//shrink the struct to lastname
typedef struct __LAST_NAME__ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    detail *data;//data pointer to detail struct
    struct __LAST_NAME__ENTRY *pNext;// the address of next point value
} entry;

結果:
加上Hash table方法
- table大小和function的採用都是trial&error
- 結果：(BKDRhash)

改善

Github筆記

新手上路, 有錯請指正

記得建SSH key, 設定權限。方法：Generating a new SSH key and adding it to the ssh-agent
Github四種狀態
圖片來源：Git 教學
- 先用git status看放在staging area準備commit的檔案（本端repository）
- changes not staged for commit和untracked files都是沒有要commit的
- 想要新增要commit的檔案用： git add <File>, 如果想要新增全部檔案：git commit -A 或 git commit –all
- 再用一次git status>>changes to commit有綠色的檔案, 就是準備上傳的
- 接下來用git commit -m "title" -m "message"就可以放到本端的respository
- git push就可以送到遠端（web上）參考Git 教學研究站
- 如果上傳了想刪除：參考How can I delete a file from git repo?
  - git rm file1.txt
    git commit -m "remove file1.txt"

除掉warning

原本compile有waring:

  In function 'findname': 
warning : assignment from incompatible pointer type 
  HashHead[i] = HashHead[i]->pNext;



  In function 'HashFunction': 
  warning : suggest parentheses around assignment used as truth value 
  while((c=*str++))

「程式輸出的文字訊息」，請勿使用圖片表示，請改由文字貼上課程助教

收到ruby

assignment from incompatible pointer type：仔細看才發現是我在struct中的pointer宣告出錯, struct __LAST_NAME__ENTRY少一條底線, 之前居然可以執行。有點好奇編譯器是怎麼處理的？
suggest parentheses around assignment used as truth value, 因為要判斷裡面的真假值, 所以改成while((c = *str++))

分析Hash Table

之前的作業對Hash function的選擇和bukkets數沒有用數據下去做分析, 這次先把這部份補齊。
- 參考rnic大大的方式, 我用slot下去看目前的Hash function 分佈狀況：
  
  好悲劇的圖阿…不過這也讓我發現沒有做分析, 根本不了解自己的程式是怎麼跑的
- 做圖表看不同的table size出來效能如何
  　　原本Hash的分佈很不均勻, 所以想要看效能隨著hash table size不同是怎麼變化的, 再取一個適當值。我把main.c改成phonebook.c（function也改成phonebook）, 另外寫一個main把值輸給phonebook嘗試做測試, 可是卡在不知道怎麼寫, 才可以正確的宣告Hash table size。

降低append

memory pool: 用一次分配記憶體再一次歸還的方式增進效能。

參考implement own memory pool
先只考慮放last name structure的話, 用到的memory以word.txt約是350000*32bytes: 11200000。~~一次alloc這麼大的空間會不會影響cach miss呢？ >> 先試試看~~
我的想法是:
memory pool既然是先alloc一個空間, 也就是有一長串的位址, 那我每次要用的時候就用已知的需要大小, 跳到下一個address, 再用另一個變數紀錄剩下的空間來避免超過alloc的大小。不過跟上面的問題一樣，我不知道有沒有方式可以有個變數讓所有file共用（還是這樣的寫法不好？）,所以把變數寫在main.c裡, 有點凌亂。

在h file的宣告：






/* Memory Pool*/                                                               
typedef struct __POOL {                                                        
    char lastName[MAX_LAST_NAME_SIZE];                                         
    detail *data;//data pointer to detail struct                               
    struct __POOL *pNext;// the address of next point                          
} entry;

在c file的函式：

















entry *pool_create( size_t size ){

    entry *p = (entry*)malloc((size*sizeof(entry)));
    return p;
}
/*  return the memory space at once */
void pool_destroy( entry *p){
    free(p);
}

/*  each time alloc a entry size memory */
entry *palloc (entry *p){

    return (++p);

}

結果：

Performance counter stats for './phonebook_memory' (100 runs):

       373,575      cache-misses              #   33.211 % of all cache refs      ( +-  0.54% )
     1,124,861      cache-references                                              ( +-  0.73% )
   170,807,651      instructions              #    1.96  insns per cycle          ( +-  0.03% )
    87,203,719      cycles                                                        ( +-  0.10% )

   0.041390273 seconds time elapsed                                          ( +-  0.12% )



對比Hash funciton:

Performance counter stats for './phonebook_hash' (100 runs):

       488,578      cache-misses              #   31.665 % of all cache refs      ( +-  0.07% )
     1,542,935      cache-references                                              ( +-  0.54% )
   243,589,372      instructions              #    2.18  insns per cycle          ( +-  0.02% )
   111,539,167      cycles                                                        ( +-  0.08% )

   0.052506768 seconds time elapsed                                          ( +-  0.13% )


cache miss上升了一些, 不知道為什麼

效能：

「程式輸出的文字訊息」請勿使用圖片表示，請改由文字貼上課程助教

好開心ˊˇˋ, 比之前縮小struct還快一些～

Pthread

改寫Hash table的structure
- 原本笨笨的用兩個大大的structure來放HashHead和HashHead之後的linked list。



































entry *HashHead[HASH_SIZE], *Hashe[HASH_SIZE];
entry *findName(char lastname[], entry *pHead)
{
    int i;
    i = HashFunction(lastname);                                                

    while (HashHead[i] != NULL) {                                              
        if (strcasecmp(lastname, HashHead[i]->lastName) == 0) {                
            return HashHead[i];
        }
        HashHead[i] = HashHead[i]->pNext;
    }
    return NULL;                                                               
}

/* allocate memory for the new entry and put lastName*/                        
entry *append(char lastName[], entry *e)                                       
{
    e = palloc(e);
    int i;                                                                     
    i = HashFunction(lastName);                                                
    strcpy(e->lastName, lastName);

    if(HashHead[i] == NULL) {
        HashHead[i] = e; 
        Hashe[i]=HashHead[i];                                                  
        Hashe[i]->pNext=NULL;                                                  
    } else {                                                                   
        Hashe[i]->pNext = e;                                                   
        Hashe[i]=Hashe[i]->pNext;                                              
        Hashe[i]->pNext=NULL;                                                  
    }
    return e;
}

參考TempoJiJi的寫法之後, 發現在創一個新的Node的時候, 只要把新Node的pNext放入Hash Head的位址, 再把新Node的位址放進Hash Head中就可以不用像之前一樣判斷Head是否存在或陣列的index,方便做pthread。

除此之外, TempoJiJi在struct用pointer to pointer宣告, 之後再用tail[]做array放pointer的寫法讓我對pointer和array又了解了一點。













typedef struct __HASH_TABLE { 
    entry **tail;                               
}HashTable; 
HashTable *hash_ptr;

void Initial_HashTable(void){ 
    hash_ptr = (HashTable* )malloc(sizeof(HashTable)); 
	hash_ptr->tail = malloc(HASH_SIZE*sizeof(entry*)); 
	for(i=0;i<HASH_SIZE;i++){
	    hash_ptr->tail[i]=NULL;
	}
}

~~不過因為用另一個定義在h file裡的pointer來取代原本的phead和e,我不知道__builtin___clear_cache裡面的參數該放什麼。~~
原本的：



#if defined(__GNUC__)
    __builtin___clear_cache((char *) pHead, (char *) pHead + sizeof(entry));
#endif

在6.25 gcc

— Built-in Function: void __builtin___clear_cache (char *begin, char *end)
	This function is used to flush the processor's instruction cache 
	for the region of memory between begin inclusive and end exclusive. 
	Some targets require that the instruction cache be flushed, 
	after modifying memory containing code, in order to obtain deterministic behavior.

	If the target does not require instruction cache flushes, 
	__builtin___clear_cache has no effect. 
	Otherwise either instructions are emitted in-line to clear the instruction cache 
	or a call to the __clear_cache function in libgcc is made.

~~看不太懂…~~
這行是在清除之前在cache中pHead被分配的空間, 如果我使用hash_ptr, 不知道是否可以省略

把新改好的Hash 和原本的memory pool結合, 不知道為什麼cache miss大量增加
Performance counter stats for './phonebook_memory' (100 runs):

       374,597      cache-misses              #   43.750 % of all cache refs      ( +-  0.76% )
       856,231      cache-references                                              ( +-  0.53% )
   169,958,884      instructions              #    2.09  insns per cycle          ( +-  0.04% )
    81,406,863      cycles                                                        ( +-  0.14% )

   0.038805268 seconds time elapsed                                          ( +-  0.18% )

加上pthread之後：

Performance counter stats for './phonebook_thread' (100 runs):

     1,280,729      cache-misses              #    4.501 % of all cache refs      ( +-  0.56% )
    28,456,255      cache-references                                              ( +-  0.69% )
   414,598,663      instructions              #    0.54  insns per cycle          ( +-  0.34% )
   771,199,411      cycles                                                        ( +-  0.54% )

   0.116110237 seconds time elapsed                                          ( +-  1.47% )

似乎有地方出錯…

甚至連Hash的版本cache也升高了
Performance counter stats for './phonebook_hash' (100 runs):

       479,087      cache-misses              #   64.595 % of all cache refs      ( +-  0.09% )
       741,675      cache-references                                              ( +-  0.74% )
   241,084,858      instructions              #    2.22  insns per cycle          ( +-  0.02% )
   108,662,675      cycles                                                        ( +-  0.11% )

   0.051187177 seconds time elapsed                                          ( +-  0.15% )

看了TempoJiJi分享的網頁才知道threa在I/O發揮的效果不大, 雖然還是不太明白為什麼這樣append就會升高那麼多。

ˊˋ在debug很久很久之後發現原本的code在函式的地方沒有把指標放進去，不過cache miss還是蠻高的。試試看mmap()

我的code



































 #if defined(MMAP)
    int j, fd,pagesize;
    char *data;
    struct stat file_status;
    unsigned long int bytes_read,bytes_left, to_read;
    bytes_read=0; bytes_left=0; to_read=0;

    /* create a memory pool and initialize hash table*/
    hash_ptr = Initial();

    /* open the file*/
    fd = open(DICT_FILE,O_RDONLY);
    /* check the size of the file*/
    fstat(fd, &file_status);
    /* check the size of page for mmap to asign virtual address space*/
    pagesize = getpagesize();
    bytes_left = file_status.st_size;

/* clean cache*/
    #if defined(__GNUC__)                                                      
        __builtin___clear_cache((char *) hash_ptr, (char *) hash_ptr + HASH_SIZE*sizeof(HashTable));
    #endif

    /* if remain data bigger that pagesize, then just give pagesize data to mmap*/
    clock_gettime(CLOCK_REALTIME, &start);
    while(bytes_left > 0){
        if (bytes_left > pagesize){
            to_read = pagesize;
        }else{
            to_read = bytes_left;
        }
        /* mmap allocate virtual memory for data*/
        data = mmap((caddr_t)0, to_read, PROT_READ, MAP_SHARED, fd, bytes_read);

結果：

有稍微再降了一些些～

嘗試用pthread做:
code:











void *pmmap(void* ptr){


    Pmmap = (mmap_arg* )ptr;

        /* mmap allocate virtual memory for data*/
    Pmmap->data = mmap((caddr_t)0, Pmmap->to_read, PROT_READ, MAP_SHARED, Pmmap->fd, Pmmap->bytes_read);

    pthread_exit(NULL);
}

結果：
8 threads:

2 threads:

來不及做跟字串相關的實作, debug往往花太長時間

需不需要free在function裡的node?

實際運作

在輸入的data沒有很看得懂老師的提示：
```
  本例選用的 dataset (也就是輸入電話簿的姓名) 是否存在問題？
  有代表性嗎？跟真實英文姓氏差異大嗎？
  資料難道「數大就是美」嗎？
  如果我們要建立符合真實電話簿情境的輸入，該怎麼作？
```
先找了正常last name的資料：Free First-name and Last-name Databases (CSV and SQL), 我覺得跟之前的dataset做分析比起來, 在真實使用上的lastname會比較少, 可是可能有重複而要比對其他資料的情況。

C: How to free nodes in the linked list?

參考資料

rnic 大大的phonebook
implement own memory pool
Git hub教學參考：
- https://kingofamani.gitbooks.io/git-teach/content/chapter_2/repo.html
- https://zlargon.gitbooks.io/git-tutorial/content/
struct: Difference between -> and . in a struct?