2017q3 Homework2 (prefix-search)(uncomplete)

contributed by < kevin55216 >

Reviewed by `kevin550029`

可以更詳細的解釋個功能, 用自己的理解表達
開發日誌部分可以提供更多程式碼來做說明
嘗試說明結果部分, 並用圖表表達

功能觀察

a add word to the tree
加入文字到建立的樹中（不會寫入cities.txt）
f find word in tree
從樹中尋找文字（需要完全吻合）
s search words matching prefix
從樹中尋找文字（只需要吻合前綴）
d delete word from the tree
將文字從建立的樹中刪除（不會寫入cities.txt）若無此文字，則插入進樹中
q quit, freeing all data
離開

功能圖解

ternary search tree

          c
        / | \
       a  u  h
       |  |  | \
       t  t  e  u
     /  / |   / |
    s  p  e  i  s
    
    a ac  c hi  u
    s tu  u e   s
       p  t
          e
copy from wikipedia/Ternary_search_tree

delete word

delete "cup"
          c          stack           
        / | \       |__t__|
       a  u  h      |__u__|        
       |  |  | \    |__c__|    
       t  t  e  u                  
     /  / |   / |                    
    s  p  e  i  s                       
---------------------------------------------------------------------
node p is victim
node t is parrent
parent lokid = NULL, free victim, victim = NULL, return NULL
---------------------------------------------------------------------
          c                          
        / | \               
       a  u  h                     
       |  |  | \               
       t  t  e  u                  
     /    |   / |                    
    s     e  i  s
---------------------------------------------------------------------
delete "us" 
stack
|__u__|
|__h__|
|__c__|
---------------------------------------------------------------------
node s is victim, node u is parrent
parent eqkid = NULL, free victim, victim=parent, parent = node h (pop stack)
parent hikid (u) = victim lokid (i), free victim, victim = NULL
---------------------------------------------------------------------
          c                          
        / | \               
       a  u  h                     
       |  |  | \               
       t  t  e  i                  
     /    |                        
    s     e

insert

insert "cup"
          c                      
        / | \       
       a  u  h              
       |  |  | \        
       t  t  e  u                  
     /  / |   / |                    
    s  p  e  i  s                       
---------------------------------------------------------------------
p node->refcnt ++
so refcnt is a count that count how many "cup" word in tree.
---------------------------------------------------------------------
insert "tea"
t > c
t > h
t < u
t > i
---------------------------------------------------------------------
          c                      
        / | \       
       a  u  h              
       |  |  | \        
       t  t  e  u                  
     /  / |   / |                    
    s  p  e  i  s
              \
               t
               |
               e
               |
               a

---------------------------------------------------------------------
t e a node = 1

開發日誌

將 CPY 換成 REF ，發現差異在 tst.c 的 tst_ins_del()中
有 CPY 的情形，會做 const char *eqdata = strdup(*s)
strdup() 會先幫 *eqdata 做 malloc ，後在複製字串進 *eqdata 中
malloc fword 一段很大的連續記憶體空間，使每一個 node p 皆在 fword 上
問題點： free 的時候會 core dumped
參考 HMKRL 將 tst_del_word 中 1 改成 cpy , 按 q 仍會 dump 掉, d 可以使用了。
q 會 dump 掉原因， tst_free_all() 中 free(p->eqkid) 會導致錯誤
在 tst_free_all() 中加入可以控制是否執行 free(p->eqkid) 解決問題
加入 bench 功能，會將已經隨機取樣1000筆測試資料，進行 find 並將結果輸出成txt檔















/*取得tst_search cycle*/
while ((rtn = fscanf(bfp, "%s", word)) != EOF) {
    char *testbench = word;
    rmcrlf(testbench);
    get_cycles(&timec_high1, &timec_low1);
    res = tst_search(root, testbench);
    get_cycles_end(&timec_high2, &timec_low2);
    result[j] = diff_in_cycles(timec_high1, timec_low1, timec_high2, timec_low2);
    if (res)
        fprintf(brfp, "%"PRIu64"\n", result[j]);
    else
        fprintf(brfp,"%s not found.\n", testbench);
    j++;
}














/*計算平均、標準差*/
    double resultmid = 0.0;
    for(int i = 0; i < 1000; i++) {
        resultmid += (double)result[i];
    }
    resultmid /= 1000.0;
    double resultsd = 0.0;
    for(int i = 0; i < 1000; i++) {
        resultsd += (((double)result[i]-resultmid)*((double)result[i]-resultmid));
    }
    resultsd /= (1000.0-1.0);
    resultsd = sqrt(resultsd);
    printf("mid: %.9f\n", resultmid);
    printf("SD : %.9f\n", resultsd);

並print出平均與標準差。

請列出程式碼
課程助教

結果

單位為cycles, mid 為平均值, SD 為標準差
$ ./test_ref --bench
benching...
bench complete
mid: 889.605000000
SD : 772.879254034
$ ./test_cpy --bench
benching...
bench complete
mid: 1016.366000000
SD : 1758.118714267

2017q3 Homework2 (prefix-search)(uncomplete)

Reviewed by kevin550029

功能觀察

功能圖解

開發日誌

結果

Reviewed by `kevin550029`