2017q3 Homework2 (prefix-search)

contributed by <yuan922>

Reviewed by `ZixinYang`

建議修改程式碼的時候, 不要只是寫把 tst_free_all(root) 改成 tst_free(root), 而是去解釋該 function 的實作差異。
記得按照規定去 fork GitHub repository, 並 push 修改後的程式碼。

開發環境

Architecture:          x86_64
CPU 作業模式：    32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
每核心執行緒數：1
每通訊端核心數：4
Socket(s):             1
NUMA 節點：         1
供應商識別號：  GenuineIntel
CPU 家族：          6
型號：              60
Model name:            Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz
製程：              3
CPU MHz：             3390.435
CPU max MHz:           3700.0000
CPU min MHz:           800.0000
BogoMIPS:              6584.89
虛擬：              VT-x
L1d 快取：          32K
L1i 快取：          32K
L2 快取：           256K
L3 快取：           6144K
NUMA node0 CPU(s):     0-3

執行程式

中英文字間請以空白隔開
課程助教

首先嘗試執行 test_cpy.c 檔

choice: f
find word in tree: Taiwan
found Taiwan in 0.000001 sec.

搜尋 Taiwan 正常執行

choice: f
find word in tree: Taipei
Taipei not found.

搜尋 Taipei 卻找不到

用其他指令找看看原因:

choice: s
find words matching prefix (at least 1 char): Taip
Taip - searched prefix in 0.000014 sec
suggest[0] : Taipa,
suggest[1] : Taipalsaari,
suggest[2] : Taipei,
suggest[3] : Taiping,
suggest[4] : Taipu,

用 s 搜尋之後發現不是找不到，而是城市名稱接著逗號再接國家名稱
而城市的名稱會包含中間區隔的逗號，所以要城市名稱加上逗號才能用 f 搜到

choice: f
find word in tree: Taipei,
found Taipei, in 0.000001 sec.

 while ((rtn = fscanf(fp, "%s", word)) != EOF) {
        char *p = word;
        if (!tst_ins_del(&root, &p, INS, CPY)) {
            fprintf(stderr, "error: memory exhausted, tst_insert.\n");
            fclose(fp);
            return 1;
        }
        idx++;
    }

參考 Lihikoka 同學的共筆之後得知，要修正這問題只要把word最後一個字移除就好。

在while裡面加上:

int len = strlen(word);
word[len - 1] = 0;

執行f

choice: f
find word in tree: Taipei
found Taipei in 0.000001 sec.

結果換成Taiwan找不到

choice: f
find word in tree: Taiwan
  Taiwan not found.

參考stackoverflow上面的寫法:

 while ((rtn = fscanf(fp, "%s", word)) != EOF) {
        char *p = word;
	char *w;
	char *r;
	for(w=r=p;*r;r++){
		if(*r!=','){
		   *w++ = *r;
		}
	}
	*w='\0';

直接檢查字串有無含逗號。

choice: f
find word in tree: Taipei
found Taipei in 0.000001 sec.

choice: f
find word in tree: Taiwan
found Taiwan in 0.000001 sec.

執行結果皆正常。

FIXME

先檢查 test_ref.c 檔，看到第一個 FIXME

  if (!tst_ins_del(&root, &p, INS, CPY))

先把 CPY 改成 REF

  if (!tst_ins_del(&root, &p, INS, REF))

但這樣在執行上會有幾個錯誤

指令為 s :

choice: s
find words matching prefix (at least 1 char): New
 New - searched prefix in 0.000222 sec

suggest[0] : New
suggest[1] : New
suggest[2] : New
suggest[3] : New
suggest[4] : New
...

可以發現，都會對應到同樣的字

觀察 test_ref.c

enum { INS, DEL, WRDMAX = 256, STKMAX = 512, LMAX = 1024 };
#define REF INS
#define CPY DEL

enum的賦值順序為0，1，2…（如果沒事先給定）
所以 INS=REF=0 , DEL=CPY=1,
再看到 tst.c

  if (*p++ == 0) {
            if (cpy) { /* allocate storage for 's' */
                const char *eqdata = strdup(*s);
                if (!eqdata)
                    return NULL;
                curr->eqkid = (tst_node *) eqdata;
                return (void *) eqdata;
            } else { /* save pointer to 's' (allocated elsewhere) */
                curr->eqkid = (tst_node *) *s;
                return (void *) *s;
            }
        }
        pcurr = &(curr->eqkid);
    }

將CPY 改成 REF 之後會直接將 s 指派給 curr->eqkid,而s又會對應到word，因此指標會一直指向同一個起始位址，造成錯誤。

參考st9007a同學的作法
使用 strdup() 修正

 char *p = strdup(word);

顯示正常

suggest[30] : Taiwala
suggest[31] : Taiwan
suggest[32] : Taixing
suggest[33] : Taiynsha
suggest[34] : Taiyuan
suggest[35] : Taizhou

指令為 q :

choice: q
*** Error in `./test_ref': free(): invalid pointer: 0x00007ffe767782d0 ***

指標指向無效的位址。

case 'q':
tst_free_all(root);

這裡要更正為

case 'q':
tst_free(root);

因為tst_free是用來釋放外部資料。

執行:

choice: q
yuan@yuan-OptiPlex-7020:~/prefix-search$

2017q3 Homework2 (prefix-search)

Reviewed by ZixinYang

開發環境

執行程式

FIXME

Reviewed by `ZixinYang`