2017q3 Homework2 (prefix-search)

# 2017q3 Homework2 (prefix-search) contributed by < `shanshow` > ### 其餘問題執行 make 時遇到問題 ``` error ‘for’ loop initial declarations are only allowed in c99 mode in ubuntu ``` * 解法為使用export CC=c99，主因在系統並非預設使用c99，通常是某些老舊系統會遇到這個問題 > 在 gcc 指定編譯參數 `-std=c99` 或 `-std=gnu99` (C99 with GNU extension) 即可。不全然是「老舊系統」，是相容性和預設組態議題 > [name="jserv"][color=red] ### tst_suggest 程式碼的隱憂與改進 >請您再重新檢視，並在中英文間以空白隔開 >[name=課程助教][color=red] 根據作業的教學，執行 ``` $ ./test_cpy ``` 會得到 ``` ternary_tree, loaded 259112 words in 0.151270 sec Commands: a add word to the tree f find word in tree s search words matching prefix d delete word from the tree q quit, freeing all data choice: ``` * a：新增文字到樹中 * f：在樹中找尋詞彙 * s：給定前幾個字母，自動搜尋符合前幾個字母的單字 * d：刪除樹中的詞彙根據老師在 facebook 上提供的[zhanyangch共筆](https://hackmd.io/s/Syc8yOhhW)，可以得知在 choice:s 的情況下搜尋"A"會產生錯誤。 ``` choice: s find words matching prefix (at least 1 char): A >Program received signal SIGSEGV, Segmentation fault. 0x0000000000401f8b in tst_suggest (p=0xd72680, c=65 'A', nchr=1, a=0x7fffffffbd50, n=0x7fffffffbd0c, max=1024) at tst.c:330 330 a[(*n)++] = (char *) p->eqkid; ``` * 它會出現在當程式企圖存取 CPU 無法定址的記憶體區段時。當錯誤發生時，硬體會通知作業系統產生了記憶體存取權限衝突的狀況。作業系統通常會產生核心傾印檔案（core dump）以方便程式員進行除錯。通常該錯誤是由於調用一個位址，而該位址為空（NULL）所造成的，例如鏈錶中調用一個未分配位址的空鏈錶單元的元素。陣列存取越界也可能產生這個錯誤。參考資料：[wiki](https://zh.wikipedia.org/wiki/%E8%A8%98%E6%86%B6%E9%AB%94%E5%8D%80%E6%AE%B5%E9%8C%AF%E8%AA%A4) [ubuntu論壇](https://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=56946) * 此錯誤為 segment fault，記憶體區段錯誤，而出錯的部份在 tst_suggest，那就先來找看看 tst_suggest 出現在哪。在tst.h中可找到介紹用法的註解 ``` /** tst_search_prefix() fills ptr array 'a' with words prefixed with 's'. * once the node containing the first prefix matching 's' is found * tst_suggest() is called to traverse the ternary_tree beginning * at the node filling 'a' with pointers to all words that contain * the prefix upto 'max' words updating 'n' with the number of words * in 'a'. a pointer to the first node is returned on success * NULL otherwise. */ ``` :::danger 1. 不要只會列表，你應該「摘要」 2. 同時思考這樣 API 設計的考量 :notes: jserv ::: 在tst.c中則可找到關於這<s>函數</s> 函式的用法 :::info function 不做數學運算用途時，應該翻譯為「函式」 :notes: jserv ::: ``` void tst_suggest(const tst_node *p, const char c, const size_t nchr, char **a, int *n, const int max) { if (!p || *n == max) return; tst_suggest(p->lokid, c, nchr, a, n, max); if (p->key) tst_suggest(p->eqkid, c, nchr, a, n, max); else if (*(((char *) p->eqkid) + nchr - 1) == c) a[(*n)++] = (char *) p->eqkid; tst_suggest(p->hikid, c, nchr, a, n, max); } ``` * 此函數會將符合的字串放入a[]中，設定一個max值，在重複呼叫自己的同時 * n 也會隨之增長，所以判斷式(!p || * n == max)中才會限制當 * n 為max時可觸發 return 。 * 其中(!p || * n == max)為檢查輸入的數值是否合法的重要部分，然而如果* n這樣寫，當* n超過 max 時仍會被判斷成 false ，也就是 * n > max 時系統不會認為這是不合理的，導致之後程式嘗試寫入非法的程式而錯誤。 * 解法：超過 max 部分的數值會造成錯誤的話，就讓它把大於max的數值也一起判斷為 true ，也就是 ``` (!p || *n >= max) ``` 再次執行後，可以發現它搜索了前1024個前頭為A的單詞 ``` suggest[1021] : Akzhal, suggest[1022] : Alà suggest[1023] : Alàs suggest[1024] : Al ``` ### REF的實作參考[st9007a共筆](https://hackmd.io/s/SkeKQEq3Z#) > 應該指出同學的名稱或 GitHub 代稱 > [name="jserv"][color=red] 作業中有提及，可從tst.h中的註解找到參考 /** tst_ins_del() ins/del copy or reference of 's' from ternary search tree. * insert all nodes required for 's' in tree at eqkid node of leaf. if 'del' * is non-zero deletes 's' from tree, otherwise insert 's' at node->eqkid * with node->key set to the nul-chracter after final node in search path. if * 'cpy' is non-zero allocate storage for 's', otherwise save pointer to 's'. * if 's' already exists in tree, increment node->refcnt. (to be used for del). * returns address of 's' in tree on successful insert (or on delete if refcnt * non-zero), NULL on allocation failure on insert, or on successful removal * of 's' from tree. */ * 註解提及 tst_ins_del() ，先行前往 test_ref.c 把此程式碼列出： ```Clike void *tst_ins_del(tst_node **root, char *const *s, const int del, const int CPY); ``` * 要對 CPY 及 REF 兩者進行效能分析，於是要先讓 REF 的方式能夠有效運作。前往 tst.c 找尋 tst_ins_del() ```clike void *tst_ins_del(tst_node **root, char *const *s, const int del, const int cpy) { /*......*/ if (*p++ == 0) { if (cpy) { /* allocate storage for 's' */ const char *eqdata = strdup(*s); if (!eqdata) return NULL; curr->eqkid = (tst_node *) eqdata; return (void *) eqdata; } else { /* save pointer to 's' (allocated elsewhere) */ curr->eqkid = (tst_node *) *s; return (void *) *s; } } /*......*/ } ``` * 可由上面的兩句註解得知， CPY 狀態時會直接指派一個空間給 's' ； REF 狀態時則是讓 's' 存一個指標。因此 REF 狀態時需要預先幫他劃好一個空間，讓它之後能夠用指標指向。 * CPY 狀態時， strdup()便預先指派好一塊記憶體空間放置複製的字串了。 * REF 狀態時，試著預先劃分好一塊記憶體空間，以便讓指標可以指定。參考共筆，在tst_ref.c中先劃好記憶體空間。 ```clike char (*city)[MAX] = malloc(sizeof (*city) * CMAX); ``` * 參考網站的[[C] 透過函式記憶體配置 malloc()](http://lee.logdown.com/posts/98518/c-through-the-function-malloc-memory-configuration)，使用malloc()分配記憶體，之後將整個tst_ref.c改寫成使用預先劃好記憶體空間的部份。在更動途中遇到下列錯誤： ``` Segmentation fault (core dumped) ``` * 原因是在設定常數 CMAX 時，數值定義過小，導致給定的空間不夠用而造成記憶體區段錯誤，目前解法為把數值增大，以後或許應嘗試使用動態增加方式來避免這種錯誤。 ``` choice: q *** Error in `./test_ref': free(): invalid pointer: 0x00007ffc11f6c1d0 *** ``` * 錯誤訊息提示了錯誤的程式為 free() ，於是從 tst.c 裡面找到 free 相關程式，進行分析。 ```clike void tst_free_all(tst_node *p) { if (!p) return; tst_free_all(p->lokid); if (p->key) tst_free_all(p->eqkid); tst_free_all(p->hikid); if (!p->key) free(p->eqkid); free(p); } /** free the ternary search tree rooted at p, data storage external. */ void tst_free(tst_node *p) { if (!p) return; tst_free(p->lokid); if (p->key) tst_free(p->eqkid); tst_free(p->hikid); free(p); } ``` * tst_free_all 多了幾行程式碼，而那個程式碼會把儲存字串的 tst_node 也一齊 free ，造成 free() 會使用到不合法的指標，導致錯誤。若更改成 tst_free ，便可解決問題。 ``` *** Error in `./test_ref': free(): invalid pointer: 0x00007f05d09b4810 *** ``` * 而 delete 功能也產生了錯誤，同樣找到刪除用的程式碼： ``` static void *tst_del_word(tst_node **root, tst_node *node, tst_stack *stk, const int freeword) { tst_node *victim = node; /* begin deletion w/victim */ tst_node *parent = tst_stack_pop(stk); /* parent to victim */ if (!victim->refcnt) { /* if last occurrence */ if (!victim->key && freeword) /* check key is nul */ free(victim->eqkid); /* free string (data) */ /*......*/ ``` * 此程式最後一段的程式碼，與造成 tst_free_all() 錯誤的程式碼相近。 ``` if (!victim->key && freeword) /* check key is nul */ free(victim->eqkid); /* free string (data) */ ``` VS ``` if (!p->key) free(p->eqkid); ``` * 註解寫明，如果最後的字串出現，會依照上列判斷式決定是否清除字串， CPY 模式下可以照常運作但是在 REF 狀態下會產生錯誤。 tst_del_word(root, curr, &stk, 1); * 因此，在 tst.c 中，為了避免錯誤而又不讓 CPY 模式下的環境出現變化，將常數1的位置上填入 CPY ，如此一來 CPY 模式下能夠判斷為 TRUE ，而 REF 模式下能夠判斷為 FALSE 從而避免錯誤。 ### 效能測試：參考[zhanyangch同學之共筆](https://hackmd.io/s/Syc8yOhhW#) 在 Makefile 內放入 bench ``` bench: echo 3 | sudo tee /proc/sys/vm/drop_caches perf stat --repeat 150 \ -e cache-misses,cache-references,instructions,cycles \ sudo ./test_cpy --bench echo 3 | sudo tee /proc/sys/vm/drop_caches perf stat --repeat 150 \ -e cache-misses,cache-references,instructions,cycles \ sudo ./test_ref --bench ``` * echo 3 > /proc/sys/vm/drop_caches 為釋放所有 cache 記憶體空間，而 tee 指令則是讓資料既可儲存亦可顯示在螢幕上 :::danger 「釋放所有 cache 記憶體空間」這段描述是錯誤的，請找 Linux 核心文件並修正 :notes: jserv ::: * chrt 可定義優先級， taskset 可以把特定的核心交給特定程式使用。 * Perf 全名是 Performance Event，是在 Linux 2.6.31 以後內建的系統效能分析工具 ```Clike double bench(const tst_node *root, char **a, int *n, const int max) { double t1, t2, ttime; char word[]="A"; ttime = 0; for (int i=0 ; i<25 ; i++) { t1 = tvgetf(); tst_search_prefix(root, word, a, n, max); t2 = tvgetf(); word[0]++; ttime += t2 - t1; } word[0] = 'a'; for (int i=0 ; i<25 ; i++) { t1 = tvgetf(); tst_search_prefix(root, word, a, n, max); t2 = tvgetf(); word[0]++; ttime += t2 - t1; } return ttime; } ``` * 從大寫 A 開始，搜尋到大寫 Z ，再接著從 a 到 z 搜索，藉此來測試兩者跑完 bench程式後的效率如何。 * Makefile 裡面這測試各要跑150次。 ``` Performance counter stats for 'sudo ./test_cpy --bench' (150 runs): 2,025,944 cache-misses # 56.530 % of all cache refs ( +- 0.13% ) 3,583,866 cache-references ( +- 0.11% ) 544,523,831 instructions # 0.90 insn per cycle ( +- 0.01% ) 604,969,702 cycles ( +- 0.15% ) 0.191043196 seconds time elapsed ( +- 1.71% ) Performance counter stats for 'sudo ./test_ref --bench' (150 runs): 3,517,770 cache-misses # 65.822 % of all cache refs ( +- 0.33% ) 5,344,407 cache-references ( +- 0.24% ) 525,117,217 instructions # 0.76 insn per cycle ( +- 0.01% ) 690,790,048 cycles ( +- 0.51% ) 0.218351956 seconds time elapsed ( +- 1.50% ) ``` * build 速度 ![](https://imgur.com/T8wCt1F.png) * search 速度 ![](https://i.imgur.com/IewO0tS.png) 原因： * REF 在TST建立時，使用的記憶體空間早已被指派，所以是固定的； CPY 則是 TST 建立後根據搜尋後開始指派，可根據搜尋的字數長短來改變大小。 * REF 要將資料存放在記憶體的不同處，需耗費額外計算。 CPY 則可以將資料的存放位置相鄰從而避免額外計算。 :::danger 1. 上述是實作層面的「限制」，而非終極設計「目標」 2. 應該提出改進效能的構想和手法 :notes: jserv ::: 參考資料： [跟我一起写Makefile:MakeFile介绍](http://wiki.ubuntu.org.cn/index.php?title=%E8%B7%9F%E6%88%91%E4%B8%80%E8%B5%B7%E5%86%99Makefile:MakeFile%E4%BB%8B%E7%BB%8D&variant=zh-hant) [釋放Linux記憶體](http://huenlil.pixnet.net/blog/post/26822270-%5B%E8%BD%89%5D-%E9%87%8B%E6%94%BElinux%E8%A8%98%E6%86%B6%E9%AB%94) [linux 指令 - tee](http://ccd9527.blogspot.tw/2009/05/linux-tee.html) [在 Linux 中以特定的 CPU 核心執行程式](https://blog.gtwang.org/linux/run-program-process-specific-cpu-cores-linux/) [Linux的进程优先级](http://liwei.life/2016/04/07/linux%E7%9A%84%E8%BF%9B%E7%A8%8B%E4%BC%98%E5%85%88%E7%BA%A7/) [Linux 效能分析工具: Perf](http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial)

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.