# 2017q3 Homework2 (prefix-search)
contributed by <`ChiuYiTang`>
### Reviewed by `HTYISABUG`
* 正如老師所說, 效能分析用圖來顯示更簡單明瞭
* 雖然比較無關緊要, `lscpu` 的結果用 code block 包起來吧, 用列表的方式讓人看不清楚也不想看啊
[GitHub](https://github.com/ChiuYiTang/prefix-search)
## 開發環境
```
Ubuntu 16.04.5
Linux 4.4.0-96-generic
gcc version 5.4.0 20160609
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
```
`$ lscpu`
```
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 94
Model name: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Stepping: 3
CPU MHz: 1261.187
CPU max MHz: 4000.0000
CPU min MHz: 800.0000
BogoMIPS: 6816.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
```
> 列出處理器的型號
> [name="jserv"][color=red]
> 已修正
> [name="ChiuYiTang"][color=blue]
## TST 資料架構
* TST 為一種 Trie 資料結構,一般用來保存相關連的矩陣,常見於搜尋提示。
* 不同於 BST 僅有兩個子樹,TST 具備三個子樹架構,又稱為 prefix tree。
* 可藉此儲存大量字串,每個節點由一個字符構成,並包含分別指向三子樹的指標。分別為 equal kid, lower kid and higher kid,以及一個實現物件。
* 常用於 package routing 以及 Prefix matching (e.g. Google search)。
### 操作步驟 - 插入儲存
先加入cat 再加入apple 再加入cow
(a比c小,所以接左邊) (c相等,往下o大於a,所以接右邊)
c c c
| => / | => / |
a a a a a
| | | | | \
t p t p t o
| | |
p p w
| |
l l
| |
e e
### 應用場景
#### Autocompletion
* 輸入一個不完整字串,透過 prefix search 可以給出自動完成剩餘字串建議。
```
# Complete strings: States and regions that begin with "A" and "Al"
completeWord(US.CanadaTree, "A")
#> [1] "Alaska" "Alabama" "Alberta" "Arkansas" "Arizona"
completeWord(US.CanadaTree, "Al")
#> [1] "Alaska" "Alabama" "Alberta"
```
#### Spell checking
* 透過窮舉搜索方式,搜尋 TST 裡最接近(minimize Edit distance)輸入字串的元素。
## 修改 Makefile
* 參考大神們 [ryanwang522](https://github.com/ryanwang522/phonebook/blob/master/Makefile) 以及 [st9007a](https://hackmd.io/s/SkeKQEq3Z) ,修改 makefile,加入`$ make bench` :
```clike=
BIN = \
test_cpy \
test_ref
...
bench: $(BIN)
@for test in $(BIN); do\
perf stat --repeat 100 \
-e cache-misses,cache-references,instructions,cycles \
./$$test --bench;\
done
```
- [ ] 加入 bench 測試檔
## 修正 test_ref.c
* 透過 macro 觀察到兩個版本 `REF` 以及 `CPY`。
* 針對 FIXME 幾個部份做修正:
```clik=
while ((rtn = fscanf(fp, "%s", word)) != EOF) {
char *p = word;
/* FIXME: insert reference to each string */
if (!tst_ins_del(&root, &p, INS, CPY)) {
fprintf(stderr, "error: memory exhausted, tst_insert.\n");
fclose(fp);
return 1;
}
idx++;
}
t2 = t;
```
* 將`tst_ins_del(&root, &p, INS, CPY)`修正為`tst_ins_del(&root, &p, INS, REF)`後執行。
* 執行後發現以下幾點錯誤:
### tst_free_all
* 執行 Quit 會出現錯誤:
* 用 gdb 觀察:
```
choice: q
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a91532 in __GI___libc_free (mem=0x7ffff70d1eea) at malloc.c:2967
2967 malloc.c: No such file or directory.
```
* 發現為 malloc 到不存在的地方,造成 [double free](https://www.owasp.org/index.php/Double_Free) 現象。
:::danger
這現象叫做 [double free](https://www.owasp.org/index.php/Double_Free),請調整共筆和 Git commit messages 用語
:notes: jserv
:::
* 前往 test_ref.c **151:Quit** 部份,並結合 tst.c 裡 tst_free_all 以及 tst_free Reference,兩者分別為` data storage internal`以及` data storage external`。
* 猜測分別為`test_cpy.c`以及`test_ref.c`所使用。
* 更正後:
```clik=
case 'q':
tst_free(root);
return 0;
break;
```
* 再次執行,Quit 錯誤消失。
### Error Search
* 執行 prefix search 發現
```
suggest[1013] : T
suggest[1014] : T
suggest[1015] : T
suggest[1016] : T
suggest[1017] : T
suggest[1018] : T
suggest[1019] : T
suggest[1020] : T
suggest[1021] : T
suggest[1022] : T
suggest[1023] : T
```
* Search prefix 可發現所有節點都變成輸入值。
* `test_cpy.c`以及`test_ref.c`兩者差異或許為使用到 Macro `REF` & `CPY`的部份。
```clike= if (*p++ == 0) {
void *tst_ins_del(tst_node **root, char *const *s, const int del, const int cpy)
{
...
for (;;) {
...
if (*p++ == 0) {
if (cpy) { /* allocate storage for 's' */
const char *eqdata = strdup(*s);
if (!eqdata)
return NULL;
curr->eqkid = (tst_node *) eqdata;
return (void *) eqdata;
} else { /* save pointer to 's' (allocated elsewhere) */
curr->eqkid = (tst_node *) *s;
return (void *) *s;
}
}
pcurr = &(curr->eqkid);
}
}
```
* 觀察發現`s`為指向`p`的指標,`p`為指向`word`的指標,`*s`為指向`word`的指標。
* 使用`cpy`時,將`*s`指向的 string 直接 malloc 一個空間,copy 到`eqdata`,因此每次皆可順利插入資料進 TST。
* 然使用`ref`時,直接將指向`word`的指標`*s`存入 TST。
* 再任何後續輸入都會修改`word`的值,進而使得整個 TST 的節點都指向以`word`為起點的連續記憶體位址。
* 因此需要透過每次動態配置新的記憶體位址,避免`*s`存取到`word`位址。
* 最簡單的方式即在`char *p = word;`前後,分配新的記憶體空間,或直接透過`strdup()`將`word`複製給`p`。
```
choice: s
find words matching prefix (at least 1 char): Taiwa
Taiwa - searched prefix in 0.000002 sec
suggest[0] : Taiwala,
suggest[1] : Taiwan
```
* 然而頻繁地動態分配,容易造成記憶體破碎,亦增加 Cache miss 的機會。
* 因此透過 phonebook 作業所使用之 memory pool 的方法,事先分配足夠記憶體位置來解決。
#### Memory pool
* 事前給定一塊巨大記憶體,並透過指標操作分配記憶體空間。
* 透過此種方式能避免記憶體破碎,並避免配置與釋放記憶體所需時間。
```clike=
/* Memory pool size */
#define MemoryPoolSize 10000000
...
int main(int argc, char **argv)
{
...
char *pptr = (char *) malloc(MemoryPoolSize * sizeof(char)); /* Memory pool */
char *pTop = pptr; /* assign a pointer to top of memory pool */
...
/* Use memory pool top pointer to insert reference to each string */
while ((rtn = fscanf(fp, "%s", pTop)) != EOF) {
char *p = pTop;
if (!tst_ins_del(&root, &p, INS, REF)) {
fprintf(stderr, "error: memory exhausted, tst_insert.\n");
fclose(fp);
return 1;
}
idx++;
/* If memory exhausted, pTop++ for next memory */
pTop += (strlen(pTop) + 1);
}
...
for (;;) {
...
switch (*word) {
char *p = NULL;
case 'a':
printf("enter word to add: ");
if (!fgets(pTop, sizeof word, stdin)) {
fprintf(stderr, "error: insufficient input.\n");
break;
}
rmcrlf(pTop);
p = pTop;
t1 = tvgetf();
/* FIXME: insert reference to each string */
res = tst_ins_del(&root, &p, INS, REF);
t2 = tvgetf();
if (res) {
idx++;
pTop += (strlen(pTop) + 1);
printf(" %s - inserted in %.6f sec. (%d words in tree)\n",
(char *) res, t2 - t1, idx);
} else
printf(" %s - already exists in list.\n", (char *) res);
break;
...
case 'd':
printf("enter word to del: ");
if (!fgets(pTop, sizeof word, stdin)) {
fprintf(stderr, "error: insufficient input.\n");
break;
}
rmcrlf(pTop);
p = pTop;
printf(" deleting %s\n", pTop);
t1 = tvgetf();
/* FIXME: remove reference to each string */
res = tst_ins_del(&root, &p, DEL, REF);
t2 = tvgetf();
if (res)
printf(" delete failed.\n");
else {
printf(" deleted %s in %.6f sec\n", word, t2 - t1);
idx--;
pTop -= (strlen(pTop) + 1);
}
break;
...
case 'q':
tst_free(root);
free(pptr);
...
...
}
}
return 0;
}
```
* 再次執行後,可以順利 Search。
```
...
suggest[962] : Tayuman,
suggest[963] : Taywanak
suggest[964] : Tayzhina,
suggest[965] : Taza,
suggest[966] : Tazacorte,
suggest[967] : Tazewell,
suggest[968] : Tazlău,
suggest[969] : Tazoult-Lambese,
suggest[970] : Tazovskiy,
```
### Search word 'A'
* Search Prefix : A,出現 Segmentation fault。
* 用 gdb 確認錯誤訊息
```
choice: s
find words matching prefix (at least 1 char): A
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401e3a in tst_suggest (p=0x18985e0, c=65 'A', nchr=1, a=0x7fffffffba80, n=0x7fffffffba30, max=1024) at tst.c:331
331 a[(*n)++] = (char *) p->eqkid;
```
* 觀察程式碼
```clike=
void tst_suggest(const tst_node *p,
const char c,
const size_t nchr,
char **a,
int *n,
const int max)
{
if (!p || *n < max)
return;
tst_suggest(p->lokid, c, nchr, a, n, max);
if (p->key)
tst_suggest(p->eqkid, c, nchr, a, n, max);
else if (*(((char *) p->eqkid) + nchr - 1) == c)
a[(*n)++] = (char *) p->eqkid;
tst_suggest(p->hikid, c, nchr, a, n, max);
}
```
* 將`p->eqkid`儲存到`a[]`中時,因未正確檢驗輸入 index,造成讀取到危險區域 ,經修正後如下:
```clike=
void tst_suggest(const tst_node *p,
const char c,
const size_t nchr,
char **a,
int *n,
const int max)
{
if (*n >= max || !p)
return;
tst_suggest(p->lokid, c, nchr, a, n, max);
if (p->key)
tst_suggest(p->eqkid, c, nchr, a, n, max);
else if (*(((char *) p->eqkid) + nchr - 1) == c)
a[(*n)++] = (char *) p->eqkid;
tst_suggest(p->hikid, c, nchr, a, n, max);
}
```
* 確保`*n`不越界到不可存取之範圍。
## 效能分析
- [ ] 待補
```
Performance counter stats for './test_ref --bench':
4,509,511 cache-misses # 47.705 % of all cache refs
9,452,930 cache-references
505,184,022 instructions # 1.11 insns per cycle
454,265,288 cycles
2.513052543 seconds time elapsed
```
```
Performance counter stats for './test_cpy --bench':
5,229,904 cache-misses # 51.312 % of all cache refs
10,192,322 cache-references
533,208,563 instructions # 1.03 insns per cycle
518,309,114 cycles
2.138658243 seconds time elapsed
```
## 針對現代處理器架構之改善方式
* Linux Kernel 的 Slab Allocator 機制為一種 memory pool 的實現。
`sudo slabtop`
```
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
140931 140918 99% 0.19K 6711 21 26844K dentry
96174 96174 100% 0.10K 2466 39 9864K buffer_head
52032 50612 97% 0.06K 813 64 3252K kmalloc-64
49860 49860 100% 1.05K 1662 30 53184K ext4_inode_cache
48360 47006 97% 0.20K 2418 20 9672K vm_area_struct
34986 34986 100% 0.12K 1029 34 4116K kernfs_node_cache
22338 22338 100% 0.04K 219 102 876K ext4_extent_status
20808 20076 96% 0.08K 408 51 1632K anon_vma
19328 18511 95% 0.03K 151 128 604K kmalloc-32
15988 15832 99% 0.57K 571 28 9136K radix_tree_node
15484 15484 100% 0.55K 553 28 8848K inode_cache
13504 12667 93% 0.25K 422 32 3376K kmalloc-256
10024 10024 100% 0.07K 179 56 716K Acpi-Operand
9472 9472 100% 0.02K 37 256 148K kmalloc-16
8192 8192 100% 0.01K 16 512 64K kmalloc-8
7310 7310 100% 0.05K 86 85 344K ftrace_event_field
7176 7009 97% 0.61K 276 26 4416K proc_inode_cache
5814 5814 100% 0.04K 57 102 228K Acpi-Namespace
5586 4738 84% 0.19K 266 21 1064K kmalloc-192
4074 4074 100% 0.09K 97 42 388K kmalloc-96
4448 4033 90% 0.12K 139 32 556K kmalloc-128
4176 3833 91% 0.64K 174 24 2784K shmem_inode_cache
2816 2650 94% 0.50K 88 32 1408K kmalloc-512
2720 2582 94% 1.00K 85 32 2720K kmalloc-1024
2380 2287 96% 0.56K 85 28 1360K ecryptfs_key_record_cache
1840 1840 100% 0.09K 40 46 160K trace_event_file
1428 1331 93% 0.12K 42 34 168K jbd2_journal_head
1325 1325 100% 0.62K 53 25 848K sock_inode_cache
1376 1275 92% 2.00K 86 16 2752K kmalloc-2048
1152 1152 100% 0.06K 18 64 72K ext4_free_data
1120 1120 100% 0.14K 40 28 160K btrfs_path
1008 961 95% 3.75K 126 8 4032K task_struct
896 896 100% 0.03K 7 128 28K jbd2_revoke_record_s
768 768 100% 0.02K 3 256 12K jbd2_revoke_table_s
680 680 100% 0.05K 8 85 32K jbd2_journal_handle
657 657 100% 0.05K 9 73 36K Acpi-Parse
```
* 透過 [slab allocation](https://en.wikipedia.org/wiki/Slab_allocation) 為記憶體動態智慧管理的 memory allocator。
* 若將 linux kernel 之 slab 技術運用至程式中,可進一步智慧地動態分配記憶體。
> 或許是殺雞焉用牛刀?
> [name="ChiuYiTang"][color=blue]
## tst_traverse_f 與 callback function 運作模式
* 觀察程式碼
```clike=
void tst_traverse_fn(const tst_node *p,
void(*fn)(const void *, void *),
void *data)
{
if (!p)
return;
tst_traverse_fn(p->lokid, fn, data);
if (p->key)
tst_traverse_fn(p->eqkid, fn, data);
else
fn(p, data);
tst_traverse_fn(p->hikid, fn, data);
}
```
* 此為 Inorder traversal of ternary search tree。
* 遍歷順序為:lower kid > middle kid > root > higher kid
* 以下圖為例,遍歷順序為:e > l > p > p > a > t > a > w > o > c
```
c
/ |
a a
| | \
p t o
| |
p w
|
l
|
e
```
* 每遍歷一個節點,透過 Call back function 回傳函式指標呼叫`void(*fn)(const void *, void *)`。
#### Call back function
* 當我們希望某個事件發生時,會有通知,以方便進行一些相對應的處理工作。
* 方法即透過呼叫某函式`tst_traverse_fn`時,於過程中傳入一個函式指標`fn`給他。待條件觸發(上例為『中序遍歷到該節點』),會呼叫函式指標。此時會回到主程式並傳入函式 `fn`,結束後再回到函式`tst_traverse_fn`繼續執行未完事項。
[[source](http://www.opensourceforu.com/wp-content/uploads/2012/02/function-callback-1.jpg)]

## 問題探討
### 詳細觀察 tst.h 和 tst.c 為何要 forward declaration 呢?分離介面和實作有何好處呢?這樣的思維又該如何運用於 phonebook 呢?
* 透過 forward declaration 除了與 function pointer 結合,可以使用 C 語言實現資料封裝、繼承與多型等等物件導向思維之程式碼,亦可實作介面快速抽換程式碼。例如:
> 用一致的 coding style 編排以下程式碼列表:
> [name="jserv"][color=red]
```clike=
//Person.h
typedef struct _Person Person;
//declaration of pointers to functions
typedef void(*fptrDisplayInfo)(Person*);
typedef void(*fptrWriteToFile)( Person*, const char*);
typedef void(*fptrDelete)( Person *) ;
//Note: In C all the members are by default public. We can achieve
//the data hiding (private members), but that method is tricky.
//For simplification of this article
// we are considering the data members //public only.
typedef struct _Person
{
char* pFName;
char* pLName;
//interface for function
fptrDisplayInfo Display;
fptrWriteToFile WriteToFile;
fptrDelete Delete;
}Person;
person* new_Person(const char* const pFirstName,
const char* const pLastName); //constructor
void delete_Person(Person* const pPersonObj); //destructor
void Person_DisplayInfo(Person* const pPersonObj);
void Person_WriteToFile(Person* const pPersonObj, const char* pFileName);
```
* 此外,透過 forward declaration 亦可於編譯時取代原先`#include <head file.h>`,於大型專案下可大幅減少重新編譯之時間。
```clike=
#include <A.h>
class B
{
private:
A* PtrA ;
public:
...
};
```
變為:
```clike=
class A;
class B
{
private:
A* PtrA ;
public:
...
};
```
#### 應用於 phonebook
- [ ] 待補
### 針對各國城市的 prefix search 有無缺陷呢?比方說無法區分城市和國家,並提出具體程式碼的修正
* 觀察測資:
```
Taipei, Taiwan
Kinshasa, Democratic Republic of the Congo
Lima, Peru
Cairo, Egypt
London, United Kingdom
Beijing, China
Tehrān, Iran
Nanchong, China
Bogotá, Colombia
Hong Kong, China
Lahore, Pakistan
```
* 測資格式為:**城市, 國家**。
* 城市與國家之差異在
* 城市以『,』結尾,以此線索修改:
```clike=
int main(int argc, char **argv)
{
for(;;) {
...
case 's':
printf("find words matching prefix (at least 1 char): ");
if (!fgets(word, sizeof word, stdin)) {
fprintf(stderr, "error: insufficient input.\n");
break;
}
rmcrlf(word);
t1 = tvgetf();
res = tst_search_prefix(root, word, sgl, &sidx, LMAX);
t2 = tvgetf();
if (res) {
printf(" %s - searched prefix in %.6f sec\n\n", word, t2 - t1);
for (int i = 0; i < sidx; i++){
char *flag = sgl[i] + strlen(sgl[i]) - 1;
if(*flag == ',')
printf("suggest[%d] for CITY : %s\n", i, sgl[i]);
else printf("suggest[%d] for COUNTRY: %s\n", i, sgl[i]);
}
} else
printf(" %s - not found\n", word);
break;
}
}
```
* 結果:
```clike=
...
suggest[23] for COUNTRY: Tairan
suggest[24] for CITY : Tairua,
suggest[25] for CITY : Taisen-ri,
suggest[26] for CITY : Taissy,
suggest[27] for COUNTRY: Taitung
suggest[28] for CITY : Taivalkoski,
suggest[29] for CITY : Taivassalo,
suggest[30] for CITY : Taiwala,
suggest[31] for COUNTRY: Taiwan
suggest[32] for CITY : Taixing,
suggest[33] for CITY : Taiynsha,
suggest[34] for CITY : Taiyuan,
suggest[35] for CITY : Taizhou,
```
## 參考資料
[嵌入式的復健筆記](http://fiend1120.pixnet.net/blog/post/196941375-callback-function)
[你所不知道的 C 語言:物件導向程式設計篇](https://hackmd.io/s/HJLyQaQMl)
[你所不知道的C語言:技巧篇](https://hackmd.io/s/HyIdoLnjl)
[Inheritance and Polymorphism in C](https://www.codeproject.com/Articles/108830/Inheritance-and-Polymorphism-in-C)
[Slab allocation
](http://neokentblog.blogspot.tw/2010/10/slab-allocation.html)
[Ternary search trees for autocompletion and spell checking](https://cran.r-project.org/web/packages/TSTr/vignettes/TSTr.html)