# 2018q1 Homework1 (phonebook) contributed by < `ryanpatiency` > ###### tags: `ryanpatiency` [github](https://github.com/ryanpatiency/phonebook)、 [作業區](https://hackmd.io/s/SJONH8fuz)、 [作業要求](https://hackmd.io/s/SykZ8IMOf) > [3d23f44](https://github.com/ryanpatiency/phonebook/commit/3d23f44bbbcdaeb03723f85a5001124253d68cc9) commit message 看不出做了哪些修改 > [3cce83d](https://github.com/ryanpatiency/phonebook/commit/3cce83d8761c4510554ba255a6fb8e7de5969f62) 未說明為什麼要修改 hash table size > [color=red][name=課程助教] >> git rebase 後 push 在 github 上了 ,但 3d23f44 是 merge 前的 commit ,我不是很明白要怎麼改訊息。下次上課再問助教 >> [3cce83d](https://github.com/ryanpatiency/phonebook/commit/3cce83d8761c4510554ba255a6fb8e7de5969f62) >> Update hash table size => Update hash table size to prime >> [name=ryanpatiency][color=green] ## 給初學者/無頭緒 的建議 老師的作業要求很多,但是如果你完全沒有頭緒,不知道從哪裡開始,這篇文章是針對你寫的 **重點整理** 1. 準備好 linux, git&github, hackmd 2. 看作業 1 教學影片 3. 整理歷屆學生的文件,找出值得效法者 ( cashe-miss 最低者) 4. 選一個效法 5. 紀錄過程在 hackmd 上,並 push 到 github 上 6. 回到第四步 7. 挑戰題 **第一步驟是:安裝 linux 、 學會用 git & github 、學會打 hackmd** 因為作業要運行在 linux 上,作業繳交是用 github & hackmd,所以你一定要會這 3 個。 * 安裝 linux 網路上有許多教學,大部份都是下面的步驟 1. 格式化一個 usb 2. 把 linux 安裝到 usb 3. 重開機 4. 剛開機後立刻按下 F1 or F2 or F11 or ... 進去 bios mode 5. 選擇開機順序 => 讓電腦先用 usb 開機 6. 重開機 => 成功!! * github 網路上有許多教學,你先學會以下就好 1. git add -A 2. <s>git commit -a -m "messadge"</s> 3. git pull 4. git push 5. git diff 6. gitk :::danger 避免用 `git commit -a -m “messadge”`,請用 `git commit` 然後仔細撰寫 Git commit message。 參見 [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/) :notes: jserv ::: * hackmd 你看任何一個文件是怎麼打的,依樣畫葫蘆就好了(點工具欄的鉛筆符號) 如果你花了超過3個小時還沒有辦法把上面的東西弄好,找一個已經會的同學教你,因為 ~~ 你知道的,這次作業要求你付出16個小時以上,你不能夠在預備上就花超過 3 個小時 如果你弄好了,就可以開始做作業了: **第二步驟是:看作業1的教學影片 [link](https://www.youtube.com/watch?v=ZICRLKf_bVw)** 看影片大概要花一個小時的時間,看完並且跟著操作後,配合之前所學,你就能在[作業區](https://hackmd.io/s/SJONH8fuz)上傳一份空白的作業了。(包含空白的 hackmd 和 同步後的 github) 現在你的 github 上是沒有任何進度的,接下來你要做的,就是 commit 並 push 你的進度。並且紀錄在 hackmd 上。加油!! 看完影片後,你應該要知道: 知道這一次的作業目的是減少 cache miss ,並且提高程式執行速度。 **第三步驟是:站在巨人的肩膀上**(尋找強者學長和同學的文件)。 首先我們已經知道目標是要減少 cache miss ,那麼我們就應該先找找歷屆同學的範例程式裡面;哪一個的 case-miss 最低?我把我整理的名單放在後面,但是你應該要自己尋找、整理。 **第四步驟是:選個文件照著做** 如果文件看不懂,比如你不知道 `$ make cashe-test` 你可以: * 丟給 google 和 youtube * 問同學。 * 去找 official manual (man) * 在 fb 上的 課程討論區發問。 * 紀錄在 hackmd 上 * 跳過 <= important 這裡我給你一些在讀文件遇到問題時的範例,供你參考 * 了解 make 的方法 (also perf , gcc, ...) * 終端機打 ` $ man make` * YouTube/google :what is make in linux * 了解 memory pool 怎麼實作 * google: what is memory pool * google: memory pool in c * 了解 hash table 怎麼實作 * youtube: hash algorithm * google: hash table in c 你可以一直做第四步驟直到你有信心為止。 如果你還是不知道要怎麼做 接下來是我的作法記錄,你可以對照著看 最後最後,祝你修課順利:) 加油!! ## 內容大綱 * 作業一影片要求整理 * 見賢思齊的成績紀錄 * 實驗紀錄:重複 `ryanwang` 同學的實驗 ( struct size, hash table ) 達到已知最低的 cache-miss: 21% * 實驗紀錄:重複 `ierosodin` 同學的實驗 ( struct size, hash table, memory pool):cache-miss 不降反升、再次增快執行速度 * 實驗紀錄:其他實驗 * 問題紀錄 ## 作業一要求: ### 目的:實作自己的 * `phonebook_opt.c` * Achieve **low** cache miss and run time ### 手段: - 調查:find the lowest cache miss report in this class's history - examine the method he/she used - do it again myself - if possible, combine-other's method ### 加分題: - fuzzy search - 了解 race condition - 了解使用 [valgrind](http://valgrind.org/) - superscalar - out-of-order execution - dual-issue - branch prediction (& how to use gcc to optimize) - tracepoint - clock_gettime man page - gnuplot => 除了預設以外的 plot 先不管加分題,我們先開始調查: ## "見賢思齊" 中的成績紀錄: 同學的成果,以 cache miss 為標準,以下是成績優異的同學 - memory pool: 37.338% [link](https://hackmd.io/s/BJl6pFyzf) - mmap: 29.101% (but require extra time) [link](https://hackmd.io/s/H1g0s3sax) - hash function 26.915% [link](https://hackmd.io/s/Sk-qaWiYl) - memory pool: 29.716% [link](https://hackmd.io/s/SJ4uqs9Yx) - entry size 32.843% [link](https://hackmd.io/s/BJjL6cQ6) 調查結果,`ryan wang` 同學 26.915% 第1名 我們的第 1 要務是了解 26.915% 是怎麼做到的。 ###### 另外,記得影片中老師提到有一同學做到16% ,回去看影片,發現是在 [42分](https://youtu.be/ZICRLKf_bVw?t=42m50s)處。結果誤會一場 :) ## 實驗紀錄:重複 26.915% `ryanwang` 同學的實驗 以下的實驗可當作[`ryanwang`同學](https://hackmd.io/s/Sk-qaWiYl)的筆記的補充,請大家可以先看他的筆記,如果遇到難處,可以參考以下我找到的資料 有許多部份我沒有一步一步照著做,而是自己上網找資料,然後用自己的方法做。如果大家看到有些步驟沒有說明,只有一些連結,那代表我所參考的資料 **** `ryan wang` 的文件中有 `$ echo 1 | sudo tee /proc/sys/vm/drop_caches` 解說如下: * 不懂 | 和 > 和 &> 和 >&2 ... 的話,他們的搜尋關鍵字是 bash pipe, redirection, 參考 [link](http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html#toc4) * ` tee `是用來給 `stdout` 權限的,,大家可以 `man tee` 為什麼不 run `$ echo 1 > sudo /proc/sys/vm/drop_caches` 呢? 實驗看看下面兩行就能理解 `$ sudo echo 1 > /proc/sys/vm/drop_caches` `$ echo 1 > sudo /proc/sys/vm/drop_caches` 這個技巧再後來`$ perf record`的時候也會用到 * 1, 2, 3 各有不同的意思: - To free pagecache: `echo 1 > /proc/sys/vm/drop_caches` - To free dentries and inodes: `echo 2 > /proc/sys/vm/drop_caches` - To free pagecache, dentries and inodes: `echo 3 > /proc/sys/vm/drop_caches` 我 `$ make cache-test ` 的結果: ``` Performance counter stats for './phonebook_opt' (100 runs): 1,178,032 cache-misses # 78.818 % of all cache refs ( +- 0.97% ) 1,494,619 cache-references ( +- 0.90% ) 261,906,975 instructions # 1.17 insns per cycle ( +- 0.02% ) 223,419,587 cycles ( +- 1.14% ) 0.095444565 seconds time elapsed ( +- 1.55% ) ``` `$ make plot` 的結果 ![](https://i.imgur.com/KoDWHDm.png) 可以發現:我的電腦速度本來就比這位`ryanwang522`同學慢,他的初始執行時間是 0.04 秒 / 0.006 秒,我的是 0.06 秒 / 0.007 秒 ### # 如何縮小 struct __PHONE_BOOK_ENTRY 的大小 **我的實作方式:註解程式碼** ``` typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; // char firstName[16]; // char email[16]; // char phone[10]; // char cell[10]; // char addr1[16]; // char addr2[16]; // char city[16]; // char state[2]; // char zip[5]; struct __PHONE_BOOK_ENTRY *pNext; } entry; ``` * `ryanwang` 提到的 l1_cache ,可用 `lscpu` 找到 After Shrink the size I run `make cache-miss` again ``` Performance counter stats for './phonebook_opt' (100 runs): 103,879 cache-misses # 32.039 % of all cache refs ( +- 1.26% ) 324,226 cache-references ( +- 1.76% ) 241,106,445 instructions # 1.72 insns per cycle ( +- 0.02% ) 140,094,134 cycles ( +- 1.03% ) 0.059512604 seconds time elapsed ( +- 1.29% ) ``` :::success cashe-miss 從 80% 降到 30% ::: :::warning phonebook_opt.c 中有一個 TODO, 請大家要按照 TODO 做! ::: **改好後的 make plot** ![](https://i.imgur.com/Mzzjfil.png) ### # 如何實作 Hash table 參考以下連結 DIY - tutorial [link](https://www.youtube.com/watch?v=KyUTuwz_b7Q) - BKDR (backdoor) Hashtable tutorial [link](http://www.partow.net/programming/hashfunctions/) - According to the website, "This hash function comes from Brian Kernighan and Dennis Ritchie's book 'The C Programming Language'" > 可見我們老師多愛 C > [name=ryanpatiency][color=green] 我的實作很簡單,只要把 `e` & `pHead` 改成 `hash_e[key]` & `hash_pHead[key]` 就好了,github 上可以找到 * 結果 1 ( hash_table 數量為 100 ) phonebook_opt cache-misses # 28.637 % of all cache refs ( +- 0.88% ) * 結果 2 ( hash_table 數量為 1000) phonebook_opt ache-misses # 21.690 % of all cache refs ( +- 2.34% ) :::info 當 hash_table 數量為 1000 時,就連 phonebook_orig 都能跑到 27% 的 cache-miss 為了對照,最好把 HASH_TABLE_SIZE 的定義 放在 #ifdef OPT #else 之中,並定義 orig 的 hash table size 為 1 ::: :::success **21%!! 只是 follow 前人的腳步就能超過已知最佳成績,真的要感謝這位`ryanwang522`同學** ::: ## 實驗紀錄:重複 [`ierosodin`](https://hackmd.io/s/SJ4uqs9Yx#%E4%BD%BF%E7%94%A8-hash-function-BKDR) 同學的實驗 ### # 解說 `./phonebook_orig & perf top -p $!` * `&` make the program run in background. google key word: bash & * `$!` the last ip of command. google key word: bash $! * `perf stat` : please run `$perf help top` yourself In fact, this command is tricky way of `perf record` see th link below: [link: perf record tutorial](https://www.ibm.com/developerworks/library/l-analyzing-performance-perf-annotate-trs/) ### # 解說 `perf stat --repeat 100 -e cache-misses,cache-references,instructions,cycles ./phonebook_orig` DIY the following to understand: * `$ perf help stat` * `$ perf help list` We can find a spooky thing in the end of the `$ perf help list`: 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B: System Programming Guide http://www.intel.com/Assets/PDF/manual/253669.pdf 2. AMD64 Architecture Programmer’s Manual Volume 2: System Programming http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf it is a 500 page manual, I am terrified * go to [wiki](https://perf.wiki.kernel.org/index.php/Tutorial#Events) Wiki also tells you to look manual, So to understand what is cache-misses, cache-references, I suggest you just go to look at the manual. * helpful [教學](http://www.brendangregg.com/perf.html#Events) 可以學會其他不錯的 event: L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores * 了解 cashe-store 和 cashe-miss 的差別:[教學](https://stackoverflow.com/questions/18408189/cpu-cache-performance-store-misses-vs-load-misses) :::info `ierosodin` also shrink struct size, but he add a `struct __PHONE_BOOK_INFO` to keep the rest of the info, which looks more proper. ::: ### # How to implement Memory pool `ierosodin` 同學有提供 memory pool 的 code,但由於沒有針對這次的作業,我建議大家上網去找 [link](https://stackoverflow.com/questions/11749386/implement-own-memory-pool) 我採用 link 中的方式實現 memory pool ``` typedef struct pool { char * next; char * end; } POOL; POOL * pool_create( size_t size ) { POOL * p = (POOL*)malloc( size + sizeof(POOL) ); p->next = (char*)&p[1]; p->end = p->next + size; return p; } void pool_destroy( POOL *p ) { free(p); } size_t pool_available( POOL *p ) { return p->end - p->next; } void * pool_alloc( POOL *p, size_t size ) { if( pool_available(p) < size ) return NULL; void *mem = (void*)p->next; p->next += size; return mem; } ``` 要用 memory pool 取代 malloc, 所以 ctrl-F 找尋所有 malloc 出現的地方,把他們換成 pool_alloc, 當然一開始要先 pool_create() (除了 main.c 以外,不要動到 orig比較好) 由於我這次打算只觀察 memory pool 的影響,所以沒有 shrink entry size, 所以一個 node size = 136 bytes, dictionary 有 350000 筆資料,總共要 alloc 350000 * 136 = 47600000 bytes :::danger Bug 紀錄: Error: double free or corruption ::: 解決方法: 在 `pool_create` 和 `pool_destroy` 外加上 `#ifdef OPT` 和 `#endif`,把原本的 free 放在 `#else` 中 結果 * phonebook_orig: cache-misses # 82.738% * phonebook_opt: cache-misses # 57.111% :::success **改善成功!** drop from 82 % to 57 % ::: 但是可以發現在 cashe-miss 上的改善不如 Hashfunction (82% to 27%) 及 Shrink entry size (82% to 30%) 明顯 ### # 介紹 valgrind 用法 `valgrind ./phonebook_opt` 可以發現產生許多有用的資訊,用一次就知道了。 範例: ``` $ valgrind ./phonebook_orig ... ==3341== HEAP SUMMARY: ==3341== in use at exit: 0 bytes in 0 blocks ==3341== total heap usage: 349,906 allocs, 349,906 frees, 47,596,856 bytes allocated ==3341== ==3341== All heap blocks were freed -- no leaks are possible ... $ valgrind ./phonebook_opt ... ==3291== HEAP SUMMARY: ==3291== in use at exit: 0 bytes in 0 blocks ==3291== total heap usage: 1,006 allocs, 1,006 frees, 47,634,336 bytes allocated ==3291== ==3291== All heap blocks were freed -- no leaks are possible ``` 可以發現 allocs 次數從 349906 降到 1006 :::info 現在要 alloc 的 memory 是 entry size (24 bytes) * 350000 = 8400000 但是上面的 terminal out 的時候忘記改了。所以 allocated bytes 才會相同 ::: `iero` 同學使用 `valgrind --leak-check=full`, 但我在 man valgrind 中沒看到 --leak-check=full 發現是執行 valgrind ./phonebook_o* 後,如果有 memory leak, `valgrind` 會告訴你請加上 --leak-check=full 實作 memory pool,並用 valgrind debug 完後,我想看看綜合之前的方法會怎麼樣 我將 mpool branch 與原本參考 `ryanwang` 的 master branch git-merge 並執行 執行結果 ``` $ ./phonebook_opt size of entry : 24 bytes execution time of append() : 0.063161 sec execution time of findName() : 0.000003 sec ... $ ./phonebook_orig size of entry : 136 bytes execution time of append() : 0.088766 sec execution time of findName() : 0.006353 sec ``` 執行時間都變快不少 ``` Performance counter stats for './phonebook_orig' (100 runs): 1,472,016 cache-misses # 88.575 % of all cache refs ( +- 0.46% ) 1,661,879 cache-references ( +- 0.36% ) 379,636,405 instructions # 1.49 insns per cycle ( +- 0.02% ) 255,468,254 cycles ( +- 0.36% ) 0.103079308 seconds time elapsed ( +- 0.75% ) Performance counter stats for './phonebook_opt' (100 runs): 73,289 cache-misses # 35.223 % of all cache refs ( +- 8.15% ) 208,071 cache-references ( +- 5.94% ) 177,799,568 instructions # 1.63 insns per cycle ( +- 0.29% ) 108,785,516 cycles ( +- 0.91% ) 0.044483962 seconds time elapsed ( +- 1.11% ) ``` 什麼?為什麼 cashe-miss **從 21% 上升到了 30% 了?** 我不相信,於是 git checkout 以前的版本試試看 ``` // 之前的版本 Performance counter stats for './phonebook_orig' (100 runs): 1,232,064 cache-misses # 85.881 % of all cache refs ( +- 0.48% ) 1,434,611 cache-references ( +- 0.43% ) 314,415,068 instructions # 1.39 insns per cycle ( +- 0.02% ) 225,707,514 cycles ( +- 0.40% ) 0.091467723 seconds time elapsed ( +- 0.76% ) Performance counter stats for './phonebook_opt' (100 runs): 77,994 cache-misses # 22.709 % of all cache refs ( +- 1.49% ) 343,453 cache-references ( +- 2.01% ) 238,434,085 instructions # 1.64 insns per cycle ( +- 0.02% ) 145,711,315 cycles ( +- 0.68% ) 0.059365206 seconds time elapsed ( +- 1.10% ) ``` `opt` 的執行結果 ,這次是 22% 。看起來實驗是沒問題的。git diff 後也證實的確只有 memory pool 不同。可能是 memory pool 哪裡出了問題吧? 不過大家可以發現,memory pool 把 instruction 數目 降低了 ,所以時間也從 0.06 變為 0.04 實驗示範到這裡結束,從分隔線後,是我的實驗紀錄。就沒有特別寫給初學者看了。不過看到這裡,你應該有能力去挑戰加分題了吧~ **加油!** *** ## `iero` 同學的最終版本還是快我 3 倍 我想 git pull 他的 code 來跑看看 ## gnuplot `iero` 同學有畫很特別的 hash_function plot 我也要學起來,並且了解他的橫軸是怎麼一回事?(MOD=17 時不是應該每一個要有 20000 左右的節點嗎? ## 執行結果 和 hash_table_size 的關係 根據[網路上的建議](http://cseweb.ucsd.edu/~kube/cls/100/Lectures/lec16/lec16-8.html),size 的選擇是質數比較好 ## Input Data 改成 all-names.txt add the line `#define INPUT "john"` because now input may change there is three john in the list, hm... I thought each will be unique ## fuzzy search [blog](https://blog.forrestthewoods.com/reverse-engineering-sublime-text-s-fuzzy-match-4cffeed33fdb) [demo](https://s3-us-west-2.amazonaws.com/forrestthewoods.staticweb/lib_fts/tests/fuzzy_match/fts_fuzzy_match_test.html) [source code](https://github.com/forrestthewoods/lib_fts/blob/master/code/fts_fuzzy_match.h) Match score rule: - Score starts at 0 - Matched letter: +0 points - Unmatched letter: -1 point - Consecutive match bonus: +5 points - Separator bonus: +10 points - Camel case bonus: +10 points - Unmatched leading letter: -3 points (max: -9). - There is a penalty if you DON’T match the first three letters. - Score system should be tuned for YOUR use case. Words, sentences, file names, or method names all prefer different tuning. author of gnu `grep` 's post: [link](https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html) the way he use to measure `grep`'s speed: ``` $ time sh -c 'find . -type f -print | xargs grep -l 123456789abcdef' real 0m1.530s user 0m0.230s sys 0m1.357s $ time sh -c 'find . -type f -print | xargs grep --mmap -l 123456789abcdef' real 0m1.201s user 0m0.330s sys 0m0.929s ``` Summary by the `grep` author: - Use Boyer-Moore (and unroll its inner loop a few times). - Roll your own unbuffered input using raw system calls. Avoid copying the input bytes before searching them. (Do, however, use buffered \*output\*. The normal grep scenario is that the amount of output is small compared to the amount of input, so the overhead of output buffer copying is small, while savings due to avoiding many small unbuffered writes can be large.) - Don't look for newlines in the input until after you've found a match. - Try to set things up (page-aligned buffers, page-sized read chunks, optionally use mmap) so the kernel can ALSO avoid copying the bytes. - The key to making programs fast is to make them do practically nothing. ;-) - read "Fast String Searching", by Andrew Hume and Daniel Sunday, => free pdf online My Makefile: ``` phonebook_fuzzy: $(SRCS_common) phonebook_fuzzy.c phonebook_fuzzy.h $(CC) $(CFLAGS_common) $(CFLAGS_opt) \ -DIMPL="\"$@.h\"" -o $@ \ $(SRCS_common) $@.c ``` findName in phonebook_fuzzy.c: ```clike= entry *findName(char lastName[], entry *pHead) { int best_score = INT_MIN; char *maybe = NULL; while (pHead != NULL) { if (strcasecmp(lastName, pHead->lastName) == 0) return pHead; int score = fuzzy_match(lastName, pHead->lastName); // * fuzzy_match is the c version of the code link in the above if(score > best_score) { best_score = score; maybe = pHead->pNext; } pHead = pHead->pNext; } printf("No result for %s, Did you mean: %s\n", lastName, maybe); return NULL; } ``` try search for my ip: ryanpatiency, the end: ``` $ ./phonebook_fuzzy size of entry : 136 bytes No result for ryanpatiency, Did you mean: ryan ``` :::success I did it. It was super super fun to make a fuzzy search and search for my id. ::: just for fun, here is the output if you search for jserv: ``` size of entry : 136 bytes No result for jserv, Did you mean: jose ``` hmm.. jose, quite unlike a name. I think jserv is better the jose. challenge: using priority queue. * using heap for heap is the maximally efficient implementation of priority queue (quote from wikipedia) 心得:看懂強者的 code 不好玩,看不懂更不好玩。最好玩的還是理解原理自己作 ps: after merge, current speed is: ``` $ ./phonebook_opt size of entry : 24 bytes execution time of append() : 0.003056 sec execution time of findName() : 0.000000 sec ``` It is because this all-name.txt is much small ## Data alignment ## Other homework: * the value of `LHS >> RHS` is implementation-defined where in most implementations, this performs arithmetic right shift (so that the result remains negative) Amazing work: from ```clike= uint32_t reverse_bits_32(uint32_t x) { uint32_t reverse_x = 0; for (int i = 0; i < 32; i++) { if((x & (1 << i))) reverse_x |= 1 << ((32 - 1) - i); } return reverse_x; } ``` to ```clike= uint32_t reverse_bits_32(uint32_t x) { uint32_t reverse_x = 0; for (int i = 0; i < 32; i++){ reverse_x |= ( x >> ( ( 32 - 1 ) - i ) & 1 ) << i; } return reverse_x; } ``` 2's compliment: * 0's 2's compliment = 0 * most negative nuber's 2's compliment = itself * x + (-x) = 1<<(16 or 32 or 64) positive -> + negative -> - carry out -> c most significant bit => m (+) + (+) => * overflow iff m=1 * c always = 0 (+) + (-) => c = 0 * c = 1 iff (+) + (-) >= 0 (m=0) * c = 0 iff (+) + (-) < 0 (m=1) * never overflow (-) + (-) => * c always = 1 * m = 0 iff overflow conclution => overflow (x,y, x+y) = x'y'(x+y) + xy(x+y)' ## 你所不知道的C語言 遞迴心得 && & || left first page fault: 3 kind invalid: page major: disk minor : 20:20 ulimit -s stack size stacksize observation TODO run the program around 39:30 type size rule: 50: 06 Kill tail recurtion, Q-matrix , 1:20:00 upper string means late unix variable; todo: modify find.c source code what is pure function. ## Introduction to Computer System 心得 * float x = x*x > 0 yes * int x = x*x > 0 ? (try gdb 50000 * 50000) * test program => gdb * 300*400*500*600 = 400*500*600 * 300 * all overflow, but follow some rule * (1e20+-1e20) + 3.14 != 1e20+(-1e20+3.14)\9 * bound tracing ```clike= typedef struct { int a[2]; double d; } s modify s.a[3] change the value of d; ``` * 1948 => start bits * 1/2 one-half * 1/4 one-forth * 1100 = c * p && *p avoid null pointer access * (gdb) p 123 << 64 => 123 * experiment compare if unsign short and short ```clike= unsigned short ux = 0xffff; short x = 0xffff; printf("x is %d, ux is %d, x==ux : %d\n", x, ux, x==ux); ``` `$ x is -1, ux is 65535, x==ux:0` * -1 > 0U * ``` (gdb) ptype -2147483648 type = unsigned int (gdb) ptype -2147483647-1 type = int ``` * abs(INT_MIN) = INT_MIN9 * sizeof return usigned, i-sizeof(char) >=0 forever * sign extension: -8 = -16 + 8 (1xxx = 11xxx) * negative multiplication9 * C will make both unsigned if one is unsigned number9 * unsigned loop ```clike= unsigned i ; for( i = cnt-2, i<cnt, i--){ // do any thing } ``` ## 問題紀錄和解答 問:在 ``` $ valgrind ./phonebook_orig size of entry : 136 bytes ==3341== HEAP SUMMARY: ==3341== in use at exit: 0 bytes in 0 blocks ==3341== total heap usage: **349,906 allocs**, 349,906 frees, 47,596,856 bytes allocated ==3341== ==3341== All heap blocks were freed -- no leaks are possible ... $ valgrind ./phonebook_opt size of entry : 24 bytes ... ==3291== HEAP SUMMARY: ==3291== in use at exit: 0 bytes in 0 blocks ==3291== total heap usage: **1,006 allocs**, 1,006 frees, 47,634,336 bytes allocated ==3291== ==3291== All heap blocks were freed -- no leaks are possible ``` 中 為什麼 明明 size of entry 不一樣,bytes allocated 卻差不多呢? 答:因為 `pool_alloc()` 的 size 沒變 問: linux 中 $! 代表什麼意思呢? [解答:]((https://stackoverflow.com/questions/21063765/get-pid-in-shell-bash)) - `$!` is the PID of the last backgrounded process. - `kill -0 $PID` checks whether it's still running. - `$$` is the PID of the current shell. 問: `perf stat -e r1a8 -a sleep 1` 中的 1a8 是 event, 但是前面的 r 是什麼意思 問: 為什麼初始情形,我執行 `./phonebook_orig & perf top -p $!` 的結果和 [ierosodin](https://hackmd.io/s/SJ4uqs9Yx) 不一樣,我執行了好幾次都是 100 % sample 到某個 kernel function 去,即使我的 sample rate 已經調到最高 問: 在 make cache-test 前面為什麼不用清除 cache? 問: 為什麼 main.c 不用 include "phone_book.o*.h" 就可以呼叫 append, findname 等 function 了呢? 問: 為什麼 valgrind 說我有以下的 error 呢? ``` ==714== Conditional jump or move depends on uninitialised value(s) ==714== at 0x4E67E6F: tolower (ctype.c:46) ==714== by 0x4C31899: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==714== by 0x400DE3: findName (phonebook_opt.c:13) ==714== by 0x400B74: main (main.c:80) ==714== ==714== Use of uninitialised value of size 8 ==714== at 0x4E67E83: tolower (ctype.c:46) ==714== by 0x4C31899: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==714== by 0x400DE3: findName (phonebook_opt.c:13) ==714== by 0x400B74: main (main.c:80) ``` 是說 tolower 錯了嗎? 問: 為什麼使用 memory pool 後,cache-misses 竟然從 19.9% 增加到 30 % 問: 為什麼 `git rerere` default enable = false ? 問: change git messadge `git rebase` 遇到 merge 問: ``` (gdb) ptype -2147483648 type = unsigned int (gdb) ptype -2147483647-1 type = int ``` Ans: According to manual: If both operands have the same type, then no further conversion is needed.