Try   HackMD

2018q1 Homework1 (phonebook)

contributed by < ryanpatiency >

tags: ryanpatiency

github作業區作業要求

3d23f44 commit message 看不出做了哪些修改
3cce83d 未說明為什麼要修改 hash table size
課程助教

git rebase 後 push 在 github 上了 ,但 3d23f44 是 merge 前的 commit ,我不是很明白要怎麼改訊息。下次上課再問助教
3cce83d
Update hash table size => Update hash table size to prime
ryanpatiency

給初學者/無頭緒 的建議

老師的作業要求很多,但是如果你完全沒有頭緒,不知道從哪裡開始,這篇文章是針對你寫的

重點整理

  1. 準備好 linux, git&github, hackmd
  2. 看作業 1 教學影片
  3. 整理歷屆學生的文件,找出值得效法者 ( cashe-miss 最低者)
  4. 選一個效法
  5. 紀錄過程在 hackmd 上,並 push 到 github 上
  6. 回到第四步
  7. 挑戰題

第一步驟是:安裝 linux 、 學會用 git & github 、學會打 hackmd

因為作業要運行在 linux 上,作業繳交是用 github & hackmd,所以你一定要會這 3 個。

  • 安裝 linux 網路上有許多教學,大部份都是下面的步驟
    1. 格式化一個 usb
    2. 把 linux 安裝到 usb
    3. 重開機
    4. 剛開機後立刻按下 F1 or F2 or F11 or 進去 bios mode
    5. 選擇開機順序 => 讓電腦先用 usb 開機
    6. 重開機 => 成功!!
  • github 網路上有許多教學,你先學會以下就好
    1. git add -A
    2. git commit -a -m "messadge"
    3. git pull
    4. git push
    5. git diff
    6. gitk

避免用 git commit -a -m “messadge”,請用 git commit 然後仔細撰寫 Git commit message。
參見 How to Write a Git Commit Message

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

  • hackmd 你看任何一個文件是怎麼打的,依樣畫葫蘆就好了(點工具欄的鉛筆符號)

如果你花了超過3個小時還沒有辦法把上面的東西弄好,找一個已經會的同學教你,因為 ~~ 你知道的,這次作業要求你付出16個小時以上,你不能夠在預備上就花超過 3 個小時

如果你弄好了,就可以開始做作業了:

第二步驟是:看作業1的教學影片 link

看影片大概要花一個小時的時間,看完並且跟著操作後,配合之前所學,你就能在作業區上傳一份空白的作業了。(包含空白的 hackmd 和 同步後的 github)

現在你的 github 上是沒有任何進度的,接下來你要做的,就是 commit 並 push 你的進度。並且紀錄在 hackmd 上。加油!!

看完影片後,你應該要知道:

知道這一次的作業目的是減少 cache miss ,並且提高程式執行速度。

第三步驟是:站在巨人的肩膀上(尋找強者學長和同學的文件)。

首先我們已經知道目標是要減少 cache miss ,那麼我們就應該先找找歷屆同學的範例程式裡面;哪一個的 case-miss 最低?我把我整理的名單放在後面,但是你應該要自己尋找、整理。

第四步驟是:選個文件照著做

如果文件看不懂,比如你不知道 $ make cashe-test 你可以:

  • 丟給 google 和 youtube
  • 問同學。
  • 去找 official manual (man)
  • 在 fb 上的 課程討論區發問。
  • 紀錄在 hackmd 上
  • 跳過 <= important

這裡我給你一些在讀文件遇到問題時的範例,供你參考

  • 了解 make 的方法 (also perf , gcc, )

    • 終端機打 $ man make
    • YouTube/google :what is make in linux
  • 了解 memory pool 怎麼實作

    • google: what is memory pool
    • google: memory pool in c
  • 了解 hash table 怎麼實作

    • youtube: hash algorithm
    • google: hash table in c

你可以一直做第四步驟直到你有信心為止。

如果你還是不知道要怎麼做 接下來是我的作法記錄,你可以對照著看

最後最後,祝你修課順利:) 加油!!

內容大綱

  • 作業一影片要求整理
  • 見賢思齊的成績紀錄
  • 實驗紀錄:重複 ryanwang 同學的實驗 ( struct size, hash table ) 達到已知最低的 cache-miss: 21%
  • 實驗紀錄:重複 ierosodin 同學的實驗 ( struct size, hash table, memory pool):cache-miss 不降反升、再次增快執行速度
  • 實驗紀錄:其他實驗
  • 問題紀錄

作業一要求:

目的:實作自己的

  • phonebook_opt.c
  • Achieve low cache miss and run time

手段:

  • 調查:find the lowest cache miss report in this class's history
  • examine the method he/she used
  • do it again myself
  • if possible, combine-other's method

加分題:

  • fuzzy search
  • 了解 race condition
  • 了解使用 valgrind
  • superscalar
  • out-of-order execution
  • dual-issue
  • branch prediction (& how to use gcc to optimize)
  • tracepoint
  • clock_gettime man page
  • gnuplot => 除了預設以外的 plot

先不管加分題,我們先開始調查:

"見賢思齊" 中的成績紀錄:

同學的成果,以 cache miss 為標準,以下是成績優異的同學

  • memory pool: 37.338% link
  • mmap: 29.101% (but require extra time) link
  • hash function 26.915% link
  • memory pool: 29.716% link
  • entry size 32.843% link

調查結果,ryan wang 同學 26.915% 第1名

我們的第 1 要務是了解 26.915% 是怎麼做到的。

另外,記得影片中老師提到有一同學做到16% ,回去看影片,發現是在 42分處。結果誤會一場 :)

實驗紀錄:重複 26.915% ryanwang 同學的實驗

以下的實驗可當作ryanwang同學的筆記的補充,請大家可以先看他的筆記,如果遇到難處,可以參考以下我找到的資料

有許多部份我沒有一步一步照著做,而是自己上網找資料,然後用自己的方法做。如果大家看到有些步驟沒有說明,只有一些連結,那代表我所參考的資料


ryan wang 的文件中有 $ echo 1 | sudo tee /proc/sys/vm/drop_caches

解說如下:

  • 不懂 | 和 > 和 &> 和 >&2 的話,他們的搜尋關鍵字是 bash pipe, redirection, 參考 link

  • tee是用來給 stdout 權限的,,大家可以 man tee

    為什麼不 run $ echo 1 > sudo /proc/sys/vm/drop_caches 呢?
    實驗看看下面兩行就能理解
    $ sudo echo 1 > /proc/sys/vm/drop_caches
    $ echo 1 > sudo /proc/sys/vm/drop_caches
    這個技巧再後來$ perf record的時候也會用到

  • 1, 2, 3 各有不同的意思:

    • To free pagecache:
      echo 1 > /proc/sys/vm/drop_caches
    • To free dentries and inodes:
      echo 2 > /proc/sys/vm/drop_caches
    • To free pagecache, dentries and inodes:
      echo 3 > /proc/sys/vm/drop_caches

$ make cache-test 的結果:

 Performance counter stats for './phonebook_opt' (100 runs):

         1,178,032      cache-misses              #   78.818 % of all cache refs      ( +-  0.97% )
         1,494,619      cache-references                                              ( +-  0.90% )
       261,906,975      instructions              #    1.17  insns per cycle          ( +-  0.02% )
       223,419,587      cycles                                                        ( +-  1.14% )

       0.095444565 seconds time elapsed                                          ( +-  1.55% )

$ make plot 的結果

可以發現:我的電腦速度本來就比這位ryanwang522同學慢,他的初始執行時間是 0.04 秒 / 0.006 秒,我的是 0.06 秒 / 0.007 秒

# 如何縮小 struct __PHONE_BOOK_ENTRY 的大小

我的實作方式:註解程式碼

typedef struct __PHONE_BOOK_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
//    char firstName[16];
//    char email[16];
//    char phone[10];
//    char cell[10];
//    char addr1[16];
//    char addr2[16];
//    char city[16];
//    char state[2];
//    char zip[5];
    struct __PHONE_BOOK_ENTRY *pNext;
} entry;
  • ryanwang 提到的 l1_cache ,可用 lscpu 找到

After Shrink the size I run make cache-miss again

Performance counter stats for './phonebook_opt' (100 runs):

       103,879      cache-misses              #   32.039 % of all cache refs      ( +-  1.26% )
       324,226      cache-references                                              ( +-  1.76% )
   241,106,445      instructions              #    1.72  insns per cycle          ( +-  0.02% )
   140,094,134      cycles                                                        ( +-  1.03% )

   0.059512604 seconds time elapsed                                          ( +-  1.29% )

cashe-miss 從 80% 降到 30%

phonebook_opt.c 中有一個 TODO, 請大家要按照 TODO 做!

改好後的 make plot

# 如何實作 Hash table

參考以下連結 DIY

  • tutorial link
  • BKDR (backdoor) Hashtable tutorial link
    • According to the website, "This hash function comes from Brian Kernighan and Dennis Ritchie's book 'The C Programming Language'"

可見我們老師多愛 C
ryanpatiency

我的實作很簡單,只要把 e & pHead 改成 hash_e[key] & hash_pHead[key] 就好了,github 上可以找到

  • 結果 1 ( hash_table 數量為 100 )
    phonebook_opt cache-misses # 28.637 % of all cache refs ( ± 0.88% )
  • 結果 2 ( hash_table 數量為 1000)
    phonebook_opt ache-misses # 21.690 % of all cache refs ( ± 2.34% )

當 hash_table 數量為 1000 時,就連 phonebook_orig 都能跑到 27% 的 cache-miss
為了對照,最好把 HASH_TABLE_SIZE 的定義 放在 #ifdef OPT #else 之中,並定義 orig 的 hash table size 為 1

21%!! 只是 follow 前人的腳步就能超過已知最佳成績,真的要感謝這位ryanwang522同學

實驗紀錄:重複 ierosodin 同學的實驗

# 解說 ./phonebook_orig & perf top -p $!

  • & make the program run in background. google key word: bash &
  • $! the last ip of command. google key word: bash $!
  • perf stat : please run $perf help top yourself
    In fact, this command is tricky way of perf record
    see th link below:
    link: perf record tutorial

# 解說 perf stat --repeat 100 -e cache-misses,cache-references,instructions,cycles ./phonebook_orig

DIY the following to understand:

  • $ perf help stat

  • $ perf help list

    We can find a spooky thing in the end of the $ perf help list:

    1. Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume
      3B: System Programming Guide
      http://www.intel.com/Assets/PDF/manual/253669.pdf

    2. AMD64 Architecture Programmer’s Manual Volume 2: System Programming
      http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf

    it is a 500 page manual, I am terrified

  • go to wiki
    Wiki also tells you to look manual, So to understand what is cache-misses, cache-references, I suggest you just go to look at the manual.

  • helpful 教學
    可以學會其他不錯的 event: L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores

  • 了解 cashe-store 和 cashe-miss 的差別:教學

ierosodin also shrink struct size, but he add a struct __PHONE_BOOK_INFO to keep the rest of the info, which looks more proper.

# How to implement Memory pool

ierosodin 同學有提供 memory pool 的 code,但由於沒有針對這次的作業,我建議大家上網去找 link

我採用 link 中的方式實現 memory pool

typedef struct pool
{
  char * next;
  char * end;
} POOL;

POOL * pool_create( size_t size ) {
    POOL * p = (POOL*)malloc( size + sizeof(POOL) );
    p->next = (char*)&p[1];
    p->end = p->next + size;
    return p;
}

void pool_destroy( POOL *p ) {
    free(p);
}

size_t pool_available( POOL *p ) {
    return p->end - p->next;
}

void * pool_alloc( POOL *p, size_t size ) {
    if( pool_available(p) < size ) return NULL;
    void *mem = (void*)p->next;
    p->next += size;
    return mem;
}

要用 memory pool 取代 malloc, 所以 ctrl-F 找尋所有 malloc 出現的地方,把他們換成 pool_alloc, 當然一開始要先 pool_create() (除了 main.c 以外,不要動到 orig比較好)
由於我這次打算只觀察 memory pool 的影響,所以沒有 shrink entry size, 所以一個 node size = 136 bytes, dictionary 有 350000 筆資料,總共要 alloc 350000 * 136 = 47600000 bytes

Bug 紀錄:
Error: double free or corruption

解決方法:
pool_createpool_destroy 外加上 #ifdef OPT#endif,把原本的 free 放在 #else
結果

  • phonebook_orig: cache-misses # 82.738%
  • phonebook_opt: cache-misses # 57.111%

改善成功! drop from 82 % to 57 %

但是可以發現在 cashe-miss 上的改善不如 Hashfunction (82% to 27%) 及 Shrink entry size (82% to 30%) 明顯

# 介紹 valgrind

用法 valgrind ./phonebook_opt

可以發現產生許多有用的資訊,用一次就知道了。

範例:

$ valgrind ./phonebook_orig
...
==3341== HEAP SUMMARY:
==3341==     in use at exit: 0 bytes in 0 blocks
==3341==   total heap usage: 349,906 allocs, 349,906 frees, 47,596,856 bytes allocated
==3341== 
==3341== All heap blocks were freed -- no leaks are possible
...
$ valgrind ./phonebook_opt
...
==3291== HEAP SUMMARY:
==3291==     in use at exit: 0 bytes in 0 blocks
==3291==   total heap usage: 1,006 allocs, 1,006 frees, 47,634,336 bytes allocated
==3291== 
==3291== All heap blocks were freed -- no leaks are possible

可以發現 allocs 次數從 349906 降到 1006

現在要 alloc 的 memory 是 entry size (24 bytes) * 350000 = 8400000
但是上面的 terminal out 的時候忘記改了。所以 allocated bytes 才會相同

iero 同學使用 valgrind --leak-check=full, 但我在 man valgrind 中沒看到 leak-check=full
發現是執行 valgrind ./phonebook_o* 後,如果有 memory leak, valgrind 會告訴你請加上 leak-check=full

實作 memory pool,並用 valgrind debug 完後,我想看看綜合之前的方法會怎麼樣

我將 mpool branch 與原本參考 ryanwang 的 master branch git-merge 並執行

執行結果

$ ./phonebook_opt 
size of entry : 24 bytes
execution time of append() : 0.063161 sec
execution time of findName() : 0.000003 sec
...
$ ./phonebook_orig 
size of entry : 136 bytes
execution time of append() : 0.088766 sec
execution time of findName() : 0.006353 sec

執行時間都變快不少

Performance counter stats for './phonebook_orig' (100 runs):

         1,472,016      cache-misses              #   88.575 % of all cache refs      ( +-  0.46% )
         1,661,879      cache-references                                              ( +-  0.36% )
       379,636,405      instructions              #    1.49  insns per cycle          ( +-  0.02% )
       255,468,254      cycles                                                        ( +-  0.36% )

       0.103079308 seconds time elapsed                                          ( +-  0.75% )
            
Performance counter stats for './phonebook_opt' (100 runs):

            73,289      cache-misses              #   35.223 % of all cache refs      ( +-  8.15% )
           208,071      cache-references                                              ( +-  5.94% )
       177,799,568      instructions              #    1.63  insns per cycle          ( +-  0.29% )
       108,785,516      cycles                                                        ( +-  0.91% )

       0.044483962 seconds time elapsed                                          ( +-  1.11% )

什麼?為什麼 cashe-miss 從 21% 上升到了 30% 了? 我不相信,於是 git checkout 以前的版本試試看

// 之前的版本
Performance counter stats for './phonebook_orig' (100 runs):

         1,232,064      cache-misses              #   85.881 % of all cache refs      ( +-  0.48% )
         1,434,611      cache-references                                              ( +-  0.43% )
       314,415,068      instructions              #    1.39  insns per cycle          ( +-  0.02% )
       225,707,514      cycles                                                        ( +-  0.40% )

       0.091467723 seconds time elapsed                                          ( +-  0.76% )


Performance counter stats for './phonebook_opt' (100 runs):

            77,994      cache-misses              #   22.709 % of all cache refs      ( +-  1.49% )
           343,453      cache-references                                              ( +-  2.01% )
       238,434,085      instructions              #    1.64  insns per cycle          ( +-  0.02% )
       145,711,315      cycles                                                        ( +-  0.68% )

       0.059365206 seconds time elapsed                                          ( +-  1.10% )

opt 的執行結果 ,這次是 22% 。看起來實驗是沒問題的。git diff 後也證實的確只有 memory pool 不同。可能是 memory pool 哪裡出了問題吧?

不過大家可以發現,memory pool 把 instruction 數目 降低了 ,所以時間也從 0.06 變為 0.04

實驗示範到這裡結束,從分隔線後,是我的實驗紀錄。就沒有特別寫給初學者看了。不過看到這裡,你應該有能力去挑戰加分題了吧~

加油!


iero 同學的最終版本還是快我 3 倍

我想 git pull 他的 code 來跑看看

gnuplot

iero 同學有畫很特別的 hash_function plot 我也要學起來,並且了解他的橫軸是怎麼一回事?(MOD=17 時不是應該每一個要有 20000 左右的節點嗎?

執行結果 和 hash_table_size 的關係

根據網路上的建議,size 的選擇是質數比較好

Input Data

改成 all-names.txt
add the line #define INPUT "john" because now input may change

there is three john in the list, hm I thought each will be unique

blog
demo
source code

Match score rule:

  • Score starts at 0
  • Matched letter: +0 points
  • Unmatched letter: -1 point
  • Consecutive match bonus: +5 points
  • Separator bonus: +10 points
  • Camel case bonus: +10 points
  • Unmatched leading letter: -3 points (max: -9).
    • There is a penalty if you DON’T match the first three letters.
  • Score system should be tuned for YOUR use case. Words, sentences, file names, or method names all prefer different tuning.

author of gnu grep 's post: link

the way he use to measure grep's speed:

  $ time sh -c 'find . -type f -print | xargs grep -l 123456789abcdef'
  real	0m1.530s
  user	0m0.230s
  sys	0m1.357s
  $ time sh -c 'find . -type f -print | xargs grep --mmap -l 123456789abcdef'
  real	0m1.201s
  user	0m0.330s
  sys	0m0.929s

Summary by the grep author:

  • Use Boyer-Moore (and unroll its inner loop a few times).

  • Roll your own unbuffered input using raw system calls. Avoid copying
    the input bytes before searching them. (Do, however, use buffered
    *output*. The normal grep scenario is that the amount of output is
    small compared to the amount of input, so the overhead of output
    buffer copying is small, while savings due to avoiding many small
    unbuffered writes can be large.)

  • Don't look for newlines in the input until after you've found a match.

  • Try to set things up (page-aligned buffers, page-sized read chunks,
    optionally use mmap) so the kernel can ALSO avoid copying the bytes.

  • The key to making programs fast is to make them do practically nothing. ;-)

  • read "Fast String Searching", by Andrew Hume and Daniel Sunday, => free pdf online

My Makefile:

phonebook_fuzzy: $(SRCS_common) phonebook_fuzzy.c phonebook_fuzzy.h
	$(CC) $(CFLAGS_common) $(CFLAGS_opt) \
		-DIMPL="\"$@.h\"" -o $@ \
		$(SRCS_common) $@.c

findName in phonebook_fuzzy.c:

entry *findName(char lastName[], entry *pHead) { int best_score = INT_MIN; char *maybe = NULL; while (pHead != NULL) { if (strcasecmp(lastName, pHead->lastName) == 0) return pHead; int score = fuzzy_match(lastName, pHead->lastName); // * fuzzy_match is the c version of the code link in the above if(score > best_score) { best_score = score; maybe = pHead->pNext; } pHead = pHead->pNext; } printf("No result for %s, Did you mean: %s\n", lastName, maybe); return NULL; }

try search for my ip: ryanpatiency, the end:

$ ./phonebook_fuzzy 
size of entry : 136 bytes
No result for ryanpatiency, Did you mean: ryan

I did it. It was super super fun to make a fuzzy search and search for my id.

just for fun, here is the output if you search for jserv:

size of entry : 136 bytes
No result for jserv, Did you mean: jose

hmm.. jose, quite unlike a name. I think jserv is better the jose.

challenge: using priority queue.

  • using heap for heap is the maximally efficient implementation of priority queue (quote from wikipedia)

心得:看懂強者的 code 不好玩,看不懂更不好玩。最好玩的還是理解原理自己作

ps: after merge, current speed is:

$ ./phonebook_opt 
size of entry : 24 bytes
execution time of append() : 0.003056 sec
execution time of findName() : 0.000000 sec

It is because this all-name.txt is much small

Data alignment

Other homework:

  • the value of LHS >> RHS is implementation-defined where in most implementations, this performs arithmetic right shift (so that the result remains negative)

Amazing work:

from

uint32_t reverse_bits_32(uint32_t x) { uint32_t reverse_x = 0; for (int i = 0; i < 32; i++) { if((x & (1 << i))) reverse_x |= 1 << ((32 - 1) - i); } return reverse_x; }

to

uint32_t reverse_bits_32(uint32_t x) { uint32_t reverse_x = 0; for (int i = 0; i < 32; i++){ reverse_x |= ( x >> ( ( 32 - 1 ) - i ) & 1 ) << i; } return reverse_x; }

2's compliment:

  • 0's 2's compliment = 0
  • most negative nuber's 2's compliment = itself
  • x + (-x) = 1<<(16 or 32 or 64)

positive -> +
negative -> -
carry out -> c
most significant bit => m
(+) + (+) =>

  • overflow iff m=1
  • c always = 0

(+) + (-) => c = 0

  • c = 1 iff (+) + (-) >= 0 (m=0)
  • c = 0 iff (+) + (-) < 0 (m=1)
  • never overflow

(-) + (-) =>

  • c always = 1
  • m = 0 iff overflow

conclution => overflow (x,y, x+y) = x'y'(x+y) + xy(x+y)'

你所不知道的C語言 遞迴心得

&& & || left first

page fault: 3 kind

invalid: page
major: disk
minor : 20:20

ulimit -s stack size

stacksize observation

TODO run the program around 39:30

type size rule: 50: 06

Kill tail recurtion, Q-matrix , 1:20:00

upper string means late unix variable;

todo: modify find.c source code

what is pure function.

Introduction to Computer System 心得

  • float x = x*x > 0 yes
  • int x = x*x > 0 ? (try gdb 50000 * 50000)
  • test program => gdb
  • 300400500600 = 400500*600 * 300
  • all overflow, but follow some rule
  • (1e20±1e20) + 3.14 != 1e20+(-1e20+3.14)\9
  • bound tracing
    ​​​​typedef struct { ​​​​ int a[2]; ​​​​ double d; ​​​​} s ​​​​modify s.a[3] change the value of d;
  • 1948 => start bits
  • 1/2 one-half
  • 1/4 one-forth
  • 1100 = c
  • p && *p avoid null pointer access
  • (gdb) p 123 << 64 => 123
  • experiment compare if unsign short and short
    ​​​​ unsigned short ux = 0xffff; ​​​​ short x = 0xffff; ​​​​ printf("x is %d, ux is %d, x==ux : %d\n", x, ux, x==ux); ​​​​
    $ x is -1, ux is 65535, x==ux:0
  • -1 > 0U
  • ​​​​(gdb) ptype -2147483648
    ​​​​type = unsigned int
    ​​​​(gdb) ptype -2147483647-1
    ​​​​type = int
    
  • abs(INT_MIN) = INT_MIN9
  • sizeof return usigned, i-sizeof(char) >=0 forever
  • sign extension: -8 = -16 + 8 (1xxx = 11xxx)
  • negative multiplication9
  • C will make both unsigned if one is unsigned number9
  • unsigned loop
unsigned i ; for( i = cnt-2, i<cnt, i--){ // do any thing }

問題紀錄和解答

問:在

$ valgrind ./phonebook_orig
size of entry : 136 bytes
==3341== HEAP SUMMARY:
==3341==     in use at exit: 0 bytes in 0 blocks
==3341==   total heap usage: **349,906 allocs**, 349,906 frees, 47,596,856 bytes allocated
==3341== 
==3341== All heap blocks were freed -- no leaks are possible
...
$ valgrind ./phonebook_opt
size of entry : 24 bytes
...
==3291== HEAP SUMMARY:
==3291==     in use at exit: 0 bytes in 0 blocks
==3291==   total heap usage: **1,006 allocs**, 1,006 frees, 47,634,336 bytes allocated
==3291== 
==3291== All heap blocks were freed -- no leaks are possible

中 為什麼 明明 size of entry 不一樣,bytes allocated 卻差不多呢?
答:因為 pool_alloc() 的 size 沒變

問: linux 中 $! 代表什麼意思呢?

解答:
- $! is the PID of the last backgrounded process.
- kill -0 $PID checks whether it's still running.
- $$ is the PID of the current shell.

問: perf stat -e r1a8 -a sleep 1 中的 1a8 是 event, 但是前面的 r 是什麼意思

問: 為什麼初始情形,我執行 ./phonebook_orig & perf top -p $! 的結果和 ierosodin 不一樣,我執行了好幾次都是 100 % sample 到某個 kernel function 去,即使我的 sample rate 已經調到最高

問: 在 make cache-test 前面為什麼不用清除 cache?

問: 為什麼 main.c 不用 include "phone_book.o*.h" 就可以呼叫 append, findname 等 function 了呢?

問: 為什麼 valgrind 說我有以下的 error 呢?

==714== Conditional jump or move depends on uninitialised value(s)
==714==    at 0x4E67E6F: tolower (ctype.c:46)
==714==    by 0x4C31899: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==714==    by 0x400DE3: findName (phonebook_opt.c:13)
==714==    by 0x400B74: main (main.c:80)
==714== 
==714== Use of uninitialised value of size 8
==714==    at 0x4E67E83: tolower (ctype.c:46)
==714==    by 0x4C31899: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==714==    by 0x400DE3: findName (phonebook_opt.c:13)
==714==    by 0x400B74: main (main.c:80)
​​​​是說 tolower 錯了嗎?

問: 為什麼使用 memory pool 後,cache-misses 竟然從 19.9% 增加到 30 %

問: 為什麼 git rerere default enable = false ?

問: change git messadge git rebase 遇到 merge

問:

(gdb) ptype -2147483648
type = unsigned int
(gdb) ptype -2147483647-1
type = int

Ans: According to manual: If both operands have the same type, then no further conversion is needed.