Po-Hsing Wu
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- tags: ncku-course --- # 2016q3 Homework1 (phonebook) contributed by <`0140454`> ### Reviewed bt `finalallpass` * 可以嘗試更多的hash function來做比較。 --- ## 開發環境 Ubuntu 16.04 LTS Linux 4.4.0-38-generic 在筆電上灌了Ubuntu Linux,為了讓比較基準一致,所以重新檢視先前的進度。 ## 效能分析 - 優化前 * `./phonebook_orig` ``` size of entry : 136 bytes execution time of append() : 0.037285 sec execution time of findName() : 0.005002 sec ``` * `make cache-test` ``` Performance counter stats for './phonebook_orig' (100 runs): 104,2032 cache-misses # 86.554 % of all cache refs ( +- 0.24% ) 120,3911 cache-references ( +- 0.15% ) 2,6080,5148 instructions # 1.39 insns per cycle ( +- 0.02% ) 1,8753,3207 cycles ( +- 0.31% ) 0.056462097 seconds time elapsed ( +- 0.32% ) ``` * 透過perf找熱點 (`./phonebook_org & sudo perf top -p $!`) ``` PerfTop: 10440 irqs/sec kernel:15.3% exact: 100.0% [5000 cycles:pp], (target_pid: 4387) ------------------------------------------------------------------------------- 25.19% libc-2.23.so [.] __strcasecmp_l_avx 24.03% phonebook_orig [.] findName 9.70% phonebook_orig [.] main 4.49% libc-2.23.so [.] _int_malloc 4.10% libc-2.23.so [.] _IO_fgets 3.50% libc-2.23.so [.] __memcpy_sse2 2.72% libc-2.23.so [.] _IO_getline_info 2.13% libc-2.23.so [.] __strcpy_sse2_unaligned 1.84% [unknown] [k] 0x00007f9ed815602e 1.71% [kernel] [k] clear_page_c_e 1.71% [kernel] [.] native_irq_return_iret 1.66% phonebook_orig [.] append 1.55% libc-2.23.so [.] malloc 1.53% libc-2.23.so [.] memchr 1.02% [kernel] [k] release_pages 1.02% phonebook_orig [.] strcasecmp@plt 1.02% [kernel] [k] unmap_page_range 0.92% [kernel] [k] free_pcppages_bulk 0.78% libc-2.23.so [.] __strcasecmp_avx 0.70% [kernel] [k] free_hot_cold_page ``` >「程式輸出的文字訊息」請勿使用圖片表示,請改由文字貼上[name=課程助教] >>了解![name=吳勃興] 從上面的資料可以發現findName佔了24.03%,append則是佔了1.66%。但我們從執行秒數上面可以發現,append的時間卻是比findName長的。 另外,一開始在執行perf的時候,發現取樣的數量太少,導致append原本沒有出現在perf的列表上,所以用了`-c`的參數來增加取樣頻率。 ### Memory Leak 重新觀看main.c後,發現在最後並沒有完整將記憶體釋放。 ```clike= if (pHead->pNext) free(pHead->pNext); ``` 我們可以透過valgrind來觀看memory leak的程度 ``` ==24874== LEAK SUMMARY: ==24874== definitely lost: 136 bytes in 1 blocks ==24874== indirectly lost: 46,770,128 bytes in 343,898 blocks ==24874== possibly lost: 816,000 bytes in 6,000 blocks ==24874== still reachable: 0 bytes in 0 blocks ==24874== suppressed: 0 bytes in 0 blocks ==24874== Rerun with --leak-check=full to see details of leaked memory ``` 將程式碼改寫成 ```clike= while(pHead->pNext) { entry *next = pHead->pNext; pHead->pNext = next->pNext; free(next); } ``` valgrind的執行結果顯示沒有memory leak的問題 ``` ==24915== All heap blocks were freed -- no leaks are possible ``` ## 效能分析 - 優化後 ### 方法一:減少結構大小 這邊有參考一下先前學長姐的開發紀錄,另外定義一個結構 __PHONE_BOOK。 在分配 __PHONE_BOOK的時候也會順便分配 __PHONE_BOOK_ENTRY。 ```clike= typedef struct __PHONE_BOOK { char lastName[MAX_LAST_NAME_SIZE]; struct __PHONE_BOOK_ENTRY *pDetail; struct __PHONE_BOOK *pNext; } entry; ``` 雖然結構的大小縮減為32bytes,但cache miss卻沒有下降,時間反而變得更長。 ``` size of entry : 32 bytes execution time of append() : 0.047521 sec execution time of findName() : 0.010678 sec Performance counter stats for './phonebook_opt' (100 runs): 126,9535 cache-misses # 94.437 % of all cache refs ( +- 0.09% ) 134,4318 cache-references ( +- 0.07% ) 3,4116,6636 instructions # 1.23 insns per cycle ( +- 0.01% ) 2,7753,9644 cycles ( +- 0.18% ) 0.084211150 seconds time elapsed ( +- 0.17% ) ``` 在想應該是因為在append時做了兩次的malloc。 既然在dictionary內只有lastName的資訊,那append的時候只需要先分配名字的記憶體就好。未來如果有需要,再分配存放詳細資訊的空間。 ``` size of entry : 32 bytes execution time of append() : 0.030606 sec execution time of findName() : 0.002104 sec Performance counter stats for './phonebook_opt' (100 runs): 15,5705 cache-misses # 48.712 % of all cache refs ( +- 1.17% ) 31,9643 cache-references ( +- 1.02% ) 2,4467,7584 instructions # 1.90 insns per cycle ( +- 0.02% ) 1,2855,7826 cycles ( +- 0.33% ) 0.038822965 seconds time elapsed ( +- 0.36% ) ``` ![](https://i.imgur.com/3q8uWbv.png) ### 方法二:利用hash function 雖然減少結構大小已經有不錯的改善,但假如每一次搜尋都需要從頭找,多少會影響執行效率。 如果可以利用hash function將名字map到數個slot,使得搜尋時只要在特定slot內的尋找,這樣應該也能夠改善效能。 在這邊我使用的是djb2,並將TABLE_SIZE定為32767,也就是原本的 ```clike= entry *pHead, *e; ``` 變成 ```clike= entry pHead[TABLE_SIZE], *e[TABLE_SIZE]; ``` findName的時間接近0秒,但append的時間卻增加了0.025秒。可是cache miss卻增加到68%。 透過hash table,只要發生碰撞的機率很低,我們便可以查詢所花的時間。但是,在插入資料的時候,因為必須做額外的運算,所以append的部份有所增長也是預期的結果。 ``` size of entry : 32 bytes execution time of append() : 0.051574 sec execution time of findName() : 0.000000 sec Performance counter stats for './phonebook_hash' (100 runs): 102,5574 cache-misses # 68.005 % of all cache refs ( +- 0.56% ) 150,8084 cache-references ( +- 0.09% ) 2,8155,3172 instructions # 0.96 insns per cycle ( +- 0.02% ) 2,9422,2148 cycles ( +- 0.28% ) 0.090163917 seconds time elapsed ( +- 0.28% ) ( +- 0.59% ) ``` ![](https://i.imgur.com/y6hrY6f.png) ### 方法三:利用tree structure 利用binary search tree來實作的話,append的時間會增加很多。這是因為dictionary中的word都已經排好序了,使得binary search tree退化成一條線。 因此應該要用其他的樹來進行優化,才可能使執行時間縮短。 * 字典樹 trie 相較於普通的binary search tree,trie因為不會整個退化成一條線,所以在append的時間少很多,但因為每次都從root開始搜尋且entry大小比較大,所以還是比先前的優化方式慢。 ``` size of entry : 232 bytes execution time of append() : 0.134898 sec execution time of findName() : 0.000000 sec Performance counter stats for './phonebook_trie' (100 runs): 96,7428 cache-misses # 67.495 % of all cache refs ( +- 0.26% ) 143,3335 cache-references ( +- 0.19% ) 16,1746,5334 instructions # 1.61 insns per cycle ( +- 0.01% ) 10,0716,5615 cycles ( +- 0.04% ) 0.299867931 seconds time elapsed ( +- 0.05% ) ``` ![](https://i.imgur.com/duJmLWj.png) * 紅黑樹 red-black tree 因為dictionary的關係,binary search tree會退化,所以append的效率非常的差。還是要使用的話,則應該要使用自平衡樹,這邊是使用紅黑樹,在append的時候因為需要進行旋轉等操作,所以時間會比trie多一點點。 ``` size of entry : 56 bytes execution time of append() : 0.158551 sec execution time of findName() : 0.000000 sec Performance counter stats for './phonebook_rbtree' (100 runs): 21,0833 cache-misses # 53.825 % of all cache refs ( +- 0.47% ) 39,1698 cache-references ( +- 1.25% ) 9,7743,9464 instructions # 1.72 insns per cycle ( +- 0.00% ) 5,6885,8067 cycles ( +- 0.72% ) 0.169528500 seconds time elapsed ( +- 0.71% ) ``` ![](https://i.imgur.com/TGw21Jq.png) ### 方法四:利用memory pool 在觀看[陳品睿學長Hackpad](https://embedded2016.hackpad.com/2016q1-Homework1-utBhpkDFVMh)時有看到老師提到可以將malloc替換成memory pool的方式來做,先分配好一大塊記憶體,等append需要的時候直接取一小塊來用,這樣append的時間也能有所改善,畢竟頻繁的要求/釋放記憶體也會造成系統的負荷。 ![](https://i.imgur.com/lBMgvCs.png) 將memory pool套用至各種版本上之後可看到總時間比原本短。看來對於大量的記憶體操作時,可以優先考慮利用memory pool來取代傳統的malloc,進而提升執行效率。 | 版本 | 套用前後時間差異 | | --- | --------------| | 優化前 | -0.01173 | | 小結構 | -0.009254 | | Hash function | -0.007918 | | 字典樹 | -0.063499 | | 紅黑樹 | -0.014424 | ## 演算法和實作的小結 經由上面幾種優化方式後發現,減少結構大小可以使得cache中保存較多資料,因此append與findName的時間都有縮短的趨勢。 而hash與tree則是在findName較有優勢,append因為需要額外處理,所以可能會較原本的慢。再來就是記憶體分配,頻繁的malloc/free對系統來說也是個負擔,善用memory pool可以提升效率。 經過這次作業才知道以前所學科目的重要性,包括計算機組織、作業系統等。因為這次作業也發現自己其實並沒有那麼了解C語言跟硬體方面的東西,這一部份是未來要多加努力來補齊的。 ## 新進度 ### Dataset 的影響 首先,觀察程式中所使用的words.txt可以發現,已經照字典序排序好,那如果把他打亂 (`sort -R words.txt`)再測試一次的話,結果是否會有所不同? ![](https://i.imgur.com/tR805AS.png) 打亂後發現,未優化前與前兩種方式所受到的影響比較小。而後面兩種有關樹的方式,均有明顯的不同。原因在於順序的不同,導致在維護樹的時候有所差異。 下一步,應該尋找一個適當的lastname dataset來做測試,因為words.txt中包含了許多沒有名字意義的單字。 ### Fuzzy Search (Levenshtein Distance) 計算出Levenshtein Distance並列出距離為2的結果。 ```clike= int levenshtein(char *src, char *dst) { int i, j; int src_len = strlen(src); int dst_len = strlen(dst); int table[MAX_LAST_NAME_SIZE + 1][MAX_LAST_NAME_SIZE + 1]; // initialize distance table for (i = 0; i <= src_len; ++i) { table[i][0] = i; } for (j = 0; j <= dst_len; ++j) { table[0][j] = j; } // calculate distance for (j = 1; j <= dst_len; ++j) { for (i = 1; i <= src_len; ++i) { table[i][j] = table[i - 1][j - 1]; if (src[i - 1] != dst[j - 1]) { table[i][j] = min( table[i - 1][j] + 1, table[i][j - 1] + 1, table[i - 1][j - 1] + 1 ); } } } return table[src_len][dst_len]; } ``` 將原本要搜尋的`zyxel`改為`xyzel`的話,結果如下: ``` aczel (2) azel (2) etzel (2) eyfel (2) ezel (2) fazel (2) gazel (2) gyel (2) gymel (2) gysel (2) hazel (2) hywel (2) kydel (2) kyzer (2) kyzyl (2) lyel (2) mazel (2) mykel (2) mysel (2) nyel (2) oxymel (2) pyhel (2) rozel (2) sydel (2) vozel (2) xdel (2) xurel (2) xylem (2) xylol (2) xylyl (2) xyzzy (2) yeel (2) yoel (2) zyxel (2) ``` 原先想計算相似度,然後返回相似度最大的entry。但是從上面的結果可以看到,假如搜尋`xyzel`的話,`aczel`跟`zyxel`的相似度會是一樣的,所以不如將可能的清單列出,讓使用者判斷,或是增加其他搜尋條件來精確地過濾。 ## 參考資料 1. [案例分析:Phonebook](https://embedded2016.hackpad.com/ep/pad/static/1vvUk25C0fI) 2. [靜妃大神Hackpad](https://embedded2015.hackpad.com/note-hw2-phonebook-3mUOSk5J9cu) 3. [关于CPU Cache -- 程序猿需要知道的那些事](http://cenalulu.github.io/linux/all-about-cpu-cache/) 4. [各種字符串Hash函數比較](https://www.byvoid.com/zht/blog/string-hash-compare) 5. [平衡樹](http://user.frdm.info/ckhung/b/al/balst.php) 6. [C语言 实现红黑树](http://www.oschina.net/code/snippet_1178986_47779) 7. [陳品睿學長Hackpad](https://embedded2016.hackpad.com/2016q1-Homework1-utBhpkDFVMh) 8. [Wagner–Fischer algorithm - Wikipedia](https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm)

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully