施雨妏
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # 2016q3 Homework1 (phonebook) contributed by <`ruby0109`> ### Reviewed by `SarahYuHanCheng` * 使用相同的commit可能造成日後專案維護不易 ![](https://i.imgur.com/RTs0u9X.png) >好的!! 謝謝~ >[name=施雨妏] ## **作業目標** * 要有所成長 * 分析Hash table * 降低append * memory pool * ~~pthread~~ ## 先前微微進度簡介ˊˋ * [Phonebook](https://embedded2016.hackpad.com/2016q1-Homework-1-HVI482VjSAk) * Gnuplot: * 打上 $gnuplot 即可啟動 * 簡易指令如下: > set title 'my plot' @設定圖片名稱 > set xlabel 'x axis' @設定XY軸座標名稱 > set ylabel 'y axis' > set terminal png @設定輸出格式為 .png > set output 'output_plot.png' @設定輸出檔名 > plot [1:10][0:1] sin(x) @畫出 sin(x) 函式,x軸座標範圍1 ~ 10, y軸座標範圍0 ~ 1 * 理解runtime * reset //清除之前的設定 * set ylabel 'time(sec)' * set style fill solid * set title 'perfomance comparison' * set term png enhanced font 'Verdana,10' * set output 'runtime.png' * plot [:][:0.100]'output.txt' using 2:xtic(1) with histogram title 'original', \ 這行代表original 是用第二個column的資料, X的label是取自第1個column * '' using ($0-0.06):($2+0.001):2 with labels title ' ', \ 這行是控制在圖上的資料顯示, $0 用在只有一維的資料, 做x軸; $2是第二個column的資料位置; 2則是第二組的資料(放在圖上的資料內容) 0.06和0.001都是標籤位移量 * '' using 3:xtic(1) with histogram title 'optimized' , \ optimized 是用第三個column的資料, X的label是取自第1個column * '' using ($0+0.3):($3+0.0015):3 with labels title ' ' * 補充: * plot 'file', plot 'file' 1:2 和plot 'file' ($1):($2)有什麼不同呢? 如果其中有幾列的column數比較少(有幾格有缺), 那第一種會自己補一個值, 第二種會忽略那一列, 第三種則是會把遺失的data空下來 如果每列有一些text, 1會有error, 第二種和第三種則是忽略 * 縮小struct: * 原本: ![](https://hackpad-attachments.imgix.net/embedded2016.hackpad.com_HVI482VjSAk_p.575151_1471774925975_2016-08-21-182040_1920x1029_scrot.png?fit=max&w=882) * 改成只有lastname: ```clike= /*Optimal 1*/ //shrink the struct to lastname typedef struct __LAST_NAME__ENTRY { char lastName[MAX_LAST_NAME_SIZE]; detail *data;//data pointer to detail struct struct __LAST_NAME__ENTRY *pNext;// the address of next point value } entry; ``` ![](https://hackpad-attachments.imgix.net/embedded2016.hackpad.com_HVI482VjSAk_p.575151_1471851263346_2016-08-22-153402_1920x1029_scrot.png?fit=max&w=882) * 結果: ![](https://hackpad-attachments.imgix.net/embedded2016.hackpad.com_HVI482VjSAk_p.575151_1457211571113_optimal1.png?fit=max&w=882) * 加上Hash table方法 * table大小和function的採用都是trial&error * 結果:(BKDRhash) ![](https://i.imgur.com/09KFumD.png) ## **改善** ### Github筆記 ==新手上路, 有錯請指正== * 記得建SSH key, 設定權限。方法:[Generating a new SSH key and adding it to the ssh-agent](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/) * Github四種狀態 ![](https://db.tt/FTDkPQUb)圖片來源:[Git 教學](https://kingofamani.gitbooks.io/git-teach/content/chapter_2/repo.html) * 先用git status看放在staging area準備commit的檔案(本端repository) ![](https://i.imgur.com/NVmZ2Rz.png) * changes not staged for commit和untracked files都是沒有要commit的 * 想要新增要commit的檔案用: git add <File>, 如果想要新增全部檔案:git commit -A 或 git commit --all * 再用一次git status>>changes to commit有綠色的檔案, 就是準備上傳的 ![](https://i.imgur.com/2j7tAt2.png) * 接下來用git commit -m "title" -m "message"就可以放到本端的respository * git push就可以送到遠端(web上)參考[Git 教學研究站](http://dylandy.github.io/Easy-Git-Tutorial/) * 如果上傳了想刪除:參考[How can I delete a file from git repo?](http://stackoverflow.com/questions/2047465/how-can-i-delete-a-file-from-git-repo) * git rm file1.txt git commit -m "remove file1.txt" ### 除掉warning * 原本compile有waring: In function 'findname': warning : assignment from incompatible pointer type HashHead[i] = HashHead[i]->pNext; In function 'HashFunction': warning : suggest parentheses around assignment used as truth value while((c=*str++)) >「程式輸出的文字訊息」,請勿使用圖片表示,請改由文字貼上[name=課程助教] > 收到[name=ruby][color=#bc1e3b] * assignment from incompatible pointer type:仔細看才發現是我在struct中的pointer宣告出錯, struct __LAST_NAME__ENTRY少一條底線, 之前居然可以執行。有點好奇編譯器是怎麼處理的? * suggest parentheses around assignment used as truth value, 因為要判斷裡面的真假值, 所以改成while((c = *str++)) ### 分析Hash Table * 之前的作業對Hash function的選擇和bukkets數沒有用數據下去做分析, 這次先把這部份補齊。 * 參考[rnic](https://embedded2016.hackpad.com/2016q1-Homework-1-9eSkToGwJvZ)大大的方式, 我用slot下去看目前的Hash function 分佈狀況: ![](https://i.imgur.com/RhlxyI7.png) 好悲劇的圖阿...不過這也讓我發現沒有做分析, 根本不了解自己的程式是怎麼跑的 * 做圖表看不同的table size出來效能如何   原本Hash的分佈很不均勻, 所以想要看效能隨著hash table size不同是怎麼變化的, 再取一個適當值。我把main.c改成phonebook.c(function也改成phonebook), 另外寫一個main把值輸給phonebook嘗試做測試, 可是卡在不知道怎麼寫, 才可以正確的宣告Hash table size。 ### 降低append #### memory pool: 用一次分配記憶體再一次歸還的方式增進效能。 * 參考[implement own memory pool](http://stackoverflow.com/questions/11749386/implement-own-memory-pool) * 先只考慮放last name structure的話, 用到的memory以word.txt約是350000*32bytes: 11200000。~~一次alloc這麼大的空間會不會影響cach miss呢? >> 先試試看~~ * 我的想法是: memory pool既然是先alloc一個空間, 也就是有一長串的位址, 那我每次要用的時候就用已知的需要大小, 跳到下一個address, 再用另一個變數紀錄剩下的空間來避免超過alloc的大小。不過跟上面的問題一樣,我不知道有沒有方式可以有個變數讓所有file共用(還是這樣的寫法不好?),所以把變數寫在main.c裡, 有點凌亂。 在h file的宣告: ```clike= /* Memory Pool*/ typedef struct __POOL { char lastName[MAX_LAST_NAME_SIZE]; detail *data;//data pointer to detail struct struct __POOL *pNext;// the address of next point } entry; ``` 在c file的函式: ```clike= entry *pool_create( size_t size ){ entry *p = (entry*)malloc((size*sizeof(entry))); return p; } /* return the memory space at once */ void pool_destroy( entry *p){ free(p); } /* each time alloc a entry size memory */ entry *palloc (entry *p){ return (++p); } ``` 結果: Performance counter stats for './phonebook_memory' (100 runs): 373,575 cache-misses # 33.211 % of all cache refs ( +- 0.54% ) 1,124,861 cache-references ( +- 0.73% ) 170,807,651 instructions # 1.96 insns per cycle ( +- 0.03% ) 87,203,719 cycles ( +- 0.10% ) 0.041390273 seconds time elapsed ( +- 0.12% ) 對比Hash funciton: Performance counter stats for './phonebook_hash' (100 runs): 488,578 cache-misses # 31.665 % of all cache refs ( +- 0.07% ) 1,542,935 cache-references ( +- 0.54% ) 243,589,372 instructions # 2.18 insns per cycle ( +- 0.02% ) 111,539,167 cycles ( +- 0.08% ) 0.052506768 seconds time elapsed ( +- 0.13% ) cache miss上升了一些, 不知道為什麼 效能: >「程式輸出的文字訊息」請勿使用圖片表示,請改由文字貼上[name=課程助教] > ![](https://i.imgur.com/TTimUlF.png) 好開心ˊˇˋ, 比之前縮小struct還快一些~ #### Pthread * 改寫Hash table的structure * 原本笨笨的用兩個大大的structure來放HashHead和HashHead之後的linked list。 ```clike= entry *HashHead[HASH_SIZE], *Hashe[HASH_SIZE]; entry *findName(char lastname[], entry *pHead) { int i; i = HashFunction(lastname); while (HashHead[i] != NULL) { if (strcasecmp(lastname, HashHead[i]->lastName) == 0) { return HashHead[i]; } HashHead[i] = HashHead[i]->pNext; } return NULL; } /* allocate memory for the new entry and put lastName*/ entry *append(char lastName[], entry *e) { e = palloc(e); int i; i = HashFunction(lastName); strcpy(e->lastName, lastName); if(HashHead[i] == NULL) { HashHead[i] = e; Hashe[i]=HashHead[i]; Hashe[i]->pNext=NULL; } else { Hashe[i]->pNext = e; Hashe[i]=Hashe[i]->pNext; Hashe[i]->pNext=NULL; } return e; } ``` 參考[TempoJiJi](https://hackmd.io/MYUwzMCGCckAwFoBMcAmB2BAWa1UMmCIRAEYkA2MddEADjmCyA==#使用pthread降低append時間)的寫法之後, 發現在創一個新的Node的時候, 只要把新Node的pNext放入Hash Head的位址, 再把新Node的位址放進Hash Head中就可以不用像之前一樣判斷Head是否存在或陣列的index,方便做pthread。 除此之外, [TempoJiJi](https://hackmd.io/MYUwzMCGCckAwFoBMcAmB2BAWa1UMmCIRAEYkA2MddEADjmCyA==#使用pthread降低append時間)在struct用pointer to pointer宣告, 之後再用tail[]做array放pointer的寫法讓我對pointer和array又了解了一點。 ```clike= typedef struct __HASH_TABLE { entry **tail; }HashTable; HashTable *hash_ptr; void Initial_HashTable(void){ hash_ptr = (HashTable* )malloc(sizeof(HashTable)); hash_ptr->tail = malloc(HASH_SIZE*sizeof(entry*)); for(i=0;i<HASH_SIZE;i++){ hash_ptr->tail[i]=NULL; } } ``` ~~不過因為用另一個定義在h file裡的pointer來取代原本的phead和e,我不知道__builtin___clear_cache裡面的參數該放什麼。~~ 原本的: ```clike= #if defined(__GNUC__) __builtin___clear_cache((char *) pHead, (char *) pHead + sizeof(entry)); #endif ``` 在6.25 gcc — Built-in Function: void __builtin___clear_cache (char *begin, char *end) This function is used to flush the processor's instruction cache for the region of memory between begin inclusive and end exclusive. Some targets require that the instruction cache be flushed, after modifying memory containing code, in order to obtain deterministic behavior. If the target does not require instruction cache flushes, __builtin___clear_cache has no effect. Otherwise either instructions are emitted in-line to clear the instruction cache or a call to the __clear_cache function in libgcc is made. ~~看不太懂...~~ 這行是在清除之前在cache中pHead被分配的空間, 如果我使用hash_ptr, 不知道是否可以省略 把新改好的Hash 和原本的memory pool結合, 不知道為什麼cache miss大量增加 Performance counter stats for './phonebook_memory' (100 runs): 374,597 cache-misses # 43.750 % of all cache refs ( +- 0.76% ) 856,231 cache-references ( +- 0.53% ) 169,958,884 instructions # 2.09 insns per cycle ( +- 0.04% ) 81,406,863 cycles ( +- 0.14% ) 0.038805268 seconds time elapsed ( +- 0.18% ) 加上pthread之後: Performance counter stats for './phonebook_thread' (100 runs): 1,280,729 cache-misses # 4.501 % of all cache refs ( +- 0.56% ) 28,456,255 cache-references ( +- 0.69% ) 414,598,663 instructions # 0.54 insns per cycle ( +- 0.34% ) 771,199,411 cycles ( +- 0.54% ) 0.116110237 seconds time elapsed ( +- 1.47% ) 似乎有地方出錯... ![](https://i.imgur.com/A5LSmeW.png) 甚至連Hash的版本cache也升高了 Performance counter stats for './phonebook_hash' (100 runs): 479,087 cache-misses # 64.595 % of all cache refs ( +- 0.09% ) 741,675 cache-references ( +- 0.74% ) 241,084,858 instructions # 2.22 insns per cycle ( +- 0.02% ) 108,662,675 cycles ( +- 0.11% ) 0.051187177 seconds time elapsed ( +- 0.15% ) 看了TempoJiJi分享的網頁才知道threa在I/O發揮的效果不大, 雖然還是不太明白為什麼這樣append就會升高那麼多。 ˊˋ在debug很久很久之後發現原本的code在函式的地方沒有把指標放進去,不過cache miss還是蠻高的。試試看mmap() 我的code ```clike= #if defined(MMAP) int j, fd,pagesize; char *data; struct stat file_status; unsigned long int bytes_read,bytes_left, to_read; bytes_read=0; bytes_left=0; to_read=0; /* create a memory pool and initialize hash table*/ hash_ptr = Initial(); /* open the file*/ fd = open(DICT_FILE,O_RDONLY); /* check the size of the file*/ fstat(fd, &file_status); /* check the size of page for mmap to asign virtual address space*/ pagesize = getpagesize(); bytes_left = file_status.st_size; /* clean cache*/ #if defined(__GNUC__) __builtin___clear_cache((char *) hash_ptr, (char *) hash_ptr + HASH_SIZE*sizeof(HashTable)); #endif /* if remain data bigger that pagesize, then just give pagesize data to mmap*/ clock_gettime(CLOCK_REALTIME, &start); while(bytes_left > 0){ if (bytes_left > pagesize){ to_read = pagesize; }else{ to_read = bytes_left; } /* mmap allocate virtual memory for data*/ data = mmap((caddr_t)0, to_read, PROT_READ, MAP_SHARED, fd, bytes_read); ``` 結果: ![](https://i.imgur.com/leOiZFw.png) 有稍微再降了一些些~ 嘗試用pthread做: code: ```clike= void *pmmap(void* ptr){ Pmmap = (mmap_arg* )ptr; /* mmap allocate virtual memory for data*/ Pmmap->data = mmap((caddr_t)0, Pmmap->to_read, PROT_READ, MAP_SHARED, Pmmap->fd, Pmmap->bytes_read); pthread_exit(NULL); } ``` 結果: 8 threads: ![](https://i.imgur.com/59OuT2L.png) 2 threads: --- 來不及做跟字串相關的實作, debug往往花太長時間 * 需不需要free在function裡的node? ### 實際運作 * 在輸入的data沒有很看得懂老師的提示: 本例選用的 dataset (也就是輸入電話簿的姓名) 是否存在問題? 有代表性嗎?跟真實英文姓氏差異大嗎? 資料難道「數大就是美」嗎? 如果我們要建立符合真實電話簿情境的輸入,該怎麼作? 先找了正常last name的資料:[Free First-name and Last-name Databases (CSV and SQL)](http://www.quietaffiliate.com/free-first-name-and-last-name-databases-csv-and-sql/), 我覺得跟之前的dataset做分析比起來, 在真實使用上的lastname會比較少, 可是可能有重複而要比對其他資料的情況。 [C: How to free nodes in the linked list?](http://stackoverflow.com/questions/6417158/c-how-to-free-nodes-in-the-linked-list) ## 參考資料 * [rnic 大大的phonebook](https://embedded2016.hackpad.com/2016q1-Homework-1-9eSkToGwJvZ#:h=OPT2-使用mempool) * [implement own memory pool](http://stackoverflow.com/questions/11749386/implement-own-memory-pool) * Git hub教學參考: * https://kingofamani.gitbooks.io/git-teach/content/chapter_2/repo.html * https://zlargon.gitbooks.io/git-tutorial/content/ * struct: [Difference between -> and . in a struct?](http://stackoverflow.com/questions/5998599/difference-between-and-in-a-struct) * ###### tags:`Ruby` `2016_Autumn` `HW1` ` Phonebook`

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully