kuoyliu
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # 2017q1 Homework1 (phonebook) contributed by <`davis8211`> ### Reviewed by `natetang` * 可以考慮將資料宣告為 pointer ,在確定有資料進入後在配置記憶體。[參考](https://hackmd.io/GbDGCYCMAYEMBYC0ATAzAU3Y+qBsBGRSAVnmG1mTHHGX2VwA4g==?both) * 第2張比較圖中,append() 部分中間的數字被遮住了,findName() 部分的資料怎麼會有0秒的?? * 使用hash function 後為何 append() 的時間不降反升?? * 試試不同的 hash function,以及試著縮小 hash table 看看會不會有什麼不同 [參考](https://hackmd.io/GbDGCYCMAYEMBYC0ATAzAU3Y+qBsBGRSAVnmG1mTHHGX2VwA4g==?both) * 尚未回答作業要求的問題 ## 開發環境 ```shell= Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 58 Model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz Stepping: 9 CPU MHz: 1282.043 CPU max MHz: 3100.0000 CPU min MHz: 1200.0000 BogoMIPS: 4988.74 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 3072K NUMA node0 CPU(s): 0-3 ``` ## perf * cache * 從記憶體中讀取 instruction 和 data * cache 是一種 SRAM,與處理器處理速度較為接近 * 常用的資料放在 cache 中,處理器便無需等待。 * 早期 main memory 的速度非常慢,但當時的 CPU 速度也慢,因此未有 cache 出現 * 但後來兩者差距拉大,因此需要 cache 來彌補兩者之間的差距 * 為了讓資料存取的速度適應 CPU 的處理速度 * 允許 CPU 直接到 cache memory 查看資料是否存在,減少到 main memory 存取的次數,解決 main memory 存取不夠快的問題 * cache-line * cache 中的最小單位,目前主流中的 cache 的 cache line 大小都是 64B * 分層 * 當 cache 和 main memory 的速度的差距再拉大時,或許會再出現更多層 * L1 cache - 2KB ~ 64KB * L2 cache - 256KB ~ 2MB * L3 cache - 1MB ~ 8MB 仍比 RAM 快 * associativity * 快取記憶體的動態位址轉換,採用硬體實作的技術來完成,每個 CPU 包含 tag RAM,用來記錄一個 memory block 要對應到哪一個 cache block * direct mapping * 1:1,每個 cache block 儘可以對應到唯一的一個 main memory block * 優點,搜尋時間短 * 缺點,hit rate 低 * fully mapping * 任意 cache block 可以對應到任意的 main memory block * 優點,hit rate 高 * 缺點,搜尋時間長,CPU 必須掃果整個 cache 才能決定是否該繼續往 main memory 撈資料 * n-way associative * 把 cache 分成很多 set,CPU 必須檢查指定 set 內的每個 block 是否有可用的 cache * 優點,搜尋時間短且 hit rate 高 * 缺點,Fully Associative 的話,CPU 要檢查整個 cache * cache miss * 在 cache 裡,找不到需要的指令或資料,因此需繼續往 main memory 找,而 latency 時間會越長 * instruction read miss * data read miss * data write miss * pipeline, superscalar, out-of-order execution * pipeline * 同個 clock 可以用不同單元處理不同事情 * superscalar * 同個 clock 可以 issue 多道指令的 pipeline 機器架構 * Intel 的 Pentium 處理器,內部有兩個執行單元 - dual issue * out-of-order execution * 不同指令所需的執行時間和 clock 是不同的,若嚴格按照程序的執行順序執行,就無法充分利用處理器的 pipeline * 相同要求,相鄰的指令相互沒有依賴關係 - dependency * branch prediction * 對軟體效能影響較大 * 如果進入一道指令為 branch。假設處理器順序讀取指令,那麼如果分支的結果是跳躍到其他指令,那麼被處理器 pipeline 所 fetch 的後續兩道指令勢必被棄置,影響性能,因此提供 branch prediction * 根據同一道指令的歷史執行紀錄進行預測,讀取最可能的下一道指令,而非順序讀取指令。 * 依賴 clock 進行定期採樣的 profiler 模式無法闡述程式對這些處理器硬體特性的使用情況。 * PMU * 允許硬體針對某種事件設置 counter * 處理器便開始統計該事件的發生次數,發生次數超過 counter 內設定的數值後,便產生 interrupt。 * 比如 cache miss * 捕捉到 interrupt 後,便可以分析程式對這些硬體特性的使用率了。 * Tracepoints * 散亂在核心原始程式碼的一些 hook * 當核心 * 運行到這些 tracepoint 時,便會通知 perf * Perf 能觸發的事件分為三類 1. hardware - 由 PMU 產生的事件 * cache-misses * cpu-cycles * instructions * branch-misses * 通常是需要了解程序對硬體特性的使用情況會使用 2. software - 核心程式產生的事件 * context-switches * page-faults * cpu-clock * cpu-migrations 3. tracepoint - 核心中的靜態 tracepoint 所觸發的事件,這些 tracepoint 用來判斷在程式執行時期,核心的行為細節,比如 slab 記憶體配置器的配置次數等 * perf_stat_cache_miss ```clike= for (i = 0; i < 10000; i++) for (j = 0; j < 10000; j++) array[j][i]++; ``` * 3,266,946 cache-miss * 101,776,248 cache-reference * 1,609,402,505 instructions * 1,178,927,466 cycles * 0.389176277s * vs ```clike= for (i = 0; i < 10000; i++) for (j = 0; j < 10000; j++) array[i][j]++; ``` * 1,642,250 cacah-miss * 1,748,428 cache-reference * 1,609,304,923 instructions * 944,937,703 cycles * 0.312469410s * 可以發現 * cache-miss 下降了 1,624,696 次 * cache-reference 下降了 100,027,820 次 * cycle 下降了 233,989,763 次 * time 下降了 0.076 秒 * perf_record_example ```clike void normal_loop(int a) { int i; for (i = 0; i < N; i++) array[i] = array[i]+a; } void unroll_loop(int a) { int i; for (i = 0; i < N; i+=5){ array[i] = array[i]+1; array[i+1] = array[i+1]+a; array[i+2] = array[i+2]+a; array[i+3] = array[i+3]+a; array[i+4] = array[i+4]+a; } } ``` * normal_loop 83.65% overhead * unroll_loop 16.02% overhead ## phonebook * 目的 * 探討 cache miss * 複習 C 語言和資料結構 * 改進1 - 減少 srtuct 大小 * 首先改變資料結構,因為 findName 時,只用到 lastName,所以將其他資訊列為另一個 struct,將作為搜尋的 struct 減小。 ```clike typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; struct __PHONE_BOOK_ENTRY *pNext; } entry_detail; typedef struct __LAST_NAME_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; entry_detail *detail; struct __LAST_NAME_ENTRY *pNext; } entry; ``` * 中間 make cache-test 沒辦法直接列出兩個的 stat,重新 clone 一次就解決這個問題 * orig :::warning * size of entry : 136 bytes * execution time of append() : 0.056006 * execution time of findName() : 0.006254 * 1,910,372 cache-misses # 82.095 % of all cache refs * 2,327,026 cache-references * 262,913,899 instructions # 1.18 insns per cycle * 223,236,432 cycles * 0.078547712 seconds time elapsed ::: * opt :::warning * size of entry : 32 bytes * execution time of append() : 0.044978 * execution time of findName() : 0.002536 * 431,091 cache-misses # 63.907 % of all cache refs * 674,566 cache-references * 243,539,078 instructions # 1.57 insns per cycle * 155,166,359 cycles * 0.054267184 seconds time elapsed ::: * make plot 總是繪製出兩條相同的長條圖 * solution 1. 在 phonebook_opt.h 設定 #define OPT 1 2. 或者也可以修改 Makefile * ![](https://i.imgur.com/QdnHXlm.png) * 改進2 - 使用 hash table * djb2 * 專門處理文字字串的 hash function,不管字串長度為何,都可以得到一個 unsigned int 型態的 hash value * 發現 append 時,插入頭端或尾端效率不同,或許是新加入的聯絡人使用率較高? * 插入頭端 * 278,522 cache-misses # 50.214 % of all cache refs * 554,666 cache-reference * 238,986,949 instructions # 1.46 insn per cycle * 163,938,803 cycles * 0.056820536 seconds time elapsed * 插入尾端 * 453,517 cache-misses # 47.658 % of all cache refs * 951,609 cache-reference * 238,172,455 instructions # 1.40 insn per cycle * 170,300,003 cycles * 0.062036628 seconds time elapsed * 首先建立一個 hash table * 在 append 時,對 lastName 用 hash function 算出一個 hashing 值,並存進 hash table。 * 如果 hash table 原本為空,就直接放進去;若有值,就令新的值插入頭端。 * findName 時,先把要找的名字丟到 hash function,求出 hash value 後,到 hash table 依序找,可大幅降低 findName 的時間。 ```clike entry *append(char lastName[], entry *e) { entry *newEntry = (entry *) malloc(sizeof(entry)); strcpy(newEntry->lastName, lastName); // get hash value unsigned int hashValue = djb2Hash(lastName); // check whether hashTable is empty if(hashTable[hashValue] == NULL) { hashTable[hashValue] = newEntry; hashTable[hashValue]->pNext = NULL; } else { newEntry->pNext = hashTable[hashValue]; hashTable[hashValue]->pNext = newEntry; } return newEntry; } ``` * 成效 * ![](https://i.imgur.com/n600pxV.png) * 大幅縮減 findName 的時間,append 可以再想一想怎麼改善。 * 我的電腦配備 * 條件 * cache 總大小事 32KB * 一個 cache 的 block 是 64B * 一個 entry 的大小為 136B + memory control block(8B) = 144B * 8-way set associative * 計算 * 有多少 block? * 350000 筆資料x每筆 144b = 5040000B * 5040000B / 64B x 2次 function = 1574000 block * 得一個 block 是一次 ref * orig 共 2411581 次,與 1574000 量級相同 * cache 有幾個 set * cache 32KB / block 64Byte = 512個 block * 512/8-way set = 64個 * 一個 cache line 可以存幾筆 entry * 144B/64B=2.25 所以是3個 block * 已知是8-way set 所以可以存2個 * cache miss * 1574000(總block)/3(一個 entry 需要 3 block)=524667組 entry * 524667/16 (index數) = 32790(tag數,填到相同 index 的個數) * 由上知一個 cache line 最多可以放 2 組 entry,所以有兩次機會 * 32790/2=16395(可能被對到的次數) * 16395x16(index數)x3(entry 佔 block 數)=786960(cache hit 次數) * 1574000-786960= 787040(cache miss) * 787040/1574000=50%(miss rate) ## hash function * hash function 把訊息或資料壓縮成摘要,使得資料量變小,將資料的格式固定下來。該 function 將資料打散混合,重新建立一個叫做雜湊值(hash value, hash codes, hash sum)。這個雜湊值就是陣列的索引,資料就儲存在這個索引的位置中。 * hash function 性質 1. 運算速度快 2. 不可逆性:無法從雜湊值推回原本的資料 3. 如果兩個雜湊值不同,原始輸入也不相同 4. 如果兩個雜湊值相同,原始輸入也不一定相同 * collision * hash distribution * 好的 hash function 產生的 hash value 應盡可能的均勻分布 * hash function 應用 * 加密 * hash table * 使用雜湊表可以快速的按照關鍵字(雜湊值)尋找資料紀錄。 * reference * https://hackmd.io/s/rJYD4UPKe * https://hackmd.io/s/rkx2IG5T * https://hackmd.io/s/SkAVm0I6 * https://hackmd.io/s/S14L26-_l * https://hackmd.io/s/BJRMImUPg

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully