Parallel_Programming
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.

      Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Explore these features while you wait
      Complete general settings
      Bookmark and like published notes
      Write a few more notes
      Complete general settings
      Write a few more notes
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Help
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- tags: para., MPI, distribution system, message passing --- # MPI Programming ## Q1: hello world ### 1. How do you control the number of MPI processes on each node? * 使用 -npernode 參數調整每個 node 的 processes,如下面指令。 ``` $ mpirun -np <# of processes> -npernode per node> --hostfile <hostfile> myprog ``` ### 2. Which functions do you use for retrieving the rank of an MPI process and the total number of processes? Retrieving the rank: ``` int rank; MPI_Comm_rank( MPI_COMM_WORLD, &rank); ``` Retrieving the total number of processes: ``` int size; MPI_Comm_size( MPI_COMM_WORLD, &size); ``` ### 3. We use Open MPI for this assignment. What else MPI implementation is commonly used? What is the difference between them? * 除了 Open MPI 以外 Intel MPI 也能實現 MPI。 | | Open MPI | Intel MPI | |:--- |:------------:|:--------------------- | | 優 | 硬體兼容性好 | 適用於 Intel 自家產品 | | 劣 | 效率較差 | 效率較佳 | <br><br> ## Q2: block_linear ### 1. Why MPI_Send and MPI_Recv are called “blocking” communication? (1 points) * 使用 MPI_Send 指令後,須等到有人接收訊息才會繼續往下執行,MPI_Recv 則是須等到有人送出訊息,讓該行程式接收到訊息才會繼續往下執行,因此稱為 “blocking” communication ### 2. Measure the performance (execution time) of the code for 2, 4, 8, 12, 16 MPI processes and plot it. (1 points) 執行所需花費的時間隨著 Processes total number 下降。 ![](https://i.imgur.com/A5bMFPX.png) <br><br> ## Q3: block_tree ### 1. Measure the performance (execution time) of the code for 2, 4, 8, 16 MPI processes and plot it. (1 points) * 執行所花費的時間如下圖 ![](https://i.imgur.com/Cg72t8E.png) ### 2. How does the performance of binary tree reduction compare to the performance of linear reduction? (2 points) 統計結果如下表,從結果來看兩者的表現沒有顯著差異,但是第三題的時候將問題放大後處理就能看出,在 processes 數多的情況下,linear 會更具優勢。 | method\NP | 2 | 4 | 8 | 16 | | --------- | ----- | ----- | ----- | ----- | | linear | 6.72s | 3.47s | 1.73s | 0.92s | | tree | 6.74s | 3.47s | 1.74s | 0.89s | ### 3. Increasing the number of processes, which approach (linear/tree) is going to perform better? Why? Think about the number of messages and their costs. (3 points) * 我將 tosses 次數增加成上面統計的 100 倍後,linear 和 tree 的表現如下圖。 * 在NP number 2 和 4 的時候,tree 的表現較佳,但是到 NP 8 和 16 的時候結果相反。 * 隨著 processes 數增加,tree 需要更多次進入迴圈執行 message commuication,因此相較於 linear 總是只在最後使用一個迴圈接收訊息,tree 在溝通上花費更高。 ![](https://i.imgur.com/gjLWMPR.png) <br><br> ## Q4: non-blocking linear ### 1. Measure the performance (execution time) of the code for 2, 4, 8, 12, 16 MPI processes and plot it. (1 points) * 執行所花費的時間如下圖 ![](https://i.imgur.com/A5yMUCj.png) ### 2. What are the MPI functions for non-blocking communication? (1 points) 在 non-blocking 的通訊中 `MPI_Isend()` 用來送出訊息,`MPI_Irecv()`用來接收訊息。 ### 3. How the performance of non-blocking communication compares to the performance of blocking communication? (3 points) 從結果我們可以發現 在這個任務中 non-blocking 雖然和 blocking 表現差不多,但的整體來說還是略好於 blocking 的表現。 ![](https://i.imgur.com/HSEcTSp.png) <br><br> ## Q5: gather ### 1. Measure the performance (execution time) of the code for 2, 4, 8, 12, 16 MPI processes and plot it. (1 points) * 執行所花費的時間如下圖 ![](https://i.imgur.com/xaSas53.png) <br><br> ## Q6: Reduce ### 1. Measure the performance (execution time) of the code for 2, 4, 8, 12, 16 MPI processes and plot it. (1 points) * 執行所花費的時間如下圖 ![](https://i.imgur.com/qijLwVX.png) <br><br> ## Q7: one-side communication ### 1. Measure the performance (execution time) of the code for 2, 4, 8, 12, 16 MPI processes and plot it. (1 points) * 執行所花費的時間如下圖 ![](https://i.imgur.com/t5UUIeC.png) ### 2. Which approach gives the best performance among the 1.2.1-1.2.6 cases? What is the reason for that? (3 points) * 從下表統計來看,nonblock linear 的表現是最佳的,reduce 次之,one-side 最差 * 我覺得也許使用 MPI_Irecv + MPI_Wait 會比使用 MPI_Recv 還快,即使他們看起來是等價的,但是不需要等待接收到的過程還是節省了一點時間,因此雖然差距不大但 nonblocking 通常有較好的結果。 | method\NP | 2 | 4 | 8 | 12 | 16 | |:--------------- |:----------- |:----------- |:----------- |:----------- |:----------- | | block_linear | 6.74 | 3.47 | 1.74 | 1.17 | 0.89 | | block_tree | <b>6.72</b> | 3.47 | <b>1.73</b> | | 0.92 | | nonblock_linear | <b>6.72</b> | <b>3.46</b> | <b>1.73</b> | 1.19 | <b>0.88</b> | | gather | 6.74 | 3.47 | 1.75 | <b>1.16</b> | 0.90 | | reduce | <b>6.72</b> | <b>3.46</b> | 1.74 | <b>1.16</b> | 0.89 | | one-side | 13.45 | 6.93 | 3.49 | 2.33 | 1.78 | ### 3. Which algorithm or algorithms do MPI implementations use for reduction operations? You can research this on the WEB focusing on one MPI implementation. (1 points) * 使用 reduction operation 來實現 rank-order-based deterministic algorithms。 * 運用實驗通常必須反覆一直 Validation 的特性,使用固定 reduction 的操作來加速例行公事。 * ref: [On the Reproducibility of MPI Reduction perations](https://www.mcs.anl.gov/papers/P4093-0713_1.pdf) <br><br> ## Q8: Ping-Pong > case 1: intra-node communication (two processes on the same node), and case 2: inter-node communication (two processes on different nodes). You should consider more different nodes in this case. ### 1. Plot ping-pong time in function of the message size for cases 1 and 2, respectively. (2 points) case1 指令: `mpicxx ping_pong.c -o ping_pong; mpirun -np 2 -npernode 2 --hostfile hosts ping_pong` ![](https://i.imgur.com/eQR2c5A.png) case2 指令: `mpicxx ping_pong.c -o ping_pong; mpirun -np 2 -npernode 1 --hostfile hosts ping_pong` (此實驗採用 two processes on four different nodes.) ![](https://i.imgur.com/7EEfz8m.png) ### 2. Calculate the bandwidth and latency for cases 1 and 2, respectively. (3 points) bandwidth: 斜率倒數,latency: 與y軸截面 case1: bandwidth: 6723345886.513239 ~= 6.72 GB/s latency: 0.0007381264050581177 ~= 7.381 * 10^-4 s case2: bandwidth: 97132541.8124317 ~= 0.97 GB/s latency: 0.0011550466383125624 ~= 1.155 * 10^-3 s ### 3. For case 2, how do the obtained values of bandwidth and latency compare to the nominal network bandwidth and latency of the NCTU-PP workstations. What are the differences and what could be the explanation of the differences if any? (4 points) 我使用 `ping -U 192.168.110.91` 指令來計算 pp6 的 bandwidth,ping 結果如下圖, ![](https://i.imgur.com/nHxsAyT.png) time 取平均值0.5235 ms。 NCTU-PP workstations 所聲稱的 bandwidth = 64*8 / 0.5235 = 978.032 ~= 0.978 GB / s NCTU-PP workstations 所聲稱的 lantency = 0.0005235 s ~= 5.2 * 10^-4s 我覺得利用 ping pong 演算法所計算出的 **bandwidth 和 lantency 已經算是很接近NCTU-PP所聲稱的結果**。但就可能導致算出來不太一樣的因素討論的話,我覺得有可能因為使用 ping pong 計算時間很久,如果有其他同學中途使用該工作站,就有可能會導致計算出現誤差。 <br><br> ## Q9: Part Two: Matrix Multiplication with MPI ### 1. Describe what approach(es) were used in your MPI matrix multiplication for each data set. * 在MPI 方面我先在 rank 0 將資料進行讀取後,使用 non-blocking 的方式來將 data 傳送到各個 processes 進行計算。再來將輸出的陣列依照列來分割給每個 processes 進行計算,最後一樣透過 non-blocking send 傳送結果回 rank 0 進行整合。 * 在矩陣乘法方面,A 矩陣乘以 B 矩陣,我將 B 矩陣進行"轉置"如此在計算的時候陣列讀取記憶體會變成連續的,這樣變能夠加快計算速度,這個方法幫助我在計算 data2的速度加快了將近2倍。 * 採用 SIMD AVX 256 bit register 來進行計算,如此可以一次計算八個整數(int)很直觀的加速了計算的速度。 * 我在矩陣乘法使用 `cache-friendly`的方式計算,這樣可以減少陣列讀取的次數,對於計算加速也有幫助,演算法如下: ``` // 如此一來,重複取用 a[i][k] 且每個矩陣都是以步長為 1 取用變數。 for (int i = 0; i < a.row; ++i) for (int k = 0; k < a.col; ++k) { int r = a.values[i][k]; for (int j = 0; j < b.col; ++j) dst->values[i][j] += r * b.values[k][j]; } ``` 最後在測試結果上在 data1 和 data2 的結果如下: > data1: 0.1897 s > data2: 3.8303 s data1: ![](https://i.imgur.com/W3YmT0q.png =340x) ![](https://i.imgur.com/A6REOd8.png =320x) 參考 MPI 筆記: http://cic-2012.pccu.edu.tw/HPC/pdf/MPI2.pdf 矩陣乘法: https://hackmd.io/@siahuat0727/HJscCbWgV?type=view#SIMD-AVX

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Google Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully