--- title: RDMA(紀見如) tags: Session Three --- # Ping-pong Measurements ![](https://i.imgur.com/xH3cfyO.png) ## busy polling * polling I/O : 就是當process運行到某部分時會發出I/O request來進行週期性的檢查,I/O只需要將資訊放進暫存器就好了,這是最簡單的溝通方式。 * 消耗較多 CPU 資源 ## How to reduce 100% CPU usage * Cause is “busy polling” to wait for completions * Burns CPU since most calls find nothing ### Why is “busy polling” used at all? * simple to write such a loop * gives very fast response to a completion * gives low latency !!! * "busy polling" to get completions 1. start loop 2. ibv_poll_cq()to get any completion in queue 3. exit loop if a completion is found 4. end loop ## How to eliminate “busy polling” * Cannot make ibv_poll_cq()block 小補帖 : ibv_poll_cq() * 用于從 Completion Queue 中查询已完成的 Work Request。 所有的 Receive Request、signaled Send Request 和出错的 Send Request 在 完成之后都會產生一个 Work Completion, Work Completion 就被放入完成序列 (Completion Queue)中。 * Solution is a “wait-for-event” mechanism * ibv_req_notify_cq() :tell CA to send an “event” when next WC enters CQ 用于在完成序列中请求一个完成通知。 * ibv_get_cq_event():blocks until gets “event” 用于等待某一 channel 中的下一个通知產生。 * ibv_ack_cq_event():acknowledges “event” 用于確認已完成 Completion events。 ## ”wait-for-event” to get completions 1. start loop 2. ibv_poll_cq()to get any completion in CQ 3. exit loop if a completion is found 4. ibv_req_notify_cq()to arm CA to send event on next completion added to CQ 5. ibv_poll_cq()to get new completion between 2&4 6. exit loop if a completion is found 7. ibv_get_cq_event()to wait until CA sends event 8. ibv_ack_cq_events()to acknowledge event 9. end loop ![](https://i.imgur.com/5E78J6D.png) ![](https://i.imgur.com/RnYYo1f.png) ![](https://i.imgur.com/vjZIfdP.png) ## Collective Communication Operations * Collective communication is communication that involves a group of processing elements (termed nodes in this entry) and effects a data transfer between all or some of these processing elements. 白話文: 是涉及一組處理元件並在所有或部分這些處理元件之間進行數據傳輸的通信。 * The importance of collective communications is derived from the fact that many frequently used parallel algorithms such as sorting, searching, and matrix manipulation share data among groups of processes 簡單來說,因為许多经常使用的並行算法之间有共享數據,所以collective communication在處理這些文件時非常重要。 ## Types of collective communication * Broadcast: A source process sends identical data to all other processes. * Scatter: A source process sends a distinct message to all other processes. * Gather: This operation is the reverse of scatter. * All-to-all broadcast: Every process communicates the same data to every other process. * All-to-all personalized exchange: Every process communicates unique data to each other process. ##### This comes from https://www.sciencedirect.com/topics/computer-science/collective-communication