COSCUP
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Help
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
從 C++11 規格和 Template Meta-Programming 的角度欣賞 Boost 對 atomic 和 memory order 的支援 - 翁敏維 === {%hackmd LcL4-VI0SGitSiINTuy4rA %} > [簡報模式](https://hackmd.io/@butastur/coscup-2019-boost-atomic-memory-order) --- ## Agenda - Introduction - Why needs Template metaprogramming? - What is memory order? - How [C++11](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf) supports memory order? - Boost.Atomic --- ### Introduction 為什麼要看 C++11 規格? ```cpp= void push(const T &data) { node * n = new node; n->data = data; node * stale_head = head_.load(boost::memory_order_relaxed); do { n->next = stale_head; } while (!head_.compare_exchange_weak(stale_head, n, boost::memory_order_release)); } ... node * pop_all_reverse(void) { return head_.exchange(0, boost::memory_order_consume); } private: boost::atomic<node *> head_; ``` <small>Reference [Boost.Atomic, Usage example](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation)</small> ---- - <small>上面這段節錄自 [Boost.Atomic, Usage example](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation),由於直接看 code 並不是有效率的學習方式,<font color='red'>甚至可能得到錯誤的認知</font>,所以從理解 C++11 規格開始。</small> - <small>透過理解 C++11 規格,看到上面這段 code 就不必用猜的,也不必用 try and error ,是一種比較有效率的學習路線。</small> --- ## Why needs Template metaprogramming? ---- 先澄清一件事,<br> 不是直接用 templates 程式就會變快。 ---- 雖然 template 會在 compile-time 作展開和運算,但如果 <font color='red'>僅僅只是</font> 直接把 function 改成 template,就如同<font color='red'>直接換成 inline 或 macro一樣</font>,執行的時間是沒有差別的 ---- 作一個實驗來說明這個概念 ---- ```cpp // templates.cpp template <class T> T add(T a, T b){ return a + b; } int main(){ int ans = 0; int a = 1, b = 2; for(int i = 0; i < 1000000; ++i){ ans += add(a, b); } } ``` ```cpp // no-templates.cpp int add(int a, int b){ return a + b; } int main(){ int ans = 0; int a = 1, b = 2; for(int i = 0; i < 1000000; ++i){ ans += add(a, b); } } ``` ---- ``` Performance counter stats for './templates' (10000 runs): 8,911,140 cycles ( +- 0.09% ) 0.003526686 seconds time elapsed ( +- 0.11% ) Performance counter stats for './no-templates' (10000 runs): 9,010,668 cycles ( +- 0.11% ) 0.003570547 seconds time elapsed ( +- 0.13% ) ``` - 是不是 templates 的差異只有 0.05 ms - 0.05 ms 是執行 1 萬次的平均時間 - <font color='red'>並不是直接換成 template 就會變快</font> ---- <font color='red'>並不是 templates 不會讓執行時間縮短</font>,而是因為剛剛的測試範例只是直接把 function 換成 templates ---- 用 templates 實作出<font color='red'>在 compile-time 運算來縮短執行時間</font>,同時具有 inline 和 macro 所不具有的優點,<font color='red'>並不是這一次 talk 的主題</font> ---- 那 template metaprogramming 要探討什麼? ---- 有。 write generic code ---- 用對照組,體會一下是不是 generic code 的差異 ---- ```cpp unsigned int calc(unsigned int a, unsigned int b, unsigned int c){ return a ^ 2 + b * 3 + c * 2; } int calc(int a, int b, int c){ return a * 2 + b * 2 + c ^ 2; } long calc(long a, long b, long c){ return a * 3 + b ^ 2 + c ^ 3; } ``` <small>這段看起來,我們可能會這麼<font color='red'>覺得</font>:</small> 1. 這是 overloading,只是 <font color='red'>其中幾個寫錯</font> - <small>那麼哪一個寫錯?</small> - <small>還是有兩個寫錯?</small> - <small>哪麼是哪兩個寫錯?</small> 2. 不同的 types 的計算方式<font color='red'>本來就不一樣</font> ---- 剛剛我們示範了一個讓人用 <font color='red'>覺得</font> 來判斷的 code,這在可讀性上來說,<font color='red'>實在不妥</font> ---- 或許可以用加註解來解決可讀性, 不過上面這個例子有更好的解法 ---- 更好的解決方法是 write generic code,所以我們用 templates metaprogramming,修改後是這樣子 ```cpp template<class T> T calc(T a, T b, T c){ return a ^2 + b^2 + c^2; } ``` <small>當然,我們假設上面的例子,每一個 calc 的計算方式一樣</small> --- ## What is memory order? - reorder - out-of-order ---- ### reorder ``` shared memory: x = 0; y = 0 P0 | P1 ; a: MOV [x],$1 | c: MOV [y],$1 ; b: MOV EAX,[y] | d: MOV EAX,[x] ; ``` - a, b, c, d 的執行順序有沒有可能是 <font color='red'>b, d</font>, c, a - 由於 CPU 會進行 reorder,所以<font color='red'>未必會照著順序執行</font>,因此在這個範例中,processor 0 的 EAX 和 processor 1 的 EAX 有可能都是 0 (只是經過測試,100 萬次裡出現的次數是個位數) ---- reorder 在不同的 CPU 會有不一樣的行為 ---- reorder 之後<font color="red"> **有可能** </font>是照著這個順序執行 ``` shared memory: x = 0; y = 0 P0 | P1 ; b: MOV EAX,[y] | d: MOV EAX,[x] ; | c: MOV [y],$1 ; a: MOV [x],$1 | ; ``` ---- - compiler 會對 program order 作 reorder,而 CPU 也會對 instruction 執行的順序作 reorder。 - 阻止全部的 reorder 雖然也是一種方法,但這樣會產生 <font color='red'>效率低落</font> 的結果。所以需要研究 memory order,讓工程師能對 reorder 作一些控制。 ---- ## out-of-order ---- 我們從 [instruction pipelining](https://en.wikipedia.org/wiki/Instruction_pipelining) 來理解 out-of-order ### in-order ``` 1 2 3 4 5 6 7 8 fetch | fetch1 fetch2 decode | decode1 decode2 execute | execute1 execute2 write-back | write1 write1 ``` 如果是照著以上的順序來執行 2 個 instruction,會用掉 8 個 cycle 來執行 2 個 instruction。 ---- ### out-of-order 但如果是 OoOE,**可能**會是以下的順序,這樣就可以只用 5 個 cycle 執行 2 個 instruction ``` 1 2 3 4 5 fetch | fetch1 fetch2 decode | decode1 decode2 execute | execute1 execute2 write-back | write1 write1 ``` 但是 out-of-order 會有一個延伸的問題,`execute2` <font color='red'>如果</font>會從 register 讀取一個值,而這個值要 `write1` 執行後才會更新... <font color='red'>那 `execute2` 的結果?</font> ---- 現在我們知道 program order 是我們從 source code 看到的順序,但是實際執行的順序,未必和 program order 一樣。 ---- CPU 會對 instruction 作 reordering,所以我們看到的 assembly 的 program order 未必是執行順序 如果我們寫的是 C++11 的 code,program order 是不是也有可能不一樣? ---- 對,program order 是允許不一樣的,但 C++11 支援不同的 memory order,可以控制實際執行順序 --- #### How [C++11](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf) supports memory order? - [data races](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf#section.1.10) - synchronization operations / atomic operations - memory_order_seq_cst - memory_order_relaxed - 如何透過 memory model 建立 happens-before 的關係 ---- #### 從 C++11 規格來看 [data races](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf#section.1.10) <small>C++11 1.10/21</small> <small>The execution of a program contains a **data race** if it contains two **conflicting** actions in different threads, at least one of which is **not atomic**, and neither **happens before** the other.</small> <small>Reference http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf</small> ---- 其中一個 thread 的 operation 不是 atomic operation,而且也沒有 <font color='red'>happens before 這種關係</font> - synchronization operations - atomic operation - 如何建立 happens before 這種關係? ---- A **synchronization** operation on one or more memory locations is either a **consume** operation, an acquire operation, a **release** operation, or both an acquire and release operation. <small>Reference http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf#section.1.10</small> <small>synchronization operation 是要在不同的執行緒之間同步,等一下我們會在範例中看到這是如何運作的</small> <small>在 wait-free multi-producer queue 會用到 memory_order_relaxed、memory_order_release 和 memory_order_consume</small> <small>wait-free multi-producer queue 就是一開始提到的 Boost 的範例</small> ---- 特別要留意的地方是: relaxed operations 雖然是 atomic,但不是 synchronization operations <small> In addition, there are relaxed atomic operations, which are <font color='red'>not synchronization operations</font>, and atomic read modify-write operations</small> <small>Reference [C++11 1.10 Multi-threaded executions and data races](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf#section.1.10)</small> ---- [C++11, page 1115]() 說明了 memory_order_seq_cst 是一種 single total order <small>**consistent** with the “happens before” order and modification orders for **all** affected locations</small> ---- 因為 memory_order_seq_cst 會維持所有執行緒的同步,考慮到效能,所以需要再理解其他的 memor order,例如接下來要探討的 memory_order_relaxed 、memory_order_release 和 memory_order_consume。 ---- ```cpp // Thread 1: r1 = y.load(memory_order_relaxed); x.store(r1, memory_order_relaxed); // Thread 2: r2 = x.load(memory_order_relaxed); y.store(42, memory_order_relaxed); ``` <small>Reference: [C++11 規格, page 1116](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf#section.1.10) </small> 有沒有可能最後的結果是 r1 = r2 = 42? ---- 因為 memory_order_relaxed 只是 atomic operaion **不是 synchronization operations** 所以執行順序有可能是這樣 ```cpp y.store(42, memory_order_relaxed); r1 = y.load(memory_order_relaxed); x.store(r1, memory_order_relaxed); r2 = x.load(memory_order_relaxed); ``` <small>Reference: [C++11 規格, page 1116](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3376.pdf#section.1.10) </small> ---- ## [Boost.Atomic](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/thread_coordination.html) ---- ### Example about atomic ---- ```cpp= void push(const T &data) { node * n = new node; n->data = data; node * stale_head = head_.load(boost::memory_order_relaxed); do { n->next = stale_head; } while (!head_.compare_exchange_weak(stale_head, n, boost::memory_order_release)); } ... node * pop_all_reverse(void) { return head_.exchange(0, boost::memory_order_consume); } private: boost::atomic<node *> head_; ``` <small>第 5 、8 行的 load、compare_exchange_weak 就是 atomic operation,底下會再討論到這兩個 atomic operation,也會提到 3 個 memory order:memory_order_relaxed、memory_order_release 和 memory_order_consume。</small> <small>Reference [Boost.Atomic, Usage examples](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation)</small> ---- [C++11, page 1124]() 對 compare_exchange_weak 是這麼說的 <small>Implementations should ensure that weak compare-and-exchange operations **do not** consistently return false unless either the atomic object has value **different** from expected or there are concurrent modifications to the atomic object.</small> ---- - compare_exchange_weak 在 exchange 前會先作比較,比較結果不一樣才會 exchange,如果 exchange 就會 return true,否則 return false。 - 所以範例裡會有一個 ```cpp while (!head_.compare_exchange_weak(stale_head, n, boost::memory_order_release)); ``` - 在這個範例,我們用 memory_order_release 作為 atomic operation 的標記,底下將會探討 memory_order_release 還有與之成對的 memory_order_consume ---- memory_order_release v.s. memory_order_consume ---- Boost 是這麼說的 The use of **memory_order_release** after creating and initializing the object and **memory_order_consume** before dereferencing the object provides this guarantee. ---- 再看看 [C++11, page 1114]() 怎麼說 - **memory_order_release**, memory_order_acq_rel, and memory_order_seq_cst: a store operation performs a release operation **on the affected memory location**. - memory_order_consume: a load operation performs a consume operation **on the affected memorylocation** ---- 如果用 memory_order_acquire 來理解 memory_order_consume,會容易點,但是<font color='red'> memory_order_consume 不等於 memory_order_acquire</font> ---- 可以把 memory_order_consume 換成 memory_order_acquire,<font color='red'>但不能把 memory_order_acquire 換成 memory_order_consume</font> ---- 透過 memory_order_release、memory_order_consume 建立了 happens-before 的關係 ```cpp head_.compare_exchange_weak(stale_head, n, boost::memory_order_release)); ``` ```cpp head_.exchange(0, boost::memory_order_consume) ``` --- ### Boost.Atomic - Boost 實作了一個 [Wait-free multi-producer queue](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation) - 透過以上對 C++11、memory order 和 atomic operation 的了解,我們來欣賞一下 Boost 如何用 memory order 和 atomic operation 來實作出一個 Wait-free multi-producer queue。 ---- - producer–consumer problem - multi-producer and single consumer queue ---- #### multi-producer and single consumer queue [producer–consumer problem](https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem) 基本上分這幾種 - **multi-producer and single consumer** - single-producer and multi-consumer - multi-producer and multi-consumer - single producer and single consumer <small>由於 [producer–consumer problem](https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem) 不是這裡的重點,<br>所以只會講到所以只會講到 **multi-producer and single consumer**</small> ---- ### multi-producer and single consumer queue <small>producer 會 push message 到 queue ,然後 consumer 會將 queue 裡的 message 取走</small> <small>queue 是一個先進先出的 FIFO, First-In-First-Out 的資料結構,但是如果加了<font color='red'>在多執行緒下有多個 producer</font> 同時 push to queue,情況就有點複雜</small> ```graphviz digraph G{ rankdir = "LR"; producer1 -> queue; producer2 -> queue; queue -> consumer queue[shape="quare"]; } ``` ---- 節錄自 [Boost.Atomic, Usage example](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation) ```cpp= void push(const T &data) { node * n = new node; n->data = data; node * stale_head = head_.load(boost::memory_order_relaxed); do { n->next = stale_head; } while (!head_.compare_exchange_weak(stale_head, n, boost::memory_order_release)); } ... node * pop_all_reverse(void) { return head_.exchange(0, boost::memory_order_consume); } private: boost::atomic<node *> head_; ``` <small>現在再看這個範例,是不是就可以不用靠 try and error 跟猜測了?</small> <small>Reference [Boost.Atomic, Usage example, multi-producer queue](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation)</small> --- ## Reference - [ ] [Thread coordination using Boost.Atomic, Boost.Atomic, version 1.66](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/thread_coordination.html) - [ ] [Wait-free multi-producer queue, implementation](https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.mp_queue.implementation) - [ ] [producer–consumer problem](https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem) - [ ] [Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer](https://www.linuxjournal.com/content/lock-free-multi-producer-multi-consumer-queue-ring-buffer) - [ ] [The Purpose of memory_order_consume in C++11](https://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11/) - [ ] [CppMem: Interactive C/C++ memory model](http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/) --- END ---- ###### tags: `COSCUP2019` `帶您讀源碼` `IE2102` ###### tags: `C++11` `memory order` `TMP` `Template Meta-Programming` `Boost` `atomic` `Boost.Atomic` `1.66`

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully