Thread - HackMD

# Thread ## Thread 特點 - ==thread 有自己的== - thread ID - PC - register sets - stack - ==和其他 thread 共用== - text section - data section - OS resources - open files - flags ![image](https://hackmd.io/_uploads/SyrbU-hw6.png) - threads 共享 memory > thread 共享 code, text section 的好處就是允許一個 app 有很多個 threads 在同個 address space 活動 - 切換 thread 仍然要 context switch （只是比切換 process 快） --- ## Thread Models 常見 Thread Models： - POSIX $\rightarrow$ Pthreads - Java $\rightarrow$ Green Threads > JVM runs as a single process，會去 schedule threads in app > - kernel 不知道有 multithread >> 好處是 ==thread switch== efficient （不用 system call） >> 缺點是所有 thread 共享這個 process 分到的 CPU time，所以當 thread 越多，performance 會急遽下降（因為更多的 threads 去共享相同量的 CPU time） - Solaris, System V UNIX $\rightarrow$ light-weight processes (LWPs) > - many to many > - 由 ==user-mode thread library== schedule between LWPs 而不是 kernel ![image](https://hackmd.io/_uploads/B1o93qe5a.png) ### Pthreads ![image](https://hackmd.io/_uploads/Bk1S9Z2Pa.png) > 重點： global 變數 sum 被 threads 共享 > - 如果 thread 有用 ``pthread_attr_init()`` 設 attribute，就可以設定 stack size, scheduling info⋯⋯（每個 thread 有各自的 attr） > - ``pthread_create()`` 的第三個 parameter（此例為 runner()）就是這個被建立的 thread 要開始執行的地方 > - 如果 ``pthread_join()`` 是 success call，代表要等的那個 thread 已經 terminate，所以 caller 就可以選擇是否回收資源；如果``pthread_join()`` call 失敗，會產生 zombie thread >> **zombie**: a process that has terminated, but whose parent has not yet called ``wait()`` 因為 multicore 越來越普遍，所以很常會要寫一個包含很多 threads 的 program，如果一次要等很多 threads，一種寫法： ![image](https://hackmd.io/_uploads/B1WkZf2D6.png) > 把 ``pthread_join()`` 包在 loop 裡，就能一次等很多 threads terminate #### Pthread 常考 function 整理 ==``pthread_join(pthread_t thread, void **retval)``== $\rightarrow$ 等指定的 thread terminate ==``pthread_mutex_lock(pthread_mutex_t *mutex)``== $\rightarrow$ lock 指定的 mutex $\rightarrow$ 當 invoke 此 function 時，如果 mutex 已經被別的 thread lock 住，call 這個 function 的 thread 就會被 blocked，直到擁有 mutex 的 thread 發 ``pthread_mutex_unlock()`` 讓 mutex 變成 available ==``pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)``== ![image](https://hackmd.io/_uploads/BkLPmI0Pa.png) > 在 call 完 ``pthread_mutex_lock()``，取得 mutex 後，thread 就會去檢查 condition（此例中的 ``a != b``），如果條件不成立， thread 就會接著 invoke ``pthread_cond_wait()`` ，把 mutex lock 跟 condition variable 當作 parameter pass 過去。 > call ``pthread_cond_wait()`` 會釋放 mutex，讓其他 thread 可以 access shared data，因此也就可能會修改到 shared data 的值，讓 condition 條件成立。 > 改了 shared data 的 thread 會 invoke ``pthread_cond_broadcast()`` 或是 ``pthread_cond_signal()``，signal 原本因為 ``cond`` 被 blocked 的 thread >> conditional clause (此例的 ``a != b``) 必需要被包在 loop 裡面，這樣 signal 以後才會去 recheck 條件是否成立 $\rightarrow$ atomically release mutex and cause the calling thread to block on the condition variable ``cond`` $\rightarrow$ 要求 call ``pthread_cond_wait()`` 的 thread 要先用 ``pthread_mutex_lock()`` 取得 mutex，以防 race condition $\rightarrow$ ``cond``是 threads 共享的 conditional variable，如果有 thread 要對``cond``做更動，需要先獲得 ``mutex`` ==``pthread_cond_signal(pthread_cond_t *cond)``== ![image](https://hackmd.io/_uploads/HJKTu8RPT.png) $\rightarrow$ unblock 至少一個因為正在等``cond`` 而被 blocked 的 thread (wake up sleeping or waiting thread) $\rightarrow$ 如果都沒有 thread 正在因為 ``cond`` 被 blocked，就不做任何事 $\rightarrow$ call ``pthread_cond_signal()`` 不會釋放 mutex lock，是在 call ``pthread_cond_signal()`` 後再 call ``pthread_mutex_unlock()`` 才會釋放 mutex $\rightarrow$ call 完 ``pthread_cond_signal()``後，call ``pthread_cond_signal()`` 的 thread 不會馬上把 CPU 給被 signal 的 thread，而是要再接著發完 ``pthread_mutex_unlock()``，把 mutex 給被 signal 的 thread，才會把控制權給 call ``pthread_cond_wait()`` 的 thread ==``int sem_post(sem_t *sem)``== $\rightarrow$ ++unlock++ a semaphore referenced by sem > 如果 call 完 ``int sem_post()`` 的結果 semaphore 值是正的，代表沒有 thread 正在被 blocked waiting for semaphore to become unlocked，``int sem_post()`` 就單純只將值增加 > 但如果 call 完 ``int sem_post()`` 的結果 semaphore 值是 0，代表其中一個被 blocked waiting 的 thread 會被成功從原本它 call 的 `` sem_wait()`` return >> 如果有多個 threads 正在被 blocked ，就根據當初 thread 的 attribute 設定，先 return priority 高的，如果有多個 thread 都是最高 priority，再讓等最久的先 ==``int sem_init(sem_t *sem, int pshared, unsigned value)``== $\rightarrow$ ``value`` 是 semaphore ``sem`` 的值，如果 ``pshared`` 值是 0，代表這個 semaphore 只被同個 process 中的 threads 共享；如果`pshared`` 值 $\not=$ 0，代表這個 semaphore 是 shared between processes ### Linux Threads - Linux 沒有區分 Process 和 Thread，而是用 task 同時代表 Process 和 Thread Linux 產生 thread 的 sys call: ``clone()`` > ``clone()`` 被 invoke 的時候會 pass 一系列的 **flags**，來決定 parent 和 child 之間要共享多少東西 ![image](https://hackmd.io/_uploads/HJDyYG3wa.png) 如果沒有設任何的 flag，代表 parent 和 child 之間沒有共享任何東西 $\rightarrow$ 這種情況下，``clone()`` 的效果就和 ``fork()``差不多 --- ## Thread Pools 假如我們沒有限制 concurrently active in system 的 thread 個數，可能會耗盡 system resources（如 CPU time, memory），因此其中一個解決這個問題的方法就是 Thread Pool Thread Pool 的概念就是在 start-up 的時候就預先 create 一些 thread，然後把他們放到 thread pool 中，等之後工作分派給他們 $\rightarrow$ 當 server 收到 request 時，不會直接 create 一個新的 thread，而是把 request 丟到 thread pool ，再去等新的 request $\rightarrow$ thread pool 收到 request ，如果剛好有可以用的 thread，就直接 service；但如果沒有 available 的 thread，task 就會被放到 queue 裡，直到有 thread 完成工作 ### Thread Pool 好處 - 從現有的 thread 找一個去 service 比收到 request 再 create 一個快 - 會限制一個時間點內的 thread 個數 - 把 create task 和 task perform 分開可以比較有彈性的安排，例如說把 task 排程 Thread Pool 裡要有多少 threads 可以由像是 CPU 數目、Physical Memory 大小、預期的 concurrent client request⋯⋯決定，有些 Thread Pool Architecture 還可以動態調整 Thread Pool 大小 --- ### Reference - 恐龍 p.302-303 - [pthread_join(3) — Linux manual page](https://man7.org/linux/man-pages/man3/pthread_join.3.html) - [pthread_cond_wait - IBM](https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-pthread-cond-wait-wait-condition-variable) - [pthread_cond_signal - IBM](https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-pthread-cond-signal-signal-condition#ptcsig) - [pthread_mutex_lock - IBM](https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-pthread-mutex-lock-wait-lock-mutex-object) - [sem_post - Opengroup](https://pubs.opengroup.org/onlinepubs/009695399/functions/sem_post.html) - [sem_init - Opengroup](https://pubs.opengroup.org/onlinepubs/009696699/functions/sem_init.html) - [OpenCSF](https://w3.cs.jmu.edu/kirkpams/OpenCSF/Books/csf/html/ProcVThreads.html)