Ch04: Threads and Concurrency

# Threads and Concurrency  ## Overview **Thread**: - A basic unit of CPU utilization - Comprises - A thread ID - A program counter (PC) - A register set - A stack - **Sharing with other threads** belonging the same process - code - data - other OS resources - file - signals （要能畫出圖，==哪些會共用、哪些會有獨立的==） ![Single-threaded and multithreaded processes](https://hackmd.io/_uploads/S1Fcauo-6.png) - Single-threaded process - A traditional process (heavyweight process) - Has a single thread of control - Multi-threaded process - Has multiple thread of control - Can perform more than one task at a time ### Motivation - Multithreaded server ![Multithreaded server architecture](https://hackmd.io/_uploads/HJMg1Fi-a.png) ### Benefits  - Responsiveness - 可以快速的回應 - may allow continued execution if part of process is blocked, especially important for user interfaces - Resources sharing - threads share resources of process, easier than shared memory or message passing - Economy - cheaper than process creation, thread switching lower overhead than context switching - Scalability - process can take advantage of multiprocessor architectures ## Multicore Programming - Multicore - Place multiple computing cores on a single chip. - Each core appears as a separate CPU to the OS. - A single-core system ![Concurrent execution on a single-core system](https://hackmd.io/_uploads/H1cYcWSMa.png) - A multi-core system ![Parallel execution on a multicore system](https://hackmd.io/_uploads/ryG25WSGp.png) - Concurrency（並行性） - 將程式拆成多個 task（不一定同時執行） - Parallelism（平行） - 強調==同時==執行 [Concurrency vs Parallelism 淺談兩者區別以及名詞介紹](https://davidleitw.github.io/posts/concurrency01/) ### Programming Challenges - Identifying tasks - Balance - Data splitting - Data dependency - Testing and debugging ### Types of Parallelism - **Data** parallelism - Distributing subsets of the **same** data across multiple computing cores and performing the same operation on each core - **Task** parallelism - Distributing tasks (threads) across multiple computing cores - Each thread is performing a **unique** operation ![Data and task parallelism](https://hackmd.io/_uploads/HkOA294Mp.png) ### Multithreading Models - **User** threads - Are supported above the kernel and are managed **without kernel support** - **Kernel** threads - Are supported and managed directly by the OS - A relationship must exist between the user thread and kernel threads - Many-to-one model - One-to-one model - Many-to-many model ![User and kernel threads](https://hackmd.io/_uploads/SJ6jyiEfa.png) ### Many-to-One Model - Many user-level threads mapped to single kernel thread - One thread **blocking** causes all to block - Multiple threads may not run in parallel on multicore system because **only one may be in kernel at a time** - Example: - Solaris Green Threads - GNU Portable Threads ![Many-to-one model](https://hackmd.io/_uploads/H1E8loVGT.png) ### One-to-One Model ![One-to-one model](https://hackmd.io/_uploads/Sk3KgoNz6.png) ### Many-to-Many Model ![Many-to-many model](https://hackmd.io/_uploads/rJRjloVza.png) - Two-level Model ![Two-level model](https://hackmd.io/_uploads/SyNJZi4M6.png) ## Thread Libraries - A thread library - Provides the programming with an API for creating and managing threads - Two primary ways of implementing a thread library - Provide a library entirely in user space with no kernel support - Implementing kernel-level library supported directly by the OS ## Implicit Threading - To address the difficulties and better support the design of concurrent and parallel applications - Transfer the creation and management of threading from application developers to compilers and run-time libraries ### Thread Pools - Create a number of threads at process startup and place them into a pool - Benefits - Usually **slightly faster** to service a request with an existing thread than create a new thread - **limits the number of threads** that exist at any one point - separating the task to be **performed** from the mechanics of **creating** the task ### Fork Join - Multiple threads (tasks) are **forked**, and then **joined**. ![Fork-join parallelism](https://hackmd.io/_uploads/SkQy4sNGa.png) ## Threading Issues ### The `fork()` and `exec()` System Calls - fork() - exec() ### Signaling Handling - Signal - To notify a process that a particular event has occurred - The pattern - Generated - Delivered - To the thread to **which the signal applies** - To **every** thread in the process - **Certain** threads in the process - Assign **a specific thread** to receive all signals for the process - Handled - A **default** signal handler - A **user-defined** signal handler ### Thread Cancellation - Involves terminating a thread before it has completed - Two different scenarios - **Asynchronous** cancellation - terminates the target thread **immediately** - **Deferred** cancellation - allows the target thread to **periodically check** if it should be cancelled ### Thread-Local Storage - Threads belonging to a process share the data of the process - **Thread-Local Storage (TLS)** - In some circumstances, each thread **might** need its own copy of certain data ### Scheduler Activation ![Lightweight process (LWP)](https://hackmd.io/_uploads/SJN2Z3NMp.png) - Both M:M (many-to-many) and Two-level models require communication to maintain the appropriate number of kernel threads allocated to the application - **Lightweight process (LWP)** - A **virtual processor** on which the applications can schedule a user thread to run - Each LWP is attached to a **kernel thread** - kernel threads that the OS schedules to run on physical processors - One scheme for communitaction between the **user-thread library** and the **kernel** is known as **scheduler activation** - The kernel provides an application with a set of virtual processors (LWPs) - Upcall - The **kernel** informs an **application** about certain events - Upcall handler