Linux 核心設計: 不只挑選任務的排程器 === :::warning 注意!這是上課筆記 ::: :::info 原始課程在這裡 - [影片 Part 1](https://www.youtube.com/watch?v=X2T7U6AEWrY) - [影片 Part 2](https://www.youtube.com/watch?v=S2MP4hqNXX0) - [影片 Part 3](https://www.youtube.com/watch?v=-wn-ttOvQl0) - [講義](https://hackmd.io/@sysprog/linux-scheduler) Original by [jserv](http://wiki.csie.ncku.edu.tw/User/jserv) ::: --- # Part 1 | Section | Time| Description | | -------- | -------- | -------- | | 1 | [00:00:00](https://youtu.be/X2T7U6AEWrY?si=7yWZ2KqHT-XffnXv) | 1. 目標設定: 公平性(fairness) / 反應時間(interactiveness) / 多核處理器負載平衡 / 電源管理 / 及時處理。<br>2. machine learning in linux scheduler.<br>3. context switch.<br>4. real-time scheduling. | 2 | [00:19:51](https://youtu.be/X2T7U6AEWrY?si=XpERlpiNXIdFNKbi&t=1191) | 1. 重新認識 scheduling.<br>2. 其他排程器。<br>3. scheduling 就是增加一層 indirection.<br>4. 以少量的資源嘗試完成大量任務。| | 3 | [00:34:57](https://youtu.be/X2T7U6AEWrY?si=I4NCVY90tpBqGe92&t=2097) | 1. CPU scheduling 對系統的影響。<br>2. 如何測量?perf 工具。<br>3. pipe() 隱含任務的切換。<br>4. 利用 taskset() 指派特定 CPU core.| | 4 | [00:47:47](https://youtu.be/X2T7U6AEWrY?si=MUuNLdWiQJb7U3Up&t=2867) | 1. runqueue: 紀錄多少任務要執行。<br>2. 排程器的要求。<br>3. moving average.<br>4. per entity loading tracking(PELT).| | 5 | [01:03:45](https://youtu.be/X2T7U6AEWrY?si=YmMZ10FVrkFzupKu&t=3825) | 1. 投影片 “CPU Scheduling of the Linux Kernel”.<br>2. 介紹 "demystifying the linux cpu scheduler".<br>3. 介紹 “Operating Systems: Three Easy Pieces”.<br>4. container docker 與排程器的關係。<br>5. Agenda | | 6 | [01:20:54](https://youtu.be/X2T7U6AEWrY?si=JKofZi1e8CDVARQh&t=4854) | 1. clone() 可以建立 process and thread.<br>2. vfork(): 鳩佔鵲巢。<br>3. session / process group / process / thread.<br>4. signal.| | 7 | [01:37:42](https://youtu.be/X2T7U6AEWrY?si=b-D-BTksBlArFJ07&t=5862) | 1. task_struct.<br>2. linked list and array for process.<br>3. Process Identifier(支援 namespace).<br>4. nstack / netns.| | 8 | [01:52:00](https://youtu.be/X2T7U6AEWrY?si=aFYshoTocJriTNVr&t=6720) | 1. 面試分享。<br>2. PID hash table.<br>3. 繼續介紹 Process Identifier.<br>4. pidfd.<br>5. Linux 設計哲學:everything is a file descriptor.| | 9 | [02:09:55](https://youtu.be/X2T7U6AEWrY?si=OexUs8MrhChgisMW&t=7795) | 1. pointer to the current process: macro ***current***.<br>2. context switch: hardware context and process execution context.<br>3. Lazy FP state restore.| | 10 | [02:25:13](https://youtu.be/X2T7U6AEWrY?si=B-RGVp7le9kak7T5&t=8713) | 1. Task State Segment(TSS).<br>2. 文章 “Evolution of the x86 context switch in Linux”.<br>3. microarchitecture 的演化讓 system call 成本降低。<br>4. uniprocessing.<br>5. Hardware context switch with interrupt.| | 11 | [02:40:23](https://youtu.be/X2T7U6AEWrY?si=VG1pnnwYvn7FB5Op&t=9623) | 1. Linux 與鐵達尼號的關係。<br>2. multiprocessing.<br>3. switch to new address space.<br>4. git blame 觀察程式碼。| | 12 | [02:54:54](https://youtu.be/X2T7U6AEWrY?si=7pTTgboLC7_N75LC&t=10494) | 1. switch_to.<br>2. 介紹「痴漢水球」。<br>3. What about the third parameter? 指向要保留的data資源。<br>4. Scheduling classes(排程演算法): RR/FIFO/CFS.<br>5. 介紹 nice value: 人善被人欺。 # Part 2 | Section | Time| Description | | -------- | -------- | -------- | | 1 | [00:00:00](https://youtu.be/S2MP4hqNXX0?si=2HYrJTHtZzmNcrPz) | 1. 前情提要。<br>2. POSIX 規範。<br>3. sequence of scheduling classes.| | 2 | [00:10:53](https://youtu.be/S2MP4hqNXX0?si=mArYs7oY5Uk6Eb8x&t=653) | 1. task group.<br>2. TIF: thread information flag.<br>3. 介紹 User-mode Linux 來追蹤測試核心。<br>4. Linux 核心需要做實驗,不是用看的。<br>5. jiffies.| | 3 | [00:23:45](https://youtu.be/S2MP4hqNXX0?si=d7ZmhXnVQQVVhL36&t=1425) | 1. priorities.<br>2. functions of scheduler.<br>3. ttwu(Try to wake up): 確認是否可被搶佔。| | 4 | [00:35:40](https://youtu.be/S2MP4hqNXX0?si=gIsgBe25AaBphg1r&t=2140) | 1. CPU schedulers: O(n), O(1), RSDL, CFS, BFS, Deadline, MuQSS.<br>2. scheduler developers.<br>3. TUX 是 Linux 吉祥物的名字。| | 5 | [00:50:08](https://youtu.be/S2MP4hqNXX0?si=Kyc7a0NlbJWsNLV-&t=3008) | 1. 回顧 CPU scheduler 的發展。| | 6 | [00:59:22](https://youtu.be/S2MP4hqNXX0?si=kOp9WDnJVuDLYEfW&t=3562) | 1. 分析排程器的準則。<br>2. Turnaround time: 開始執行到完成的時間。<br>3. O(n) and O(1) scheduler.<br>4. O(1) scheduler 使用 bitmap 來達成。<br>5. 聊新酷音輸入法。| | 7 | [01:17:51](https://youtu.be/S2MP4hqNXX0?si=xAVwQiJd7eow-6sx&t=4671) | 1. O(n) scheduler 8核效率比4核差。<br>2. O(1) features.<br>3. NUMA: non-uniform memory access.<br>4. Priority and timeslice.| | 8 | [01:30:02](https://youtu.be/S2MP4hqNXX0?si=lhQ48LS6jPucsVLw&t=5402) | 1. batch job and interactive processes.<br>2. Bonus and process interactivity.| | 9 | [01:42:15](https://youtu.be/S2MP4hqNXX0?si=3qV9tTUXVhihxFq3&t=6135) | 1. Heuristics work fine in most case, but work badly in some case.(頭痛醫頭,腳痛醫腳)<br>2. xkcd 漫畫。<br>3. RSDL: Rotating Staircase Deadline.<br>4. 介紹 Linux foundation.| | 10 | [02:00:19](https://youtu.be/S2MP4hqNXX0?si=lgpUclCbh6wL2M57&t=7219) | 1. Completely Fair Scheduler(CFS).<br>2. virtual runtime.<br>3. Proportional-share scheduling.<br>4. CFS using red-black tree.| | 11 | [02:13:50](https://youtu.be/S2MP4hqNXX0?si=-d6ueV900wg_mICT&t=8030) | 1. CFS defined entities: process groups.<br>2. timeslice and context switch cost.| | 12 | [02:23:00](https://youtu.be/S2MP4hqNXX0?si=xbnJydxih7P9dmjH&t=8580) | 1. CFS user indirection layer: virtual runtime.<br>2. scheduler latency.<br>3. time slice 的計算。<br>4. prio_to_weight().| | 13 | [02:36:07](https://youtu.be/S2MP4hqNXX0?si=k2lX9I2HeE4nDwvk&t=9367) | 1. CFS runtime weight v.s. nice level<br>2. data structure 解說。<br>3. container_of().| | 14 | [02:44:00](https://youtu.be/S2MP4hqNXX0?si=ibKb_q6ma0oU9e4l&t=9840) | 1. 總結CFS。<br>2. plugshed: 動態更改排程器。<br>3. BFS: Brain Fuck Scheduler.<br>4. Scheduler benchmark.| | 15 | [02:58:50](https://youtu.be/S2MP4hqNXX0?si=6DC6W8mk8cZln_9H&t=10730) | 1. hackbench.<br>2. CK 和 Ingo 的恩怨情仇。<br>3. 排程器的測試場景是很重要的。 # Part 3 | Section | Time| Description | | -------- | -------- | -------- | | 1 | [00:00:15](https://www.youtube.com/live/-wn-ttOvQl0?si=vSFtamrnQoh4iuE-&t=15) | 1. 前情提要。<br>2. 2012 CPU schedulers compared.<br>3. make benchmark / video benchmark(ffmpeg).<br>4. Conclusion: 少核可以使用BFS; 多核CFS表現較好。| | 2 | [00:17:56](https://www.youtube.com/live/-wn-ttOvQl0?si=pxAdhk7YicN_6IQM&t=1076) | 1. scheduling classes.<br>2. deadline scheduler.<br>3. EDF: earliest deadline first.<br>4. real-time wcet(worst-case execution time).<br>5. 飛機記錄器的案例(ARINC 653)。| | 3 | [00:39:35](https://www.youtube.com/live/-wn-ttOvQl0?si=EnQumoQlaoqx6LL7&t=2375) | 1. 介紹 seL4: 軍工 航太的作業系統。<br>2. ARM big.LITTLE。<br>3. ARM DynamIQ.<br>4. 分享大學聯考的經歷。| | 4 | [00:56:15](https://www.youtube.com/live/-wn-ttOvQl0?si=Iuth4ZMADwGLJa-z&t=3375) | 1. CPU frequency framework.<br>2. PSI: pressure stall information.<br>3. CPU topology.| | 5 | [01:06:50](https://www.youtube.com/live/-wn-ttOvQl0?si=iTFPAaqObjJ_ngZW&t=4010) | 1. 關於 deadline 的閒聊。<br>2. MuQSS CPU scheduler.<br>3. multilevel feedback queue.| | 6 | [01:19:40](https://www.youtube.com/live/-wn-ttOvQl0?si=6RbfNTfkdHwbZKAz&t=4780) | 1. skip list.<br>2. Red-black tree.<br>3. virtual deadline.<br>4. MuQSS 改用 single runqueue, 因為不在意 scalability.| | 7 | [01:34:00](https://www.youtube.com/live/-wn-ttOvQl0?si=hb8v_Ma94nV6bH-u&t=5640) | 1. 2018 intel: CPU idle loop.<br>2. Dynamic voltage and frequency scaling.<br>3. thermal 降頻。| | 8 | [01:46:27](https://www.youtube.com/live/-wn-ttOvQl0?si=9WytAtEO13k0X92h&t=6387) | 1. CPU idle loop.<br>2. CPU and bus frequency.<br>3. CPUFreq governor.<br>4. EWMA: exponentially weighted moving average.| | 9 | [02:04:38](https://www.youtube.com/live/-wn-ttOvQl0?si=XA7TzTdVsbCwrCv5&t=7478) | 1. idle loops v.s. idle power.<br>2. SCHED_IDLE is for tasks with very low priority.<br>3. thermal pressure.<br>4. EAS: energy-aware scheduling.| | 10 | [02:21:12](https://www.youtube.com/live/-wn-ttOvQl0?si=Us0HPoJz3ZeGvwL2&t=8472) | 1. 開放原始碼的閒聊。<br>2. EAS: energy-aware scheduling.<br>3. scheduling for ARM big.LITTLE chip.<br>4. ARMv8 架構介紹。| | 11 | [02:32:56](https://www.youtube.com/live/-wn-ttOvQl0?si=IGkBnd3HvqDQNncT&t=9176) | 1. SMT: 硬體的平行化能力。<br>2. SMT 有安全問題(e.g. TLBleed)。<br>3. 2021 core scheduling 正式納入 linux kernel.| | 12 | [02:53:09](https://www.youtube.com/live/-wn-ttOvQl0?si=LPBqqqoDtGUNa8J7&t=10389) | 1. 回顧 Agenda。<br>2. Linux CPU scheduler 歷史發展。<br>3. Time keeping.<br>4. Tracing with ftrace.
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up