Week 3 - Highlights

# Week 3 - Highlights Nice notes: https://drive.google.com/drive/folders/1UJyV17jx2l3RdIVhpEY7sOfxqcWYRjKJ?usp=sharing * @team1 * @team12 * @team15 * @team18 * @team23 * @team24 * @team28 * @team30 ## @team28 ### Unix Philosophy - Make each program do one thing well and let them work together - user can quickly assemble a desired program using many single-purpose binaries - 每個人使用電腦都有不同的需求，但這些需求拆到最後都是一些atomic的功能組合起來的與其為了1000個很像功能設計1000個很像的指令，設計出重點的幾個功能，並善用組合來達成我們要的功能是更好的設計方法。這裏就像是functional programming，盡量讓每個function都是pure、single purpose的，並用組合的方式來做出我們要的效果。如此一來，不只增加可讀性，學習曲線較低，又便於優化、維護。 - handle text streams - for M$ powershell, it expects everything is a `.NET` object - Everything has same simply interface, no much domain knowledge required ## @team15 我們簡單實作了reading material中的hw，另外多處理了一些事情。他hw所需的code在refrence中的github。她作業中沒有實現 double fork 的技巧，所以我們可以簡單實現如下：我們輸入 `./fork.py -A a+b,b+c,b+d,d+e,d-,b- -c` ![image](https://hackmd.io/_uploads/ryfP0hMsex.png) :::info `-c` 意思是顯示出答案，因為這個本來是給你action然後要你畫出每一步的process tree，但這裡我們只是好奇double fork的樹長甚麼樣子 `-A` 後面為自訂義腳本，`a+b`表示a生b，`-b`表示b exit ::: 所以我們會拿到 ![image](https://hackmd.io/_uploads/SJUckaGixg.png) ![image](https://hackmd.io/_uploads/SyOi1Tzieg.png) 可以看到d結束的時候e就直接回到a了，a相當於我們電腦的`PID = 1`。另外我們好奇 zombie，很可惜的是這個作業提供的 `fork.py` ，沒辦法弄出zombie的情況。 reference： 1. reading material hw github ：https://github.com/remzi-arpacidusseau/ostep-homework/ ## @team28 ### 為什麼`fork`的return值不同，剛fork過去時卻還是共用同一個記憶體空間？假設以下的code： ```c= pid_t pid = fork(); if (pid == 0) { // child do something ... } ``` 在剛複製過去時，兩個process會共用同一塊記憶體空間，直到某個process率先執行寫入，觸發了CoW後，在獨立寫入該process的記憶體內。但是如果`fork()`的return值不一樣，記憶體為什麼還可以共用呢？在查詢資料後發現： 1. 雖然兩個process的`pid`看起來像是同一個memory address，但實際上的physical memory會map到不同的位址。 2. 有可能OS去更改了cpu register value，在沒必要的時候，並不會去將這個value寫到memory裡面，自然就不會有CoW的狀況發生。（待驗證）補充：後來在 TA Time 問老師 ```c pid_t pid = fork() ``` 就會觸發 copy-on-wirte 紀錄有 pid 的那個page會被複製一份 <p/> 而如果是 ```c if (fork() == 0) { print ("Child Processing...\n") } ``` 就組語觀點來說，只要把 fork() 的 return value 放在 register 裡面即可完成判斷（branch 跳轉）是故雖然parent/child 的 PC 會不同，但還不會觸發 copy on write，因為不同的地方只有reg，還沒動到memory ## @team30 **fork() 之後，父與子 process 的記憶體空間明明相同，為什麼修改變數時卻互不影響？** A2. 因為雖然父跟子 process 在 fork() 時看起來擁有相同的虛擬地址空間，但現代 OS 採用 Copy-On-Write (COW) 機制：fork 時不會立即複製所有實體 page；取而代之的是把原本的 page 標記為可共享且唯讀，父子都指向相同的物理 page。當某一方嘗試寫入（修改變數）時，產生 page fault，kernel 才為該方分配一個新的物理 page 並把內容複製過去，然後那一方修改的是自己的新頁，另一方看不到變更。因此修改是「私有化」的，不會互相影響。參考資料： Copy-On-Write 機制說明：https://www.geeksforgeeks.org/operating-systems/copy-on-write/ ## @team1 在查找資料過程中，我發現了一個有趣的情況並且寫了 code 來測試 ```c for(int i=0;i<3;i++){ cout << i; fork(); } // output : 012012012012012012012 ``` 這個 output 非常反直覺，因為前面 i = 0, 1 的時候有的 child process 還沒生出來，但最後卻變成所有 process 都輸出 012。原因是 output 會先被 buffered 在 stdout 裡面，如果 child 直接複製的話會連 buffer 中的內容一起複製，直到最後才輸出。 ```c for(int i=0;i<3;i++){ cout << i << flush; fork(); } // output : 0112222 ``` 如果每次 fork 前都有把 buffer 給 flush 掉的話就可以確保 child 拿到的 stdout 是乾淨的。 ## @team23 Question: We learned that a zombie process occurs when a child process terminates but its parent does not call `wait()` or `waitpid()`. **Our question:** *If a zombie is really created, how can we observe it and how can it be reaped (collected)?* - Experiment (Our Hands-on Test) - creating a zombie - Program:use `fork()`, let the child immediately `exit(0)`, and let the parent **not** call `wait()` but sleep for 30 seconds.<img src="https://hackmd.io/_uploads/SJL0XZHsgx.png" width="350"> - Execution:run `./zombie &` to place it in the background.<img src="https://hackmd.io/_uploads/SJG-HbHsee.png" width="350"> - Observing the zombie - Use `ps` to filter processes with state `Z`:<img src="https://hackmd.io/_uploads/Sk8NrbHjel.png" width="650"> - After 30 seconds, the parent process also terminates. Running `ps` again shows that the zombie process has disappeared, since it has been reparented to `init` and reaped automatically:<img src="https://hackmd.io/_uploads/SklWIbHixg.png" width="650"> - If we want to remove the zombie before the parent exits :::warning Key point: a zombie cannot be killed directly. We must make its parent reap it (call wait()/waitpid()), or make the parent exit so that init will reap it. ::: - method 1: remind the parent to reap - We first tried to send `SIGCHLD` manually to the parent process while a zombie existed. But we found nothing happened, because the parent program had no signal handler and no `waitpid()`.![image](https://hackmd.io/_uploads/SJCqUqHsgx.png) - Adding a signal handler: We modified the parent code to install a handler for `SIGCHLD`. The handler simply calls `waitpid(-1, &status, WNOHANG)` in a loop to reap all terminated children:![image](https://hackmd.io/_uploads/S1YeVcBoxg.png) - Result: With this handler, whenever the child terminates, the kernel sends `SIGCHLD` automatically. The handler executes immediately and reaps the child, so the zombie never appears. **Therefore, in most cases, there is no need to manually send the signal.** - method 2: terminate the parent **Let `init` adopt and reap the zombie by ending the parent:**![image](https://hackmd.io/_uploads/HkfJeMHslg.png) - Result:Success! - Reference website - Baeldung – How to Clean a Linux Zombie Process https://www.baeldung.com/linux/clean-zombie-process?utm_source=chatgpt.com ## @team28 Q3. Why not just use command-line arguments: docker run postgres --user=myuser --password=mypassword? 　　在容器化部署中，使用環境變數而非命令列參數主要有兩個實質理由。首先，安全性上，命令列參數會暴露在進程的 /proc/<pid>/cmdline 或 ps 輸出中，容易被同一宿主機或命名空間下的其他進程窺探到；相比之下，環境變數不會直接出現在命令列中，且存取 /proc/<pid>/environ 需要更高權限，因此可降低敏感資訊（如資料庫密碼）意外洩漏的風險。　　其次，環境變數提供更高的部署靈活性。在不同環境或多容器系統中，可以透過 docker-compose、Kubernetes ConfigMap 或 Secrets 等方式集中管理與注入設定，而無需每次修改啟動命令。這種方法不僅簡化了自動化部署流程，也方便在多環境間切換設定，對持續整合與持續部署（CI/CD）流程尤其有利 ## @team2 speaking of docker environment variable, we can also use docker compose or dockerfile to write is, instead of writing it all on the commandline. something like this ```dockerfile FROM ros:kinetic LABEL org.opencontainers.image.authors="ohin" ENV RUNNING_IN_DOCKER=true ENV TERM=xterm-256color ARG DEBIAN_FRONTEND=noninteractive ARG USER ARG USER_UID=1000 ``` ## @team30 **為什麼透過 bash & 建立的背景 bash 是 stopped 狀態?** A:bash會嘗試從 stdin 讀取使用者輸入(在背景也一樣)，Linux 有一個保護機制：背景程式不能直接讀 stdin，所以kernel 會送 SIGTTIN 訊號給它 SIGTTIN 的預設行為是：暫停（Stopped）程式。 **Q: 要怎麼解決上問題？** A: 重定向 stdin，例如 bash < /dev/null & stdin 被重定向到 /dev/null，bash 不會嘗試讀鍵盤，所以它啟動後立刻執行完（因為沒有互動命令要執行），就退出了 → 顯示 Done。 ## @team15 `jobs -l` ：顯示當前的的工作，ex： ```bash root@ubuntu:/home/vagrant# bash & [2] 10190 root@ubuntu:/home/vagrant# jobs -l [1]- 10172 Stopped (tty input) bash [2]+ 10190 Stopped (tty input) bash ``` `fg %3`：%多少就是把那個工作拉到前面來(吃到stdin) ![image](https://hackmd.io/_uploads/BkCFyMloxl.png) ## @team18 **Q:為什麼透過 bash & 建立的背景 bash 是 stopped 狀態?** A:bash會嘗試從 stdin 讀取使用者輸入(在背景也一樣)，Linux 有一個保護機制：背景程式不能直接讀 stdin，所以kernel 會送 SIGTTIN 訊號給它 SIGTTIN 的預設行為是：暫停（Stopped）程式。 **Q: 要怎麼解決上問題？** A: 重定向 stdin，例如 bash < /dev/null & ## @team15 **Q1, How is it possible to run so many “virtual CPUs” on a machine with only 2 physical CPU cores?** 在虛擬化層 `hypervisor`中，實體 CPU 的時間片`time slice`會被分配給多個虛擬機，通常會採取平均分配或公平排程的策略。乍看之下，這與一般作業系統中的排程機制非常相似，但在虛擬化環境下，這種操作有一個專門的名稱：`over-subscription`（超額分配）。 vCPU（虛擬 CPU）之所以能實現這種功能，主要在於它的分配方式就像是在多條並行公路上讓多個 VM guest 依序「排隊前進」。每個 VM 認為自己擁有完整的 CPU 資源，但實際上它們共享底層的物理核心，透過輪流使用時間片來模擬「同時運行」的效果。這就是為什麼即使一台實體機只有 2 顆核心，也能支援三個甚至更多 vCPU 的虛擬機運作。這種設計能夠有效運作的關鍵在於大部分時間內 vCPU 的使用率並不會飆滿。換句話說，多數 VM 並不是持續占用 100% 的 CPU，因此 hypervisor 可以安全地讓多個 vCPU 共用相同的物理核心而不會立即造成嚴重性能下降。然而，如果某個 VM 突然開始執行高 CPU 消耗的程式（例如大型遊戲或密集計算任務），這種共享機制就會受到影響：其他 VM 可能會因為搶不到 CPU 時間而效能下降，甚至出現延遲或卡頓。這也是設計虛擬化環境時需要考慮資源限制與負載管理的原因。