Linux-Project2

# Linux-Project2 ###### tags: `linux` 第9組 111522030 李孟潔 111522061 鄭伊涵 111526003 張友安 ## Outline [ToC]   ## Kernel Space Kernel Space的詳情 : https://hackmd.io/@linuxWarrior/H1So0AlHo ## User Space ### The 1st version ``` c= #include <syscall.h> #include <sys/types.h> #include <stdio.h> #include <unistd.h> #include <time.h> #include <pthread.h> #include <stdlib.h> #include <dlfcn.h> int init_data = 30; int non_init_data; long ret = 0; pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER; void* our_func() { printf("Hello World!\n"); } int main(){ int local_data = 30; int* heapAddress = malloc(sizeof(int)); char test = '\n'; printf("start! %c\n", test); void* fHandle; void* func; fHandle = dlopen("/lib/x86_64-linux-gnu/libc.so.6",RTLD_LAZY); printf("libc base: %lx\n",*(size_t*)fHandle); if (!fHandle) { printf ("%s\n", dlerror()); exit(1); } func = dlsym(fHandle, "printf"); static __thread int tls = 0; if(dlerror() != NULL){ printf("%s\n", dlerror()); exit(1); } size_t* va[7] = { (size_t*)&init_data, (size_t*)&non_init_data, (size_t*)&our_func, (size_t*)&local_data, (size_t*)func, (size_t*)heapAddress, (size_t*)&tls }; size_t* pa[7]; ret = syscall(333, va, pa, 7); printf("\nPid of the process is = %d\nAddresses info:", getpid()); char printArray [7][20] = {"Data", "BSS", "Code", "Stack", "Library", "heap", "thread"}; for(int i=0;i<7;i++){ printf("\n %d) %8s (va/pa) = %16p / %20p", i+1, printArray[i], va[i], pa[i]); } tls++; printf("\n\n"); if(ret > 0) printf("error!!!QQQQQQQ"); sleep(30); return 0; } ``` ### Process * **設計流程** 1. 基本架構與project1一樣，只是將multi thread的部分刪除。 2. 在程式最後面添加sleep()，讓兩個process同時執行時方便我們進行比較。 3. 執行程式使用下方執行方式，將其中一個放背景執行，實現同時兩個processes執行同個code，並觀察processes間記憶體的分享方式。 ```c= ./getpa2 & getpa2 ``` ### Result of the 1st ver. ![](https://i.imgur.com/sQ084KV.jpg) * **結果分析** 1. 觀察後發現兩個processes的Code以及Library的physical addresses是相同的。 2. 其中Data、BSS的virtual addresses雖然相同，但他們的physical addresses為不同。 ### The 2nd version ``` c= #include <syscall.h> #include <sys/types.h> #include <sys/wait.h> #include <stdio.h> #include <unistd.h> #include <time.h> #include <pthread.h> #include <stdlib.h> #include <dlfcn.h> int init_data = 30; int non_init_data; long ret = 0; pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER; void* our_func() { printf("Hello World!\n"); } int main(){ int local_data = 30; int* heapAddress = malloc(sizeof(int)); char test = '\n'; printf("start! %c\n", test); void* fHandle; void* func; fHandle = dlopen("/lib/x86_64-linux-gnu/libc.so.6",RTLD_LAZY); printf("libc base: %lx\n\n",*(size_t*)fHandle); if (!fHandle) { printf ("%s\n", dlerror()); exit(1); } func = dlsym(fHandle, "printf"); static __thread int tls = 0; if(dlerror() != NULL){ printf("%s\n", dlerror()); exit(1); } pid_t PID = fork(); if(PID == -1) { printf("error\n"); } else if(PID == 0) { printf("It's child process, PID is %d.", getpid()); size_t* va[7] = { (size_t*)&init_data, (size_t*)&non_init_data, (size_t*)&our_func, (size_t*)&local_data, (size_t*)func, (size_t*)heapAddress, (size_t*)&tls }; size_t* pa[7]; ret = syscall(333, va, pa, 7); char printArray [7][20] = {"Data", "BSS", "Code", "Stack", "Library", "heap", "thread"}; for(int i=0;i<7;i++){ printf("\n %d) %8s (va/pa) = %16p / %20p", i+1, printArray[i], va[i], pa[i]); } tls++; printf("\n\n"); sleep(10); } else if(PID > 0) { printf("It's parent process!, PID is %d.", getpid()); size_t* va[7] = { (size_t*)&init_data, (size_t*)&non_init_data, (size_t*)&our_func, (size_t*)&local_data, (size_t*)func, (size_t*)heapAddress, (size_t*)&tls }; size_t* pa[7]; ret = syscall(333, va, pa, 7); char printArray [7][20] = {"Data", "BSS", "Code", "Stack", "Library", "heap", "thread"}; for(int i=0;i<7;i++){ printf("\n %d) %8s (va/pa) = %16p / %20p", i+1, printArray[i], va[i], pa[i]); } tls++; printf("\n\n"); pid_t rpid = wait(NULL); printf("Catch the child process pid : %d\n",rpid); } if(ret > 0) printf("error!!!QQQQQQQ"); return 0; } ``` ### Process * **設計流程** 1. 2nd ver.與1st ver.的差別在於開啟兩個processes的方式，這個版本使用的是fork()去開啟一個新的child process。 2. 將fork()回傳的PID用if-else去判斷目前執行程式的是parent process或child process。(PID>0 -> parent ; PID=0 -> child) 3. 最後結束時parent process會呼叫wait()，會去抓取已經結束的child process的資源並且回傳該child的PID。 ```c= ./getpa2_fork ``` ### Result of the 2nd ver. ![](https://i.imgur.com/DELOauy.jpg) * **結果分析** 1. 同1st ver.，兩個processes的Code及Library的physical address都是相同的。 2. 與上方不同的是，兩個processes的所有virtual addresses都是相同的，但除了Code跟Library之外的physical addresses都是不同的。(virtual address相同不代表physical address會相同) ## 加分題 ### Question 1. Parent process在執行wait4() or waitpid()時，為何要從已結束的child process收集資訊? 2. 會收集那些資訊? 其用途為何? ### Answer - wait(): 當process執行wait()時，會暫停當前執行中的process，wait()會分析the child process of current process是否已經exit，如果找到已變成zombie process的child process，就會收集該child process的訊息，收集完後會將該zombie process銷毀。 -- 參數status會儲存child process退出時的訊息(如task_struct, thread_info等)。 - waitpid(): waitpid()與wait()相似，但比wait()多了兩個參數(pid and options)。 -- pid: ID of waiting child process -- options: control waitpid() by additional selection - wait(): 可以獲得child process的狀態訊息。 - wait4(): 可以獲得child process的狀態訊息外，還能透過參數rusage取得child process資源的使用訊息(Ex: total user CPU time, total system CPU time, # of page fault, # of received signal etc.) - 用途: 目的是為了回收及釋放資源，當child process被終止後，若parent process沒有回收及釋放資源的話，會變成仍占用著資源的zombie process。 ## 補充 ### System call: sleep() ```c= static inline void sleep(unsigned sec) { current->state = TASK_INTERRUPTIBLE; schedule_timeout(sec * HZ); } ``` * **說明** 1. sleep()會將當前process的狀態改為TASK_INTERRUPTIBLE。 2. 呼叫schedule_timeout，再藉由前者呼叫schedule()來執行context switch。 ### System call中的mmap(): mmap_pgoff(...) ```c= SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len, unsigned long, prot, unsigned long, flags, unsigned long, fd, unsigned long, pgoff) { struct file *file = NULL; unsigned long retval; if (!(flags & MAP_ANONYMOUS)) { audit_mmap_fd(fd, flags); file = fget(fd); if (!file) return -EBADF; if (is_file_hugepages(file)) len = ALIGN(len, huge_page_size(hstate_file(file))); retval = -EINVAL; if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) goto out_fput; } else if (flags & MAP_HUGETLB) { struct user_struct *user = NULL; struct hstate *hs; hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); if (!hs) return -EINVAL; len = ALIGN(len, huge_page_size(hs)); /* * VM_NORESERVE is used because the reservations will be * taken when vm_ops->mmap() is called * A dummy user value is used because we are not locking * memory so no accounting is necessary */ file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, VM_NORESERVE, &user, HUGETLB_ANONHUGE_INODE, (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); if (IS_ERR(file)) return PTR_ERR(file); } flags &= ~(MAP_EXECUTABLE | MAP_DENYWRITE); retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff); out_fput: if (file) fput(file); return retval; } ``` * **說明** 1. sys_mmap會調用sys_mmap_pgoff函數，最終do_mmap函數實現具體的功能。 2. 函數經過檢查，會調用get_unmapped_area(file, addr, len, pgoff, flags)獲取未映射的vma，也就是一段可用的虛擬區間，最後調用mmap_region，判斷是否合法或是和跟其他vma合併，返回address。 ### SIGNAL的處理機制 * **說明** 1. SIGNAL在Linux中是processes之間互相傳遞訊息的方法。 2. SIGNAL用來通知process發生了非同步事件。processes之間可以互相通過系統調用kill發送軟中斷信號。kernel也可以因為內部事件而給進程發送信號，通知process發生了某個事件。注意，SIGNAL只是用來通知某process發生了什麼事件，並不給該process傳遞任何資料。 3. 收到SIGNAL的process對各種SIGANL有不同的處理方法。處理方法可以分為三類：第一種是類似中斷的處理常式，對於需要處理的SIGNAL，process可以指定處理函數，由該函數來處理。第二種方法是，忽略某個SIGNAL，對該SIGNAL不做任何處理，就象未發生過一樣。第三種方法是，對該SIGNAl的處理保留系統的預設值，這種缺省操作，對大部分的SIGNAL的缺省操作是使process得終止。 4. 進程通過系統調用signal來指定進程對某個信號的處理行為。 ### Segmentation Fault vs Page Fault * **說明** 1. Segmentation Fault: 記憶體區段錯誤，也稱存取權限衝突（access violation），它會出現在當程式企圖存取CPU無法定址的記憶體區段時。當錯誤發生時，硬體會通知作業系統產生了記憶體存取權限衝突的狀況。 2. Page Fault: 當作業系統要存取某個page時，會先跟page table中搜尋該page的PTE，這個PTE會儲存page所對應到的frame，以及其他關於這個page的資訊，例如此page目前是不是正存在於physical memory中(是不是有frame可以使用)。若要存取的分頁不存在於physical memory中，就會發生page fault。 3. 實際上，segmentation fault與page fault是截然不同的兩件事! Segmentation fault意味著程序試圖要去存取無效或非法的記憶體位址，Page fault意味著程序要存取的資料目前尚未在physical memory中，因此需要透過MMU將所需要的page放到swap進frame裡。前者為非法存取，程序一般會停止並且跳error message，後者則為合法的，若page fault發生的次數越少，代表作業系統運作的效能越好，減少讀寫次數也能延長硬體壽命！ ### fork() vs do_fork() * **說明** 1. fork(): 此system call可以建立child process，child process會被配置與parent不同的memory space(by kernel)，他的資料內容都會來自於parent的copy。需要#include <unistd.h> 2. do_fork(): 此fork的作用機制與c library裡fork的作用機制不同，library中的fork會藉由parameter的值放入process的stack中進行參數的傳遞。而sytem call的fork是通過中斷process，從user mode切換到kernel mode的一種特殊的function call，藉由eax去判別屬於第幾個system call function，此system call是透過CPU的register來進行參數傳遞。 [do_fork identifier- Linux Source Code](https://elixir.bootlin.com/linux/v3.7/source/kernel/fork.c#L1555) ### Copy On Write * **說明** ![](https://i.imgur.com/JwiOkZJ.png) 一般來說，每個process都有屬於自己私人的記憶體空間，如heap, stack, BSS, data 等，但是 processes 之間也可能會使用到相同的資源，例如在寫C language使用的libc。這些不會修改到資源就可以透過virtual address及MMU等提供的位址轉換機制得以使用相同資源而不作不必要的複製。而在實作上多個process使用相同資料時，在一開始只會loading一份資料，並且被標記成read-only。當有 process 要寫入時則會對kernel觸發page fault，進而使得 page fault handler處理copy on write的操作。 fork()system call 會建立新的 process，當父行程 fork()一個子行程之後，子行程並沒有馬上複製父行程的資料，而是暫時與父行程共用，在自己的page table上標示這部分的資料只讀不寫，直到某一方要修改其中的資料的時候，會引發分頁錯誤， do_wp_page()函數會處理那些行程試圖修改、標示為只讀的頁面，重新分配記憶體頁面並且複製就的頁面內容到新的頁面中。也就是說，在Linux當中它會先複製原先process 的 mm_struct, vm_area_struct以及page table，並且讓每個page的flag設為 read-only 。最後，當有作更改時則會利用COW機制進行處理。 * **流程如下：** ![](https://i.imgur.com/67wODMU.png) ## Reference [wait(), waitpid(), wait4](https://www.cnblogs.com/tongye/p/9558320.html) [wait(), waitpid(), wait4](https://blog.csdn.net/qq_40073459/article/details/104453740) [replace wait4() by waitpid()](https://stackoverflow.com/questions/35316374/why-did-wait4-get-replaced-by-waitpid) [fork()](https://wenyuangg.github.io/posts/linux/fork-use.html) [sleep()](https://elixir.free-electrons.com/linux/v3.9/source/drivers/staging/comedi/drivers/me_daq.c#L179) [signal](https://lyt0112.pixnet.net/blog/post/347001917-linux-%E4%BF%A1%E8%99%9Fsignal%E8%99%95%E7%90%86%E6%A9%9F%E5%88%B6) [segmentation fault](https://grantliblog.com/2022/05/05/%E6%B7%BA%E8%AB%87-c-%E4%B8%AD%E7%9A%84segmentation-fault-%E9%8C%AF%E8%AA%A4/) [page fault](https://magiclen.org/page-fault/) [do_fork](https://blog.csdn.net/gatieme/article/details/51569932) [system call vs normal function call](https://blog.csdn.net/weixin_40710708/article/details/105370454) [常用的system call](https://hackmd.io/@dZfCcN4hT8aUuDPv3B8CWQ/B1zxcLJmK) [fork觀念由淺入深](https://wenyuangg.github.io/posts/linux/fork-use.html) [copy on write](https://hackmd.io/@linD026/Linux-kernel-COW-Copy-on-Write)