# 2025 NCU Linux Project 1 ## Description In modern operating systems, every running program is a process, and each process has its own virtual address space. The CPU and OS cooperate through the page table to translate a virtual address (VA) to its corresponding physical address (PA) in memory. In this project, you will implement a new Linux system call that translates a given virtual address into its corresponding physical address. This allows you to directly observe how the OS manages physical memory and how virtual-to-physical mappings evolve at runtime. Your system call should follow the prototype below: ```c void * my_get_physical_addresses(void *) ``` The return value should be: * 0 -> if the virtual address is currently not mapped to any physical page * Non-zero value -> the physical address corresponding to the input virtual address ## Question 1 (30 points) In this part, we use `malloc()` to allocate memory. At first, the virtual addresses (VAs) are reserved, but they do not yet have corresponding physical addresses (PAs). Only when the program writes to a page does the operating system actually allocate a physical page. This mechanism, where physical memory is allocated only on first access, is called **lazy allocation**. ```c= #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define SYS_GET_PHY 449 // <-- 這裡換成你的 syscall number unsigned long my_get_physical_addresses(void *va) { return syscall(SYS_GET_PHY, va); } int main(void) { unsigned long length = 4 * 4096; // malloc void *addr = malloc(length); if (addr == NULL) { perror("mmap"); return 0; } printf("malloc() returned address = %p\n\n", addr); printf("Check VA -> PA before access:\n"); for (size_t i = 0; i < length; i += 4096) { unsigned long pa = my_get_physical_addresses((void *)addr + i); printf(" VA: %p -> PA: %p\n", (char *)addr + i, (void *)pa); } printf("\nTouching memory (write 1 byte per page)...\n"); for (size_t i = 0; i < length; i += 4096) { ((char *)addr)[i] = 42; } printf("\nCheck VA -> PA after access:\n"); for (size_t i = 0; i < length; i += 4096) { unsigned long pa = my_get_physical_addresses((void *)addr + i); printf(" VA: %p -> PA: %p\n", (char *)addr + i, (void *)pa); } return 0; } ``` ## Question 2 (30 points) In this part, you need to convert a C system call into a shared library so that it can be called from Python using `ctypes` library. ```c= // my_get_phy.c #include <unistd.h> #define SYS_hello XXX // 你設定的 syscall number // Python 會用 ctypes 呼叫這個函數 unsigned long my_get_physical_addresses(void *virtual_addr) { return (unsigned long)syscall(SYS_hello, virtual_addr); } ``` Compile the C code into a shared library using: ```bash gcc -Wall -O2 -fPIC -shared -o lib_my_get_phy.so my_get_phy.c ``` --- All programming languages are eventually translated into machine code that runs on the operating system (OS). Each program, regardless of its language, is executed as a process managed by the OS. Therefore, all processes - whether written in C, Python, or other languages - have their own heap, stack, and page table, just like a normal C program. We use Python to observe the heap growth with `sbrk()`. After allocating more heap memory, you can call your system call to check the physical addresses of the new memory. Some virtual pages may not have a physical mapping, physical memory is only assigned when the pages are accessed. ```python= import ctypes # ----------------------------- # C syscall wrapper # ----------------------------- lib = ctypes.CDLL('./lib_my_get_phy.so') lib.my_get_physical_addresses.argtypes = [ctypes.c_void_p] lib.my_get_physical_addresses.restype = ctypes.c_ulong def virt2phys(addr): """VA -> PA""" return lib.my_get_physical_addresses(ctypes.c_void_p(addr)) # ----------------------------- # sbrk(0) break # ----------------------------- libc = ctypes.CDLL("libc.so.6") libc.sbrk.restype = ctypes.c_void_p libc.sbrk.argtypes = [ctypes.c_long] def get_heap_break(): return libc.sbrk(0) PAGE_SIZE = 4096 print("=== 初始 program break ===") break_before = get_heap_break() print(f"program break: {hex(break_before)}\n") malloc_size = 4096 # 每次 malloc 4KB buffers = [] # 循環 malloc 觀察 break 變化 for i in range(50): buf = ctypes.create_string_buffer(malloc_size) buffers.append(buf) break_now = get_heap_break() # 代表 heap 變大了 if break_now != break_before: print(f"malloc {i+1}: break = {hex(break_now)}") print(f" -> program break 增加了 {break_now - break_before} bytes") # 列出新增 break 範圍的 VA -> PA start_va = break_before end_va = break_now print("\n=== 新增 heap VA->PA ===") for va in range(start_va, end_va, PAGE_SIZE): try: pa = virt2phys(va) if pa == 0: pa_str = "未分配" else: pa_str = f"0x{pa:x}" # 十六進位格式 except Exception: pa_str = "未分配" print(f"VA: 0x{va:x} -> PA: {pa_str}") print() break_before = break_now ``` ### Hint * Two threads show a physical memory cell (one byte) if both of them have a virtual address that is translated into the physical address of the memory cell. * The kernel usually does not allocate physical memories to store all code and data of a process when the process starts execution. * Inside the Linux kernel, you need to use function `copy_from_user()` and function `copy_to_user()` to copy data from/to a user address buffer. * Check the "Referenced Material" part of the Course web site to see how to add a new system call in Linux. ## Project Submission: * Due time: 17th Nov * The demo will be held on 21th Nov * Please fill out this [form](https://docs.google.com/spreadsheets/d/1auJ64XT8Ew_H3FudmqsQAgkpTliuFNjxZ3CRho8UFPQ/edit?usp=sharing) to choose your demo time before 15th Nov * On site demo of this project is required. * During on site demo, the TAs will execute several programs written by them to check the correctness of your system calls. * When demonstrating your projects, the TAs will ask you some questions regarding to your projects. Part of your project grade **(40%)** is determined by your answers to the questions. * Report Content: * Do not forget writing the names and student IDs of all members in your team. * Your report should be in hackmd document form and contain: * Your source code * The execution results * Submit the URL of your hackmd document to the eeclass. * Late submission will NOT be accepted