# 2025 NCU Linux Project 1
## Description
In modern operating systems, every running program is a process, and each process has its own virtual address space. The CPU and OS cooperate through the page table to translate a virtual address (VA) to its corresponding physical address (PA) in memory.
In this project, you will implement a new Linux system call that translates a given virtual address into its corresponding physical address. This allows you to directly observe how the OS manages physical memory and how virtual-to-physical mappings evolve at runtime.
Your system call should follow the prototype below:
```c
void * my_get_physical_addresses(void *)
```
The return value should be:
* 0 -> if the virtual address is currently not mapped to any physical page
* Non-zero value -> the physical address corresponding to the input virtual address
## Question 1 (30 points)
In this part, we use `malloc()` to allocate memory. At first, the virtual addresses (VAs) are reserved, but they do not yet have corresponding physical addresses (PAs).
Only when the program writes to a page does the operating system actually allocate a physical page.
This mechanism, where physical memory is allocated only on first access, is called **lazy allocation**.
```c=
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define SYS_GET_PHY 449 // <-- 這裡換成你的 syscall number
unsigned long my_get_physical_addresses(void *va) {
return syscall(SYS_GET_PHY, va);
}
int main(void) {
unsigned long length = 4 * 4096;
// malloc
void *addr = malloc(length);
if (addr == NULL) {
perror("mmap");
return 0;
}
printf("malloc() returned address = %p\n\n", addr);
printf("Check VA -> PA before access:\n");
for (size_t i = 0; i < length; i += 4096) {
unsigned long pa = my_get_physical_addresses((void *)addr + i);
printf(" VA: %p -> PA: %p\n", (char *)addr + i, (void *)pa);
}
printf("\nTouching memory (write 1 byte per page)...\n");
for (size_t i = 0; i < length; i += 4096) {
((char *)addr)[i] = 42;
}
printf("\nCheck VA -> PA after access:\n");
for (size_t i = 0; i < length; i += 4096) {
unsigned long pa = my_get_physical_addresses((void *)addr + i);
printf(" VA: %p -> PA: %p\n", (char *)addr + i, (void *)pa);
}
return 0;
}
```
## Question 2 (30 points)
In this part, you need to convert a C system call into a shared library so that it can be called from Python using `ctypes` library.
```c=
// my_get_phy.c
#include <unistd.h>
#define SYS_hello XXX // 你設定的 syscall number
// Python 會用 ctypes 呼叫這個函數
unsigned long my_get_physical_addresses(void *virtual_addr) {
return (unsigned long)syscall(SYS_hello, virtual_addr);
}
```
Compile the C code into a shared library using:
```bash
gcc -Wall -O2 -fPIC -shared -o lib_my_get_phy.so my_get_phy.c
```
---
All programming languages are eventually translated into machine code that runs on the operating system (OS).
Each program, regardless of its language, is executed as a process managed by the OS.
Therefore, all processes - whether written in C, Python, or other languages - have their own heap, stack, and page table, just like a normal C program.
We use Python to observe the heap growth with `sbrk()`.
After allocating more heap memory, you can call your system call to check the physical addresses of the new memory.
Some virtual pages may not have a physical mapping, physical memory is only assigned when the pages are accessed.
```python=
import ctypes
# -----------------------------
# C syscall wrapper
# -----------------------------
lib = ctypes.CDLL('./lib_my_get_phy.so')
lib.my_get_physical_addresses.argtypes = [ctypes.c_void_p]
lib.my_get_physical_addresses.restype = ctypes.c_ulong
def virt2phys(addr):
"""VA -> PA"""
return lib.my_get_physical_addresses(ctypes.c_void_p(addr))
# -----------------------------
# sbrk(0) break
# -----------------------------
libc = ctypes.CDLL("libc.so.6")
libc.sbrk.restype = ctypes.c_void_p
libc.sbrk.argtypes = [ctypes.c_long]
def get_heap_break():
return libc.sbrk(0)
PAGE_SIZE = 4096
print("=== 初始 program break ===")
break_before = get_heap_break()
print(f"program break: {hex(break_before)}\n")
malloc_size = 4096 # 每次 malloc 4KB
buffers = []
# 循環 malloc 觀察 break 變化
for i in range(50):
buf = ctypes.create_string_buffer(malloc_size)
buffers.append(buf)
break_now = get_heap_break()
# 代表 heap 變大了
if break_now != break_before:
print(f"malloc {i+1}: break = {hex(break_now)}")
print(f" -> program break 增加了 {break_now - break_before} bytes")
# 列出新增 break 範圍的 VA -> PA
start_va = break_before
end_va = break_now
print("\n=== 新增 heap VA->PA ===")
for va in range(start_va, end_va, PAGE_SIZE):
try:
pa = virt2phys(va)
if pa == 0:
pa_str = "未分配"
else:
pa_str = f"0x{pa:x}" # 十六進位格式
except Exception:
pa_str = "未分配"
print(f"VA: 0x{va:x} -> PA: {pa_str}")
print()
break_before = break_now
```
### Hint
* Two threads show a physical memory cell (one byte) if both of them have a virtual address that is translated into the physical address of the memory cell.
* The kernel usually does not allocate physical memories to store all code and data of a process when the process starts execution.
* Inside the Linux kernel, you need to use function `copy_from_user()` and function `copy_to_user()` to copy data from/to a user address buffer.
* Check the "Referenced Material" part of the Course web site to see how to add a new system call in Linux.
## Project Submission:
* Due time: 17th Nov
* The demo will be held on 21th Nov
* Please fill out this [form](https://docs.google.com/spreadsheets/d/1auJ64XT8Ew_H3FudmqsQAgkpTliuFNjxZ3CRho8UFPQ/edit?usp=sharing) to choose your demo time before 15th Nov
* On site demo of this project is required.
* During on site demo, the TAs will execute several programs written by them to check the correctness of your system calls.
* When demonstrating your projects, the TAs will ask you some questions regarding to your projects. Part of your project grade **(40%)** is determined by your answers to the questions.
* Report Content:
* Do not forget writing the names and student IDs of all members in your team.
* Your report should be in hackmd document form and contain:
* Your source code
* The execution results
* Submit the URL of your hackmd document to the eeclass.
* Late submission will NOT be accepted