# CS 162 Project Design Doc
## Task 1: Argument Passing
### 1. Data structures and functions
In `process.c` modify
``` C
static bool setup_stack (void **esp)
```
to be
``` C
static bool setup_stack (void **esp, const char* arg_str)
```
The arguments will be extracted from `arg_str` and then will be used to set up the stack
### 2. Algorithms
In `load` function in `process.c`, we call `setup_stack(esp, filename)`.
Before `setup_stack` function return, we perform the following steps if `success = true`:
1. Move `esp` down to the length of `arg_str` then copy the `arg_str` to stack
2. Create a array of `char* argv_ptr[MAX_ARGS]` where `MAX_ARGS` is chosen. Init a local variable argc = 0
3. Tokenize the `arg_str` using `strtok_r`, and record the pointer for each token.
```C
for (token = strtok_r (arg, " ", &save_ptr); token != NULL; token = strtok_r (NULL, " ", &save_ptr)) {
argv_ptr[argc] = save_ptr;
argc++;
}
```
4. Move the `esp` to get space for `argc`, `argv`, `argv[0]...argv[argc]` => `esp -= 4*(argc + 3)`
5. Compute `stack_align = (esp % 16 - 12) % 16` :
```
For x is the number of bytes that need to move down to make stack align
(esp - x) % 16 = 12
=> (esp % 16 - x % 16) % 16 = 12
=> x = (esp % 16 - 12) % 16
```
6. Move `esp` down: `esp -= stack_align`
7. Set all the arguments
```C
*(esp + 4) = argc
*(esp + 8) = esp + 12 // argv
for i in range(0..argc):
*(esp + 12 + (i * 4)) = argv_ptr[i]
```
### 3. Synchronization
N/A
### 4. Rationale
An alternative way to record the pointers to tokens is using pintos linked-list instead of a array of pointer with a default `MAX_ARGS`.
Pros: do not need to declare explicit `MAX_ARGS` before hand
Cons: need to create a wrapper stuct that contain `list_elem` and `char*`
## Task 2: Process Control Syscalls
### 1. Data structures and functions
#### Wait
```/pintos/src/threads/thread.h```
wait status must reside in kernel?
```C
struct wait_status
{
//FROM DISCUSSION
//struct list_elem elem;
//struct lock lock;
//int ref_cnt:
//tid_t tid;
//int exit_code;
//struct semaphore dead;
}
```
```
struct exec_status
{
//struct semaphore load;
//tid_t tid;
}
```
### 2. Algorithms
#### Verification of Addresses
For every address obtained from the stack during the syscalls, we must check:
1) The address is not Null
2) The address is within userspace
3) Address is mapped - lookup_page(uint32_t pd) pagedir.c
To check that the address is within userspace we verify that the address is not smaller than `PHYS_BASE`. We can check this by calling `is_ser_vaddr(cons void *vaddr)` in `pintos/src/threads/vaddr.h`
To check if the address is mapped we call `lookup_page(uint32_t pd)` in `/pintos/src/userprog/pagedir.c`. We get the page directory from `thread_current()->pagedir`.
If any of these three are true we throw an error.
In pintos/src/userprog/syscall.c first we must check which syscall is being called by checking args[0].
Each syscall has a preassigned number defined in pintos/src/lib/syscall-nr.h
#### Halt
Practice calls `syscall0(NUMBER)` in `pintos/src/lib/user/syscall.c`
`NUMBER` = preassigned number defined in `pintos/src/lib/syscall-nr.h`
```C
12 ("pushl %[number]; int $0x30; addl $4, %%esp" \
13 : "=a" (retval) \
14 : [number] "i" (NUMBER) \
15 : "memory"); \
```
1) `syscall0` pushes `NUMBER` on to the stack.
2) `syscall0` runs the interrupt instruction `int $0x30`.
3) `syscall0` adds 4 bytes to the stack pointer: `addl $4, %%esp`.
4) `syscall0` returns the value in the eax register? what value
In the syscall handler in `pintos/src/userprog/syscall.c` args is a unsigned 32 bit int that is assigned the value of our stack pointer: `uint32_t* args = ((uint32_t*) f->esp)`. Since args is a 32 bit int pointer, we know that using array bracket notation to access each element will give us the 4 bytes of memory at a time: `args[2] = *(args + (2 * 4 bytes))`
From the information learned in syscall1 we know:
`args[0] = NUMBER`
First we check if `args[0]` is assigned the same number as SYS_PRACTICE.
Next we call shutdown_power_off() from devices/shutdown.h to halt.
```C
if (args[0] == SYS_HALT)
{
shutdown_power_off()
}
```
#### Exec
`pid_t exec (const char *cmd_line)`
`cmd_line` = name of executable
Will use syscall1
```C
if (args[0] == SYS_EXEC)
{
- make sure cmd_line is a valid executable name return -1 if does not exist
- lock file so that it cannot be modified once opened (maybe file system call will do this for us?)
- open file
- create process using /pintos/src/userprog/process.c tid_t process_execute(const char *file_name)
- if there are child processes, wait by calling sema_down
- if child processes successfully load their executables, call sema_up
- return pid by assigning frame's eax register to pid
}
```
#### Wait
`int wait (pid_t pid)`
```C
if (args[0] == SYS_WAIT)
{
/*FROM DISCUSSION*/
- Find the child in the list of shared data structures.
- (If none is found, return -1.)
- Wait for the child to die, by downing a semaphore in the shared data.
- Obtain the child’s exit code from the shared data.
- Destroy the shared data structure and remove it from the list.
- Return the exit code.
}
```
```C
pintos/src/userprog/process.c
/*FROM DISCUSSION*/
process_wait (tid_t child_tid) {
- iterate through list of child processes
- if child process tid matches tid parameter, call sema_down on the semaphore associated with that child process
- (when child process exits, it will sema_up on that same semaphore, waking up the parent process)
--- after waking ---
- set exit code to terminated child process’s exit code
- decrease ref_cnt of child //why? need to free memory, "zombie processes"
- return exit_code
}
process_exit (void) {
- sema_up on semaphore that parent might be sleeping on
- remove all child processes from child process list
- decrement ref_cnt
}
```
#### Practice
Practice calls `syscall1(NUMBER, ARG0)` in `pintos/src/lib/user/syscall.c`
`NUMBER` = preassigned number defined in `pintos/src/lib/syscall-nr.h`
`ARG0` = user entered number to be incremented by one
```x86
25 ("pushl %[arg0]; pushl %[number]; int $0x30; addl $8, %%esp" \
26 : "=a" (retval) \
27 : [number] "i" (NUMBER), \
28 [arg0] "g" (ARG0) \
29 : "memory"); \
```
1) `syscall1` pushes `ARG0` on to the stack.
2) `syscall1` pushes `NUMBER` on to the stack.
3) `syscall1` runs the interrupt instruction `int $0x30`.
4) `syscall1` adds 8 bytes to the stack pointer: `addl $8, %%esp`.
5) `syscall1` returns the value in the eax register
In the syscall handler in `pintos/src/userprog/syscall.c` args is a unsigned 32 bit int that is assigned the value of our stack pointer: `uint32_t* args = ((uint32_t*) f->esp)`. Since args is a 32 bit int pointer, we know that using array bracket notation to access each element will give us the 4 bytes of memory at a time: `args[2] = *(args + (2 * 4 bytes))`
From the information learned in syscall1 we know:
`args[0] = NUMBER`
`args[1] = ARG0`
First we check if `args[0]` is assigned the same number as SYS_PRACTICE.
Next we add one to `ARG0` and save it in the frame's eax register to be returned.
```
if (args[0] == SYS_PRACTICE)
{
f->eax = args[1] + 1;
}
```
### 3. Synchronization
#### Wait
By using semaphores, we are able to make sure that parent and child wait calls are synchronized. For example, if a parent process is called to wait before a child process exits, our implementation will call sema_down on the semaphore associated with that child process. When the child process exits it will sema_up on the same semaphore. When the semaphore is equal to 1, the child is dead. If the child has not yet exited, the semaphore will be less than 1.
#### Exec
We had to make sure that when we call exec, the parent process cannot return until it knows if the child process successfully loaded it's executable. Again we chose to use semaphores here to force the parent to wait if their children did not successfully load their executables yet. When the children's executables are loaded, the children will call sema_up to the same semaphore and will allow the parent to return.
### 4. Rationale
We chose to implement our synchronization using semaphores beccause we saw it as a useful method of abstraction to communicate between processes. While we could have used some binary or boolean logic, semaphores are especially helpful for counting if there exists more than one child process. For example, if there are many child processes that need to exit before the parent can call wait, we can make sure they have all exited by ensuring our semaphore is a certain size rather than checking many different variables.
We chose to implement our validation techniqies as the first option described in the spec: verify the validity of a user-provided pointer, then dereference it. We chose this implementation over only checking if a pointer is below PHYS_BASE and then handling the page fault. Even though the second implementation would be faster, we thought that it would be more straight forward to do some simple validity checks instead.
## Task 3: File Operation Syscalls
### 1. Data structures and functions
do we need a list to keep track of files, if another tread tries to read from the file, but its alraedy being read, add it to a queue???
syscall.c needs to keep track of which threads are trying to do a syscall on the file system.
We should keep a DS to keep track of when the file system is locked and by which process as well as which files are currently open.
```C
// new struct
struct open_file {
int fd; /* File descriptor for this open file. */
FILE * file /* Pointer to the file. */
list_elem elem;
}
//thread.h
struct thread {
...
int num_fds /* number of open files */
list open_files /* list of open_files */
...
}
//Global filesystem_lock
static lock filesystem_lock; /* only 1 thread can acquire the lock */
```
#### create
#### remove
#### open
#### filesize
#### read
#### write
#### seek
#### tell
#### close
### 2. Algorithms
scope pintos/src/filesys/
For each one of these check the f
#### create
first check the pointer passed in is valid
aquire a lock on the system.
check that the amount of bytes the user wants to create is valid.
If either of these fail return false
else call filesys#filesys_create() passing in the filename
that the user wants and an initial size
#### remove
check pointer if file exists/is valid
request a system lock
we could 0 the file but that could be costly if its big, instead well just delete the refreance to it.
#### open
check the ptr, if valid:
call file_open
add it to the list of files open by whichever thread opened it
#### filesize
check the file * ptr
if its valid call file#file_length()
#### read
first we check if what we are trying to read is all in user space by checking the pointr and pointer + filesize.
Aditionaly, check that the file buffer being passed in is valid, ensuring that the bytes we are loading are in the process' allocated space.
should be fine for multiple threads to read from the same open file, as long as there isnt a thread trying to write to it.
call file#file_read()
#### write
check file pointer
call file#file_deny_write()
then aquire a lock on the filesys
check that the file being writen ptr + size does not go over the programs allocated pages.
if its valid size then use file#file_write()
#### seek
We could use this boi to make fragmented files in the furture no?
check the pointer to see if its valid,
if it is call file#file_seek()
#### tell
check the pointer,
if its valid call file#file_tell()
#### close
check validity of pointer.
call file_close()
remove it from the threads list of open files.
call file_allow_write().
this does not necessaraly mean that the file can be written to as other processes may still be reading.
### 3. Synchronization
threads share both the file system and whatever files need to be read from or written to. Multiple threads can be reading from a file but only one thread can write and only one thread can use the file system manager at a time.
To ensure only one thread is using the system at once we request a lock for that process.
if a process wants to read from a file file_deny_write() is called.
Multiple processes can call this on multiple files, and a file can only be written to again when all processes have called file_allow_write() on that file.
This is like having a "shared" lock.
if a process wished to write, it must wait for all the "shared" locks to be released.
### 4. Rationale
for these we are just calling implementing functinos but doing saftey checks beforehand.
We can allow some sharing; if two threads on want to process some data on a file, its fine if they both have access to it as long as no other thread is trying to write, if another thread was allowed to write we would get undetermined behavior.
We also have to make sure the user program does not write data that is too big or they could overwrite memory that is not allocated to them.
## Additional Questions
### 1. Invalid Stack Pointer
`sc-bad-sp.c` tests bad stack pointer handling
line 18, a crazy number is put into `%esp` and should be handled accordingly in the interupt
### 2. Valid Stack Pointer
`sc-bad-arg.c` im sus about SYS_EXIT
### 3. Not Fully Tested
The tests dont check for possible dead locks, if locking isnt handled properley, we could end up with a loop where processess are waiting for eachother to release a lock.