# Assignment 4 ## File data structure ![](https://i.imgur.com/AqKayGD.png) - We need a file descriptor table per process entry - In `proc.h`, add a `struct file_descriptor_entry *fd_table[OPEN_MAX]` ``` struct file_descriptor_entry { int fd_occupied; struct open_file_entry *fd_of; struct lock *fd_lock; }; ``` - `struct open_file_entry` ``` struct open_file_entry { /* kfree only when refcount is about to hit 0 */ int of_refcount; // flags defined in src/kern/include/kern/fcntl.h int of_status; off_t of_offset; struct vnode *of_vn; struct lock *of_lock; }; ``` ## Design & Implementation - Important: Before sitting down to write code, get together with your partner and write down the following for every system call: - the arguments it takes - the return values it might return - the errors it must check for - the information it needs to access and update inside the file table - the functions and macros available in os161 that you can use for its implementation - the potential race conditions and how they must be prevented (assume no user-level threads for this assignment) ### open() - `int open(const char *filename, int flags, mode_t mode);` - Return: On success, open returns a nonnegative file handle. On error, -1 is returned, and errno is set according to the error encountered. - Errors: - [ ] ENODEV The device prefix of filename did not exist. - [ ] ENOTDIR A non-final component of filename was not a directory. - [ ] ENOENT A non-final component of filename did not exist. - [ ] ENOENT The named file does not exist, and O_CREAT was not specified. - [ ] EEXIST The named file exists, and O_EXCL was specified. - [ ] EISDIR The named object is a directory, and it was to be opened for writing. - [ ] EMFILE The process's file table was full, or a process-specific limit on open files was reached. - [ ] ENFILE The system file table is full, if such a thing exists, or a system-wide limit on open files was reached. - [ ] ENXIO The named object is a block device with no filesystem mounted on it. - [ ] ENOSPC The file was to be created, and the filesystem involved is full. - [x] EINVAL flags contained invalid values. - [ ] EIO A hard I/O error occurred. - [ ] EFAULT filename was an invalid pointer. ### read() - `ssize_t read(int fd, void *buf, size_t buflen);` - Return: The count of bytes read is returned. This count should be positive. A return value of 0 should be construed as signifying end-of-file. On error, read returns -1 and sets errno to a suitable error code for the error condition encountered. - Errors: - [ ] EBADF fd is not a valid file descriptor, or was not opened for reading. - [ ] EFAULT Part or all of the address space pointed to by buf is invalid. - [ ] EIO A hardware I/O error occurred reading the data. ### write() - `ssize_t write(int fd, const void *buf, size_t nbytes)`; - Return: The count of bytes written is returned. This count should be positive. A return value of 0 means that nothing could be written, but that no error occurred; this only occurs at end-of-file on fixed-size objects. On error, write returns -1 and sets errno to a suitable error code for the error condition encountered. - Errors: - [ ] EBADF fd is not a valid file descriptor, or was not opened for writing. - [ ] EFAULT Part or all of the address space pointed to by buf is invalid. - [ ] ENOSPC There is no free space remaining on the filesystem containing the file. - [ ] EIO A hardware I/O error occurred writing the data. ### lseek() - `off_t lseek(int fd, off_t pos, int whence);` - Return: On success, lseek returns the new position. On error, -1 is returned, and errno is set according to the error encountered. - Errors: - [ ] EBADF fd is not a valid file handle. - [ ] ESPIPE fd refers to an object which does not support seeking. - [ ] EINVAL whence is invalid. - [ ] EINVAL The resulting seek position would be negative. ` ### close() - `int close(int fd);` - Return: On success, close returns 0. On error, -1 is returned, and errno is set according to the error encountered. - Errors: - [ ] EBADF fd is not a valid file handle. - [ ] EIO A hard I/O error occurred. ### dup2() - `int dup2(int oldfd, int newfd);` allows you to specify in what file descriptor you want the copy, since it duplicates fildes into fildes2, returning it as well. If fildes2 happens to be already open, dup2() closes it before duplication, unless it's equal to fildes: in this case no duplication occurs, and the file remains open. Both functions, needless to say, return -1 in case of troubles. - Return: On success, returns newfd and on error, -1 is returned, and errno is set according to the error encountered. - [ ] EBADF fd is not a valid file handle. - [ ] EMFILE The process's file table was full, or a process-specific limit on open files was reached. - [ ] ENFILE The system's file table was full, or a global limt on open files was reached ### chdir() > os161/src/kern/proc/proc.c > proc_setas() is what we need (probably) - `int chdir(const char *pathname);` - Return: On success, chdir returns 0. On error, -1 is returned, and errno is set according to the error encountered. - [ ] ENODEV The device prefix of pathname did not exist. - [ ] ENOTDIR A non-final component of pathname was not a directory. - [ ] ENOTDIR pathname did not refer to a directory. - [ ] ENOENT pathname did not exist. - [ ] EIO A hard I/O error occurred. - [ ] EFAULT pathname was an invalid pointer. ### __getcwd() - `int__getcwd(char *buf, size_t buflen);` - On success, __getcwd returns the length of the data returned. On error, -1 is returned, and errno is set according to the error encountered. - [ ] ENOENT A component of the pathname no longer exists. - [ ] EIO A hard I/O error occurred. - [ ] EFAULT buf points to an invalid address. ## Code reading exercises 1. What are the ELF magic numbers? * Ans: The four-byte ELF magic numbers ('0x7F', 'E', 'L', 'F') are placed at the very beginning of every ELF files. The OS checks the magic numbers when accessing an ELF to check the file format. 2. What is the difference between UIO_USERISPACE and UIO_USERSPACE? When should one use UIO_SYSSPACE instead? - Ans: UIO_USERSPACE is a flag in a uio t that user data is transfered between the kernel and user space. Whereas UIO_USERSPACE indicates the user code is transferred between kernel and user space. UIO_SYSSPACE is used when data is transfered within the kernel. 3. Why can the struct uio that is used to read in a segment be allocated on the stack in load_segment() (i.e., where does the memory read actually go)? - Ans: uio can be defined on the stack because of the fact that it is an abstraction that is used for the transfer and is not related to the final location that the memory read goes to, which would be u.uio_iov.iov_ubase in vaddr. 4. In runprogram(), why is it important to call vfs_close() before going to usermode? - Ans: vfs_close() calls vnode_decref in src/kern/vfs/vnode.c, in which we decrement the reference count of the ELF file (or do VOP_RECLAIM if vn_refcount == 1). If we don't call vfs_close() before going to usermode, the file may never be closed, causing a memory leakage. 5. What function forces the processor to switch into usermode? Is this function machine dependent? - Ans: enter_new_process in src/kern/arch/mips/locore/trap.c sets up the trap frame for exception return. Then it calls mips_usermode to turn off interrupt and call asm_usermode in src/kern/arch/mips/locore/exception-mips1.S where we update the status register to usermode and load all registers from the trapframe. 6. In what file are copyin and copyout defined? memmove? Why can't copyin and copyout be implemented as simply as memmove? - Ans: copyin and copyout are defined in copyinout.c; memmove is defined in memmove.c. Because copyin and copyout deal with both the user address space and kernel address space, we have to make sure that the memory being copied to or from user address space does not overlap with the kernel address space. By default memmove wouldn't perform domain crossing and would not check such a condition. 7. What (briefly) is the purpose of userptr_t? - userptr_t is a pointer that points to one byte, used to indicate a memory address in the user space which we will interact with, such as by copying memory from user space to kernel space or vice versa at the address specified by the userptr_t. 8. What is the numerical value of the exception code for a MIPS system call? - Ans: On line 91 in src/kern/arch/mips/include/trapframe.h, EX_SYS is defined as 8. 9. How many bytes is an instruction in MIPS? (Answer this by reading syscall() carefully, not by looking somewhere else.) - Ans: On line 141 in src/kern/arch/mips/syscall/syscall.c, upon syscall return the program counter stored in the trapframe, tf->tf_epc, is incremented by 4 bytes, which means an instruction in MIPS is 4 bytes. 10. Why do you "probably want to change" the implementation of kill_curthread()? - Ans: the kernel probably does not need to fail due to a usermode trap, so it would be better to make some handler that would simply kill the process or handle the error in some other way. 11. What would be required to implement a system call that took more than 4 arguments? - Ans: we would need to allocate the other arguments onto the user stack using copyin(). If the arguments are 64-bit, we can only pass 2 before needing to allocate the other arguments onto the user-stack. 12. What is the purpose of the SYSCALL macro? - Ans: The SYSCALL macro provides a template for populating all SYSCALL functions. In each SYSCALL(sym, num), SYS_##sym is passed in v0 (e.g., SYSCALL(fork, 0) pass SYS_fork in v0) before invoking the MIPS syscall instruction. 13. What is the MIPS instruction that actually triggers a system call? (Answer this by reading the source in this directory, not looking somewhere else.) - Ans: On line 85 of src/build/userland/lib/libc/syscalls.S, "syscall" is the MIPS instruction that actually triggers a software interrupt. 14. After reading syscalls-mips.S and syscall.c, you should be prepared to answer the following question: OS/161 supports 64-bit values; lseek() takes and returns a 64-bit offset value. Thus, lseek() takes a 32-bit file handle (arg0), a 64-bit offset (arg1), a 32-bit whence (arg2), and needs to return a 64-bit offset value. In void syscall(struct trapframe *tf) where will you find each of the three arguments (in which registers) and how will you return the 64-bit offset? - Ans: The first argument (arg0), which is the 32-bit file handle will be found in a0 for the syscall, then since the second argument is 64-bit, and 64-bit arguments are passed in aligned pairs of registers, a1 will be unused and the second argument, aka the 64-bit offset (arg1) will be found in a2/a3. Lastly, we will find the last argument on the user stack at sp+16 using copyin(). Then, to return the 64-bit offset, we pass the offset value to the v0/v1 registers. 15. As you were reading the code in runprogram.c and loadelf.c, you probably noticed how the kernel manipulates the files. Which kernel function is called to open a file? Which macro is called to read the file? What about to write a file? Which data structure is used in the kernel to represent an open file? - Ans: vfs_open() is called to open a file, VOP_READ macro is called to read the file, and VOP_WRITE to write. vnode is used to represent an open file, on which we can do VOP_READ, VOP_WRITE operations. 16. What is the purpose of VOP_INCREF and VOP_DECREF? - Ans: When we open (or close) a file, device, or other kernel objects, we have to call VOP_INCREF (or VOP_DECREF) to update the reference count of the vnode. ## References https://www.freebsd.org/cgi/man.cgi?query=uio&sektion=9&format=html https://ops-class.org/asst/2/