--- tags: mth225, 225-spr22 --- # Compositions and Inverse Images ## Compositions of Functions Quite often in computer programming, we solve a complex problem by chaining together simple processes together one at a time in a particular order. For example, suppose you wanted to take a list of numbers and return the largest or "max" element in the list. For example, starting with `[23,55,12,21]` would produce `55`. There are a lot of ways to do this, but one way is to break that problem down into two steps: 1. *Sort the list from low to high*; and then 2. *Select the last element in the list*. ![](https://i.imgur.com/cTuoy2b.png) Notice a couple of things. First, there are two simpler processes involved that are done in a "chain" or sequence. (Although sorting a list is not as simple as it sounds, it's simpler than sorting and then doing something else.) Second, the order in which these processes are done matters --- if we were to pull off the last element in the list *first*, and *then* try to sort, it wouldn't make sense because the new first step produces a number, and you can't sort a number. For a more complicated example, look at the `XOR` encryption process featured in [AEP 1: Encrypting messages with binary](/bhGGWEuNRfuz3in0LRXzNQ). The process that Alice uses to encode a plain text message into a bunch of encrypted bitstrings looks complicated, but actually it's just a sequence of simple steps: ![](https://i.imgur.com/D1Eu9iq.png) In mathematics, each of these processes would be represented by *functions*. You can think of a function as a kind of abstract computer program. And when we chain together two functions, like in our examples above, this is known as **function composition**. You can probably think of many places in computer science where you have used function composition without realizing it. Any time you call a process, then call another process using the outcome of the first one as the input, that's function composition. In formal math language, we have this definition: :::info If $f: X \rightarrow Y$ and $g: Y \rightarrow Z$ are functions, then define the **composition** of $f$ and $g$ to be the function $g \circ f: X \rightarrow Z$, defined by $(g \circ f)(x) = g(f(x))$. ::: Math definitions are dense, with a lot of info packed into a small amount of space. Let's unpack this definition to make sure we understand what it says: - First, we start with two functions $f$ and $g$, and *the codomain of $f$ equals the domain of $g$*. Did you notice that $Y$ is on the "receiving end" of $f$ and is the "starting point" of $g$? ![](https://i.imgur.com/OWAkkdg.jpg) That makes sense when you think about the example of finding the maximum element of a list above. The first process (sorting) has to produce an output whose "data type" is compatible with the second process. In that example $f$ would be a function that sorts lists; its domain and its codomain is the collection of all possible lists of numbers. And $g$, the second process, is the "pick the last element" function, whose domain is the collection of all lists and whose codomain would be the set of all *numbers*. Since the "pick the last element" function requires a list to be plugged into it, the output of the first process has to be a list. Other things to note: - The circle symbol $\circ$ means "composition". Don't confuse this with the multiplication symbol, which is a dot like this: $\cdot$. - The result of composing two functions is *another function* that combines them both into one big process. If you plug something into that process, the way you evaluate the composition is to (1) take the input and plug into $f$, then (2) take the output of that and plug into $g$. ![](https://i.imgur.com/pky5h6D.jpg) And finally: Note that again, the order matters. Depending on what the domains and codomains are, the function composition $f \circ g$ (where we do $g$ first, then $f$) might not even be possible, or might be possible even if the composition $g \circ f$ (where we do $f$ first, then $g$) is doable. And even if it's possible to compose two functions in different orders, the results might be very different. Here's a more formal example. :::warning **Example:** Look at the two functions $f: \mathbb{R} \rightarrow \mathbb{R}$ and $g: \mathbb{R} \rightarrow \mathbb{R}$ given by the formulas $f(x) = x^2$ and $g(x) = x + 10$. Look at the composition $g \circ f$. This is a function; how do we evaluate it if we plug in $x = -3$, for instance? The definition says: $$(g \circ f)(-3) = g(f(-3))$$ What does this mean? Remember it's a two-stage process. First we have to find $f(-3)$, and that's $f(-3) = (-3)^2 = 9$. Then we take the result of that, and run it through $g$: $g(9) = 9 + 10 = 19.$ Therefore $(g \circ f)(-3) = 19$. What if we composed in a different order? It's still a two-stage process: $$(f \circ g)(-3) = f(g(-3))$$ But we do $g$ first this time: $g(-3) = -3 + 10 = 7$. Then we take that result and run it through $f$: $f(7) = 7^2 = 49$. So $(f \circ g)(-3) = 49$. So like we said, composing in different orders often gives different results. ::: ## Inverse Images Another task that comes up a lot in computing is when want to do the *opposite* of some operation. For example, a professor might have a database of her students so that if she needed to look up a student's final exam grade, she could type in the name and the exam grade would come up: ``` f(Alice Jones) = 89 f(Bob Smith) = 72 f(Chuck Brown) = 89 ``` So given the name, she can look up the grade. But what if she wanted to do the opposite --- find all students who made a particular grade on the exam? She would need some kind of function where she would enter in `89` and both Alice and Chuck's name would come up (along with everyone else who scored an 89). You might notice that this "opposite" action wouldn't actually be a *function*, because a single input might produce multiple outputs. For example inputting `89` would bring up at least two possible results (Alice and Chuck). But we *can* say that this single input produces a **set** of results --- take all the students who earned 89 on the final exam, put them in a set, and return that set. This is the idea behind the following definition: :::info Suppose $f: X \rightarrow Y$ be a function and let $y$ be any point in $Y$, the codomain. Then the **complete inverse image of $y$ under $f$** is the set of all points in $X$ that map to $y$. We write this set $f^{-1}(y)$. ::: :::warning **Example:** Here are the semester grades for a MTH 225 class: | Student | Grade | | -------- | -------- | | Alice | B+ | | Bob | A | | Chuck | C+ | | Darla | A | | Earl | B+ | | Fiona | A | | Gus | C | Let $G$ be the function from the set {Alice, Bob, Chuck, ..., Gus} to the set of all possible letter grades (A through F) that is given by this table. For example $G(\text{Earl}) = \text{B+}$. If we wanted a list of all students who made an A in the class, that would be... $$G^{-1}(\text{A}) = \{ \text{Bob, Darla, Fiona}\}$$ Likewise, $G^{-1}(\text{B+}) = \{ \text{Alice, Earl} \}$ because this is the set of all students who earned a B+. Notice $G^{-1}(\text{C-}) = \emptyset$ because nobody earned a grade of C-. ::: ![](https://i.imgur.com/RwRT1a7.jpg) :::warning **Examples:** Look at the floor function $f(x) = \lfloor x \rfloor$ that takes in a real number and rounds it down. The domain of this function is $\mathbb{R}$, the set of all real numbers; and the codomain is $\mathbb{Z}$, the set of all integers. Pick a $y$ out of the codomain --- for example $y = 4$. What is the "complete inverse image of $4$ under $f$"? This would be the set of all real numbers that you would plug into $f$ to get $4$ as an output. There are a lot of points in this set, for example $4.0001$, $4.2$, and $4.9$... and many others. In fact any number between $4$ and $5$, including $4$ but not including $5$, would work. So we'd say: $$f^{-1}(4) = \{x \in \mathbb{R} \, : \, 4 \leq x < 5\}$$ ::: We can extend the idea of an inverse image by starting not with a *single* output but with a *set* of outputs and asking which points in the domain map into that set: :::info Suppose $f: X \rightarrow Y$ be a function and let $B$ be any subset of $Y$, the codomain. Then the **inverse image of $B$ under $f$** is the set of all points in $X$ that map into $B$. We write this set $f^{-1}(B)$. That is, $$f^{-1}(B) = \{ x \in X \, : \, f(x) \in B\}$$ ::: :::warning **Example:** Going back to the MTH 225 gradebook, suppose we want to know which students had semester grades in the "C" range --- either C, C-, or C+. We'd want to know the inverse image of $\{\text{C, C-, C+}\}$ and that would be... $$G^{-1}(\{\text{C, C-, C+}\}) = \{\text{Chuck, Gus}\}$$ ::: Two final notes on this concept: - The $f^{-1}$ **does not mean** $1/f$, like a negative exponent from algebra. Dividing by $f$ doesn't really make sense for us. The "negative exponent" here isn't an exponent, it's just a symbol that means "inverse". - If $f$ is a *bijective* function, then there are no collisions, and every point in the codomain will have something mapping onto it. This means that the inverse image of a point in the codomain will be just a one-element set. That means $f^{-1}$ is *another function* that goes backwards from $f$. But this only happens when $f$ is bijective!