# Expression Trees (Friday)
## Motivating Expression Trees
We talked briefly before about how tree-like data pops up all over the place in computing and in life. Today we'll look at one more type of tree. Here's an example:

What do you notice about this tree?
* All the leaf nodes (nodes without children) are numbers;
* non-leaf nodes are operations like addition.
What might this tree represent? Arithmetic! The arithmetic expression `((9*4) + 5)` to be precise. The children of an operation node correspond to that operation's inputs. We'll read these inputs from left to right (although `+` and `*` are commutative, so for the moment I'll disregard ordering).
Both of these representations, the tree and the expression, are equally valid. For computation, though, the tree (we'll call it an _expression tree_) has advantages.
Here are two more perfectly good trees, representing `4` and `(5 + ((4 * 2) - 7 ))` respectively:

Notice how the parentheses in the expressions echo the "levels" of the corresponding expression tree.
What would an _invalid tree_ look like? Maybe something like:

What about

which represents a division by zero: `(10 / 0)`? Here something is wrong, but is it the same kind of problem? We can explore the difference in Python.
If I write `10 / 0` in Python, the program runs and I get a `ZeroDivisionError`. But if I write `1 2 3` in Python I get a `SyntaxError`: the program doesn't run, because Python _doesn't understand a meaning for it_.
## Implementing Expression Trees
There are a lot of things we might want to do with an expression tree. Three of the most common are:
* converting a string expression into a corresponding tree (_parsing_);
* converting a tree into its corresponding string expression (this is just writing `__repr__`); and
* _running_ the tree as a program.
### String to tree: parsing
Parsing is out of scope for 112, and is usually far harder than writing `__repr__`. (This is why we gave you a parser for HTML trees.)
Why is parsing hard? Lots of reasons, but consider producing a tree for the expression string `1 + 2 * 3`. Without knowing the rules of precedence for arithmetic, this expression is _ambiguous_: there are multiple potential trees it could correspond to (`(1+2)*3` and `1+(2*3)`), and those trees even produce different values!
If you're curious about how parsing works, check out CSCI 1260.
### Tree to string: `__repr__` (Part 1)
Before we can convert trees into strings, we need to implement trees themselves.
What do these arithmetic expressions look like in class form? There are multiple ways to represent them. Let's be guided by a convenient design principle: invalid trees should be, whenever possible, impossible to create. For instance, we'd like to make it impossible to create a tree with `12` as an internal node, or `+` as a leaf.
To accomplish this, we'll have different classes for different kinds of node. Here, we have two very different notions: _operations_, which can only be internal nodes of the tree, and _values_, which can only be leaf nodes.
```python
class Value:
def __init__(self, value):
self.value = value
class Operation:
def __init__(self, op:str, left, right):
self.op = op
self.left = left
self.right = right
```
We could improve on this (e.g., by raising an error if an `Operation` gets created with an invalid operator) but let's move forward for now.
### Tree to string: `__repr__` (Part 1)
Now let's write `__repr__` methods for each class. Actually, let's do a bit more. In the past, we've seen that there are two different "convert to string" methods in Python, `__repr__` (for the use of programmers) and `__str__` (for the use of regular users). Let's use this opportunity to demonstrate the different uses of `__repr__` and `__str__`.
#### Values
A value is just a single value. If we're representing it for _our own debugging use_, we should probably tag it as a `Value` class, and say what it contains. But if we're representing it for a user, we should probably just give the value itself. We'll do that like this:
```python
def __repr__(self):
return "Value({})".format(self.value)
def __str__(self):
return str(self.value)
```
#### Operations
Operations are a bit more complex. Not only do we need to account for their left and right children, but we need to make sure that the recursive structure of the whole tree gets explored.
Fortunately, this isn't tough to do; we'll just use `format` and make sure to call either `str` or `repr` appropriately:
```python
def __repr__(self):
return "Operation({}, left={}, right={})".format(self.op, repr(self.left), repr(self.right))
def __str__(self):
return "({} {} {})".format(str(self.left), self.op, str(self.right))
```
#### Testing
Let's try some expressions and make sure these methods work like we expect:
```python
Operation('+', Value(1), Value(2))
Operation('/', Operation('/', Value(2), Value(2)), Value(5))
```
## Running the program
One type of computation we might want to do with expression trees is _run_ them! For instance, if our expressions are arithmetic, we might write a method that runs the arithmetic and returns the result. We call such a function an _interpreter_.
Again, the interpreter for `Value` turns out to be pretty straightforward: we just return the value. But what about `Operation`? The key lies in realizing that we need to turn an operation node containing the string `+` into an actual addition operation (and likewise for the other operators):
```python
def interp(self):
if self.op == '+':
return self.left.interp() + self.right.interp()
if self.op == '*':
return self.left.interp() * self.right.interp()
if self.op == '-':
return self.left.interp() - self.right.interp()
if self.op == '/':
return self.left.interp() / self.right.interp()
raise ValueError(self.op) # built-in error type
```
If you've ever heard someone say that the Python prompt runs an "interpreter" for Python, now you know why.
### Testing
Suppose I write:
```python
expr1 = Operation('+', Value(1), Value(2))
expr2 = Operation('/', Operation('/', Value(2), Value(2)), Value(5))
assert expr1.interp() == 3
assert expr2.interp() == 0.2
```
What tests am I missing?