Testing programs with objects; inheritance

# Testing programs with objects; inheritance ## Table of Contents 0. [Another Example of Polymorphism: Python’s special methods](#another) 1. [Testing Classes](#testclasses) 2. [Inheritance](#inheritance) Some quick notes from the drill questions: * A "field" of a class (also called an attribute) is a variable in the scope of the class. For instance, the `author` and `title` variables were fields of `Book` last time. * Fields can be either class-specific or instance-specific. That is, you might have one variable that is _shared_ across all instances of a class (like `DJData`'s `queue` field), and another that is different for every instance (like the `title` of a book). To tell whether a field is class-specific or instance-specific, just see if it's defined in the `__init__` method (instance-specific) or outside the `__init__` methods (class-specific). Many other questions involved today's topics! ## Another Example of Polymorphism: Python’s special methods <A name="another"></A> Python uses polymorphism ubiquitously. For example, recall that `dataclass` instances printed out in a nice way. How can we get similar behavior with our own classes? We can implement the `__str__` and `__repr__` methods: ```python class Book: # other methods... # "human readable" string representation def __str__(self): return self.title + " by " + self.author # "unambiguous" string representation def __repr__(self): return 'Book("' + self.title + '", "' + self.author + '")' ``` These two methods sound similar, but they get called in different places internally by Python. By convention, `__str__` is for end-user focused output, and you want it to be human readable. In contrast, `__repr__` should be precise and unambiguous (so I wrote it to produce a string that is, itself, Python code). If I run `print(library[0])`, it prints `The Nickel Boys by Colson Whitehead`. But if I'm on the console and I just type `library[0]`, it will produce `Book("The Nickel Boys", "Colson Whitehead")`. Note that this only affects the way Python _displays_ the object; it doesn’t change anything about the object’s internal representation. It's generally good practice to define at least the `__repr__` method in all of your classes. It makes debugging a lot easier when you get an understandable string instead of an object type and ID! Methods that use the double-underscore notation are usually understood to be special, and interact with some internal Python functionality. By convention we don't call them directly, but via _syntactic sugar_ they provide. We don't invoke `__init__` to make a new object, we use the `ClassName(...)` syntax instead. Likewise, we'll use (e.g.) `repr(an_object)` and `str(an_object)`. _If you don't define a `__str__`, Python will fall back to using `__repr__` instead._ Python classes have quite a few of these standard method names. Another is `__eq__`, which lets you define how an object should decide whether it's equal to another. We'll talk about this, and other such methods, later. ## Testing Classes <A name="testclasses"></A> How should we test our library program? We should write tests for each of the methods. Let’s test the `Book` class: ```python from library import * def test_init(): b = Book("The Nickel Boys", "Colson Whitehead") assert b.title == "The Nickel Boys" assert b.author == "Colson Whitehead" def test_matches(): b = Book("The Nickel Boys", "Colson Whitehead") assert b.matches("Nickel") assert not b.matches("Parasite") assert b.matches("Colson") ``` Notice the pattern here: we’re creating objects in our test methods in order to test their methods. ### Adding checkout Let’s say we want to add "checkout" behavior to our library: we want to be able to record that certain items have been checked out or returned, and take this into account when searching. Let’s add this behavior to our tests first: ```python def test_init(): b = Book("The Nickel Boys", "Colson Whitehead") assert b.title == "The Nickel Boys" assert b.author == "Colson Whitehead" assert not b.checked_out def test_checkout(): b = Book("The Nickel Boys", "Colson Whitehead") b.checkout() assert b.checked_out def test_return(): b = Book("The Nickel Boys", "Colson Whitehead") b.checkout() b.return() assert not b.checked_out def test_matches(): b = Book("The Nickel Boys", "Colson Whitehead") assert b.matches("Nickel") assert not b.matches("Parasite") assert b.matches("Colson") b.checkout() assert not b.matches("Nickel") assert not b.matches("Parasite") assert not b.matches("Colson") ``` Then we can add the behavior to our Book class: ```python class Book: def __init__(self, title: str, author: str): self.title = title self.author = author self.checked_out = False def checkout(self): self.checked_out = True def return(self): self.checked_out = False def matches(self, query: str) -> bool: return (not self.checked_out) and (query in self.title or query in self.author) ``` Note that this is a design choice: we could put the check in the class, or we could put the check in the `search` function. I'd argue that putting the method here, in the class, is actually _less_ preferable: what if a librarian needs to search the library for books, so that they can track down what's missing? Better to allow the library functions to make that distinction, and not force it in the `matches` method. So, if I were writing this again, I'd design it differently. ## Inheritance <A name="testclasses"></A> Right now, we’re considering each class to be totally separate, with no shared code or data. For instance, even though `Book`s and `Movie`s both implement a `matches` method, the two methods have completely different implementations. Inheritance gives us a way to share code between classes. Classes can inherit from other classes, like this: ```python class A: pass # fill in "parent" class B(A): # note the parameter (A) pass # fill in "child" ``` When there’s an inheritance relationship like this between two classes, we say that `A` is the superclass of `B` and `B` is a subclass of `A`. Let’s see an example of inheritance in action. Suppose that our library wants to track which items have been checked out, and only return items in a search if they are actually available. First, we’ll implement a `LibraryItem` class, which we’ll use as the superclass for both books and movies: ```python class LibraryItem: def __init__(self): self.checked_out = False def checkout(self): self.checked_out = True ``` The `LibraryItem` class only handles checking items out and back in; it doesn’t know anything about what the items actually are. In order to use our new superclass, we’ll make some changes to the `Book` and `Movie` classes: ```python class Book(LibraryItem): def __init__(self, title: str, author: str): self.title = title self.author = author super().__init__() class Movie(LibraryItem): def __init__(self, title: str, director: str, actors: list): self.title = title self.director = director self.actors = actors super().__init__() ``` (This is one of the few places where the convention above is conventionally violated: we call `__init__` directly here, from within a subclass's `__init__`.) Both `Book` and `Movie` objects will now _automatically_ have a `checked_out` field, as well as the checkout method. We can use the field in each class’s matches method: ```python class Book(LibraryItem): # other methods elided... def matches(self, query: str) -> bool: return (not self.checked_out) and (query in self.title or query in self.author) class Movie(LibraryItem): # other methods elided... def matches(self, query: str) -> bool: return (not self.checked_out) and query in self.title or query in self.director or any([query in actor for actor in self.actors]) ``` ...although, again, if I were re-writing these, I'd probably put the check in the `search` method. Better yet, I'd probably create a `Library` class to handle the idea of checking books in and out, rather than making the actual item keep track itself.