Objects continued

:::warning Announcements - Assignment 8 due Thurs. - Help hours today 3:45-5:15pm, L037 - Coding workshop today 6-7pm, E111 ::: :::success Agenda for today - Object-oriented programming continued - Overriding/dynamic dispatch - More on inheritance/subtyping - covariant and contravariant types - Relation to functional programming ::: ## Method overriding Recall that method **overriding** lets a subclass provide a new definition for a method with the same name as a superclass. :::success This is distinct from method **overloading**. Could someone remind me of what that is? ::: **Answer** Method overloading is when multiple different methods happen to have the same name, but a different number (or type) of arguments. A common case for overloading in Java is object constructors: we might have a constructor for `new C()` that sets `C`'s fields to default values, and a constructor `new C(1)` that sets a field on object initialization. :::success **Question:** How do you think Java implements overloading? ::: Think, then discuss with folks near you. **Answer** Java's type checkers can use context (how many/what arguments are supplied, how the returned value is used) to resolve method overloading. This turns out to be semantically not-that-interesting, so we will just focus on the more interesting case: **overriding**. # Method overriding: dynamic dispatch Unlike method overloading, method **overriding** in languages like Java is when a subclass "replaces" a method in superclass. Overriding is implemented via a technique called [*dynamic dispatch*](https://en.wikipedia.org/wiki/Dynamic_dispatch#:~:text=In%20computer%20science%2C%20dynamic%20dispatch,(OOP)%20languages%20and%20systems.). Dynamic dispatch is also known as late binding or virtual methods/functions. Dynamic dispatch is a core programming language semantics feature that allows the specific implementation of a polymorphic method/function to be select *at runtime*. :::success **Question:** how do we type check that the function call be safe if will if we don't know which function is select? ::: **Answer**: we don't know what *specific* implementation will be called, but we know that the static, compile-time check ensures that based on the static type, the actual dynamic type must have the method defined. For example, consider an inheritance hierarchy and some function call elsewhere in the code: ```java= A // defines method f B extends A // overrides method f C extends B D extends C // overrides method f /* Somewhere else in the code, we have a function call */ void g(A a) { a.f(); // Which implementation of f is selected? } ``` Statically, at compile time when we examine the call at line 9, we can check that `f` is defined for the static/declared type `A`. However, we don't know at compile time which of the 3 implementations this might be! :::success **Question:** what code would call each implementation of `f`? ::: **Answer**: At **runtime**, the implementation of `f` is called based on the actual, dynamic type of the object. This might be different at different points in time within the same program! (This is, in fact, the *reason* to have polymorphism, so it's the common case). - A call `g(new A())` would dynamically select the `f` on line 1. - A call `g(new B())` would dynamically select the `f` on line 2. - A call `g(new C())` would also dynamically select the `f` on line 2 (because `C` does not override `B`, and `B` is the closest superclass that has `f` defined). - A call `g(new D())` would dynamically select the `f` on line 4. ## Evaluation rules for method overriding The typing rules for method overriding are similar to what we've seen before (just extended with our subclass handling). The evaluation rules, however, are more complicated: :::info Method call: `e.m()` Evaluation rule: 1. Under the current environment, evaluate `e` to value `v`. 2. Let `C` refer to the **actual, dynamic** class of the "receiver" object `v`. 3. Until class `C` contains a method definition `m() { body }` let `C` refer to the superclass of the current `C` and repeat step 3. 4. Under the environment of class `C` extended with `this` mapping to `v`, evaluate the body found in step 3. :star:. ::: Note that this is **not** exactly lexical scope or dynamic scope: it's its own thing! ### :star: What's this about `this`? In Java, `this` refers to current receiver object, not the containing class. In addition to adding the method parameters/arguments to the environment (which is elided here, since it follows what we've seen before), step 4 needs to extend the dynamic environment to map `this` to `v`! #### More evaluation rules notes - The subclass definition of `m` replaces the superclass definition of `m` when dispatching on an object of the subclass (or its descendant) in all contexts, even if dispatching from method in superclass. - More complicated than the rules for closures! #### Dynamic dispatch is not... Note that: `obj0.m(obj1, ..., objn)` ≠ `m(obj0, obj1, ..., objn)` That is, dynamic dispatch is fundamentally different from an implicit parameter that captures a first argument written in a different spot! "What m means" is determined by run-time class of `obj0`. The implementation must dynamically inspect `obj0` before starting to execute `m`. `this` is different than any other parameters. ## Dynamic dispatch implementation We'll start by covering how an simple interpreter (that runs a type check/compile step first) could implement dynamic dispatch. Then, we'll briefly touch on how optimized implementations actually work. ## Simpler interpreter implementation Here is one design for a basic interpreter: we have a type-checking/compilation step that runs first and check that all methods are defined on the static types. Then, we have a dynamic interpreter that actually does the dynamic dispatch. #### Step one: type-check/compile First, we compile (translate) the code to a lower-level representation. - In this step, check that all method calls are valid based on its static (declared) types. - Output an intermediate representation where: - Each object has a reference to its dynamic class. - We have a set of known classes: - A top-level `Object` class - Any user-defined class. - Classes have: - A set of methods defined by that class itself (including overrides). - A reference to their direct parents class (unless they are `Object`). #### Step two: dynamic dispatch during `interp` To dynamically dispatch at runtime, we: - Interpret the object and its argument to values. - Follow the object's reference to its dynamic class. - Check if the current class contains the method - If so, use that body to make the call, extending the environment with the arguments and with `this` mapping to the current object. - If not, repeat this step with the superclass. Since we ran the compile step first, this strategy is guaranteed to terminate by the time we reach `Object`! This implementation works, and is roughly what you will implement for Assignment 9. Implementing it is instructive for understanding dynamic dispatch; but it is too slow for optimized implementations, since we need to traverse the inheritance hierarchy at every call site. ## Optimized implementation: vtables Actual Java implementations (called ["Java Virtual Machines", or JVMs](https://en.wikipedia.org/wiki/Java_virtual_machine)) implement dynamic dispatch via a concept called a *virtual method table*, or *vtable* (this is the focus of Alexa's PhD research paper mentioned last class). We still have two steps: compiling to Java bytecode (including type checking based on static types), then executing that bytecode. In compiling to bytecode, JVMs vtable (typically one per *class*) that specify the implementation of *every* method for that class, including inherited methods. Each object then just needs a reference to the vtable for its dynamic class. For example, from our earlier `A/B/C/D` hierarchy, the vtable for `C` would specify that `f` maps to `B`'s implementation of `f` (at a lower level, the vtable includes a pointer to the actual relevant implementation of `f`). At runtime, dynamic dispatch then works by following the object's reference to the vtable for its dynamic class, then looking up the method by name. Vtables are more efficient because we only need to follow a reference from an object to its class's vtable, rather than traversing the object hierarchy. Vtables are extremely cool (in my biased opinion!), but this is the extent that we will cover them for CS251. # More on subclass polymorphism Next, we'll look more deeply at subclass polymorphism. We know that the point of polymorphism is to let us call a method on an object and get specific behavior depending on if its a particular subclass. We can write one function that takes in the superclass type, and call it with multiple subclass types: ```java class Cat extends Animal {} // Elsewhere void doSomething(Animal a) { a.noise(); } // We can call doSomething on an Animal: doSomething(new Animal()); // AND on a Cat subclass of Animal: doSomething(new Cat()); // where noise() may have different implementations ``` This example should be very familiar from CS230, and depends on this rule: :::info For method calls, we can pass a subclass where any of its superclasses are expected. ::: An interesting question to consider is this: when a class extends another and overrides functions, do those **function definitions** need to have the exact same return types, or can the return type be a subclass or superclass? ## Group exercise: sub/super for method overriding definitions Consider the following example: ```java /* Group exercise: discuss whether you think in the New subclass overriding the superclass methods, Java can change the: * 1: Return type: Java can safely let ??1 be a: * - subclass (Animal) * - superclass (Object) * - both * - neither: must be the exact type * 2: Argument type: Java can safely let ??1 be a: * - subclass (Animal) * - superclass (Object) * - both * - neither: must be the exact type * * Hint: consider how another class would use Groomer/New. */ class Groomer { Animal pickUpPet() { // ... } void dropOffPet(Animal a) { // ... } } class New extends Groomer { ?? pickUpPet() { // ... } void dropOffPet(?? a) { // ... } } ``` **Answer** The correct answer for what Java can *safely* do is: ``` What Java *could safely allow* for overriding ??1 (return type) : subclass ??2 (argument type): superclass ``` However, the *actual* Java rules are more conservative for object **argument types**: ``` What Java *actually allows* for overriding ??1 (return type) : subclass ??2 (argument type): neither (must be exact type) ``` To understand why we can use a subclass for the return type, consider this code of *using* the subclass: ```java // Declared type is Groomer, but dynamic type is New Groomer g = new New(); Animal ani = g.pickUpPet(); // Would it break things if ani was actually a Cat? // No, because Cat has all methods that Animal has! // Would it break things if ani was actually an Object? // Yes, this is unsafe! Object has no noise method defined. ``` To understand why we *could* safely use a superclass for the argument type, consider: ```java n.dropOffPet(new Dog()) // Consider: // Would it break things if the argument were actually a Dog? // Seeing the problem is a bit more complicated here, since // it requires another subclass. // Yes, because although dropOffPet is defined on Groomer, // the dynamic type of New only knows how to call dropOffPet // on Cat, not Dog. New's implementation of dropOffPet // might call Cat's "makeHairball", for example, which Dogs cannot do! // Would it break things if the argument were actually an Object? // No, this would not break things. Cat's dropOffPet // implementation could only call Object's methods, but // that is fine. // However, this case is less intuitive! So Java does not // allow it for method overriding, and instead would treat // the method definition as overloading. ``` ## contravariance and covariance In **general** safe OOP design (and other contexts with *subtyping*), when overriding methods defined in a superclass, the subclass does not need to use the **exact** same argument and return types. Rather: 1. The return type of the overridden method may be a **subclass** of the return type in the superclass. This is an example of **covariance**. 2. The argument type in the subclass may be a **superclass** of the argument type in the superclass. This is an example of **contravariance**. This is *safe* for OOP, but is **not actually implemented** by Java: Java makes a conservative choice to not support this for overridden methods. ### Covariance/contravariance in general **Covariance** means you can substitute a more specific (sub)type in a place where a more general (super)type is expected. - Applies to: Return types (including Java), arrays (though this can cause issues, which are out-of-scope today), some generic type parameters. - Intuition: “Going down the type hierarchy (more specific) is OK.” **Contravariance** means you can accept a more general (super)type in a place where a more specific (sub)type is expected. - Applies to: Some OOP function arguments (but not Java), some generic type parameters. - Intuition: “Going up the type hierarchy (more general) is OK.” Java doesn't allow contravariant parameter types in method overrides (even though it makes sense conceptually). There is lots of interesting further details on how covariance/contravariance impact Java generics, but we'll stop here for CS251. # Relationship between OOP and FP To close out our OOP unit, we'll zoom out and consider the difference again between OOP and FP, now that we understand the OOP mechanisms. # Mechanism: closed vs open functions/methods One core difference between the *mechanisms* of OOP and FP is how we can "shadow" function/method implementations. ### Shadowing for closures In OCaml and most functional languages: closures are, well, **closed** (at creation, they bundle up their environments into a closed context). For example: consider these mutually-recursively-defined functions in OCaml: ```ocaml (* Note, this is intentionally a slow, silly implementation! *) let rec even x = if x=0 then true else odd (x-1) and odd x = if x=0 then false else even (x-1) ``` Client code that uses these may shadow `even`, with a new definition, but calls to `odd` are **unaffected** (they use the "original" version): ```ocaml! (* Shadow "even" with a more efficient implementation: *) let even x = (x mod 2) = 0 (* This does not change odd (too bad, would be faster!) *) (* odd's particular implementation of even was CLOSED *) ``` ### Shadowing for "open" overriding In most OOP languages, subclasses can change the behavior of superclass methods, even ones they do not directly override! Here is the same example as before, now in Java: ```java class A { boolean even(int x) { if (x == 0) return true; else return odd(x-1); } // odd's particular implementation of even is OPEN boolean odd(int x) { if (x == 0) return false; else return even(x-1); } } class B extends A { // Improves odd in B objects, even though B does not // change/override odd directly! boolean even(int x) { return x % 2 == 0; } } ``` The new definition of `even` defined by `B` overrides `even` in **all** context for objects of dynamic type `B`, **including** in the `odd` implementation that is still only defined in `A`! Critically, `A`'s `odd` implementation was *not closed* when `A` was defined: the author of `A` **cannot know** what specific version of `even` might be called (unless `even` is modified with the `final` keyword). ### Closures vs. Objects Closures: – Capture code of function, by function definition. – Capture all bindings the code may use, by lexical scope of definition. Objects: – Capture code for all methods that could be called on it, by class hierarchy. – Capture bindings that may be used by that code, by instance variables declared in class hierarchy. ## FP vs OOP decomposition/extensibility ### FP vs. OOP decomposition FP and OOP represent two distinct world views! In FP, functions perform some operation, which may use pattern matching to specialize on variant types or other patterns. In OOP: classes/prototypes give behavior to some kind of data. Subclassing allows specialization for subtypes. Which of these approaches is better depends on the domain, how the software is expected to change over time, and (perhaps most importantly) engineer preference and social factors. ### Table view of FP vs OOP decomposition Consider implementing functionality on our `Expr` running example, where `Expr` expressions can be either a `Var`, `Add`, or `Num`. The functionality/behaviour we want to define is `interp` and `toString`, which should be specialized for each subtype. | | `interp` | `toString` | | -------- | -------- | -------- | | `Var` | | | | `Add` | | | | `Num` | | | #### FP view of the table In OCaml, we would create the `Expr` subtypes with a variant type. Each function would then be a definition **over all variants**, with specialization via pattern matching. We can think of this as a vertial grouping of columns in our table! Said another way, in FP, function implementations, with subtype behavior spread across function implementations. | | :large_blue_square: `interp` | :large_red_square: `toString` | | -------- | -------- | -------- | | `Var` | :large_blue_square: | :large_red_square: | | `Add` | :large_blue_square: | :large_red_square: | | `Num` | :large_blue_square: | :large_red_square: | #### OOP view of the table In Java, we would create the `Expr` as a superclass, then define `Var`, `Add`, and `Num` as subclasses that each extend `Expr`. Each subclass would then define *its own* specialization of `interp` and `toString` via method overriding. Said another way, in OOP, we group code based on subtype, with function implementations spread across subtypes. Each subclass would then be a definition **over all overridden methods**, with specialization via inheritance. We can think of this as a horizontal grouping of rows in our table! | | `interp` | `toString` | | -------- | -------- | -------- | | :large_blue_square: `Var` | :large_blue_square: | :large_blue_square: | | :large_red_square: `BinOp` | :large_red_square: | :large_red_square: | | :large_yellow_square: `Num` | :large_yellow_square: | :large_yellow_square: | #### Implications on extensibility Choose which view is better partially depends on how we expect the code to change over time. When code can be easily adapted to future changes, we call it **extensible**. Extensibility is an important goal in real-world software engineering, since most code is not "one and done": it continuously evolves over time! Which of FP vs OOP is better for extensibility depends on how we expect the code to change in this way: do we think it's more likely/frequent that we will change the set of supported **functions**, or **subtypes**? #### FP: good for new functions Since FP implementations are grouped by *function*, adding a new function is easy: we just define the new function without changing existing functions! For example, if we want to add a `typeCheck` function for our `Expr` language, we need only add a `fun type_check` definition. This is part of why programming language people love FP: in the long run, programming language ASTs are fairly stable, but the functions/analyses will define over them change more frequently! #### OOP: good for new subtypes Since OOP implementations are grouped by *subtype*, adding a new subtype is easy: we just `extend` the existing superclass, without changing existing classes! For example, if we want to add a `Neg` variant `Expr` language, we need only add a `class Neg extends Expr` class definition. This is part of why OOP is the paradigm of choice for most Graphical User Interface (GUI) frameworks: for GUIs, adding new components (e.g., a new button type) is the more common case! ### Concluding musings As we've discussed, FP and OOP are different worldviews, but both can be incredibly useful in different cases. You now are more prepared to choose between the two, including in emulated good designs from one in a language that primarily uses the other!