Lecture 19: Objects continued

Lecture 19: Objects continued --- :::warning Announcements - **Please sit in the front two rows of the room.*** - Assignment 8 due Tuesday. - CS Club Coding workshop today, 5:30pm, L039 - Next help hours: - Emily's 3-5pm Friday - Mine * 5:30-7:30 Monday ::: :::success Agenda for today - Object-oriented programming continued - Overriding vs. Overloading - Dynamic dispatch - (Next time) More on inheritance/subtyping - covariant and contravariant types - (Next time) Relation to functional programming ::: ## Method overloading vs. method overriding Recall that method **overriding** lets a subclass provide a new definition for a method with the same name as a superclass. :::success This is distinct from method **overloading**. Could someone remind me of what that is? ::: :::spoiler Think, then click! Method overloading is when multiple different methods happen to have the same name, but a different number (or type) of arguments. ::: A common case for overloading in Java is object constructors: we might have a constructor for `new C()` that sets `C`'s fields to default values, and a constructor `new C(1)` that sets a field on object initialization. :::success How do you think Java implements overloading? ::: :::spoiler Think, then click! Java's type checkers can use context (how many/what arguments are supplied, how the returned value is used) to resolve method overloading. ::: *Overloading* turns out to be not-that-interesting semantically, so we will focus on the more interesting case: **overriding**. # Method overriding: dynamic dispatch Unlike method overloading, method **overriding** in languages like Java is when a subclass "replaces" a method in superclass. Overriding is implemented via [*dynamic dispatch*](https://en.wikipedia.org/wiki/Dynamic_dispatch#:~:text=In%20computer%20science%2C%20dynamic%20dispatch,(OOP)%20languages%20and%20systems.). Dynamic dispatch is also known as late binding or virtual methods/functions. Dynamic dispatch is a core programming language semantics feature that allows the specific implementation of a polymorphic method/function to be select *at runtime*. :::success How do we type check that the function call be safe statically, before the code is run, if we don't know which function will actually be called at runtime? ::: :::spoiler Think, then click! We don't know what *specific* implementation will be called, but we know that the static, compile-time check ensures that based on the static type, the actual dynamic type must have the method defined. ::: For example, consider an inheritance hierarchy and some function call elsewhere in the code: ```java= A // defines method f B extends A // overrides method f C extends B D extends C // overrides method f /* Somewhere else in the code, we have a function call */ void g(A a) { a.f(); // Which implementation of f is selected? } ``` Statically, at compile time when we examine the call at line 9, we can check that `f` is defined for the static/declared type `A`. However, we don't know at compile time which of the 3 implementations this might be! :::success What code would call each implementation of `f`? ::: :::spoiler Think, then click! At **runtime**, the implementation of `f` is called based on the actual, dynamic type of the object. This might be different at different points in time within the same program! (This is, in fact, the *reason* to have polymorphism, so it's the common case). - A call `g(new A())` would dynamically select the `f` on line 1. - A call `g(new B())` would dynamically select the `f` on line 2. - A call `g(new C())` would also dynamically select the `f` on line 2 (because `C` does not override `B`, and `B` is the closest superclass that has `f` defined). - A call `g(new D())` would dynamically select the `f` on line 4. ::: ## Evaluation rules for method overriding The typing rules for method overriding are similar to what we've seen before (just extended with subclass logic, as we'll discuss a bit later this lecture). The evaluation rules, however, are more complicated: :::info Method call: `e.m()` Evaluation rule: 1. Under the current environment, evaluate `e` to value `v`. 2. Let `C` refer to the **actual, dynamic** class of the "receiver" object `v`. 3. Until class `C` contains a method definition `m() { body }` let `C` refer to the superclass of the current `C` and repeat step 3. 4. Under the environment of class `C` extended with `this` mapping to `v`, evaluate the body found in step 3. :star:. ::: Note that this is **not** exactly lexical scope or dynamic scope: it's its own thing! ### :star: What's this about `this`? In Java, `this` refers to current receiver object, not the containing class. In addition to adding the method parameters/arguments to the environment (which is elided here, since it follows what we've seen before), step 4 needs to extend the dynamic environment to map `this` to `v`! ### :star: What's this about `super`? In Java, `super` refers to the direct superclass of the current receiver object. #### More evaluation rules notes - The subclass definition of `m` replaces the superclass definition of `m` when dispatching on an object of the subclass (or its descendant) in all contexts, even if dispatching from method in superclass. - More complicated than the rules for closures! #### Dynamic dispatch is not... Note that: `obj0.m(obj1, ..., objn)` ≠ `m(obj0, obj1, ..., objn)` That is, dynamic dispatch is fundamentally different from an implicit parameter that captures a first argument written in a different spot! "What m means" is determined by run-time class of `obj0`. The implementation must dynamically inspect `obj0` before starting to execute `m`. `this` is different than any other parameters. # Dynamic dispatch implementation We'll start by covering how an simple interpreter (that runs a type check/compile step first) could implement dynamic dispatch. Here is one design for a basic Java implementation: we have a type-checking/compilation step that runs first and checks that all methods are defined on the static types. Dynamically, we choose specific methods by using "virtual method tables", or *vtables*. Actual Java implementations (typically called ["Java Virtual Machines", or JVMs](https://en.wikipedia.org/wiki/Java_virtual_machine)) implement dynamic dispatch with *vtables* (this is the focus of Alexa's PhD research paper mentioned last class). They also have two steps: compiling to Java bytecode (including type checking based on static types), then executing that bytecode. We will use OCaml ASTs instead of bytecode, but the core ideas are the same. ## Step one: type-check/compile First, we compile (translate) the code to a lower-level representation. - In this step, check that all method calls are valid based on its static (declared) types. - Output an intermediate representation where: - Each object has a reference to its dynamic class. - We have a set of known classes: - A top-level `Object` class - Any user-defined class. - Classes have: - A set of methods defined by that class. - A reference to their direct parents class (unless they are `Object`). - A vtable (conceptually, a map) that maps method names to the specific implementation of that method. - The methods could be defined in the class itself, or defined in any of its superclasses. - We can build the vtable once per class, not once per object (or use of the object). ## Step two: dynamic dispatch during `interp` To dynamically dispatch at runtime, we: - Interpret the object and its argument to values. - Follow the object's reference to its dynamic class. - Look up the method by name in the vtable of the dynamic class. - Call the method! Since we ran the compile step first, this strategy is guaranteed to find the right method in the vtable! This implementation is roughly what you will implement for Assignment 9: MiniJava! We'll look over the stencil code together on Monday. ## Vtable example Consider the example: ``` A // defines method f B extends A // overrides method f, defines g C extends B // overrides method g D extends C // overrides method f ``` The vtable for `A` just has method `f`: | Method | Implementation | |--------|----------------| | `f` | `A.f` | The vtable for `B` has `g` point to its own definition, and overrides the definition of `f` from `A` with its own: | Method | Implementation | |--------|----------------| | `f` | `B.f` (overrides `A.f`) | | `g` | `B.g` | :::success What are the vtables for `C` and `D`? ::: :::spoiler Think, then click! The vtable for `C` overrides `g`, but inherits the version of `f` defines by its superclass `B`: | Method | Implementation | |--------|----------------| | `f` | `B.f` (inherited) | | `g` | `C.g` (overrides `B.g`) | `D` overrides `f` again: | Method | Implementation | |--------|----------------| | `f` | `D.f` (overrides `B.f`) | | `g` | `C.g` (inherited) | ::: :::success Assume `A` and `C` both have public fields `x`. What version of `x` is used for a `new` object of each of the 4 classes? How can you change which version of `x` is used? ::: :::spoiler Think, then click! Recall from last lecture that fields are *shadowed*, not *overridden*. Fields need to be **per object, not per class** (different instantiated objects of the same class can have different field values!). So conceptually, fields live on the object, and the object references a shared class that defines the vtables. As we saw last lecture, field lookup is determined by **declared**, not **actual** type. But the declared type can be changed! So objects need potentially duplicate fields, one per possible declared class. So if both `A` and `C` declare a public field `x`, then: | Object created with `new` | Version of `x` stored in object | |---------------------------|--------------------------------| | `new A()` | Only `A.x` exists | | `new B()` | Only `A.x` exists (inherited) | | `new C()` | Has **both** `A.x` and `C.x` | | `new D()` | Has **both** `A.x` and `C.x` | We can change which version is used by changing the declared static type, e.g., by calling different methods or casting: ``` class A { int expectsA(A a) { return a.x; } int expectsC(C c) { return c.x; } class demo() { C c = new C(); expectsA(); // returns the A.x field from C expectsC(); // returns the C.x field from C } } ``` ::: # More on subclass polymorphism Next, we'll look more deeply at subclass polymorphism. We know that the point of polymorphism is to let us call a method on an object and get specific behavior depending on if its a particular subclass. We can write one function that takes in the superclass type, and call it with multiple subclass types: ```java class Cat extends Animal {} // Elsewhere void doSomething(Animal a) { a.noise(); } // We can call doSomething on an Animal: doSomething(new Animal()); // AND on a Cat subclass of Animal: doSomething(new Cat()); // where noise() may have different implementations ``` This example should be very familiar from CS230, and depends on this rule: :::info For method calls, we can pass a subclass where any of its superclasses are expected. ::: An interesting question to consider is this: when a class extends another and overrides functions, do those **function definitions** need to have the exact same return types, or can the return type be a subclass or superclass? ## Group exercise: sub/super for method overriding definitions :::success Consider the following example: ```java /* Group exercise: discuss whether you think in the * New subclass overriding the superclass methods, Java * can change the: * 1: Return type: Java can safely let ??1 be a: * - subclass (Animal) * - superclass (Object) * - both * - neither: must be the exact type * 2: Argument type: Java can safely let ??2 be a: * - subclass (Animal) * - superclass (Object) * - both * - neither: must be the exact type * * Hint: consider how another class would use Groomer/New. */ class Groomer { Animal pickUpPet() { // ... } void dropOffPet(Animal a) { // ... } } class New extends Groomer { ?? pickUpPet() { // ... } void dropOffPet(?? a) { // ... } } ``` ::: :::spoiler Think, then click! The correct answer for what Java can *safely* do is: ``` What Java *could safely allow* for overriding ??1 (return type) : subclass ??2 (argument type): superclass ``` However, the *actual* Java rules are more conservative for object **argument types**: ``` What Java *actually allows* for overriding ??1 (return type) : subclass ??2 (argument type): neither (must be exact type) ``` ::: To understand why we can use a subclass for the return type, consider this code *using* the subclass: ```java // Declared type is Groomer, but dynamic type is New Groomer g = new New(); Animal ani = g.pickUpPet(); // Would it break things if ani was actually a Cat? // No, because Cat has all methods that Animal has! // Would it break things if ani was actually an Object? // Yes, this is unsafe! Object has no noise method defined. ``` To understand why we *could* safely use a superclass for the argument type, consider: ```java n.dropOffPet(new Dog()) // Consider: // Would it break things if the argument were actually a Dog? // Seeing the problem is a bit more complicated here, since // it requires another subclass. // Yes, because although dropOffPet is defined on Groomer, // the dynamic type of New only knows how to call dropOffPet // on Cat, not Dog. New's implementation of dropOffPet // might call Cat's "makeHairball", for example, which Dogs cannot do! // Would it break things if the argument were actually an Object? // No, this would not break things. Cat's dropOffPet // implementation could only call Object's methods, but // that is fine. // However, this case is less intuitive! So Java does not // allow it for method overriding, and instead would treat // the method definition as overloading. ``` ## Contravariance and covariance In **general** safe OOP design (and other contexts with *subtyping*), when overriding methods defined in a superclass, the subclass does not need to use the **exact** same argument and return types. Rather: 1. The return type of the overridden method may be a **subclass** of the return type in the superclass. This is an example of **covariance**. 2. The argument type in the subclass may be a **superclass** of the argument type in the superclass. This is an example of **contravariance**. This is *safe* for OOP, but is **not actually implemented** by Java: Java makes a conservative choice to not support this for overridden methods. ### Covariance/contravariance in general **Covariance** means you can substitute a more specific (sub)type in a place where a more general (super)type is expected. - Applies to: Return types (including Java), arrays (though this can cause issues, which are out-of-scope today), some generic type parameters. - Intuition: “Going down the type hierarchy (more specific) is OK.” **Contravariance** means you can accept a more general (super)type in a place where a more specific (sub)type is expected. - Applies to: Some OOP function arguments (but not Java), some generic type parameters. - Intuition: “Going up the type hierarchy (more general) is OK.” Java doesn't allow contravariant parameter types in method overrides (even though it makes sense conceptually). There is lots of interesting further details on how covariance/contravariance impact Java generics, but we'll stop here for CS251. # Relationship between OOP and FP To close out our OOP unit, we'll zoom out and consider the difference again between OOP and FP, now that we understand the OOP mechanisms. # Mechanism: closed vs open functions/methods One core difference between the *mechanisms* of OOP and FP is how we can "shadow" function/method implementations. ### Shadowing for closures In OCaml and most functional languages: closures are, well, **closed** (at creation, they bundle up their environments into a closed context). For example: consider these mutually-recursively-defined functions in OCaml: ```ocaml (* Note, this is intentionally a slow, silly implementation! *) let rec even x = if x=0 then true else odd (x-1) and odd x = if x=0 then false else even (x-1) ``` Client code that uses these may shadow `even`, with a new definition, but calls to `odd` are **unaffected** (they use the "original" version): ```ocaml! (* Shadow "even" with a more efficient implementation: *) let even x = (x mod 2) = 0 (* This does not change odd (too bad, would be faster!) *) (* odd's particular implementation of even was CLOSED *) ``` ### Shadowing for "open" overriding In most OOP languages, subclasses can change the behavior of superclass methods, even ones they do not directly override! Here is the same example as before, now in Java: ```java class A { boolean even(int x) { if (x == 0) return true; else return odd(x-1); } // odd's particular implementation of even is OPEN boolean odd(int x) { if (x == 0) return false; else return even(x-1); } } class B extends A { // Improves odd in B objects, even though B does not // change/override odd directly! boolean even(int x) { return x % 2 == 0; } } ``` The new definition of `even` defined by `B` overrides `even` in **all** context for objects of dynamic type `B`, **including** in the `odd` implementation that is still only defined in `A`! Critically, `A`'s `odd` implementation was *not closed* when `A` was defined: the author of `A` **cannot know** what specific version of `even` might be called (unless `even` is modified with the `final` keyword). ## Closures vs. Objects In this way, we can consider the following differences between closures and objects: Closures: – Capture code of function, at function definition. – Capture all bindings the code may use, by lexical scope of definition. Objects: – Capture code for all methods that could be called on it, by class hierarchy. – Capture bindings that may be used by that code, by instance variables declared in class hierarchy. ## FP vs OOP decomposition/extensibility ### FP vs. OOP decomposition FP and OOP represent two distinct world views! In FP, functions perform some operation, which may use pattern matching to specialize on variant types or other patterns. In OOP: classes/prototypes give behavior to some kind of data. Subclassing allows specialization for subtypes. Which of these approaches is better depends on the domain, how the software is expected to change over time, and (perhaps most importantly) engineer preference and social factors. ### Table view of FP vs OOP decomposition Consider implementing functionality on our `Expr` running example, where `Expr` expressions can be either a `Var`, `Add`, or `Num`. The functionality/behavior we want to define is `interp` and `toString`, which should be specialized for each subtype. | | `interp` | `toString` | | -------- | -------- | -------- | | `Var` | | | | `Add` | | | | `Num` | | | :::success Consider the following: In FP, do we group code by rows of this table, or columns? Likewise, in OOP, do we group code by rows or columns? ::: :::spoiler Think, then click! FP groups code by columns (by function). OOP groups code by rows (by type of data). More on this below! ::: #### FP view of the table In OCaml, we would create the `Expr` subtypes with a variant type. Each function would then be a definition **over all variants**, with specialization via pattern matching. We can think of this as a vertical grouping of columns in our table! Said another way, in FP, function implementations, with subtype behavior spread across function implementations. | | :large_blue_square: `interp` | :large_red_square: `toString` | | -------- | -------- | -------- | | `Var` | :large_blue_square: | :large_red_square: | | `Add` | :large_blue_square: | :large_red_square: | | `Num` | :large_blue_square: | :large_red_square: | #### OOP view of the table In Java, we would create the `Expr` as a superclass, then define `Var`, `Add`, and `Num` as subclasses that each extend `Expr`. Each subclass would then define *its own* specialization of `interp` and `toString` via method overriding. Said another way, in OOP, we group code based on subtype, with function implementations spread across subtypes. Each subclass would then be a definition **over all overridden methods**, with specialization via inheritance. We can think of this as a horizontal grouping of rows in our table! | | `interp` | `toString` | | -------- | -------- | -------- | | :large_blue_square: `Var` | :large_blue_square: | :large_blue_square: | | :large_red_square: `BinOp` | :large_red_square: | :large_red_square: | | :large_yellow_square: `Num` | :large_yellow_square: | :large_yellow_square: | ### Implications on extensibility Choose which view is better partially depends on how we expect the code to change over time. When code can be easily adapted to future changes, we call it **extensible**. Extensibility is an important goal in real-world software engineering, since most code is not "one and done": it continuously evolves over time! Which of FP vs OOP is better for extensibility depends on how we expect the code to change in this way: do we think it's more likely/frequent that we will change the set of supported **functions**, or **subtypes**? #### FP: Good for new functions Since FP implementations are grouped by *function*, adding a new function is easy: we just define the new function without changing existing functions! For example, if we want to add a `typeCheck` function for our `Expr` language, we need only add a `fun type_check` definition. This is part of why programming language people love FP: in the long run, programming language ASTs are fairly stable, but the functions/analyses will define over them change more frequently! #### OOP: Good for new subtypes Since OOP implementations are grouped by *subtype*, adding a new subtype is easy: we just `extend` the existing superclass, without changing existing classes! For example, if we want to add a `Neg` variant `Expr` language, we need only add a `class Neg extends Expr` class definition. This is part of why OOP is the paradigm of choice for most Graphical User Interface (GUI) frameworks: for GUIs, adding new components (e.g., a new button type) is the more common case! ### Concluding musings As we've discussed, FP and OOP are different worldviews, but both can be incredibly useful in different cases. You now are more prepared to choose between the two, including in emulating good designs from one in a language that primarily uses the other!