Objects - HackMD

:::warning Announcements - Assignment 8 due Thurs. - Help hours today 3:45-5:15pm, L043 - MIT Programming Languages Review workshop Friday, April 25: http://plr.csail.mit.edu (small extra credit bump) ::: :::success Agenda for today - Concurrency - Bit more on `mutex` abstraction - Object-oriented programming - What is an object? - How are names resolved? - Group exercise: predicting Java output - (Next time) Overriding/dynamic dispatch - (Next time) Relation to functional programming - (Next time) More on inheritance/subtyping ::: # Concurrency: more on `mutex` Recall our example from last time: ```ocaml let sleep_then_print delay lines mutex = Unix.sleepf delay; Mutex.lock mutex; List.iter (fun line -> print_endline line ) lines; Mutex.unlock mutex (* lists not shown *) let main () = (* Create a mutex to manage stdout *) let mutex = Mutex.create () in (* Pass it to each sleep_then_print call*) let t1 = Domain.spawn (fun ()-> sleep_then_print 2. list1 mutex) in let t2 = Domain.spawn (fun ()-> sleep_then_print 2. list2 mutex) in Domain.join t1 ; Domain.join t2 let _ = main () ``` The OCaml library will ensure that at runtime, whichever domain reaches the `Mutex.lock mutex` line first gets to "have" the mutex and complete its printing; making the other domain wait until the mutex is unlocked before allowing it to enter that section of the code. Both domains can still sleep at the same time, allowing true parallelism for the independent/non-shared parts of their work. The `mutex` forces the code between the `lock` and `unlock` to happen sequentially. ### Deadlock If we made a mistake and forgot to call `unlock`, we could run into an issue where the second domain to get to the `Mutex.lock mutex` lock line can never actually make progress - it becomes stuck waiting for a `mutex`/lock that never becomes available. This is known as **deadlock**, and is again a shared term across different programming languages. **Deadlock** is when one or more threads (or other concurrent actors) are waiting for a resource held by other threads in the group, in a way that prevent progress. Avoiding deadlock is a substantial issue in distributed systems (see CS343!). Some languages provide higher level "synchronization" constructs that avoid the problem of remembering to unlock in the right place: see part 3 of homework 8! # Object-oriented programming ## Why study object-oriented programming Why study OOP in a programming languages class, especially when we mostly focused on functional programming? - OOP is frequently brought up as a core paradigm, along with imperative/functional programming. I don't find thinking of these as firm delinations to be very useful, but understanding some key features is important. - Many languages support some forms of objects, even if they are not primarily OOP. - Understanding the *implementation* of objects helps with debugging and improving the performance of real-world code. ## What is an object? Everyone has taken CS230 (or the equivalent), so you all have seen some Java. :::success What is an object? ::: **Answer**: definitions vary! We can think of an object as: - a *value* (cannot be evaluated further) - that maps *names* to other things: either: - other values (*fields*), and - specific flavors of functions (*methods*) ## How are *names* resolved? Names are key piece of semantics in any language. ### In Racket, OCaml In the Racket and OCaml we've studied, variables are the result of bindings. Closures follow lexical scope, which provides unambiguous binding semantics. We saw in OCaml, records have field names, but (1) field names are not truly variables, and (2) we do not have any difficult "lookup": the name always refers to a piece of the overall value. ### In Java (and other OOP languages) In Java, local variables follow lexical scope, and closures are less a key part of the language. We do have a new concern, though: how to handle name lookup for instance variables (fields) and methods. Because of *inheritance*, name lookup is more complicated! ## Name lookup in OOP languages ## Design space Language designers have have at least two core questions to consider when designing field/method name semantics (from the [PLAI textbook](https://www.plai.org)): 1. Is the set of method/field names statically fixed? Can programmers change it on-the-fly, dynamically? 2. Is the method/field being accessed at a program point statically fixed, or can it be computed dynamically? By this, we mean: can you construct a name at runtime by, for example, concatenating strings? This gives us a 2x2 design space where languages can sit: :::success Consider: Where do we think Java goes in this diagram? What about Javascript? OCaml has some OOP features, where do you think they go? ::: | | **Name is static** | **Name is computed** | |------------------------------|---------------------------------------------|-----------------------------------------------| | **Fixed set of field/methods** | Basic Java, OCaml OOP | Java with reflection. | | **Dynamic set of fields/methods** | Not useful | Most “scripting” languages, Javascript, Python | The bottom left quadrant does not really make sense for any language, for the following reason: if the *set* of names must be statically set before runtime, it wouldn't make sense to be able dynamically construct a name to *use* during runtime. ### Name lookup in scripting OO languages The bottom right quadrant is the most permissive/least strict, and is where most scripting languages sit. It's also where language with so called "duck typing" sit. :::info The name "duck typing" comes from the saying: > “If it walks like a duck and it quacks like a duck, it must be a duck.” In languages with "duck typing", instead of checking an object's type, the language just dynamically checks if the object has the method or field in question. If it does, the language allows the access/call to happen. ::: This quadrant includes both Python and Javascript. :::success In term of implementation, how do these languages work? ::: **Answer** they typically use a dictionary, or hash-table, to represent objects. The name of a field/method is just a string (or symbol) key into the dictionary. We can dynamically construct names, because we can dynamically construct strings. :::info Side note: as noted in older versions of the PLAI textbook, this "objects as dictionary" mentality is actually how the JSON data format was invented. JSON is a subset of the textual representation of Javascript objects. ::: This underlying dictionary strategy makes this quadrant less semantically interesting, so we won't focus on it further at this point. ### Name lookup in more statically-typed OO languages, like Java The top left quadrant is the most interesting in terms of (1) being traditional object-oriented programming, and (2) having interesting semantics for inheritance. We'll thus focus in on the flavor of objects that have a statically fixed set of names and static name references, such as Java. Within this segment of the design space, we can consider how names are resolved with Java inheritance as a concrete example. Let's first do an exercise. ## Group exercise What is printed by the following Java program (4 print statements in total)? Draw pictures of the objects to help you understand what's happening. :star: note, this uses *variable overriding*, a feature of Java that classes like CS230 often encourage students to avoid. So do not feel bad or worry if you are unsure of the answer! (Note: program adapted from a quiz originally by [Justin Pombrio](https://justinpombrio.net)). ```java class A { public int x = 1; public int f() { return x; } public int g() { return x + f(); } } class B extends A { public int x = 10; public int f() { return x; } } class C extends B { public int x = 100; public int g() { return x + f(); } } public class Tricky { public static void main(String[] args) { A declAC = new C(); System.out.println(declAC.x + declAC.f()); System.out.println(declAC.g()); C declCC = new C(); System.out.println(declCC.x + declCC.f()); System.out.println(declCC.g()); } } ``` # Solution ``` 11 110 110 110 ``` Explanation: We have 3 classes: `A`, `B`, and `C`, where: - `B extends A` (inherits from `A`) - `C extends B` (inherits from `B`) Each class has its own `x` field, and `B` and `C` **override** some methods: - `B` overrides `f()` - `C` overrides `g()` To figure out which values get used, we need two core rules from Java inheritance. First, we have a distinction between the *declared* (or static) type and the *actual* (or dynamic) type. In the snippet: ```java A declAC = new C(); ``` `A` is the declared/static type, and `C` is the actual/dynamic type. The declared/static type of an object is based on what the variable (or argument) is declared as. The actual/dynamic type of an object is the type that is was instantiated with at runtime (which `new` was actually called to make the object). Now, to the two core rules. The first, you should be more familiar with from CS230: :::success Do **method calls** use the declared or actual type? ::: **Answer**: the actual type. This is often seen as a selling point of polymorphism! :::info Rule 1: method name lookup is based on the actual/dynamic object type at runtime. Methods then follow inheritance rules: the most-specific (lowest in the inheritance diagram) implementation is used. ::: The second, you have less practice with from CS230: :::success Do **field name lookups** (instance variable lookups) use the declared or actual type? ::: **Answer**: field name lookups are based on the *declared* type. This does not come up as much in CS230, since often, OOP design encourages *private* fields that objects could not even try to override. *Fields are not overridden in the same way as methods are*. :::info Rule 2: field name lookup is based on the declared/static object type. ::: With these rules, we can examine each print in more detail: ## First declaration ```java A declAC = new C(); System.out.println(decAC.x + decAC.f()); System.out.println(decAC.g()); ``` ### `x` and `f()` - `decAC.x` => Declared type is `A`, so this accesses `A`'s field: `x = 1` - `decAC.f()` => This is a method call, so we look to the actual object type of `C`. `C` has no overridden definition of `f`, so we look up in the inheritance diagram. `B` does override `f`. So `B.f()` is used. - Inside `B`'s definition for `f`, the `x` that is referenced is referenced with static type `B`. So this returns `B`'s value for `x`, which is `10`. - So: `decAC.x + decAC.f() = 1 + 10 = 11` ### `g` - `decAC.g()` => This is a method call, which we base on the actual type of `C`. `C` does override `g`, so we look there. - Inside `C`'s `g` definition: - `x` is `C`'s `x = 100` field. - `f` again results in looking up `B`'s `f` body. This return's `B`'s `x = 10` - So: `decAC.g() = 100 + 10 = 110` ## Second declaration ```java C declCC = new C(); System.out.println(declCC.x + declCC.f()); System.out.println(declCC.g()); ``` Here, the declared and actual types are the same in `main` (though function definitions still use the `x` of that object). - `declCC.x` => again refers to `C`'s field: `x = 100` - `declCC.f()` => Inherited from `B`, so returns `B`'s `x = 10` - So: `cc.x + cc.f()` → `100 + 10 = 110` - `declCC.g()` → Defined in `C` - `x = 100` - `f()` = `B.f()` → returns `10` - So: `100 + 10 = 110` # Dynamic dispatch The general name for how methods are resolved dynamically is *dynamic dispatch*. This is a core feature of both OOP languages and less object-oriented languages, such as Rust. We'll cover this more next class! :::info Side note: Alexa has a published paper on Rust's dynamic dispatch implementation that arose out of a PhD intern project at AWS! [Verifying Dynamic Trait Objects in Rust](https://ieeexplore.ieee.org/document/9794041). ::: More on objects next time!