Concepts in Programming Languages

# Concepts in Programming Languages ###### `supervisions`, `Part 1B` ### 1 The Ancestors #### 1.1 An idealised computing device that can execute a specific programming language directly (a). Static storage machine (b). Both static allocation and heap allocation with garbage collection. (c\). All of the previous, but we introduce the stack, which allows nested scopes and recursion. #### 1.2 - Static type checking - #### 1.3 (a) Shorter specification means it's a simpler dialects, and it's cheap (timewise) to onboard; However a more complex dialect, while it can take more time to master, can then avoid excessive verbosity and provide a more efficient experience. (b) ```lisp (defun main () (f 1)) (defun f (x) (g 2 (lambda () x))) (defun g (x myfn) (apply myfn ())) ``` Static scoping should give 1 whereas dynamic will return 2. #### 1.4 (a) ```lisp (eval ’(+ 1 2 (eval ’(+ 3 4)))) ``` The parser evaluates expression (E), and sees E = `eval X`, so it will evaluate X and then apply eval to X. X = 'Y, hence X gets evaluated to Y. then (eval Y) means the interpreter will just have to evaluate Y = `(+ 1 2 (eval ’(+ 3 4)))`. Analogously, the third argument of the sum gets evaluated to 7, and then the sum can happen. ```lisp (cons 1 (list 2 3 (eval (cdr (cons 4 '(cdr '(5 6))))))) ``` (cons 4 '(cdr '(5 6))) just goes to (4 cdr '(5 6)), so cdr of that expression goes to (cdr '(5 6)), which evaluates to (6) - '(5 6) goes to (5 6); cdr takes the (6); then (list 2 3 (6)) goes to (2 3 (6)) and finally (cons 1 (2 3 (6))) goes to (1 2 3 (6)). (b) (c\) #### 1.5 (Errata: in the example of funcall, the parenthesis does not match) Mystery map a function to all the elements of the passed list i.e. ML map function ```lisp (defun map (f x) ( cond ((consp x) (cons (apply f ( cons (car x) nil)) (map f (cdr x)))) (T nil) ) ) ``` #### 1.6 (a) * call by value: in f(expr), expr gets evaluated. if it's an rvalue, it gets passed to the function; if it's an lvalue, its value (an rvalue) gets passed to the function. * call by reference: as before, but if expr evaluates to an lvalue, an alias to the storage represented by it gets passed, so that the function opertaing with that actually manipulates the passed variable * call by value result: the call behaves as by value, but when the call returns, the value of the corresponding callee's variable is written back into the caller's variable * call by name: in f(a), just 'a' is passed, and f * call by text: in f(expr), expr is not evaluated but passed to the program, and lazily evaluated by it only when necessary. (b) ```cpp int x = 100; int foo(int y){ y+= 10; x+=y; return y; } // call by value: // equivalent to calling foo(100): //will return 110 and x will be equal to 210 //call by reference: assume signature like int foo(int &y); // here we can substitue y with x in the body: hence x+=10; // -> x = 110 x+=x; // -> x = 220 return x; // -> return 220 //call by value-result: calling with y=100, with final writeback on x hence the function will return 110 and x will be equal to 110. ``` #### 1.7 (a) A discriminated union (or variant) is a types which unites types i.e. given types $\mathcal{T} = \{T_1,T_2,\ldots,T_n\}$, with not necessarily disjoint domains $D_{T_1},D_{T_2},\ldots,D_{T_n}$ their union is type $T$ whose domain is $$D_\mathcal{T} = \{(t,v)\mid t \in \mathcal{T},v \in D_t \}\subseteq \mathcal{T} \times \bigcup_{T \in \mathcal{T}} D_T$$. (b) However these are faulty implemented in Pascal, having simply $$D_\mathcal{T} = \mathcal{T} \times \bigcup_{T \in \mathcal{T}} D_T$$ i.e. the label might not match the value stored. (c\) In C we can go with ```c union T { t_1 v_1; t_2 v_2; ... t_2 v_2; } V; ``` C++ fanboy note: `std::variant<Ts...>` is cooler and actually enforces the rules above i.e. `void std::variant<T1,T2,T3>::set<T2>(const &T2 value)` will set the label to T2 and actually accepts a T2 only. (e) - #### 1.8 * Dynamic lookup: which member to access, or which method to execute, can be detrmined at runtime * Abstraction: the logic is exposed, the inner behaviour is not. * Subtyping: If T2 provides all the functionalities of T1, all the method accepting a T1 will behave seamlessly when handed a T2 instead * Inheritance: the ability to reuse the definition of one kind of object to define another kind of object #### 1.12 ##### Apology of Pascal naïveté Pascal was born as a didactic language. I cannot write this paragraph without being partial about it, since it was also my first language, and I thought it was a good first one. I guess the intent of the Niklaas was to discourage BASIC-like extensive use of unstructured jumps, hence he decided to only provide the basic flow control structures. On a more technical viewpoint, since for loops are based not on a condition which might become false at any time, but on a range, upon entrance in the loop we can know how many iterations there will be, and this could be a great place for compiler optimizations. ##### A critique Pascal has never really left the classroom - Even in its modern flavours like Lazarus or Delphi, we could argue it has never been a top language for any use case. It is not in the top 50 language for use (TIOBE index), and each of the top languages has its own: C is fast, Java allows worse engineers to code ;-) , Python offers a fast development iteration, C++ is both performant and powerful, etc. The lack of `break` and `continue` is just another of the missing feature of an immature language, which does not have much to offer in terms of versatility, ease of use or portability. ### 2 Types #### 2.1 (a) If a phrase is accepted by the language, it will not produce a type error. (b) In C, lots of function use void pointers to pass parameters around. (c\) Type checking is static if it's performed at runtime i.e. after it has succesfully compiled, no operation will produce a type error. On the other hand, if at runtime we still have to perform some type check, we'll say the type-checking is dynamic. (d) I don't think so - take any programming language which performs both static and dynamic type checking to ensure strong typing. Remove dynamic type checking and you'll have a static typechecked language which is not strongly typed. (e) Java is both. #### 2.2 1. Multiple, different functions with the same name, but different signature. For each call, statically or dynamically, the best-matching version is chosen, or an error is produced when no one is applicable. An example: ```java double expectedValue(discreteRandomVariable x) { double e = 0.0; for(Outcome o : x.outcomes()){ e += o.probability() * o.value(); } return e; } double expectedValue(continuousRandomVariable x) { return Calculus.integrate(x.probabilityDistribution(), Calculus.NegInfty, Calculus.Infty); } ``` 2. A single definition of a function may be applied to a subclass ```java char readCharacter(InputStream i){ return i.getchar(); } readCharacter(new FileStream(File.open("hello.txt"))); readCharacter(new NetworkStream(Os.openSocket(...))); ``` 3. Generic polymorphism (Generics in Java, Template in C++, 'a types in ML, duck-typing in Python etc.) ```java <T> T function duplicate(T x){ return x*2; } ``` #### 2.3 ```haskell fn x => fn y => fn z => (z (x y)) y ``` lets' assign types and make inferences! ``` x : a y : b z : c (z (x y)) y : d z (x y) : e = b -> d (x y) : f z : c = f -> e = f ->b -> d x : b -> f ``` Hence ``` fn x => fn y => fn z => (z (x y)) y : (b -> f) -> b -> (f -> b -> d) -> d ``` #### 2.4 Assume S is subtype of type T * A generic (or templated) type system is **covariant** when G\<S> is subtype of G<T> (e.g. when G is shared_ptr, we'd like a function accepting shared_ptr<T> to accept also shared_ptr<S\>) * A generic (or templated) type system is **contravariant** when G<T> is subtype of G<S\> * A generic (or templated) type system is **invariant** when none of the abolve holds #### 2.5 #### 2.6 In C++, B : A means that an A * can hold a B * (without entering the matter of ownership), but this doesn't mean A** can hold a B**. #### 2.7 At line 40, we assign ```java=30 Fruit [] r = av; r[3] = f; ``` f, which is a fruit, to r[3]. Remember r is in fact pointing to an array of Apple references, and we cannot make an Apple reference point to a Fruit. As for the commentend blocks: ```java=20 Apple ERRORa = f; ``` The exact thing we were talking before, but would get caught at compile time - an explicit cast would be needed, which could fail at runtime. ```java=31 ArrayList <Fruit> ERRORq = al; //ERROR! ``` We don't have covariance for doing this. We can at omst do this ```java=32 ArrayList <? extends Fruit > p = al; ``` But the covariance acts on the container as a whole: ```java=34 p.set(3,f); // ERROR! ``` i.e. if it is actually referencing an ArrayList of Apple, we cannot assign a Fruit to it. #### 2.8 The definition is quite arbitrary - but it definitely involves the automation of tasks in a run-time environment. While this rule out compiled languages like C/C++, this is more related to the use that people make of a particular language - e.g. JavaScript/Bash are more associated with the script use, than maybe Python or even Java (yeah, there's a Java Shell ;-) #### 2.10 Dynamic typing: the type of anything is determined at runtime. This allow following python function: ```python def f(x): return x*2 ``` To deal with `x` regardless of its runtime type. The function will succeed as long as `x` support the product operator where the second argument is an `int` (actually, just the number 2: it might not support negative integers). This way, the specification of the type accepted by `f` is defined by what `f` expects `x` to be able to perform. This concept is called duck typing. ##### 2.11 (a) ``` empty = fn : 'a queue -> bool pop = fn : 'a queue -> 'a * 'a queue push = fn : 'a * 'a queue -> 'a queue ``` (b) - single list: ``` datatype 'a queue = Q of 'a list; fun empty (Q([])) = true | empty _ = false; fun pop (Q(x::l)) = (x,Q(l)); fun push (x,Q(l)) = Q(l @ [x]); ``` - double list: ``` datatype 'a queue = Q of 'a list * 'a list fun norm (Q([],l)) = Q(rev l,[]) | norm q = q; fun empty (Q([],_)) = true | empty _ = false; fun pop (Q(x::l,b)) = (x,norm(Q(l,b))); fun push (x,Q(f,b)) = norm(Q(f,x::b)); ``` (c\) It is transparent in both cases (altought with different underlying types) because we could have just used `'a list` or `'a list * 'a list` to define our queue. However declared datatypes in ML, even if they have a single option, are considered different from their underlying type (like C/C++ in struct definitions). #### 2.12 A Java interface makes no assumption of the underlying data of the implementing class, nor provide any default behaviour for its logic, but rather a specific requirement. If we want to enforce a similar idea in ML, we could write down some functions we'd like to have for that type in order to implement a particular interface, we the functions name should depend on the type (no polymorphism) and the system can't actually enforce those rules. On the other hand, classes and structures are more similar, but again, we lack polymorphism, subtyping and the bond between an object and its logic in ML. ### 3 Further Concepts #### 3.2 (a) Concurrent machines is quite complicated, and we need to work with an abstracted model. However we have a compromise between complexity of the asbtraction and fidelity of it. (b) It abstracts how the threads organize themselves, ad just leaves the user the possibility to specify which work will be done by a different thread (spawn) and when to wait for all threads launched in the current scope (sync). #### 3.4 A is terribly wrong. Java is a very limited language, that while can boast an extensive library when it comes to various functionalities, the core language won't allow you to do certain things. To say some, Java is garbage collected, and cannot be reference counted whereas in C++ you can choose the strategy you prefer, or actually implement yours. Or if you wanted to avoid dynamic allocation at all, you could (always in C++) decide to allocate everything on the stack (or on the heap if you needed) while in Java you're forced to allocate primitives and references on the stack, and objects on the heap. Not to mention the lack of sensible copy/move semantics, that result in less flexible management of resources, and when a simulacrum of implementation appear, this is poorly done, solely relying on references, increasing terribly the indirection level. In addition each method is virtual, forcing you to opt-in for a more expensive runtime method resolution. All of this not to mention the lack of other (at least three) computation models (the preprocessor, the templates, the constexpr) which prevents you from performing satisfactory computation/definition at compile time. B is just uneducated in the field of parallel programming. Parallelism won't make your code magically faster. In general many #### 3.5 Where external iteration mixes the what work is being performed and also the how is that begin assigned (for loop/iterator), internal iteration lets the user to specify only the "what" but lets the library control the "how". This is particularly good for multithreading, as efficient libraries might to do some runtime predictions of the work, and the decide how to split the latter in equal parts by retaining some thread locality in the context of the iterated content. With external iterator you might end up writing some suboptimal strategy (e.g. work queue, work `i` at thread `i%n`, divide the index in equal chunks) which would also require greater efforts as this should be done in any place something like that is needed. #### 3.6 It's the problem of, when in a language of variant/union datatypes and case-structured function, we need to additional cases to both data and operations. This usually comes with the need of recompiling everything. #### 3.7 I'm quite confused by monads - I think I see the monad M<T> as a type T decorated by a computation M. The two operations are - building a monad : `T -> M<T>` - concatenating monadic functions : `M<T> ->(T -> M<T'>) -> M<T'>`