Lecture 0: Introductions

Lecture 0: Introductions --- # Welcome! Welcome to CS251: Principles of Programming Languages! :::info Agenda for today: - Introductions. - High-level course goals. - Course expectations, learning outcomes. - Group discussion on course AI policy. - Programming paradigms, why Racket/OCaml. ::: # Introductions ## Name magnets/name tents If you missed class, I will have more on Monday! ## Professor introduction I'm [Alexa VanHattum][avh]---you can call me "Alexa" (preferred!), "Prof. Alexa", or "Prof VanHattum". This is my third year as an Assistant Professor at Wellesley. My research focus is the intersection of programming languages & computer systems, with a focus on applying lightweight formal methods to compilers for systems languages. Before Wellesley, I spent time as a software engineer for Apple health, then completed grad school at Cornell. The focus of my PhD was programming languages and compilers for systems programming. [avh]: https://cs.wellesley.edu/~avh/ ## Student introductions We each shared names, pronouns, favorite or least favorite programming language/programming project. This is a smaller elective, and we want to all get to know each other! # Course learning goals :::info In this class, you will: - Gain a more accurate view on what programs actually mean. - Learn powerful programming language features—such as structural recursion, higher-order functions, and pattern matching—that will make you a stronger programmer across languages even beyond the two functional languages (Racket and OCaml) we will use. - Develop skills and strategies that will help you learn programming languages more quickly and effectively in the future. - Approach problem-solving through the lens of language design and program analysis. - Gain new experiences as a programmer and problem-solver. ::: An undergraduate class on "programming languages" can reasonably mean a lot of different things. In some iterations, a course might have students briefly study many different existing programming languages, without a ton of depth (like a "PL Zoo"). We'll instead take a different path, to focus on the *fundamentals* of programming languages: - The properties and paradigms important across different PLs. - The design, implementation, underlying principles of PLs. :::info *Alexa's* version of CS251: - Redesigned last year to focus more on implementation rather than theory. - To deepen our study and expose you to more languages unlike what you've seen before, we'll first focus on picking up the *functional* programming paradigm. - More focus on testing and other industry-relevant skills (which students from last semester shared has been especially useful!). ::: # Go over the syllabus/website :::success Read the syllabus here! https://cs.wellesley.edu/~cs251/f25/syllabus/ Bookmark this page! https://cs.wellesley.edu/~cs251/f25 ::: - Assignments (roughly 1 per week) - Help hours, Calendly (home page, click "CS251") - Quiz and retake policy (3 quizzes, can retake 0-3 during finals) - Gradescope (no Sakai, all assignments due here) - Zulip (replacement for Piazza/Ed, Q&A, course announcements) - Collaboration policy - Expectations # AI policy All Wellesley Computer Science classes have explicit Generative AI Policies this year. Last year, CS251 had a restrictive policy (no allowed uses of AI). This year, I'd like to experiment with opening this up as a topic of discussion, with some bounds on our choice as a class. Generative AI tools (ChatGPT, Claude, Copilot, Cursor, etc.) know a lot about programming languages, but they also have serious limitations that at times are at odds with our course goals. :::success *(Note: green blocks in the course notes indicate questions to the class. Often, there will be collapsible sections with an answer/outcome below these blocks, click to reveal it.)* What are some "pros" you see to using generative AI for learning about programming languages? What are some "cons"? ::: :::spoiler Think, then click! Some of the pros we came up with: - AI can be used to quickly reference new syntax. - AI might be able to save time with helping you diagnose errors. - AI might be able to help you convert from a language you do know to one you are learning. Note that much of these "benefits" have not always had much evidence in new research on the topic. Some studies show that use of AI systems actually *slows down* programming/debugging in certain contexts. Some of the cons: - AI systems have serious environmental impacts. - AI code can be wrong! This is more likely to be the case for non-Python, non-Javascript languages, because there is less training data (less seriously, AI code is more likely to be "non canonical" for less prolific languages). - Using AI systems to write code for you, especially earlier in your programming experiences, can impede your own learning. - AI systems may produce code you don't actually understand, or use features outside of the scope of the class. - Copyright issues, along 2 fronts: (1) these systems were trained without authors' consent, and (2) these tools can regurgitate exact copies of code without proper attribution. - If one of those copies has something like a security bug, the lack of attribution or actual tracing/use of libraries makes this harder to fix. - LLMs can only extrapolate from what they have seen. I suspect they have a very limited ability to design new languages for new problems. - You should be able to fluently discuss these topics in contexts where you don't have access to AI tools. - AI can get things very wrong, and usually does a bad job fixing its own errors. You, the programmer, need code comprehension to be able to debug. When I talk to working software engineers, some use these AI tools in their daily workflows, but with limitations. Some don't use these tools at all. The ones who use it often spent years learning to code "the traditional way" first. ::: ### How this applies to CS251 One goal of a college class is to assess how well a student has mastered the material. There are (at least!) 2 good reasons for this: (1) It's a condensed "recommendation" from me as to your mastery of what we covered. An "A" indicates mastery of the material. A "C" indicates competence, but not as much mastery. (2) For many students (including me, in the past!) it's motivating to have clear feedback on what you do and don't understand, and a grade may hold you accountable to learning more. (For example, as a student, I felt I learned a bit less in classes I took credit/non). How do grades then relate to AI use? CS251 has quizzes, where AI is decidedly **not allowed**. The question then becomes: what is our balance between quizzes and other coursework (primarily, coding assignments)? :::info I have a proposal to the class, which is at a high level what is currently in the syllabus ("proposal A"). I also have an alternative proposal that allows more AI use, with fewer restrictions ("proposal B"). We'll decide as a class which proposal to go with this semester. ::: ::: warning Notably, neither policy *requires* nor *encourages* the use of AI. There are serious ethical concerns with these tools, and unclear evidence around the benefits. The assignments are designed such that you **do not need** to use AI at all to finish them in a reasonable amount of time. At the same time, I want to recognize that you are adults entering a world where AI tools are increasingly common. ::: #### ====Proposal A=== Written assessments (quizzes) without AI count for about half your grade (48%), with coding assignments and participation making up the remaining half. This means the assignments themselves must be more of a real assessment of mastery, which I see as imposing limitations on AI use. The AI policy will be specified on a per-assignment basis. Any use must be cited in your written README. The default will be for assignments to allow AI to look up syntax or help interpret error messages, but to disallow copying any code. On later assignments, after you have some experience in each language, AI can be used to generate some snippets of code, with the stipulations: 1. You must still follow attribution standards (we'll talk about as it comes up). 2. You must describe how much of your code comes from AI. #### ====Proposal B=== An alternative proposal is to allow less restricted AI use on all assignments. That is, to view the assignments as less substantial in terms of *assessment*, and to put more weight on the quizzes. In this proposal, "anything goes" on assignments as long as your use of AI follows our citation and attribution standards. On this alternative path, quizzes would be worth 70-75% of your grade. :::info As an experiment this semester, I'd like us to decide this policy as a course (one policy for the entire class). Assignment 0 will have a question on which of these two paths you'd prefer for this semester. Note that Proposal A involves a good deal of trust: I implore you to only choose Proposal A if you'd committed abide by these restrictions. ::: I'll let you know the outcome of this poll next week. For Assignment 0, assume we are working under ====Proposal A====. \ Now, let's get to the fun stuff! ====Let's dive in to programming languages!==== # Paradigms Paradigms are specific models and shared features across different languages. These are usually not strict boundaries, but rather broad groupings (with "hand wavy" distinctions). ## The imperative paradigm Most programming you've done up to this point is _imperative_. We can think of imperative code as being "step-by-step": you tell the computer what to do via a series of instructions. This comes from the lineage of assembly code, which is organized as a sequence of instructions grouped by jumping between blocks of code. The primary means of processing data in imperative code is via loops: `for`, `while`, `do while`, etc. Data in imperative languages is more closely aligned with its representation in memory: memory is a contiguous array of bytes. Types group the bytes into e.g. integers (like the `int` type in C/Java. The core composite data type in most imperative languages is an array, which is a contiguous group of values indexed via integers. :::info Examples of primarily imperative languages: C, Java, Python. ::: ## The functional paradigm Functional programming languages are more _declarative_: they describe _what_ a computation is, rather than _how_ to do it step-by-step. This more closely aligned with pure mathematics, where a function uniquely defines one output for a fixed (set of) inputs. :::warning Note that in imperative languages, like Python, many functions do not match this pure mathematical definition. For example: ```python def f(x): return self.y ``` `f` is a python function, but the same input (`x = 0`) could produce many different outputs, based on the value of `y`. This is thus not a "pure" function when `y` is mutable (can be changed). ::: Another feature of the functional paradigm is that the primary means of processing data is via recursion, instead of loops. Functional languages avoid mutating data, and are thus less close of a representation to low-level program memory. (In contrast, loops in imperative languages typically require mutating the loop index!) For example, you'd avoid the following pattern of mutation in most functional languages: ```python # This code is more imperative than functional x = 3 x = 5 ``` Another primary feature of functional languages is that they _treat functions as values_ (which can be passed as arguments, stored in variables, etc). This is related to "first class functions" and "higher order functions", which we'll define more precisely over the coming weeks. Imperative languages also often support some version of functions as values, even though functions are not necessarily "first class". This is an example of functional patterns bleeding into more popular languages! For example: - In C, you can pass around function pointers as a specific language feature. :::success - In Python, how do you sort a list of objects based on one field of the object? That is, if we have a `class C` with a field `id`, how would we sort a list `my_list` by each `C`'s `id`? ::: :::spoiler (Think, then click!) You can use an `lambda` or _anonymous function_: ```python my_list.sort(key=lambda o: o.id) ``` `lambda o: o.id` is a function (with no name, and thus *anonymous)* that is passed as an optional argument to the `sort` function. The `sort` library function's argument `key` expects a function that tells it how to choose the sort key out of an object. "Lambda" more broadly means an unnamed function. Later in the semester if we have time, we will briefly cover the _lambda calculus_. ::: :::info Examples of primarily functional languages: Racket, OCaml, Lisp, Rust ::: ## Mindset Functional programming is a whole new way of doing things! I encourage you feel comfortable approaching learning Racket and OCaml as fun new problem solving exercises, distinct from the programming you've done prior to now. Think of these as a new game, with new rules to learn! :) ## Back to PL paradigms: why Racket and OCaml? :::info _Static_ means before the program runs. _Dynamic_ means as the program is running. ::: Broadly: A static *type system* finds type errors before the program runs. A dynamic type system only hits the error when the program actually reaches is. People tend to think of static as "stronger" and dynamic as "weaker" types. If we take the cross product of these categories: :::info | | Primarily imperative | Primarily functional | -------- | ------- | ----- | | Statically typed | Java | 🆕 OCaml | | Dynamically typed | Python | 🆕 Racket | Racket and OCaml are great languages to learn, because they cover the two quadrants you have less experience with! ::: Next class, we'll dig more into "why functional programming" before getting into "what, exactly, is a programming language?".