Tag(sp22)
In order to sign up for this course, you probably had to complete a class with programming assignments. Take a moment to list a handful of such assignments: what did you have to build? Some examples might include Search in 0180 or Tetris in 0150 or a kd-tree in 0320.
Now ask yourself:
Some of these questions have "expected" answers. For instance, you might say you know your code worked because you tested it. But is that really the truth? In terms of consequences, the true bar for "correctness" in your programming classes is the grade you got. That is, you "know" your programs worked because someone else said they did. And you knew what to do because you were told what to do. And you knew which algorithms to use, probably, because they'd just been taught to you.
Some time from now, you might be a professional software engineer. You might be working on a program that controls the fate of billions of dollars, tempers geopolitical strife, or controls a patient's insulin pump. Would you trust a TA to tell you those programs worked "good enough"? Would you trust your boss to understand exactly what needed to happen, and tell you exactly how to do it? Probably not!
So we need to take seriously the question of correctness, from both directions:
To do that, we need to think about what "correctness" even means, and how we might triangulate that idea in a useful way. Maybe we can't get to 100% confidence in correctness, but the perfect should never be the enemy of the good.
We'll talk about both of these aspects in time, but today let's focus on the second. We'll start from unit-testing (as practiced in intro and 0320) and deconstruct it. What does it do well? What does it do poorly?
Exercise: Make two lists to answer the above question. Why do we test? What could go wrong, and how can example-based testing let us down?
Hopefully we all agree that example-based testing has its virtues and that we should keep doing it. But let's focus on the things that testing doesn't do well. You might have observed that (for most interesting programs, anyway) tests cannot be exhaustive because there are infinitely many possible inputs. And since we're forced to test non-exhaustively, we have to hope we pick good tests–-tests that not only focus on our own implementation, but on others (like the implementation that replaces yours eventually) too.
Worse, we can't test the things we don't think of, or don't know about; we're vulnerable to our limited knowledge, the availability heuristic, confirmation bias, and so on. In fact, we humans are generally ill equipped for logical reasoning. Let's see if I can convince you.
Suppose we're thinking about the workings of a small company. We're given some facts about the company, and have to answer a question based on those facts. Here's an example. We know that:
Question: Does someone who graduated from Brown directly supervise someone who graduated from another University?
Humans tend to be very bad at reasoning by hypotheticals. There's a temptation to think that this puzzle isn't solvable because we don't know where Bob graduated from. Even Tim thought this when he first encountered the puzzle–-in grad school! For logic!
Now imagine a puzzle with a thousand of these unknowns. A thousand boolean variables means cases to reason through. Want to use a computer yet?
There is a prison in a magical world where an evil wizard holds a family of four gnomes. Every day, the wizard forces the gnomes to play a game for their freedom: he lines them up, single-file, in one or more rooms, facing the wall. The gnomes cannot move or communicate with each other; they can only look straight ahead. The wizard then pulls out four hats: two orange and two blue, and magics one onto each gnome's head.
The wizard then walks the line, asking each gnome: "What is your hat color?" They may try to answer, or remain silent for 10 seconds (and then the wizard moves on). If a gnome guesses correctly, they all immediately go free. But if one guesses incorrectly, they become the wizard's magical servants forever. So it's in their best interest to not answer unless they are absolutely convinced that they have guessed correctly.
Neither the wizard nor the gnomes can cheat. It's against magical law. The gnomes are, however, very intelligent. Smarter than the wizard for sure: they're perfect logical reasoners.
Here's an example configuration of the puzzle room:
(Why are they smiling?) |
In this configuration, can the gnomes escape? If so, why?
There's a real cryptographic protocol called the Needham-Schroeder public-key protocol. You can read about it here. Unfortunately the protocol has a bug: it's vulnerable to attack if one of the principles can be fooled into starting an exchange with a badly-behaved or compromised agent.
Since this isn't a security course, we won't go into specifics. Instead, let's focus on the fact that it's quite easy to get things like protocols wrong, and sometimes challenging for us humans to completely explore all possible behaviors – especially since there might be behaviors we'd never even considered! It sure would be nice if we could get a computer to help out with that.
Last year, a pair of former 1710 students did an ISP on modeling crypto protocols, using the tools you'll learn in class. Here's an example picture, generated by their model, of the flaw in the Needham-Schroeder protocol:
You don't need to understand the specifics of the visualization; the point is that someone who has studied crypto protocols would. And this really does show the classic attack on Needham-Schroeder. You may not be a crypto-protocol person, but you probably are an expert in something you'd like to model, and you might very well get the chance to do so this semester.
The human species has been so successful, in part, because of our ability to use assistive devices – tools! Eyeglasses, bicycles, hammers, bridges: all devices that assist us in navigating the world in our fragile meat-bodies. One of our oldest inventions, writing, is an assistive device that increases our long-term memory space and makes that memory persistent. Computers are only one in a long line of such inventions.
So why not (try to):
There's a large body of work in Computer Science that tries to do all those things and more. That's what Logic for Systems is about.
The semester will look roughly like the below table, although some of it is subject to change. Bold assignments denote multi-week projects. Italicized assignments denote pair homeworks.
Week | Topic | Lab | Assignment (Out) |
---|---|---|---|
Jan23 | Properties | Python (Optional) | PBT |
Jan30 | Structural Modeling in Forge | N Queens | Forge 1 |
Feb06 | Stateful Modeling in Forge | Ring Election | Forge 2 |
Feb13 | Stateful Modeling in Forge | Curiosity | |
Feb20 | Relational Modeling | Curiosity col-ab | Curiosity |
Feb27 | Relational Modeling | Memory Management | Forge 3 |
Mar06 | Model Checking (LTL) | Stack | Model Checking 1 |
Mar13 | Model Checking (LTL) | Model Checking 2 | |
Mar20 | By Request or Need, Guests | ||
Mar27 | Spring Break | ||
Apr03 | Boolean SAT, Herbrand | SAT 1, Proposals | |
Apr10 | SMT | SAT col-lab | SMT |
Apr17 | Resolution | SAT 2 | |
Apr24 | By Request or Need, Guests | SAT 2 col-lab | Term Project |
May01 | Reading Period | Term Project | |
May08 | Reading Period | Project col-lab | Term Project |
The dates of final project presentations are still TBD, but they will be scheduled sometime during finals. (Students who need to travel, etc. before will be able to schedule remote presentations.)
Before leaving class, please do this exercise via Google forms. It shouldn't take longer than 5 minutes. Feel free to discuss with other students!