1: What is Logic for Systems?

tags: `Tag(sp22)`

In order to sign up for this course, you probably had to complete a class with programming assignments. Take a moment to list a handful of such assignments: what did you have to build? Some examples might include Search in 0180 or Tetris in 0150 or a kd-tree in 0320.

Now ask yourself:

How did you know what was the "right" behavior to implement?
How did you know what data structures or algorithms were the "right" ones to use?
How did you know your program "worked", in the end?

Some of these questions have "expected" answers. For instance, you might say you know your code worked because you tested it. But is that really the truth? In terms of consequences, the true bar for "correctness" in your programming classes is the grade you got. That is, you "know" your programs worked because someone else said they did. And you knew what to do because you were told what to do. And you knew which algorithms to use, probably, because they'd just been taught to you.

Some time from now, you might be a professional software engineer. You might be working on a program that controls the fate of billions of dollars, tempers geopolitical strife, or controls a patient's insulin pump. Would you trust a TA to tell you those programs worked "good enough"? Would you trust your boss to understand exactly what needed to happen, and tell you exactly how to do it? Probably not!

So we need to take seriously the question of correctness, from both directions:

are you trying to do the correct thing?
is what you're doing correct?

To do that, we need to think about what "correctness" even means, and how we might triangulate that idea in a useful way. Maybe we can't get to 100% confidence in correctness, but the perfect should never be the enemy of the good.

We'll talk about both of these aspects in time, but today let's focus on the second. We'll start from unit-testing (as practiced in intro and 0320) and deconstruct it. What does it do well? What does it do poorly?

Exercise: Make two lists to answer the above question. Why do we test? What could go wrong, and how can example-based testing let us down?

Hopefully we all agree that example-based testing has its virtues and that we should keep doing it. But let's focus on the things that testing doesn't do well. You might have observed that (for most interesting programs, anyway) tests cannot be exhaustive because there are infinitely many possible inputs. And since we're forced to test non-exhaustively, we have to hope we pick good tests–-tests that not only focus on our own implementation, but on others (like the implementation that replaces yours eventually) too.

Worse, we can't test the things we don't think of, or don't know about; we're vulnerable to our limited knowledge, the availability heuristic, confirmation bias, and so on. In fact, we humans are generally ill equipped for logical reasoning. Let's see if I can convince you.

Three classic puzzles

What's that about confirmation bias?

Suppose we're thinking about the workings of a small company. We're given some facts about the company, and have to answer a question based on those facts. Here's an example. We know that:

Alice supervises Bob.
Bob supervises Charlie.
Alice graduated Brown.
Charlie graduated Harvale.

Question: Does someone who graduated from Brown directly supervise someone who graduated from another University?

Think, then click.

Yes! Regardless of whether Bob graduated from Brown, _some_ Brown graduate supervises _some_ non-Brown graduate. Reasoning by hypotheticals, there is one fact we don't know: where Bob graduated. In case he graduated Brown, he supervises Charlie, a non-Brown graduate. In case he graduated from another school, he's supervised by Alice, a Brown graduate.

Humans tend to be very bad at reasoning by hypotheticals. There's a temptation to think that this puzzle isn't solvable because we don't know where Bob graduated from. Even Tim thought this when he first encountered the puzzle–-in grad school! For logic!

Now imagine a puzzle with a thousand of these unknowns. A thousand boolean variables means

2^{1000}

cases to reason through. Want to use a computer yet?

The Gnomshank Redemption

There is a prison in a magical world where an evil wizard holds a family of four gnomes. Every day, the wizard forces the gnomes to play a game for their freedom: he lines them up, single-file, in one or more rooms, facing the wall. The gnomes cannot move or communicate with each other; they can only look straight ahead. The wizard then pulls out four hats: two orange and two blue, and magics one onto each gnome's head.

The wizard then walks the line, asking each gnome: "What is your hat color?" They may try to answer, or remain silent for 10 seconds (and then the wizard moves on). If a gnome guesses correctly, they all immediately go free. But if one guesses incorrectly, they become the wizard's magical servants forever. So it's in their best interest to not answer unless they are absolutely convinced that they have guessed correctly.

Neither the wizard nor the gnomes can cheat. It's against magical law. The gnomes are, however, very intelligent. Smarter than the wizard for sure: they're perfect logical reasoners.

Here's an example configuration of the puzzle room:

Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →
(Why are they smiling?)

In this configuration, can the gnomes escape? If so, why?

Think, then click.

Yes! The gnomes can escape, because they're able to use the knowledge of other gnomes _without explicit communication. When the wizard asks Gnome #2 what color their hat is, Gnome #2 can conclude nothing, and is silent. Gnome #3 can then reason that his hat, and the hat of Gnome #4, must be different colors. Only two colors are possible. And so the wizard is thwarted.

A Real Scenario

There's a real cryptographic protocol called the Needham-Schroeder public-key protocol. You can read about it here. Unfortunately the protocol has a bug: it's vulnerable to attack if one of the principles can be fooled into starting an exchange with a badly-behaved or compromised agent.

Since this isn't a security course, we won't go into specifics. Instead, let's focus on the fact that it's quite easy to get things like protocols wrong, and sometimes challenging for us humans to completely explore all possible behaviors – especially since there might be behaviors we'd never even considered! It sure would be nice if we could get a computer to help out with that.

Last year, a pair of former 1710 students did an ISP on modeling crypto protocols, using the tools you'll learn in class. Here's an example picture, generated by their model, of the flaw in the Needham-Schroeder protocol:

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

You don't need to understand the specifics of the visualization; the point is that someone who has studied crypto protocols would. And this really does show the classic attack on Needham-Schroeder. You may not be a crypto-protocol person, but you probably are an expert in something you'd like to model, and you might very well get the chance to do so this semester.

Automated Reasoning as an Assistive Device

The human species has been so successful, in part, because of our ability to use assistive devices – tools! Eyeglasses, bicycles, hammers, bridges: all devices that assist us in navigating the world in our fragile meat-bodies. One of our oldest inventions, writing, is an assistive device that increases our long-term memory space and makes that memory persistent. Computers are only one in a long line of such inventions.

So why not (try to):

use computers to help us test our ideas?
use computers to exhaustively check program correctness?
use computers to help us find gaps in our intuition about a program?
use computers to help us explore the design space of a data structure, or algorithm, card game, or chemical reaction?
anything else you can think of…

There's a large body of work in Computer Science that tries to do all those things and more. That's what Logic for Systems is about.

Overview of the Semester

The semester will look roughly like the below table, although some of it is subject to change. Bold assignments denote multi-week projects. Italicized assignments denote pair homeworks.

Week	Topic	Lab	Assignment (Out)
Jan23	Properties	Python (Optional)	PBT
Jan30	Structural Modeling in Forge	N Queens	Forge 1
Feb06	Stateful Modeling in Forge	Ring Election	Forge 2
Feb13	Stateful Modeling in Forge		Curiosity
Feb20	Relational Modeling	Curiosity col-ab	Curiosity
Feb27	Relational Modeling	Memory Management	Forge 3
Mar06	Model Checking (LTL)	Stack	Model Checking 1
Mar13	Model Checking (LTL)		Model Checking 2
Mar20	By Request or Need, Guests
Mar27	Spring Break
Apr03	Boolean SAT, Herbrand		SAT 1, Proposals
Apr10	SMT	SAT col-lab	SMT
Apr17	Resolution		SAT 2
Apr24	By Request or Need, Guests	SAT 2 col-lab	Term Project
May01	Reading Period		Term Project
May08	Reading Period	Project col-lab	Term Project

The dates of final project presentations are still TBD, but they will be scheduled sometime during finals. (Students who need to travel, etc. before will be able to schedule remote presentations.)

Exercise

Before leaving class, please do this exercise via Google forms. It shouldn't take longer than 5 minutes. Feel free to discuss with other students!