29 and 30: Satisfiability Modulo Theories (SMT)

tags: `Tag(sp22)`

Logistics

Remember we have upcoming guest lectures! You can see them all on this calendar.
Tim's hours moved to Wednesday 11am this week.
New Forge build. Problem is: can't get Racket's package system to believe in it last night. Should be able to pull via git to get the new version, at least.

What's "Modulo Theories"?

Boolean solvers are powerful, but not very expressive. If you want to use them to solve a problem involving (e.g.) arithmetic, you need to encode that idea with booleans. Forge does this by "bit-blasting": one boolean variable per bit in a fixed bitwidth, along with formulas that build boolean adders, multipliers, etc.

An SMT solver is a SAT solver that can handle various domain-specific concepts beyond boolean logic. Hence "modulo theories": it's satisfiability, but with the addition of (say) the theory of linear integer arithmetic.

From a certain point of view, Forge is an "SMT" solver, because it includes concepts like relations and bit-vector integers. But this isn't usually how people understand the term in 2022. SMT solvers can be either "eager" or "lazy". An eager solver translates to boolean logic and then uses a boolean solver engine–-this is Forge's approach. A lazy solver actually implements domain-specific algorithms and integrates those with a boolean core. Most modern SMT solvers are lazy, since they can benefit from clever domain algorithms.

Here are some common domains that SMT solvers tend to support:

integer arithmetic (linear nearly always, non-linear sometimes)
real arithmetic (yes, real numbers)
bit vectors
arrays
datatypes

Some slightly less common domains include:

relations
lists
strings

And of course there are many others. The solver we'll use this week supports many of these, but not all.

A Key Difference

Most SMT solvers don't have "bounds" in the same way Forge does. You can declare a datatype that's bounded in size, but the addition of domains like mathematical integers or lists means that unless you're working in a very restricted space, the set of possible objects is infinite. This can cause some confusion vs. what we're used to.

What does it mean to say "For all

x

of type

A

P (x)

is true?" In Forge,

A

always has an upper bound, and so the quantifier can always be converted to a boolean constraint. But in SMT, suppose the type is integer? There are infinitely many integers, which means the solver can't convert the quantifier to a (finite) boolean constraint.

Try to avoid universal quantification in SMT, at least at first. If we had two weeks for SMT, we'd cover quantification in the second week.

Even without universal quantifiction, a problem might not necessarily be solvable. We'll talk more about why on Wednesday; for now, just be aware that the solver might return a third result in addition to sat and unsat: unknown (the solver gave up or ran out of time).

The Z3 Solver

Setup

We'll be using the Python bindings for the Z3 solver, available here. You can also install via pip:

pip3 install z3-solver

To update to the latest version of the solver, you can run:

pip install z3-solver --upgrade

Another great solver is CVC5. Although we won't use it in class, it supports some things that Z3 doesn't (and vice versa). For instance: relations!

Boolean

We've still got the boolean-logic capabilities of a normal SAT solver:

def demoBool(self):
        # Create a new solver
        s = Solver()

        # declare some boolean *solver* variables
        p, q = Bools('p q')         
        
        s.add(Or(p, q))
        if s.check() == sat:        
            print(s.model()) # "model" ~= "instance" here :/
        
        # (Think: how would we get a different instance?)

        # getting at pieces of a model for programmatic use
        print(s.model().evaluate(p)) # can pass a formula

Uninterpreted Functions And Integer Inequalities

If a symbol (function, relation, constant, …) is interpreted, then its meaning is encoded via constraints built into the solver. In Forge, we'd say that:

add is an interpreted function, since Forge assigns it a meaning innately; but
relations you add as sig fields are uninterpreted, since absent constraints you add yourself, Forge treats their values as arbitrary.

With some exceptions, SMT solvers usually have functions, not relations. This is another reason for Froglet to be about functions–-they're more useful as a foundation in other tools.


    def demoUninterpreted(self):
        s = Solver()
        # ; Ints, UNINTERPRETED Functions (think of as like relations in Alloy)        
        a, b = Ints('a b')  
        f = Function('f', IntSort(), IntSort())
        s.add(And(b > a, f(b) < f(a)))        
        if s.check() == sat:        
            print(s.model()) 
        print(s.model().evaluate(f(a)))
        print(s.model().evaluate(f(b)))
        print(s.model().evaluate(f(1000000)))

Arithmetic

Let's try something that involves arithmetic, and also explore how the solver handles real numbers vs. integers.

TODO: we use a forall here?

 # Real numbers
    def demoReals(self):
        s = Solver()
        x = Real('x') #Int('x')  
        s.add(x*x > 4)
        s.add(x*x < 9)
        result = s.check()
        if result == sat:
            print(s.model())    
        else: 
            print(result)
        
    def demoFactoringInt(self):
        s = Solver()
        # (x - 2)(x + 2) = x^2 - 4
        # Suppose we know the RHS and want to find an *equivalent formula* LHS:
        # (x - ROOT1)(x + ROOT2) = x^2 - 4
        xi, r1i, r2i = Ints('x root1 root2') # int vars

        # Note: don't use xi ** 2 -- gives unsat?
        s.add(ForAll(xi, (xi + r1i) * (xi + r2i) == (xi * xi) - 4  ))
        result = s.check()
        if result == sat:
            print(s.model())    
        else: 
            print(result)

        s.reset()    
        # (x + 123)(x - 321) = x^2 - 198x - 39483
        s.add(ForAll(xi, (xi + r1i) * (xi + r2i) 
                         == (xi * xi) + (198 * xi) - 39483))
        result = s.check()
        if result == sat:
            print(s.model())    
        else: 
            print(result)
        # Note how fast, even with numbers up to almost 40k. Power of theory solver.

    def demoFactoringReals(self):
        s = Solver()
        x, r1, r2 = Reals('x root1 root2') # real number vars
        # ^ Solve for r1, r2 because they are unbound in outer constraints
        #   x is quantified over and therefore not a var to solve for

        # (x + ???)(x + ???) = x^2 - 198x - 39484         
        s.add(ForAll(x, (x + r1) * (x + r2) 
                         == (x * x) + (198 * x) - 39484))
        result = s.check()
        if result == sat:
            print(s.model())    
        else: 
            print(result)
        
        s.reset()

        # Here's how to start using cores in Z3 if you want, but
        # see the docs -- it's a bit more annoying because you need to create 
        # new boolean variables etc.

        #s.set(unsat_core=True) # there are so many options, at many different levels
        # use s.assert_and_track; need to give a boolean 
        # see: https://z3prover.github.io/api/html/classz3py_1_1_solver.html#ad1255f8f9ba8926bb04e1e2ab38c8c15 

        # Now, for the demo!
        
        # Note e.g., x^2 - 2x + 5 has no real roots (b^2 - 4ac negative)
        s.add(ForAll(x, (x + r1) * (x + r2) 
                         == (x * x) - (2 * x) + 5))
        
        result = s.check() 
        if result == sat:
            print(s.model())    
        else: 
            print(result)            

    def coefficients(self):

        s = Solver()
        x, r1, r2, c1, c2 = Reals('x root1 root2 c1 c2') # real number vars        
        s.add(ForAll(x, ((c1*x) + r1) * ((c2*x) + r2) == (2 * x * x)))
        result = s.check()
        if result == sat:
            print(s.model())    
        else: 
            print(result)

Another Demo: N-Queens

   def nQueens(self, numQ):
        s = Solver()
        # Model board as 2d list of booleans. Note the list is *Python*, booleans are *Solver*
        cells = [ [ z3.Bool("cell_{i}{j}".format(i=i,j=j)) 
                    for j in range(0, numQ)] 
                    for i in range(0, numQ) ]
        #print(cells)
        
        # a queen on every row
        queenEveryRow = And([Or([cells[i][j] for j in range(0, numQ)]) for i in range(0, numQ)])
        #print(queenEveryRow) # for demo only
        s.add(queenEveryRow)

        # for every i,j, if queen present there, implies no queen at various other places
        # Recall: queens can move vertically, horizontally, and diagonally.
        # "Threaten" means that a queen could capture another in 1 move. 
        queenThreats = And([Implies(cells[i][j], # Prefix notaton: (And x y) means "x and y".
                                    And([Not(cells[i][k]) for k in range(0, numQ) if k != j] +
                                        [Not(cells[k][j]) for k in range(0, numQ) if k != i] +
                                        # Break up diagonals and don't try to be too smart about iteration
                                        [Not(cells[i+o][j+o]) for o in range(1, numQ) if (i+o < numQ and j+o < numQ) ] +
                                        [Not(cells[i-o][j-o]) for o in range(1, numQ) if (i-o >= 0 and j-o >= 0) ] +
                                        # flipped diagonals
                                        [Not(cells[i-o][j+o]) for o in range(1, numQ) if (i-o >= 0 and j+o < numQ) ] +
                                        [Not(cells[i+o][j-o]) for o in range(1, numQ) if (i+o < numQ and j-o >= 0) ]
                                        ))
                           for j in range(0, numQ)
                           for i in range(0, numQ)])
        #print(queenThreats) # for demo only
        s.add(queenThreats)

        if s.check() == sat:
            for i in range(0, numQ):
                print(' '.join(["Q" if s.model().evaluate(cells[i][j]) else "_" for j in range(0, numQ) ]))
        else: 
            print("unsat")

if __name__ == "__main__":
    demo = SMTDemo()
    demo.coefficients()

Wednesday

Note: class will cover:

boolean skeleton;
congruence closure (intuition);
Goldbach argument; and
undecidability.

I've had to copy a bunch of difference sources to construct this lecture given external tim constraints, so expect the notes to be less-than-ideal.

What's Going On In The Solver?

Modern SMT-solvers tend to be lazy (a technical term): they use a base boolean solver, and call out to domain-specific algorithms ("theory solvers") when needed. This is how Z3 manages to be so fast at algebraic reasoning.

But what shape of constraints do these theory-solvers accept? And how does an SMT solver manage to provide that? Let's look at one potential theory: systems of linear inequalities.

An Example Theory-Solver: Linear Inequalities

If I gave you a system of linear inequalities, like this:

x + y < 3
x < 2y

Could you find a solution? Probably (at least, if you Googled or had an old algebra textbook to hand). Or, if you're like me, you might enter it into Wolfram Alpha:

But suppose we added a bunch of boolean operators into the mix. Now what? You can't solve a "system" if the input involves "or".

SMT solvers use a technique to separate out the boolean portion of a problem from the theory-specific portion. For example, if I wrote: x > 3 or y > 5, the solver will convert this to a boolean skeleton, replacing all the theory terms with boolean variables: T1 or T2, where T1 means x > 3 and T2 means y > 5. It can now find an assignment to that boolean skeleton with a normal SAT-solver. And every assignment is implicitly conjunctive: there's no "or"s left!

Suppose the solver finds T1=True, T2=True. Then we have the system:

x > 3
y > 5

If, instead, the solver found T1=False, T2=True, we'd have the system:

x <= 3
y > 5

Another Example Theory-Solver: Uninterpreted Functions With Equality

Here are some constraints:

! ( f(f(f(a))) != a or f(a) = a ) 
f(f(f(f(f(a))))) = a or f(f(a)) != f(f(f(f(a))))

Are these constraints satisfiable? Does there exist some value of a and f such that these constraints are satisfied?

As before, we'll convert these to a boolean skeleton:

!(T1 or T2)
T3 or T4

An Unproductive Assignment

This CNF is satisfied by the assignment T1=False, T2=False, T3=True, T4=False, which gives us the theory problem:

f3(a) = a
f(a) != a
f5(a) = a
f2(a) = f4(a)

(I won't even need that fourth constraint to show this skeleton-assignment won't work.)

This system of equalities can be solved via an algorithm called congruence closure. It goes something like this. First, collect all the terms involved in equalities and make

Now draw undirected edges between the terms that the positive equality constraints force to be equivalent. Since we're being told that f3(a) = a, we'd draw an edge between those nodes. And similarly between f5(a) and a:

But equality is transitive! So we have learned that f5(a) and f3(a) are equivalent. (This is the "closure" part of the algorithm's name.)

From there, what else can we infer? Well, if f3(a) is the same as a, we can substitute a for f(f(f(a))) inside f(f(f(f(f(a))))), giving us that f(f(a)) (which we're calling f2(a) for brevity) is equivalent to a as well, since f5(a) = a. And since equality is transitive, f2(a) equals all the other things that equal a.

And if f2(a) = a, we can substitute a for f(f(a)) within f(f(f(a))), to get f(a) = a.

But this contradicts the negative equality constraint f(a) != a. So we've found a contradiction.

A Productive Assignment

Another way of solving the boolean skeleton is with the assignment T1=False, T2=False, T3=False, T4=True, which gives us the theory problem:

f3(a) = a
f(a) != a
f5(a) != a
f2(a) != f4(a)

We'd proceed similarly. But this time we don't have many unavoidable equalities between terms: f3(a) is linked with a, and we could say that f6(a) = a via substitution–-if we cared about f6(a). But it's not necessary for any of the inequalities to be violated.

Return to Decidability

There's a famous unsolved problem in number theory called Goldbach's conjecture. It states:

Every integer greater than 2 can be written as the sum of three primes.

This is simple to state. Yet, we don't know whether or not it holds for all integers greater than 2. We've looked for small (and not so small) counterexamples, and haven't found one yet.

But that illustrates a problem. To know whether Goldbach's conjecture is false, we just need to find an integer greater than 2 that cannot be written as the sum of 3 primes. Here's an algorithm for disproving the conjecture:

    for i in Integers:
        for p1, p2, p3 in PrimesUpTo(i):
            if i = p1 + p2 + p3: 
                continue;
        return i;

If Goldbach's conjecture is wrong, this computation will eventually terminate and give us the counterexample.

But what about the other direction? What if the conjecture is actually true? Then this computation never terminates, and never gives us an answer. We never learn that the conjecture is true, because we're never done searching for counterexamples.

Now, just because this specific algorithm isn't great doesn't mean that a better one might not exist. Maybe it's possible to be very smart, and search in a way that will terminate with either true or false.

Except that it's not. CSCI 1010 talks about this a lot more, but I want to give you a bit of a taste of the ideas today.

Note that we can express Goldbach's conjecture in Z3…

Undecidability

I want to tell you a story–-with only some embellishment.

First, some context. How do we count things? Does

1, 2, 3

have the same number of elements as

A, B, C

? What about

N

vs.

N \cup {B r o w n U}

? If we're comparing infinite sets, then it seems reasonable to say that they have the same size if we can make a bijection between them: a 1-1 mapping.

But then, counter-intuitively,

N

and

N \cup {B r o w n U}

are the same size. Why? Because of Hilbert's Hotel. Here's the idea: suppose you work at the front desk of a hotel with a room for every natural number. And, that night, every room is occupied. A new guest arrives. Can you find room for them?

Think, then click!

Yes! Here's how. For every room $i$, tell that guest to move into room $i+1$. You'll never run out of rooms, and room 0 will be free for the new guest.

So, it's the late 1800's. Hilbert's Hotel (and related ideas) have excited the mathematical world. Indeed, can we use this trick to show that every infinite set is the same size? Are all infinities one?

There was a non-famous but moderately successful mathematician named Georg Cantor, in his 40s at the time. Georg suffers from depression, and indeed lived a tragic life (and died a tragic death). Cantor proves that the power set of

N

, that is, the set of subsets of

N

, must be strictly larger than

N

. This was groundbreaking work, and gives the lie to the conventional wisdom (thanks, Hardy) about young mathematicians.

There is pandemonium. There is massive controversy. He was, no kidding, called a corrupter of youth. Later, mathematicians said tht his ideas are 100 years too early, and Hilbert actually said, later, that "No one shall drive us from the paradise Cantor has created for us."

How did Cantor prove this? By contradiction. Assume you're given a bijection between a set

N

and its power set. Now, an infinite table must exist, with subsets of

N

as rows and elements of

N

as columns. The cells contain booleans: true if the subset contains the element, and false if it doesn't.

PLACEHOLDER FOR FALL: sketch

Cantor showed that there must always be a row (i.e., a subset of

N

) that isn't represented in the table. That is, a bijection cannot exist. Even with the very permissive definition of "same size" we use for infinite sets, there are still more subsets of (say) natural numbers than there are natural numbers.

Why does this matter to US? Let me ask you two questions:

QUESTION 1: How many syntactically-valid Java program source files are there?

QUESTION 2: How many mathematical functions from int inputs to bool outputs are there?

What is our conclusion?

Try as you might, it is impossible to express all functions (of infinite type) in any programming language where program texts are finite.

So we know that programs in any language must be unable to express some things (indeed, most things). But is there anything NO language can express? Maybe all the things unexpressible nobody actually needs or cares about.

Another Story

It's the early 1900's. Hilbert and others: IS MATHEMATICS MECHANIZABLE?

In 30's: Church, Turing, Godel: "No. At least not completely."

Why? A few reasons. Here's a challenge. Write for me a program h(f, v) that accepts two arguments:

another program; and
an input to that program.

It must:

always terminate;
return true IFF f(v) terminates in finite time;
return false IFF f(v) does not terminate in finite time

Suppose h exists, and can be embodied in our language. Then consider this program.

def g(x):
  if h(g, x): # AM I GOING TO HALT? (remember h always terminates)
    while(1); # NUH UH IM NOT!
  else:       
    return;   # HAHAHAHAHA YES I AM

Argh! Assuming that our halting function h is a program itself: CANNOT EXIST! This is called the "halting problem".

Exercise: what consequences for us? OK to be philosophical, uncertain.

"Undecidability"

What Does This Have To Do With SMT?

Gödel also proved that number theory is undecidable: if you've got the natural numbers, multiplication, and addition, it is impossible to write an algorithm that answers arbitrary questions about number theory in an always correct way, in finite time.

There are also tricks you'll learn in 1010 that let you say "Well, if I could solve arbitrary questions about number theory, then I could turn the halting problem into a question about number theory!"

There's so much more I'd like to talk about, but this lecture is already pretty disorganized, so I'm not going to plan on saying more today.

Looking Ahead

Friday: CEGIS
Deal with HO universals, other issues of optimization

Powerful, used with both SAT and SMT for e.g. program or circuit synthesis.