Tag(sp22)
When I wrote the model for Case Study 2, it was really an adaptation of a larger, more complex model that I wrote together with some former 1710 students. The version we adapted used a lot of Racket scripting to produce the Needham-Schroeder part of the model, and so I tried to flatten that down into a concrete Forge file instead.
Some of you noticed an error introduced along the way. If you commented out the attack
predicate in the final run command, the result was still unsatisfiable. This is a major signal that the model is overconstrained!
When I initially wrote the N-S predicate, I quantified the timeslots like this:
rather than this:
The difference is one of ordering: in the original, t1
has to come before t0
(rendering the protocol predicate unsatisfiable when connected with other predicates, like wellformed
). I fixed the bug in the original protocol predicate, wrote tests to confirm that, and then forgot to fix the bug in the predicate for the modified protocol. And since I didn't add tests for the modified protocol in the same way as for the original, this error slipped in undetected. E.g., I wrote:
but not:
After this fix, the model allowed runs where the secret and the identity in the first message were sent via separate encrypted messages. This allows an attacker to subtly replace the Alice identity with their own:
Why is this allowed? Because I'd written:
and the .plaintext
join works the same across multiple ciphertexts in t1.data
. The fix was to add one t1.data
(and analogous constraints to every other multi-datum timestep in the protocol).
But we still got counterexamples in the final run!
This shows Alice opening a session with the attacker, and the attacker opening a session with Bob using the same original secret.
Nobody is fooled about identity. Alice believes she's talking with Eve. Bob believes he's talking with Eve. Even though the secrets are the same. So our attack
predicate was underconstrained: we also needed to introduce a notion of the participants being confused about who is who.
Kudos to the group that found this. I'm surprised more didn't, to be honest. Had this been found earlier, I'd very likely have made detecting it part of the rubric. (But I haven't added it.)
You should never assume a model is correct just because someone else has written some tests for it. Assumptions like that could cause you a great deal of trouble. We explicitly prompted you to consider whether you trusted the model in the previous case study; this one is no different.
Consider modeling Kruskal or Prim–Jarník's approach to finding a minimum spanning tree on a weighted graph.
I wrote a rough model of Prim's algorithm over winter break, intending to turn it into a lecture. It never appeared in that setting, so I didn't polish it. But here it is:
If we want to verify the correctness of this algorithm, using this model, we face a problem. What does it mean to find a minimum spanning tree for an undirected, weighted graph ? It must be a set of edges , such that:
Checking the final criterion requires higher-order universal quantification. We'd need to write something like this:
Forge can eliminate the outer some
quantifier via Skolemization: turn it into a new relation to solve for. But it can't do that for the inner all
quantifier. How many possible edge sets are there? If there are 5 possible Node
objects, then there are 25 possible edges between those objects, and thus possible edge sets. While, technically, Forge probably could produce a big and
formula with 33 million children, this approach doesn't scale. So Forge won't even try.
We need a different, more structured way of attacking this problem.
Suppose that, instead of the above shape, we had something like this, with respect to a fixed edge set t
:
That is, suppose we had a prospective candidate solution t
, and we want to search for better solution. This is fine: Forge can handle higher-order some
. So we can use Forge to check a candidate solution.
This suggests an iterative approach. Find a candidate spanning tree–-any spanning tree. Then try to find something better. And again. Until nothing better can be found.
Since Forge is a Racket library, you can use this technique to check (e.g.) Kruskal's algorithm with a loop in Racket. It's a bit less straightforward, since you need to break out of the Forge language itself, and because this use of Forge isn't yet documented well, you'd probably need to ask questions if you needed this technique for your project.
Note though, that since Z3py is a Python library, you can use this technique in Z3 as well.
This technique is pretty specialized, though. It relies on:
Not all higher-order universal constraints exhibit these nice properties, and others which aren't higher-order can still benefit from this idea.
Here's a classical example from formal methods: program synthesis. Suppose we were trying to synthesize a program that takes a machine integer as input, and outputs the number of 1
bits in that number. We might express the goal roughly as:
We might proceed as follows:
some i: Int | p[i] != countBitsInInteger[i]
is satisfiable.
i
that the current program doesn't work for. instantiate the formula p[i] = countBitsInInteger[i]
with the concrete value, add it to our constraints, and repeat.More sophisticated versions of this idea will try to infer root causes for failure (rather than just learning, essentially, "…yes, but make it work for i
, too."
This broad technique is called CounterExample Guided Inductive Synthesis (or CEGIS). It and related ideas are used heavily in synthesis tools. Similar ideas are also used inside SMT solvers to help eliminate universal quantifiers.