Another title of this note could have been
(thanks to Samuel Balco for implementing the Isabelle and Idris code)
In this note we will again try to understand more deeply something we already know how to do. Namely how to reason with equations. And again, this understanding will allow us to put on a machine something that used to be the reserve of humans: automated reasoning.
But before we can go there we need to understand mathematical induction better.
Induction can come in many different forms:
We can show, to give just one example, that the equation
holds for all positive integers as follows. If
Exercise: Go through the equational reasoning above. Can you justify/explain each step?
Remark: Why is it correct to apply the equation we want to prove in the proof of the equation itself? Isn't circular reasoning unsound? Depends … some forms of circularity are justifiable, others are not. In the example above, we prove the equation for f
by f(n)= if n=0 then 1 else n*f(n-1)
I made a circular definition. But is this a definition? Does it actually define a function ? Or, in other words, does f
terminate on all inputs? Yes, f
does terminate on all inputs … because each recursive call f(n-1)
calls f
with a smaller argument n-1
.
We defined a class of arithmetic expressions in BNF as
num ::= 1 | num +1
exp ::= num | exp + exp | exp * exp
This is an inductive definition.
We defined the transitive closure of a relation
This is an inductive definition.
We added to the exp
above some equations. For example,
X + ( Y + Z ) = ( X + Y ) + Z
and then claimed that for all numbers n
and m
, the equation
n + m = m + n
follows already, even without having an equation for commutativity.
What do we mean when we say that an equation follows from some others?
Given a set of equations
An equation is in this set
This is an inductive definition.
Numbers, context-free grammars, transitive closure, equational reasoning
Can we explain how they all are instances of the same general phenomenon?
In all four examples, we define a set of elements: exp
,
In all cases, the set is infinite, so writing it down with a finite number of rules is doing something clever.
Example: In case of numbers we can write the two rules as
Of course, the trick is that the rule on the right has a free variable
What is inductive about this definition?
This is the crucial point:
When we say that the two rules above "define
In other words,
Exercise: What are the rules for the example of arithmetic expressions and for transitive closure? Write them in a form that resembles as much as possible the two rules for the natural numbers.
For a solution see the footnote. [2]
The case of equational reasoning is important for too many reasons to start making a list. We will discuss it in some detail in the next section.
We define inductively the set of all equations
that we want to study for expression. We write num
s and exp
s.
I write now
I omit the rule for multiplication because it is not needed in the following.
As an axiom on expressions we want to have for now only
and we want to show that
follows already from associativity. Of course, this is only an example. What we are really interested in is to understand what it means for one equation to follow from other equations.
As indicated above, the idea is to inductively define the set of all equations that follow. For this, we need to write out the rules of equational reasoning. They are as follows:
and
To these rules of equational reasoning we want to add, as we said, in our example, the axiom
To warm up, and to understand how equational reasoning is inductive, let us prove something even simpler then
The most important idea to understand in this note is the following.
To say that the equation
If you have any doubts about this statement, stop and think …
How do we show that
How would we do this in high-school algebra style?
If
If
Exercise: Can you justify all the steps in the chain of reasoning above?
Exercise: Write the chain of reasoning out in way that one can see at each step which rule among
This last exercise, if done properly, means that we understand in all detail how equational reasoning works. This should mean that we can implement it on a machine.
The next section will look at two programming languages, Isabelle and Idris, in which these proofs can be implemented.
The topic of the current lecture is not logic but the more general phenomenon of induction. Nevertheless, while analysing familiar reasoning with equations from the point of view of induction, we discovered almost all the rules of equational logic. So we may as well give them here.
Equational logic is the part of logic that is only concerned with proving new equations from old. The rules of equational logic are the one we have seen above (we use now
and, for all
and a rule for substitution of terms into variables that we won't explain now but encounter later again
Thus we have a set of analogies
mathematics | computer science | logic |
---|---|---|
numbers | programs | theorems |
with the sets of numbers/programs/theorems being defined inductively. [3] Of course, there is much more defined inductively, for example we could also define a set of proofs inductively.
So can we compute with programs and theorems as we can compute with numbers?
You could say that
The context-free grammar
num ::= 1 | num +1
exp ::= num | exp + exp | exp * exp
can be written as a set of rules as follows.
The mathematical definition of transitive closure
can be written as a set of rules as
As often in research, what starts out as a curious analogy can be turned into an important research programme. So let us take, for a moment, the analogy program vs theorem seriously. If programs are like theorems, then parsing is like proving. This is the basic idea of two papers by Lambek (1958, 1961) that spawned whole research areas in both linguistics and in logic. ↩︎