Term Rewriting Systems (TRSs)

# Term Rewriting Systems (TRSs) ## Introduction In this lecture we can put almost everything together that we have seen so far. Let us look again at one of our examples. - In one of the first lectures, we discussed the [high-school algebra problem](https://github.com/alexhkurz/programming-languages/blob/master/lecture-1.3.md) that was asking whether the equations we know from high-school algebra are enough to do all possible computations allowed in the natural numbers. - In the meanwhile, we have done enough work to be able to unpack the question above. By now we can be precise about all the following. - "equations" - "are enough" - "all possible" - "computations" - "allowed in the natural numbers" **Exercise:** Explain precisely the meaning of "equations", "are enough", "all possible", "computations", "allowed in the natural numbers". Part of the answer to the exercise above, was computation as rewriting to normal form. So far we took as a model of computation an abstract reduction system (ARS). While each computer, or virtual machine, or program can be understood as an ARS, there are many ARSs that are not models of computation. Therefore, in computer science various refinements of this notion are studied. In this lecture, we are refining ARSs as follows. The elements of an ARS $(A,\to)$ are now terms, that is, $A=Term_\Sigma(X)$ for some signature $\Sigma$ and some set $X$ of variables. A rewrite rule will be an equation in $Term_\Sigma(X)$. And the reduction relation $\to$ is now generated by a set of rewrite rules/equations $\sf E$. To emphasise that $\to$ is generated by $\mathsf E$ we write $\to_\mathsf E$. To summarise the lecture, we will explain the move from ARSs $$(A,\to)$$ to TRSs $$(Term_\Sigma(X),\to_{\mathsf E})$$ As it will turn out, a TRS is completely specified by its equations $\mathsf E$. In a previous lecture, we presented the [mathematical background on equations](), but I hope that most of the lecture will be readable with an intuitive understanding of equations. ## Aims The aim of the lecture is to answer precisely, among others, the following questions. - How are the ARSs from the [exercises]() examples of TRSs? - In what sense is computation in a TRS a particular case of reasoning in equational logic? - In what sense are TRSs models of computation? After the lecture you should test yourself by answering these questions. ## Examples of TRSs To see what is at stake let us look back at the exercises. ### Finitely many rewrite rules vs infinitely many reductions In the exercises on [string rewriting](https://hackmd.io/s/HJQNfRbtX#Exercise-String-rewriting) we defined an ARS $(A,\to)$ via ab -> ba ba -> ab aa -> b -> Notice that by writing these 4 rewrite rules, we define infinitely many possible reductions $w\to v$ for infinitely many words $w,v$. For example the first rule not only defines a reduction $ab\to ba$ but also reductions such as $aabb\to abab$, etc. In fact, for all words $w,w'$, we have a corresponding reduction $wabw'\to wbaw'$. **Summary:** The reason to go from an ARS $(A,\to)$ to a TRS is that we can make precise how finitely many rewrite rules generate infinitely many reductions. From a computational point, this reduction is crucial. To implement an ARS, we should be able to decide whether there is a possiblity to reduce a term or not. If there are only finitely many rewrite rules, then this can be done mechanically. We will have to say more about this in the Theorem below. ### Signature, Variables, Equations Recall that an equation is a pair $(t,t')\in Term_\Sigma(X)\times Term_\Sigma(X)$. In this lecture, the set of equations of interest will be denoted by $\sf E$ and we write, as before, $$ t\,\mathsf E\,t'$$ to say that $(t,t')\in \mathsf E$. The font has been chosen in order to make $t\,\mathsf E\,t'$ resemble $t\equiv t'$. Note that, because an equation is just a pair of terms, every equation can also be understood as a rewrite rule. And vice versa, every rewrite rule should be an equation. Let us look how this plays out in an example. **Activity 0:** In the example ab -> ba ba -> ab aa -> b -> what is the signature $\Sigma$ and what is the set of variables $X$ and what is the set of equations? (Hint: Imagine you want to program a text editor; think about what basic operations are needed to create a text file ... one certainly needs to be able to create an empty text and then to add letters as one types ... this should lead to the answer about the signature ... once you have the signature and variables you know what are the terms, hence also pairs of terms, hence also equations.) [^strings] ### Termination An important question in the discussion of ARSs was termination. Let us see what we have achieved so far by refining ARSs to TRSs. In most interesting examples of an ARS $(A\to)$, the reduction relation $\to$ has infinitely many elements. In general, if there are infinitely many reductions, we cannot be sure to be able to determine whether an element of $A$ is a normal form, as there could be infinitely many reductions to check. On the other hand, we have seen in Activity 0 an infinite set $\to$ which is generated by a finite set of equations $\sf E$. In this case checking whether $a\in A$ is a normal form is decidable: $a$ is not a normal form if and only if some left hand side of some of finitely many equations matches $a$. We will come back to this observation in the Theorem below. ### Computing in a TRS Maybe you have noticed that by going to a TRS, we have turned the meta-variables into object-variables. In previous examples, for instance in the [equations](https://hackmd.io/hILQksyiTUW4mXxxOSF7eQ) of the lectures on syntax and semantics, we wrote variables $X,Y,Z$ in capital letters instead of the more familiar $x,y,z$ in order to emphasise that these variabels are not part of the object language but rather are place holders for terms of the object language. On the other hand, in the term algebra $Term_\Sigma(\{x,y,z\})$, the variables $x,y,z$ are themselves terms, so part of the object-language. (In software engineering, sometimes the term [reification](https://en.wikipedia.org/wiki/Reification_(computer_science)) is used for "turning meta into object". So we could say here that we reified variables.) This opens up the possibility that we can perform now the operation of pattern matching inside the calculus of equational logic. In fact, as will see, the rules of substitution and congruence are exactly what is needed for pattern matching. Let us do a simple computation. (I drop some brackets to make the notation easier to read.) max(ss0,sss0) -> s(max(s0,ss0)) -> ss(max(0,s0)) -> sss0 **Activity 1:** Justify each of the steps. [^activity1] **Summary:** As an outcome of the activity, we understand now that when we formulate our rewrite-rules/equations *with variables* and use the rules of equational logic called *substitution* and *congruence*, then we can generate *infinitely* many rewrite-rules from a *finite* set of equations containing variables using *pattern matching*. As a slogan, the rules of substitution and congruence compute pattern matchings. What we still need to explain better is what we mean by "generating new rewrite-rules from equations using substitution and congruence". But first we should recall again what we mean by "use the rules of equational logic" ### Equational logic, recalled As we have seen in the example above, using just the three rewrite-rules/equations max(sx,sy) -> s(max(x,y)) max(0,x) -> x max(x,0) -> x we can compute the `max` of aribtrarily large numbers. For example, we can compute `max(1024,2048)=1024`, where I write `1024` for a string of 1024 `s` followed by a `0`. As we explained earlier each step in such a computation involves the application of a rule of [equational logic](https://hackmd.io/s/H1panO_um#Excursion-on-Equational-Logic). Here are these rules again. The first three rules say that $\approx$ is an equivalence relation. From a computational point of view the important rule is (trans) which allows to chain computations together: Transitivity correpsonds to having a loop in computational terms. The last two rules are what will allow us to do pattern matching, as explained below. $$ \frac{}{t\approx t}{\ \rm (refl)\ } \quad\quad\quad\quad \frac{t_1\approx t_2}{t_2\approx t_1}{\ \rm (sym)\ } \quad\quad\quad\quad \frac{t_1\approx t_2\quad\quad t_2\approx t_3}{t_1\approx t_3}{\ \rm (trans)\ } $$ and, for all $n$-ary operations $f$ a congruence rule $$ \frac{t_1\approx t_1'\quad\ldots\quad t_n\approx t_n'}{f(t_1,\ldots t_n)\approx f(t_1',\ldots t_n')}{\ \rm (cong)\ } $$ and a rule for substitution $$ \frac{t_1\approx t_2}{t_1[x\mapsto t]\approx t_2[x\mapsto t]}{\ \rm (subst)\ } $$ Also recall $$\approx_E$$ for the set of equations that can be derived from a set of "axioms" $E$ using the rules above. ### Pattern matching Informally, I understand by pattern matching the following. If we have a structure $S$ and a pattern $P$ containing variables from a set $X$, is there a substructure $S'$ of $S$ and a substitution such that $P$ equals $S'$. It is common to think of the variables in $P$ as "holes" and of the substitution as "filling the holes". In our situation, we can be more precise. The structure will be a term $t$, typically not containing variables and the pattern will be the left hand side $l$ of a rewrite rule/equation $(l,r)$ typically containing variables. We then are looking for a subterm $t'$ of $t$ and a substitution $\sigma$ such that $t'=l\sigma$. $t'=l\sigma$ is also called a redex, for reducible expression. To decide wether, given $t$ and $l$, there is a substitution such that $t=l\sigma$, and to compute $\sigma$ if it exists, is of called the matching problem, see also Baader-Nipkow, Section 4.1. If there are only finitely many equations, then deciding whether a term $t$ contains a redex is decidable. ### The rewrite relation associated to a set of equations **Activity 2:** Looking back at the derivation max(ss0,sss0) -> s(max(s0,ss0)) -> ss(max(0,s0)) -> sss0 from the point of view of equational logic, how many steps of equational reasoning does it take to prove that `max(ss0,sss0) = sss0`? [^activity2] The number coming out of the activity is considerably bigger than 3. But we would like to say that we took 3 rewrite steps. The solution is to define from the equations $E$ a new relation $\to_E$ so that, in the example above, if we let $E$ to be the equations max(sx,sy) = s(max(x,y)) max(0,x) = x max(x,0) = x then $\to_E$ is defined in such a way that taking `->` to be $\to_E$, the derivation considered in Activity 2 takes exactly three steps. So let us define $\to_{\sf E}$. **Definition:** Let $E$ be a set of equations in a signature $\Sigma$ and a set $X$ of variables. Let $t_1,t_2\in Term_\Sigma(X)$. We define $$t_1\to_E t_2$$ if there is one equation $l=r$ in $E$ and the equation $t_1 = t_2$ can be derived from $l=r$ with one application of the subsitution rule and any number of applications of the congruence rule. **Activity 3:** What is the relationship between $\approx_E$ and $\to_E$? [^activity3] We can now summarise what we learned from Activity 1 by saying that if $E$ is finite and $\to_E$ is [confluent and terminating](), then it is decidable whether two terms are equal according to $E$. We will formulate this as a theorem in the next section. ## Term rewriting We summarise the work we have done in the examples above by listing again the most important definitions and results. As we only have time to present the bare bones of term rewriting, we recommend Baader-Nipkow to the reader interested in more detail. Our definition of a TRS is the same as Def.4.2.1 in Baader-Nipkow. **Definition (TRS):** Fix a signature and a set of variables. Let $Term$ be the corresponding term-algebra. Then a rewrite rule $l\to r$ is an equation on the $Term$. Recall the section [substitution](). **Definition (matching):** We say that a term $l$ matches $t$ if there is a substitution $\sigma$ such that $\sigma^\sharp(l)=t$. **Proposition:** Matching is decidable. For more on matching see Section 4.1 in Baader-Nipkow. The rewrite relation $\to_E$ associated to a set of equations $E$ is Def.3.1.8 in Baader-Nipkow. We give a more conceptual, less technical definition. **Definition ($\to_E$):** Let $E$ be a set of equations in a signature $\Sigma$ and a set $X$ of variables. Let $t_1,t_2\in Term_\Sigma(X)$. We define $$t_1\to_E t_2$$ if there is an equation $(l,r)$ in $E$ and the equation $t_1 = t_2$ can be derived from $l=r$ with one application of the subsitution rule and any number of applications of the congruence and reflexivity rule. The next theorem shows that every [confluent and terminating](https://hackmd.io/s/B1DPNGEdm) TRS is a model of computation. This is Thm 4.1.1 in Baader-Nipkow. **Theorem:** If $E$ is finite and $\to_E$ is confluent and terminating, then it is decidable whether two terms are equal according to $E$. *Proof:* We have seen in Activity 3 that $\stackrel{\ast\ \ \ }{\leftrightarrow_E}$ equals $\approx_E$. So in order to decide whether $t_1\approx_E t_2$ we use confluence and termination to rewrite $t_1, t_2$ to normal form using $\to_E$. This takes a finite amount of time because of the following reasons. First, $E$ is finite. Second, to check whether an equation $l=r$ in $E$ can be used to generate a rewrite $t\to_E t'$ for a term $t$ takes only a finite amount of time. Third, $\to_E$ is terminating. Now $t_1\approx_E t_2$ is true if and only if the normal forms of $t_1$ and $t_2$ are identical. [^activity1]: In the first step, we apply the rewrite-rule/equation `max(sx,sy) -> s(max(x,y))`. In order to this we need to apply a *substitution* `x:=s0,y:=ss0`, or, written in more mathematical notation, $x\mapsto s0, y\mapsto ss0$. This gives us the rewrite-rule/equaton `max(ss0,sss0) -> s(max(s0,ss0))` as needed for the first step. $\quad$ In the second step, substitution gives us `(max(s0,ss0)) -> s(max(0,s0))`, from which, in turn, we derive the rewrite-rule/equation `s(max(s0,ss0)) -> ss(max(0,s0))`. The rule of equational logic that justifies this step is called the *congruence rule*. [^strings]: $\Sigma = \{\epsilon, a, b\}$ where $\epsilon$ is a constant and $a,b$ are unary. For the set of variables we can choose $X=\{x\}$ and then the equations are \begin{align} a(b(x)) &\,\sf E\, b(a(x))\\ b(a(x)) &\,\sf E\, a(b(x)) \\ a(a(x)) &\,\sf E\, x \\ b(x) &\,\sf E\, x \end{align} [^activity2]: If I counted correctly, it is 8 applications of rules of equational reasoning: (subst), (subst), (cong), (subst), (cong), (cong), (trans), (trans). [^activity3]: The answer is that $\stackrel{\ast\ \ \ }{\leftrightarrow_E}$ equals $\approx_E$ (this is Theorem 3.1.12 in Baader-Nipkow.). For the definition of $\stackrel{\ast}{\longleftrightarrow}$ see the [lecture on ARSs](https://hackmd.io/s/B1DPNGEdm).