# Notes on verifying hyperproperties with TLA
[reference](https://lamport.azurewebsites.net/pubs/hyper2.pdf)
## Broad idea (sketchy)
Start with a system described by a TLA formula $P$ and a hyperprop expressed by a formula $H$ involving $k$ behaviours. A formula $Q$ describes a new system comprising multiple copies of $P$. The assertion that the "system satisfies $H$" is given by a TLA formula $Q \Rightarrow R$, i.e., $Q$ satisfies property $R$.
## Preliminaries
A **behavior** is a sequence of states. A **step** is a pair of consecutive states.
### State Machines
A state machine is described by an **initial predicate** $\mathcal{I}$ and a **next-state predicate** $\mathcal{N}$. The set of states may be infinite. Behaviors generated by the state machine will have the first state satisfying $\mathcal{I}$ with $\mathcal{N}$ true at each succeeding step.
A concurrent system can be described by a state machine where each step represents operations performed synchronously by one or more processes.
##### Example 1
Let's model a state machine for a 12-hour hour-minute clock with initual state 12:00.
$$\mathcal{I} \triangleq (min = 0) \land (hour = 12)$$
The next state predicate would be
$$ \begin{align} \mathcal{N} \triangleq \; &\land\; min' = (min + 1) \bmod 60 \\
& \begin{aligned}\land\; hour' = &\text{ if } min = 59 \\
&\text{ then } hour' = (hour \bmod 12) + 1 \\
& \text{ else } hour\end{aligned} \end{align} $$
### Properties
A **property** is a predicate on behaviors. Denote that the property $P$ is true for behaviour $b$ by $b \vDash P$ and similarly for a set of behaviors $S$, $S\vDash P$ means $b\vDash$ for every $b\in S$.
For example, checking termination may be expressed as $S\vDash Term$ where $b\vDash Term$ is true if and only if $b$ reaches a terminating step.
There is a correspondence:
\begin{align}
\text{Subsets of a set S} \quad&\Leftrightarrow \quad\text{Predicates on elements of S}\\
U \quad &\leftrightarrow \quad \{s \in S : P(s)\}\\
\cup \quad&\leftrightarrow \quad\lor\\
\subseteq \quad &\leftrightarrow \quad \Rightarrow\\
= \quad &\leftrightarrow \quad \equiv\\
\lnot \quad &\leftrightarrow \quad (\;)' \end{align}
We may thus think of a property either as
- a predicate on behaviors
- a set of behaviors satisfying the predicate
> Note that the next state predicate (see example) is only concerned with steps that are *allowed* and not those which *must occur*. We may add supplementary conditions to ensure the latter.
## Hyperproperties
A **hyperproperty** is a predicate on sets of behaviors making it a predicate on properties *(unclear)*.
For example, for a property $P$, define the hyperproperty $\mathcal{H}(P)$ to be true if any two terminating behaviors satisfying $P$ having different initial values of $x$ have different terminal values of $y$.
> Note that here $P$ is viewed as a set of behaviours on which the hyperproperty may be defined.
A hyperproperty is **finitary** if $\mathcal{H}(P)$ can be written as a formula using propositional logic operators and quantinfication of the form $\forall b \in P$ with predicates $F(b_1,\dots, b_n)$ where F does not depend on P.
Since we can "pull out" all the quantifiers, $\mathcal{H}(P)$ may be written as
$$\circ\: b_1\in P: \dots :\circ b_k\in P : J(b_1,\dots, b_k)$$ where $\circ$ can be $\exists$ or $\forall$ and J doesn't depend on P.
Verifying that $P$ satisfies a property is the special case $k=1$.
$$\forall b\in P : J(b)$$
Methods for this special case need to be generalized to be applicable to the general case.
If $\circ = \forall$ then we can let $P^k$ be the state machine running $k$ copes of $P$ in parallel we can write
$$\forall\: b\in P^k : J(\pi_1(b), \dots, \pi_k(b))$$
where $\pi_i$ picks out the $i$-th component in the $k$-tuple $b$.
Consider the class of $\forall\exists$-hyperproperties which are of the form:
\begin{align}
\forall\: b_1 \in P: \dots \forall\: b_j\in P : K(b_1, \dots, b_j) \Rightarrow& \\
\exists\: b_{j+1}\in P: \dots \exists \:b_k \in P : L (b_1, \dots b_k).
\end{align}
Now if we switch to thinking of $P$ as a predicate on behaviors:
\begin{align}\forall b_1,\dots,b_j : P(b_1) \land \dots \land P(b_j) \land K(b_1,\dots, b_j) \Rightarrow \\ \exists\: b_{j+1}, \dots b_k : P(b_{j+1}) \land \dots \land P(b_k) \land L (b_1, \dots , b_k). \tag{1} \label{eq1}
\end{align}
In the preceding expression, we have replaced "$\forall\: b\in P$" by "$\forall\: b : P(b) \Rightarrow$" and "$\exists\: b\in P :$" by "$b: P(b) \land$" and done some simplifications.
### Temporal logic formulas
Temporal logic formulas (TLFs) are constructed by using temporal operators and ordinary predicate-logic operators.
The temporal operator $\Box$ (read *always* or *henceforth*) is defined as follows: $b \vDash \Box\: F$ is true if $c\vDash F$ is true for $c=b$ and all the succeeding states of $b$.
The temporal existential quantifier $\#$ (denoted differently in the reference) is used as follows. For a behavior $b$, $b\vDash \:\# x: F$ means there exists a behavior $\hat{b}$ which is the same as $b$ except the values of $x$ in $\hat{b}$ and $b$ may differ such that $\hat{b}\vDash F$ is true.
*Example?*
The expression $\#x:F$ is true for a behavior if there is a sequence of values of $x$, one for each state of the behavior, that verifies $F$.
We will now rewrite $\ref{eq1}$ as a TLF wherein the $k$ behaviors $b_i$ are encoded in a single variable $b$. We assume the formula depends only on the values that the states of $b_i$ assign to certain variables, say $V = \{v_1, \dots, v_n\}$. Assume that formula $P$ can then be written as a TLF $\tilde{P}$ containing only elements of $V$.
In order to write $\ref{eq1}$ in terms of a single behavior, we need to rewrite $P(b_i)$ with the formula obtained from $\tilde{P}$ upon substituting new variables in place of $V$, a different set for each $i$.
Use $\mathbf{x_i}$ to denote the set of new variables replacing $V$ in behavior $b_i$. We now replace $P(b_i)$ with $\tilde{P}(\mathbf{x_i})$. The values that $b$ assigns to $\mathbf{x_i}$ are thus the values that the behavior $b_i$ assigns to $V$.
\begin{align} \vDash \tilde{P}(\mathbf{x_1}) \land &\dots \tilde{P}(\mathbf{x_j})\land \tilde{K} (\mathbf{x_1}, \dots, \mathbf{x_j}) \Rightarrow \\
&\#\:\mathbf{x_{j+1}}, \dots, \mathbf{x_k} : \tilde{P}(\mathbf{x_{j+1}})\land \dots \land \tilde{P}(\mathbf{x_k}) \land \tilde{L}(\mathbf{x_1}, \dots, \mathbf{x_k})\end{align}
An assertion of the form $\vDash F$ means $F$ is true for all behaviors. We drop the twiddles henceforth for simplicity. If $P$, $K$ and $L$ can be written as TLFs, it implies $\ref{eq1}$ can be writted as a TLF too.
Thus we have written an assertion that a system satisfies a hyperproperty in terms of a TLF.
## RTLA
TLA builds its formulas from predicates on steps (called *actions*) rather than from state predicates. An action is written as a formula with primed and unprimed variables (see clock example above).
Let's describe a slightly simpler logic RTLA, whrein an action is true of a behavior iff it is true of the first step of the behavior. More concretely, for a behavior $b$:
$$b \vDash \mathcal{I}\land \Box\: \mathcal{N}.$$
##### Example 2
Consider the property $P$:
$$ P \triangleq (x\neq 1 \text{ unless } x = 42 \text{ previously}).$$
We will formulate this in terms of a Boolean-valued hidden variable $y$.
$$ P \triangleq \#\:y : ((y = (x=42)) \land \Box\: (\lnot y \Rightarrow (x' \neq 1) \land (y' = (y \lor (x=42))))) $$
Statements in predicate logic such as $S \vDash P$ are written in temporal logic as $\vDash (S\Rightarrow P)$ or using the $\#$ sign,
$$ (\#\:\mathbf{y} : S(\mathbf{x}, \mathbf{y})) \Rightarrow (\#\: \mathbf{z} : P (\mathbf{x}, \mathbf{z})), $$
which by predicate logic can be simplified to
$$ S(\mathbf{x}, \mathbf{y}) \Rightarrow (\# \: \mathbf{z} : P (\mathbf{x}, \mathbf{z})), \tag{2}\label{eq2} $$
where $\mathbf{x}$ are the free variables of $S$ and $P$, and $\mathbf{y}$ and $\mathbf{z}$ are their respective hidden variables. The value of $z$ in a state of the behavior may depend on values assigned to $x$ and $y$ in all the states of the behavior. However, verification becomes simple if $z$ depends only on values of $x$ and $y$ in that particular state only.
A **state function** is any expression containing constants and unprimed variables. Let $\mathbf{z} = z_1, \dots z_n$, we verify $\ref{eq2}$ by finding state functions $f_1(\mathbf{x}, \mathbf{y}), \dots, f_n(\mathbf{x}, \mathbf{y})$ where the following formula is true:
$$ \vDash S(\mathbf{x}, \mathbf{y}) \Rightarrow P(\mathbf{x}, \mathbf{f}(\mathbf{x}, \mathbf{y})) $$
The symbol $\mathbf{f}$ denotes the list of functions $\{f_1, \dots, f_n\}$ and these functions are called a **refinement mapping**.
In an RTLA formula, substituting $f$ in place of $v$ also entails substituting $f'$ for $v'$, where $f'$ is obtained by priming the arguments of $f$.
A refinement mapping may not always exist, but we can always find one if we replace S by an equivalent formula obtained by adding *auxillary variables* $\mathbf{a}$, where $$S(\mathbf{x}, \mathbf{y}) \Leftrightarrow \# \: \mathbf{a} : S^{\mathbf{a}} (\mathbf{x}, \mathbf{y}, \mathbf{a}). $$
#### Example: Generalized non-interference