Notes on verifying hyperproperties with TLA

# Notes on verifying hyperproperties with TLA [reference](https://lamport.azurewebsites.net/pubs/hyper2.pdf) ## Broad idea (sketchy) Start with a system described by a TLA formula $P$ and a hyperprop expressed by a formula $H$ involving $k$ behaviours. A formula $Q$ describes a new system comprising multiple copies of $P$. The assertion that the "system satisfies $H$" is given by a TLA formula $Q \Rightarrow R$, i.e., $Q$ satisfies property $R$. ## Preliminaries A **behavior** is a sequence of states. A **step** is a pair of consecutive states. ### State Machines A state machine is described by an **initial predicate** $\mathcal{I}$ and a **next-state predicate** $\mathcal{N}$. The set of states may be infinite. Behaviors generated by the state machine will have the first state satisfying $\mathcal{I}$ with $\mathcal{N}$ true at each succeeding step. A concurrent system can be described by a state machine where each step represents operations performed synchronously by one or more processes. ##### Example 1 Let's model a state machine for a 12-hour hour-minute clock with initual state 12:00. $$\mathcal{I} \triangleq (min = 0) \land (hour = 12)$$ The next state predicate would be $$ \begin{align} \mathcal{N} \triangleq \; &\land\; min' = (min + 1) \bmod 60 \\ & \begin{aligned}\land\; hour' = &\text{ if } min = 59 \\ &\text{ then } hour' = (hour \bmod 12) + 1 \\ & \text{ else } hour\end{aligned} \end{align} $$ ### Properties A **property** is a predicate on behaviors. Denote that the property $P$ is true for behaviour $b$ by $b \vDash P$ and similarly for a set of behaviors $S$, $S\vDash P$ means $b\vDash$ for every $b\in S$. For example, checking termination may be expressed as $S\vDash Term$ where $b\vDash Term$ is true if and only if $b$ reaches a terminating step. There is a correspondence: \begin{align} \text{Subsets of a set S} \quad&\Leftrightarrow \quad\text{Predicates on elements of S}\\ U \quad &\leftrightarrow \quad \{s \in S : P(s)\}\\ \cup \quad&\leftrightarrow \quad\lor\\ \subseteq \quad &\leftrightarrow \quad \Rightarrow\\ = \quad &\leftrightarrow \quad \equiv\\ \lnot \quad &\leftrightarrow \quad (\;)' \end{align} We may thus think of a property either as - a predicate on behaviors - a set of behaviors satisfying the predicate > Note that the next state predicate (see example) is only concerned with steps that are *allowed* and not those which *must occur*. We may add supplementary conditions to ensure the latter. ## Hyperproperties A **hyperproperty** is a predicate on sets of behaviors making it a predicate on properties *(unclear)*. For example, for a property $P$, define the hyperproperty $\mathcal{H}(P)$ to be true if any two terminating behaviors satisfying $P$ having different initial values of $x$ have different terminal values of $y$. > Note that here $P$ is viewed as a set of behaviours on which the hyperproperty may be defined. A hyperproperty is **finitary** if $\mathcal{H}(P)$ can be written as a formula using propositional logic operators and quantinfication of the form $\forall b \in P$ with predicates $F(b_1,\dots, b_n)$ where F does not depend on P. Since we can "pull out" all the quantifiers, $\mathcal{H}(P)$ may be written as $$\circ\: b_1\in P: \dots :\circ b_k\in P : J(b_1,\dots, b_k)$$ where $\circ$ can be $\exists$ or $\forall$ and J doesn't depend on P. Verifying that $P$ satisfies a property is the special case $k=1$. $$\forall b\in P : J(b)$$ Methods for this special case need to be generalized to be applicable to the general case. If $\circ = \forall$ then we can let $P^k$ be the state machine running $k$ copes of $P$ in parallel we can write $$\forall\: b\in P^k : J(\pi_1(b), \dots, \pi_k(b))$$ where $\pi_i$ picks out the $i$-th component in the $k$-tuple $b$. Consider the class of $\forall\exists$-hyperproperties which are of the form: \begin{align} \forall\: b_1 \in P: \dots \forall\: b_j\in P : K(b_1, \dots, b_j) \Rightarrow& \\ \exists\: b_{j+1}\in P: \dots \exists \:b_k \in P : L (b_1, \dots b_k). \end{align} Now if we switch to thinking of $P$ as a predicate on behaviors: \begin{align}\forall b_1,\dots,b_j : P(b_1) \land \dots \land P(b_j) \land K(b_1,\dots, b_j) \Rightarrow \\ \exists\: b_{j+1}, \dots b_k : P(b_{j+1}) \land \dots \land P(b_k) \land L (b_1, \dots , b_k). \tag{1} \label{eq1} \end{align} In the preceding expression, we have replaced "$\forall\: b\in P$" by "$\forall\: b : P(b) \Rightarrow$" and "$\exists\: b\in P :$" by "$b: P(b) \land$" and done some simplifications. ### Temporal logic formulas Temporal logic formulas (TLFs) are constructed by using temporal operators and ordinary predicate-logic operators. The temporal operator $\Box$ (read *always* or *henceforth*) is defined as follows: $b \vDash \Box\: F$ is true if $c\vDash F$ is true for $c=b$ and all the succeeding states of $b$. The temporal existential quantifier $\#$ (denoted differently in the reference) is used as follows. For a behavior $b$, $b\vDash \:\# x: F$ means there exists a behavior $\hat{b}$ which is the same as $b$ except the values of $x$ in $\hat{b}$ and $b$ may differ such that $\hat{b}\vDash F$ is true. *Example?* The expression $\#x:F$ is true for a behavior if there is a sequence of values of $x$, one for each state of the behavior, that verifies $F$. We will now rewrite $\ref{eq1}$ as a TLF wherein the $k$ behaviors $b_i$ are encoded in a single variable $b$. We assume the formula depends only on the values that the states of $b_i$ assign to certain variables, say $V = \{v_1, \dots, v_n\}$. Assume that formula $P$ can then be written as a TLF $\tilde{P}$ containing only elements of $V$. In order to write $\ref{eq1}$ in terms of a single behavior, we need to rewrite $P(b_i)$ with the formula obtained from $\tilde{P}$ upon substituting new variables in place of $V$, a different set for each $i$. Use $\mathbf{x_i}$ to denote the set of new variables replacing $V$ in behavior $b_i$. We now replace $P(b_i)$ with $\tilde{P}(\mathbf{x_i})$. The values that $b$ assigns to $\mathbf{x_i}$ are thus the values that the behavior $b_i$ assigns to $V$. \begin{align} \vDash \tilde{P}(\mathbf{x_1}) \land &\dots \tilde{P}(\mathbf{x_j})\land \tilde{K} (\mathbf{x_1}, \dots, \mathbf{x_j}) \Rightarrow \\ &\#\:\mathbf{x_{j+1}}, \dots, \mathbf{x_k} : \tilde{P}(\mathbf{x_{j+1}})\land \dots \land \tilde{P}(\mathbf{x_k}) \land \tilde{L}(\mathbf{x_1}, \dots, \mathbf{x_k})\end{align} An assertion of the form $\vDash F$ means $F$ is true for all behaviors. We drop the twiddles henceforth for simplicity. If $P$, $K$ and $L$ can be written as TLFs, it implies $\ref{eq1}$ can be writted as a TLF too. Thus we have written an assertion that a system satisfies a hyperproperty in terms of a TLF. ## RTLA TLA builds its formulas from predicates on steps (called *actions*) rather than from state predicates. An action is written as a formula with primed and unprimed variables (see clock example above). Let's describe a slightly simpler logic RTLA, whrein an action is true of a behavior iff it is true of the first step of the behavior. More concretely, for a behavior $b$: $$b \vDash \mathcal{I}\land \Box\: \mathcal{N}.$$ ##### Example 2 Consider the property $P$: $$ P \triangleq (x\neq 1 \text{ unless } x = 42 \text{ previously}).$$ We will formulate this in terms of a Boolean-valued hidden variable $y$. $$ P \triangleq \#\:y : ((y = (x=42)) \land \Box\: (\lnot y \Rightarrow (x' \neq 1) \land (y' = (y \lor (x=42))))) $$ Statements in predicate logic such as $S \vDash P$ are written in temporal logic as $\vDash (S\Rightarrow P)$ or using the $\#$ sign, $$ (\#\:\mathbf{y} : S(\mathbf{x}, \mathbf{y})) \Rightarrow (\#\: \mathbf{z} : P (\mathbf{x}, \mathbf{z})), $$ which by predicate logic can be simplified to $$ S(\mathbf{x}, \mathbf{y}) \Rightarrow (\# \: \mathbf{z} : P (\mathbf{x}, \mathbf{z})), \tag{2}\label{eq2} $$ where $\mathbf{x}$ are the free variables of $S$ and $P$, and $\mathbf{y}$ and $\mathbf{z}$ are their respective hidden variables. The value of $z$ in a state of the behavior may depend on values assigned to $x$ and $y$ in all the states of the behavior. However, verification becomes simple if $z$ depends only on values of $x$ and $y$ in that particular state only. A **state function** is any expression containing constants and unprimed variables. Let $\mathbf{z} = z_1, \dots z_n$, we verify $\ref{eq2}$ by finding state functions $f_1(\mathbf{x}, \mathbf{y}), \dots, f_n(\mathbf{x}, \mathbf{y})$ where the following formula is true: $$ \vDash S(\mathbf{x}, \mathbf{y}) \Rightarrow P(\mathbf{x}, \mathbf{f}(\mathbf{x}, \mathbf{y})) $$ The symbol $\mathbf{f}$ denotes the list of functions $\{f_1, \dots, f_n\}$ and these functions are called a **refinement mapping**. In an RTLA formula, substituting $f$ in place of $v$ also entails substituting $f'$ for $v'$, where $f'$ is obtained by priming the arguments of $f$. A refinement mapping may not always exist, but we can always find one if we replace S by an equivalent formula obtained by adding *auxillary variables* $\mathbf{a}$, where $$S(\mathbf{x}, \mathbf{y}) \Leftrightarrow \# \: \mathbf{a} : S^{\mathbf{a}} (\mathbf{x}, \mathbf{y}, \mathbf{a}). $$ #### Example: Generalized non-interference