# Constrained Hamiltonian dynamic in parameter space
$\newcommand{\reals}{\mathbb{R}}
\newcommand{\idmtx}{\mathbb{I}}
\newcommand{\nrm}{\mathcal{N}}
\newcommand{\d}{\mathop{}\!\mathrm{d}}
\newcommand{\jacob}{\mathop{}\!\partial}
\newcommand{\grad}{\mathop{}\!\nabla}
\newcommand{\tr}{^{\mkern-1.5mu\mathsf{T}}}$
## Generative model
Assume latent state $\theta \in \reals^P$ with Gaussian prior $\nrm(0,C)$, observation $y \in \reals^N$ and observation function $g: \reals^P \to \reals^N$ subject to additive Gaussian noise with covariance $R = SS\tr \succ 0$. This corresponds to a generative model
$$ y = g(\theta) + S \eta, \quad \theta \sim \nrm(0, C), \quad \eta \sim \nrm(0, \idmtx_N).$$
## Constrained Hamiltonian dynamic
Denote the overall unknown state $q = [\theta; \eta]$ and define a constraint function $c : \mathbb{R}^{P+N} \to \mathbb{R}^N$ which is zero at states consistent with the observations
$$c([\theta; \eta]) = g(\theta) + S \eta - y.$$
The posterior distribution has support on an implicitly defined manifold $\mathcal{M} = \lbrace q \in \reals^{P+N} : c(q) = 0\rbrace$ with unnormalised density $\pi_{\mathcal{M}}$ on $\mathcal{M}$ with respect to $P$-dimensional Hausdorff measure on ambient space $\reals^{P+N}$ equipped with the Euclidean metric $d(q, q') = q\tr q$ with $\pi_{\mathcal{M}}$ defined by
$$\log\pi_{\mathcal{M}}(q) = -\frac{1}{2}q\tr \begin{bmatrix} C^{-1} & 0 \\ 0 & \idmtx_N \end{bmatrix} q - \frac{1}{2}\log\det(\jacob c(q) \jacob c(q)\tr).$$
Augmenting the space with a momentum variable $p\in\reals^{P+N}$ with covariance $\idmtx_{P+N}$, the Hamiltonian for the constrained system is
$$
h(q, p) =
-\log\pi_{\mathcal{M}}(q) + \frac{1}{2}p\tr p +
c(q)\tr \lambda(p,q),
$$
with $\lambda$ implictly defined by the requirement $c(q) = 0$. The corresponding constrained Hamiltonian dynamic is defined by the system of differential algebraic equations
\begin{align}
\frac{\d q}{\d t}
&= \jacob_2 h(q, p)
= p,
\\
\frac{\d p}{\d t}
&= -\jacob_1 h(q, p)
= \grad\log\pi_{\mathcal{M}}(q) - \jacob c(q)\tr \lambda(p,q),\\
c(q)&= 0.
\end{align}
## Riemannian manifold in parameter space
The manifold $\mathcal{M}$ can also be defined via an explicit parameterisation
$$\phi(\theta) = \left[\theta; S^{-1}(y-g(\theta)) \right] \implies \mathcal{M} = \phi(\reals^P).$$
Following Leimkuhler and Reich (2004) ยง7.6.2, if we consider the transformation $q = \phi(\theta)$ then the canonical momentum $\nu$ associated with $\theta$ is defined by $\nu = \jacob\phi(\theta)\tr p$ and the Hamiltonian $h'$ in the new variables $(\theta, \nu)$ is
\begin{align}\require{cancel}
h'(\theta,\nu)
&=
h(\phi(\theta), (\jacob\phi(\theta)\tr)^+\nu)\\
&=
\frac{1}{2}\phi(\theta)\tr\begin{bmatrix} C^{-1} & 0 \\ 0 & \idmtx_N \end{bmatrix}\phi(\theta) +
\frac{1}{2}\log\det(\jacob c(\phi(\theta))\, \jacob c(\phi(\theta))\tr) + \,\\
& \phantom{=~}
\frac{1}{2}\nu\tr (\jacob\phi(\theta))^+(\jacob\phi(\theta)\tr)^+ \nu +
c(\phi(\theta))\tr\lambda(\phi(\theta), (\jacob\phi(\theta)\tr)^+\nu).
\end{align}
By construction $c(\phi(\theta)) = 0 ~\forall \theta \in \reals^P$ so the last term can be ignored.
By the properties of the pseudo-inverse we have that
$$
(\jacob\phi(\theta))^+ (\jacob\phi(\theta)\tr)^+ = (\jacob\phi(\theta)\tr \jacob\phi(\theta))^+ = (\jacob\phi(\theta)\tr \jacob\phi(\theta))^{-1}.
$$
The last equality arises as from the definition of $\phi$, the Jacobian $\jacob\phi$ has the block structure
$$
\jacob\phi(\theta) =
\begin{bmatrix}\idmtx_P \\ -S^{-1} \jacob g(\theta)\end{bmatrix},
$$
and so $\jacob\phi(\theta)$ has full-column rank for all $\theta \in \reals^P$. We therefore have that
$$
(\jacob\phi(\theta))^+ (\jacob\phi(\theta)\tr)^+ =
(\idmtx_P + \jacob g(\theta)\tr S^{-T} S^{-1} \jacob g(\theta))^{-1}.
$$
From the definition of the constraint function $c$, the Jacobian $\jacob c$ has a block structure
$$\jacob c(\phi(\theta)) = \begin{bmatrix} \jacob g(\theta) & S \end{bmatrix}.$$
Therefore we have that
$$
\det(\jacob c(\phi(\theta))\, \jacob c(\phi(\theta))\tr) =
\det(\jacob g(\theta)\jacob g(\theta)\tr + SS^{T} ).
$$
Applying the matrix determinant lemma gives
$$
\det(\jacob c(\phi(\theta))\, \jacob c(\phi(\theta))\tr) =
\det(SS\tr)
\det(\idmtx_P + \jacob g(\theta)\tr S^{-T} S^{-1} \jacob g(\theta)).
$$
Putting everything together we therefore have that
\begin{align}
h'(\theta,\nu)
&=
\frac{1}{2}\theta\tr C^{-1}\theta +
\frac{1}{2\sigma^2}(y-g(\theta))\tr R^{-1} (y-g(\theta)) + \frac{1}{2} \log\det(R) + \,\\
& \phantom{=~}
\frac{1}{2}\nu\tr (\idmtx_P + \jacob g(\theta)\tr R^{-1} \jacob g(\theta))^{-1} \nu +
\frac{1}{2}\log\det(\idmtx_P + \jacob g(\theta)\tr R^{-1} \jacob g(\theta)).
\end{align}
This corresponds to a Riemannian-manifold Hamiltonian
$$
h'(\theta, \nu) = -\log\pi(\theta) + \frac{1}{2}\nu\tr M(\theta)^{-1} \nu + \frac{1}{2}\log\det(M(\theta))
$$
for a target (posterior) distribution in the parameter space with unnormalised density
$$
\pi(\theta) = \frac{1}{\sqrt{\det(R)}}
\exp\left(-\frac{1}{2}(y-g(\theta))\tr R^{-1} (y-g(\theta))\right)
\exp\left(-\frac{1}{2}\theta\tr C^{-1}\theta\right)
$$
and metric (momentum covariance)
$$
M(\theta) = \idmtx_P + \jacob g(\theta)\tr R^{-1} \jacob g(\theta).
$$
## Generalised No-U-Turn criteria in parameter space
A heuristic 'No-U-Turn' criterion which terminates the expansion of a Hamiltonian trajectory when it begins to 'go back on itself', was proposed for Hamiltonian systems defined on Riemannian manifolds by Betancourt (2013).
For a system with metric $M(\theta)$, if the position and momentum pairs along a trajectory of $I$ integrator steps are denoted $\theta_{1:I}$ and $\nu_{1:I}$ respectively, then the integration is terminated when the following condition is first satisfied
$$
\nu_I\tr M^{-1}(\theta_I) \sum_{i=1}^I \nu_i < 0.
$$
This condition is expressed in terms of the position and conjugate momenta of a Hamiltonian trajectory in the *parameter space*. Using the relations $\nu_i = \jacob\phi(\theta)\tr p_i$ and $q_i = \phi(\theta_i)$ and the definition of $M(\theta)$ in the previous section, we can derive an equivalent condition in terms of the positions $q_{1:I}$ and momenta $p_{1:I}$ of a Hamiltonian trajectory on the manifold embedded in the extended space.
We first decompose each $P+N$ dimensional momentum vector $p_i$ as $p_i = [p^\theta_i; p^\eta_i]$ with $p^\theta_i$ a $P$ dimensional subvector and $p^\eta_i$ a $N$ dimensional subvector. Then we have that
$$
\nu_i = \jacob\phi(\theta_i)\tr p_i = \begin{bmatrix}\idmtx_P & -\jacob g(\theta)\tr S^{-1}\end{bmatrix} \begin{bmatrix}p^\theta_i \\ p^\eta_i\end{bmatrix} = p^\theta_i - \jacob g(\theta_i)\tr S^{-1} p^\eta_i.
$$
Further from the constraint that the momenta lie in the cotangent space to the manifold we have that
$$
\jacob c(q_i) p_i = 0 \implies \jacob c(\phi(\theta_i))p_i = \begin{bmatrix} \jacob g(\theta_i) & S \end{bmatrix} \begin{bmatrix}p^\theta_i \\ p^\eta_i\end{bmatrix}=
\jacob g(\theta_i) p^\theta_i + S p^\eta_i = 0.
$$
Rearranging we therefore have that
$$
p^\eta_i = -S^{-1}\jacob g(\theta_i) p^\theta_i,
$$
and so
$$
\nu_i =
p^\theta_i + \jacob g(\theta_i)\tr S^{-1} S^{-1}\jacob g(\theta_i) p^\theta_i =
(\idmtx_P + \jacob g(\theta_i)\tr R^{-1}\jacob g(\theta_i)) p^\theta_i = M(\theta_i) p^\theta_i.
$$
Substituting into the termination criterion we get the condition
$$
p^\theta_I \sum_{i=1}^I M(\theta_i) p^\theta_i < 0.
$$
## References
1. Leimkuhler, B. and Reich, S., 2004. Simulating Hamiltonian dynamics (Vol. 14). *Cambridge University Press*.
2. Betancourt, M.J., 2013. Generalizing the no-U-turn sampler to Riemannian manifolds. *[arXiv:1304.1920](https://arxiv.org/abs/1304.1920)*.