###### tags: `one-offs` `markov chains` `monte carlo` `sampling`
# Microcanonical Monte Carlo
**Overview**: In this note, I present an MCMC algorithm from 1980s in the Lattice Gauge Theory community, and describe some connections to more modern MCMC.
## The microMC Algorithm
There's a cute little MCMC algorithm with its roots in the Lattice Gauge Theory community in the early 1980s. I think it's worth sharing. The paper is [available here]( https://quark.phy.bnl.gov/~creutz/mypubs/pub070.pdf ), and is titled 'Microcanonical Monte Carlo Simulation'; the author is M. Creutz and the idea is as follows:
1. You have a state $x$ (living on e.g. a locally-finite graph), an energy function $E \left( x \right)$, and a 'demon' which carries a positive variable $p \sim \mathrm{Exp} \left(1\right)$.
2. At each step, you pick a neighbouring state $y$, and check the ordering of $\left( E \left(x\right) + p,E \left(y\right) \right)$.
(a) If $E\left(x\right)+p>y$, then move to $\left(y,p+E\left(x\right) - E\left(y\right)\right)$.
(b) If $E\left(y\right)\geqslant E\left(x\right)+p$, then stay at $\left(x,p\right)$.
Note that $H: = E\left(x\right)+p$ is conserved throughout.
The algorithm then merits a few basic remarks:
1. As written, the algorithm is deterministic, modulo how the 'pick a neighbouring state $y$' step is implemented.
2. As such, the resulting Markov chain is reducible (due to the conserved quantity) and will not necessarily sample correctly from the distribution $\left(x,p\right) \sim \exp\left(-E\left(x\right) - p \right) \cdot \mathbf{1}\left[p\geqslant0\right]$.
3. If you resample the variable $p \sim \mathrm{Exp} \left(1\right)$ afresh every now and again, then this should restore ergodicity.
4. If you resample the variable $p \sim \mathrm{Exp}\left(1\right)$ at every step, then this recovers a standard Metropolis algorithm.
In [later work](https://latticeguy.net/mypubs/pub088.pdf), Creutz pushes the idea even further, by adding in _loads_ of demons (i.e. many $p_{i}$), one for each lattice site, which he starts to talk about viewing as 'conjugate momenta'. From a modern vantage point, one can see the connections to the (at that time, yet-to-be-envisioned) Hamiltonian Monte Carlo algorithm, which indeed also has its roots in the LGT community of the 1980s.
Originally, I found myself hoping that this analogy would be useful for deriving some interesting 'Discrete-Space Hamiltonian Monte Carlo' algorithm, but in my opinion, this is not so easy. At least to me, it is not even entirely clear what such an analogue ought to satisfy. If one thinks that the essence of HMC is to have dynamics which are { energy-preserving, deterministic, momentum-based, etc. }, then one can reproduce these features, but it's not clear that these features are sufficient for obtaining a useful algorithm. At present, I would not say that I expect any discrete sampler to convincingly resemble HMC 'in full', by my own (vague) standards.
Tentatively, I can offer the possibility that the gap stems from trajectories in HMC being not just deterministic and energy-preserving, but also clever in other ways; they also drift the bulk of the distribution convincingly, and this seems hard to replicate generically in discrete spaces.