###### tags: `one-offs` `generative models` `monte carlo` # Rare Events from Transport Maps **Overview**: In this note, I take a brief look at the problem of rare event estimation for probability measures defined via transport maps. ## Flows of Measure and Regular Transport An exercise (to which I don't yet have a good solution): fix a probability distribution $p$, specified in the form of a cascade of pushforwards, as is common with Normalising Flows. That is, a draw from $p$ is obtained by 1. $x_{0} \sim \gamma = \mathcal{N} \left(0,I_{d}\right)$. 2. For $1 \leqslant t \leqslant T$, $x_{t} = f_{t} \left( x_{t-1} \right)$. 3. Return $x_{T}$. with $\left\{ f_{t} : 1 \leqslant t \leqslant T \right\}$ a known sequence of diffeomorphisms. Suppose now that it is of interest to estimate the probability of a rare event under $p$ (e.g. some tail probability). How can you best make systematic use of the model structure towards solving this estimation problem? One interesting observation about this is the following: if the rare event is indeed only a function of the terminal state, the complexity of the maps $\left\{ f_{t}\right\}$ can to some extent be smoothed over, in the sense that we only care about the pushforward maps induced by applying these maps to the initial measure $\gamma$. Indeed, you can even imagine the fun scenario in which the $f_{t}$ are all equal to some expanding map, as one encounters in ergodic theory. Here, it can be that the iterates $\left\{ x_{t} \right\}$ form a deterministic, chaotic, ergodic system, and yet there exists a much simpler mapping which pushes $\gamma$ onto $\mathrm{Law} \left( x_{T} \right)$ (e.g. the optimal transport map). I confess that I'm still not quite sure how to turn this into a good algorithm though. The best principle which occurs to me so far is to try to express the action of the $\left\{ f_{t} \right\}$ or their compositions in terms of more regular maps (e.g. gradient of convex), and then somehow contract the problem. Still, this approach is barely about rare events, and mostly just relates to developing an understanding of the terminal law, through some sort of 'distillation' procedure. An obliquely-related task would be to perform a similar sort of distillation for non-invertible systems, as observed in e.g. Generative Adversarial Networks. Here, the degeneracy of the pushforward maps may lead to some technical difficulties, but the core task should still be somehow possible.