###### tags: `one-offs` `transport` `bregman`
# Bregman Cost Functions in Optimal Transport
**Overview**: In this note, I present a problem in Optimal Transport which involves Bregman divergences, and sketch its solution.
## Problem
Consider a measure transport problem with cost given by a Bregman divergence, i.e.
\begin{align}
c\left(x,y\right)=F\left(y\right)-F\left(x\right)-\left\langle \nabla F\left(x\right),y-x\right\rangle
\end{align}
with $F$ e.g. strictly convex and $C^{2}$, so that one is aiming to solve
\begin{align}
\min_{\gamma\in\Gamma\left(\mu,\nu\right)}\left\{ C\left(\gamma\right):=\int\gamma\left(\mathrm{d}x,\mathrm{d}y\right)\cdot c\left(x,y\right)\right\} .
\end{align}
**Task**: Show by a (nonlinear) change of coordinates that this is equivalent to solving an $L^{2}$ OT problem for another pair of probability measures.
**Bonus Task**: Deduce a Brenier-type structure theorem for the $c$-optimal coupling between the measures (being appropriately generous with assumptions, etc.).
## Solution
Write $\hat{x}=\nabla F\left(x\right)$, $\hat{\mu}=\left(\nabla F\right)_{\#}\mu$, so that $c\left(x,y\right)=:\hat{c}\left(\hat{x},y\right)=F^{*}\left(\hat{x}\right)+F\left(y\right)-\left\langle \hat{x},y\right\rangle$.
Similarly, translate between $\Gamma\left(\mu,\nu\right)$ and $\Gamma\left(\hat{\mu},\nu\right)$ through the map $\nabla F\otimes\mathrm{Id}$, writing $\int\gamma\left(\mathrm{d}x,\mathrm{d}y\right)\cdot c\left(x,y\right) = \int\hat{\gamma}\left(\mathrm{d}\hat{x},\mathrm{d}y\right)\cdot\hat{c}\left(\hat{x},y\right)$.
Note now that
\begin{align}
\hat{c}\left(\hat{x},y\right) &=F^{*}\left(\hat{x}\right)+F\left(y\right)-\left\langle \hat{x},y\right\rangle \\
&=\left(F^{*}\left(\hat{x}\right)-\frac{1}{2}\left|\hat{x}\right|_{2}^{2}\right)+\left(F\left(y\right)-\frac{1}{2}\left|y\right|_{2}^{2}\right)+\frac{1}{2}\left|x-y\right|_{2}^{2},
\end{align}
and so
\begin{align}
C\left(\gamma\right)=\hat{C}\left(\hat{\gamma}\right)=&\hat{\mu}\left(F^{*}-\frac{1}{2}\left|\cdot\right|_{2}^{2}\right)+\nu\left(F-\frac{1}{2}\left|\cdot\right|_{2}^{2}\right)\\
&+\int\hat{\gamma}\left(\mathrm{d}\hat{x},\mathrm{d}y\right)\cdot\left(\frac{1}{2}\left|x-y\right|_{2}^{2}\right).
\end{align}
Noting that the first two terms on the right-hand side are independent of the chosen coupling, one sees that the cost is minimised by taking $\hat{\gamma}$ to be the $L^{2}$-optimal transport coupling between $\hat{\mu}$ and $\nu$. As such, naive 'Bregman-ising' of the optimal transport seems not to lead to essentially new problems.
For the bonus problem, note that the Brenier theorem says that $L^{2}$-optimal transport from $\hat{\mu}$ to $\nu$ can be attained through a map of the form $T\left(x\right)=\nabla\Phi\left(x\right)$, for some convex $\Phi$. Noting that $\hat{\mu}$ is obtained from $\mu$ by transport under the map $\nabla F$, it follows that the Bregman-optimal mapping from $\mu$ to $\nu$ takes the form $\hat{T}\left(x\right)=\left(\nabla\Phi\circ\nabla F\right)\left(x\right)$, and we conclude.