# Notation.
- Consider an image $I$ has $n$ pixels and classify to $k$ classes
- Let $X_i\in\{0, 1,\cdots,255\}$ be a random variable
- We use $\mathcal{X}_B, \mathcal{X}_A$ represent the before treatment image and after treatment image and $\mathcal{F}_B, \mathcal{F}_A$ be its distribution function respectively
- For training a better model, we first transfer $\mathcal{F}_A$ to $\mathcal{F}_B$
$$
P(\mathcal{F}^{-1}_B(\mathcal{F}_A(\mathcal{X}_A)) \leq x) = P(\mathcal{X}_A \leq \mathcal{F}^{-1}_A(\mathcal{F}_B(x))) = \mathcal{F}_A(\mathcal{F}^{-1}_A(\mathcal{F}_B(x))) = \mathcal{F}^{-1}_B(x)
$$
- Therefore, $\mathcal{F}^{-1}_B(\mathcal{F}_A(\mathcal{X}_A)) \sim \mathcal{F}_B$