###### tags: `one-offs` `markov chains` `functional inequalities`
# Logarithmic Sobolev Inequalities and Logarithmic Lipschitz Regularity
**Overview**: In this note, I describe some ideas which arose in the study of Logarithmic Sobolev Inequalities in two separate works, and offer some rough interpretations of what the two notions have in common.
## Ollivier's Log-Sobolev Inequality under Ricci Curvature
A little while ago, I was curious about a proposed definition for a "generalised gradient norm"-type object in the work of Yann Ollivier on Ricci curvature of Markov chains, given for $\lambda\geqslant0$ as:
\begin{align}
\left(\mathrm{D}^{\lambda}f\right)\left(x\right)=\sup\left\{ \frac{\left|f\left(y\right)-f\left(z\right)\right|}{\mathsf{d}\left(y,z\right)}\cdot\exp\left(-\lambda\cdot\mathsf{d}\left(x,y\right)-\lambda\cdot\mathsf{d}\left(y,z\right)\right):y,z\in\mathcal{X},y\neq x\right\}.
\end{align}
While the motivation for such an object was reasonably clear, at the time, I couldn't totally follow why this definition is a priori the right one (the later derivations make clear that it works out as they need it to). Ollivier goes on to use this definition to prove a functional inequality (akin to the Logarithmic Sobolev Inequality) for Markov chains which satisfy only a Ricci curvature condition and some additional stability conditions.
## Salez-Tikhomirov-Youssef and Upgrading MLSI to LSE
Separately, I've recently come across a similar notion in a separate, recently-uploaded paper of Salez-Tikhomirov-Youssef, entitled 'Upgrading MLSI to LSI for reversible Markov chains'. In this paper, the authors are interested in translating between two notions of Logarithmic Sobolev Inequality for Markov chains on discrete spaces. They observe that the relevant calculations are particularly nice if one is working with functions which are Lipschitz at a logarithmic scale, in the following sense: say that a positive function $f$ is $r$-regular if for all adjacent pairs $\left(x,y\right)$, it holds that $f\left(x\right)\leqslant r\cdot f\left(y\right)$. Then, any $r$-regular function $f$ satisfies
\begin{align}
\frac{\mathcal{E}\left(f,\log f\right)}{\mathcal{E}\left(f^{1/2},f^{1/2}\right)}\in\left[4,\frac{r^{1/2}-1}{r^{1/2}+1}\cdot\log r\right],
\end{align}
noting that the lower bound holds for arbitrary $f>0$.
Building on this observation, the authors introduce a construction for converting a given positive function $f$ into a function $f_{*}$ which obeys this logarithmic regularity property, and is not too far from the original $f$. In particular, they define
\begin{align}
f_{*,r}\left(x\right):=\max\left\{ r^{-\mathsf{d}\left(x,y\right)}\cdot f\left(y\right):y\in\mathcal{X}\right\} ,
\end{align}
which is not exactly the same as the construction of Ollivier, but nevertheless exhibits some harmony with it. The authors then show that by taking $r$ to have a certain value (depending on the transition properties of the Markov chain in question, but independently of $f$), one can ensure that
\begin{align}
\mathcal{E}\left(f_{*,r}^{1/2},f_{*,r}^{1/2}\right) &\leqslant \frac{4}{3}\cdot\mathcal{E}\left(f^{1/2},f^{1/2}\right) \\
\mathcal{E}\left(f_{*,r},\log f_{*,r}\right) &\leqslant \frac{4}{3}\cdot\mathcal{E}\left(f,\log f\right) \\
\mathrm{Ent}\left(f\right) &\leqslant 2\cdot\mathrm{Ent}\left(f_{*,r}\right).
\end{align}
From here, the authors go on to show that the a priori weaker Modified Logarithmic Sobolev Inequality implies a 'genuine' Logarithmic Sobolev Inequality, with explicit control of the constants which are involved.
For me, the combined story thus appears to be a bit clearer now:
1. When studying Logarithmic Sobolev Inequalities (and their relatives), it is relevant to think about log-Lipschitz functions (or perhaps, densities), and
2. Given a positive function f, there are 'natural' ways in which to construct a regularised approximation to f which has favourable regularity on the logarithmic scale.
Both of the cited works then seem to be engaging with these observations in order to prove more substantial results. It will be interesting to understand this connection more clearly.

Overview: In this note, I log some basic observations about diffusion-based generative models.

8/14/2023Overview: In this note, I describe some aspects of hierarchical structure in MCMC algorithms, and how they can be of theoretical and practical relevance.

8/9/2023Overview: In this note, I discuss a recurrent question which can be used to generate research questions about methods of all sorts. I then discuss a specific instance of how this question has proved fruitful in the theory of optimisation algorithms. Methods and Approximations A nice story is that when Brad Efron derived the bootstrap, it was done in service of the question “What is the jackknife an approximation to?”. I can't help but agree that there's something quite exciting about research questions which have this same character of ''What is (this existing thing) an approximation to?''. One bonus tilt on this which I appreciate is that there can be multiple levels of approximation, and hence many answers to the same question. One well-known example is gradient descent, which can be viewed as an approximation to the proximal point method, which can then itself be viewed as an approximation to a gradient flow. There are probably even more stops along the way here. In this case, there is even the perspective that from the perspective of mathematical theory, there may be at least as much to be gained by stopping off at the proximal point interpretation, as there is from the gradient flow perspective. My experience is that generalist applied mathematicians get to grips with the gradient flow quickly, but optimisation theorists can squeeze more out of the PPM formulation. There is thus some hint that using this 'intermediate' approximation can be particularly insightful in its own right. It would be interesting to collect more examples with this character.

5/22/2023Overview: In this note, I prove Hoeffding's inequality from the perspectives of martingales and convex ordering. The Basic Construction Let $-\infty<a<x<b<\infty$, and define a random variable $M$ with law $M\left(x;a,b\right)$ by \begin{align} M=\begin{cases} a & \text{w.p. }\frac{b-x}{b-a}\ b & \text{w.p. }\frac{x-a}{b-a}. \end{cases}

5/22/2023
Published on ** HackMD**