tags: `one-offs` `optimisation` `smoothness`

Convex Taylor Polynomials

Overview: In this note, I discuss a nice result about Taylor expansions of convex functions, and their convexity.

From Smoothness to Majorants

There's a cool result from a couple of years ago which I think is really neat, and the proof is simple enough that I think it's worth presenting widely. The topic is approximating convex functions with (regularised) Taylor polynomials, and (near-)preservation of convexity under taking Taylor expansions.

To begin with: it is not so hard to see that Taylor approximations of convex functions can easily fail to be convex: consider any convex function with non-vanishing third derivative, and take its third-order Taylor polynomial. As such, if we are interested in preserving convexity, it is natural to consider constraints and regularisation.

Focus now on convex functions with whose derivatives of order p are Lipschitz continuous, i.e.

\begin{array}{r} {‖ (D^{p} f) (x) - (D^{p} f) (y) ‖}_{op} ⩽ L \cdot {‖ x - y ‖}_{2} . \end{array}

This is a sensible constraint to impose on

f

, in the sense that such functions will deviate from local approximations by controllable amounts.

Fortunately, this constraint implies also a number of additional bounds on the growth of f by integration. For example, the difference between

(D^{p - k} f) (y)

and the corresponding derivative of its Taylor approximation from

x

y

can be bounded in operator norm as

L \cdot \frac{{‖ x - y ‖}_{2}^{k + 1}}{(k + 1)!}

. In particular, taking

k = p - 2

, we see that the Hessian of

f

y

can be bounded from above (in semidefinite order) by the Hessian of the

p

th-order Taylor approximation at

x

, plus

L \cdot \frac{{‖ x - y ‖}_{2}^{p - 1}}{(p - 1)!} \cdot I_{d}

Separately, one can prove that the Hessian of the mapping

y \mapsto p \cdot L \cdot \frac{{‖ x - y ‖}_{2}^{p + 1}}{(p + 1)!}

is bounded from below (again in semidefinite order) by this same matrix. As such, one can deduce that the Hessian of the regularised Taylor polynomial

\begin{array}{r} y \mapsto T (y; x, p, f) + p \cdot L \cdot \frac{{‖ x - y ‖}_{2}^{p + 1}}{(p + 1)!} \end{array}

dominates the Hessian of

f

, where

T

is the relevant Taylor polynomial. Recalling now that

f

is convex, it thus follows that this approximation is also convex! Moreover, it is also an upper bound to

f

. This property actually holds already for

\begin{array}{r} y \mapsto T (y; x, p, f) + L \cdot \frac{{‖ x - y ‖}_{2}^{p + 1}}{(p + 1)!} \end{array}

As a consequence, we see that given a smooth, convex function

f

, it is feasible to form an approximation to it at a point

x

, with high-order accuracy in the neighbourhood of

x

, and global convexity. The practical implication of this is that one can construct improved optimisation methods based on these high-order majorants, with realistic hopes of minimising these majorants, by virtue of convexity.

The main result here is given in Nesterov's paper, and the algorithmic implications are explored in a number of related works by the same author and coauthors.

tags: one-offs optimisation smoothness

Convex Taylor Polynomials

From Smoothness to Majorants

Read more

Some Intuition Boosters

A Reading List

Denoising Diffusions

Nested Structure in MCMC Algorithms

tags: `one-offs` `optimisation` `smoothness`