---
tags: optimization
---
# Convex Sets and Convex Functions
## Affine and Convex Sets
### Affine sets
> **Def**
>
> Suppose $x_1 \neq x_2$ are two points in $\mathbb R^n$. The set
> $$
> \left\{ \theta x_1 + (1-\theta) x_2 \middle | \theta \in \mathbb R \right\}
> $$
> forms the ***line*** passing through $x_1$ and $x_2$. If we constrain $\theta \in [0, 1]$, then the corresponding set is the (closed) ***line segment*** between $x_1$ and $x_2$.
> **Def**
>
> A set $C \subseteq \mathbb R^n$ is ***affine*** if the line through any two distinct points in $C$ lies in $C$. In other words, $C$ contains the linear combination of any two points in $C$.
> ***Def***
>
> A point of the form $\theta_1 x_1 + \cdots + \theta_n x_n$, where $\theta_1 + \cdots + \theta_n = 1$, is an ***affine combination*** of the points $x_1, \ldots, x_n$.
> **Proposition**
>
> An affine set contains every affine combination of its points.
>
> *proof*
>
> This can be easily proved by induction.
> **Proposition**
>
> If $C$ is an affine set and $x_0 \in C$, then the set
> $$
> V = C - x_0 := \left\{ x - x_0 \middle| x \in C \right\}
> $$
> is a subspace.
>
> *proof*
>
> We need to show that $\forall v_1, v_2 \in V, \alpha, \beta \in \mathbb R$, we have
> $$
> \alpha v_1 + \beta v_2 \in V \iff \alpha v_1 + \beta v_2 + x_0 \in C
> $$
> Here comes the proof:
>
> Since $v_1$ and $v_2$, there exists $x_1, x_2 \in C$ such that $v_1 = x_1 - x_0$ and $v_2 = x_2 - x_0$. Thus, we have
> $$
> \alpha v_1 + \beta v_2 + x_0 = \alpha x_1 + \beta x_2 + (1 - \alpha - \beta) x_0
> $$
> Since $\alpha + \beta + (1 - \alpha - \beta) = 1$ and $C$ is an affine set, we have $\alpha v_1 + \beta v_2 + x_0 \in C$, which completes the proof.
The subspace $V$ does *not* depends on the choice of $x_0$, so $x_0$ can be chosen as any point in $C$. So the following definition is well-defined:
### Convex Sets
> **Def**
>
> A set $C$ is ***convex*** if
> $$
> \forall x_1, x_2 \in C, \theta \in [0, 1], \theta x_1 + (1 - \theta) x_2 \in C
> $$
> ***Def***
>
> A point of the form $\theta_1 x_1 + \cdots + \theta_n x_n$, where $\theta_1 + \cdots + \theta_n = 1, \theta_i \ge 0, i = 0, \ldots, n$, is an ***convex combination*** of the points $x_1, \ldots, x_n$.
> **Proposition**
>
> A set is convex if and only if its contains its every context combination of its points.
> **Def**
>
> The ***convex hull*** of a set $C$, denoted $\operatorname{con} C$, is the set of all convex combinations of points in $C$
> $$
> \operatorname{conv} C =\{ \theta_1 x_1 + \cdots + \theta_n x_n \mid x_i \in C, \theta_i \ge 0, i = 1,\ldots, k, \theta_1 + \cdots + \theta_n = 1\}
> $$
> **Remark**
>
> The convex hull of $C$ is the smallest convex set that contains $C$.
## Operations That Preserve Convexity
### Affine Functions
> **Def**
>
> A function $\mathbb R^n \rightarrow \mathbb R^m$ is ***affine*** if it is a sum of a linear function *and a constant*, i.e., it has the form
> $$
> f(x) = Ax + b,
> $$
> where $A \in \mathbb R^{m \times n}$ and $b \in \mathbb R^m$.
Affine functions preserve convexity:
> **Proposition**
>
> Suppose $S \subseteq \mathbb R^n$ is conex and $f: \mathbb R^n \rightarrow \mathbb R^m$ is an affine function. Then the image of $S$ under $f$,
> $$
> f(S) = \{f(x)|x \in S\},
> $$
> is convex.
>
> *proof*
>
> Suppose that $x_1, x_2 \in S$. Let $y_1 = f(x_1)$ and $y_2 = f(x_2)$. Since $f$ is affine, we can re-write $y_1$ and $y_2$ as
> $$
> y_1 = Ax_1 + b \\
> y_2 = Ax_2 + b
> $$
> Therefore, $\forall \theta \in [0, 1]$, we have
> $$
> \theta y_1 + (1 - \theta) y_2 = \theta(A x_1 + b) + (1 - \theta) (A x_2 + b) = A(\theta x_1 + (1 - \theta) x_2 ) + b
> $$
> Since $S$ is convex, $\theta x_1 + (1 - \theta) x_2$ is also convex. Therefore, an abitrary convex combination of $y_1$ and $y_2$ is in $f(S)$, which implies that $f(S)$ is convex.
Similar result can be found in converse direction:
> **Proposition**
>
> Suppose that $f: \mathbb R^n \rightarrow \mathbb R^m$ is an affine function. Then the inverse image of $S$ under $f$,
> $$
> f^{-1}(S) = \{x \mid f(x) \in S\},
> $$
> is convex.
Some example of affine functions:
* Scaling and translation.
* Suppose $f: \mathbb R^m \times \mathbb R^m \rightarrow \mathbb R^m$, define by $f((x_1, x_2)) = x_1 + x_2$, is affine. Therefore, if $S_1$ and $S_2$ are convex, so is $S_1 \times S_2$. In this way, we can show that $f(S_1 \times S_2)$ is convex. Therefore, by the proposition, we have
$$
S_1 + S_2 = f(S_1 \times S_2) = \{x + y \mid x \in S_1, y \in S_2\}
$$
is also convex.
## Convex Functions
> **Definition**
>
> A function $f: \mathbb R^n \rightarrow \mathbb R$ is ***convex*** if $\forall x, y \in \operatorname{dom} f$, and $\theta \in [0, 1]$, we have
> $$
> \begin{align}
> f(\theta x + (1 - \theta) y) \le \theta f(x) + (1 - \theta) f(y) \tag{1} \label{convex function}
> \end{align}
> $$
> If the strict inequality holds in ($\ref{convex function}$) whenever $x \neq y$ and $0 < \theta < 1$, then we say $f$ is ***strictly convex***.
> **Definition**
>
> A function $f$ is ***concave*** if $-f$ is convex.
### Examples
The follows are examples of convex function
> **Norms**
>
> If $f: \mathbb R^n \rightarrow \mathbb R$ is a norm, then $f$ is convex.
>
> *proof*
>
> For $x_1, x_2 \in \mathbb R_n$ and $\theta \in [0, 1]$. we have
> $$
> f(\theta x_1 + (1-\theta) x_2) \le f(\theta x_1) + f((1-\theta)x_2) \le \theta f(x_1) + (1-\theta) f(x_2), \label{norm}
> $$
> where the first inequailtiy holds by traingle inequality and the second one holds by absolute homogeneity of norm function.
## Operations That Preserve Convexity
### Nonnegative Weightted Sums
> **Proposition**
>
> A nonnegative weighted sum of convex functions,
> $$
> f = w_1 f_1 + \cdots + w_n f_n
> $$
> , where $w_i > 0$ and $f_i$ are convex, is convex.
>
> *proof*
>
> First we need to show that $w_1 f_1 + w_2 f_2$ is convex for convex function $f_1, f_2$ and $w_1, w_2 > 0$. Then, use induction to generalize the result.
This property extends to infinite sums and integrals. For example, if $f(x, y)$ is convex in $x$ for each $y \in \mathcal A$, and $w(y) \ge 0 \forall y \in \mathcal A$, then the function $g$ defined as
$$
g(x) = \int_{\mathcal A} w(y) f(x, y) dy
$$
is convex in $x$ (provided the integral exists).
### Composition with an Affine Mapping
> **Proposition**
>
> Suppose $f: \mathbb R^n \rightarrow \mathbb R$ is convex, $A \in \mathbb R^{n \times m}$, and $b \in \mathbb R^n$. Define $g: \mathbb R^m \rightarrow \mathbb R$ by
> $$
> g(x) = f(Ax + b),
> $$
> with $\operatorname{dom} g = \{x \mid Ax + b \in \operatorname{dom} f\}$. Then $g$ is also convex.
>
> *proof*
>
> Suppose $x_1, x_2 \in \operatorname{dom}g$ and $\theta \in [0, 1]$, then
> $$
> \begin{align}
> g(\theta x_1 + (1 - \theta) x_2) &= f(\theta A x_1 + (1- \theta) Ax_2 + b) \\
> & = f(\theta(Ax_1 + b) + (1 - \theta)(Ax_2 + b)) \\
> & \le \theta g(x_1) + (1 - \theta) g(x_2)
> \end{align}
> $$
> The last inequality holds due to the convexity of $f$.
### Pointwise Maximum and Supremum
> **Def**
>
> If $f_1$ and $f_2$ are convex functions then their ***pointwise maximum*** $f$, is defined by
> $$
> f(x) = \max \{f_1(x), f_2(x)\},
> $$
> with $\operatorname{dom} f = \operatorname{dom}f_1 \cap \operatorname{dom} f_2$.
> **Proposition**
>
> If $f_1$ and $f_2$ are convex, then their pointwise maximum, denoted $f$, is also convex.
>
> *proof*
>
> For any $x_1, x_2 \in \operatorname{dom} f$ and $\theta \in [0, 1]$, we have
> $$
> \begin{align}
> f(\theta x_1 + (1-\theta)x_2)
> &= \max \{f_1(\theta x_1 + (1-\theta)x_2), f_2(\theta x_1 + (1-\theta)x_2)\} \\
> & \le \max \{\theta f_1(x_1) + (1-\theta)f_1(x_2), \theta f_2(x_1) + (1-\theta)f_2(x_2)\} \\
> & \le \theta \max \{f_1(x_1), f_2(x_1)\} + \le (1-\theta) \max \{f_1(x_2), f_2(x_2)\} \\
> & = \theta f(x_1) + (1-\theta) f(x_2)
> \end{align}
> $$
>
> **Remeark**
>
> It can be easily shown that if $f_1, \ldots, f_n$ are convex, then their pointwise maximum
> $$
> f(x) = \max \{f_1(x), \ldots, f_n(x)\}
> $$
> is also convex.
The pointwise maximum property extends to the pointwise supremum over an infinite set of convex functions:
> **Def**
>
> If for each $y \in \mathcal A$, $f(x, y)$ is convex in $x$, then the function $g$ defined by
> $$
> g(x) = \sup_{y \in \mathcal A} f(x, y)
> $$
> is convex in $x$. Here the domain of $g$ is
> $$
> \operatorname{dom} g = \{x \mid (x, y) \in \operatorname{dom} f \text{ for all } y \in \mathcal A, \sup_{y\in\mathcal A} f(x, y) < \infty\}
> $$
> **Example**
>
> Let $C \subseteq \mathbb R^n$, with $C \neq \emptyset$, The ***support function*** $S_C$ associated with the set $C$ is defined as
> $$
> S_C(x) = \sup \{x^T y \mid y \in C\}.
> $$
> For each $x \in \mathbb R^n$, $x^Ty$ is a linear function of $x$, so $S_C$ is a pointwise supremum of a family of linear functions, **hence convex**.
> **Example**
>
> Let $C \subseteq \mathbb R^n$. The distance to the farthes point of $C$,
> $$
> f(x) = \sup_{y \in C} \| x - y\|,
> $$
> , where $\|\cdot\|$ is any norm, is convex.
>
> *proof*
>
> Using the fact that $\| y - x \|$ is convex in $x$ and $f(x)$ is pointwise supremum of a family of conveex functions.