Convex Sets and Convex Functions

--- tags: optimization --- # Convex Sets and Convex Functions ## Affine and Convex Sets ### Affine sets > **Def** > > Suppose $x_1 \neq x_2$ are two points in $\mathbb R^n$. The set > $$ > \left\{ \theta x_1 + (1-\theta) x_2 \middle | \theta \in \mathbb R \right\} > $$ > forms the ***line*** passing through $x_1$ and $x_2$. If we constrain $\theta \in [0, 1]$, then the corresponding set is the (closed) ***line segment*** between $x_1$ and $x_2$. > **Def** > > A set $C \subseteq \mathbb R^n$ is ***affine*** if the line through any two distinct points in $C$ lies in $C$. In other words, $C$ contains the linear combination of any two points in $C$. > ***Def*** > > A point of the form $\theta_1 x_1 + \cdots + \theta_n x_n$, where $\theta_1 + \cdots + \theta_n = 1$, is an ***affine combination*** of the points $x_1, \ldots, x_n$. > **Proposition** > > An affine set contains every affine combination of its points. > > *proof* > > This can be easily proved by induction. > **Proposition** > > If $C$ is an affine set and $x_0 \in C$, then the set > $$ > V = C - x_0 := \left\{ x - x_0 \middle| x \in C \right\} > $$ > is a subspace. > > *proof* > > We need to show that $\forall v_1, v_2 \in V, \alpha, \beta \in \mathbb R$, we have > $$ > \alpha v_1 + \beta v_2 \in V \iff \alpha v_1 + \beta v_2 + x_0 \in C > $$ > Here comes the proof: > > Since $v_1$ and $v_2$, there exists $x_1, x_2 \in C$ such that $v_1 = x_1 - x_0$ and $v_2 = x_2 - x_0$. Thus, we have > $$ > \alpha v_1 + \beta v_2 + x_0 = \alpha x_1 + \beta x_2 + (1 - \alpha - \beta) x_0 > $$ > Since $\alpha + \beta + (1 - \alpha - \beta) = 1$ and $C$ is an affine set, we have $\alpha v_1 + \beta v_2 + x_0 \in C$, which completes the proof. The subspace $V$ does *not* depends on the choice of $x_0$, so $x_0$ can be chosen as any point in $C$. So the following definition is well-defined: ### Convex Sets > **Def** > > A set $C$ is ***convex*** if > $$ > \forall x_1, x_2 \in C, \theta \in [0, 1], \theta x_1 + (1 - \theta) x_2 \in C > $$ > ***Def*** > > A point of the form $\theta_1 x_1 + \cdots + \theta_n x_n$, where $\theta_1 + \cdots + \theta_n = 1, \theta_i \ge 0, i = 0, \ldots, n$, is an ***convex combination*** of the points $x_1, \ldots, x_n$. > **Proposition** > > A set is convex if and only if its contains its every context combination of its points. > **Def** > > The ***convex hull*** of a set $C$, denoted $\operatorname{con} C$, is the set of all convex combinations of points in $C$ > $$ > \operatorname{conv} C =\{ \theta_1 x_1 + \cdots + \theta_n x_n \mid x_i \in C, \theta_i \ge 0, i = 1,\ldots, k, \theta_1 + \cdots + \theta_n = 1\} > $$ > **Remark** > > The convex hull of $C$ is the smallest convex set that contains $C$. ## Operations That Preserve Convexity ### Affine Functions > **Def** > > A function $\mathbb R^n \rightarrow \mathbb R^m$ is ***affine*** if it is a sum of a linear function *and a constant*, i.e., it has the form > $$ > f(x) = Ax + b, > $$ > where $A \in \mathbb R^{m \times n}$ and $b \in \mathbb R^m$. Affine functions preserve convexity: > **Proposition** > > Suppose $S \subseteq \mathbb R^n$ is conex and $f: \mathbb R^n \rightarrow \mathbb R^m$ is an affine function. Then the image of $S$ under $f$, > $$ > f(S) = \{f(x)|x \in S\}, > $$ > is convex. > > *proof* > > Suppose that $x_1, x_2 \in S$. Let $y_1 = f(x_1)$ and $y_2 = f(x_2)$. Since $f$ is affine, we can re-write $y_1$ and $y_2$ as > $$ > y_1 = Ax_1 + b \\ > y_2 = Ax_2 + b > $$ > Therefore, $\forall \theta \in [0, 1]$, we have > $$ > \theta y_1 + (1 - \theta) y_2 = \theta(A x_1 + b) + (1 - \theta) (A x_2 + b) = A(\theta x_1 + (1 - \theta) x_2 ) + b > $$ > Since $S$ is convex, $\theta x_1 + (1 - \theta) x_2$ is also convex. Therefore, an abitrary convex combination of $y_1$ and $y_2$ is in $f(S)$, which implies that $f(S)$ is convex. Similar result can be found in converse direction: > **Proposition** > > Suppose that $f: \mathbb R^n \rightarrow \mathbb R^m$ is an affine function. Then the inverse image of $S$ under $f$, > $$ > f^{-1}(S) = \{x \mid f(x) \in S\}, > $$ > is convex. Some example of affine functions: * Scaling and translation. * Suppose $f: \mathbb R^m \times \mathbb R^m \rightarrow \mathbb R^m$, define by $f((x_1, x_2)) = x_1 + x_2$, is affine. Therefore, if $S_1$ and $S_2$ are convex, so is $S_1 \times S_2$. In this way, we can show that $f(S_1 \times S_2)$ is convex. Therefore, by the proposition, we have $$ S_1 + S_2 = f(S_1 \times S_2) = \{x + y \mid x \in S_1, y \in S_2\} $$ is also convex. ## Convex Functions > **Definition** > > A function $f: \mathbb R^n \rightarrow \mathbb R$ is ***convex*** if $\forall x, y \in \operatorname{dom} f$, and $\theta \in [0, 1]$, we have > $$ > \begin{align} > f(\theta x + (1 - \theta) y) \le \theta f(x) + (1 - \theta) f(y) \tag{1} \label{convex function} > \end{align} > $$ > If the strict inequality holds in ($\ref{convex function}$) whenever $x \neq y$ and $0 < \theta < 1$, then we say $f$ is ***strictly convex***. > **Definition** > > A function $f$ is ***concave*** if $-f$ is convex. ### Examples The follows are examples of convex function > **Norms** > > If $f: \mathbb R^n \rightarrow \mathbb R$ is a norm, then $f$ is convex. > > *proof* > > For $x_1, x_2 \in \mathbb R_n$ and $\theta \in [0, 1]$. we have > $$ > f(\theta x_1 + (1-\theta) x_2) \le f(\theta x_1) + f((1-\theta)x_2) \le \theta f(x_1) + (1-\theta) f(x_2), \label{norm} > $$ > where the first inequailtiy holds by traingle inequality and the second one holds by absolute homogeneity of norm function. ## Operations That Preserve Convexity ### Nonnegative Weightted Sums > **Proposition** > > A nonnegative weighted sum of convex functions, > $$ > f = w_1 f_1 + \cdots + w_n f_n > $$ > , where $w_i > 0$ and $f_i$ are convex, is convex. > > *proof* > > First we need to show that $w_1 f_1 + w_2 f_2$ is convex for convex function $f_1, f_2$ and $w_1, w_2 > 0$. Then, use induction to generalize the result. This property extends to infinite sums and integrals. For example, if $f(x, y)$ is convex in $x$ for each $y \in \mathcal A$, and $w(y) \ge 0 \forall y \in \mathcal A$, then the function $g$ defined as $$ g(x) = \int_{\mathcal A} w(y) f(x, y) dy $$ is convex in $x$ (provided the integral exists). ### Composition with an Affine Mapping > **Proposition** > > Suppose $f: \mathbb R^n \rightarrow \mathbb R$ is convex, $A \in \mathbb R^{n \times m}$, and $b \in \mathbb R^n$. Define $g: \mathbb R^m \rightarrow \mathbb R$ by > $$ > g(x) = f(Ax + b), > $$ > with $\operatorname{dom} g = \{x \mid Ax + b \in \operatorname{dom} f\}$. Then $g$ is also convex. > > *proof* > > Suppose $x_1, x_2 \in \operatorname{dom}g$ and $\theta \in [0, 1]$, then > $$ > \begin{align} > g(\theta x_1 + (1 - \theta) x_2) &= f(\theta A x_1 + (1- \theta) Ax_2 + b) \\ > & = f(\theta(Ax_1 + b) + (1 - \theta)(Ax_2 + b)) \\ > & \le \theta g(x_1) + (1 - \theta) g(x_2) > \end{align} > $$ > The last inequality holds due to the convexity of $f$. ### Pointwise Maximum and Supremum > **Def** > > If $f_1$ and $f_2$ are convex functions then their ***pointwise maximum*** $f$, is defined by > $$ > f(x) = \max \{f_1(x), f_2(x)\}, > $$ > with $\operatorname{dom} f = \operatorname{dom}f_1 \cap \operatorname{dom} f_2$. > **Proposition** > > If $f_1$ and $f_2$ are convex, then their pointwise maximum, denoted $f$, is also convex. > > *proof* > > For any $x_1, x_2 \in \operatorname{dom} f$ and $\theta \in [0, 1]$, we have > $$ > \begin{align} > f(\theta x_1 + (1-\theta)x_2) > &= \max \{f_1(\theta x_1 + (1-\theta)x_2), f_2(\theta x_1 + (1-\theta)x_2)\} \\ > & \le \max \{\theta f_1(x_1) + (1-\theta)f_1(x_2), \theta f_2(x_1) + (1-\theta)f_2(x_2)\} \\ > & \le \theta \max \{f_1(x_1), f_2(x_1)\} + \le (1-\theta) \max \{f_1(x_2), f_2(x_2)\} \\ > & = \theta f(x_1) + (1-\theta) f(x_2) > \end{align} > $$ > > **Remeark** > > It can be easily shown that if $f_1, \ldots, f_n$ are convex, then their pointwise maximum > $$ > f(x) = \max \{f_1(x), \ldots, f_n(x)\} > $$ > is also convex. The pointwise maximum property extends to the pointwise supremum over an infinite set of convex functions: > **Def** > > If for each $y \in \mathcal A$, $f(x, y)$ is convex in $x$, then the function $g$ defined by > $$ > g(x) = \sup_{y \in \mathcal A} f(x, y) > $$ > is convex in $x$. Here the domain of $g$ is > $$ > \operatorname{dom} g = \{x \mid (x, y) \in \operatorname{dom} f \text{ for all } y \in \mathcal A, \sup_{y\in\mathcal A} f(x, y) < \infty\} > $$ > **Example** > > Let $C \subseteq \mathbb R^n$, with $C \neq \emptyset$, The ***support function*** $S_C$ associated with the set $C$ is defined as > $$ > S_C(x) = \sup \{x^T y \mid y \in C\}. > $$ > For each $x \in \mathbb R^n$, $x^Ty$ is a linear function of $x$, so $S_C$ is a pointwise supremum of a family of linear functions, **hence convex**. > **Example** > > Let $C \subseteq \mathbb R^n$. The distance to the farthes point of $C$, > $$ > f(x) = \sup_{y \in C} \| x - y\|, > $$ > , where $\|\cdot\|$ is any norm, is convex. > > *proof* > > Using the fact that $\| y - x \|$ is convex in $x$ and $f(x)$ is pointwise supremum of a family of conveex functions.