# Chapter 4.1. Static Optimization Methods
### 4.1.1. Unconstrained Static Optimization
Suppose we have a(n at least twice-differentiable) objective function of $n$ variables $$\max_{\mathbf{x} = \left(x_{1}, \ldots, x_{n}\right) \in \mathbb{R}^{n}} f\left(x_{1},\ldots,x_{n}\right). \tag{4.1a}$$ The first-order conditions when maximizing $f$ are:
\begin{align}
f_{1} &= \dfrac{\partial f}{\partial x_{1}} &= 0, \\
f_{2} &= \dfrac{\partial f}{\partial x_{2}} &= 0, \tag{4.1b}\\
\vdots & & \vdots \\
f_{n} &= \dfrac{\partial f}{\partial x_{n}} &= 0.
\end{align} The ==Hessian matrix== associated to our objective function $f$ is
$$Hf = \begin{bmatrix}
f_{11} & f_{12} & \cdots & \cdots & f_{1n} \\
f_{21} & f_{22} & \cdots & \cdots & f_{2n} \\
\vdots & \vdots & \ddots & & \vdots \\
\vdots & \vdots & & \ddots & \vdots \\
f_{n1} & f_{n2} & \cdots & \cdots & f_{nn}
\end{bmatrix},$$ where for $i,j=1,2,\ldots,n$, $f_{ij} = \dfrac{\partial^{2} f}{\partial x_{i}\partial x_{j}}$. The second-order conditions for maximization of the objective function $f$ requires that $Hf$ is an ==negative definite matrix==, i.e., $(-1)^{n}M_{n} > 0$ where $M_{n}$ is the ==$n$-th order leading principal minor== of the $Hf$ given by
\begin{align*}
M_{1} &= f_{11} \\
M_{2} &= \left| \begin{array}{cc}
f_{11} & f_{12} \\
f_{21} & f_{22}
\end{array} \right| \\
&\vdots \\
M_{n} &= |Hf|
\end{align*}
### 4.1.2. Static Optimization Problems with Equality Constraints
Consider the two-variable problem $$\max_{x,y} f(x,y)$$ subject to $g(x,y) = c$.
The constraint set (i.e. the set of all pairs $(x,y)$ for which $g(x,y) = c$) is a set of points in $(x,y)$-space. Assume that the functions $f$ and $g$ are at least twice-differentiable. At a solution $(x^{*},y^{*})$ of the problem, the constraint curve is tangent to a level curve of $f$ i.e.,
$$-\frac{f'_{1}(x^{*},y^{*})}{f'_{2}(x^{*},y^{*})} = -\frac{g'_{1}(x^{*},y^{*})}{g'_{2}(x^{*},y^{*})}$$
or,
$$-\frac{f'_{1}(x^{*},y^{*})}{g'_{1}(x^{*},y^{*})} = -\frac{f'_{2}(x^{*},y^{*})}{g'_{2}(x^{*},y^{*})} \tag{4.2}$$
assuming that neither $g'_{1}(x^{*},y^{*})$ nor $g'_{2}(x^{*},y^{*})$ is zero. Now introduce a new variable, $\lambda$, and set it equal to the common value of the quotients: $$\lambda = \frac{f'_{1}(x^{*},y^{*})}{g'_{1}(x^{*},y^{*})} = \frac{f'_{2}(x^{*},y^{*})}{g'_{2}(x^{*},y^{*})}.$$ It might appear that introducing a new variable merely complicates the problem, but in fact it is a clever step that allows the condition for a maximum to be expressed in an appealing way. Given the definition of $\lambda$, the condition for $(x^{*},y^{*})$ to solve the problem may be written as the two equations
\begin{align*}
f'_{1}(x^{*},y^{*})-\lambda g'_{1}(x^{*}, y^{*}) &= 0 \\
f'_{2}(x^{*},y^{*})-\lambda g'_{2}(x^{*}, y^{*}) &= 0.
\end{align*}
At a solution of the problem we need, in addition, $c = g(x^{*},y^{*})$ (the constraint is satisfied). Thus, the following conditions must be satisfied at a solution $(x^{*},y^{*})$:
\begin{align}
f'_{1}(x^{*},y^{*})-\lambda g'_{1}(x^{*}, y^{*}) &= 0 \\
f'_{2}(x^{*},y^{*})-\lambda g'_{2}(x^{*}, y^{*}) &= 0 \tag{4.3}\\
c-g(x^{*},y^{*}) &= 0.
\end{align}
The first two equations can be viewed conveniently as the first-order conditions for the derivatives of the ==Lagrangian== $$\mathcal{L}(x,y,\lambda) = f(x,y)-\lambda(g(x,y)-c), \tag{4.4}$$ with respect to $x$ and $y$ to be zero.
They are known as the first-order conditions for the problem, and $\lambda$ is called the ==Lagrange multiplier==. Suppose we have a function of $n$ variables and we face $m$ constraints, where $m<n$. Our problem is maximizing the following objective function:
$$\max_{x_{1}, \ldots, x_{n}} f\left(x_1, \ldots, x_n\right), \tag{4.5a}$$ subject to
\begin{align}
g^{1}\left(x_{1}, \ldots, x_{n}\right) &= c_{1} \\
g^{2}\left(x_{1}, \ldots, x_{n}\right) & =c_{2} \tag{4.5b}\\
\vdots &= \vdots \\
g^{m}\left(x_{1},\ldots, x_{n}\right) & = c_{m}.
\end{align}
To solve (4.5a-b), form the Lagrangian by multiplying each constraint equation $g^{j}$ by a different Lagrangian multiplier $\lambda_{j}$ and subtracting them all from the objective function $f$. For $\mathbf{x}=\left(x_{1}, \ldots, x_{n}\right)$ and $\mathbf{\lambda}=\left(\lambda_{1}, \ldots, \lambda_{m}\right)$, we obtain the function of $n+m$ variables $$\mathcal{L}(\mathbf{x}, \lambda)=f(\mathbf{x})-\sum_{j=1}^{m} \lambda_j \left(g^j(\mathbf{x})-c_{j}\right). \tag{4.6}$$ The first-order conditions again require that all partial derivatives of $\mathcal{L}$ be equal to zero at the optimum. Because $\mathcal{L}$ has $n+m$ variables, there will be a system of $n+m$ equations determining the $n+m$ variables $\mathbf{x}^{*}$ and $\lambda^{*}$:
\begin{align}
\frac{\partial \mathcal{L}}{\partial x_{i}} &= \frac{\partial f\left(\mathbf{x}^{*}\right)}{\partial x_{i}}-\sum_{j=1}^m \lambda_{j}^{*} \frac{\partial g^{j}\left(\mathbf{x}^*\right)}{\partial x_{i}}=0 \mbox{ for } i=1, \ldots, n \tag{4.7}\\
\frac{\partial \mathcal{L}}{\partial \lambda_{j}} &= g^{j}\left(\mathbf{x}^{*}\right)=0 \mbox{ for } j=1, \ldots, m.
\end{align}
As in the case of unconstrained optimization, second-order conditions require that the associated Hessian matrix is negative definite.
### 4.1.3. Static Optimization Problems with Inequality Constraints
We start by considering a general one-variable maximization problem (the analysis of the minimization problem is analogous - it is the same as maximizing $-f(x)$) with a simple non-negativity constraint:
\begin{align}
\max_{x \in \mathbb{R}} & f(x), \tag{4.8} \\
\mbox{subject to } & x \geq 0.
\end{align}
With the non-negativity constraint on the domain of solutions, we need to modify the necessary conditions for $x^{*}$ to be a solution of the optimization problem. There are three possibilities: (i) if $x^{*} > 0$ then $x^{*}$ is an ==interior optimum== and we have the familiar result that the derivative of $f$ must be zero at $x^{*}$, i.e., $x^{*}> 0$ implies that $f'(x^{*})=0$; (ii) if $x^{*}$ solves (9.1) and $x^{*}=0$ then $x^{*}$ is a ==boundary optimum== and there are two possibilities for $f'(x^{*})$, i.e., $f'(x^{*}) \leq 0$. Therefore, the first-order necessary conditions for $x^{*} \geq 0$ can be summarized as follows: $$f'(x^{*}) \leq 0 \mbox{ and } x^{*}f'(x^{*}) = 0.$$ The second condition above is often referred to as a ==complementary slackness condition== because it specifies which variable can be non-zero or ==slack== at optimum.
We can generalize these observations immediately to an $n$ variable optimization problem.
\begin{align}
\max_{\mathbf{x} \in \mathbb{R}^{n}} & f(\mathbf{x}), \tag{4.9} \\
\mbox{subject to } & \mathbf{x} = (x_{1},\ldots,x_{n}) \geq (0,\ldots,0) = \mathbf{0}.
\end{align}
The first-order necessary conditions for $\mathbf{x}^{*} = (x_{1}^{*},\ldots,x_{n}^{*})$ solves $(4.2)$ are
\begin{equation}
f_{i}(\mathbf{x}^{*}) = \dfrac{\partial f(\mathbf{x}^{*})}{\partial x_{i}} \leq 0 \mbox{ and } x_{i}^{*}f_{i}(\mathbf{x}^{*}) = 0 \mbox{ for all } i=1,2\ldots,n. \tag{4.10a}
\end{equation}
or, since for all $i=1,\ldots,n$, $x_{i}^{*} \geq 0$ and $f_{i}(\mathbf{x}_{i}^{*}) \leq 0$, (4.10a) is equivalent to
\begin{equation}
f_{i}(\mathbf{x}^{*}) = \dfrac{\partial f(\mathbf{x}^{*})}{\partial x_{i}} \leq 0 \mbox{ for all } i=1,2\ldots,n, \mbox{ and } \sum_{i=1}^{n} x_{i}^{*}f_{i}(\mathbf{x}^{*}) = 0 \tag{4.10b}
\end{equation}
which in turn is equivalent to
\begin{equation}
\nabla f(\mathbf{x}^{*}) \leq 0 \mbox{ and } \mathbf{x}^{*} \cdot {\nabla f(\mathbf{x}^{*})}^{T} = 0 \tag{4.10c}
\end{equation}
where $\nabla f(\mathbf{x}^{*}) = \left(f_{1}(\mathbf{x}^{*}),\ldots,f_{n}(\mathbf{x}^{*})\right)$ is the ==gradient vector== of $f$ at $x^{*}$. In case of a minimization problem, the inequalities in (4.10a-c) will be reversed. Note that the conditions in (4.10) are only necessary for maximization; the usual second-order conditions for sufficiency apply.
Thus, for each $i$, the condition ensures either $x_{i}^{*} > 0$ and $f_{i}(\mathbf{x}^{*}) = 0$, or $x_{i}^{*} = 0$ and $f_{i}(\mathbf{x}^{*}) \leq 0$. This condition must hold for every \(i =
So far, we worked with simple non-negativity constraints on the domain of solutions. Now, we are in a position to analyse constrained optimization problems that involve functional inequalities.
\begin{align}
\max_{\mathbf{x} \in \mathbb{R}^{n}} & f(\mathbf{x}), \\
\mbox{subject to } & g(\mathbf{x}) \geq 0, \tag{4.11} \\
& \mathbf{x} \geq 0.
\end{align}
We convert the problem in (4.11) to a problem with equality constraints by introducing a ==slack variable== $s \geq 0$ as follows:
\begin{align}
\max_{\mathbf{x} \in \mathbb{R}^{n}} & f(\mathbf{x}), \\
\mbox{subject to } & g(\mathbf{x})-s = 0, \tag{4.12} \\
\mathbf{x} \geq 0 &\mbox{ and } s \geq 0.
\end{align}
The associated Lagrangian of the optimization problem in (4.12): $$\mathcal{L}(x, s, \lambda)=f(x)+\lambda\left(g(x)-s\right), \tag{4.13}$$ where $\lambda$ is the Lagrange multiplier.
A necessary condition for $\left(\mathbf{x}^{*}, s^{*}\right)$ to solve problem (4.13) is that there exists a $\lambda^{*}$ such that:
$$
\begin{align}
\mathcal{L}_{i}\left(\mathbf{x}^{*}, s^{*}, \lambda^{*}\right) &= f_{i}\left(\mathbf{x}^{*}\right)+\lambda^{*} g_{i}\left(\mathbf{x}^{*}\right) \leq 0 \quad \text { for all } i=1,2, \ldots, n, \\
\sum_{i=1}^n x_{i}^{*} \mathcal{L}_i\left(\mathbf{x}^{*}, s^{*}, \lambda^{*}\right) &= \sum_{i=1}^n x_{i}^{*}\left(f_{i}\left(\mathbf{x}^{*}\right)+\lambda^{*} g_{i}\left(\mathbf{x}^{*}\right)\right) = 0, \\
\mathcal{L}_{s}\left(\mathbf{x}^{*}, s^{*}, \lambda^{*}\right) &= -\lambda^{*} \leq 0, \tag{4.14} \\
s^* \mathcal{L}_s\left(\mathbf{x}^{*}, s^{*}, \lambda^{*}\right)&= -s^{*} \lambda^{*} = 0, \\
\mathcal{L}_{\lambda}\left(\mathbf{x}^{*}, s^{*}, \lambda^{*}\right) &= g\left(\mathbf{x}^{*}\right)-s^{*} = 0, \\
s^{*} \geq 0 \quad \text{ and } \quad &x_{i}^{*} \geq 0 \quad \text{ for all } i=1,2, \ldots, n .
\end{align}
$$
The conditions in (4.14) allow us to eliminate the slack variable $s^{*}$. From the third expression in (4.14) we have $\lambda^{*} \geq 0$, and from the fifth expression we have that $g\left(\mathbf{x}^{*}\right)=s^{*} \geq 0$. Combining these two results with the fourth expression in (4.14) gives us $\lambda^{*} g\left(\mathbf{x}^{*}\right)=0$. Replacing the expressions in (4.14) with the ones we just derived (which excludes $s^{*}$ ) and using $\mathcal{L}$ for the Lagrangian, provides the following necessary conditions for $\mathbf{x}^{*} \in \mathbb{R}_{+}^{n}$ to solve the functionally constrained problem (4.12):
$$
\begin{align}
\mathcal{L}_{i}\left(\mathbf{x}^{*}, \lambda^{*}\right) & =f_{i}\left(\mathbf{x}^{*}\right)+\lambda^{*} g_{i}\left(\mathbf{x}^{*}\right) \leq 0 \quad \text { for all } i=1,2, \ldots, n, \\
\sum_{i=1}^n x_{i}^{*} \mathcal{L}_i\left(\mathbf{x}^{*}, \lambda^{*}\right) & =\sum_{i=1}^n x_{i}^{*}\left[f_i\left(\mathbf{x}^{*}\right)+\lambda^{*} g_{i}\left(\mathbf{x}^{*}\right)\right]=0, \\
\mathcal{L}_\lambda\left(\mathbf{x}^{*}, \lambda^{*}\right) & =g\left(\mathbf{x}^{*}\right) \geq 0, \tag{4.15} \\
\lambda^* \mathcal{L}_\lambda\left(x^{*}, \lambda^{*}\right) & =\lambda^* g\left(x^*\right)=0, \\
\lambda^* \geq 0 \quad & \text { and } \quad x_{i}^{*} \geq 0 \quad \text { for all } i=1,2, \ldots, n,
\end{align}
$$
where $\mathcal{L}(\mathbf{x},\lambda)=f(\mathbf{x})+\lambda g(\mathbf{x})$.
We are now in a position to extend the argument to the more general case of $m$ functional constraints. In the general $m$-constraint problem, there are $n$ variables and $m$ functional constraints with no necessary relationship between $n$ and $m$. The general problem is
\begin{align}
& \max_{\mathbf{x}} f(\mathbf{x}) \tag{4.16a}\\
& \text {subject to: } g^{1}(\mathbf{x}) \geq 0, \\
& g^{2}(\mathbf{x}) \geq 0, \\
& \vdots \tag{4.16b}\\
& g^{m}(\mathbf{x}) \geq 0, \\
& \mathbf{x} \geq \mathbf{0},
\end{align}
The Lagrangian function $\mathcal{L}$ for the general problem (4.16) is
$$\mathcal{L}(\mathbf{x},\mathbf{\lambda})=f(\mathbf{x})+\sum_{j=1}^{m} \lambda_{j} g^{j}(\mathbf{x})=f(\mathbf{x})+\lambda^{T} g(\mathbf{x}) \tag{4.17}$$
where $\mathbf{\lambda}^{T}=\left(\lambda_1, \lambda_2, \ldots, \lambda_m\right)$ is the $m$-dimensional row vector of Lagrange multipliers (one for each constraint function) and $g(\mathbf{x})$ is the $m$-dimensional column vector of constraint functions with $[g(\mathbf{x})]^T=\left(g^{1}(\mathbf{x}), g^{2}(\mathbf{x}), \ldots, g^{m}(\mathbf{x})\right)$.
The main result of the theory of nonlinear programming is that if $\mathbf{x}^{*}$ is a solution to (4.16), then there must exist a $\mathbf{\lambda}^{*} \in \mathbb{R}_{+}^{m}$ such that the following ==Karush-Kuhn-Tucker (KKT) conditions== hold:
\begin{gather}
\mathcal{L}_{i}\left(\mathbf{x}^{*}, \mathbf{\lambda}^{*}\right) =f_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{m} \lambda_{j}^{*} g_{i}^{j}\left(\mathbf{x}^{*}\right) \leq 0 \quad \text { for all } i=1,2, \ldots, n, \tag{4.18a} \\
x_{i}^{*} \mathcal{L}_{i}\left(\mathbf{x}^{*}, \mathbf{\lambda}^{*}\right) =x_{i}^{*}\left[f_{i}\left(\mathbf{x}^{*}\right)+ \sum_{j=1}^m \lambda_{j}^{*} g_{i}^{j}\left(\mathbf{x}^{*}\right)\right]=0 \quad \text { for all } i=1,2, \ldots, n, \tag{4.18b} \\
\mathcal{L}_{\lambda_j}\left(\mathbf{x}^{*}, \mathbf{\lambda}^{*}\right) =g^{j}\left(\mathbf{x}^{*}\right) \geq 0 \quad \text { for all } j=1,2, \ldots, m, \tag{4.18c} \\
\lambda_{j}^{*} \mathcal{L}_{\lambda_{j}}\left(\mathbf{x}^{*}, \mathbf{\lambda}^{*}\right) =\lambda_{j}^{*} g^{j}\left(\mathbf{x}^{*}\right)=0 \quad \text { for all } j=1,2, \ldots, m. \tag{4.18d}
\end{gather}
$\lambda_{j}^{*} \geq 0 \quad$ for all $j=1,2, \ldots, m \quad$ and $\quad x_{i}^{*} \geq 0 \quad$ for all $i=1,2, \ldots, n$.
The conditions in (4.18a) and (4.18c) suggest that the Lagrangian function is maximized with respect to the variables in $\mathbf{x}$ and minimized with respect to the variables in $\mathbf{\lambda}$. If these suggestions are correct, then the Lagrangian will satisfy the following condition:
\begin{align}
\mathcal{L}\left(\mathbf{x}, \mathbf{\lambda}^{*}\right) \leq \mathcal{L}\left(\mathbf{x}^{*}, \mathbf{\lambda}^{*}\right) \leq \mathcal{L}\left(\mathbf{x}^{*}, \mathbf{\lambda}\right) \quad \text { for all } x \in \mathbb{R}_{+}^n \quad \text { and } \quad \lambda \in \mathbb{R}_{+}^{m}. \tag{4.19}
\end{align}
The condition in (4.19) says that $(\mathbf{x}^{*}, \mathbf{\lambda}^{*})$ is a saddle point of the Lagrangian; the optimal value of the Lagrangian can be written as
$$\mathcal{L}\left(\mathbf{x}^{*}, \mathbf{\lambda}^{*}\right)=\max_{\mathbf{x} \geq \mathbf{0}}\min _{\mathbf{\lambda} \geq \mathbf{0}} \mathcal{L}\left(\mathbf{x}, \mathbf{\lambda}\right). \tag{4.20}$$
Thus on the basis of the KKT conditions, we conclude that solving (4.18) and finding a saddle point for the Lagrangian are equivalent problems.
> Example. Consider the following maximization problem with inequality constraints: \begin{align}
& \max_{(x,y) \in \mathbb{R}^{2}} f(x,y)\\
& \text {subject to: } g(x,y) = 2-x-y^{2} \geq 0, \\
& (x,y) \geq \mathbf{0},
\end{align}
The Lagrangian for this problem is $$\mathcal{L}\left(x,y,\lambda\right)=xy+\lambda\left(2-x-y^{2}\right).$$
The first-order conditions are:
\begin{align}
\mathcal{L}_{x}\left(x^{*},y^{*},\lambda^{*}\right)&=y^{*}-\lambda^{*} \leq 0 \\
x^{*}\mathcal{L}_{x}\left(x^{*},y^{*},\lambda^{*}\right)&=x^{*}\left(y^{*}-\lambda^{*}\right)=0 \\
\mathcal{L}_{y}\left(x^{*},y^{*},\lambda^{*}\right) &= x^{*}-2y^{*}\lambda^{*} \leq 0 \\
y^{*}\mathcal{L}_{y}\left(x^{*},y^{*},\lambda^{*}\right) &= y^{*}\left(x^{*}-2y^{*}\lambda^{*}\right) = 0 \\
\mathcal{L}_{\lambda}\left(x^{*},y^{*}, \lambda^{*}\right)&=2-x^{*}-\left(y^{*}\right)^{2} \geq 0 \\
\lambda^{*}\mathcal{L}_{\lambda}\left(x^{*},y^{*},\lambda^{*}\right)&=\lambda^{*}\left(2-x^{*}-\left(y^{*}\right)^{2}\right)=0
\end{align} where $x^{*},y^{*},\lambda^{*} \geq 0$.
We consider two cases.
**Case 1** ($\lambda^{*}=0$). Clearly, we have $y^{*}=0$. Therefore, $x^{*} \leq 2y^{*}\lambda^{*} = 0$ and since $x^{*} \geq 0$, we have $x^{*}=0$.
**Case 2** ($\lambda^{*}>0$). From the last FOC (the complementary slackness condition related to $\lambda^{*}$), we have $x^{*}+\left(y^{*}\right)^{2}=2$.
**Case 2.1.** Assume $x^{*}>0$, implying that $y^{*} = \lambda$. Also, $x^{*}-2(y^{*})^{2}=0$ or $(y^{*})^{2} = \frac{x^{*}}{2}$. Therefore, $x^{*}+\frac{x^{*}}{2}=2$ or $x^{*} = \frac{4}{3}$ and $y^{*} = \sqrt{\frac{2}{3}}$.
**Case 2.2.** Assume $x^{*}=0$. Therefore, $y^{*}=\sqrt{2}$ and $x^{*}-2y^{*}\lambda^{*} = 0$, implying that $\lambda^{*}=0$, taking us back to Case 1.