Chapter 4.2. Dynamic Optimization Methods

## Chapter 4.2. Dynamic Optimization Methods ### 4.2.1. Discrete Time Consider a household that lives for $T$ time periods where $T \leq \infty$. Time evolves discretely, i.e., $t=0,1,2,\ldots,T$. The utility function of the household in time period $t$ is defined by $u(c_{t})$. Prices are normalized to $1$. The household has assets/wealth of $a_{t}$ at the beginning of the period $t$. They earn income $w_{t}$ in and save $s_{t}$ in each period $t$, on which they earn an interest $r$. Lifetime utility is given by \begin{align} U_{0} = \sum_{t=0}^{T} \beta^{t} u(c_{t}) \tag{4.21} \end{align} where $0<\beta<1$ is the ==discount factor==. Households choose an optimal consumption plan $\{c_{t}\}_{t=0}^{T} = \{c_{0},c_{1},c_{2},\ldots,c_{T}\}$ that maximize $U_{0}$ subject to the lifetime budget constraint: \begin{align} \sum_{t=0}^{T} \dfrac{c_{t}}{(1+r)^{t}} = a_{0} + \sum_{t=0}^{T}\dfrac{w_{t}}{(1+r)^{t}}. \tag{4.22} \end{align} One can use the Lagrangian method to solve this optimization problem. Set the Lagrangian to be \begin{align} \mathcal{L}(c_{0},c_{1},c_{2},\ldots,c_{T},\lambda) = \sum_{t=0}^{T} \beta^{t} u(c_{t}) + \lambda \left[a_{0} + \sum_{t=0}^{T}\dfrac{w_{t}}{(1+r)^{t}} - \sum_{t=0}^{T} \dfrac{c_{t}}{(1+r)^{t}}\right]. \tag{4.23} \end{align} The FOCs are: \begin{align} \mathcal{L}_{c_{t}} &= \beta^{t} u'(c_{t}) - \dfrac{\lambda}{(1+r)^{t}} = 0 \mbox{ for } t=0,1,2,\ldots,T, \\ \tag{4.24} \mathcal{L}_{\lambda} &= a_{0} + \sum_{t=0}^{T}\dfrac{w_{t}}{(1+r)^{t}} - \sum_{t=0}^{T} \dfrac{c_{t}}{(1+r)^{t}} = 0 \\ \tag{4.25} \end{align} Note that we have $T+1+1$ equations and $T+2$ variables. Taking the FOCs of $c_{t}$ and $c_{t+1}$, we get \begin{align} \dfrac{u'(c_{t})}{u'(c_{t+1})} = (1+r)\beta. \tag{4.25} \end{align} This is the ==Euler equation== relating optimal consumption growth to the time preference and the interest rate as $(1+r)\beta>0$. The Lagrangian technique is, however, neither practical (especially when $T=\infty$) and not efficient. For instance, it doesn't utilize the recursive nature of the dynamic optimization problem. Therefore, we turn our attention to a powerful technique known as ==dynamic programming==. Dynamic programming is a powerful technique used to solve optimization problems by breaking them down into smaller, overlapping subproblems. The key idea is as follows: we focus on two consecutive time periods, $t$ and $t+1$, under the assumption that the economic agent acts rationally in period $t+1$ based on the decisions made in period $t$. By recursively optimizing these decisions over time and ensuring consistency across periods, we can deduce that the agent behaves optimally across all time periods. The key components of a dynamic programming problem is as follows: (i) The ==state variable==, $x_{t}$, represents the condition of the system at time $t$. It summarizes all information needed to make decisions in the current period and transition to the next. These variables are typically ==flows==. The initial value of the state variable $x_{0}$ is exogenously determined. Wealth/Value of assets in every time period $t$, $a_{t}$, in a consumption-savings problem is an example of a state variable. (ii) The ==control variable== is the decision made at time $t$ to affect the system's state and generate rewards. These variables are typically ==stocks==. Consumption in every time period $t$, $c_{t}$, in a consumption-savings problem is an example of a control variable. (iii) The ==transition function== or the ==flow constraint== describes how the current state and control determine the next state: \begin{equation} x_{t+1} = f(x_{t}, c_{t}). \tag{4.26} \end{equation} For example, in the consumption-savings problem, wealth next period depends on current wealth and how much is consumed today: Dynamic programming uses the idea of breaking a large problem into smaller subproblems. The ==Bellman Principle of Optimality== states: >"An optimal policy has the property that, regardless of the initial state and decision, the remaining decisions must form an optimal policy for the subproblem starting at the state resulting from the first decision." The dynamic optimization problem is $$U_{0} = \sum_{t=0}^{T} \beta^{t} u(c_{t}) \mbox{ subject to } a_{t+1} = f(a_{t}, c_{t}). \tag{4.27}$$ For each $s < T$, we can write $$U_{s} = \sum_{t=s}^{T} \beta^{t-s} u(c_{t}) = u(c_{s}) + \sum_{t=s+1}^{T} \beta^{t-s-1} u(c_{t}) = u(c_{s}) + \beta U_{s+1} \tag{4.28} $$ The ==value function== at time period $s$ is the maximum attainable utility in the period $s$-problem. $$V(a_{s}) = \max_{c_{s},c_{s+1},\ldots,c_{T}} U_{s} = \max_{c_{s},c_{s+1},\ldots,c_{T}} \sum_{t=s}^{T} \beta^{t-s} u(c_{t}). \tag{4.29}$$ Suppose the decision maker has already solved the problem that starts at $t+1$ for a given state variable $a_{t+1}$. Then, the value function at time $t$ can be broken down into instantaneous utility and the maximum attaintable utility afterwards, given the choice $c_{t}$ that leads to $a_{t+1}$. Therefore, $$V(a_{t}) = \max_{c_{t}}u(c_{t}) + \beta V_{t+1}(a_{t+1}) \mbox{ subject to } a_{t+1} = f(a_{t}, c_{t}) \tag{4.30}$$ Equation (4.30) is called the ==Bellman Equation==, which defines the problem recursively. Solving the Bellman Equation for all time periods gives us the optimal sequence of control variables. ### 4.2.1.1. Finite Horizon Case Consider the utility function: $$u(c_{t}) = \frac{c_{t}^{1-\theta} - 1}{1-\theta},$$ and assume that the finite horizon is $T = 2$. We aim to solve the dynamic optimization problem: $$\max_{\{c_{0},c_{1}, c_{2}\}} u(c_{0}) + \beta u(c_{1}) + \beta^{2} u(c_{2}),$$ subject to the flow budget constraints: $$a_{1}=(1+r)a_{0}+w_{0}-c_{0},$$ $$a_{2}=(1+r)a_{1}+w_{1}-c_{1}.$$ When $T=2$, the agent consumes all of the assets available in period $2$, i.e.,$c_{2}^{*} = a_{2}$. Therefore, $V_{3}=u(c_{2}^{*}) = \frac{{c_{2}^{*}}^{1-\theta}-1}{1-\theta}$. Next, write the Bellman Equation for $T=1$: $$V_{1}(a_{1}) = \max_{c_{1}} u(c_{1}) + \beta\left[\frac{{c_{2}^{*}}^{1-\theta}-1}{1-\theta}\right] = \max_{c_{1}} u(c_{1}) + \beta\left[\frac{a_{2}^{1-\theta}-1}{1-\theta}\right],$$ subject to $a_{2}=(1+r)a_{1}+w_{1}-c_{1}$. Converting this into an unconstrained problem, we have $$\max_{c_{1}} u(c_{1}) + \beta\left[\frac{((1+r)a_{1}+w_{1}-c_{1})^{1-\theta}-1}{\left(1-\theta\right)}\right].$$ Taking FOC with respect to $c_{1}$ and solving for $c_{1}^{*}$, $$u'(c_{1}^{*}) -\beta \left( (1+r)a_{1} + w_{1} - c_{1}^{*} \right)^{-\theta} = {c_{1}^{*}}^{-\theta} -\beta \left( (1+r)a_{1} + w_{1} - c_{1}^{*} \right)^{-\theta} = 0.$$ Solving, we get $c_{1}^{*} = \dfrac{\beta^{\frac{1}{\theta}} \cdot \left((1+r)a_{1} + w_{1} \right)}{\left(1+\beta^{\frac{1}{\theta}}\right)}$. Now, we write the Bellman Equation for $T=0$: $$V_{0}(a_{0}) = \max_{c_{0}} u(c_{0}) + \beta\left[\left(\frac{{c_{1}^{*}}^{1-\theta}-1}{\left(1-\theta\right)}\right)+\beta\left(\frac{((1+r)a_{1}+w_{1}-c_{1}^{*})^{1-\theta}-1}{\left(1-\theta\right)}\right)\right],$$ subject to $a_{1}=(1+r)a_{0}+w_{0}-c_{0}$. Converting this into an unconstrained problem, we have $$\max_{c_{0}} \dfrac{{c_{0}}^{1-\theta}-1}{\left(1-\theta\right)} + \beta\left[\left(\frac{{c_{1}^{*}}^{1-\theta}-1}{\left(1-\theta\right)}\right)+\beta\left(\frac{((1+r)\left((1+r)a_{0}+w_{0}-c_{0}\right)+w_{1}-c_{1}^{*})^{1-\theta}-1}{\left(1-\theta\right)}\right)\right].$$ Substituting the expression of $c_{1}^{*}$ in the above expression and one can solve for $c_{0}^{*}$ in terms of $a_{0}$, $w_{0}$, $w_{1}$, and $r$ from the first-order condition. Lastly, one can substitute the values of $c_{0}^{*}$, $c_{1}^{*}$, and $c_{2}^{*}$ in the flow budget constraints to solve for $a_{1}$ and $a_{2}$ and then substitute these expressions back in $c_{1}^{*}$, and $c_{2}^{*}$ to get their expressions in terms of the parameters of the model, i.e., $a_{0}$, wage levels, and $r$. ### 4.2.1.2. Infinite Horizon Case In this section, we assume $T=\infty$. When $T=\infty$, we need an additional restriction, called the ==transversality condition (TVC)==, that ensures that the solution to the optimization problem does not lead to a situation where resources are *left unused* asymptotically or grow indefinitely without providing utility. It reflects the economic principle that resources must be optimally allocated over time. For the dynamic programming problem (4.30), the ==transversality condition== is typically expressed as: $$\lim_{t \to \infty} \beta^{t} \frac{\partial V(x_{t})}{\partial x_{t}} x_{t} = 0. \tag{4.31}$$ The TVC ensures that the discounted marginal value of the remaining resources $x_{t}$ vanishes as $t \to \infty$. This means that the agent does not over-accumulate resources that yield no additional utility in the long run. Without the TVC, the solution may involve non-optimal paths, such as accumulating $x_{t}$ indefinitely without increasing utility (e.g., hoarding resources unnecessarily). The TVC applies directly to the state variable $x_{t}$. For any optimal sequence of $x_{t}$ and $c_{t}$, the transversality condition must hold. This ensures that the utility derived from $c_{t}$ is maximized across time, balancing current consumption with future possibilities. We are now in a position to derive the Euler condition from the Bellman equation (4.30) involves two core steps: (i) establishing the First-Order Condition; and (ii) applying the Envelope Condition. First, the FOC is found by differentiating the right-hand side of the Bellman equation with respect to the control variable, $c_{t}$, and setting it to zero to find the optimal choice. This yields $$u'\left(c_{t}\right)+\beta V'\left(a_{t+1}\right)\dfrac{\partial f\left(a_{t}, c_{t}\right)}{\partial c_{t}}=0. \tag{4.32}$$ Next, we find the FOC at time $t+1$: $$u'\left(c_{t+1}\right)+\beta V'\left(a_{t+2}\right)\dfrac{\partial f\left(a_{t+1}, c_{t+1}\right)}{\partial c_{t+1}}=0. \tag{4.33}$$ Second, the Envelope Condition is derived by differentiating the value function with respect to the state variable, $a_{t}$, resulting in $$V'\left(a_{t}\right)=\beta V'\left(a_{t+1}\right) \dfrac{\partial f(a_{t},c_{t})}{\partial a_{t}}, \tag{4.34}$$ and the Envelope Condition is derived by differentiating the value function with respect to the state variable, $a_{t+1}$, resulting in $$V'\left(a_{t+1}\right)=\beta V'\left(a_{t+2}\right) \dfrac{\partial f(a_{t+1},c_{t+1})}{\partial a_{t+1}}. \tag{4.35}$$ Rearranging (4.33), we have $$\beta V'\left(a_{t+2}\right) = -\dfrac{u'(c_{t+1})}{\frac{\partial f(a_{t+1},c_{t+1})}{\partial c_{t+1}}} \tag{4.36}$$ and using (4.36) in (4.35), we get $$ V'\left(a_{t+1}\right)=-\dfrac{u'(c_{t+1})}{\frac{\partial f(a_{t+1},c_{t+1})}{\partial c_{t+1}}} \dfrac{\partial f(a_{t+1},c_{t+1})}{\partial a_{t+1}}. \tag{4.37}$$ Rearranging (4.32) and using (4.37), we have $$-\dfrac{u'(c_{t})}{\frac{\partial f(a_{t},c_{t})}{\partial c_{t}}}=\beta V'\left(a_{t+1}\right)=-\beta\dfrac{u'(c_{t+1})}{\frac{\partial f(a_{t+1},c_{t+1})}{\partial c_{t+1}}} \dfrac{\partial f(a_{t+1},c_{t+1})}{\partial a_{t+1}}. \tag{4.38}$$ Rearranging (4.38) establishes the Euler Condition: $$\dfrac{u'\left(c_t\right)}{u'\left(c_{t+1}\right)}= \beta \dfrac{\frac{\partial f\left(a_{t}, c_{t}\right)}{\partial c_{t}}}{\frac{\partial f\left(a_{t+1}, c_{t+1}\right)}{\partial c_{t+1}}} \dfrac{\partial f(a_{t},c_{t})}{\partial a_{t}} \tag{4.39}$$ In our example, where $a_{t+1}=f(a_{t},c_{t}) = (1+r)a_{t}+w_{t}-c_{t}$, this translates to $$\dfrac{u'\left(c_t\right)}{u'\left(c_{t+1}\right)}= \beta \dfrac{\frac{\partial f\left(a_{t},c_{t}\right)}{\partial c_{t}}}{\frac{\partial f\left(a_{t+1},c_{t+1}\right)}{\partial c_{t+1}}} \dfrac{\partial f(a_{t+1},c_{t+1})}{\partial a_{t+1}}=\beta(1+r),$$ which is the same as the one derived using the Lagrangian method.