Quick Presentation on indirect methods === ###### tags: `Presentation` ## Hamiltonian formulation and Pontryagin's maximum principle :::warning The following content is rocket science. Literally. Its first application was for the maximization of terminal rocket speed. :rocket: :rocket: :rocket: :rocket: :rocket: :rocket: :rocket: :rocket: ::: Given an optimization problem of the form: $$ \begin{split} &\max_u&\bigg{[}\phi(x(t_1)) + \int_{t_0}^{t_1} f(t,x(t),u(t))dt \bigg{]} \\ &\mathrm{s.t.}\qquad &x' = g(t, x(t),u(t))\\ & & x(t_0) = x_0 \end{split} \tag{1} $$ One can instead study and solve the following system, __hoping__ for a identical solution. $$ \begin{split} &\frac{\partial H}{\partial u} = 0 \\ & \lambda' = -\frac{\partial H}{\partial x} \\ &\lambda(t_1) = \phi'(x^*(t_1)) \end{split} \\ \mathrm{where}\quad H = f(t,x(t),u(t)) + \lambda(t)g(t,x(t),u(t)) \tag{2} $$ This follows from __Pontryagin's maximum principle__ that states that if an optimal pair $(x^*,u^*)$ exists for (1), then there exists a piecewise differentiable adjoint variable $\lambda(t)$ such that: $$ H(t,x^*,u,\lambda^*)\leq H(t,x^*,u^*,\lambda^*) $$ for all controls u at every t, where $H$ is characterized by the system (2) :::danger The maximum principle is a __necessary condition__ that the pair $(x^*,u^*)$ must satisfy in order to be considered a potential solution to (1) To ensure that a solution to (2) is also a solution to (1), one must verify additional conditions, in a similar fashion to the Lagrangian and the KKT conditions. A simple condition is to check for concavity (convexity) of both $f$ and $g$ w/r to x and u. ::: :::info __Why indirect?__ Because you do not solve the problem directly. First you need to formulate the Hamiltonian and the control characterization; then you can solve the alternate problem. ::: ### Advantage of solving (2) over (1) The optimality condition ($\frac{\partial H}{\partial u} = 0$) holds **pointwise**; this is an important simplifaction as it means that instead of trying to find the maximal $u(t)$ among an infinitely huge family of fonctions, you can find $u^*(t)$ by selecting the control that maximize $H$ at every t. ### Derivation We can derive the Hamiltonian and the adjoint equation by leveraging tools from calculus of variations. Let's define the objective we want to optimize w/r to u by: $$ J(u) = \phi(x(t_1)) + \int_{t_0}^{t_1} f(t,x(t),u(t)) dt \tag{3} $$ __Assumptions__ : Let's assume a piecewise continuous control exists, $u^*(t)$, along a corresponding piecewise continuous state $x^*(t)$. Let's also assume that $f$ and $g$ are continuously differentiable w/r to their 3 arguments ($t,x,u$). Finally, $\phi$ is also assumed to be continuously differentiable w/r to $x$ We want to study variations in the vicinity of $u^*(t)$. Let's take the constant $\epsilon \in \mathbb{R}$ and the piecewise continuous variation function $h(t)$. We define $u^{\epsilon}(t)$ as $$ u^{\epsilon}(t) = u^*(t) + \epsilon h(t) $$ And it follows that the associated state $x^{\epsilon}$ must satisfy $$ \frac{d}{dt}x^{\epsilon}(t) = g(t,x^{\epsilon}(t),u^{\epsilon}(t)) $$ With the boundary condition $x^{\epsilon}(t_0) = x_0$ still satisfied. Obviously, as $\epsilon \to 0$ we have $u^{\epsilon}(t) \to u^*(t)$ and $x^{\epsilon}(t) \to x^*(t)$ since $g$ is differentiable w/r to $x$ and $u$. Let's remark that: $$ \frac{\partial u^{\epsilon}(t)}{\partial\epsilon}\Bigg|_{\epsilon =0} = h(t) $$ And $$ \frac{\partial x^{\epsilon}(t)}{\partial\epsilon}\Bigg|_{\epsilon =0} \exists \quad \mathrm{since}\;\lim_{\epsilon \to 0} x^{\epsilon}(t)\: \exists $$ With this, we now want to study the objective at $u^\epsilon$, that is: $$ J(u^\epsilon) = \phi(x^\epsilon(t_1))+\int_{t_0}^{t_1} f(t,x^\epsilon(t),u^\epsilon(t)) dt $$ We now introduce the adjoint, $\lambda(t)$, that for now we just define as a piecewise differentiable function on $[t_0,t_1]$. From FTC we have $$ \int_{t_0}^{t_1} \frac{d}{dt}[\lambda(t)x^\epsilon(t)]dt + \lambda(t_0)x_0 - \lambda(t_1)x^\epsilon(t_1) = 0 $$ And thus $$ \begin{split} J(u^\epsilon) &= \phi(x^\epsilon(t_1))+\int_{t_0}^{t_1} \bigg[f(t,x^\epsilon(t),u^\epsilon(t))+\frac{d}{dt}[\lambda(t)x^\epsilon(t)]\bigg] dt+ \lambda(t_0)x_0 - \lambda(t_1)x^\epsilon(t_1)\\ &= \phi(x^\epsilon(t_1)) + \int_{t_0}^{t_1} \bigg[f(t,x^\epsilon(t),u^\epsilon(t))+\lambda'(t)x^\epsilon(t)+\lambda(t)g(t,x^{\epsilon}(t),u^{\epsilon}(t))\bigg] dt \\ & \qquad \qquad \qquad \qquad \qquad +\lambda(t_0)x_0 - \lambda(t_1)x^\epsilon(t_1) \end{split} $$ We know the maximum of $J(u^\epsilon)$ occurs at $u^*$ (by definition), so we have $\frac{d}{d\epsilon}J(u^\epsilon)\Big|_{\epsilon =0}=0$ and so $$ 0 = \int_{t_0}^{t_1} \frac{\partial}{\partial\epsilon}\bigg[f(t,x^\epsilon(t),u^\epsilon(t))+\lambda'(t)x^\epsilon(t)+\lambda(t)g(t,x^{\epsilon}(t),u^{\epsilon}(t))\bigg] dt\Bigg|_{\epsilon =0} - \frac{\partial}{\partial\epsilon}\lambda(t_1)x^\epsilon(t_1)\Big|_{\epsilon =0} \\ +\frac{\partial}{\partial\epsilon}\phi(x^\epsilon(t_1))\Big|_{\epsilon =0} $$ We can apply the derivative and develop: $$ \begin{split} 0 &= \int_{t_0}^{t_1} \bigg[f_x \frac{\partial x^{\epsilon}}{\partial\epsilon} + f_u \frac{\partial u^{\epsilon}}{\partial\epsilon} + \lambda'(t) \frac{\partial x^{\epsilon}}{\partial\epsilon} + \lambda(t)\bigg(g_x \frac{\partial x^{\epsilon}}{\partial\epsilon} + g_u \frac{\partial u^{\epsilon}}{\partial\epsilon} \bigg) \bigg]dt\Bigg|_{\epsilon =0} \\ & \qquad \qquad \qquad +\big(\phi'(x(t_1)) - \lambda(t_1)\big) \frac{\partial x^{\epsilon}}{\partial\epsilon}(t_1)\Bigg|_{\epsilon =0}\\ &= \int_{t_0}^{t_1} \bigg[ \big(f_x + \lambda(t)g_x + \lambda'(t) \big) \frac{\partial x^{\epsilon}}{\partial\epsilon}\Bigg|_{\epsilon =0} + \big(f_u + \lambda(t)g_u \big)h(t)\bigg]dt\\ & \qquad \qquad \qquad +\big(\phi'(x(t_1)) - \lambda(t_1)\big) \frac{\partial x^{\epsilon}}{\partial\epsilon}(t_1)\Bigg|_{\epsilon =0}\\ \end{split} $$ Now, to exploit our adjoint function, we want to simplify this expression as much as we can, say by getting rid of the terms relying on $\frac{\partial x^{\epsilon}}{\partial\epsilon}$. For that, we can thus define the adjoint as: $$ \lambda'(t) = -[f_x + \lambda(t)g_x] \tag{4} $$ With: $$ \lambda(t_1) = \phi'(x(t_1)) \tag{5} $$ Therefore, for this particular choice of adjoint we get $$ 0 = \int_{t_0}^{t_1} \big(f_u + \lambda(t)g_u \big)h(t)dt $$ Since this must be true for every piecewise continuous $h(t)$ this must also be true for $h(t) = f_u + \lambda(t)g_u$ leading to: $$ \begin{split} 0 &= \int_{t_0}^{t_1} \big(f_u + \lambda(t)g_u \big)^2dt\\ &\quad \implies f_u + \lambda(t)g_u = 0 \end{split} \tag{6} $$ #### Control bounds Bounds, say $a\leq u\leq b$, can be taken into account when you derive the necessary conditions, which in the case of a maximization problem leads to the modify version condition (6) at an optimal point $u^*(t)$. (Reverse inequality signs when minimization) $$ \begin{cases} u^*(t)=a &\mathrm{if} &\frac{\partial H(t)}{\partial u(t)} <0 \; \mathrm{over\; all\;}[a,b]\\ a \leq u^*(t)\leq b &\mathrm{if} &\frac{\partial H(t)}{\partial u(t)} =0 \; \mathrm{somewhere\; on\;}[a,b]\\ u^*(t)=b &\mathrm{if} &\frac{\partial H(t)}{\partial u(t)} >0 \; \mathrm{over\; all\;}[a,b] \end{cases} $$ This shouldn't be surprising as it is equivalent to saying that in order to maximize (minimize) $H(u)$ on $u \in [a,b]$, you can follow the gradient. :::info All the relationship that we just derived can be extended to a problem with multiple states and multiples controls direclty. ::: ### Discrete time setting We can also derive the Hamiltonian in discrete time with similar steps than above but replacing $\int$ by $\sum$. Leading to the following form: #### Pierre-Luc's derivation in discrete time Worth noting, in his class notes for IFT 6760C, Pierre-Luc derive the hamiltonian and the adjoint equations for the discrete time setting in a very interesting manner, by first posing the adjoint in the setting where you only have a terminal reward, then extending it to a "Bolza discrete form". See p.55-59 of his notes ### Connecting this to the Lagrangian formulation The Hamiltonian is indeed very similar to the Lagrangian formulation. In fact, if you consider the dependency of the adjoint over $t$, you get the following expression with the Lagrangian formulation: $$ L = \int_{t_0}^{t_1} \big(f(t,x(t),u(t)) dt + \lambda(t)\big[g(t,x(t),u(t)) - x'(t) \big]\big)dt $$ With integration by parts, you can rexpress the last term as: $$ -\int_{t_0}^{t_1} \lambda(t)x'(t) = -\lambda(t_1)x(t_1) + \lambda(t_0)x(t_0) + \int_{t_0}^{t_1} \lambda'(t)x(t) $$ When substituing this expression back in the Lagrangian, we can see that we recover an expression very similar to what we encountered previously. ## Forward-Backward Sweep Method (FBSM) When you are able to find a characterization for $u^*$, that is if solving the equation $\frac{\partial H}{\partial u} = 0$ leads to an expression of the form: $$ u^* = h(x(u^*),\lambda(u^*)) $$ One can be very tempted to apply fixed-point iteration, leading to the FBSM algorithm. The difficulty here resides in the fact that to calculate $h$ in the above equation, we must solve the ODE over $x$ and $\lambda$. Thus: 1) Pick an initial guess over $u$ 2) With this $u$, solve for $x$ given $x'(u)$ and $x_0$ (forward sweep) 3) With $u$ and $x(u)$, solve for $\lambda$ given $\lambda'(x,u)$ and $\lambda(t_1)$ (backward sweep) 4) Calculate $h(x(u),\lambda(u))$ and take a step toward it 5) Repeat until convergence Under some Lipschitz conditions over $f$ and $g$ the algorithm can be showned to converge, see [McAsey, Mou & Han](https://www.researchgate.net/publication/257549034_Convergence_of_the_forward-backward_sweep_method_in_optimal_control). But, as it often is, it is much more simple to try out the algo and check if the necessary conditions are met when it converges. ### Examples from Lenhart __Example 1__ : Bear population control ![](https://i.imgur.com/04SKDRH.png) ![](https://i.imgur.com/JXJOjNt.png) __Example 2__ : Bang-bang control -- Timber harvesting The FBSM works well when we can find a characterization for $u^*$, that is $$ \frac{\partial H}{\partial u} = 0 \implies u^* = \mathrm{something} $$ This approach hits a wall when $H$ is linear w/r to u, as it would then be impossible to find such a charachterization. But we can still use the FBSM for very particular cases when $u$ is bounded in $[a,b]$. Indeed, if $\frac{\partial H}{\partial u} \neq 0$ everywhere in $[a,b]$ (and for all $t$) then we can follow the gradient to recover the maximum/minimum of $H$ on $[a,b]$ (as discussed previously when we introduced bounds). This leads to the "bang-bang" solutions, named like this because you jump from one bound to the other. To our example: ![](https://i.imgur.com/9oa5PRV.png =500x150) Where we have that $$ \frac{\partial H}{\partial u} = x(k\lambda - e^{-rt}) $$ After making sure that __Example 3__ : Invasive Plant Species By taking into account the modified necessary conditions for the dicrete case, we can extend the FBSM to discrete setting problem. $$ \begin{split} &\min_u& \sum_{j=1}^5\bigg{[}r_{j,t} + B \sum_{t=0}^{T-1}u_{j,t}^2 \bigg{]} \\ &\mathrm{s.t.}\qquad &r_{j,\, t+1}=\bigg(r_{j,t}+\frac{r_{j,t}k}{\epsilon + r_{j,t}}\bigg)(1-u_{j,t}),\;r_{j,0} = \rho_j\\ & & 0\leq u_{j,t} \leq 1 \end{split} $$ ## Versatility The indirect approach, in general, can be very versatile as it is often possible to rexpress a given problem in the form in which we previously developped the Pontryagin's maximum principle. ### Higher order differential If we are given a problem where a state variable is defined via a higher order diffential equation, say $$ \begin{split} \mathrm{s.t.}\quad &x^{(n+1)}(t) = g(t,x(t),x'(t),...,x^{(n)}(t),u_1(t),...,u_m(t))\\ & x(t_0) = \alpha_1, x'(t_0)=\alpha_2, ... , x^{(n+1)}(t_0)=\alpha_{n+1} \end{split} $$ Then when can easily convert this problem into the form we have ben using by introducing $n+1$ states variables, s.t. $x_1(t)=x(t), x_2(t)=x'(t),...,x_{n+1}(t)=x^{(n)}(t)$: $$ \begin{split} \mathrm{s.t.}\quad & x_1'(t) = x_2(t),\; x_1(t_0)=\alpha_1 \\ & x_2'(t) = x_3(t),\; x_2(t_0)=\alpha_2 \\ &\qquad\vdots \\ & x_n'(t) = x_{n+1}(t),\; x_n(t_0)=\alpha_n \\ & x_{n+1}'(t) = g(t,x_1(t),x_2(t),...,x_{n+1}(t),u_1(t),...,u_m(t))\\ \end{split} $$ ### Isoperimetric constraint Imagine that we want to exhaust all the ressources available to our control; for example imagine $u$ is a the drug amount administered through a treatment and we know that it is safe to use the total amount $B$ over the course of the treatment. If we want to administer precisely $B$, we are facing a problem of the form: $$ \begin{split} &\max_u&\bigg{[}\phi(x(t_1)) + \int_{t_0}^{t_1} f(t,x(t),u(t))dt \bigg{]} \\ &\mathrm{s.t.}\qquad &x' = g(t, x(t),u(t)),\;x(t_0) = x_0\\ & & \int_{t_0}^{t_1} h(t, x(t),u(t))dt=B\\ & & a\leq u(t) \leq b \end{split} $$ Where for the case we just discussed, we would have $h(t, x(t),u(t)) = u(t)$. Those kind of problem can also be brought back to the form we have been working with by introducing again a new state variable $$ z(t) = \int_{t_0}^{t} h(s, x(s),u(s))\, ds $$ And the stated problem become: $$ \begin{split} &\max_u&\bigg{[}\phi(x(t_1)) + \int_{t_0}^{t_1} f(t,x(t),u(t))dt \bigg{]} \\ &\mathrm{s.t.}\qquad &x' = g(t, x(t),u(t)),\;x(t_0) = x_0\\ & & z'(t) = h(t, x(t),u(t)),\;z(t_0) = 0,\, z(t_1)=B\\ & & a\leq u(t) \leq b \end{split} $$ ### Free terminal time It is worth noting that the maximum's principle can be adapted to problem with free terminal time, that is , problem of the form: $$ \begin{split} &\max_{u,\, T}&\bigg{[}\phi(T,x(T)) + \int_{t_0}^{T} f(t,x(t),u(t))dt \bigg{]} \\ &\mathrm{s.t.}\qquad &x' = g(t, x(t),u(t)),\;x(t_0) = x_0\\ \end{split} $$ In which case a new necessary condition arise that the Hamiltonian must sastisy at an optimal control which is: $$ H(T^*,x^*(T^*),u^*(T^*),\lambda^*(T^*)) + \phi'(T^*) = 0 $$ So if we are able to solve the initial problem as usual with $t_1=T^*$, an unknow quantity, then we can use this new necessary condition to try to determine $T^*$ See Lenhart chapter 20 for additional information _________