Generative Modeling for Control: Lecture 4: Control using the Continuity Equation

## From noise shaping to direct distribution control In the previous few lectures, we went over how with appropriate injection of noise in the system, one can shape the long term behavior of probability density of the system. However, one can also pose the distribution control problem directly as an optimization problem. In this lecture, we revisit such a approach due to [optimal transport theory](https://en.wikipedia.org/wiki/Transportation_theory_(mathematics)). Additionally, we will see the continuity equation, which is fundamental to many generative approaches such as diffusion models or flow matching. --- ## **Continuity Equation** Lets review the continuity equation and its relation to ODEs and dynamical systems. ### ODE viewpoint Consider an ordinary differential equation \begin{align} \dot{x} = f(x) \\ x(0) = x_0 \end{align}For general vector fields $f:\mathbb{R}^d \rightarrow \mathbb{R}^d$ this is a nonlinear differential equation. An example, is the van-der pol Oscillator \begin{aligned} \dot{x} &= v, \\ \dot{v} &= (1 - x^2)\,v - x . \end{aligned} ![single](https://hackmd.io/_uploads/HJIesZL4-g.gif) The plot above highlights the classical dynamical-systems picture: a single initial condition $x_0$ is pushed forward along trajectories generated by the vector field. ### Density viewpoint As in the diffusive case, in the previous lectures, instead of looking at how the solution flows from an initial condition along the vector field, one can consider an alternative quantity, $p(y,t)$, the probability of finding a state $x(t)$ at a point $y$ at time $t$, $\mathbb{P}(x(t) \in dy) = p(y,t)dy$ of finding a system in a state $y$ at time $t$. This quantity is known to evolve according to a linear partial differential equation, the [continuity equation](https://): \begin{align*} \partial_t \rho +\nabla_x \cdot (f(x)\rho) = 0 \\ \rho(x,0) = \rho_0 \end{align*} where $\nabla_x : \sum_{i=1}^d \partial_{x_i}$ is the divergence operation. Even though original ODE is nonlinear the continuity equation is linear. This perspective has been very fruitful in studying problems with chaos, where it is more tractable to study the evolution of a colleciton of ensembles instead of a single initial condition. ### From trajectories to ensembles The following plot shows the evolution of the Van der Pol oscillator for a family of initial conditions. The initial states are sampled uniformly from $[-1,1]^2$. ![multiple](https://hackmd.io/_uploads/Syqys-IVZx.gif) If one solves the continuity equation corresponding to this ODE, the resulting evolution of the density is as in the following plot. ![kde](https://hackmd.io/_uploads/H1GJs-UNWl.gif) The key takeaway is that we can either (i) simulate many trajectories and estimate $\rho(x,t)$ empirically, or (ii) solve directly for $\rho(x,t)$ through the PDE. The second viewpoint is what will let us pose distribution control problems in a convex form. --- ## Optimal control: trajectory formulation Consider the optimal control problem, \begin{align} &\inf_{x,u} \int_0^T L(x(t))dt + \int_0^T M(u(t))dt \\ &\dot{x} = f_0(x)+\sum_{i}u_i(t)f_i(x) := f(x,u) \\ & x(0) = x_0 \end{align}For general vector fields $f_i(x)$ this problem is non-convex even if $L$ and $M$ are convex. This is the classical formulation: we choose a control law to shape the evolution of a single trajectory. Next, we translate this into a problem over probability distributions. --- ## Density dynamics under control The evolution of the probability density is given by the continuity equation as mentioned above. \begin{align*} \partial_t \rho +\nabla_x \cdot (f(x,u)\rho) = 0 \\ \rho(x,0) = \rho_0 \end{align*}We can now treat this continuity equation as our state equation instead of the ODE above. This generative modeling shift we are making is that instead of controlling one state $x(t)$, we control how the whole distribution $\rho(x,t)$ moves through the state space. --- ## Feedback control and expected-cost objective Suppose, additionally instead of *open loop* controls $u_i(t)$ one looks for *feedback controls* $u_i(x,t)$, a natural extension of the optimal cost is the following \begin{equation} \inf_{\rho, u(x,t)} \int_0^T \int_{\mathbb{R}^d} L(x)\rho(x,t)dxdt + \int_0^T \int_{\mathbb{R}^d} M(u(x,t))\rho(x,t)dxdt \end{equation} This cost function can be interpreted as the *expected cost* of states and controls as one is averaging the cost against the distribution $\rho(x,t)$ of states. At this point, the decision variables are the evolving density $\rho$ and the feedback control field $u(x,t)$. The difficulty is that the dynamics and the objective couple them multiplicatively. --- ## Convexification via flux variables As such, this problem is not convex in the decision variables $u_i(x,t)$ and $\rho(x,t)$. However, there is a way to convexify it. Let us define the variables, \begin{equation} m_i(x,t) = u_i(x,t) \rho(x,t) \end{equation}Then the optimization problem can be rewritten as following **convex problem**. \begin{align*} \inf_{\rho, u(x,t)} \int_0^T \int_{\mathbb{R}^d} L(x)\rho(x,t)dxdt + \int_0^T \int_{\mathbb{R}^d} M(\frac{m(x,t)}{\rho(x,t)})\rho(x,t)dxdt \\ \partial_t \rho +\nabla_x \cdot \Big (f_0(x) \rho(x,t)+\sum_i f_i(x) m_i(x,t) \Big ) = 0 \\ \rho(x,0) = \rho_0 \\ \rho(x,t) \geq 0 \end{align*} It can be checked that the functional $\int_0^T \int_{\mathbb{R}^d} M(\frac{m(x,t)}{\rho(x,t)})\rho(x,t)$ is convex in $m_i$ and $\rho$, whenever $M$ is convex. Moreover, the continuity equation constraint is linear in the decision variables. This was a groundbreaking observation to due [Benamou-Brenier](https://link.springer.com/article/10.1007/s002110050002) in optimal transport theory, which considered the case: $\dot{x} = u$. Hence, one advantage of continuity equation perspective is that the problem has been convexified. --- An example of this is the control of the double gyre flow considered in [*](https://arxiv.org/pdf/1612.01193). The controlled equations we consider as follows, \begin{align} \dot{x} &= -\pi A \sin(\pi f(x,t))\cos(\pi y) + u_1, \tag{57a}\\ \dot{y} &= \pi A \cos(\pi f(x,t))\sin(\pi y)\frac{df(x,t)}{dx} + u_2, \tag{57b} \end{align} where \begin{equation} f(x,t) = \beta \sin(\omega t)x^2 + (1 - 2\beta \sin(\omega t))x \end{equation} is the time-periodic forcing in the system. An visualization of the example of transport of solutions from one family of configurations to another, [can be found here](https://www.youtube.com/watch?v=Pu7sCkpm4RY). ### Curse of Dimensionality However, note that this approach typically suffers from the curse of dimensionality: solving the continuity equation requires representing $\rho(x,t)$ over the state space, and the computational cost grows rapidly with dimension. As a result, it generally does not scale beyond $d≈3$ in practice. Later, we will see sample-based approaches, integrated with deep learning, that avoid gridding the full space and therefore exhibit much better scaling behavior. --- References [1) Karthik Elamvazhuthi, and Piyush Grover. "Optimal transport over nonlinear systems via infinitesimal generators on graphs." ](https://arxiv.org/pdf/1612.01193)