Obstacle Avoidance Motion Planning via Model Predictive Control

# Obstacle Avoidance Motion Planning via Model Predictive Control ###### tags: `ROBOTICS`, `motion planning` ## Brief description #### Mutitasks: 1. Gathered the state data from Armbot, and build DataLoader (State, Action, Next State) via Pytorch. 2. Designed the Model Predictive Path Integral (MPPI) algorithm and loss function to estimate the target state. 3. Demonstrated Armbot can avoid obstacles via the MPPI algorithm in an obstacle environment. ## Planar Pushing Learning ![](https://i.imgur.com/blSvfep.png) ### State Space and Action Space For the planar pushing task, we have the following action and state spaces: The image below illustrates the state space, which is the block planar location on top of the table. $$ \mathbf x = \begin{bmatrix} x & y & \theta\end{bmatrix}^\top\in \text{SE}(2) $$ Note that the state contains only position elements, and not velocity terms. This is because we assume that the pushing task is *quasi-static*. This means that the pushing actions are slow enough that the velocity and inertia of the block are negligible. In other words, if the robot stops pushing, the block also stops. ![State Space](https://drive.google.com/uc?export=view&id=1iuHwUC_IVsrBbgbyiR8qXDV4OBmgKObU) Note that the robot is centered at the origin. The following image show the robot action space. ![Action Space](https://drive.google.com/uc?export=view&id=1sFyNOAj-RmNbwk_Ww8VxmmkCaW4LpVZj) Each action $\mathbf u = \begin{bmatrix} p & \phi & \ell\end{bmatrix}^\top\in \mathbb R^3$ is composed by: * $p \in [-1,1]$: pushing location along the lower block edge. * $\phi \in [-\frac{\pi}{2},\frac{\pi}{2}]$ pushing angle. * $\ell\in [0,1]$ pushing length as a fraction of the maximum pushing length. The maximum pushing length is is 0.1 m ## Absolute Dynamics Learning Given the collected data triplets $(\mathbf x_{t} , \mathbf u_{t}, \mathbf x_{t+1})$ we will learn to predict the state transition dynamics $$\hat{\mathbf x}_{t+1} = f(\mathbf x_{t} , \mathbf u_{t}) $$ ```python! class AbsoluteDynamicsModel(nn.Module): """ Model the absolute dynamics x_{t+1} = f(x_{t},a_{t}) """ def __init__(self, state_dim, action_dim): super().__init__() self.state_dim = state_dim self.action_dim = action_dim self.num_hidden = 100 self.co_dim = state_dim + action_dim; self.next_action_dim = state_dim self.linear1 = nn.Linear(self.co_dim,self.num_hidden) self.act1 = nn.ReLU() self.linear2 = nn.Linear(self.num_hidden, self.num_hidden) self.act2 = nn.ReLU() self.linear3 = nn.Linear(self.num_hidden, self.next_action_dim) def forward(self, state, action): """ Compute next_state resultant of applying the provided action to provided state :param state: torch tensor of shape (..., state_dim) :param action: torch tensor of shape (..., action_dim) :return: next_state: torch tensor of shape (..., state_dim) """ next_state = None input = torch.cat((state, action), 1) x = self.linear1(input) x = self.act1(x) x = self.linear2(x) x = self.act2(x) next_state = self.linear3(x) return next_state ``` ## Residual Dynamics Learning Given the collected data triplets $(\mathbf x_{t} , \mathbf u_{t}, \mathbf x_{t+1})$ we will learn to predict the state transition dynamics $$\hat{\mathbf x}_{t+1} = \mathbf x_{t} + \Delta \mathbf x_{t} = \mathbf x_{t} + f(\mathbf x_{t}, \mathbf u_{t}) $$ ```python! class ResidualDynamicsModel(nn.Module): """ Model the residual dynamics s_{t+1} = s_{t} + f(s_{t}, u_{t}) Observation: The network only needs to predict the state difference as a function of the state and action. """ def __init__(self, state_dim, action_dim): super().__init__() self.state_dim = state_dim self.action_dim = action_dim self.num_hidden = 100 self.co_dim = state_dim + action_dim; self.next_action_dim = state_dim self.linear1 = nn.Linear(self.co_dim,self.num_hidden) self.act1 = nn.ReLU() self.linear2 = nn.Linear(self.num_hidden, self.num_hidden) self.act2 = nn.ReLU() self.linear3 = nn.Linear(self.num_hidden, self.next_action_dim) def forward(self, state, action): """ Compute next_state resultant of applying the provided action to provided state :param state: torch tensor of shape (..., state_dim) :param action: torch tensor of shape (..., action_dim) :return: next_state: torch tensor of shape (..., state_dim) """ next_state = None input = torch.cat((state, action), 1) x = self.linear1(input) x = self.act1(x) x = self.linear2(x) x = self.act2(x) next_state = self.linear3(x) + state return next_state ``` ## To be continued...