Understanding New Model

# Understanding New Model ## Sanity Check: Training on Push Recovery Train the existing model on the push dataset, looking only at the data after the first push to before the subsequent push. Note that this is a very small dataset ($< 400$ training points). The error attained at the end of training is on the order of $10^{-4}$. We plot: - Dark green dots: Individual velocity estimates, $\hat{v}(t)$ - Dark green curve: Mean of the $T$ velocity estimates at time $t$, $\text{mean}(\hat{v}(t))$ - Shaded green region: Three standard deviations from the mean (dark green curve), $\text{mean}(\hat{v}(t)) \pm 3 \text{SD}(\hat{v}(t))$ - Orange curve: True velocity fed as input to the controller, $v(t)$ ![](https://i.imgur.com/BdWqkT5.png) ![](https://i.imgur.com/k93S3iI.png) Visualize the predicted trajectories in the down direction. Notice that the errors here are $< 5$ cm. ![](https://i.imgur.com/zEYmrDr.png) ## Velocity Estimates Now, we visualize the velocity estimates made by our actual model. At each time $t$, we get a distribution of $T$ velocity estimates. $T$ is the fixed time horizon (number of forward steps we are predicting). ### Figure Eight (Evaluation) ![](https://i.imgur.com/wGhIdKL.png) ![](https://i.imgur.com/ueIa3Wy.png) ![](https://i.imgur.com/bvvjpVK.png) ### Manual Push ![](https://i.imgur.com/YvKNfyy.png) Notice that the distribution of velocity estimates has a greater spread during recovery from the push ($5.75$s-$6$s) than when the model was trained on the push directly. ## Force Predictions ### Figure Eight (Evaluation) ![](https://i.imgur.com/OdTzhK5.png) ![](https://i.imgur.com/fsvqOp6.png) ### Manual Push ![](https://i.imgur.com/Skju1wo.png) ![](https://i.imgur.com/uB2sPms.png) ![](https://i.imgur.com/6bNgg3h.png) ### Two-drone Dataset (Eight-Hover) First, visualize positions: ![](https://i.imgur.com/yjvhTSS.png) Next, visualize acceleration predictions: ![](https://i.imgur.com/FIITZoT.png) ![](https://i.imgur.com/8LF7fPZ.png) ## Correcting for bias - Ideally, in the absence of external force, the acceleration predictions should be centered at $0$; otherwise, we are predicting the presence of external forces that do not exist - This is clearly not the case (see agent U1 in the figure eight evaluation dataset and U2 in the eight-hover dataset) - **An illustrative example**: figure eight evaluation dataset 1. Control inputs are, on average, smaller (i.e. more negative) than those in the training data ![](https://i.imgur.com/SOdtTZR.png) 2. Neural network believes drone should have a more negative velocity than it actually does ![](https://i.imgur.com/XnHQUgu.png) ![](https://i.imgur.com/pamoJ1d.png) 3. Results in more negative acceleration predictions, more positive force predictions ![](https://i.imgur.com/OdTzhK5.png) - How can we correct for this? - A running mean: correct force estimates by a running mean, reject outliers when computing the running mean - Ajay: bias estimation from our flights # Force Predictions Using Trajectories Consider a two drone system. Let $\mathbf{x}^1(t)$ be the state of the first drone at time $t \in \mathbb{N}$ and $\mathbf{x}^2(t)$ be the state of the second drone at time $t$. The model from Neural Swarm is: $$\hat{f}^{(i)}(t) = h_{\boldsymbol{\theta}}(\mathbf{x}^{(i)}(t) - \mathbf{x}^{(i')}(t)), \quad i' = \{1, 2\}\setminus\{i\}.$$ In the two-drone case, we know the each of the drone's planned trajectories for the next $T$ timesteps. Why don't we model: $$(\hat{f}^{(i)}(t), \hat{f}^{(i)}(t + 1), \ldots, \hat{f}^{(i)}(t + T)) = h_{\boldsymbol{\theta}}(\mathbf{x}^{(i)}(t) - \mathbf{x}^{(i')}(t), \ldots,\\ \mathbf{x}^{(i)}(t+T) - \mathbf{x}^{(i')}(t+T)).$$ We could feed these future force estimates to the controller to plan a better trajectory over the next $T$ timesteps. ## Theoretical Gaurantees What are the errors in our force predictions? If we predict the force at time $t' = 50$, $\hat{f}(t')$, starting from $t = 20$, we would expect its error, on average, to be larger than had we predicted the same force starting from $t=45$. In other words, our model performs better when predicting forces nearer in the future. We can give [explicit bounds for the network architectures we are using](https://www.overleaf.com/3331692292fbtfthvcqtsj). They do not, however, take into account the sequential nature of our data. Our hope is that this is encoded in the network weights. ![](https://i.imgur.com/409FuoP.png) The bounds we have proven are quite conservative (for example, $\Vert \mathbf{x}(t) \Vert_2$ can be quite large). However, we know how to tighten them. ## To-do's 1. Now that we have better measurements for the drone's acceleration, can we pass these to the model and improve its performance? 2. Account for shifts in the distribution of control inputs. Right now, this corrupts our force predictions. 3. Test whether our hypothesis is correct for the $T$ step ReLU network. That is, are the weights vectors in the output layer increasing in Euclidean distance? One-dimensional case (d): ![](https://i.imgur.com/KGSUGXo.png) One-dimensional case (e): ![](https://i.imgur.com/V4jPmw2.png) One-dimensional case (n): ![](https://i.imgur.com/23VPf9M.png) What does this mean intuitively? The neural network $$f(\mathbf{x}) = \mathbf{W}^K\sigma(\mathbf{W}^{K-1} \cdots \sigma(\mathbf{W}^0\mathbf{x}))$$ can be decomposed as $f(\mathbf{x}) = \mathbf{W}^K \varphi(\mathbf{x})$, where $\varphi(\mathbf{x})$ is a mapping into a Euclidean feature space. In the output layer, our prediction is a linear combination of these features. We are asserting that the $i$ and $j$th linear combination vectors should be closer in the $\ell^2$ sense than the $i$ and $(j+1)$th linear combination vectors. Next: Look at this for the three-dimensional model.