# Training in Simulation (Part 2)
## Recap
1. Training a neural network to model the nominal dynamics is equivalent to training a neural network $r_{\theta}$ such that
\begin{align*}
\mathbf{f}_{\text{ext}}(t) = \underbrace{\textbf{a}(t) - \textbf{u}(t)}_{\text{ideal}} - \underbrace{\textbf{r}_{\theta}(v(t))}_{\text{residual}}.
\end{align*} Recall that $r_{\theta} = \frac{d}{dt} f_{\theta}(\textbf{v}(t))$. In our "naive" approach, we take $\textbf{r}_{\theta}(v(t)) = [0, 0, -g]^T$.
2. One issue with this model is that we need the non drone-drone forces to have mean zero in our training dataset: \begin{align*} \mathbf{f}_{\text{ext}}^{(i)} = \underbrace{\phi^{\star}\left( \bigcup_{j \in \mathcal{N}^{(i)}} \mathbf{x}^{(i)} - \mathbf{x}^{(j)}\right)}_{\text{drone-drone forces}} + \underbrace{\boldsymbol{\varepsilon}}_{\text{other forces}}, \quad \mathbb{E}[\boldsymbol{\varepsilon}] = \mathbf{0}. \end{align*}
## Simulation Results
We visualize the curriculum learning procedure. As we iteratively augment our training dataset, the learned force function should get more and more accurate.
Here, we are learning the force in the 'd' direction, $\mathbf{f}_z$, as a function of the two drones' relative position.
### Learned Force Function
#### Stage 1 ($\Delta p = -2m$)

Force estimates are very small for $\Delta p > -2m$. Suggests we need more training data to accurately learn the (true) force function.
However, we can fly our drone accurately at $\Delta p = -2m$ (see video Jan sent).
#### Stage 2 ($\Delta p = -1.5m$)

#### Stage 3 ($\Delta p = -1m$)

#### Stage 4 ($\Delta p = -0.5m$)



What is the issue with our previous training procedure? We use the mean-square error. This implies that we give more weight to accurately predicting large forces.
Instead, try using the loss function
\begin{align*}
L(\theta) :=\sum_{t=1}^T \sum_{i=1}^N \frac{1}{\left\Vert \mathbf{f}_t^{(i)} \right\Vert_2}\left\Vert \phi_{\theta}\left( \bigcup_{j \in \mathcal{N}_t^{(i)}} \mathbf{x}_t^{(j)} - \mathbf{x}_t^{(i)} \mathbf{}\right) - \mathbf{f}_t^{(i)}\right\Vert_2
\end{align*}
where $N$ is the number of drones and $t$ is the number of training samples. The important thing is that we are normalizing by the $\ell^2$ norm of the force labels $\mathbf{f}_t^{(i)}$.






## Force Compensation Model (Deployed in Simulation)

## Accounting for Controller Biases
For the real-world datasets, the biases in 'n' and 'e' directions are, on average, $0.1-0.3 m/s^2$. In the 'd' direction, we had a very large bias, approximately $1 m/s^2$.
Ajay: We can correct the bias in the 'd' direction so that it is comparable to the biases in the 'n' and 'e' directions.
**Question**: If our biases are, on average, $0.1-0.3 m/s^2$, should we even worry about them at all? The drone-drone forces will have much larger magnitudes.
### Mounting
Jennifer's idea:
Rigidly mount the 'sufferer' (S) drone (i.e. the one we are performing the force estimation for) to a mast. Set the S drone to the approximate thrust nessecary to counteract gravity. The rigid mounting prevents the S drone actually moving, and with integrated load cells, we can directly measure the forces imposed upon the mount/S drone coupling.
In this setup, the S drone has equivelent aerodynamic interactions due to being at the correct thrust. When the interfering (I) drone is then flown in proximity to the S drone, the variations in force on the S drone are recorded, directly providing the data for our predictive model. Since S is not running a dynamic controller, no controller bias is present.
If the experiment is run with a sweeep of the following, we should be able to give the NN the majority of factors that will arrise from prop wash interactions:
* S-I drone relative positions
Builds a map of the basic static interaction forces
* I drone velocity
Corrects for velocity induced delays due to the force field off of I.
* I drone acceleration
Corrects for I drone thrust levels changing shape of force field.
* S drone static thrust level
Corrects for the changes made in the force field due to S varying its thrust in reponse to I drone downwash.
#### Alternative
We may also be able to include a dynamic controller on the S drone which varies its thrust in response to the load cell load (i.e. attempts to return load cell loading to 0). With the controller having no estimation process for location/orientation, this may be easier to train the full model while accounting for the fact that controller actions have an impact on the S-I drone aerodynamic system.
## Training in the Real-world
What do we need to consider when transferring our training procedure from simulation to reality?
- No oracle forces to validate our force labels
- Presence of biases: measured forces will not line up as well with the oracle drone-drone forces
- Force function will be higher-dimensional: will be a function of relative velocity and perhaps relative acceleration
- In simulation, we have only modeled the drone-drone forces in the 'd' direction
- Force function will be continuous