Classification of Spherically Confined Supraparticles (and maybe more)

# Classification of Spherically Confined Supraparticles (and maybe more) ###### tags: `PhD projects` `Machine Learning` --- ### Owners (the only one with the permission to edit the main text, others can comment) Alptug, Laura --- ## To do - [x] VAE latent space - [ ] VAE + discriminator (AVAE) latent space (GAN: https://bit.ly/3tO7FXW) - [ ] ~~eVAE latent space (eVAE: https://bit.ly/3FIoVQA)~~ - [ ] ~~eVAE + discriminator latent space~~ - [ ] Try ELU, GELU in hidden layers - [ ] Try sigmoid and/or tanh in bottleneck - [ ] Compare all the models (PCA, AE, VAE, AVAE) - [x] Quantify clustering quality (how? https://bit.ly/3KuY1za) --- ## Interesting Autoencoder papers * Local Conformal AE: https://bit.ly/3fNbnbP * multiple loss funcs https://ai.googleblog.com/2020/04/optimizing-multiple-loss-functions-with.html --- ## Dataset Every particle is represented by a 48 dimensional vector: $$ \mathbf{v}= \left( \left\{ q_l \right\}_{l=1}^8, \left\{ w_{2l} \right\}_{l=1}^4, \left\{ q_l^{(ss)} \right\}_{l=1}^8, \left\{ w_{2l}^{(ss)} \right\}_{l=1}^4, \left\{ \bar q_l \right\}_{l=1}^8, \left\{ \bar w_{2l} \right\}_{l=1}^4, \left\{ \bar q_l^{(ss)} \right\}_{l=1}^8, \left\{ \bar w_{2l}^{(ss)} \right\}_{l=1}^4 \right) $$ where barred quantities are NN averaged and $(ss)$ stands for same specie. ### Preprocessing Before classification we have a few preprocessing options: * **Standardize:** given a data set $\{\mathbf{v}^{(a)}\}$ transform every coordinate such that it has vanishing mean and unit variance: $$ \tilde v^{(a)}_i = \frac{v^{(a)}_i - \mu_i}{\sigma_i} $$ * **MinMaxScale:** Scale every coordinate into $[0,1]$: * **Normalize:** Normalize vector. :::warning be careful about 0 vectors. ::: Out of all these 3, only normalization is dataset independent. :::info Idea: We can try a funky normalization, i.e normalize each subsection of the 48D vector. ::: ### Processing * PCA + GMM * Autoencoder + GMM * Autoclassifier * Classifier + Autoclassifier ## Some Takes ### Preproccessing #### Std+PCA+GMM StandardScaler() PCA(n_components=4) GMM(n_components=8, random_state=48, max_iter = 1000) Finds grain boundaries and icosahedra the best fcc/ico classification so far ![](https://i.imgur.com/pkgVzUD.png) ![](https://i.imgur.com/gzqrGeq.png) #### Norm+PCA+GMM normalize() PCA(n_components=4) GMM(n_components=8, random_state=48, max_iter = 1000) * partially finds fcc/ico * some structures resembling grain boundaries * objectively worse than standardizing ![](https://i.imgur.com/ybDWcmv.png) ![](https://i.imgur.com/NnlSnZ5.png) #### Norm+std+PCA+GMM normalize() StandardScaler() PCA(n_components=4) GMM(n_components=8, random_state=48, max_iter = 1000) * partially finds fcc/ico, better than just norm * no grain boundaries * somewhere in between std and norm ![](https://i.imgur.com/ikMR9gK.png) ![](https://i.imgur.com/zPnOYwR.png) --- From this I conclude the norm of the input vector is encoding important information. Furthermore, the reasoning is not that difficult: * Consider a particle and place neighboring particles around it such that only one BOP is non-vanishing. * Take out half of the neighbors * As a result, the BOP will decrease but unit BOP vector will remain the same. Therefore, norm of the BOP vector is important to find interfaces (grain boundaries and surfaces in open boundary conditions). All in all, normalizing a vector is nothing but reducing a dimension. We are looking for the smallest number of dimensions that can encode "important" information, and normalizing the input is nothing but an uncontrolled dimension reduction. ### Number of Clusters I fix preprocessing to standardizing. Latent dimension is still 4. Let's see when we lose grain boundary and fcc classification. #### 7 clusters ![](https://i.imgur.com/Rw4xjAe.png) ![](https://i.imgur.com/CrUFNwd.png) #### 6 clusters ![](https://i.imgur.com/igBMKVT.png) ![](https://i.imgur.com/HLg16Yk.png) #### 5 clusters ![](https://i.imgur.com/XHtBzUe.png) ![](https://i.imgur.com/kc1VKwL.png) --- 7 can be tolerated but not the others :::info PCA can find grain boundaries when latent dimension is at least 4. I checked the 4th principal component and it is dominated by $w_6$ and $w_6^{(ss)}$. ::: ## Surface Deformations Define the embedding: \begin{align} X^\mu :& \left[0,\pi\right]\times\left[0,2\pi\right) &\rightarrow &\mathbb{R}^3\\ &\left(\theta^1,\theta^2\right)&\mapsto&X^\mu\left(\theta^1,\theta^2\right) \end{align} The action due to surface tension $$ S = - \varepsilon\int_0^{2\pi}\mathrm{d}\theta^2\int_0^{\pi}\mathrm{d}\theta^1 \sqrt{\gamma} $$ where $\varepsilon$ is the energy scale and $\gamma$ is the determinant of pullback metric $\gamma_{ij}$ that is given by: $$ \gamma_{ij} = \delta_{\mu\nu} \frac{\partial X^\mu}{\partial\theta^i}\frac{\partial X^\nu}{\partial\theta^j} = \delta_{\mu\nu} \partial_i X^\mu\partial_j X^\nu. $$ This action is minimized by constant embedding. Now, introduce volume as a Lagrange multiplier $$ S = \varepsilon\int_0^{2\pi}\mathrm{d}\theta^2\int_0^{\pi}\mathrm{d}\theta^1 \left[ -\sqrt{\gamma} +\frac{\lambda}{6}\epsilon_{\mu\nu\rho}\epsilon^{ij}X^\mu \partial_i X^\nu \partial_j X^\rho \right]. $$ If we choose $\lambda = 2/R$, the action is minimized by a sphere of radius $R$. Now we will take a look at the energy introduced by a "bump" on the sphere embedding. WLOG, we put the bump on the north pole and for a function $\Delta(\theta^1)\ll 1$, the embedding is $$ X^\mu = R\left(1+\Delta(\theta^1)\right) \begin{pmatrix} \cos\theta^2\sin\theta^1\\ \sin\theta^2\sin\theta^1\\ \cos\theta^1 \end{pmatrix}. $$ Then the effective action is given by: $$ S_\mathrm{eff} = \pi\varepsilon R^2\int_0^{\pi}\mathrm{d}\theta\sin\theta\left[2\left(\Delta(\theta)\right)^2 - \left(\partial_\theta \Delta(\theta)\right)^2\right]. $$ We can interpret this as the excess energy caused by the bump. Then, the force density is given by: $$ f^\mu(\theta,\phi) = \frac{\varepsilon}{R} \partial_\theta \Delta(\theta)\left(2\Delta(\theta)-\partial^2_\theta \Delta(\theta)\right) \begin{pmatrix} \cos\phi\cos\theta\\ \sin\phi\cos\theta\\ -\sin\theta \end{pmatrix}. $$ We can further constrain the bump to leave the volume invariant. Thus, $$ 1=\frac{1}{2}\int_0^{\pi}\mathrm{d}\theta\sin\theta\left[1+\Delta(\theta)\right]^3. $$ Now consider $$ \Delta(\theta) = \frac{\Delta}{R}\rho (\theta) $$ where $\rho (\theta)$ is a bump with a unit maximum, giving us a total bump height $\Delta$. Then, the total force becomes $$ F^\mu = -\frac{2 \pi \varepsilon \Delta^2}{R^3} \hat{\mathbf{r}} \int_0^{\pi}\mathrm{d}\theta\sin^2\theta\, \partial_\theta \rho\left(2\rho-\partial^2_\theta \rho\right) = - \frac{\kappa}{R^3} \Delta^2 \hat{\mathbf{r}}. $$ This implies $$ V(r) = \kappa \left[ \frac{1}{3}\left(\frac{r}{R}\right)^3-\left(\frac{r}{R}\right)^2+\frac{r}{R} - \frac{1}{3} \right], $$ and leaves us with the parameter $\kappa$ to determine. We will fix this parameter by introducing two new parameters. This might sound like a bad idea since we are increasing the number of free parameters. However, we can choose the new two parameters such that we can constrain them using the elastic approximation that we have already applied. This will make our model better justified. Define $R_\mathrm{max}>R$ as the maximum distance a particle can have from the origin. We will put a hardwall at that distance. Since we are in the elastic regime, $$ \frac{R_\mathrm{max}-R}{R} = \frac{R_\mathrm{max}}{R} -1 =:\Delta_\mathrm{sc} \ll 1 $$ has to be satisfied. Now, we have two types of surface collisions; (i) softwall collisions, and (ii) hardwall collisions. Then, we introduce the second parameter, $p_\mathrm{sc}$, i.e., the probability of a softwall collision rather than a hard one. To enable the softwall to do its job effectively, $p_\mathrm{sc} \approx 1$ should hold. If our potential had a singularity, we could have set $p_\mathrm{sc} = 1$ but unfortunately we have none. It's time to take a look at how there parameters set the value of $kappa$. In an ensemble of particles, a softwall collision takes place when $$ \beta V(r) = - \log u, $$ where u is a uniform random number within $(0,1]$ or equivalently, $$ \beta V(r) = x, $$ where $p(x)=e^{-x}$. Then, $$ p_\mathrm{sc} = P(\beta V(R_\mathrm{max})>x) = \mathrm{CDF}_x(\beta V(R_\mathrm{max})) = 1 - e^{-\beta V(R_\mathrm{max})}. $$ Therefore $$ V(R_\mathrm{max}) = k_BT\log(1-p_\mathrm{sc}), $$ implies, $$ \beta\kappa = -\frac{3}{\Delta_\mathrm{sc}^3}\log(1-p_\mathrm{sc}). $$ if we set $\Delta_\mathrm{sc} = 10^{-2}$ and $p_\mathrm{sc} = 0.99$, we have $$ \beta\kappa \approx 1381.551 $$