# Hard-sphere glasses (with Paddy) ###### tags: `PhD projects` --- ### Owners (the only one with the permission to edit the main text, others can comment) Marjolein, Laura --- ## To do - [x] Analyze Frank's simulation data of a binary HS glass to see if our method gives low correlations for that data as well (or if it is just the experimental data). --- ## Theoretical background ### Bond-orientational order parameters To prevent confusion, let me explain the different bond-orientational order parameters (BOPs) that I use. All BOPs have the same basis, namely the local BOPs introduced by Steinhardt et al. $$ q_{lm}(i) = \frac{1}{N_b(i)} \sum_{j\in\mathcal{N}_b(i)} Y_{lm}\left( \theta_{ij},\phi_{ij} \right), $$ where $N_b(i)$ is the number of neighbors of particle $i$, $\,\mathcal{N}_b(i)$ is the set of neighbors of $i$, $\,Y_{lm}\left( \theta,\phi \right)$ are the spherical harmonics with $m\in[-l,l]$, and $\theta_{ij}$ and $\phi_{ij}$ are the polar and azimuthal angles of $\mathbf{r}_{ij}=\mathbf{r}(j)-\mathbf{r}(i)$. The first option of BOPs that I use are simply the **local** rotationally invariant $q_l$ $$ q_l(i) = \sqrt{\frac{4\pi}{2l+1} \sum_{m=-l}^l |q_{lm}(i)|^2} . $$ Using these local rotationally invariant $q_l$, the second option of BOPs are the averaged $\bar{q}_l$ used by **Emanuele Boattini** in his UML paper on glasses [**[Boattini et al. (2020)]**](https://doi.org/10.1038/s41467-020-19286-8) $$ \bar{q}_{l}(i) = \frac{1}{N_b(i)+1} \sum_{j\in\{i,\mathcal{N}_b(i)\}} q_{l}(j) . $$ The third option of BOPs are the averaged $\bar{q}_l$ introduced by **Lechner & Dellago** [**[Lechner & Dellago (2008)]**](https://doi.org/10.1063/1.2977970) $$ \bar{q}_l(i) = \sqrt{\frac{4\pi}{2l+1} \sum_{m=-l}^l |\bar{q}_{lm}(i)|^2}, \;\;\text{ where }\;\; \bar{q}_{lm}(i) = \frac{1}{N_b(i)+1} \sum_{j\in\{i,\mathcal{N}_b(i)\}} q_{lm}(j) . $$ For all systems, I use the SANN algorithm to determine the nearest neighbors of each particle. [**[van Meel et al. (2012)]**](https://doi.org/10.1063/1.4729313) ### Dynamics For a simulation, you can either use the mobility or the propensity of a measure of the dynamics. The mobility is simply the absolute displacement of a particle after some time $\delta t$ based on a single trajectory $$ u_i(\delta t) = | \mathbf{r}_i(\delta t) - \mathbf{r}_i(0) |. $$ The propensity is the mobility averaged over multiple runs $$ D_i(\delta t) = \left\langle| \mathbf{r}_i(\delta t) - \mathbf{r}_i(0) | \right\rangle_c, $$ where $\langle \dots\rangle_c$ indicates the average over the collection $c$ of simulations all started from the same configuration but with different initial speeds and directions. The propensity generally gives better correlations with the structure than the mobility, as it gets rid of the directional dependency. However, you of course cannot calculate the propensity for experiments. **Note:** when necessary, I correct for drift using the center of mass of the system. --- ## Experimental polydisperse HS glass (Paddy) ### Overview of the data A short overview of the data and information that I got from Paddy. This is all on a 6% polydisperse hard-spheres system made from PMMA particles of diameter 3.23 $\mu$m. [**[Royall et al. (2018)]**](https://doi.org/10.1063/1.5000263) | Folder | Effective volume fraction | Alpha time (in units of Brownian time) | Alpha time (in frames) | Time between frames (in seconds) | Number of particles tracked | Total number of frames | | -------- | -------- | -------- | -------- | -------- | -------- | -------- | | 10-Dec-Tsuru-phi0-53 | 0.593114447 | 2453.94218 | 155.2 | 120 | 3865 | 239 | | 8-Dec-Tsuru-phi-050 | 0.586548134 | 298.86217 | 226.8 | 10.0 | 10859 | 256 | | 8-Dec-Tsuru-phi0-486-this | 0.551384372 | 13.59362 | 17.2 | 6.0 | 11775 | 256 | | 06-Jan-Bertie-I-driftRemoved-phi-0476 | 0.510965748 | ? | ? | 15.0 | 2489 | 459 | | 9-Dec-Tsuru-phi0-34 | 0.388074863 | 0.24829 | 15.6 | 5.0 | ~5000 | 419 | Brownian time = 7.59 s **Note:** so far I have only looked at *'8-Dec-Tsuru-phi-050'* and *'8-Dec-Tsuru-phi0-486-this'* as their snapshots contain enough particles. The other folders contain to few particles to have enough particles left after I "cut off" the boundary of the box to calculate the BOPs. ### Getting the mobility and the BOPs To calculate the mobility I use the file that contains the trajectory of all particles that are tracked during the entirety of the trajectory. To calculate the BOPs I use the separate snapshots that contain ALL particles and not just that ones that are tracked during the entirety of the trajectory. This way I am more certain that each particle has enough neighbors present. Later I link the BOPs to the correct particle ID in the trajectory. For the calculation of the BOPs you have to be careful to only calculate the local $q_{lm}$ and $q_{l}$ of the particles that have enough nearest neighbors present, and the averaged $\bar{q}_{lm}$ and $\bar{q}_l$ of the particles that have enough nearest neighbors and next-nearest neighbors present. To do this I simply split the box that contains all the particles in three parts: **box2**: the smaller box that is left after cutting 2 *dbox* of all sides of the original box. **box1**: the shell that is left after cutting 1 *dbox* of all sides of the original box and taking out box2. **box0**: the origional box minus box1 and box2. Here *dbox* is choosen such that it is slightly larger than the maximum radial cutoff found by the SANN algorithm. ![](https://i.imgur.com/BU1DBoP.png, =300x) I then calculate the local $q_{lm}$ and $q_{l}$ of the particles inside box1 and box2, and the averaged $\bar{q}_{lm}$ and $\bar{q}_l$ of the particles inside box0. ### 8-Dec-Tsuru-phi-050 ($\phi_{eff} =0.5865$) For the data in this folder I used *dbox*=15.0, which leaves on average 7953 particles per snapshot for the local BOPs and 4735 for the averaged BOPs. Using the BOPs with $l\in[1,8]$ as input, I performed principal component analysis (PCA) on i) just the local BOPs, ii) Emanuele's BOPs, and iii) Lechner & Dellago's BOPs. Underneath you can see the resulting scatter plots of the first and second principal components (PC-1 and PC-2) of all particles. The red arrows visualize how each BOP contributes to PC-1 and PC-2. Above and to the right of the scatter plots are the histograms of both PCs. I also give the percentage of information (=variance) explained by each PC. | **Local** <br> PC-1: 33.0% info <br> PC-2: 29.9% info | ![](https://i.imgur.com/GSURcHt.png) | | -------- | -------- | | **Emanuele Boattini** <br> PC-1: 40.2% info <br> PC-2: 24.6% info | ![](https://i.imgur.com/sn4Snzg.png) | | **Lechner & Dellago** <br> PC-1: 69.5% info <br> PC-2: 8.8% info | ![](https://i.imgur.com/YbsrF5G.png) | Note: I also tried using both the local BOPs and Emanuele's BOPs as input for PCA (so 16 input parameters in total), but the model gave higher importance to the local BOPs, so the result was extremely similar to using just the local BOPs. Underneath, you can see how a Gaussian mixture model (GMM) would divide the (PC-1,PC-2)-plane into two clusters. Here $P_{GMM}$ indicates the probability of belonging to the cluster with the highest PC-1. | Local | Emanuele Boattini | Lechner & Dellago | | -------- | -------- | -------- | | ![](https://i.imgur.com/I9CWLM5.png) | ![](https://i.imgur.com/OfjJZrz.png) | ![](https://i.imgur.com/89RyF4N.png) | Next, I calculated the Pearson correlation between the mobility and PC-1 for a wide range of $\delta t$. Additionally, I calculated the Pearson correlation between the mobility and $P_{GMM}$. Underneath you can see the correlation as a function of $\delta t$ for the different BOPs used. The solid lines give the correlation with PC-1 and the dashed lines with $P_{GMM}$. ![](https://i.imgur.com/hLszSgk.png) There are a couple of things I want to point out about the above figure: 1. The correlation for Lechner & Dellago's BOPs (red) is negative as this PC-1 largely depends on $\bar{q}_6$, and high $\bar{q}_6$ usually means high crystaline order. 2. Emanuele's BOPs (blue) give the highest correlation by far. It is, however, still a lot lower than the correlations found in his paper. 3. Using $P_{GMM}$ instead of PC-1 does not improve the correlation for Emanuele's and Lecher & Dellago's BOPs. 4. The peak of the correlation for the local BOPs (green) and Lechner & Dellago's BOPs (red) lies slightly below $\delta t = \tau_\alpha$ (as Emanuele found in his paper), whereas the peak of Emanuele's BOPs (blue) lies around $\delta t \approx 0.6\tau_\alpha$. 5. The correlation for Emanule's BOPs (blue) shoots up for $\delta t\gtrsim230$ frames. A reason for this could be the low statistics, as there are only a handful of snapshots for which we can calculate the mobility with such a high number of frames for $\delta t$. The check what the effect would be of using low statistics, underneath you can see the correlation between the mobility and PC-1 calculated using all available snapshots (solid lines, same as above figure) and using just the first snapshot of the series (dashed lines). We indeed see that the blue dashed line lies significantly above the blue solid line. However, the peak of the blue dashed line still lies around $\delta t\approx0.6\tau_\alpha$. ![](https://i.imgur.com/TvTgDVu.png) ### 8-Dec-Tsuru-phi0-486-this ($\phi_{eff} =0.5514$) For the data in this folder I used *dbox*=16.7, which leaves on average 9249 particles per snapshot for the local BOPs and 5393 for the averaged BOPs. Again, using the BOPs with $l\in[1,8]$ as input, I performed principal component analysis (PCA) on i) just the local BOPs, ii) Emanuele's BOPs, and iii) Lechner & Dellago's BOPs. Underneath you can see the resulting scatter plots of the first and second principal components (PC-1 and PC-2) of all particles. The red arrows visualize how each BOP contributes to PC-1 and PC-2. Above and to the right of the scatter plots are the histograms of both PCs. I also give the percentage of information (=variance) explained by each PC. | **Local** <br> PC-1: 31.5% info <br> PC-2: 29.3% info | ![](https://i.imgur.com/mjQGSgS.png) | | -------- | -------- | | **Emanuele Boattini** <br> PC-1: 43.3% info <br> PC-2: 20.3% info | ![](https://i.imgur.com/nzKiekH.png) | | **Lechner & Dellago** <br> PC-1: 64.1% info <br> PC-2: 10.1% info | ![](https://i.imgur.com/PeD0Qi7.png) | Underneath, you can see how a Gaussian mixture model (GMM) would divide the (PC-1,PC-2)-plane into two clusters. Here $P_{GMM}$ indicates the probability of belonging to the cluster with the highest PC-1. | Local | Emanuele Boattini | Lechner & Dellago | | -------- | -------- | -------- | | ![](https://i.imgur.com/C1aWj1i.png) | ![](https://i.imgur.com/lqmlLbX.png) | ![](https://i.imgur.com/lB7MWz8.png) | Next, I calculated the Pearson correlation between the mobility and PC-1 for a wide range of $\delta t$. Additionally, I calculated the Pearson correlation between the mobility and $P_{GMM}$. Underneath you can see the correlation as a function of $\delta t$ for the different BOPs used. The solid lines give the correlation with PC-1 and the dashed lines with $P_{GMM}$. ![](https://i.imgur.com/OUrpCoh.png) Again, we find that Emanuele's BOPs (blue) give the highest correlations. However, now the peak lies around $\delta t \approx \tau_\alpha$, instead of $\delta t\approx0.6\tau_\alpha$ of the previous section. Furthermore, notice that the local BOPs (green) give significantly higher correlations than in the previous section. The reason for this is the different "orientation" of the PCA decomposition. In the previous section, a high value of PC-1 means either a high $q_5$ or high $q_6$, and the GMM clusters the particles with $q_5$ or high $q_6$ together in the "red cluster". In this section, a high value of PC-1 means mostly a high $q_5$, and the GMM only clusters particles with high $q_5$ together in the "red cluster". Lastly, I also calculated the correlation using just the fist snapshot of the series. Underneath you can see the correlation between the mobility and PC-1 calculated using all available snapshots (solid lines, same as above figure) and using just the first snapshot of the series (dashed lines). ![](https://i.imgur.com/1sccHzY.png) Underneath you can see the first snapshot of the series colored by the mobility and $P_{GMM}$ using Emanuele's BOPs. (Here $\mu$ is the mean of the mobility at that $\delta t$). | mobility $\delta t=16$ frames (max correlation) | $P_{GMM}$ using Emanuele's BOPs | | -------- | -------- | | ![](https://i.imgur.com/j5i5Kam.png) | ![](https://i.imgur.com/jRNer1t.png) | | ![](https://i.imgur.com/FXfyj4z.png)|![](https://i.imgur.com/7CLC1Zs.png) | ### 10-Dec-Tsuru-phi0-53 ($\phi_{eff} =0.5931$) For the data in this folder I used *dbox*=15.9, which leaves on average 1923 particles per snapshot for the local BOPs and 560 for the averaged BOPs. For the averaged BOPs this means a slab of approximately 14x10x4 particles. | **Local** <br> PC-1: 27.5% info <br> PC-2: 23.2% info | ![](https://i.imgur.com/IhVAnCS.png) | | -------- | -------- | | **Emanuele Boattini** <br> PC-1: 43.0% info <br> PC-2: 17.7% info | ![](https://i.imgur.com/fP0UZNx.png) | | **Lechner & Dellago** <br> PC-1: 60.3% info <br> PC-2: 10.5% info | ![](https://i.imgur.com/395JYYr.png) | Underneath, you can see how a Gaussian mixture model (GMM) would divide the (PC-1,PC-2)-plane into two clusters. Here $P_{GMM}$ indicates the probability of belonging to the cluster with the highest PC-1. | Local | Emanuele Boattini | Lechner & Dellago | | -------- | -------- | -------- | | ![](https://i.imgur.com/iMpxFHj.png) | ![](https://i.imgur.com/6mlhYYN.png) | ![](https://i.imgur.com/UBy18o2.png) | Next, I calculated the Pearson correlation between the mobility and PC-1 for a wide range of $\delta t$. Additionally, I calculated the Pearson correlation between the mobility and $P_{GMM}$. Underneath you can see the correlation as a function of $\delta t$ for the different BOPs used. The solid lines give the correlation with PC-1 and the dashed lines with $P_{GMM}$. ![](https://i.imgur.com/MWKMHpW.png) ### Compare So if we compare the correlation of the three different packing fraction (using Emanuele's BOPs), we get ![](https://i.imgur.com/4LYj6TH.png) These really are not great correlations. *Note: I think that the correlation of the highest packingfraction, i.e. $\eta=0.5931$, is this bad because it only had 3865 particles tracked for the entire time series, which reduces to $\sim560$ particles after "substracting" the particles at the edges of the box. The small amount of data could then explain the significant difference in the PCA decomposition (see below). Comparing to the lower density PCA decomposition, we see that for $\eta=0.5931$ $\bar{q}_5$ lies mostly along the PC2 direction instead of the PC1 direction.* | ![](https://i.imgur.com/vRd9bMg.png) | ![](https://i.imgur.com/Hqk8LIu.png) | | -------- | -------- | | $\eta=0.5514$ | $\eta=0.5931$ | *However, I did also try to use the an already trained PCA (in this case on the simulation data of the 8% HS polydisperse glass at $\eta=0.580$) to classify the structure with, and then train a GMM on that. The scatterplot you then get is given below. Note that the shift in origin and the mirrored PC1 axis are both fine, because the GMM is not affected by that.* *So the scatterplot then looks quite alright, but the correlation with the dynamics is not improved (see below).* | ![](https://i.imgur.com/RLHSjKS.png) | ![](https://i.imgur.com/p63zW1I.png) | | -------- | -------- | | New scatter plot | Dashed green line indicates the new correlation | --- ## Simulation of binary HS glass (Frank) ### Overview of the data The data I got from Frank is of a binary HS system of 2000 particles, of which 600 particles large particles with diameter $\sigma$ and 1400 small particles with diameter $0.85\sigma$. The packing fraction is 0.580. For this system $\tau_\alpha\approx 10^4 \sqrt{\beta m \sigma^2}$. ### Results I calculated the different BOPs using the periodic boundary conditions of the system and made no distinction between the small and large particles. Then, using the BOPs with $l\in[1,8]$ as input, I performed principal component analysis (PCA) on i) just the local BOPs, ii) Emanuele's BOPs, and iii) Lechner & Dellago's BOPs. Underneath you can see the resulting scatter plots of the first and second principal components (PC-1 and PC-2) of all particles. The red arrows visualize how each BOP contributes to PC-1 and PC-2. Above and to the right of the scatter plots are the histograms of both PCs. I also give the percentage of information (=variance) explained by each PC. | **Local** <br> PC-1: 50.4% info <br> PC-2: 25.0% info | ![](https://i.imgur.com/PpH92Ec.png) | | -------- | -------- | | **Emanuele Boattini** <br> PC-1: 62.7% info <br> PC-2: 16.4% info | ![](https://i.imgur.com/qSJo6tq.png) | | **Lechner & Dellago** <br> PC-1: 69.8% info <br> PC-2: 8.5% info | ![](https://i.imgur.com/wyKm9vf.png) | Underneath, you can see how a Gaussian mixture model (GMM) would divide the (PC-1,PC-2)-plane into two clusters. Here $P_{GMM}$ indicates the probability of belonging to the cluster with the highest PC-1. | Local | Emanuele Boattini | Lechner & Dellago | | -------- | -------- | -------- | | ![](https://i.imgur.com/tFa0RY2.png) | ![](https://i.imgur.com/Lw36FYj.png) | ![](https://i.imgur.com/y19G7J7.png) | Next, I calculated the Pearson correlation between the mobility and PC-1 for a logarithmic range of $\delta t$. Additionally, I calculated the Pearson correlation between the mobility and $P_{GMM}$. Underneath you can see the correlation as a function of $\delta t/\tau_\alpha$ for the different BOPs used. The solid lines give the correlation with PC-1 and the dashed lines with $P_{GMM}$. ![](https://i.imgur.com/VvgFxuM.png) Notice that the correlation for Emanule's BOPs (blue) is negative in this case as PC-1 decreases with increasing $\bar{q}_5$, whereas, for the experimental HS glasses, PC-1 increased with increasing $\bar{q}_5$. Thus, to better compare the correlations, underneath are the same results, but now with the correlations for Emanule's BOPs (blue) multiplied by -1.0. ![](https://i.imgur.com/Tsix2JE.png) Again, we find that Emanuele's BOPs (blue) give the highest correlations, with the peak just below $\delta t/\tau_\alpha =1$. However, notice that the peak correlation is approximately 3 times as high as for the experimental HS glass. ### Getting better correlations: propensity, treating small and large particles separately, and $\bar{P}_{GMM}$ There are a few things that can still be done to get better correlations though. The first one is using the propensity instead of the mobility. This is of course not something we can do for experiments, but it is good to have a feeling for how much using the mobility instead of the propensity lowers the correlation. The second thing that can be done for this system is training the PCA (and GMM) on the data of the small and large particles separately. And the third improvement is averaging the final structural order parameter, so let's use $P_{GMM}$, over its neighbors. Emanuele showed that simply averaging over the neighbors within a radial distance of $2.0\sigma$ improves the correlation by quite a lot. Underneath a figure with the correlation for the large and small particles separately (red vs blue), for the propensity and mobility (solid vs dashed), and the average and normal $PC_1$ (dark vs light). ![](https://i.imgur.com/H41FDFR.png) Lastly, one could use the inherent or cage state of the system as input for the BOPs to improve the correlations. Rinske showed that the cage state, which is the configuration consisting of the average particle positions when particle rearrangement is forbidden, out performs the inherent state. Underneath, I compare the correlations obtained using the real state of the system, i.e. the same results as above, to the correlations obtained using the cage state of the system. For both the real and the cage state, PCA is trained separatedly on the large and small particles. The correlations shown are for $\overline{PC_1}$. ![](https://i.imgur.com/X0flDr5.png) ### Correlations with TCC Underneath the correlation between the structural order parameter, i.e. the first principal component, $P_{GMM}$, and $\bar{P}_{GMM}$, and the number of TCC clusters per particle. The results are only for the large particles. Green bars indicate clusters that consist of one or more tetrahedral subclusters, blue bars indicate clusters that consist of one or more square pyramidal subclusters and gray clusters contain neither or both. We see that mostly the blue clusters have a positive correlation with the structural order parameter and thus lie in the "fast" regions of the glass. Whereas the green clusters have a negative correlation and thus lie in the "slow" regions. *Note that for the first pricipal component, we take the negative correlation, such that the bars show the same trend as for $P_{GMM}$ and $\bar{P}_{GMM}$.* | PC1 | $P_{GMM}$ | $\bar{P}_{GMM}$ | | -------- | -------- | -------- | | ![](https://i.imgur.com/2sC7Gh7.png) | ![](https://i.imgur.com/dm8XGdY.png) | ![](https://i.imgur.com/1ukUOsW.png) | ### Effect of noise To see what effect (potential) tracking errors and other forms of noise would have on the correlation of the mobility and structure, I added Gaussian noise to the simulation data. I looked at 10 different magnitudes for the noise: Gaussian noise with a standard deviation of [0.01, 0.02, ..., 0.10]$\sigma$, with $\sigma$ the diameter of the large particle. Additionally, to see whether tracking errors would be more important for determining the structure or for the mobility, I treated these two cases separately. Since Emanuele's BOPs gave the best results, I focussed on these for the structure analysis. First, I added the noise to the particle coordinates of the first snapshot of each time series. Using these "noisy" coordinates I calculated the new BOPs, performed the entire PCA again, and used a GMM to split the PC-1 vs PC-2 scatterplot into two clusters. Underneath you can see, for different magnitudes of noise, the correlation between the mobility and $P_{GMM}$ for all particles (left) and $\bar{P}_{GMM}$ for just the large particles (right). The color of the lines indicates the magnitude of the Gaussian noise: from dark red (no noise) to light red (noise with a standard deviation of 0.10$\sigma$). *Note that the mobility used for determining the correlation is the original mobility, so the mobility without any noise.* | ![](https://i.imgur.com/WFpw4yi.png) | ![](https://i.imgur.com/KptO2Sd.png) | | -------- | -------- | | $P_{GMM}$ all particles | $\bar{P}_{GMM}$ large particles | Next, I added noise to the mobility of the particles, and used these mobilities to determine the correlation with $P_{GMM}$ from the coordinate files without noise. The resulting correlations our shown in the figure below. Again, the color of the lines indicates the magnitue of noise: from dark red (no noise) to light red (noise with a standard deviation of 0.10$\sigma$). | ![](https://i.imgur.com/MKeEZOB.png) | ![](https://i.imgur.com/QYYDd8Z.png) | | -------- | -------- | | $P_{GMM}$ all particles | $\bar{P}_{GMM}$ large particles | Lastly, I determined the correlation using both the "noisy" $P_{GMM}$ and "noisy" mobility. The results are shown underneath. | ![](https://i.imgur.com/erhHaHg.png) | ![](https://i.imgur.com/9YwtepB.png) | | -------- | -------- | | $P_{GMM}$ all particles | $\bar{P}_{GMM}$ large particles | Comparing the figures, we see that the noise in the particle coordinates has a more significant effect on the local structure, than on the mobility. Specifically, when we consider just the noise in the local structure, the maximum correlation drops from ~0.36 for no noise to ~0.16 for noise with a standard deviation of 0.10$\sigma$. When we consider just the noise in the mobility, the maximum correlation drops only to ~0.33 for noise with a standard deviation of 0.10$\sigma$. If we consider both the noise in the local structure and in the mobility, the maximum correlation drops to ~0.15 for noise with a standard deviation of 0.10$\sigma$. This is on the order of the maximum correlation for the experimental data. To illustrate the effect of noise on the structure classification, underneath you can see a snapshot colored by $P_{GMM}$ for different noise sizes. The first row shows a snapshot colored by the mobility ($\mu$ is the mean of the mobility at that $\delta t$) and $P_{GMM}$ without any noise. The correlation between the two is clear to see. The other rows show the same snapshot for different noise sizes. | Mobility $\delta t/\tau_\alpha = 0.5$ (max correlation) (no noise) | $P_{GMM}$, no noise $\qquad\qquad\quad$ | | -------- | -- | | ![](https://i.imgur.com/450ClUw.png) <br> ![](https://i.imgur.com/zDMlcly.png) | ![](https://i.imgur.com/Z0fIyxq.png) <br> ![](https://i.imgur.com/ey82FxU.png) | | $P_{GMM}$, $0.02\sigma$ noise | $P_{GMM}$, $0.04\sigma$ noise | | ![](https://i.imgur.com/ftIAGrz.png) | ![](https://i.imgur.com/Ykpa5gm.png) | | $P_{GMM}$, $0.06\sigma$ noise | $P_{GMM}$, $0.08\sigma$ noise | | ![](https://i.imgur.com/ZSLESRQ.png) | ![](https://i.imgur.com/3rjeRN1.png) | To get a better understanding of what the added noise does to the structure classification, I calculated the Pearson correlation of the BOPs ($l\in[1,8]$) for the different noise levels. Underneath are the results. We see that the BOPs with higher $l$ decrease more slowly in correlation. Yet, all BOPs have a correlation of less than 0.5 for a noise of "just" 8%. ![](https://i.imgur.com/34pzyhQ.png) Additionally, I calculated the Pearson correlation for the number of TCC cluster per particles for a handful of different TCC clusters. The results are shown underneath. Same as for the BOPs, we see that the correlation decreases to less than 0.5 for a noise of 8% or larger. ![](https://i.imgur.com/G1gzTOw.png) --- ## Simulation polydisperse HS glass (DynamO data Paddy) ### Overview of the data The simulation data that I got from Paddy was obtained using DynamO. It's on a weakly polydisperse five—component system [**[Royall & Kob (2017)]**](https://doi.org/10.1088/1742-5468/aa4e92). The system contains 1372 particles, with diameters $[0.888\sigma,0.95733\sigma,\sigma,1.04267\sigma,1.112\sigma]$, such that the polydispersity is 8%. **Note 1**: Paddy said that they also tried 6% polydispersity, but these systems crystallized. **Note 2**: DynamO requires the largest diameter to be equal to 1, so there they used diameters: [0.798561151, 0.860911271, 0.899280576, 0.93764988, 1]. They looked at packing fractions: 0.19, 0.42, 0.50, 0.54, 0.56, 0.57,0.575, 0.58, and 0.585. Underneath a short overview of the data (that I analyzed) and some information. *Note: I do not know the $\tau_\alpha$ times, but I read them from Fig. 2b where $\tau_\alpha$ is plotted against $Z_{CS}=(1+\phi+\phi^2-\phi^3)/(1-\phi)^3$. This figure is given below the table.* | packing fraction | $Z_{CS}$ |time between snapshots | total time | # files | $\tau_\alpha$ (guess) | box size | | -------- | ---| -------- | --- | --- | --- | -------- | | 0.585 | 24.16 | ? (guessed 100.0) | ? | 2977 | $3.0\cdot10^4$ | 9.65372 | | 0.580 | 23.23 | ? (guessed 100.0) | ? | 2997 | $3.0\cdot10^3$ | 9.70706 | | 0.570 | 21.50 | 10.0 | 40000 | 3999 | 100 | 9.7616 | ![](https://i.imgur.com/iYjpcgC.png, =300x) ### Results The center of mass of the systems where already fixed, so I did not have to correct for that. As for all the other systems, I calculated the different BOPs. Again, I did PCA analysis on them to get the first two principal components and used a GMM to split the PC-1 vs PC-2 scatterplot into two clusters. As for the other systems, I find that using Emanuele's BOPs give the highest correlations with the mobility. Underneath you can see the correlation as a function of time for the three packing fractions. *(Note that, as I guessed $\delta t$ and $\tau_\alpha$, the numbers on the horizontal axes are just a guide and shouldn't be taken as correct.)* ![](https://i.imgur.com/xtHdHVu.png) The correlation is significantly lower than for the binary HS system (at packing fraction 0.58) ... ... But PCA found a vary similar scatterplots | System | Scatterplot of PCA on Emanuele's BOPs | | -------- | -------- | | **Binary** <br> PC-1: 62.7% info <br> PC-2: 16.4% info | ![](https://i.imgur.com/qSJo6tq.png) | | **Polydisperse** <br> PC-1: 52.6% info <br> PC-2: 19.5% info | ![](https://i.imgur.com/93h0eev.png) | It can also just be the case that this analysis method works less well for this system. Emanuele looked at three different glassy systems in his paper (binary HS, Wahnström, and Kob-Andersen) and found significantly lower correlations for the Kob-Andersen glass. See the figure below. ![](https://i.imgur.com/SijpT7R.png) --- ## Simulation polydisperse HS glass (EDMD) We also wanted to do our own simulations on a polydisperse system, so Frank gave me his EDMD code. ### Notes on the simulations I looked at a system containing $N=5000$ particles at packing fraction 0.580 for a couple of different polydispersities: 6%, 8%, and 10%. But same as Paddy, I found that the system with 6% polydispersity crystallizes. To get a list of diameters with e.g. 8% polydispersity, I just simply drew $N$ numbers from a normal distribution with mean 1.0 and standard deviation 0.08. Then, I scale all numbers by the mean of the list to get a mean diameter precisely equal to 1.0, and scale the difference w.r.t. the mean to get a standard deviation precisely equal to 8%. **Note**: As the EDMD code only works for diameters $\leq 1.0$, I later divide all diameters with the maximum diameter in the list. So sometimes I express lengths in term of $\sigma^*=\sigma/\sigma_{max}$ instead of $\sigma$, because I could not be bothered to scale it back. ### Results $\eta=0.580$ Some information on the systems. | Polydispersity | $\sigma_{max}/\sigma$ | Relaxation time <br> $\tau_\alpha / \sqrt{\beta m (\sigma^*)^2}$ | Long-time diffusion time <br> $\tau_l /\sqrt{\beta m (\sigma^*)^2}$ | | -------- | ----| -------- | -------- | | 8% | 1.2829857098200959 | $9.4\cdot10^2$ | $3.3\cdot10^3$ | | 10% | 1.3639722719472986 | $5.7\cdot10^2$ | $2.4\cdot10^3$ | For each polydispersity, I equilibrated 10 different configurations until the MSD from the starting position reached $6(\sigma^*)^2$. In practice, this meant that the runs for the systems with 8% polydispersity equilibrated for $19\tau_\alpha$ on average, and the system with 10% polydispersity equilibrated for $22\tau_\alpha$ on average. From these equilibrated configuration I started new simulations to measure the mobility in exponential time steps. I also started 30 different simulations from the same configuration to measure the propensity. Underneath are the results using Emanuele's BOPs as input. *Note that, for the results underneath, I used the mobility from 10 independent startinging configurations, and the propensity from just one starting configuration. These are of course not the best statistics, but they show the general idea.* ![](https://i.imgur.com/QxvunG7.png) So the correlation for the polydisperse HS glass is indeed significantly lower than for the binary HS glass. Because I was curious by how much the correlation improves if we go to higher supercooling, I also ran some the simulations for packing fraction 0.585. ### Results $\eta=0.585$ Same particle size distributions as for $\eta=0.580$, but now the systems are compressed to a packing fraction of $\eta=0.585$. | Polydispersity | Relaxation time <br> $\tau_\alpha / \sqrt{\beta m (\sigma^*)^2}$ | Long-time diffusion time <br> $\tau_l /\sqrt{\beta m (\sigma^*)^2}$ | | -------- |------ | -------- | | 8% | $1.4\cdot10^4$ | $1.9\cdot10^4$ | | 10% | $5.7\cdot10^3$ | $1.1\cdot10^4$ | Same as for $\eta=0.580$, I equilibrated 10 different starting configurations for $\sim20\tau_\alpha$. Underneath are the results using Emanuele's BOPs as input. *Note that, again, I used the mobility from 10 independent startinging configurations, and the propensity from just one starting configuration.* ![](https://i.imgur.com/2AOuAF8.png) Notice that the difference in the maximum correlation for $\eta=0.580$ and $\eta=0.585$ is very small. Furthermore, because these correlations are not bases an many runs (especially not for the propensity), we cannot draw a firm conclusion on this difference. ### Better statistics To get some better statistics (which also comes in handy for studying the effect of noise), I focussed on the system with 8% polydispersity and measured the mobility of 40 more starting configurations, and obtained the propensity of an additional 9 configurations. This means that we now have a dataset for the mobility of 50 configurations, and a dataset for the propensity of 10 configurations. Underneath are the results. *Note that I did not include the correlation with the propensity for $\eta=0.585$, as I decided that running more simulations (that take ~2 weeks) for this was not worth it.* ![](https://i.imgur.com/QO4DKle.png) Now, with these better statistics, we can clearly see that $\eta=585$ has a slightly higher correlation than $\eta=0.580$. ### Averaging radius for $\bar{P}_{GMM}$ Before we look at the effect of noise, I wanted to show that the specific radius used for averaging $P_{GMM}$ to obtain $\bar{P}_{GMM}$ does not really influence the maximum of the correlations that we calculate. Underneath I show the correlation between $\bar{P}_{GMM}$ and the mobility (left) or propensity (right) for different averaging radii. These results are for a system with 8% polydispersity at $\eta=0.580$. The figures clearly show that the maximum of the correlation is nog significantly affected by the specific choice of averaging radius. | ![](https://i.imgur.com/gBYJksF.png) | ![](https://i.imgur.com/WPrJqjn.png) | | -------- | -------- | | Mobility | Propensity | ### Correlations with TCC As for the binary hard-spheres glass, I calculated the correlation between the structural order parameter, i.e. the first principal component, $P_{GMM}$, and $\bar{P}_{GMM}$, and the number of TCC clusters per particle. I did this for the system with 8% polydispersity at $\eta=0.580$. Green bars indicate clusters that consist of one or more tetrahedral subclusters, blue bars indicate clusters that consist of one or more square pyramidal subclusters and gray clusters contain neither or both. Note that for the first pricipal component, we take the negative correlation, such that the bars show the same trend as for $P_{GMM}$ and $\bar{P}_{GMM}$. The clusters are shown in the same order as for the binary HS glass. Again we see that mostly the blue clusters have a positive correlation with the structural order parameter and thus lie in the "fast" regions of the glass, whereas the green clusters have a negative correlation and thus lie in the "slow" regions. *Note that for the first pricipal component, we take the negative correlation, such that the bars show the same trend as for $P_{GMM}$ and $\bar{P}_{GMM}$.* | PC1 | $P_{GMM}$ | $\bar{P}_{GMM}$ | | -------- | -------- | -------- | | ![](https://i.imgur.com/wve5Rrg.png) | ![](https://i.imgur.com/0dRSquA.png) | ![](https://i.imgur.com/oi8GPrJ.png) | ### Effect of noise As for the binary hard-spheres glass, I wanted to see for the polydisperse hard-spheres glass what effect noise would have on the correlation between the mobility and the structure. We saw for the binary system that the noise in the structure classification was most important. Hence, I used the normal (without added noise) mobilities and I only added Gaussian noise to the coordinates of starting configurations for which I then redid the structural analysis. Again, I looked at 10 different magnitudes for the noise: Gaussian noise with a standard deviation of [0.01, 0.02, ..., 0.10]$\sigma$. Underneath are the results for the system with 8% polydispersity at $\eta=0.580$. On the left is the correlation between the mobility and $P_{GMM}$ gotten from the structure classification, and on the right between the mobility and $\bar{P}_{GMM}$ (where we an averaging radius of $2.0\sigma$). The color of the lines indicates the magnitude of the Gaussian noise: from dark red (no noise) to light red (noise with a standard deviation of 0.10$\sigma$). *Note that the PCA and GMM models were trained on 10 snapshots for each noise level and then used to classify 40 more snapshots. So I used 50 runs in total to calculate these correlations.* | ![](https://i.imgur.com/Ur6VTnQ.png) | ![](https://i.imgur.com/2Ex2UJc.png) | | -------- | -------- | | $P_{GMM}$ | $\bar{P}_{GMM}$ | Lastly, I checked whether classifying the "noisy" structure with the PCA and GMM models trained on the original zero-noise system would improve the classification, and thus the correlation. Underneath are the results for the correlation between the mobility and $P_{GMM}$. The solid lines are the same as in the left figure above. The dashed lines indicate the corresponding correlations with the structure classified with the zero-noise PCA and GMM models. One can clearly see that classifying with the zero-noise PCA and GMM models decreases the correlation with the dynamics. ![](https://i.imgur.com/CuDFSI6.png) To get a better understanding of what the added noise does to the structure classification, I calculated the Pearson correlation of the BOPs ($l\in[1,8]$) for the different noise levels. Underneath are the results. As for the binary HS glass, we see that the BOPs with higher $l$ decrease more slowly in correlation. Yet, all BOPs have a correlation of less than 0.5 for a noise of "just" 7%. ![](https://i.imgur.com/4sfNGNC.png) Additionally, I calculated the Pearson correlation for the number of TCC cluster per particles for a handful of different TCC clusters. The results are shown underneath. Same as for the BOPs, we see that the correlation decreases to less than 0.5 for a noise of 7% or larger. ![](https://i.imgur.com/MM7h3t7.png) ## Ideas for future improvements Things we could try to improve the correlations (or we can just leave it as it is): - We could use more / other input parameters for the PCA based on what Emanuele and Rinske have showed. So using the order parameters averaged over different distances and multiple generations. For the experimental data we then have to think about what is conceivable with regard to the non-periodic boundary conditions and loosing a thick layer of data at the boundary of the box. - We could try to get rid of some of the noise in the experimental data by either quenching it or finding the cage coordinates, and using these quenched/caged coordinates for the structural analysis. One big problem with this is that we do not know the diameters of the individual particles. Without those the quenching will be sort of impossible. ---