Machine learn steric and depletion interactions

# Machine learn steric and depletion interactions ###### tags: `PhD projects` ## Project discription Steric and depletion interactions are computationally very expensinve to implement in simulations. Our goal is therefore to try to train ML algorithms to predict multi-body potentials between particles based on the configuration of particle. This could f.e. be used in MC algorithms where particle moves are accepted or declined based on the energy difference. ## Benchmark research To get some feeling for the systems we are working with, we first consider a benchmark case where we analytically know the interaction energy: Namely a 2D mixture of polymers and monodisperse hard spheres. ### Theory of the effective potential of colloid-polymer mixtures Under the right conditions, the polymer-polymer interaction can be described as the interaction between ideal gas particles, while the interaction between colloids and polymers is hard-sphere like, with $$ \sigma_{cp} =\frac{\sigma_c+\sigma_p}{2}, $$ with $\sigma_c$ the diameter of the colloid and $\sigma_p$ the diameter of the polymer. Since the polymers behave like ideal gas particles w.r.t each other, we can integrate them out and instead consider an effective potential between the colloids (as will be shown below). The interaction of the Hamiltonian for all particles ($N_i$, with $i= 1,2$) explicitely is given by $$ \Phi(\mathbf{R}^{N_1}, \mathbf{r}^{N_2}) = \phi_{11}(\mathbf{R}^{N_1}) + \phi_{12}(\mathbf{R}^{N_1},\mathbf{r}^{N_2})+ \phi_{22}(\mathbf{r}^{N_2}). $$ where $\mathbf{R}^{N_1}$ is the set of colloidal coordinates and $\mathbf{r}^{N_2$ is the set of polymer coordinates. Note that $\phi_{22}(\mathbf{r}^{N_2})=0$. In the HS case all interactions between species 1 and 2 are pairwise additive, meaning that we can write $$ \exp[-\beta\Phi_{12}(\mathbf{R}^{N_1},\mathbf{r}^{N_2})] = \prod^{N_2}_{j=1}\exp[-\beta \sum_{i=1}^{N_1}\phi_{12}(\mathbf{R}_i-\mathbf{r_j})], $$ where $\Phi_{12}(\mathbf{R}^{N_1},\mathbf{r}^{N_2})$ is the overall interaction between all particles of species 1 and 2. We can then write down the effictive interaction between the colloids by integrating out the solvent (where the solvent of polymers is treated grand canonically with an associted chemical potential $\mu_2$) $$ \ $$ With some rewriting this can be written as $$ \exp[-\beta\Phi^\text{eff}(\mathbf{R}^{N_1},\mu_2, T)] =\exp[-\beta\phi_{11}+z_2 V_f(\mathbf{R}^{N_1})] $$ with the free volume defined as $$ V_f(\mathbf{R}^{N_1}) = \int d\mathbf{r}\exp[-\beta\sum_{i=1}^{N_1}\phi_{12}(\mathbf{R}_i-\mathbf{r})] $$ Since our interaction is HS-like, only regions that have $\mathbf{R}_i-\mathbf{r}>\sigma_{cp}$ contribute to this free volume. Since the free volume increases when colloids come closer to each other, there is an effective attractive potential when the distance $R_{ij}$ between two colloids $i$ and $j$ lies between $\sigma_{p}<R_{ij}<2\sigma_{cp}$. Although the polymer-colloid interactions are a pair-wise interaction, by integration out the polymers, the interactions between colloids can become multibody interactions when three colloids are close enough to eachother. (see figure below). <img src="https://i.imgur.com/DjOj4xp.png " alt="drawing" width="300" /> In this system the energy per particle is still very easy to compute (namely the amount of free volume that is added when the particle is removed from the system) and therefore it is an easy system to start with. ## Initial snapshots ### Unequilibrated data For the first half of this project we used unequilibrated data (i.e. snapshots randomly inserted colloids) to train the ML algorithms. This data was mainly used to investigate which method of capturing space worked the best. ##### Insert colloids *Found on ODIN /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_Insert_colloids. Data can be found on ODIN in /home/users/5670772/ML_multibodyinteract/DATA/* We wrote a code that randomly inserts HS particles with a diameter $\sigma_c$ in a 2D system at a desired packing fraction (the edges of the box are set such that for a number of particles *N*, we are at the desired packing fraction) Here we show that the code works for a system of $2000$ particles and packing fraction $0.4$ (see place_particles_random.nb) <img src="https://i.imgur.com/iyjaHkm.png " alt="drawing" width="200" /> Note that this picture is made using periodic boundary conditions, however, one could also use hard walls ### Equilibrated data *Data can be found on ODIN in /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/DATA_eq/* To train the models that are used for the actual simulation, we use equilibrated data. To obtain this data we run a hard disk code at various packing fractions, starting from a hexagonl crystal (higher densities) or a fluid (lower densities). The simulations were initalized for half a milion cycles and then equilibrated for 1 milion cycles. The code that we used for this is the hard disk code that is discussed later in this notebook. It's checks can be found on ODIN in the folder */home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/NPT_code/* The code itself is run from the folder */home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/DATA_eq/* and is called *MC_harddisks_NVT.cc*. ## ML Voronoi-triangle First we investigated, how well we are able to predict the free volume, where we use different methods to capture the free volume: - [ ] Delaunay triangles - [ ] Voronoi cells - [ ] Voronoi cells divided in triangles. This last method turned out to work the best - the results can be found below. The results of the other methods can be found at the end of the notebook. ### ML Voronoi-triangle To express the environment of each particle, we consider a Voronoi tesselation of the space and then use the triangles formed by each voronoi cell as an input for the NN. To classify the triangles, we use the following parameters * Area triangle * Side 1 * side 2 * side 3 * perimeter * angle between two side coming from the particle. :::danger Note that we do not order the sides by length when we feed them to NN, however, they are always in the same order (i.e. we always go counter clockwise or clockwise when looking at the sides). In my opinion if we provide enough data to train, this will be okay. ::: ### Voro++ code I wrote a code that takes in a snapshot, finds the Voronoi cell and the corersponding triangles of all particles and then outputs the free volume and all parameters the classify these triangles. Since voro++ is only applicaple for 3D systems, we needed to adjust it a little bit for our 2D system. The voronoi cell in our case is a container in the z direction, where both the bottom and the top contain the voronoi cell we are interested in (see f.e. figure below where all the corners are points associated with the voronoi cell). For our analysis we only take the lower points of the voronoi cell. <img src="https://i.imgur.com/YvQxphp.png " alt="drawing" width="300" /> To find the free volume, we use an insertion method. For each triangle that exists, we start inserting points. The points that are closer to one of the colloids than $\sigma_{cp}$ are rejected. With this we can define an acceptance probability $P_{acc} = N_{acc}/N_{tot}$, where, $N_{acc}$ is the total of accepted points and $N_{tot}$ is the total points sampled. The free volume is then given by: $$ V_f = V_\text{triangle}\cdot P_{acc}, $$ where $V_\text{triangle}$ is the volume of the triangle. ### CHECKS **Visual check** *Data can be found on /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_Free_volume_voronoi_triangle/CHECKS/freevolume_check/* The first check is a visual check. We look at a particle and the corresponding voronoi cell. Below we see the particle (blue), the excluded volume (yellow). THe non-allowed points (dark red) and accepted points (other collors). | | | | -------- | -------- | |![](https://i.imgur.com/K85gnuq.png)| ![](https://i.imgur.com/71o8UTX.png) | **Parameters** *Found on ODIN /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_Free_volume_voronoi_triangle/CHECKS/parameters_check/* We also check the parameters | Fig 1. | Fig 2. | | -------- | -------- | |![](https://i.imgur.com/P7QONxL.png)|![](https://i.imgur.com/OV3xvkp.png) | * Fig 1.:On the x-axis we plot the free volume measured. On the y-axis we plot the total volume of the triangle minus the excluded volume. To compute this we used the measured angle between the two base vectors of the triangle (that are connected to the particle), to find the fraction of the excluded particle volume that lies inside this triangle. This last quantity should (and is in our case) always smaller or equal to the actual free volume. * We checked that the total sum of all angles divided by $2\pi$ is equal to 1000 * We checked that the total perimeter of all triangles is equal to the total of all the three sides of all triangles. * We checked that the total area of the triangles is equal to the area of the system. * Finally we checked (Fig.2), that the maximum ratio of the $area/perimeter^2$ is equal to that of an equilateral triangle. Finally, since all snapshots have exactly 6000 triangles (i.e. on average each particle has 6 neighbours), we checked whether nothing was going wrong. Below we show both the voronoi analysis (with the colour of the particle indicating how many neighbours the particle has) and a histogram of the number of neighbours. As one can see, the number of neighbours is, as expected, a distribution around 6. That in total there are exactly 6000 particles, probabaly has to do with the fact that we are using periodic boundaries (and some mathematical stuff, that we did not look into). | Column 1 | Histogram | | -------- | -------- | | ![](https://hackmd.io/_uploads/HyLkNNIIh.png)| ![](https://hackmd.io/_uploads/S1aH4NL8n.png)| ## Training ### Unequilibrated data #### NN voronoi triangles *Found on ODIN /home/users/5670772/ML_multibodyinteract/code_Train_volume_NN_voronoi_triangle/ Results are found in the OUTPUT folder* We train a simple neural network to predict the free volume of triangles. We use a learningrate $= 0.0001$, BATCH_SIZE = $50$, EPOCHS = $10000$ and Relu. For every network we use all the parameters as an input. ##### Packing fraction $0.4$ | $\sigma_p$ | $0.1$ | $0.2$ | $0.3$ | $0.4$ | $0.5$ | $0.6$ | $0.7$ | $0.8$ | | ---------- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | | $\rho$: 2 (20, 5) 10.000 epochs |0.99998 |0.99998 |0.99997 | 0.99997 | 0.99997 |0.99996 |0.99988 |0.99989 | |$\rho$: 2 (20, 5) 100 epochs | | | |0.99942 | | | | | | $\rho$: 2 (5, 5) 100 epochs | | | |0.99918 | | | | | Below we indeed see that the results are very good (Results for the $\rho$: 2 (20, 5) network with 1000 epochs and ReLu ). | $\sigma_P = 0.3$ | $\sigma_P = 0.8$ | | ------------------------------------ | ------------------------------------ | | ![](https://hackmd.io/_uploads/HyaT1dv8n.png)|![](https://hackmd.io/_uploads/SJTpydD83.png)| #### Different packing fractions We furthermore also train our NN for different packing fractions and different $\sigma_P$'s. ##### (2 (5,5) network 10000 epochs) | $PF$ \ $\sigma_p$ | 0.1 | 0.4 | 0.8 | | ----------------- | ------- | ------- | ------- | | 0.1 | 0.99999 |0.99999| 0.99999| | 0.2 |0.99999 |0.99998| 0.99997| | 0.3 | 0.99999 |0.99997|0.99969| | 0.4 |0.99998|0.99990 | 0.99816 | | 0.5 |0.99997|0.99849|0.95053| ##### (2 (20,5) network 10000 epochs) | $PF$ \ $\sigma_p$ | 0.1 | 0.4 | 0.8 | | ----------------- | ------- | ------- | ------- | | 0.1 |0.99999|0.99999|0.99999 | | 0.2 |0.99999 |0.99999|0.99999| | 0.3 | 0.99999 |0.99999|0.99998| | 0.4 |0.99998|0.99998 |0.99989| | 0.5 |0.99997|0.99995|0.99967| As we can see the more complex model works significnatly better for higher $\sigma_P$'s. #### Transferibility Then we check how the models transfer, by applying models trained on different densities/$\sigma_P$'s on other densities/$\sigma_P$'s. **Same density, different $\sigma_p$'s.** | Data/Model| $\sigma_p = 0.1$ | $\sigma_p = 0.2$ | $\sigma_p = 0.3$ | $\sigma_p = 0.4$ | $\sigma_p = 0.5$ | $\sigma_p = 0.6$ | $\sigma_p = 0.7$ | $\sigma_p = 0.8$ | | ---------- | ---------------- | ---------------- | ---------------- | ---------------- | ---------------- | ---------------- | ---------------- | ---------------- | | $\sigma_p = 0.1$ | | | | 0.923 | | | |0.655 | |$\sigma_p = 0.2$ | | | | | | 0.863 | | | |$\sigma_p = 0.3$ | | 0.991 | | | 0.961 | | | | |$\sigma_p = 0.4$ |0.923 | | | | | 0.957 | | 0.837 | |$\sigma_p = 0.5$ | | | 0.961 | | | | 0.952 | | |$\sigma_p = 0.6$ | | 0.863 |0.957 | | | | | 0.946 | | $\sigma_p = 0.7$ | | | | | 0.952 | | | | | $\sigma_p = 0.8$ | 0.654 | | | | | 0.944 | | | **Different density, same $\sigma_p$'s.** Using $\sigma_p = 0.8$ | Data/Model | $pf = 0.1$ | $pf = 0.3$ | $pf = 0.5$ | | ---------------- | ---------- | ---------- | ---------- | | $pf = 0.1$ | | 0.99994 | 0.68684 | | $pf = 0.3$ |0.99988 | | 0.84793 | | $pf = 0.5$ | 0.92477 | 0.99919 | | As exptect the transferibility along different densities is higher (since at different densities but constant $\sigma_P$, equal triangles still have the same excluded volume). The reason that the predictions worsen when we use very different packing fractions is simply because there isn't enough good training data. #### Train on all packing fractions simultaniously *See ML_multibodyinteract/code_Train_volume_NN_voronoi_triangle/OUTPUT/HS_2D_N1000_pf_0.1_0.2_0.3_0.4_0.5__sigma_p_0.8/layers_2_[20, 5, 0, 0]_1000_Epochs/* If we train on all packing fractions simultaneously for $\sigma_P = 0.8$ we obtain an accuracy of $0.99999$, meaning that we can make perfect predictions accross all densities. ### Equilibrated data As mentioned before, for the actual training of the model we want to use snapshots of equilibrated data. Note that using this equilibrated data also allows us to include higher packing fractions, where the system is in a crystal. The packing fractions we use are $\eta\in\{0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9\}$ (note that the maximum packing fraction is $\frac{\pi}{2\sqrt{3}}\approx0.9069$). In total we equilibrated 25 snapshots per packing fraction. We then train a NN on this data (using various hyperparameters) | $\sigma_P$ | snaps per $\eta$ | Sig./ReLu | Epochs | Layers | Par. | Perimeter | $\rho$ | | ---------- | ---------------- | ------------ | ------ | ------- | --- | ---------------------------- | ----------------------------------- | | 0.01 | 10 | ReLu | 1000 | (20, 5) | 251 | Yes | 0.999997 | | 0.1 | 10 | ReLu | 1000 | (20, 5) | 251 | Yes | 0.999999 | | 0.4 | 10 | ReLu | 1000 | (20, 5) | 251 | Yes | 0.999997 | | 0.8 | 10 | ReLu | 1000 | (20, 5) | 251 | Yes | 0.999996 | | 0.8 | 10 | ReLu | 1000 | (20, 5) | 231 | **No** | **0.980584** | | 0.8 | 10 | Sigmoid | 1000 | (20, 5) | 251 | Yes | 0.329439 | | 0.8 | 10 | ReLu | 10000 | (20, 5) | 251 | Yes | 0.999999 | | 0.8 | 10 | ReLu | 1000 | (3,3,3) | 49 | Yes | 0.999994 | | 0.8 | 10 | ReLu | 1000 | (3,3,3) | 43 | No | 0.999995 | | 0.8 | 10 | ReLu | 1000 | (5,5,5) | 101 | Yes | 0.999998 | 0.8 | 24 | Sigmoid | 10000 | (20,5) | 251 | Yes | 0.999857 | | 0.8 | 24 | ReLu | 1000 | (20,5) | 251 | Yes |0.999999 | | 0.8 | 24 | ReLu | 10000 | (20,5) | 251 | Yes | | | 0.8 | 24 | ReLu | 10000 | (3,3,3) | 49 | Yes | | ::: warning Note that the last three NN are trained using a batchsize of 100, while the rest of the runs has a batchsize of 50. For these last two runs I saw that the testdata blew up giving the following error: RuntimeWarning: invalid val c /= stddev[None, :] This is probably because there are a lot of small triangles for which the freevolume is zero. With a too small batchsize, this means that there is a large probability of only having zero free volume triangles in a batch, leading to an SD of 0. ::: ## MC Voronoi *Found on ODIN /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/ The final code is called **MC_voronoi_2D_NN_PB_def_20231004.cc*** To implement the voronoi code in a MC simulation, we use the method described by Rastko. > 1. I use Python to construct a Delaunay triangulation (Python calls QHull in the background http://www.qhull.org/) for a set of points (random or coming from somewhere else). Delaunay triangulation is dual to Voronoi diagrams, i.e. centres of triangles of Delaunay triangulation are vertices of the dual Voronoi diagram, and vice versa, vertices of Delaunay triangulation are centres of Voronoi cells. So, knowing one gives you the other one right away. > 2. Then I load the Delaunay triangulation into my own half-edge data structure – a very elegant way to implement triangulated surfaces. For the half edge data structure, see also **active_juction_2d.pdf** on ODIN. > 3. I update the Delaunay triangulation and hence the Voronoi diagram using a so called equiangulation move (discussed in section 7.2 of the attached paper). The idea is simple – for each edge you check the sum of the angles facing it. If the sum is larger than \pi, you flip the edge to recover the Delaunay nature of the triangulation. The procedure is guaranteed to converge and in practice it does so very quickly. Note that each Delaunay triangle is associated with a voronoi vertex, which corresponds to the center of the circumscribed sphere of that triangle. To determine this center we use the algorithm found on https://gist.github.com/mutoo/5617691. I implement the flip moves as follows: Every particle move, I set up a linked list with all the edges that should be checked for a potential edge flip. For every particle move, this list at the beginning concludes all edges associated with the Delauny triangles that surround the moved particle. For each edge in the edgelist we then check whether we should flip the edge - if that is the case I then add all neighbouring edges (i.e. all the edges that surround the flipped edges) to the list of edges that should be checked for a potential flip (if they are not yet on that list already). Note that this is an iterative update scheme. Every edge that is flipped or is associated with a triangle with changed corners is then added to a linked list, for which the new Voronoi corner is computed. ### Different steps/checks #### CHECK: NPT hard disk code *Found on ODIN /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/NPT_code/* We first check whether our NPT simulation works for simple hard disks (i.e. without polymers) by looking at the equation of state. Red are our points. We run the simulation for multiple statepoints, every time starting from a non-equilibrated snapshot at $\rho\sigma^2=0.4$. MC simulations are run for in between $10^5$-$5\cdot10^7$ initialization cycles $10^6$-$10^7$ measuring cycles, where the volume is measured every $500$ steps. We checked whether the snapshots where equilibrated. Below we show the results | $\beta P\sigma^2$ versus $\rho\sigma^2$ |$\frac{\beta P\sigma^2-\rho}{\rho^2}$ versus $\rho\sigma^2$ | | -------- | -------- | | ![](https://i.imgur.com/rFw0C41.png)|![](https://i.imgur.com/9FX1Kuj.png)| We use the following points as check: - Blue are points found by *J. Kolafa and M. Rottner "Simulation-based equation of state of the hard disk fluid and prediction of higher-order virial coefficients", Molecular Physics 104 pp. 3435 - 3441 (2006)* - Green line is found by using Henderson hard disk equation of state - Yellow is ideal gas plus up to tenth order virial coefficients - Red are our simulated poins. Below we also show what the simulations looke like for the lowest and the highest pressure. | $\beta P\sigma^2= 1\sigma^2$ |$\beta P\sigma^2= 28$| | -------- | -------- | |![](https://i.imgur.com/8k0uBEX.png)|![](https://i.imgur.com/41Hhnhi.png)| #### Delaunay Tesselation *Code can be found on ODIN /home/users/5670772/ML_multibodyinteract/code_python_triangles/* As stated above, we start the Monte Carlo simulation with an initialized Delaunay triangulation that we obtain using the python package. This python package however doesn't have periodic boundary conditions. We therefore build them in ourselves usng the following steps (see figures) 1. Take the snapshot 2. Add the periodic images using nearest image convention (in this consider 1./4. of the box as a periodic image). 3. Do the Delauny triangulation 4. Only take into account * triangles that lie in the box completely * triangles that are connected to at least on particle inside the box and particles in the right lower quadrant (i.e. particles outside the box for which it is valid that ($y < boxy \text{ and } x > boxy$ and/or $y < 0 \text{ and } 0 < x$). * Triangles that that have one particle for which $y < 0\text{ and } x > boxy$, one for which $y < 0 \text{ and } 0< x boxx$ and one for which $0 < y < boxy \text{ and } x > boxx$. 5. The code then writes out a file that contains the coordinates of the three particles associated with each particle. The procedure is graphically shown below 1.<img src="https://i.imgur.com/z8THtMq.png " alt="drawing" width="250" />2.<img src="https://i.imgur.com/ad48Y1M.png " alt="drawing" width="250" /> 3.<img src="https://i.imgur.com/kUamYYh.png " alt="drawing" width="250" />4.<img src="https://i.imgur.com/Dppo6lK.png " alt="drawing" width="250" /> In the notebook checktriangulation.nb in the CHECKS folder, we show the Delaunay tesselation. We checked for multiple snapshost that the number of triangles is always 2000 with our method. This should be the case since there are 6000 voronoi cells. Since each delaunay triangle will be part of 3 voronoi cells (it has 3 sides), we know that it is a requirement to have 2000 particles. #### Triangulation implementation */home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/check_delaunay_and_voronoi_triangulation/* \ We implemented the half edge code in our MC NPT simulation. This code thus reads in the Delaunay triangulation obtain with the python code and then uses the half edge flips to update the triangulation. At specific points in time we write out a snapshot, as well as the Delaunay triangulation and the associated voronoi vertices. In the notebook *CHECK_triangulation.nb* (that can be found in the CHECKS folder) we plot these tesselations, as well as the tesselation that we find when we feed the snapshots of the corresponing to the python algorithm. Below we show the reults. In these figures, we see * the particles (larger disks) * the Delaunay tesselation of our algorithm (black dotted lines) * the Delaunay tesselation of the python code (blue lines) * the Voronoi vertices of our algorithm (small red disks) * the Voronoi verticesof the python code (small yellow disks) | $0$ MC steps | $30.000$ MC steps| $90.000$ MC steps | | -------- | -------- | -------- | |![](https://hackmd.io/_uploads/HJCpKAST2.png)|![](https://hackmd.io/_uploads/SkGi9RHTn.png)|![](https://hackmd.io/_uploads/rktV9ABah.png)| #### Measure free volume *See ODIN /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/check_delaunay_and_voronoi_triangulation/* To find the free volume associated with each Voronoi triangle, we compute the centre of the circumscribed sphere using the following algorithm. https://gist.github.com/mutoo/5617691. Since each Delaunay triangle is associated with a Voronoi vertex, we can associate a free volume with each edge; namely the parallellogram formed by the to neighouring voroni vortices and the two particles. Note that each of these parallellogram contains two Voronoi triangles (see below). | $0$ MC steps | $30.000$ MC steps| $90.000$ MC steps | | -------- | -------- | -------- | |![](https://hackmd.io/_uploads/rkGCjCHTn.png)|![](https://hackmd.io/_uploads/HJypsAHpn.png)| ![](https://hackmd.io/_uploads/ryy2j0B63.png)| To check the code, there is a build in check that every particle move calculates $\Delta V_o$ and $\Delta V_n$, where $\Delta V_i$ is the volume associated with the edges that have a changed parallellogram (either because the edge flipped or because the Voronoi vertices changed). Since the total volume should not change it should be the cage that $\Delta V_n - \Delta V_o = 0$. Moreover, once every *n* moves, it is checked that the total volume formed by the Delaunay triangles, as well as by the parallollograms, is equal to the overal box volume $V_\text{box} = \text{side}_x \cdot \text{side}_y$. :::info Note that Voronoi triangles that are opposite to eachother have the same area (they share a side, and the other two sides most be equal to eachother, because the corners of the triangle are the centers of the circumscribed spheres). This means that we only have to compute the volume of half of the Voronoi triangles. ::: #### Problem code phase seeperation *See /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/bugfix_triangle_moves/* When the system we are looking at phase seperates, the triangles become more and more elongated. As a results we can obtain weird triangle moves. The first possibility is that a particle crosses over the edge of the adjacent triangle (see left picture). To fix this we check whether the angles of the quadrilateral sums up to 360 degrees. If not, we check whether the triangles lie in eacht other. If so, we flip the edge, if not, we still check whether the opposite angles sum up to more than 180 degrees and flip accorndingly. The second possibility is that a particle jumps more than 1 edge (see right picture). | Move inside triangle | Move two triangles | | -------- | -------- | |![](https://hackmd.io/_uploads/r1DJw29ep.jpg)| ![](https://hackmd.io/_uploads/SJK_yvWeT.png) | We can check when such a move happens because then the Delaunay triangulation does not tile the space anymore. If a move like this happens we solve it by iteratively breaking up the move in various little steps until no forbiden moves occur. Although this might not be the most efficient way of solving it, due to the rarereness of the move it does not matter. ::: info The code *MC_voronoi_2D_NN.cc* in the folder /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/ on ODIN is the final version of the code as of 24-01-2024 ::: #### Neural network implementation in C *See ODIN /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/check_NN_implementation_c/* Since we want to use the trained NN's to predict the free volume of the triangles, we need to implement the NN in c. First we use a python code (*outputweights.py*) that outputs the weights and biases of the NN. Per two layers this cod first outputs the dimensions of the layers (i.e. $n \times m$ means that we go from $m$ nodes to $n$ nodes) and then outputs the weights in a file, where we follow the structure $w_{11}, w_{21}, \dots, w_{m,1}, \dots,w_{mn}$. The biases are written out in the same way (i.e. we start witht the bias associated with the first node). To check whether our NN is implemented correctly, we check for several inputs whether the python network and the c network output the same value (where both networks are implemented with a reLu activation function). | Input | Python output | c output | | -------- | -------- | -------- | |$[1, 1, 1, 1, 1, 1]$ |$0.3108$ | $0.310835$ | |$[0, 0, 0, 0, 0, 0]$ |$0.0910$|$0.091048$| |$[3.4, 0.4, 0.2, 3, 2.5, 4.1]$ |$1.8697$|$1.869749$| The same test as above was executed when the NN was implemented in the actual MC code, yielding again agreement between the c code NN and the python NN. ::: warning Note that we put a relu function over the last layer in the NN to make sure that we cannot have negative volumes. ::: #### MC with effective potential As mentioned in the theory section, the effective potential of the depletent system is given by $$ \exp[-\beta\Phi^\text{eff}(\mathbf{R}^{N_1},\mu_2, T)] =\exp[-\beta\phi_{11}+z_p V_f(\mathbf{R}^{N_1})] $$ with $V_f(\mathbf{R}^{N_1})$ the free volume of the system and $z_2$ the fugacity, defined via $$ \eta_p = \frac{\pi\sigma_p^2z_p}{4} $$ with $\eta_p$ the polymer reservoir packing fraction. This means that the general acceptance rule for moves is given by $$ \text{acc} = \begin{cases} \text{min}(\exp[\beta z_p(V_f^n - V_f^o) - \beta P (V_\text{tot}^n- V_\text{tot}^o) +N\log(V_\text{tot}^n/V_\text{tot}^o)], 1) &\text{No overlap}\\ &\text{Overlap} \end{cases} $$ with $V_f$ the free volume and $V_\text{tot}$ the total volume. ### Results We consider a system with $\eta_c = 0.4$, $\sigma_P = 0.8$ and various densities of polymers. The NN that we use is the (3, 3, 3) network where we include the perimeter (see */home/users/5670772/ML_multibodyinteract/code_Train_volume_NN_voronoi_triangle/OUTPUT_data_eq/HS_2D_N1000_pf_0.05_to_0.9_sigma_p_0.8_snaps10/layers_3_[3, 3, 3, 0]_1000_Epochs_relu1_sigmoid0/*). | Initial snapshot | Delaunay triangulation| | -------- | -------- | | ![](https://hackmd.io/_uploads/Sk4HqPmR2.png)|![](https://hackmd.io/_uploads/SJWVqD7R2.png)| | $\eta_P$ | Final snapshot | Delaunay triangulation | | -------- | -------------- | ---------------------- | | $0.1$ |![](https://hackmd.io/_uploads/BkER0DmR3.png)|![](https://hackmd.io/_uploads/HyzZJum02.png)| |$1.0$ |![](https://hackmd.io/_uploads/BkTy6vX0n.png) |![](https://hackmd.io/_uploads/B1sa1uXC3.png)| | $5.0$ |![](https://hackmd.io/_uploads/BkKICPX03.png)| ![](https://hackmd.io/_uploads/B1ODCP7Ah.png)| | $10.0$ |![](https://hackmd.io/_uploads/By2L3vXRn.png)|![](https://hackmd.io/_uploads/r1nw2w702.png)| | $20.0$ | ![](https://hackmd.io/_uploads/SJN1jDX03.png)| ![](https://hackmd.io/_uploads/By-moPX03.png)| ## Real depletion system As a benchmark system we also implemented a real depletion system (i.e. where the polymers are simulated explicitely). For this we used the NPT code of hard disks, with a grand canonical ensemble of the polymers implemented. The code works as follows: We start at low density with $N_c$ polymers, where the number of polymers and is set such that it matches $\eta_p$. Then the system is quenched until, the size of the box matches $\eta_c$. Note that during this quenching the number of polymers does not change. After the quenching, we start equilibration cycles, where $\Delta R$ is tuned (for polymers and hard disks sepeteratly). Note that this is a $N_C\mu_P VT$ ensemble, i.e. the box size is not changed. Every cycle, we on average also try to add or remove one of the polymers. To do so, we first check whether the move is accepted by checking the acceptance rule. $$ \text{insert, } acc = \text{min}(1,\frac{V\cdot z_p}{N_p+1})\\ \text{remove, } acc = \text{min}(1,\frac{N_p}{V\cdot z_p}) $$ If the insertion move is accepted, we then pick a random $x$ and $y$ for the polymer and check whether that position would lead to overlap with a colloid. If yes, we decline the move, otherwise we accept. This way the packinfraction of polymers inside the free volume is on average equal to $\eta_P$. ##### CHECKS *See ODIN /home/users/5670772/ML_multibodyinteract/code_MC_col_pol_muVT/CHECKS/ideal_gas. Here the code MC_harddisks_polymers_muVT_CHECK_grandcanonical.cc outputs the density for a given $\beta\mu$. The notebook that makes the snapshot for a real depletion snapshot is called depletion_snap.nb and can be found on ODIN /home/users/5670772/ML_multibodyinteract/code_MC_col_pol_muVT/CHECKS/config/ . The trajectories that lead to snap_1000000.dat can be found on ODIN /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/check_paircor_realsystem/ML_benchmark/eta_c_0.40_eta_p_5.00/* To check this code, we test it for $N_c=0$, i.e. the case that we only have polymers and thus an ideal gas. Since for an ideal gas $z = e^{\beta\mu}$ and $z = \frac{4\eta_p}{\pi\sigma_p^2} = \frac{N}{V} = \rho$, we know that $\rho = e^{\beta\mu}$. Below we plot both the theoretical line (black dotted), as well as our measured points. As we can see they nicely overlap. |$\rho$ v.s. $\beta\mu$ | Snapshot $\eta_c =0.4$, $\eta_p = 5.$, $\sigma_p = 0.8$| | -------- | -------- | |![Screenshot from 2024-01-15 11-12-18](https://hackmd.io/_uploads/r1OzrYGtp.png)|![Screenshot from 2024-01-16 09-44-20](https://hackmd.io/_uploads/Sy9gMa7Yp.png)| On the right hand side we show a snapshot where we show the colloids (blue) and polymers (red) for $\sigma_c = 5.0$ and $\sigma_p = 0.8$ after a milion MC steps. The snapshot can be found in the * ## Comparing real depletion system and ML learned depletion system *See /home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/check_paircor_realsystem/ Note that for the real system we use the code found and checked in /home/users/5670772/ML_multibodyinteract/code_MC_col_pol_muVT/* To compare the two systems we measure the pair correlation functions of colloid-colloid, colloid-polymer and polymer-polymer interactions. ##### Measuring $g(r)$ in Voronoi system Note that since we integrated out the polymeres in the ML system, we need to measure the pair correlation function that involve polymers in a different way (see *https://journals.aps.org/pre/pdf/10.1103/PhysRevE.73.041404*). In the paper they say *As the polymer coils are ideal, the number density of the polymer is constant ($\rho_p = \rho_p^r$) in the free volume or the holes of the system. The colloid-polymer and polymer-polymer correlations can therefore be obtained from the colloid-hole and holehole correlations. More precisely, we determine the colloidpolymer and polymer-polymer correlations from 10 000 randomly inserted polymer coils in an instantaneous colloid configuration, provided that no overlap exists between the polymer and colloids.* Note that this procedure breaks down when the probability of inserting a colloid is too small. Therefore, the method is restricted to $\eta_c\leq 0.4$ Note that since we try insert 10.000 polymers every time we measure the pair correlation function, the distribution of the ideal gas of polymers uses a density of $10.000/V$. ##### Measuring $g(r)$ in benchmark system Also in the benchmark system, measuring the colloid-polymer and polymer-polymer distributions are not trivial. The reason is that the number of polymers changes throughout the simulation, meaning that also the distribution of an ideal gas changes. Therefore, every time we measure the pair correlation function, we re-compute the distribution of an ideal gas with comparable density and then normalize the pair correlation function with that distribution. All different pair correlation functions are then averaged to obtain the actual $g(r)$. ### Results We measure the distributions by letting the system initialize for $5\cdot 10^5$ MC steps and then do $10^6$ measure cycles, where we measure the distribution every 100 steps for the benchmark system and every 1000 steps for the Voronoi system (since the insertion method takes a lot of time, we do lightly less measurements for this system). | Parameters | Pair distribution | | ---------------------------------------------------------------- | --------------------------------------------- | | $\sigma_C = 1.0$, $\sigma_P = 0.8$, $\eta_C = 0.4$, $\eta_P = 0.1$ |![Screenshot from 2024-01-16 09-37-41](https://hackmd.io/_uploads/SyIwlT7KT.png)| | $\sigma_C = 1.0$, $\sigma_P = 0.8$, $\eta_C = 0.4$, $\eta_P = 0.5$ | ![](https://hackmd.io/_uploads/BkkmWk802.png)| |$\sigma_C = 1.0$, $\sigma_P = 0.8$, $\eta_C = 0.4$, $\eta_P = 1.0$|![Screenshot from 2024-01-16 09-38-21](https://hackmd.io/_uploads/SkxS9ga7K6.png)| |$\sigma_C = 1.0$, $\sigma_P = 0.8$, $\eta_C = 0.4$, $\eta_P = 5.0$| ![Screenshot from 2024-01-16 09-39-31](https://hackmd.io/_uploads/r1v0eTmKp.png)| ## Better training data Although the results above look very nice we see that when we increase the polymer density, at one point the $g(r)$'s start to differ from each other. The most likely explanation for this is that the trainingdata is lacking configurations with phse seperation. Therefore we made a new trainingdata set where we perform a simulation with colloids and polymers explicitely simulaed for various densities $\eta_c \in [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]$ and various polymer chemical potentials $z_p \in [0.01, 5, 9.94, 15.92]$. Note that the colloid-polymer size ratio is always $1:0.8$. For each of these statepoints we do 10 runs. The data can be found on Odin in */home/users/5670772/ML_multibodyinteract/DATA_pol_col_id/ (the z_p* folders)* To fit the triangulation, we first use use *compute_free_volume_voronoi_triangle_data_col_pol_voro.cc* in */home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_Free_volume_voronoi_triangle/* (execute with *ex_freevolume_data_pol_col*) to do the voronoi triangulation. Afterwards this data is read using the code *compute_free_volume_voronoi_triangle_data_col_pol_comp.cc* (same folder), which computes the freevolume (I've done this because to my feeling voro ++ makes the code slower, so that is only used in the first code). #### Checks *See ODIN /home/users/5670772/ML_multibodyinteract/code_Free_volume_voronoi_triangle/CHECKS/CHECKS_phase_seperation/*. Below we show the Voronoi triangulation (red) and the Voronoi triangles (yellow). Below that (for slightly different snapshots, but the same statepoints), we plot the fraction of free volume for that triangle (where dark red is 0 and dark blue is 1). | $z_p=15.92$ ,$\eta_c = 0.4$| $z_p=5$ $\eta_c = 0.4$ | $z_p=9.94$ $\eta_c = 0.4$ | $z_p=5$ $\eta_c = 0.2$ | | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | | ![Screenshot from 2024-03-12 09-47-54](https://hackmd.io/_uploads/r1N085Tpa.png) | ![Screenshot from 2024-03-12 09-48-13](https://hackmd.io/_uploads/SJN1w5TaT.png) | ![Screenshot from 2024-03-12 09-48-20](https://hackmd.io/_uploads/rk51vc66p.png) | ![Screenshot from 2024-03-12 10-57-27](https://hackmd.io/_uploads/BysSwoaaa.png) | | ![Screenshot from 2024-03-12 13-53-04](https://hackmd.io/_uploads/SJGueRaa6.png) | ![Screenshot from 2024-03-12 14-23-20](https://hackmd.io/_uploads/S16uvC666.png)|![Screenshot from 2024-03-12 14-23-34](https://hackmd.io/_uploads/H1awwAa6p.png)| ![Screenshot from 2024-03-12 13-53-10](https://hackmd.io/_uploads/rknvxC6aT.png) | ::: info The polymer-polymer pair correlation. is sometimes nan for higher densities of colloids. The reason is that at these high densities there will be snapshots where there are no polymers. Since we divie by the number of polymers this leads to an infinitely high pair correlation. ::: ### Results new training data #### Pair correlation function *The ML data can be found on ODIN in the folder /home/users/5670772/ML_multibodyinteract/SIM_ML_MC_col_pol/SIM_ML_MC_col_pol/sigma_c_1_sigma_p_0.8/, while the actual simulation data can be found on ODIN /home/users/5670772/ML_multibodyinteract/DATA_pol_col_id/ (the eta\* folders). The notebook can be found in the folder /home/users/5670772/ML_multibodyinteract/ANALYSIS/ and is called Paircorrelation_MC_EDMD_ML.nb* We check the trained ML model by measuring the pair correlation for the real system and the ML system for various densities of both polymers and colloids ($\eta_c\in[0.4, 0.6]$ and $\eta_p\in[0.1,0.5,5, 8]$). Since we ran the ML system in terms of $\eta_P$ instead of $z_P$ we also performed new real simulations with the exact same $\eta_P$ to make sure we are making a fair comparison. In principle we start the ML simulations using a snapshot with the correct $\eta_c$ that is equilibrated at a $z_p = 0.01$ (*found on ODIN /home/users/5670772/ML_multibodyinteract/DATA_pol_col_id/z_p_0.01/*), i.e. almost no polymers. However, since we also wanted to check what the influences is of starting in an equilibrated simulation with the correct $\eta_P$, for both $\eta_C$ we take one of the equilibrated snapshots of $\eta_P=5$ as a starting point for the MC simulation and check what the pair correlation is for this system (for $\eta_C=0.4$ we use run 1007 (for this snap showed a nice phase seperation) for this and for $\eta_C=0.6$ we use run 1007). *This DATA can be found at the same location as the other runs, i.e. f.e. for $\eta_c = 0.4$ and $\eta_P= 5$ it can be found in /home/users/5670772/ML_multibodyinteract/SIM_ML_MC_col_pol/SIM_ML_MC_col_pol/sigma_c_1_sigma_p_0.8/pf_c_0.4/Eta_c_0.40_eta_p_5.00/ however instead of run_\* the folder with this type of simulation is called CHECK*. Because at these polymer densities there is already phase seperation, we don't use the normal readtriangle function that uses normal PB. Instead there is a new version of the code *MC_voronoi_2D_NN_PB_init.cc* that already incorporates the PB method that we use throughout the rest of the simulation. **N.B. Note that this version of the code is not thoroughly tested, so one should always check whether the Voronoi triangulation indeed works correctly** Below we show the results. For both systems we run 10 runs, the open symbols are the average $g(r)$ as measured in the ML sysmte, the solid line is the average $g(r)$ measured in the real system. The colored zones indicate $2\sigma$ as measured for the 10 real runs. | | $\eta_P = 0.1$ |$\eta_P = 0.5$ | $\eta_P = 5$ | $\eta_P = 8$ | | -------- | -------- | --- | --- | -------- | | $\eta_C = 0.4$ |![Screenshot from 2024-03-28 14-01-25](https://hackmd.io/_uploads/S1m45yQ10.png) |![Screenshot from 2024-03-28 14-01-44](https://hackmd.io/_uploads/BJHSqJ71A.png) | ![Screenshot from 2024-03-28 14-01-59](https://hackmd.io/_uploads/S1BIqJmk0.png)| ![Screenshot from 2024-04-02 12-01-05](https://hackmd.io/_uploads/H1Z_PLYJR.png)| | $\eta_P = 0.6$ | ![Screenshot from 2024-03-28 14-03-59](https://hackmd.io/_uploads/r1lAqJmJA.png) | ![Screenshot from 2024-04-02 12-01-54](https://hackmd.io/_uploads/r1xsvLtJC.png)| ![Screenshot from 2024-03-28 14-04-38](https://hackmd.io/_uploads/r1UliyQJ0.png) | ![Screenshot from 2024-03-28 14-05-06](https://hackmd.io/_uploads/SkQzoy7JR.png)| We also compare the $g(r)$ that was measured in the ML system that uses an equilibrated snapshot at $\eta_p=5$, below we show the results. The black line indicates the run of the system that we took the snapshot of. The other solid lines indicate the other runs. | | $\eta_C = 0.4$ | $\eta_C = 0.6$ | | -------- | -------- | -------- | | $\eta_P = 5$ | ![Screenshot from 2024-03-28 14-02-23](https://hackmd.io/_uploads/SJnF9JQ1R.png)| ![Screenshot from 2024-04-02 12-03-28](https://hackmd.io/_uploads/rygZ_8t10.png)| ::: info The fact that the the phase seperation is hard to capture correctly in the $g(r)$ is because the phase boundaries have a lot of influence on the form of the function. Below we show for various runs what the system looks like (where the upper left corner shos the ML system). As we can see it really differes per run whether we still have a round phase boundary, or a flat boundary (which is the equilibritum configuration). | | | | | | | -------- | -------- | --- | --- | -------- | | ![Screenshot from 2024-03-25 09-54-33](https://hackmd.io/_uploads/By-L7pCRa.png)| ![Screenshot from 2024-03-25 09-54-05](https://hackmd.io/_uploads/HJKUmT0C6.png)|![Screenshot from 2024-03-25 09-53-51](https://hackmd.io/_uploads/Hkzv76ACp.png)| ![Screenshot from 2024-03-25 09-53-41](https://hackmd.io/_uploads/SJ_DQ6AC6.png)| ![Screenshot from 2024-03-25 09-53-32](https://hackmd.io/_uploads/Skaw7aRCp.png)| | ![Screenshot from 2024-03-25 09-53-09](https://hackmd.io/_uploads/SyV_mpACa.png)| ![Screenshot from 2024-03-25 09-52-36](https://hackmd.io/_uploads/HknFm6R0p.png)|![Screenshot from 2024-03-25 09-51-59](https://hackmd.io/_uploads/S125Q6CCp.png)| ![Screenshot from 2024-03-25 09-51-45](https://hackmd.io/_uploads/ryms7TCR6.png) | ![Screenshot from 2024-03-25 09-51-00](https://hackmd.io/_uploads/HJZhXp0RT.png) | ::: ::: danger **2024-04-22** For the lower size ratio it seems like the ML system works less well... It could be that the training data for this lower size ratio just simply does not work well. Ofcourse these polymers should be closer to eachother to still have an energy gain. Therefore it would be nice to also train on data where we explicitely simulate the system for $\sigma_P = 4$. Since we did start the simulations as a function of $eta_P$ and not $z_P$, this means that the training data has a different set-up from the trainingdata we used for $\sigma_P = 0.8$. To make sure the training data is well balanced, I want to retrain both size ratio's based on the $\eta_P$ data. This means that we need to do the following steps: - [ ] Wait for all training data to finish - [ ] Compute free volume all triangles - [ ] Retrain NN both size ratio's - [ ] Start SIM_ML simulations both size ratio's - [ ] Start SIM_ML simulations $\sigma_P=0.8$ that start in the equilibrated simulation - [ ] Start Long box simulations for both systems ::: ::: danger *See notebook Freevolmeasurement_0.4.nb in folder /home/users/5670772/ML_multibodyinteract/DATA_pol_col_id/* Maybe it works better to generate triangles in a different fashion. At this moment our data is quite unbalanced | Free volume |Volume | Area/perimeter$^2$ | | -------- | -------- | -------- | | ![Screenshot from 2024-04-23 15-40-21](https://hackmd.io/_uploads/Sy485VSW0.png) | ![Screenshot from 2024-04-23 15-41-51](https://hackmd.io/_uploads/SJ6j9NSZR.png)| ![Screenshot from 2024-04-23 15-42-21](https://hackmd.io/_uploads/rkATcNHb0.png) | Note that the max ratio between $A$ and perimeter$^2$ area and perimeter, is $\frac{1}{4} \sqrt{3}/3^2$, which is indeed the max value we find in our data. At this moment half of our data has a freevolume of zero and only a slight amount of the dataset has a Area/perimeter$^2$ that really deviates from an almost equilside traingle ::: ## Long box simulation To make the long box simulation method work, we first need a way to measure the pressure in a system with a discontinous potential ### Measure the pressure in an NVT ensemble of hard disks Since our particles have discontinuous potentials, to measure the pressure in our system, we need to to do trial volume moves along the long axis. The energy change associated with this will give us $P$ (note that this is kind of like Widom test insertion). Given a small volume change along one axis $\delta V_{i}$, we can write the free energy difference as $$ \begin{align} F(N, V+\delta V, T)- F(N, V, T) &=-k_BT\log\left(\frac{Z_{V+\delta V}}{Z_V}\right)\\ &=-k_BT\log\left(\frac{\int d\mathbf{s}^Ne^{-\beta U(\mathbf{s}^N, V+\delta V)}(V+\delta V)^N}{\int d\mathbf{s}^Ne^{-\beta U(\mathbf{s}^N, V)}V^N}\right)\\ &=-k_BT\log\left\langle e^{-\beta( U(\mathbf{s}^N, V+\delta V)-U(\mathbf{s}^N, V))+\log\frac{(V+\delta V)^N}{V^N}}\right\rangle\\ &=-k_BT\log\left\langle e^{-\beta\left[ U(\mathbf{s}^N, V') - U(\mathbf{s}^N, V)-Nk_BT\log\frac{V'}{V}\right]}\right\rangle \end{align} $$ Since we also know that $F(N, V+\delta V, T)- F(N, V, T) = -P(N,V,T)\delta V$ we thus find $$ \begin{align} \beta P(N,V,T)&=\frac{1}{\delta V}\log\left\langle e^{-\beta\left[ U(\mathbf{s}^N, V') - U(\mathbf{s}^N, V)-Nk_BT\log\frac{V'}{V}\right]}\right\rangle \end{align} $$ If we want to find the pressure in a certain direction, we have to change the volume in a certain direction. Assume we change the $\delta V_x$ in the $x$-direction, then $\delta V = \text{box length}_y\cdot \delta V_x$. This means that the pressure in the $x$-direction becomes $$ \begin{align} \ \beta P_{xx}(N,V,T)&=\frac{1}{\text{box length}_y\cdot \delta V_x}\log\left\langle e^{-\beta\left[ U(\mathbf{s}^N, V') - U(\mathbf{s}^N, V)-Nk_BT\log\frac{\text{box length}_{x}'}{\text{box length}_{x}}\right]}\right\rangle \end{align} $$ #### CHECKS: Hard disks *See /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_NVT_pressure/CHECKS_NVT_measurepress* As a starting point to test this method, we will use hard disks. For hard disks $e^{-\beta\left[ U(\mathbf{s}^N, V') - U(\mathbf{s}^N, V)-Nk_BT\log\frac{V'}{V}\right]}$ simplifies to either 0 when there is overlap or $\frac{V'^N}{V^N}$ when there is no overlap. By measuring the average value as a function of $\delta V$ we should regain the EOS for HS. We can do this for multiple values of $\delta V$ and see which values lead to the best results We measure the pressure for various ensities in the range $[0.2, 1.1]$ and various $\delta V$ in the range $[0.001, 0.5]$. We run the simulation for $500000$ init steps and $5000000$ measure steps. In the simulation we do both volume moves in the $x$- as well as in the $y$-direction, such that we obtain both $P_{xx}$ and $P_{yy}$. | Pressure| Ratio $P_{xx}/P_{yy}$ $5\cdot10^6$ cycles | -------- | -------- | | ![Screenshot from 2024-03-18 10-09-55](https://hackmd.io/_uploads/rkOlBFSA6.png)| ![Screenshot from 2024-03-18 10-07-54](https://hackmd.io/_uploads/B1vOEtB0p.png) | |Pressure with SD |SD as a function of $\delta V$ for various densities | | ![Screenshot from 2024-03-18 10-08-54](https://hackmd.io/_uploads/rkt24YS0T.png)| ![Screenshot from 2024-03-18 10-09-17](https://hackmd.io/_uploads/HJipVFB0a.png)| in the first graph the black line shows the EOS obtained using a normal NPT simulation, data for this graph can be found (See ODIN */home/users/5670772/ML_multibodyinteract/code_MC_voronoi_Rastko/CHECKS/NPT_code/*). Note that this the data that we used earlier on this page to check the EOS of our NPT simulation. ::: warning Note that in some of the plots we plot the SD of the value. Note however that this is not completely correct; the SD assumes a normal distribution, while we clearly have a binary probability distribution. It will however still give an indication of the spread in the data. In the lower right plot we plot the SD for various densities as a function of $\delta V$. For later data we use another method to select the best points. Since we want a good average we want the move to be accepted more or less 50 % of the time. ::: We also checked that running more cycles does not change the average or the ratio between $P_{xx}$ and $P_{yy}$ significantly. Finally we ran 10 simulation for a specific density and $\delta V$, after which we computed the spread in the data, which turns out to be rather small. ![Screenshot from 2024-03-20 09-43-55](https://hackmd.io/_uploads/BkibGQOC6.png) ### EDMD code Frank To test how well our ML method works, we test it against the EDMD code of Frank. Therefore we first need to test whether the EDMD code and the ML simulations give good EOS results. #### CHECK EDMD code *See ODIN /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_NVT_pressure/CHECKS_dif_simulation_pressures/, results can be found in the notebook EOS.nb* For $\eta_P= 5$ we meausure the EOS with four different methods: - EDMD (black) - MC with polymers simulated (blue) - MC with ML simulated and volume moves in both directions simultaneously (yellow) - MC with ML simulated and volume moves in both directions seperated (red) In this last method, we implement Voro++ in the ML simulation (see code *MC_voronoi_2D_NN_maesvol.c* in the folder *ML_int_xx_yy_moves*). Everytime a volume move is performed, we check whether new volume computed by the edges is different from the actual volume. If so, the triangulation changed and we need voro++ to calculate the new volume. (the voro triangles, calculated with voro++ are then used to compute the new free volume). *Note that Since the actual volume move is never accepted, we do not have to implement the new triangulation.* First we look at how the system responds to different values of $\delta V$. | | MC | ML equal volume moves |ML diff volume moves | | --- | ---- | --------------------- | -------- | | SD as function $\delta V$ | ![Screenshot from 2024-04-03 13-14-47](https://hackmd.io/_uploads/rkYhzTq1C.png)| ![Screenshot from 2024-04-03 13-15-07](https://hackmd.io/_uploads/BythMpq1C.png) |![Screenshot from 2024-04-03 13-15-26](https://hackmd.io/_uploads/SyF2fTqk0.png) | | SD v.s. fraction of zeros | ![Screenshot from 2024-04-03 13-14-52](https://hackmd.io/_uploads/BkthfT9kR.png) | ![Screenshot from 2024-04-03 13-57-36](https://hackmd.io/_uploads/S1eH465yC.png)|![Screenshot from 2024-04-03 13-15-32](https://hackmd.io/_uploads/H1dhMack0.png) | Below we plot the EOS (see colors in list above for legends). Open symbols are lowest SD, closed symbols are fraction 0 closests to $0.5$, i.e. about $50\%$ acceptence. | | EOS all |EOS and lowest SD |EOS and fraction zeroes lowest to 0. | | --- | ---- | --------------------- | -------- | | |![Screenshot from 2024-04-03 13-15-53](https://hackmd.io/_uploads/BydhM69kA.png) |![Screenshot from 2024-04-03 13-16-01](https://hackmd.io/_uploads/B1u2zT5kR.png)| ![Screenshot from 2024-04-03 13-16-09](https://hackmd.io/_uploads/Hk_hzpckR.png) | #### EOS different size ratios *Data can be found on ODIN in /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/DATA_EDMD/, code is run with the code found in /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_EDMD/code_Frank/. Notebooks that show the EOS are called EOS_sigma_p_\*.nb and can be found in the DATA_EDMD folder* To check for which statepoint we have crystal-fluid/gas coexistences we compute the EOS for both size ratio's below.The different labels correspond to different $\eta_P$'s. Note that all simulations ran for only 50.000 steps (because of computer time), so many are not well equilibrated. | Sizeratio $1:0.4$ | Sizeratio $1:0.8$ | | -------- | -------- | |![Screenshot from 2024-04-15 15-12-20](https://hackmd.io/_uploads/HkSRvoqg0.png)| ![Screenshot from 2024-04-08 15-55-06](https://hackmd.io/_uploads/rJWUwdbxR.png)| #### Size ratio $1:0.8$ As it turns out, the larger size ratio's haver has a crystal fluid coexistence, only a crystal-dilute gas phase. #### Size ratio $1:0.4$ *Snapshots are not saved, we took screenshots from the mov.\* files in the OLD_2025_02_11/DATA_EDMD/sigma_p_0.4 folder on ODIN* We want for the lower size ratio a statepoint where there is coexistence between a crystal and a fluid/gas | | $\eta_c = 0.2$ | $\eta_c = 0.3$ | $\eta_c = 0.4$ | $\eta_c = 0.5$ | $\eta_c = 0.6$ | $\eta_c = 0.7$ | | -------- | -------- | --- | --- | --- | --- | -------- | |$\eta_P = 1.5$ |![Screenshot from 2024-04-15 15-29-21](https://hackmd.io/_uploads/rJMTss5xC.png)| | | | | | |$\eta_P = 1.75$|![Screenshot from 2024-04-16 09-24-54](https://hackmd.io/_uploads/B1_0wsseA.png)||![Screenshot from 2024-04-16 09-25-23](https://hackmd.io/_uploads/BkBxdoilC.png)|![Screenshot from 2024-04-16 09-26-48](https://hackmd.io/_uploads/SJkIOjjl0.png)|![Screenshot from 2024-04-16 09-26-08](https://hackmd.io/_uploads/rkkXOjolC.png)|| |$\eta_P = 2.0$ |![Screenshot from 2024-04-15 15-30-33](https://hackmd.io/_uploads/BkYZhicgC.png)| ![Screenshot from 2024-04-15 15-30-46](https://hackmd.io/_uploads/SJEznjcgC.png)| ![Screenshot from 2024-04-15 15-30-57](https://hackmd.io/_uploads/SkQQnj5lR.png)|![Screenshot from 2024-04-15 15-31-09](https://hackmd.io/_uploads/BJCmhsqlR.png)| ![Screenshot from 2024-04-15 15-31-19](https://hackmd.io/_uploads/rks4ni5l0.png)| ![Screenshot from 2024-04-15 15-31-43](https://hackmd.io/_uploads/S1aShsce0.png)| |$\eta_P = 2.25$|![Screenshot from 2024-04-16 09-27-59](https://hackmd.io/_uploads/Bkb9dojg0.png)|||![Screenshot from 2024-04-16 09-27-31](https://hackmd.io/_uploads/HJ8ddoigR.png)||| |$\eta_P = 2.5$ |![Screenshot from 2024-04-15 15-31-51](https://hackmd.io/_uploads/r1rL3i5l0.png)|![Screenshot from 2024-04-15 15-32-00](https://hackmd.io/_uploads/Sktw3iqeA.png) | | | | | ::: info From these simulations, we see that probably $\eta_c = 0.5$ and $\eta_P=1.75$ will lead to good fluid/crystal coexistences ::: ### Long box Measurements #### Real system To make sure the system does not melt in the beginning, we start in an FCC colloid crystal and a void that is filled with polymers at the correct chemical potential (see blow). | Different crystal densities | Different crystal densities | Different crystal densities | | -------- | -------- | -------- | | ![Screenshot from 2024-04-05 16-51-46](https://hackmd.io/_uploads/S1kGe9TkA.png) | ![Screenshot from 2024-04-05 16-47-44](https://hackmd.io/_uploads/HyV51qTyR.png) | ![Screenshot from 2024-04-05 16-50-43](https://hackmd.io/_uploads/rk_A156JA.png) | #### Measurements EDMD *Data can be found on ODIN /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/DATA_EDMD_longbox/sigma_p_*/ Code is run with the code found in /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_EDMD/code_longbox/, notebook is located in the DATA folder and is called EOS_longbox_sigma_p_0.8.nb* Since Franks code did not contain a stress tensor measurement we implemented that. Moreover, we perform the reference simulation of the crystal for both $\eta_P = 0$ and $\eta_P = 7$. Below we show in the first two collums the reference simluations. ##### $\eta_P = 7$, size ratio is $1:0.8$ For this size ratio we look at quite a high $\eta_P$ and quite a high global density, such that the amount of polymer-gas (and thus the size of the triangles in the ML simulation) is not to big. | Check pressure tensor | Check different $\epsilon$ values | Long box simulation| | -------- | -------- | -------- | | ![Screenshot from 2024-04-08 15-12-46](https://hackmd.io/_uploads/HJa8pwWe0.png)| ![Screenshot from 2024-04-08 15-12-56](https://hackmd.io/_uploads/Sk8v6wZxC.png)| ![Screenshot from 2024-04-16 10-16-27](https://hackmd.io/_uploads/Hk4WEnig0.png)| In the last collumn we show the longbox simulation for a system with global colloid density given by $\eta_B = 0.7$. #### Size ratio $1:0.4$, multiple $\eta_P's$ We consider various $\eta_P$'s for this size ratio. As we can see for the lower $\eta_P$'s the system does in the end melt, i.e. there is no crystal fluid coexistence. For the $\eta_P =1.7$'s the system does stay a crystal, but it turns for the higher packing fractions, which is why the y-component goes | | $\eta_p = 1.6$, $\rho_c\sigma^2 = 0.81$ | $\eta_p = 1.7$,$\rho_c\sigma^2 = 0.81$ | $\eta_p = 1.75$,$\rho_c\sigma^2 = 0.81$ | | --------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | | Snapshots |![Screenshot from 2024-04-22 13-26-32](https://hackmd.io/_uploads/B1xKKaXb0.png)|![Screenshot from 2024-04-22 12-23-52](https://hackmd.io/_uploads/SkqCi2QZC.png)|![Screenshot from 2024-04-22 12-24-37](https://hackmd.io/_uploads/HJdTonX-A.png) | | Pressures | ![Screenshot from 2024-04-22 12-07-56](https://hackmd.io/_uploads/B1HrinQbC.png) | ![Screenshot from 2024-04-22 12-26-52](https://hackmd.io/_uploads/Byids2mZR.png) | ![Screenshot from 2024-04-22 12-16-49](https://hackmd.io/_uploads/BJEKjhX-0.png) | ::: danger What to do with this.... we still have basicallly crystal-very dilute gas ::: #### Measurement ML *Data can be found on ODIN /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/DATA_voronoi_longbox/sigma_p_0.8/ Code is run with the code found in /home/users/5670772/ML_multibodyinteract/OLD_2025_02_11/code_EDMD/OLD_2025_02_11/code_voronoi_longbox/, :warning: notebooks is still located on my desktop in the folder /home/rinskealkemade/Documents/ML_multibodyinteract/DATA_voronoi_longbox and is called EOS_LB_sigma_P_0.8_pf_X_0.70_compare_EDMD.nb* For the machine learning simulation we use the same hyper parameters, however now we also need to check which value of $\delta V$ works the best. Below we show the results for the reference crystal | SD x moves | SD y moves|SD x versus fraction moves 0 |SD y versus fraction moves 0 | | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------- | --- | | ![Screenshot from 2024-04-15 14-39-40](https://hackmd.io/_uploads/SytflsceA.png) | ![Screenshot from 2024-04-15 14-39-49](https://hackmd.io/_uploads/SJQ7xj5lC.png) | ![Screenshot from 2024-04-15 15-01-38](https://hackmd.io/_uploads/rJ18Bicg0.png)|![Screenshot from 2024-04-15 15-02-00](https://hackmd.io/_uploads/Hkkvrocl0.png)| ##### $\eta_P = 7$, size ratio is $1:0.8$ We select both in the x and the y -direction independently the runs with a $\delta V$'s that has the lowest SD/fraction of zeroes closest to $0.5$. Then compute $P = (P_{xx}+ P_{yy})/2$ to compte the pressure. Below we show the the results of selecting the best runs. |Lowest SD | Fraction of zero closes to 0.5 | | -------- | -------- | | ![Screenshot from 2024-04-15 14-47-24](https://hackmd.io/_uploads/HJejGj9e0.png)| ![Screenshot from 2024-04-15 14-47-28](https://hackmd.io/_uploads/HJsqMo5xC.png)| ::: danger How to choose a good value for $\delta V$? ::: For the coexistence long box simulation we do the same simulations. Together we get the following result |Lowest SD | Fraction of zero closes to 0.5 | | -------- | -------- | |![Screenshot from 2024-04-15 15-07-30](https://hackmd.io/_uploads/S1ViIj5xR.png)| ![Screenshot from 2024-04-15 15-07-46](https://hackmd.io/_uploads/HyX28jqg0.png)| With the EDMD together, this becomes for the SD plot. As we can see it is not perfect, however, there is quite a spread in the pressures as measured with the ML method (see picture on the right where all the green-yellow lines are different pressure runs). |Bases on lowest SD| Spread in resulst| | -------- | -------- | | ![Screenshot from 2024-04-16 10-23-11](https://hackmd.io/_uploads/HJwFS2jxR.png)| ![Screenshot from 2024-04-15 15-09-20](https://hackmd.io/_uploads/BkRZPs5lR.png)| ## Generate own trainingdata *Found on ODIN /home/users/5670772/ML_multibodyinteract/DATA_gen_tri/ we use the folders /$*$difrandom_smalltri and the notebook generaterandomtri_difmethod_small_tri.cc.* (see below which data belongs to which). We use two different schemes to produce the triangles 1. The first way to generate our own trainingdata, we perform the following scheme: - Generate a random triangle - Accept the triangle given the biaspotential $\exp[-\beta\left(\frac{1/4\sqrt{3}}{3^2}- \text{Area}/\text{Perimeter}^2\right)]$, where $\frac{1/4\sqrt{3}}{3^2}$ is the ratio between $\text{Area}/\text{Perimeter}^2$ for an equilateral triangle or the bias potential $\exp[-\beta\left(\frac{\pi}{3}- \theta \right)]$ with $\theta$ the angle between the two sides that are connected to the particle. - Scale the triangle such that the two sides connected to the particle and the height are at least the particle radius. - Scale the area of the triangle 100 times, of which 50 times with a scaling factor between 1 and $4\sigma_{cp}^2/\sigma_{c}^2$ and 50 times with a scaling factor between $4\sigma_{cp}^2/\sigma_{c}^2$ and 1000. - Only triangles with a volume less than 100 and a side length less than 20 are accepted. 2. The second way to generate our own trainingdata, we perform the following scheme: - Generate a random triangle by randomly picking an angle between $[0, \pi]$ and two side lengths between $[0,1]$. - Accept the triangle given the biaspotential $\exp[-\beta\left(-beta(dr_1-dr_3)^2\right)]$, where $dr_1$ and $dr_3$ are the two sides of the equilateral triangle connected to the particle. - Scale the triangle such that the two sides connected to the particle and the height are at least the particle radius. - Scale the area of the triangle 20 times, of which 10 times with a scaling factor between 1 and $4\sigma_{cp}^2/\sigma_{c}^2$ and 10 times with a scaling factor between $4\sigma_{cp}^2/\sigma_{c}^2$ and 1000. - Only triangles with a volume less than 100 and a side length less than 20 are accepted. ::: warning Moreover, for this method we bias particles with small angles (which are present a lot in the simulations), every first 50 triangles of every run we generate random triangles with an angle between $[0, 0.001]$. The problem was that there are almost no random triangles with an angle of $0$ degrees, while in the actual simulation those occur. Real simulation triangle angles ![image](https://hackmd.io/_uploads/HkY6atpBA.png) Generated triangle angles ![Screenshot from 2024-06-17 12-20-19](https://hackmd.io/_uploads/SyIbJcprR.png) ::: We try different bias potentials $\beta$, see figure below | Biase skewness | Bias angle | | -------- | -------- | | ![Screenshot from 2024-06-10 11-10-20](https://hackmd.io/_uploads/rkYWmr4r0.png) | ![Screenshot from 2024-06-10 11-54-15](https://hackmd.io/_uploads/BkBLTSEHA.png) | #### $\sigma_p = 0.8$ trainingdata skewness bias | | Simulated data | $\beta = 0$ | $\beta = 5$ | $\beta = 10$ | $\beta = 20$ | $\beta = 50$| $\beta = 1000$ | | ---------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |-------------- | | Log Free area vs Area | ![Screenshot from 2024-06-10 11-24-39](https://hackmd.io/_uploads/SyVvUBNrA.png) | ![Screenshot from 2024-06-10 11-38-11](https://hackmd.io/_uploads/SyHqtHEBA.png) | ![Screenshot from 2024-06-10 11-33-39](https://hackmd.io/_uploads/By-FOSVHR.png) | ![Screenshot from 2024-06-10 11-31-41](https://hackmd.io/_uploads/rk_zdSErA.png) | ![Screenshot from 2024-06-10 11-29-45](https://hackmd.io/_uploads/HkdcPrNrC.png) | ![Screenshot from 2024-06-10 11-26-17](https://hackmd.io/_uploads/BJTTIr4SR.png) | ![Screenshot from 2024-06-10 11-18-54](https://hackmd.io/_uploads/SySNrrVSR.png) | | Free area vs Area | ![Screenshot from 2024-06-10 11-24-52](https://hackmd.io/_uploads/Hk4uLS4rC.png) | ![Screenshot from 2024-06-10 11-38-26](https://hackmd.io/_uploads/H1XsFB4BC.png) | ![Screenshot from 2024-06-10 11-33-52](https://hackmd.io/_uploads/BJ1kFSErA.png) | ![Screenshot from 2024-06-10 11-32-08](https://hackmd.io/_uploads/H1IXurNSR.png) | ![Screenshot from 2024-06-10 11-29-55](https://hackmd.io/_uploads/SJxhwBNr0.png) | ![Screenshot from 2024-06-10 11-26-34](https://hackmd.io/_uploads/S1oRLS4r0.png) | ![Screenshot from 2024-06-10 11-19-49](https://hackmd.io/_uploads/S1BHSr4BC.png) | | | Histogram Area/perimeter$^2$ | ![Screenshot from 2024-06-10 11-25-11](https://hackmd.io/_uploads/rJLYUrNBA.png) | ![Screenshot from 2024-06-10 11-38-42](https://hackmd.io/_uploads/BJDntS4H0.png) | ![Screenshot from 2024-06-10 11-35-34](https://hackmd.io/_uploads/r1SlYH4B0.png) | ![Screenshot from 2024-06-10 11-32-22](https://hackmd.io/_uploads/ByY4urNS0.png) | ![Screenshot from 2024-06-10 11-30-24](https://hackmd.io/_uploads/SJJpPSVSC.png) | ![Screenshot from 2024-06-10 11-26-52](https://hackmd.io/_uploads/By-xPS4BR.png) | ![Screenshot from 2024-06-10 11-20-39](https://hackmd.io/_uploads/S1dOBH4BA.png) | | | Angle distribution | ![Screenshot from 2024-06-10 11-23-46](https://hackmd.io/_uploads/SkEEISEBR.png) | ![Screenshot from 2024-06-10 11-39-07](https://hackmd.io/_uploads/Bk26FHVBR.png) | ![Screenshot from 2024-06-10 11-35-51](https://hackmd.io/_uploads/BkubKBNSR.png) | ![Screenshot from 2024-06-10 11-32-39](https://hackmd.io/_uploads/SJCSuBNH0.png) | ![Screenshot from 2024-06-10 11-30-41](https://hackmd.io/_uploads/H1QRPHVHR.png) | ![Screenshot from 2024-06-10 11-27-19](https://hackmd.io/_uploads/SJrWwS4HC.png) | ![Screenshot from 2024-06-10 11-28-57](https://hackmd.io/_uploads/Bk5wPB4SR.png) | | | Fraction freevol zero | $56\%$ | $2.1\%$ | $2.3\%$|$2.3\%$|$2.4\%$ | $2.7\%$| $3.2\%$| | Accuracy training |$0.99997317$ | | $0.99999997$ | $0.99999975$ || $0.99999987$ | $0.99999999$ | #### $\sigma_p = 0.8$ trainingdata angle bias | | Simulated data | $\beta = 0$ | $\beta = 0.5$ | $\beta = 1$ | $\beta = 2$ | $\beta = 3$ | $\beta = 10$ | | ----------------- | -------------- | ----------- | ----------- | ------------ | ------------ | ------------ | -------------- | | Free area vs Area |![Screenshot from 2024-06-10 11-24-52](https://hackmd.io/_uploads/Hk4uLS4rC.png) |![Screenshot from 2024-06-10 11-38-26](https://hackmd.io/_uploads/H1XsFB4BC.png)|![Screenshot from 2024-06-10 13-46-34](https://hackmd.io/_uploads/SkupPvNrR.png)|![Screenshot from 2024-06-10 13-49-23](https://hackmd.io/_uploads/Hy78dvEBA.png)|![Screenshot from 2024-06-10 13-54-59](https://hackmd.io/_uploads/HJUiKDNBA.png)|![Screenshot from 2024-06-10 13-56-33](https://hackmd.io/_uploads/S1IW9P4rR.png)|![Screenshot from 2024-06-10 13-58-28](https://hackmd.io/_uploads/rkNFcvEB0.png)| |Histogram Area/perimeter$^2$ | ![Screenshot from 2024-06-10 11-25-11](https://hackmd.io/_uploads/rJLYUrNBA.png)|![Screenshot from 2024-06-10 11-38-42](https://hackmd.io/_uploads/BJDntS4H0.png)|![Screenshot from 2024-06-10 13-47-56](https://hackmd.io/_uploads/SyJZOw4r0.png)|![Screenshot from 2024-06-10 13-50-19](https://hackmd.io/_uploads/Hkzq_DVS0.png)|![Screenshot from 2024-06-10 13-55-18](https://hackmd.io/_uploads/S1w3FwNBA.png)|![Screenshot from 2024-06-10 13-56-54](https://hackmd.io/_uploads/Hy8f9vNrR.png)|![Screenshot from 2024-06-10 13-59-01](https://hackmd.io/_uploads/rkf99vNBA.png)| |Angle distribution | ![Screenshot from 2024-06-10 11-23-46](https://hackmd.io/_uploads/SkEEISEBR.png) |![Screenshot from 2024-06-10 11-39-07](https://hackmd.io/_uploads/Bk26FHVBR.png)|![Screenshot from 2024-06-10 13-52-50](https://hackmd.io/_uploads/Bk8mFwEHR.png)|![Screenshot from 2024-06-10 13-53-59](https://hackmd.io/_uploads/rygdFwESA.png)|![Screenshot from 2024-06-10 13-55-30](https://hackmd.io/_uploads/Hk7pKvVSC.png)|![Screenshot from 2024-06-10 13-57-12](https://hackmd.io/_uploads/Bkt75PNSC.png)|![Screenshot from 2024-06-10 13-59-12](https://hackmd.io/_uploads/ryms5PESC.png)| |Fraction freevol zero | $56\%$ | $2.1\%$ | $2.3\%$ |$2.4\%$ |$2.4\%$ |$2.5\%$|$2.4\%$| | Accuracy training |$0.99997317$ | | $0.99999998$ |$0.99999998$ | $0.99999992$ | | $1.0$ | #### Documentation trainingdata Below we discuss how the different trainingdata is called In the directory SIM_ML_gen_tri_MC_col_pol we have the results for many different ways of generating the triangles. Training data can be found in *OUTPUT_data_gen_tri* and is evaluated using *NN_voronoi_triangle_train_generate_tri.py* - everything with no bias is triangles generated using three random points and without any bias (with generaterandomtri.cc code in DATA_gen_Tri) - everything with bias 1 is triangles generated using three random points and with angle bias (with generaterandomtri_angle.cc code in DATA_gen_Tri, data is in folder called sigma_p_\*_beta_\*_angle) - everything with bias2 is triangles generated using three random points and with ratio bias (with generaterandomtri_ratio.cc code in DATA_gen_Tri, data is in folder called sigma_p_\*_beta_\*) - everything with bias 3 is triangles generated using a random angle and two random sides (with generaterandomtri_difmethod.cc code in DATA_gen_Tri, data is in folder called sigma_p_\*_beta_\*_difmethod). The bias coupled to this is how equilateral the triangles are. - everything with bias 4 is triangles generated using a random angle and two random sides and with the first 50 triangles having small angles (with older version of generaterandomtri_difmethod_small_tri.cc code in DATA_gen_Tri, data is in folder called sigma_p_\*_beta_\*_difmethod_smalltri). The bias coupled to this is how equilateral the triangles are. - everything with bias 5 is triangles generated using a random angle and two random sides with first 100 triangles have an angle smaller than 0.001 and 100 smaller than 0.1 (with older version of generaterandomtri_difmethod_small_tri.cc code in DATA_gen_Tri, data is in folder called sigma_p_\*_beta_\*_difmethod_more_smalltri). The bias coupled to this is how equilateral the triangles are. In the directory SIM_ML_gen_tri_more_par_MC_col_pol we have the results for simulations based on different number of input parameters for the NN. Training data can be found in *OUTPUT_data_gen_tri_more_par* and is evaluated using NN_voronoi_triangle_train_generate_tri_more_par.py. Data is always generated using - triangles generated using a random angle and two random sides with first 100 triangles have an angle smaller than 0.0001 and 100 smaller than 0.1 (with older version of generaterandomtri_difmethod_small_tri_more_par.cc code in DATA_gen_Tri, data is in folder called sigma_p_\*_beta_\*_difmethod_more_smalltri_more_par). The bias coupled to this is how equilateral the triangles are. We then train a NN on $250\cdot 2\cdot 10^4$ triangles and test on the same amount of triangles. ### Testing the ML method ::: warning I found out during running that sometimes particles get stuck. The reason turned out to be that if the angles between triangles is too small, the computed angle becomes nan (which leads to a very large freevolume). To solve this I build in that if the angle is smaller than some small cutoff, $\theta$ is set to zero. Red particles are particles that are frozen (Paircorrelation_eta_p_0.8_bias_4_beta_0.00) | Frozen particles | DeoWn pERIXLWA QIRH REINflwa| | -------- | -------- | |![Screenshot from 2024-06-24 09-46-54](https://hackmd.io/_uploads/BJ3cNi8LC.png)|![Screenshot from 2024-06-24 09-58-18](https://hackmd.io/_uploads/By9XDsLIA.png) | This gives rise to Voronoi cells with very small angles Moreover, sometimes the free volume predicted by the NN is larger than the total volume of the triangle. To solve that, we say if the predicted free volume is larger than $V_{tot} - \frac{\theta}{2\pi}\cdot\pi\frac{\sigma_c^2}{4}$, the new free area is in approximation given by $\text{ReLU}[V_{tot} - \frac{\theta}{2\pi}\cdot\pi\frac{\sigma_{cp}^2}{4}]$ ::: danger For the earlier simulations in SIM_ML_gen_tri_MC_col_pol we had not yet fixed this, so always be aware that these particles might be stuck!!! Only the folders with test\* in *SIM_ML_gen_tri_MC_col_pol* have the correct c-code that fixes this problem. This means that only bias 4 and bias 5 are ran correctly. :::success ::: ## Results ### How many body is the potential? *Found on my computer /home/rinskealkemade/Documents/ML_multibodyinteract/DATA_pol_col_id/Manybodyinteractions, code MC_manybodypotentials.cc, picture comes from MB_col_pol.nb* ::: warning Put on ODIN!!! ::: To measure the amount of n-body interactions we take 5 snapshots of each pahsepoint (the last snapshot of that run) and shoot randompoints at it. If it does not fall in one or more depletion zones (i.e. overlap with a particle or not in the neighbourhood of any particles), we reject it. Otherwise we count the amount of overlap zones, which tells us how many body that interaction is. Below we plot the result for multiple phaasepoints. ![Screenshot from 2024-07-17 14-41-41](https://hackmd.io/_uploads/S10G34HdA.png) We test this in the folder */home/rinskealkemade/Documents/ML_multibodyinteract/DATA_pol_col_id/Manybodyinteractions/Test*, where for a snapshot we can plot the points that we should together with the snapshot (see notebook MB_interactions.c).Below we show it for one snapshot where blue is 1 body, orange is two body, red is three body, green is fourbody and where in this case we look at a snapshot at $\eta_p = 5, \eta_c = 0.4, \sigma_p = 0.8$ ![Screenshot from 2024-07-16 12-00-28](https://hackmd.io/_uploads/BJYxhaXuR.png) Below we show the free volume percentage associated with those statepoitns (same folder, ML_multibodyinteract/DATA_pol_col_id/Manybodyinteractions, code *MC_manybodypotentials_freevol.cc* and notebook MC_col_pol_Freevol.nb). Data is averaged over 5 runs and 9 snapshots. The opaque areas indicated $1\sigma$. | Column 1 | $\sigma_P = 0.8$ | | -------- | -------- | | ![Screenshot from 2024-07-12 17-13-45](https://hackmd.io/_uploads/rysVd6CPA.png)| ![Screenshot from 2024-07-12 17-02-05](https://hackmd.io/_uploads/ryZYBpCP0.png)| https://stats.stackexchange.com/questions/86708/how-to-calculate-relative-error-when-the-true-value-is-zero/201864#201864 |MSE only | MSE and Arctan alternately | | -------- | -------- | |![Train_only_RMSE](https://hackmd.io/_uploads/ryJ1difuA.png) | ![train on arc and MSE](https://hackmd.io/_uploads/ry97dif_0.png) || ## New Results 11-09-2024 With new logmarithmic loss function we find for $\sigma_P = 0.8$ and various input parameters | area1_s1_1_s2_1_s3_1_per0_angle1_height1_ratio1|area1_s1_1_s2_1_s3_1_per0_angle1_height1_ratio1 |area1_s1_1_s2_1_s3_1_per0_angle1_height1_ratio0| | -------- | -------- | -------- | |![NN_densityplot_log_sigma_p0.8_area1_s1_1_s2_1_s3_1_per0_angle1_height1_ratio1](https://hackmd.io/_uploads/S17XygyTA.png) | ![NN_densityplot_log_sigma_p0.8_area1_s1_1_s2_1_s3_1_per1_angle1_height1_ratio1](https://hackmd.io/_uploads/H1R_kxkpC.png) |![NN_densityplot_log_sigma_p0.8_area1_s1_1_s2_1_s3_1_per0_angle1_height0_ratio1](https://hackmd.io/_uploads/rkUjJxJpA.png)| ### Coexistence pressure If we use free volume form the ML | $\eta_P = 6.5$ | $\eta_P = 7.0$| $\eta_P =9$ | | -------- | -------- | -------- | | Still running | ![image](https://hackmd.io/_uploads/BkZ92ZkTA.png)| ![Screenshot from 2024-09-11 14-29-54](https://hackmd.io/_uploads/HkEUaW1TR.png)| ![Screenshot from 2024-09-11 16-07-07](https://hackmd.io/_uploads/ByzVVQ1aC.png) If we use the analytical free volume | $\eta_P = 8$ | $\eta_P = 9$| | -------- | -------- | |![Screenshot from 2024-09-18 09-26-17](https://hackmd.io/_uploads/ry58-b_TC.png)|![Screenshot from 2024-09-18 09-28-17](https://hackmd.io/_uploads/HymUbbd6R.png)| ## NO ML | Free area fraction | paircor| | -------- | -------- | |![Screenshot from 2024-09-17 10-15-08](https://hackmd.io/_uploads/Byisc3Ia0.png)|![Screenshot from 2024-09-17 10-16-05](https://hackmd.io/_uploads/B1gxi3IpA.png)| ## What data is used? All red folders are not used in the end - [ ] ANALYSIS - [ ] code_EDMD: Contains the EDMD codes. The source codes from Frank and the adjusted long box code. - [ ] code_Free_volume_triangle - [ ] code_Free_volume_voronoi - [ ] code_Free_volume_voronoi_triangle: Contains the code to compute the free volume associated with a Voronoi triangle. It is not used in the end, but it contains the CHECK folder that checks this code. - [ ] code_Free_volume_WCA_voronoi_triangle - [ ] code_Insert_colloids - [ ] code_MC_col_pol_muVT: Code that generates the compare data for the ideal polymer-colloid system. In the end we only use the MC_harddisks_polymers_muVT_obtaincomparedata.cc code that generates the data in the folders *DATA_col_pol/sigma_\** ## Figures - [ ] Figure about how many MB interactions we have, can be found in the folder */home/users/5670772/ML_multibodyinteract/DATA_pol_col_id/Manybodyinteractions/*. We use the code *MC_manybodypotentials.cc* to analyze the data from the folders *DATA_col_pol/sigma_\**, we use $1000000$ points per snapshot. In total we use 45 snapshots per statepoint. - [ ] Figures pair correlation: Made with notebook *Paircorrelation_only_1_1_1_0.\*.nb* in folder */home/users/5670772/ML_multibodyinteract/SIM_ML_gen_tri_more_par_other_loss_MC_col_pol/*. Uses per statepoint 5 runs to compute the average pair correlation function. We only use the pair correlation run for the model trained on all input parameters. Data comes from the folders *SIM_ML_gen_tri_more_par_other_loss_MC_col_pol/sigma\** for the ML data and the folder *DATA_col_pol/sigma_\** for the real data.