Arjen's project

# Arjen's project ###### tags: `bachelors projects` # Code designs ## Design code generation of cubic and FCC crystals File name: Generate_crystal.c ### Global variables (simulation parameters) double eta0: initial packing fraction int N: the number of particles (such that $\sqrt[3]{N}$ is an integer in case of a cubic crystal and $\sqrt[3]{N}/2$ in case of an FCC crystal). double r[N][NDIM + 1]: coordinates of the particles (first 3 columns) and their diameter int N_pd: the number of particle displacements at each MC cycle that is accepted int N_vc: the number of volume changes at each MC cycle that is accepted. ### Functions #### void set_coord #### void initialize_cubic Algorithm: * Function first calculates the grid spacing (see [note on calculation grid spacing](#Appendix:-note-on-calculation-grid-spacing)). #### void initialize_fcc Algorithm: * Function first calculates the grid spacing (see [note on calculation grid spacing](#Appendix:-note-on-calculation-grid-spacing)). * Function then places particles in an FCC-grid (see [note on FCC crystal generation](#Appendix:-note-on-FCC-crystal-generation)) for details on the calculation of the particle coordinates, see "Appendix: note on FCC crystal generation". #### initialize_box_cubic #### initialize_box_fcc #### double calculate_volume_fraction #### void write_data ## Design code Monte Carlo simulations ### Global variables NDIM: number of dimensions MAX: max number of particles int n_particles: number of particles (initialized when crystal setup file is read) box[NDIM][2]: Array containing the coordinates of the box walls r: Array containing the x, y, z coordinates (1st, 2nd, 3th column) and diameter (4th column) of the different particles (rows). const int mc_cycles: number of mc cycles double q: the particle charge. ### Functions #### double applyPeriodicBC Function to apply periodic boundary conditions on the (new) particle coordinate. Input: * double x: the coordinate of the particle in one of the directions * double boxLength: the length of the box Returns: the coordinate after application of the B.C.'s. #### double calculateDistance Function that calculates the distance between particle 1 with coordinates r1 and particle 2 with coordinates r2. Input: * double r1[NDIM + 1]: coordinates of particle 1 * double r2[NDIM + 1]: coordinates of particle 2 Returns: the distance r_12 #### void setCoordinates Function to set the values of some destination coordinate array to the coordinates of the source array. Input: * r_dest[MAX][NDIM+1]: the destination array. * r_source[MAX][NDIM+1]: the source array. #### potential Function which calculate the Yukawa-potential, $U(r_{ij}) = \frac{V_0}{r_{ij}} \exp[-\kappa (r_{ij} - d)]$, for 2 particles i and j, separated by a distance r_ij. * double r_ij: the distance between particles i and j * double dia: the diameter of the particles Returns: the value of the potential #### double calculatePairEnergy Function to calculate the energy $U(r_{ij})$ between two coloid particles with $\mathbf{r}_i$ and $\mathbf{r}_j$ according to the truncated and shifted Yukawa potential. Algorithm: * Initialize variables: * * Calculate $r_{ij}$ by calling calculate_distance * return V0/r_ij*exp(-$\kappa$*r_ij) #### double calculateParticleEnergy Function to calculate the contribution of one particle $i$ to the total energy of the system. Input: * double coord[MAX][NDIM + 1]: coordinate array * double boxLength: the length of the box * int i: the index of the particle in the coordinate array Algorithm: * double energy = 0 * for $j \in [0, N_{part}]$: * if $j \neq i$: energy += calculatePairEnergy (i, j) * return energy #### double updateSystemEnergy Function to calculate the total energy of the system after a particle displacement. Algorithm is of order $O(n)$. Input: * double r_new[MAX][NDIM + 1]: coordinate array * double boxLength: the length of the box * double Us_old: the old energy of the system * double Up_old: the contribution to the energy of the particle np in the old configuration (before displacement). * int i: index of the displaced particle. Algorithm: * Substract contribution of particle in the old configuration from total system energy; double Us_new = Us_old - Up_old * Calculate contribution of particle in the new configuration; double Up_new = energy_particle(i) * Us_new += Up_new * return Us_new #### double calculateSystemEnergy Function to calculate the total energy of the system. Algorithm is of order $O(n^2)$. Input: * double coord[MAX][NDIM + 1]: coordinate array Returns: total energy of the system. #### int cycle_NVT Function which performs one MC cycle of $N_{particles}$ steps (particle displacements). Input: Returns: the number of succesful particle displacements in the MC cycle (N_pd). Algorithm: * int N_pd = 0 * Loop steps $[0, N_{particles})$ * N_pd += move_particle * Return N_pd #### double initialize_NVT Function to get the value for the maximum particle displacement $dR$ Input: * double coord[MAX][NDIM + 1]: coordinate array * double boxLength: the length of the box * double dR0: the initial value of the max particle displacement ($\delta R_0$) * double DAR: the desired acceptance rate * int N_ic: the number of initialization cycles Algorithm: * Declare/initialize variables * double acc = 0 (Acceptance rate of particle displacements) * double dR = dR0 (Set particle displacement to given initial value) * Loop cycles $[1, N_{ic}]$ * acc += cycle_NVT (Note: at this point $acc$ is not the actual acceptance rate, but rather the sum of succesful particle displacements in the last 100 cycles) * Every 100 steps: update $\delta R$ * Calculate (the actual) acceptance rate: $acc = acc/(100 \ n_{part})$ * if $acc > DAR$: dR *= 1.05 * else if $acc < DAR$: dR *= 0.95 * if $dR > 5$: set $d R = 0.5$ and break * set value dR_max to dR #### read_coord Function that reads the initial coordinates of the crystal particles from a .dat-file, with in the first line the number of particles, the min and max coordinates of the box in lines 2-4 and the coordinates in all lines 5-end. #### int main Main function of program Algorithm: * Read crystal configuration data * In case NPT: open file to write volumes to * Initialization * NVT: tune $dR$ * NPT: tune $dR$ and $dV$ * Perform $N_{cyc}$ MC-cycles (loop) * NPT: save avg volume to file after every 100 cycles * In case NVT (optional): write new coordinates to file * In case NPT: close volume file ## Design code Brownian dynamics ### Global variables double visc: viscosity of the solvent ($\eta$) double d: diameter of the colloid double fric: the friction coefficient ($\gamma$) double dt: timestep double simT: the total simulation time int Npart: the number of particles in the system double coordinates[Npart][NDIM]: array with the present coordinates of the particles. double MSD[Niter]: array with the mean square displacement (MSD) values. double SDOM[Niter]: array with the SDOM of the MSD values. ### Functions #### double gaussian Function to generate a Gaussian number according to the Marsaglia polar method (source: https://www.projectrhea.org/rhea/index.php/The_principles_for_how_to_generate_random_samples_from_a_Gaussian_distribution) #### double applyPeriodicBC Function to apply boundary condition on the coodinate of a particle in one of the directions (see Notes>Brownian dynamics>Boundary conditions) Input: * double x: the coordinate of the particle without application boundary conditions ($x'$) Returns: coordinate after application of boundary conditions Algorithm: * if $x'$ > L: $x_{new} = x' - L*(int)(\frac{x'}{L})$ * else: $x_{new} = L [1 + (int)(-\frac{x'}{L})] + x'$ * Return $x_{new}$ #### double nearestImage Function to apply nearest image convention on some distance dx ($\delta x$, $\delta y$ or $\delta z$) between particles. Input: * double dx: the distance between two particles in a particular direction. Returns: the distance after nearest image convention has been applied. #### double* calculateDistanceVector Function to calculate the distance vector $\mathbf{r}_{12} = [x_2 - x_1, y_2 - y_1, z_2 - z_1]^T$ between two particles with $\mathbf{r}_1$ and $\mathbf{r}_2$. Input: * struct vector r1: the first vector * struct vector r2: the second vector Returns: the pointer to the distance vector $\mathbf{r}_{12}$ #### double* calculateInteractionForce Function which calculates the interaction force $\mathbf{F}_{ij}$ exerted by a particle at $\mathbf{r}_i$ on a particle at $\mathbf{r}_j$, according to the Yukawa-potential; $\mathbf{F} = \frac{V_0}{r^2} \exp\{-\kappa r\} (\frac{1}{r} + \kappa) [x_i - x_j, y_i - y_j, z_i - z_j]^T$. If $r_{ij}$ is larger than the cutoff radius, returns $\mathbf{F}_{ij} = [0, 0, 0]^T$ Input: * double r_i[NDIM] * double r_j[NDIM] Returns: the interaction force on particle $i$ exerted by particle $j$. #### double* calculateTotalInteractionForce Function to calculate the interaction force of all particles $j \neq i$ on particle $i$; $\mathbf{F}_{i}^{int} = \sum_{j \neq i} \mathbf{F}_{ji}$. Algorithm: $O(n)$. input: * double coord[MAX][NDIM]: coordinate array . * int ii: index of the particle Returns: pointer to array with $\mathbf{F}_{i}^{int}$ components. Algorithm: * Initialize * static double F_int[NDIM]: array in which components total interaction force stored. * double* F_ji: pointer to the array of interaction force components of a particle $j \neq i$ on particle $i$ * for $j \in [0, Npart)$: * if $j \neq i$: * Calculate $F_{ji}$ * Add components in each direction to total force on particle * Return: F_int #### double* EulerMaruyama Function which performs the integration step of one particle according to the Euler-Maruyama integration scheme. input: * double r_old[NDIM]: the old coordinates of the particle * double* F_int: pointer to the array with interaction force components on the particle. Returns: pointer to the displacement vector, $\mathbf{\delta r} = [\delta x, \delta y, \delta z]^T$, of a particle. Algorithm: * Initialize: * static double dr[NDIM] * for $k \in [0, dim)$ * dr[k] = sqrt(2D*dt)*GenGaussian(t) + F_int[k]/fric * return dr #### void calculateMSD Function which sets the values of the pointers MSD and SDOM. Input: * double SD[Npart]: the displacement values of the particles * int step: the current step in the simulation Algorithm: * Initialize * double msd = 0: mean square displacement * double sdom = 0: SDOM in mean square displacment. * double variance = 0: the variance in the MSD * Calculate SD of each particle and MSD * for $i \in N_{part}$: * ~~SD[i] = dr[i][0]*dr[i][0] + dr[i][1]*dr[i][1] + dr[i][2]*dr[i][2]~~ * msd += sd[i] * Calculate MSD: msd = msd/Npart. * MSD[step] = msd * Calculate the SDOM of the mean square displacement. * for $i \in N_{part}$: * variance += (sd[i] - msd)*(sd[i] - msd) * variance = variance/(Npart*(Npart - 1)) * sdom = sqrt(variance) * SDOM[step] = sdom #### void simStep Function which performs one step of the simulation by performing Euler Maruyama on every particle. Input: * int step: the current step in the simulation Algorithm: * Initialize: * double* F_int: pointer to the total interaction force on a particle $i$. * double* dr_ptr: pointer to the array with displacement vector of a particle. * double disp[Npart]; * for $i \in N_{part}$: * Calculate $\mathbf{F}_{i}^{int}$ * Peform Euler-Maruyama on particle $i$: dr_ptr = EulerMaruyama * for $j \in [0, dim): * disp[i] += dr[j]*dr[j] * coordinates[i][j] = applyPeriodicBC(coordinates[i][j] + dr[j]) * Update MSD: calculateMSD(disp, step) #### void readCoord Function which loads the coordinate data from a setup file to the coordinates array. Input: * struct vector coord[Npart]: array with the coordinates vectors of all the particles. * char setupFileName[]: the name of the data file #### void writeMSD Function to write the mean square displacement values and its associated SDOM to a file Input: * char fileName #### int main Main function of program Algorithm: * Initialize array with vectors particle coordinates, vector coord[Npart] and set the coordinates using the readCoord function. * for $np < Ncycles: * * coord[np] = EulerMaruyama(coord[np]) ## Design code GNN In this section the code for the GNN used to learn the forces from the particle coordinates is derived. We start by a very basic description of what the code should do. ### General description The code loads data from a folder with simulation data files containing the coordinates of all particles in the system. Every file corresponds to some time instance during the simulation time. The coordinates are put into a graph, which contains 1) structural information and 2) physical information. Structural information is only used by the network to know which particles are neighbours. The physical data is used in the NN calculations to make predictions. Graph contains/consists of: * Node feature matrix: the forces(physical) * Edge feature matrix: the distanes between the particles (physical) * Node position matrix (physical) * The number of nodes (structural) * Edge indices: matrix with indices particle which are connected by an edge (structural) GNN: updates physical values nodes and edges multiple times: graph layers. In every graph layer: * Update edge features (distances): edge function: * input: particle pair coordinates * output: estimate of 2-body force * Update node features: node function: * input: sum of 2-body forces, particle coordinate * output: estimated total force/particle velocity Algorithm: for each snapshot (file) of system: * Construct graph: * Get particles coordinates: import file data * Calculate distances * create edges for distances $< r_{cut}$ * Calculate all forces for some particle to use for testing???? * Training * Apply graph layers * Test edge and node functions: ?? * Edge function: compare prediction with Yukawa-Force * Node function: compare prediction with Yukawa force and drag force * Go to next snapshot. * Iterate over particles in system * Get coordinates of particle * Get all connections to other particles for this particle (egdes) * Iterate over all edges of particle, use edge function: * input: coordinate particle + distance to some other particle (edge features) * output: average interaction force between particle and other particle, and number of edges * Input coordinates of particle, avg interaction force and number of edges into node function to get velocity particle. * Calculate MSE velocity particle with respect to its ground-truth value. ### Global variables ### Classes #### EdgeFunc Class which describes a the Edge Function NN. Child class of torch.nn.Module ##### Constructor Constructor which sets the sequence of NN modules. Contains input layer with 3 nodes, 2 hidden layers with 300 nodes and output layer with 3 nodes (based on ActiveNet). ##### forward Function which describes a forward pass. Input: $dr$ (radial distance) Output: Force $\mathbf{F_{ji}$ corresponding to the edge $e_{ji}$. Note that direction is taken into account, since graph does not distinguish between directed and undirected edges. #### ForceSumLayer Class which descibes the message passing, i.e. summing the forces learned by the Edge Function NN and calculating the magnitude. Class is a child class of torch_geometric.nn.MessagePassing. ##### forward Method which sets the node features to the sum of all the interaction forces. Calls the self.propagate and self.update methods. ##### message Method which describes the message in the message passing scheme, in this case the distance $|\mathbf{x}_j - \mathbf{x}_i|$. Constructs messages from node $j$ to node $i$ in analogy to for each edge in edge_index. This function can take any argument as input which was initially passed to propagate(). #### GNN Class which describes the GNN. Child class of torch.nn.Module. ##### Constructor Constuctor method sets the layer (ForceSumLayer) ##### forward Function which describes a forward pass. #### trainData A class which contains the data of the GNN training procedure. ##### Constructor In the constructor, you need to set the number of epochs. ##### trainStop Method to call at the end of a simulation: select only the elements $n$ in the data arrays for which $n < N_{stop}$, where $N_{stop}$ is the number of training epochs and set $N_{epochs} = N_{stop}$. Input: Nstop ##### Instance variables * Nepochs * mse (array of float): array containing the MSE values of the predicted forces with respect to the ground-truth value * cod_train (array of float): array containing the COD values of the model applied on the training data. * cod_test (array of float): array containing the COD values of the model applied on the test data. ### Functions #### BatchTrainGNN Function to train the GNN using batch training. Algorithm: - Get train loader and test data - Initialize a trainData object. - Initialize temporary variables: - cod_test = -inf (best COD value of model on test values) - best_weights = None - for epoch < Nepochs: - for batch in training batches: - Calculate prediction of model on train data - Back propagation - Calculate prediction model on test data - Calculate MSE model prediction and test data and store - Calculate the COD values for both train and test data and store in trainData object - (update model parameters) - if COD train value > old COD: - Set best_weights - if # Notes ## Generation crystals ### Calculation grid spacing #### Cubic crystal Consider unit cell of a cubic crystal ![](https://hackmd.io/_uploads/Bkff7mU63.png) Cell contains 4 quarters (2D) 8 quarters (3D) of a particle. Now given the packing fraction $\eta$ (global variable) we can calculate the grid spacing $l$ in the following way $\eta = \frac{V_{part}}{V_{cell}} = \frac{V_{part}}{l^{dim}} \qquad l = \sqrt[dim]{\frac{V_{part}}{\eta}}$. ### Algorithm FCC crystal generation See sketch ![](https://hackmd.io/_uploads/ByHkxFlCh.png) Number particle positions by nx, ny, nz. Sketch is example of 2x2x2 crystal, the left front angle of the lowest cube being at (nx, ny, nz) = (-2, -2, -2). If we move along z-direction we see that 2D pattern of particles depend on whether nz is even or odd: ![](https://hackmd.io/_uploads/HymVQ8kCn.png) Within these two patterns we see that the x-coordinates of the particles depends on the whether ny is odd or even. For case I) ![](https://hackmd.io/_uploads/Hy9Um81Ah.png) and for case II) ![](https://hackmd.io/_uploads/Hk4vQIkCn.png) The general procedure of placing the particles can thus be summarized in the following scheme One problem with this scheme is that it only excepts half the particles, thus $N \neq N_x N_y N_z$, but $N = \frac{1}{2} N_x N_y N_z$. ## Monte Carlo simulations ### NVT Test simulation NVT by checking that at phase transition at $\eta \approx 0.5$. Run 2 simulations with $N_{particles} = 500$, $N_{ic} = 50,000$ and $\delta R = 0.1$ at start, the first one with $\eta = 0.4$ and the second one at $\eta =0.6$. Initialization $\eta = 0.4 \rightarrow \delta R = 0.153165$ $\eta = 0.6 \rightarrow \delta R = 0.053484$ After 100,000 cycles: $\eta = 0.4$: ![](https://hackmd.io/_uploads/rky43ImR2.png) $\eta = 0.6$: ![](https://hackmd.io/_uploads/Sk1I2IXA3.png) ### NPT To test the code for the NPT-ensemble, we tried to reproduce the Carnahan-Starling equation. First rewrite CS equation of state in terms of dimensionless quantities: For the crystal phase, the CS equation can be approximated as All simulations use $N_{particles} = 500$. Initialization: $dR_0 = 0.1$, $dV_0 = 0.4$, $DARR = 0.3$, $DARV = 0.09$, $N_{mc} = 500,000$ and $N_{ic} = 100,000$. From the simulations we get $\tilde{V}$ and we obtain $\eta$ via $\eta = \frac{\pi \tilde{\rho}}{6} = \frac{\pi N}{6 \tilde{V}}$. For the crystal phase we use an FCC crystal with $\eta_0 = 0.7$. The fluid phase is generated by running a FCC simulation at $\tilde{P} = 0.1$. Snapshot: ![](https://hackmd.io/_uploads/rkE62sWy6.png) The values of the volumes at the different pressures were saved to seperate files. In the plot_eta_P Python script, we calculate the average value of $\eta$ and we plot ($\eta$, $\tilde{P}$). For the fit curves we used $\eta$ for the fluid phase and $\eta$ for the crystal phase. Result: ![](https://hackmd.io/_uploads/Hy_tNkvy6.png) ### Yukawa particles Test energy functions Yukawa particles. 1. Test energy functions for the following setup: ![](https://hackmd.io/_uploads/Hk13stx16.png) 2. Test potential function by plotting it for range of $r$-values. #### Test calculatePairEnergy Test function for particles 1 and 2: ![](https://hackmd.io/_uploads/ByKAjKeJT.png) Code: $E_{12} = 8.988011$. Correct. #### Test calculateSystemEnergy ![](https://hackmd.io/_uploads/SyRg3Kl1T.png) Code: $E = 12.581735$. Correct. #### Test calculateParticleEnergy Code: $E = 10.133912$. Correct. #### Test updateSystemEnergy For the new configuration we change $\mathbf{r_3}$ to $\mathbf{r_3} = [1, 3, 1]^T$. ![](https://hackmd.io/_uploads/Bkby6txkp.jpg) Code: $E = 10.133912$ #### Plot potential ![](https://hackmd.io/_uploads/H1ZTgiZ1T.png) Seems correct. Now try to incorporate cutoff potential. Use $r_{cut} = 3$ ![](https://hackmd.io/_uploads/H1S_PsZ16.png). Seems correct. ![](https://hackmd.io/_uploads/ByWx7ibxa.png) #### Test move_particle First simple test: check whether energy correctly calculated by comparing with result calculateSystemEnergy after a large number of steps. Compare values after 50,000 initialization steps and 500 MC steps. Use FCC with $N_{part} = 108$, $\beta = 2$ and $\eta = 0.7$. Result: $\delta R = 0.011267$ The energies are equal and have value $U_{sys} = 129119.965613$ ## Brownian dynamics ### Boundary conditions Problem: in BD simulations you don't control max displacement value, hence situations in which $\delta x > L$ might occur. Easy solution would be modulus $x_{new} = x_{old}\%L$, but this is computationally slow. Modulus can be replaced by following: $C = A\%B \quad \rightarrow \quad C = A - B*(int)(A/B)$ Problem: how to handle boundary at $x = 0$ if $\delta x > L$? Example: $L = 4$, $x_0 = 1$ and $\delta x = -10$. Desired result $x_1 = 3$. Let $x'$ be the value of $x$ before application of boundary conditions $x' = -9$. To calculate $x_1$ you take $-x'$ and take the modulus of it. Then you substract modulus from L: $x_1 = L - (-x'\%L)$ Plugging in $C = A\%B \quad \rightarrow \quad C = A - B*(int)(A/B)$, we get $x_1 = L - (-x'-L*(int)(-x'/L)) = L [1 + (int)(-x'/L)] + x'$ **Test boundary conditions** After 10 iterations, FCC $\eta =0.7$, $N_{part} = 500$ and $L = 7.204827$. ![](https://hackmd.io/_uploads/H1ud4tK16.png) ### Calculating interaction force $V(r) = \frac{V_0}{r} \exp(-\kappa r) - U(r_{cut})$ $U(r) = q \frac{V_0}{r} \exp(-\kappa r)$ $r = \sqrt{x^2 + y^2 + z^2}$ $\mathbf{F}(x, y, z) = -\nabla U = \frac{V_0}{r^2} \exp(-\kappa r) (\frac{1}{r} + \kappa)[x, y, z]^T$ Test: $y = z = 0$ $F(x) = -\nabla U = \frac{V_0}{x^2} \exp(-\kappa x) (1 + \kappa x)$ Paul Moses: $F(r) = -\nabla U = \frac{V_0}{r^2} \exp(-\kappa r) (1 + \kappa r)$ $U^{F} = - \frac{V_0}{r_c^2} e^{-\kappa r_c} [1 + \kappa r_c] r + C$ $C = \frac{V_0}{r_c} e^{-\kappa r_c} [1 + \kappa r_c]$ ### Test gaussian() Test whether Gaussian numbers generated correctly by writing 10,000 generated numbers, generated with a distribution mean = 0 and std = 1, to a file and by calculating the mean and standard deviation afterwards. Result: mean = 0.018777924399999996 std = 1.0054006935538888 ### Test calculateDistanceVector() Test using same setup as in section Yukawa particles: ![](https://hackmd.io/_uploads/Hk_ngUu1T.png) Code: $r_{12} = [1, 1, -1]$. Correct. ### Test calculateInteractionForce() Same setup. Calculate force of particle 2 on particle 1; $r_{21} = [-1, -1, 1]$. Use $V_0 = 100$, $\kappa = 0.5$. Code: $\mathbf{F_{21}} = 11.526718[1,1,-1]$. Correct. ### Test calculateTotalInteractionForce() Same setup. Calculate total interaction force of particles 2 and 3 on particle 1. Code: $\mathbf{F} = [10.657286, 19.553053, -6.209403]$. Correct. ### Test calculateMSD() Test MSD for SD = [44, 8, 21]. Results: MSD = 24.333333 SDOM = 10.525102 Correct. ### Test displacements EulerMaruyama() Test EulerMaruyama() by saving the displacements of each particle to a file. Displacements should have shape of some distribution. Save for $N_{iter} = 10$ steps, $dt = 0.0001$ and setup "mc_fluid_NP=500.dat". Use 60 bins. Result: ![](https://hackmd.io/_uploads/BJ1R92Zg6.png) ### Test simulation one particle Test simulation for one particle: $L = 4$, $\mathbf{r}_0 = [2, 2, 2]^T$ and $N_{iter} = 100$. Note: you can't calculate MSD/SDOM since SDOM requires division by $N_{part} - 1 = 0$ Seems fine, particle just "wiggles" around its starting position. ### Test simulation simulation without interaction forces As a first test, we would like to measure the value of the mean square displacement for on particle (no interaction forces). To get an accurate result we want to average over many particles. To achieve this, we run the simulation with an initial FCC crystal $\eta = 0.7$, $N_{particles} = 500$, but we set the interaction force to zero, such that the particles behave as if no other particles in the system. Furthermore: $D = 1$, $dt = 0.001$, $N_{iter} = 10000$ and results are saved every 100 steps. From theory, we know that $<R^2(t)> = 2dDt$, in which $d$ is the number of dimensions. Comparing the results of the simulations with theory: ![](https://hackmd.io/_uploads/rkup6hA1p.png) ### Test simulation two particles Test simulation for two particles with interaction forces $V_0 = 100$ and $\kappa = 0.5$, $r_{cut} = 10$. $L = 10$, $\mathbf{r}^1_0 = [3, 5, 5]^T$, $\mathbf{r}^1_0 = [7, 5, 5]^T$. Test whether energies calculated correctly at first and final steps. Hand calculations: ![](https://hackmd.io/_uploads/B1W_ApRy6.jpg) Code: $E(step = 1) = 11.062471$ $E(step = 100) = 0.087936$ Correct. ### Adjust potential and force Adjust potential and force such that the force is 0 at $r_{cut}$. **Note that this derivation holds for POINT Yukawa particles.** Compare code output potential and force functions, in MC and BD simulations respectively, with expressions Rinske: ![](https://hackmd.io/_uploads/HkhGuh-lT.png) We see $F = -F_{Rinske}$. If force negative, however, all particles would move towards same point. Wikipedia does agree however. ### Debug Monte Carlo script Problem: ![](https://hackmd.io/_uploads/ryovxeQg6.png) **Particles can get out of box!** Settings used described in section "Compare energies Monte Carlo and Brownian Dynamics" Boundary conditions not applied in move_particle() Now particles don't seem to leave the box. To check whether it reproduces the right result, mimmick the hard sphere potential by $V = 10^6$ if $r_{ij} < \sigma$ $V = 0$ otherwise. and check whether it reproduces hard sphere results. For hard spheres: phase transition at $\eta = 0.5$ (see section NVT, also for simulation parameters) For $\eta = 0.4$ ![](https://hackmd.io/_uploads/Sy67fbEg6.png) For $\eta = 0.6$ ![](https://hackmd.io/_uploads/r1yvG-4e6.png) ### Compare energies Monte Carlo and Brownian Dynamics Now we want to compare energy Monte Carlo and Brownian Dynamics simulations. The particles will move towards a configuration in which the energy is minimum. If we take the volume too large, the energy will tend to 0, which doesn't give a nice comparison. Initial configuration: an FCC crystal with $N_{particles} = 1,000$, $\sigma = 1$, but with $L = 10$ (MC_BD_comparison_setup_v2.dat). For both simulations: $V_0 = 100$ and $\kappa = 0.5$, $r_{cut} = 3$, $N_{initialization} = 100,000$, $N_{steps} = 100,000$, save energy values every $100$ steps and coordinates every 1,000 steps. Brownian: $D = 1$, $\gamma = 1$, $dt = 0.0001$ Monte Carlo: $\beta = 1$, desired $acc = 0.3$ ![ComparisonSystemEnergy_MC_BD](https://hackmd.io/_uploads/S1eWUg1wp.png) MC: $<E/k_B T> = 164.5354$ with $\sigma_{<E/k_B T>} = 0.001278$ BD: $<E/k_B T> = 164.5535$ with $\sigma_{<E/k_B T>} = 0.001288$ $2 \sigma$-criterion: $\sigma = \sqrt{0.001278^2 + 0.001288^2} = 0.00181445$ $|E_{ref} - E_{measured}| = |E_{MC} - E_{BD}| = 0.0181$ Difference is significant. ### Build NN regression Build NN to fit to function $f(x) = 2x^3 + 5$. The test an training data are generated by applying f on a range of x-values $x \in [-2, 2]$, $dx = 0.01$ and by adding Gaussian noise ($\mu = 0$, $\sigma = 1$) to the values. Numpy seed is $1997$. From the data $0.75\%$ is used for training, the rest for testing. ![](https://hackmd.io/_uploads/S1GNTrBWp.png) We'll use pyramid shaped NN with 4 hidden layers (16, 8, 4, 2 nodes). Batch training: $N_{epochs} = 1,000$; $B = 100$ (batch size), $lr = 0.01$ (learning rate) Plot of the MSE during training. ![](https://hackmd.io/_uploads/S1mB6HH-T.png) Test fit on all data ![](https://hackmd.io/_uploads/SJfNJIHZT.png) Problem: sometimes when you run the algorithm, you get straight line, instead of nice fit and no MSE values. Example ![](https://hackmd.io/_uploads/Hkq7GDBWT.png) ![](https://hackmd.io/_uploads/Hyk4MPrZ6.png) Probably some PyTorch function uses random numbers. Try to fix torch seed. Set torch seed to $1997$. Seems to resolve the issue. ## Build GNN Message passing in GNN can be described as $\mathbf{x}_i^k = \gamma^k(\mathbf{x}_i^{k-1}, \Theta_{j \in N(i)} \phi^k (\mathbf{x}_i^{k-1}, \mathbf{x}_j^{k-1}, e_{j,i}))$ with $\gamma^k$ the update function (differentiable functions such as MLP), at message passing step $k$ and $e_{j, i}$ a matrix with edge features. $\Theta_{j \in N(i)} \phi^k$ is the aggregate function, where $\Theta_{j \in N(i)}$ denotes a differentiable, permutation invariant function, e.g., sum, mean or max and $\phi^k$ some differentiable function (MLP). This thesis: only one message passing step: $\mathbf{x}_i^1 = \sum_{j \in N(i)} \phi^1 (e_{j,i}))$ where $\mathbf{x}_i$ corresponds to the force value $\sum_{j \in N(i)} F_{ji}$, $\phi^1$ is a MLP (the Edge Function) which calculates the interaction force $F_{ji}$ given the adjacency matrix $e_{j, i}$. ## Test on simple test system In this test, we check whether the data is correctly converted to a graph by applying the GNN to a simple test system of $3$ particles. File: "test_network.dat" ### Pen and paper Test GNN for 2D system, consisting of 3 particles: $\mathbf{r}_1 = [0, 2]^T$ $\mathbf{r}_2 = [3, 1]^T$ $\mathbf{r}_3 = [1, 1]^T$ in a box with L = 4 and cut-off radius $r_c = 3$. The inter-particle distance vectors: $\mathbf{r}_{12} = [-1, -1]^T \qquad |\mathbf{r}_{12}| = \sqrt{2}$ $\mathbf{r}_{21} = [1, 1]^T \qquad |\mathbf{r}_{21}| = \sqrt{2}$ $\mathbf{r}_{13} = [1, -1]^T \qquad |\mathbf{r}_{13}| = \sqrt{2}$ $\mathbf{r}_{31} = [-1, 1]^T \qquad |\mathbf{r}_{31}| = \sqrt{2}$ $\mathbf{r}_{23} = [-2, 0]^T \qquad |\mathbf{r}_{23}| = 2$ $\mathbf{r}_{32} = [2, 0]^T \qquad |\mathbf{r}_{32}| = 2$ The interaction forces: $\mathbf{F}_{12} = -25.376638[1, 1]^T \qquad |\mathbf{F}_{12}| = 35.887985$ $\mathbf{F}_{21} = 25.376638[1, 1]^T \qquad |\mathbf{F}_{21}| = 35.887985$ $\mathbf{F}_{13} = 25.376638[1, -1]^T \qquad |\mathbf{F}_{13}| = 35.887985$ $\mathbf{F}_{31} = 25.376638[-1, 1]^T \qquad |\mathbf{F}_{31}| = 35.887985$ $\mathbf{F}_{23} = 12.195912[-1, 0]^T \qquad |\mathbf{F}_{23}| = 12.195912$ $\mathbf{F}_{32} = 12.195912[1, 0]^T \qquad |\mathbf{F}_{23}| = 12.195912$ The total interaction force on each particle: $\mathbf{F}_1^{int} = 50.753276 [0, 1]^T \qquad |\mathbf{F}_1^{int}| = 50.753276$ $\mathbf{F}_2^{int} = [-13.180726, -25.376638]^T \qquad |\mathbf{F}_2^{int}| = 28.595546$ $\mathbf{F}_1^{int} = [13.180726, -25.376638]^T \qquad |\mathbf{F}_3^{int}| = 28.595546$ ### Test import and conversion of data Test whether data is correctly imported and converted to a graph by plotting the graph and checking the edge values. Plot of graph: ![](https://hackmd.io/_uploads/r1Bw_luMT.png) Note that the edges don't necessarily represent the actual distance to the graph, since in this plot the nearest image convention is not taken into account. Coordinates are correct. Since no inter-particle distance is larger than $r_c$ all nodes should be connected: seems correct as well. The edge indices: [1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2] The edge values: [ 1., 1.], [-1., 1.], [-1., -1.], [ 2., 0.], [ 1., -1.], [-2., 0.] Edge values have indices flipped with regards to their values, for example $[1, 1]^T$ (value) corresponds to $[2, 1]$ (indices) (and not $[1, 2]$ ) Fix by doing $dX = X^T - X$ instead of $dX = X - X^T$ . The edge indices: [1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2] The edge values: [-1., -1.], [ 1., -1.], [ 1., 1.], [-2., 0.], [-1., 1.], [ 2., 0.] ### Test message passing Print distances and distance unit vectors in ForceSumLayer.message(). Distances: $[r_{12}, r_{13}, r_{21}, r_{23}, r_{31}, r_{32},] = [1.4142, 1.4142, 1.4142, 2.0000, 1.4142, 2.0000]$ is correct. Distance unit vectors are correct as well. ## Test EdgeFunc (separately) For training purposes, use NN with 2 layers with $100$ nodes. For the data, we take an array $r \in [0.5, r_c]$, $dr = 0.005$, and we calculate the ground-truth interaction force components. We generate a noise array of the same shape with ($\mu = 0$, $\sigma = 5$) and add it to the force. We do a train-test split of $50\% - 50\%$. As loss function, we take the added mean square errors $\sigma_x^2 + \sigma_y^2 + \sigma_z^2$. Batch training: $N_{epochs} = 1,000$; $B = 100$ (batch size), $lr = 0.01$ (learning rate) We set both the numpy and torch seeds to $1997$. Result: ![](https://hackmd.io/_uploads/Hy9ErKRbp.png) ![](https://hackmd.io/_uploads/rkZBHYCb6.png) Model doesn't work. Try Pytorch seed $1994$ Try different pytorch seed $1994$: ![](https://hackmd.io/_uploads/BkX0rY0ZT.png) ![](https://hackmd.io/_uploads/HkaRrtA-a.png) Model works. Note, however, that it only works if we use ReLu activation function in the Edge Function. ## Test Complete GNN To test the full GNN, we apply it to the simulation data for which $\eta = 0.10$. ### Test two graphs As a first test we try to learn the forces using two graphs; one for testing ("eta_0.100000_snapshot_00.dat") and one for training ("eta_0.100000_snapshot_01.dat"). Subsequently apply trained model to $r \in [0.5, 3)$, $\delta r = 0.005$. We use an edge function with 2 hidden layers of each $100$ nodes | Parameter | Value | | -------- | -------- | | $N_{epochs}$ | 100 | | $lr$ | 0.01 | | Torch seed | 1994 | | Nodes Edge Function | 100 | ![](https://hackmd.io/_uploads/r1JPotaG6.png) ![](https://hackmd.io/_uploads/rkqwiYpGT.png) Zoom in on range $r > 1.5$: ![](https://hackmd.io/_uploads/r1HyTYaMT.png) Converges fairly wel at higher $r$. At lower $r$ model fails, probably because particles don't approach one another closely at low $\eta$ (lack of data for low distance values). Try whether increasing number of epochs improves prediction at high $r$. | Parameter | Value | | -------- | -------- | | $N_{epochs}$ | 200 | | $lr$ | 0.01 | | Torch seed | 1994 | | Nodes Edge Function | 100 | ![](https://hackmd.io/_uploads/HyXfAt6Gp.png) ![](https://hackmd.io/_uploads/HyKB0YpGp.png) Zoom in: ![](https://hackmd.io/_uploads/SJ45Atpzp.png) Try adding more nodes to hidden layers Edge Function | Parameter | Value | | -------- | -------- | | $N_{epochs}$ | 5 | | $lr$ | 0.01 | | Torch seed | 1994 | | Nodes Edge Function | 300 | ![](https://hackmd.io/_uploads/BkdHwcaMT.png) ![](https://hackmd.io/_uploads/S1bvP5pfp.png) ![](https://hackmd.io/_uploads/S1r5vcpM6.png) Mininum MSE is $0.40121990442276$. ### Test on 50 graphs Test on all graphs $\eta = 0.50$ | Parameter | Value | | -------- | -------- | | $N_{epochs}$ | 5 | | $lr$ | 0.1 | | Torch seed | 1994 | | Nodes Edge Function | 300 | ![](https://hackmd.io/_uploads/ryGvoBCMT.png) ![](https://hackmd.io/_uploads/B13dir0za.png) ### Test for files without forces First test whether velocity calculation correctly by given $r(t)$ for which we know $v(t)$ analytically. Choose: $\mathbf{r}(t) = \mathbf{r_0} + [2 \cos(t), 2 \sin(t), t]$ $\frac{d \mathbf{r}}{dt} = [-2 \sin(t), 2 \cos(t), 1]$ $t \in [0, 0.1)$, $\delta t = 0.01$ Two particles: $\mathbf{r}_{1, 0} = [1, 0, 0]^T$ $\mathbf{r}_{2, 0} = [-1, 0, 0]^T$ (Note that these are not the initial coordinates of the particles because $\cos(0) = 1$, they're are a set off such that we can distinguish between particles 1 and 2). The ground truth velocities: $t = 0.00 \qquad [-0.0000, 2.0000, 1.0000]$ $t = 0.01 \qquad [-0.0200, 1.9999, 1.0000]$ $t = 0.02 \qquad [-0.0400, 1.9996, 1.0000]$ $t = 0.03 \qquad [-0.0600, 1.9991, 1.0000]$ $t = 0.04 \qquad [-0.0800, 1.9984, 1.0000]$ $t = 0.05 \qquad [-0.1000, 1.9975, 1.0000]$ $t = 0.06 \qquad [-0.1199, 1.9964, 1.0000]$ $t = 0.07 \qquad [-0.1399, 1.9951, 1.0000]$ $t = 0.08 \qquad [-0.1598, 1.9936, 1.0000]$ $t = 0.09 \qquad [-0.1798, 1.9919, 1.0000]$ Velocities calculated correctly. ## Improve GNN GNN should stop training when accuracy of model on train data becomes larger than accuracy model on test data. Accuracy measured in terms of coefficient of determination (COD). Calculation of COD implemented in calculateCOD(). Consider simple test system $\mathbf{F}_1^{int} = 50.753276 [0, 1]^T \qquad |\mathbf{F}_1^{int}| = 50.753276$ $\mathbf{F}_2^{int} = [-13.180726, -25.376638]^T \qquad |\mathbf{F}_2^{int}| = 28.595546$ $\mathbf{F}_3^{int} = [13.180726, -25.376638]^T \qquad |\mathbf{F}_3^{int}| = 28.595546$ The average forces in the $x$ and $y$ directions are $<\mathbf{F}> = [0, 0]^T$ For the MSE of the true with respect to their mean (denominator), we get $[347.42, 3863.85]$ Let's say you have a MSE of the prediction values with respect to the measurements of $3000$. Then we have $r^2 = 0.2876$ ## Generate data For the data generation, we generate cubic crystals with packing fraction $\eta$ and run initialization cycles to get a random initial particle configuration. After initialization, the configuration for each value of $\eta$ is saved to a file. We then run the simulation, in which we take snapshots of the system every $T = N_{save} \ dt$, where $N_{save}$ is the number of iterations between each save. Before the simulation, the random number seed is changed to $2$, such that the random force on the particles at the start of the simulation is not the same as the random on the particles at the start of the initialization. In each snapshot, we save the number of particles and box size (first 4 lines), and the coordinates and forces (number of lines equal to number of particles) to a file, where every row is formatted as $x$, $y$, $z$, $F^{tot}_x$, $F^{tot}_y$, $F^{tot}_z$, $F^{int}_x$, $F^{int}_y$, $F^{int}_z$ Furthermore, we generate data for $2$ different values of the energy $k_B T$. From the integration scheme, it follows that high values of $k_B T$ ($k_B T \approx 1$) result in random forces which are much larger than the interaction forces. Therefore we compare results for values of $k_B T$ at which $|\mathbf{F}^{R}| >> |\mathbf{F}^{int}|$ and at which $|\mathbf{F}^{R}| \le |\mathbf{F}^{int}|$. ### Data $|\mathbf{F}^{R}| \le |\mathbf{F}^{int}|$ Simulation parameters | Name | Value | | -------- | -------- | | $N_{ic}$ | $10^5$ | | $N_{iter}$ | $50$ | | $dt$ | $0.0001$ | | $\tilde{\zeta}$ | $1$ | | $\tilde{D}$ | $0.01$ | | $\tilde{V_0}$ | $100$ | | $r_c$ | $3$ | | $\sigma$ | $1$ | $\eta \in \{0.1, 0.15, 0.2, 0.25, 0.30\}$ $T \in \{0.1, 0.15, 0.2, 0.25, 0.3\} * dt$ At this value of $D$ $\zeta \frac{\Delta \mathbf{r}}{\Delta t} = \mathbf{F}^R + \mathbf{F}^{int} = \sqrt{\frac{2 \zeta k_B T}{\Delta t}} \mathbf{R} + \mathbf{F}^{int}$ $O(\mathbf{F}^{int}) \approx O(10) - O (100)$ $O(\mathbf{F}^R) \approx \sqrt{\frac{2 * 10^{-3}}{10^{-4}}} = 14.14$ Looking at the data it seems that the total force has decreased but that the interaction force has decreased as well. ### Data $|\mathbf{F}^{R}| >> |\mathbf{F}^{int}|$ Simulation parameters | Name | Value | | -------- | -------- | | $N_{ic}$ | $10^5$ | | $N_{iter}$ | $50$ | | $dt$ | $0.0001$ | | $\tilde{\zeta}$ | $1$ | | $\tilde{D}$ | $1$ | | $\tilde{V_0}$ | $100$ | | $r_c$ | $3$ | | $\sigma$ | $1$ | $\eta \in \{0.1, 0.15, 0.2, 0.25, 0.30\}$ $T \in \{0.1, 0.15, 0.2, 0.25, 0.3\} * dt$ At this value of $D$ $\zeta \frac{\Delta \mathbf{r}}{\Delta t} = \mathbf{F}^R + \mathbf{F}^{int} = \sqrt{\frac{2 \zeta k_B T}{\Delta t}} \mathbf{R} + \mathbf{F}^{int}$ $O(\mathbf{F}^{int}) \approx O(10) - O (100)$ $O(\mathbf{F}^R) \approx \sqrt{\frac{2}{10^{-4}}} = 141.42$ Make histogram of all distances between particles for full data ### Comparison datasets Although decreasing $k_B T$ should make the random force components relatively small compared to interaction force components, it seems from the data $k_B T = 0.01$ that random force has also become very small. Compare first snapshot $\eta = 0.1$ for both datasets. For $k_B T = 0.01$: ![image](https://hackmd.io/_uploads/ByLK0rkHT.png) For $k_B T = 1$: ![image](https://hackmd.io/_uploads/SyJACHkHp.png) Hence decreasing the energy causes the random force to decrease, but crystal formation occurs and therefore the interaction force becomes very small as well. ## Train on interaction forces As measure of the accuracy we take the COD between model prediction and true force values labels and between prediction and true force *curves* on $r \in [1, 3], \delta r = 0.0005$. ### Hyperparameter selection The accuracy of the model prediction depends on hyperparameter settings. Here we discuss the effect of different settings on the accuracy. Each simulation we change one hyperparameter with respect to the default settings | Parameter | Value | | -------- | -------- | | $N_{epochs}$ | 1000 | | $lr$ | 0.001 | | Batch size | 2 | | Ratio train/test | 0.5 | | Torch seed | 1 | | Numpy seed (to shuffle snapshots) | 1 | | Shape Edge Function | (4, 4, 4)| | Activation function Edge Function | Sigmoid | Accuracy with respect to curve $r^2_{int} = 0.9999977208735437$ #### Shape Edge Function | Shape | $r^2$, $F_{int}$ labels | $F_{int}$ curve | | ---------- | ------------ | ------------ | (3, 3, 3) | 0.9999879002571106 | | | (4, 4, 4) | 0.9999969005584717 | 0.9999977208735437 | | (5, 5, 5) | 0.9999998211860657 | As can be seen all neural networks do a fairly good job at predicting the interaction forces. Increasing the number of nodes barely improves $r^2$. #### Batch size Compare accuracy (with respect to interaction forces) for different batch sizes. | Batch size | $r^2$ | $r^2$ (curve) | | ---------- | ------------ | ------------ | 2 | 0.9999969005584717 | 0.9999977208735437 | | 4 | 0.999964714050293 | 0.9999994616154961 | | 8 | 0.9636549949645996 | 0.9999648472664697 | As we can see, the prediction gets worse if we increase the batch size. To understand this compare plots COD for each batch size. ![image](https://hackmd.io/_uploads/rJd6FPyUa.png) If we increase batch size, $r^2 \rightarrow 1$ more slowly. This is probably because the weights and biases of the Edge Function are updated less frequently. #### Activation function | Activation function | $r^2$ (labels) | $r^2$ (curve) | | ---------- | ------------ | | ReLu | 0.9865358471870422 |0.9967628725763361| | Leaky ReLu | 0.996918261051178 |0.9997181712381543 | | sigmoid | 0.9999969005584717 |0.9999977208735437| | tanh | 0.9999947547912598 |0.9999873125014568| Sigmoid has the best COD. If we look at plots of Edge Function corresponding to different activation functions, we see that sigmoid and tanh produce smooth curve predictions, whereas ReLU and Leaky ReLU produce "kinks". ![image](https://hackmd.io/_uploads/B1wMQdJ8T.png) ![image](https://hackmd.io/_uploads/rk8X7_1LT.png) #### Different (initial) learning rates | lr | $r^2$ | | ---------- | ------------ | | 0.0001 | | | 0.001 | | | 0.01 | | | 0.1 | | ## Train on total forces | Parameter | Value | | -------- | -------- | | $N_{epochs}$ | 1000 | | $lr$ | 0.001 | | Batch size | 2 | | Ratio train/test | 0.5 | | Torch seed | 1 | | Numpy seed (to shuffle snapshots) | 1 | | Shape Edge Function | (4, 4, 4)| Train GNN on total force labels from simulations, both with 50 SPD and 500 SPD. Edge Function predicts magnitude interaction force correctly. ![plot_EF_Ftot_comparison_spd](https://hackmd.io/_uploads/rJjqyaFva.png) Here left: 50 spd, right 500 spd Prediction better for 500 spd. For both spd values, MSE remains very high and COD very low througout training loop. ![plot_COD_Ftot](https://hackmd.io/_uploads/S1MvepYP6.png) The vertical lines indicate the epoch at which the model gave highest COD. For both spd values, the COD converges to some value $r^2 << 1$. Furthermore $r^2_{500 spd} < r^2_{50 spd}$. This might be caused by the fact that for these simulation values $\mathbf{F}^{R} >> \mathbf{F}^{int}$. Estimate order of magnitude: $\zeta \frac{\Delta \mathbf{r}}{\Delta t} = \mathbf{F}^R + \mathbf{F}^{int} = \sqrt{\frac{2 \zeta k_B T}{\Delta t}} \mathbf{R} + \mathbf{F}^{int}$ $O(\mathbf{F}^{int}) \approx 10$ $O(\mathbf{F}^R) \approx \sqrt{\frac{2}{10^{-4}}} = 141.2$ Ruiz-Garcia: $k_B T = 0.01$ $O(\mathbf{F}^{int}) \approx 10-10^2$ $O(\mathbf{F}^R) \approx \sqrt{\frac{2 *10^{-2}}{10^{-4}}} = 44.7$ If we add more snapshots, we get more points with high noise, causing the COD decrease as SPD increases. Still, Edge Function predicts magnitude interaction force quite accurately. Indeed, if we consider the $r^2$ of the prediction with respect to the interaction force labels, we see $r^2 \rightarrow 1$. ![plot_COD_Fint](https://hackmd.io/_uploads/BJiQz6KP6.png) Here, we that model trained on 500 spd tends to $r^2 = 1$ more quickly. Also 500 spd results in a slightly higher final COD in the end. If we compare different seeds for 50 spd, we see that all predicted force curves lie above the true force curve. ![image](https://hackmd.io/_uploads/rJUHOpFwT.png) This might have something to do with a bias in force values. Plot histograms of individual force components and generate a table with averages and SDOM per component. | Direction | 50 spd | 500 spd | |-----------|---------------------------------------------|-----------------------------------------------| | $x$ | $-0.3847 \pm 0.1633$ | $-0.244363471865654 \pm 0.0365435816347599$ | | $y$ | $-0.7215 \pm 0.1627$ | $-0.04648003354668617 \pm 0.036514390259981155$ | | $z$ | $0.1893 \pm 0.1632$ | $0.08742117136716843 \pm 0.036516886204481125$ | ## Train on trajectories Train the GNN on data with different time-intervals $T$ between snapshots. Consider $T \in \{1, 2, 5, 10\}*dt$ for both 50 and 500 spd. ![plot_EdgeFunction_compare_T=n_dt](https://hackmd.io/_uploads/BJxvWCFPp.png) Predictions 50 spd not as good as 500 spd. Especially at $r/ \sigma \approx 1$, model trained on 500 SPPF performs better. For both spd values, we see that prediction gets increasinly bad as $\tau$ increases. Deviation with respect to the true values increasingly large as $r/ \sigma$ decreases. The reason for this is that the forward difference gives the time-average of the total force on the time-interval $t \in [0, \tau]$. For increasing $\tau$ this average will deviate more from the value at the start of the interval, since the path of a particle can no longer be described by a straight line. Explained differently, the interactions during the interval will change the direction of a particle's motion. Most likely, the particle will move in another direction than the direction of the initial interaction force. Therefore the estimate of the interaction force based on the initial and final position will be smaller than the interaction force at the start of the interval. ![image](https://hackmd.io/_uploads/HkuKRlud6.png) At a value of $\tau /dt = 1$, the $500$ SPPF model clearly outperforms the $50$ SPPF model (the $500$ SPPF prediction overlaps with the true force, whilst the $50$ SPPF model overestimates the force). As $\tau /dt$ increases the predictions of the $50$ and $500$ SPPF model becomes increasingly similar. The "benefit" of more data seems to decrease. At small $\tau /dt$, the effect of the random force on the particle motion is quite large. At larger values of $\tau /dt$ the estimates of the total force obtained from numerical differentiation, will have less noise. At low $\tau /dt$ you need much data such that $<F^{\alpha}> \approx 0$. At high $\tau /dt$ you have less noise and hence you'll need less data to make a proper fit. Study more in depth, compare COD values (left = 50 spd, right = 500 spd). ![image](https://hackmd.io/_uploads/Hy6mAN2Dp.png) * For both datasets $r^2_{int}$ decreases as $\tau$ increases (expected). * For both datasets $r^2_{tot}$ increases as $\tau$ increases. * Logical (?): if $\tau$ increases you average out more noise and if timestep still sufficiently small, such that interaction force doesn't change much over during one $\tau$, the total force will approach the interaction force, causing $r^2_{tot}$ to increase. * For both data sets $r^2_{int} (\tau = 2 \ dt) > r^2_{int} (\tau = 1 \ dt)$ * Weird: the smaller the timestep the better the prediction should be. * However: model takes best prediction with respect to total forces, which is not best model with respect to interaction forces. * As $\tau$ increases, the accuracy of the model trained on $50$ SPPF approach model trained on $500$ SPPF. $\vec{F}_{int}$