# BioExcel/ENCCS Gromacs workshop Jan 2021 ## Important links This document: https://hackmd.io/@enccs/gromacs-jan-2021 Workshop webpage: https://enccs.se/events/2021/01/advanced-topics-in-simulations-with-gromacs/ EuroCC National Competence Center Sweden (ENCCS): https://enccs.se/ BioExcel: https://bioexcel.eu/ ### Instructors - [Mark Abraham, ENCCS, UU](https://enccs.se/people/mark-abraham/) - [Christian Blau, BioExcel, Stockholm University](https://www.biophysics.se/index.php/members/christian-blau/) - [Artem Zhmurov, ENCCS, UU](https://www.kth.se/profile/zhmurov?l=en) - [Alessandra Villa, BioExcel, KTH](https://www.kth.se/profile/avilla) - Host: [Thor Wikfeldt](https://enccs.se/people/kjartan-thor-wikfeldt) --- :::danger This is the place to ask questions about the workshop content! We use the Zoom chat only for posting links, reporting Zoom problems and such. ::: --- ## Code of conduct We strive to follow the Code of Conduct developed by The Carpentries organisation to foster a welcoming environment for everyone (see https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html). In short: - Use welcoming and inclusive language - Be respectful of different viewpoints and experiences - Gracefully accept constructive criticism - Focus on what is best for the community - Show courtesy and respect towards other community members --- ## Schedule - January 12 – Using umbrella-sampling ("pulling") simulations |time| | |----|----| |9.00 |Introduction - Thor Wikfeldt| |9.10 |Introduction to BioExcel - Rossen Apostolov| |9.20 |Workshop details - Thor Wikfeldt| |9.25 |Setting up for Umbrella Simulations - Mark Abraham| |9.30 |Umbrella sampling - Mark Abraham| |10.00 |Break| |10.10 |Planning and running an umbrella sampling simulation - Mark Abraham| |10.40 |Break| |10.50 |Analysing umbrella sampling - Mark Abraham| |11.20 |Break| |11.30 |Might something have gone wrong? - Mark Abraham| |12.00 |Break| |12.10 |Questions - Mark Abraham| |12.20 |Wrap-up - Thor Wikfeldt| |12.30| Close| - January 13 – Using replica-exchange molecular dynamics |time| | |----|----| |9.00 |Questions from day 1| |9.20 |Replica exchange molecular dynamics simulations| |10.00 |Break| |10.10 |System preparation| |10.30 |Break| |10.40 |Analysis of equilibrium simulations| |11.10 |Production simulations| |11.30 |Break| |11.40 |Analysis of the data| |12.00 |Can we reproduce the simulations with single MD run?| |12.10 |Using different set of temperatures.| |12.20 |Wrap-up| - January 14 – Using applied-weight histogram methods ("AWH") |time| | |----|----| |9.00 | Starting notebooks , follow up on day 2 |9.20 | Re-cap of AWH - theory |10.00 | Break |10.10 | First Part AWH |11.00 | Break |11.15 | Second Part AWH |12.00 | Wrap-up AWH - January 15 – Computing trajectories efficiently on GPUs |time| | |----|----| |9.00 | Welcome, recap of day 3 |9.10 | Introduction |9.30 | Setting up on Puhti |9.50 | Learning about hardware on Puhti |10.10 | Break |10.20 | Running the MD algorithm |11.00 | Break |11.10 | Running with PME |11.50 | Break |12.00 | General considerations for performance |12.20 | Wrap-up workshop --- # Day 1 ## Using umbrella-sampling (pulling) simulations Download the tutorial, and then unpack it with something like ```bash= unzip ~/Downloads/enccs-bioexcel-gromacs-workshops-advanced-umbrella-sampling.zip conda activate gromacs-tutorials # Activate conda environment cd advanced-umbrella-sampling jupyter notebook tutorial.ipynb # open notebook in new browser window ``` ## Questions and answers - Why someone should prefer Umbrella Sampling rather than the other free energy methods e.g. MBAR? - MBAR is an analysis alternative to WHAM which can also be used to compile the results from the many biased umbrella runs to a free energy profile and will show up later in the tutorial. - There is more advanced methods (eg AWH, extended ensemble, and the whole zoo of methods implementd in `PLUMED` and `colvars`), the umbrella-sampling / pulling provides the basic framework for most of them from generating the data to analysing - Does Umbrella Sampling give more accurate results than FEP methods? - If we imagine a ligand binding free energy, then using alchemical transformation to make a ligand vanish from a binding site and make water appear in place can at times be more efficient, it all depends on the pathway; if it's easy to pull out a ligand from a binding site, umbrella sampling will win, if a tiny ligand is buried in a protein, I would expect FEP to be more efficient. - is pull-cord1-init the center of the harmonic potential? - Yes - could you comment a bit on the protocol you follow to generate the windows? - It's usually iterative - the perfect protocol could be derived from the free energy landscape / potential of mean force, because you would know how rugged it is, how large energy barriers are, etc. but this is the exact thing we try to estimate. - An indication that something is off with your profile is if you see that the system is never at a certain distance - imagine you placec umbrellas at distances 0.5 and 1.0 and never see the system at a distance between 0.7 or 0.8, you would have to redefine your umbrellas, because there is most likely an energy barrier, however with no sampling there you will never be able to estimate this one right - To save computational effort you would start with small force constants and a large spacing - As discussed below, it's beneficial to have the number of umbrellas be a multiple of the MPI ranks that you have available to make use of `-multidir` efficiently. - if you want to use multidir on HPC, for instance 7 MPIs/directory and 4 OMP, what could be a suggested command line? - If possible, you would like to have gromacs decide automatically for most cases to find the best balance between `openMP` and `MPI` threads, so just running `mpiexec -np 28 gmx_mpi mdrun -multidir sampling/run*` would give you best perfomance (the 28 MPI ranks would now be evenly distributed over all directories, so you might want to set up the umbrellas so that they are a multiple of the number of MPI ranks). You can try `mpiexec -np 28 gmx_mpi mdrun -multidir sampling/run* -ntomp 4` to force GROMACS, but this might result in GROMACS complaining that you have chosen an inefficient combination - Running that `mpiexec -np 28` command when there were 7 `sampling/run*` directories would make sense if there was e.g. 7 nodes (each with 16 cores) and you intended 4 MPI ranks to run on each node, each with 4 OpenMP threads. mdrun will keep all 4 ranks together on the same node (which is most efficient), etc. - I see, gromacs will share the MPI evenly, right? - yes - could each window be run on GPUs if available with multidir? - Yes - cylindrical pulling - where is it suitable? Can see if you want to pull a lipid from bilayer, but is there some other things - Asymmetry in the system may introduce a need for this. - Think in term of reaction coordinate - is there something in the system that breaks symmetry in a way - can US be also done with PLUMED, if that is the case, what would be the advantages/disadvatages from the setup we followed today? - Yes, GROMACS runs faster, is installed by default, and you will usually have the latest version installed. The GROMACS pull code is so to say bare-metal, very fast across lots of compute resources and robust whereas the plumed patched GROMACS has all kinds of algorithms and ways of pulling. Because of its generality, PLUMED will generally run slower than the equivalent using the pull code. So, if you can use the pull code, do so because you can reinvest your computer time in more sampling. If you have to use the PLUMED capabilities because they're not in GROMACS, then absolutely do so! - is the machinery for US with QM/MM interfaces already available? - Dmitry Morozov at Jyvaskylä University https://www.jyu.fi/science/en/chemistry/staff-and-administration/staff/morozov-dmitry in Finland did something like that with a new GROMACS fork that combines GROMACS and CP2K. That is all in "beta" phase, but do not hestitate to contact them if you're interested. --- # Day 2 ## Using replica-exchange molecular dynamics Unpack tutorial and set up with ```bash tar xfz enccs-bioexcell-remd-tutorial.tar.gz cd alanine-tripeptide-replica-exchange-molecular-dynamics jupyter notebook remd-tutorial.ipynb ``` * How optional is MPI GMX? - One will need it to run production simulations. But we have reference data to use for those who dont have GMX MPI. **Quizzes** You are studying interactions of protein with small ligand. Is replica-exchange molecular dynamics ringht tool for you? 1. Yes 2. No You are studying interactions of protein with small ligand. Is TEMPERATURE replica-exchange molecular dynamics ringht tool for you? 1. Yes 2. No --- * Can dynamics really be probed with REMD? (given that MC steps are used, which may be unphysical) - In REMD, you are using Monte-Carlo to switch between the replicas. The actual simulations are propagated by MD integration. You loose the continuity of the simulation, though. If the energy landscape is properly sampled, you can always refer to that to get the dynamics. * Can selecting different force field lead to the same conclusions? - Maybe. That depends on the properties the force fields are parameterized to reproduce (and whether they succeed in the actual case). In all MD (not just REMD) it is wise to plan to study the same system with more than one force field (where possible) so that you get insight into whether your observations have limitations from the choice of force field. Of course, all the force fields might be wrong, in either similar or different ways... Solution to sed problem on MacOS: ``` sed -i "" 's/XXXXXX/300/g' equil0/equil.mdp sed -i "" 's/XXXXXX/310/g' equil1/equil.mdp sed -i "" 's/XXXXXX/320/g' equil2/equil.mdp sed -i "" 's/XXXXXX/330/g' equil3/equil.mdp ``` - Just wondering, can I force it to run on cuda cores? I guess it uses only cpu cores right? - by default it will use GPU for PME and non-bondeds. That is if you have compiled with CUDA support. You can modify it by passing flags `-pme gpu -nb gpu -update gpu -bonded gpu` to `gmx mdrun`. - so, 4 replicas 4 processors and for let's say 6 replicas, 6 processors right? - yes mpirun -np N gmx_mpi mdrun -multidir stage1/equil[0..N-1] - I will be running the simulations later, I hope that won't be any issue? It's really bumping my laptop a lot. - you can choose to only follow instead of running yourself, and use the reference output from Artem for the analysis - but you can also run gmx_mpi on mybinder.org, see instructions at https://github.com/enccs/gromacs-workshop-installation/#running-via-binder (will take a little while to set up though) - If my replicas are spaced too close together, how should I think about reorganizing things? - General wisdom is to aim at about 20% average exchange probability. Mark wrote a paper on the topic, and included a section on "how to set up replica profiles" at https://doi.org/10.1021/ct800016r - Does it make sense to calculate the energy bariers between the conformational states of replicas? - If you knew that already, you wouldn't need to run MD, unfortunately. (Or did I misunderstand the question? Ask another if you want to!) - The structure may remain kinetically trapped in a certain configuration (e.g. starting configuration) during simulation. I think the calculation of energy bariers between states may help us to understand this? - Yes, if you have insight into barrier heights that would be a good guide to the temperature range you need - What would have happened if gen-vel=yes in simulation mdp script? - You'd have randomized the velocities and lost some of the benefit of the equilibration. It's probably not too serious, but not desirable either. (This is the same as for regular MD.) - how is the exchange scheme? is it always from replica 0 to replica 1, from 1 to 2 etc, or is there a randomization ? - Odd steps 0<->1, 2<->3, etc. , even steps 1<->2, 3<->4, etc. only neighbours exchange. - Random exchange partners would be ineffective, because they would come from replicas that have negligible overlap in either ensemble space or potential energy distribution - I've problems installing the MPI version of Gromacs. How can I do? - what error message do you see? - You will need to install MPI first - I've installed OpenMPI. I've followed the installation instructions for Gromacs from the link in the mail, but at the end I don'y find gmx_mpi as a command. - Once you've run `make install` then gmx_mpi is installed somewhere (e.g. `/usr/local/gromacs`). To put it into your `PATH` environment variable, you need to do something active. The easiest thing to do is `source /path/to/wherever/gmx_mpi/was/installed/bin/GMXRC`. - Doesn't polyalanine have propensity for PPII and isn't that seen in the Ramachandran plot? - Ordinarily, yes. Maybe the sampling wasn't enough yet, or this force field doesn't do a good job of it! Or that the propensity is observed for oligmer sizes above 2. --- ### notes from breakout room session room 1: - We need more windows since the probability of exchange is lower than 20%. - With closer windows, we would probably need more temperature values if we wish to cover the entire range - Probably window 2 can be moved closer to 1 as well. - room 2: - Move the temperatures closer together, but then the range decreases. - Shift the temperature of the second replicate closer to the first and further from the third, to increase the exchange between 1 and 2, and decrease it between 2 and 3. - room 4: - The windows should be closer or more windows to increase the probability to 20% - The temperatures do not have to be evenly spaced, e.g 3 and 2 can be further apart since number of exchanges higher for these. - --- Questions: - When I run gmx chi -s topol.tpr -f traj_comp.xtc -rama, I get the error: Fatal error: reading tpx file (topol.tpr) version 119 with version 116 program - That happens when you're running `gmx chi` from an older version of GROMACS, which can't read the .tpr files from newer versions of GROMACS. Probably you used a different terminal and had a different PATH. Running e.g. `conda activate gromacs-tutorials` from that terminal would re-activate the environment that has the same version of `gmx`. What do you think would be a better temperature set for our system? - Decrease Temperature Range from 30 - 25 Possibly move Replica 2 closer to replica 1 (questionable if 400 ps is enough sampling for a ~3% replica exhange to be a significant difference to justify that.) --- # Day 3 ## Using applied-weight histogram methods - AWH Download the tutorial, then unzip and start it with the following commands (same procedure as the last two days) ```bash= cd cd Downloads tar -xvf awh-tutorial.tar.gz cd awh-dna-basepair-opening/ jupyter notebook awh-tutorial.ipynb ``` ## Questions and answers - Where should I go to get support for GROMACS? - https://gromacs.bioexcel.eu/ - Where can I find the slides of today's presentation? - we will make them available ASAP - So is this a Essentially a Baysian Estimate of the reaction profile? - If one has a multidimensional reaction cordinate in AWH does one still use one diffusion constant? - one can define diffusion for each coordinate using the option in mdp file ( awh1-dim1-diffusion = Estimated diffusion constant for this coordinate dimension determining the initial biasing rate.) - I have a problem regarding installation of MPI-enabled GROMACS on my core i7 laptop with a windows operating system, I have an Oracle VM virtual box with Ubuntu 16 installed in that. I am trying to install Gromacs-2020.1 with MPI enabled. I first installed open MPI v4.1. Then tried to install Gromacs but when I execute the cmake command as follows: cmake .. -DCMAKE_C_COMPILER=mpicc -DCMAKE_C++_COMPILER=mpic++ -DGMX_MPI=ON -DGMX_DOUBLE=OFF -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON it gives this error: ```bash= The C compiler identification is unknown The CXX compiler identification is unknown ``` How can I solve the problem?? - can you try to run `sudo apt install mpich`? - try cmake with`-DGMX_MPI=ON -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON` to have it pick up the compilers itself - I did: sudo apt install mpich when I try to execute cmake command, It gives me this Error: MPI support requested but no MPI compiler found. - what if you run cmake with `-DGMX_MPI=ON -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON`? - I did. still problem exist - can you check that the compilers and MPI were actually installed by `mpicc --version` and `mpic++ --version`? - you can also type `mpi` in terminal and then TAB to get autocomplete list of available commands when I execute mpicc –version it says: mpicc:error while loading shared libraries:libopen.pal.so.40: can not open shared object file - one suggestion from an internet search is to run `sudo ldconfig` to configure dynamic linker run-time bindings. Does that work? - it seems yes. now that i execute mpicc --version it gives: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 - great! now you should be able to compile GROMACS - After installing gromacs, i executed this command: `mpirun -np 2 gmx_mpi mdrun -deffnm md` and it gave me this error: `There are not enough slots available in the system to satisfy the 2 slots that were requested by the application: gmx_mpi` - Either request fewer slots for your application, or make more slots available for use. - A "slot" is the Open MPI term for an allocatable unit where we can launch a process. The number of slots available are defined by the environment in which Open MPI processes are run: 1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided) 2. The --host command line parameter, via a ":N" suffix on the hostname (N defaults to 1 if not provided) 3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.) 4. If none of a hostfile, the --host command line parameter, or an RM is present, Open MPI defaults to the number of processor cores - In all the above cases, if you want Open MPI to default to the number of hardware threads instead of the number of processor cores, use the --use-hwthread-cpus option. - Alternatively, you can use the --oversubscribe option to ignore the number of available slots when deciding the number of processes to launch. - Do you know what is the problem?? - Would multiple walkers also help when we have RC which has large values? - Yes, in principle. In general it is always better to have a RC 'without large values' due to sampling issues --- # Day 4 ## Computing trajectories efficiently on GPUs ### How to run on Puhti **Today we will be using the [Puhti cluster](https://docs.csc.fi/computing/systems-puhti/)** - [Information on GROMACS on Puhti](https://docs.csc.fi/apps/gromacs/) Then use ssh to connect to Puhti, like `ssh training0XX@puhti.csc.fi` If you need to install ssh, then perhaps `sudo apt install ssh` will help. Or ask an instructor for help. The common password is the Zoom chat - ask an instructor if you don't have it. :::danger All work on Puhti should be done in the scratch folder, which is visible to all compute nodes. You need to make your own working directory inside that. The inputs for today's exercises are found in an archive file that you can then unzip: ```bash= cd /scratch/project_2000745 mkdir training0XX cd training0XX unzip /scratch/project_2000745/gpu-performance.zip ``` ::: #### To use the workshop reservation make sure to include this in your batch script: `#SBATCH --reservation=gromacs` The sample scripts in the archive have this already! ## Questions, answers and discussions - Is the last version of Gromacs always the best version to install?? - generally yes, and particularly the latest patch version (e.g. 4 in 2020.4), which has the latest bug fixes - newer major versions often not only have new features but also performance improvements - When to switch the major version / how to make sure no dangerous discontinuities occur in simulation series? - until a better expert answers this question, i would say that one shouldn't change version in the middle of a simulation (i.e. for a restart) - You should be able to switch versions in most cases. The simulation continuety should be ok. You may have differences in the energy output format, restart files format, etc. So it is generally not recomended to switch major version. - Many people keep the same major version for the whole of a related group of simulations, so that they eliminate that variable from questions like "why was this simulation different from the other ones?" - However you should update minor versions whenever you have the time to. The code is kept super stable, with strictly only bug fixes. We post on https://gromacs.bioexcel.eu/c/gromacs-announcements/7 when we make new releases, along with release notes that tell you what issues were fixed. Definitely update if you read about fixes in areas that might affect your work! - has the GROMACS team started exploring the performance on Apple silicon (M1) and its GPU? - Yes. GROMACS has ARM SIMD support, and it compiled perfectly on M1. We also got an A1-equipped Apple hardware for CI and testing. Making sure it uses the heterogeneous cores well might need a little work, however! - Does Puhti mean something particular in finnish? it means clean - i don't know finnish but found this: https://en.wiktionary.org/wiki/puhti: puhti=energy, vigour/vigor, vitality, vim, zip - curiously, two linux commands in there ;-) - Sorry, what '--gres=gpu:v100:1' stands for? - this command requests one V100 GPU on the node that gets assigned to the job - full documentation in case that's useful: https://slurm.schedmd.com/gres.html - If you are runnning a single line gmx mdrun, should you always have --ntask=1? And is that the same as -ntask-per-node=1? - normally you will want to use gmx_mpi because GROMACS can run in parallel on both CPUs and GPUs simultaneously, and in that case you don't give the `-ntask-per-node=1` slurm option. does that answer the question? - So with gmx_mpi you never specify -ntask-per-node? And what if you have a node with only CPUs? A node with 2 x Intel CPUs. - if someone do a MD simulation two times with the same starting molecular conformations and simulation parameters, would he exactly get the same results for example regarding energy parameters or density in both times? and if once the simulation is done on cpu once on gpu, would the results be exactly the same? - the trajectories will not be exactly the same since MD is inherently chaotic (tiny numerical noise will lead to large divergence in sufficiently long time). But macroscopic properties like energy and density should be reproducible between different runs, also if you switch from CPU to GPU - when i execute this command on my laptop, `mpirun -np 2 gmx_mpi mdrun -deffnm md` i get following error: ``` There are not enough slots available in the system to satisfy the 2 slots that were requested by the application: gmx_mpi Either request fewer slots for your application, or make more slots available for use. ``` - can you tell me how to solve it??? - looks like this question was answered yesterday, see lines 399-413 of the [past-days hackMD](https://hackmd.io/@enccs/HyTIOXnRw) - - does that work? - to begin with, try adding the `--oversubscribe` option to `mpirun` - hackMD from previous days has no line number to find 399-413. however i cheked it and there was no answer to that question as i asked it around 4 pm yesterday - try adding the `--oversubscribe` option to `mpirun`. Does it work? This option makes it possible to run more MPI ranks on each core - btw you should be able to see line numbers in edit mode - i can not go to edit mode for hackMD for previous days.how can i do that? can you copy for me the lines 399-413 here?? - it starts with the reply "Either request fewer slots for your application, or make more slots available for use." - but this is what gromacs writes when error happens and not an answer from your side - what happens when you add `--oversubscribe` option to mpirun? - it did not help - sorry that we weren't able to solve this quickly. Do you get the same error when you use `--oversubscribe`? - you can also try using this flag with mpirun: `-use-hwthread-cpus` - is there a rule of thumb guide for which approach to choose for some particular system size / algorithm? (I guess today's topic is this... but some quick and dirty ballpark estimate) - sadly, it depends :-) If you have lots of CPU cores, then just offload NB and maybe PME. If you have weak or few CPU cores, try to offload everything. Newer server-grade GPUs like V100 are more powerful, so trying to use more than one of them needs > 500K particles. If running on modern consumer GPUs (e.g. GTX 1080 or larger numbers) then you'd want maybe > 100K particles to run on more than one GPU. - there is some issue with the batch script. If you get an error about trying to use 4 nodes, add this option: #SBATCH --nodes=1 - (I'm not sure why this is needed...) - if that doesn't work, comment out `#SBATCH --reservation` line with a `#` in front - how does the tunepme time required to converge depend on the number of particles or used cores? - It's adaptive... if it sees variation, it tries for longer periods, discarding points that are clearly silly. The number of cores and particles determine the value of the optimum, but not the convergence of the tuning process. - is it there a way to know the meaning of the flags used in the ${options} variable? there is not any exmplanation in the gmx mdrun web page nor in typing 'gmx mpi_mdrun -h' from the terminal - ``-noconfout`` is to not write the final coordinates - ``-resetstep 10000`` is to reset counters at step 10000, so that only second half of the trajectory is used for the ns/day number - https://manual.gromacs.org/current/user-guide/mdrun-performance.html#finding-out-how-to-run-mdrun-better has a very small amount - Is -multidir option also giving speed-up for CPU-only simulations? - No, because it relies on sharing the GPU between two simulations. While the update runs on the CPU, the GPU is idle *for that simulation*. But if there's another simulation wanting to use it, then it's not idle. - How do you see from the autocorrelation how often you should save output for trajectory? - gmx energy has support for looking at those autocorrelations in .edr files - Also gmx analyze can do that for output in .xvg files - There's no point sampling more often than the autocorrelation time - I guess it differs from case to case but do you think sampling every 10 ps is way too often? - That depends on the observable. Kinetic energy has an autocorrelation time around 1 ps in typical biomolecular force fields. But observables based on atomic coordinates have much longer times. - How to find pros and cons for different software like GROMACS, NAMD, AMBER? - https://www.livecomsjournal.org/ does a lot of that for methods, which does indirectly touch on the topic of software. But there's a niche that someone could fill for comparing software, perhaps by attracting collaborators from the specific fora for each tool. It's super hard to do a good job at such a comparison, and keeping it up to date. It quickly gets political too. There's nothing so toxic as in quantum chem (google "banned by Gaussian"), but people are human and sometimes feel threatened by others' perceived success. - https://doi.org/10.1021/acs.jctc.8b00544 did succeed at comparing reproducibility of some aspects of some of the codes. But they did the community a disservice by only comparing GROMACS in double precision in an ancient version (because one of them thinks GROMACS is always buggy because they once experienced a bug), while using bleeding-edge versions of other codes (whose bugs they fixed during the study...). Their conclusions are favorable about GROMACS (yay!), but invalid for how most people actually use GROMACS (boo). - https://link.springer.com/article/10.1007%2Fs10822-020-00290-5, https://doi-org.focus.lib.kth.se/10.1007/s10822-016-9977-1, and https://doi.org/10.1080/00268976.2020.1742938 are also interesting contributions - Most people end up choosing software based on their ability to get practical help with it. You will find strong correlations in student's "choices" with what their PI used in their Ph.D. That's part of why fora like https://gromacs.bioexcel.eu/ are so valuable. - Will the GROMACS GPU performance webpage stay for future reference? - Yes, and be updated. The archive will be made available there also! - Will the HackMD document with all days collected, be accessible in the future too or will it be removed? - It's been collected and will be shared - What if you are running on a cluster that only has CPUs, how can you optimize runtime then? And if you run on multiple CPU nodes would be beneficial to keep nonbonded on one node and everything else on the other? And how would you specify that or does gromacs automatically look for the best set up? - CPU only is quite a different beast. Now the units of parallelism are MPI ranks. You want many of them (depending on the size of your system and the quality of the cores), and to use a subset for PME-only. Check out the mdrun performance guide in the GROMACS documentation (https://manual.gromacs.org/current/user-guide/mdrun-performance.html), and the webinars at https://enccs.github.io/gromacs-gpu-performance/#related-material :::info *Always ask questions at the very bottom of this document, right above this.* :::