# Advanced GROMACS Workshop 2022
## Agenda and resources
> This is the collaborative "notebook" for the "Advanced GROMACS Workshop 2022" course organised in February 2022 by CSC -IT center for Science together with ENCCS and BioExcel, supported by EuroHPC Competence Center.
> [Course page](https://ssl.eventilla.com/advanced-gromacs-2022)
:::danger
This page has been frozen as a memo of the content and Q&A covered in the workshop. Thanks for participating!
:::
[ToC]
### Day 1, Monday 7 February (Times CET)
| Hands-on | Resources / Advance reservation | Time | Topic | Lecturer |
| -------- | -------- | -------- |-------- |-------- |
|login|interactive (1 core) | 9:00 – 9:10 |Introduction, organization, accounts, passwords |(Atte)|
|HackMD|-| 9:10 – 9:20 |Describe your challenge (you) |(Atte)|
|none| -| 9:20 – 9:50 |Overview on enhanced sampling methods| (Berk)|
|none| -| 10:00 – 11:00| Accelerated weight histogram method (AWH)| (Berk)|
||| 11:00 – 12:00 |Lunch Break
|Notebook **AGW-gromacs-AWH**| 1 node/user for notebook "gmx1"| 12:00 – 12:50 |[AWH hands-on tutorial](AWH: https://gitlab.com/gromacs/online-tutorials/awh-tutorial/-/archive/main/awh-tutorial-main.zip) |(Alessandra)|
|Notebook **AGW-gromacs-cp2k** 1 core| for batch jobs "gmx1"| 13:00 – 14:00 |[QM/MM and enhanced sampling methods tutorial](https://github.com/bioexcel/2022-02-07-CSC-gromacs-cp2k-tutorial) and <br> [Slides](https://github.com/bioexcel/2022-02-07-CSC-gromacs-cp2k-tutorial/blob/master/CSC-AWH-Tutorial.pptx?raw=true)| (Dmitry)|
### Day 2, Tuesday 8 February
| Hands-on | Resources / Advance reservation | Time | Topic | Lecturer |
| -------- | -------- | ------ |-------- |-------- |
|-|| 9:00 – 9:10 |Quick summary of homework exercise(s)||
|Notebook **AGW-biobb-MDsetup**|1 node/user, "gmx2"| 9:10 – 9:50 |Gromacs and workflows - [BioBB Website](https://mmb.irbbarcelona.org/biobb/) and [Intro slides](https://docs.google.com/presentation/d/1nlUy05zZcLYl0A3KzsOxLtHSEfWj2lnw/edit#slide=id.p1) |(Adam)|
|Notebook **AGW-biobb-MDsetup**|1 node/user, "gmx2"| 10:00 – 11:00 |Workflows [Tutorial](https://docs.google.com/presentation/d/1RpPTW33CEwaP7MzWmXY5Gv_KRXdqk95a/edit#slide=id.p1) |(Adam)|
||gmx2| 11:00 – 12:00 |Lunch Break||
|Notebook, **AGW-gmxapi-dev**|1 node/user, gmx2| 12:00 – 12:50 |GMXAPI, intro and hands-on (materials in notebook) |(Eric)|
|Notebook, **AGW-gmxapi-dev**|1 node/user, gmx2| 13:00 – 14:00 |GMXAPI continued |(Eric)|
### Day 3, Wednesday 9 February
| Hands-on | Resources / Advance reservation | Time | Topic | Lecturer |
| -------- | -------- | ------ |-------- |-------- |
|-|-| 9:00 – 9:10 |Quick summary of homework exercise(s)| all|
|shell & std environment via OoD|GPUs, "gmx3"| 9:10 – 9:50 |Performance considerations, CPU + GPU [materials](https://enccs.github.io/gromacs-gpu-performance/) |(Artem)|
|shell & std environment via OoD|GPUs, "gmx3"| 10:00 – 11:00|Efficient use of GPU [hands-on](https://enccs.github.io/gromacs-gpu-performance/introduction/) |(Artem)|
||| 11:00 – 12:00 |Lunch Break||
|-|interactive/test partition| 12:00 – 12:50 |How the new functionality Gromacs 2022 was realized |(Artem)|
|-|interactive/test partition| 13:00 – 14:00 |Q&A on challenges, tutorials, HackMD open issues| all|
### Advance reservations
Monday: 12-14, 40 nodes, small partition, name `gmx1`
Tuesday: 9-14, 40 nodes, small partition, name `gmx2`
Wednesday: 9-11, 4 GPU nodes, gpu partition, name `gmx3`
### Puhti user account
We have set up training accounts for the workshop. Only these can access the advance reservations (_i.e._ have guaranteed resources). You'll be sent your account name by email. Please use it, and only it, to access Puhti during the workshop.
### Access Puhti via the Web interface
* Point your browser to https://www.puhti.csc.fi
* use _your_ training account
* Starting a notebook:
* At the top select Apps
* On the left select "Jupyter for Courses"
* See the Agenda which "Course module" to select
* Check if an advance reservation should be specified
* Launch (wait for everything to be set up. If you suspect something is wrong, look at the log via the link)
* Once you get a "button" announcing the Notebook is ready, click and navigate to the tutorial material
* _If_ you also want a terminal on the same node, click at the blue button after "Host:" `> _nodename` at the GUI
* The GPU session on Wednesday will not use a notebook. Instead, select "Compute node shell" at the bottom left
* 1 core (default settings) will be fine (it's for submitting jobs only)
* Use the Scratch area: `cd /scratch/project_2003752`
* Create a subfolder for yourself (everyone is using the same project) if it doesn't exist already `mkdir $USER`
* use this in your batch script to access the advance reservation
* `#SBATCH --reservation=gmx3`
Links for Wednesday:
* ENCCS: https://enccs.se/
* Lesson materials: https://enccs.github.io/gromacs-gpu-performance/
* Github repository: https://github.com/ENCCS/gromacs-gpu-performance
* To check-out: ``git clone https://github.com/ENCCS/gromacs-gpu-performance.git``
---
## 📝 Q & A
Your questions are answered here. We will answer them, and this document will store the answers for you for later use! :rocket:
:::info
Scroll :arrow_down: to the bottom of the page to submit a question
:::
### Topics I'd like to discuss with developers
- [ ] **Q: ...**
### General and practical matters
- [x] **Q: I have difficulty pasting my questions into HackMD (here). Do you have some instructions on how to write here?**
- A: Can you see these three icons on top left corner, next to HackMD text? There’s pencil, this side-by-side symbol, and an eye. In eye view, you can’t edit, you are just viewing. The other two reveal the markdown (MD) version of the page, which you can edit. I find it easiest to edit with the side-by-side view.
:::info
:bulb: **Hint:** You can also apply styling from the toolbar at the top :arrow_upper_left: of the editing area.![](https://i.imgur.com/Cnle9f9.png)
:::
- [x] **Q: Slides available after the course?**
- A: They sure are! You will also have the access to this HackMD document (save the link).
- The slides and tutorials have been linked to the agenda above.
- are available here: ADD LINK HERE
- [x] **Q: Has the Zoom meeting already started? I'm just getting "The meeting has not started" -note.**
- A: Make sure you have copied the full Zoom link when connecting! It's long and contains the password :)
- [x] **Q: How can I get the access to Puhti?**
- A: You will get a training account for the course. Password is given in the first Zoom session. Do not share the credentials. The training account is to be used only during the course and only for course purposes.
### Questions for Monday
- [x] **Q: I am interested in setting up a CG model of a protein and run a plane MD with martini FF**
- tutorials can be found on Martini webpage http://cgmartini.nl
- [x] **Q: I would like to know the basics for setting up a steered MD in GROMACS**
- **Q: Hello everyone, just two items I would like to tackle for now**
- [x] Learn to use BioBB to be able to automate the setting of simulations
- This will be a topic for Tuesday, please also see the tutorial links on the registration page!
- [x] How to make enhanced sampling methods statistically rigorous
- See below (and Berk's slides) for discussion on AWH
- [ ] **Q: I am interested in analyzing peptide stability and calculation of binding free energy changes upon binding for both ligand-protein and peptide-protein complex formation.**
- Do you have a specific question related to these, or did the topics on Monday cover these?
- [x] **Q: I am interested in simulating complex systems in GROMACS and QM/MM MD for enzyme reactions in CP2K.**
- There will be a tutorial about such kind of things this afternoon. For big and complex systems (such as enzymes) you will probably need much more computing time, but general methods would be the same as in tutoorial.
- [x] **Q: Interested in Dr. Morozov's work. Would like to learn enhanced sampling methods other than umbrella and metadynamics. Gpus are cool to run on, it is good learn more about getting the most from them.**
- Thanks for your interest! Hope you will get new insights after the Workshop.
- [x] **Q: I'm interested in resolving conflicting distance restraints. (A fast way to test sets of distance restraints?)**
- [x] **Q: Can one use Umbrella sampling to sample the energy landscape first and then somehow use MSM to build the macrostates?**
- If there are multiple paths, you might want to generate them first and MSM would hence not help a lot
- [x] **Q: Maybe the TRAM family of methods help here? (see https://www.pnas.org/content/113/23/E3221)**
- Yes
- [x] **Q: Are there plans to include the possibility to bias rotation in GROMACS (say a peptide) to extract PMF?**
- There is some functionality for this already
- [x] **Q: Can we calculate the free energy required for conformational change in the protein via umbrella sampling? e.g. Free energy for switching between two conformations in the loop.**
- Yes! But this might be tricky as loops can have many different conformations and umbrella sampling can fix confom
- [x] **Q: Can you use a generalized coordinate for AWH in gromacs? Like in essential dynamics? I mean something like an eigenvector from PCA, as an example.**
- There will be transformation pulling coordinates (collective variables) in GROMACS 2022 that could cover partialy your needs.
- [x] **Q: Any internally implemted way to calculate the consistency error/ Bootstrap error for the PMF a la Umbrella Sampling?**
- The only good way to compute error estimates is to run multiple independent AWH simulations, for example 4, and compute a standard error estimate from that.
- [x] **Q: Is there a sign that tells the force constant for AWH was chosen too small?**
- The distributions of coordinates and reference value distribution should agree, there will be problems crossing barriers. PMF and Coord bias should also gree. These distributions are useful.
- Your system will be stuck in minima that how you will immediately see that something going wrong. Will discuss that in QM/MM tutorial. But be careful, your MD time-step should be small enough to cover high force constant (vibration period along your cooordinate with applied potential should be at least 10 time higher then your MD time-step)
- [x] **Q: is there an intuitive physical relation for the force constant (e.g. as high as possible as long as the dt allows it)?
- I think it's an educated guess for the underlying free energy barrier generally. Larger constants will allow you to constrain more in regions of high "steepness"
- [x] **Q: So it seems that the AWH outputs (PMF & friction) could be combined to get permeability coefficients based on the inhomogeneous solubility--diffusion model, right?**
- In principle yes, but depends on the details.
- [x] **Q: How does AWH scale across multiple nodes? Since it seems like a viable alternative for metadynamics, this could be a key factor. ADDITION: how costly is the communication of a single walker information between nodes (horribly slow in PLUMED)?**
- AWH with multiwalkers is trivially parallel, meaning that it could gain a lot from using large number of nodes. Walkers only communicates during update step (by default once per 100 MD steps), so their coommunication is neglectible in practice.
- [x] **Q: Is it possible to use AWH parameters with CP2K and for QM/MM calculations? ((maybe it is stupid question but I have no idea :) thank u in advance)). ( I am trying to understand some drug interaction with DNA)**
- Yes this is possible, moreover it works great! Will discuss that at GROMACS-CP2K tutorial.
- [x] **Q: Is it possible to use distances between center of masses in AWH? Or does it require something like PLUMED to define such variables?**
- Yes you can use distance between center of masses as reaction coordinates in AWH. Note you can use any reaction availabe coordinate with AWH.
- GROMACS by default pull CoMs of the index groups
- [x] **Q: Could you recommend a publication showcasing the application of AWH for alchemical transformations?**
- [x] BioExcel webinar on AWH for alchemical transformations: https://youtu.be/E5nGLcbyqTQ
- [x] publication M. Lundborg, J. Lidmar and B Hess doi:10.1063/5.0044352
- [x] **Q: How does the efficiency of AWH compare to umbrella smapling and metadynamics?**
- Umbrella sampling will be faster along a slow reaction coordinate when there is a single, well defined pathway. Otherwise a dynamic method like AWH will be more efficient and sample all relevant paths.
- AWH skips lots of oversampling, which typically occurs in some umbrella sampling points. The speedup can be upto 4x.
- Thanks!
- [x] **Q: How would we quantify an estimate of error along the awh outputs (specifically the PMF)? Is there an output that provides estimated error or something that can be used to derive one?**
- You can follow of the PMF evolves over time for a rough indication of the error. The only robust way to compute error esimates is to run multiple indepedent AWH simulations and compute the standard error estimate. To compute errors along the PMF, takes two reaction coordinate values, compute the difference in PMF and the error esimate of that.
- [x] **Q: Why do we need to write out output more freuqently in multi-walker scenario?**
- Strictly speaking you don't need to. But with N walkers you sample roughly N times faster, so you would normally also want output N times as frequent.
- Thanks!
- [x] **Q: How can we retain access to the notebooks post workshop?**
- The materials will be linked to this page (slides, etc.). The notebooks will be available on Puhti at least for a while. We'll also link their source here for reference. (like, AWH: https://gitlab.com/gromacs/online-tutorials/awh-tutorial/-/archive/main/awh-tutorial-main.zip)
- for AWH GROMACS tutorials and other GROMACS tutorials you can see tutorials.gromacs.org (all the jupiter notebook implementation can be run online thanks mybinder to mybinder.org or can be run locally by following the instructions)
- For GROMACS-CP2K: https://github.com/bioexcel/2022-02-07-CSC-gromacs-cp2k-tutorial
- For BioBB: https://mmb.irbbarcelona.org/biobb/workflows collection of demonstration workflows with links to BioExcel-Binder.
- [x] **Q: gmx_cp2k is only available for GROMACS 2022, isn't it?**
- a release candidate is out, see : https://ftp.gromacs.org/gromacs/gromacs-2022-rc1.tar.gz. Final release is planned end Febraury. see GROMACS announcement for update (https://gromacs.bioexcel.eu/c/gromacs-announcements/7)
- [x] **Q: How should I choose a specific QM method for my problem? is there a convenient database to select from or does it involve a lot of literature review?**
- QM/MM will probably not become a black box, so you should know what to choose. Some good defaults/suggestions will be provided.
- [x] **Q: What about the mdp and other files? Would we have access to them in case trying to follow the tutorials later on?**
- Sure. We'll try to link all the materials to the Agenda so it's easy to locate.
- [x] **Q: is it possibile to use cpk2 for simulating periodic system with non-periodic system?**
- No you would have to make the whole system periodic
- [x] **Q: The Pull coordinate force constant is very high?! Is it so because covalent bonds are very strong?**
- Yes. If you see empty windows, your force constant is likely too soft.
- [x] **Q: CP2k v8.2 package has Gromacs QM/MM support? or can we need to update CP2K v9.1package?
Is it possible to update over CP2K v8.2 package or do we need to download new version CP2K v9.1?**
- Both versions are fine, but with 9.1 will be simplier to install the interface.
### Questions for Tuesday
- [x] **Q: I am interested in setting up the BioBB workflow for a membrane protein simulation.**
- Right now we don't have any workflow available for that and some new tools should have to be integrated.
- [x] **Q: What enhanced sampling techniques are available in BioBB?**
- None, at the moment.
- [x] **Q: Do you have or have you planned to create a repository with customized workflows (beyond those provided by you)?**
- Yes. Actually we are about to launch a BioBB Workflows website: https://mmb.irbbarcelona.org/biobb-wfs/
- As for a repository, you can find here all the current workflows in Jupyter Notebook: https://mmb.irbbarcelona.org/biobb/workflows
- [x] **Q: Do you have any plans for adding AlphaFold into BioBB?**
- There is still no support for AlphaFold in the biobb_io package, but in the BioBB Workflows website you can run a structure workflow starting from AlphaFold: https://mmb.irbbarcelona.org/biobb-wfs/structure/step1#alphafold
- [x] **Q: Is there a functionality to add membranes during the building phase?**
- Yes, via MemProtMD. Requires that your protein is listed there. Otherwise you need to set up a protein-membrane system yourself. This functionality is in the pipeline, though.
- Please take a look to https://biobb-io.readthedocs.io/en/latest/readme.html
- [x] **Q: Is enhanced sampling supported by BioBB? And can we use a user defined function as a CV if we perform simulations using BioBB**
- We are thinking on integrate plumed
- [x] **Q: Is it possible to fix missing loop regions in PDB structures?**
- There is a possibility to offload this for Modeler, if you have the license.
- Please take a look to https://biobb-model.readthedocs.io/en/latest/readme.html
- [x] **Q: Is there a functionality to use AWH / Pull code from bioBB interface or is that still through the MDP options in native gromacs?**
- Not yet, but it's in the roadmap
- [x] **Q: Will the QM/MM functionality coming with Gromacs 2022 be integrated in BioBB? Additionally are there any plans to add any individual QM packages?**
- It's in the roadmap
- [x] **Do you have any offer for assessing the protonation states of the DNA? For example, for protein the H++ server and PROPKA software but DNA ?**
- Sorry, we don't have this in our plan. Can I ask what is the project you are running where this functionality would be useful?
- [x] **Q: Do you have suggestions for downloading CP2K? I mean some trustable video? (sometimes I get lost in the texts and I am missing some part) (my question for Dmitry Morozov :) )**
- As I know, unfortunately, there is no such video. It is always try and see approach with installing software onto clusters. Suggestion is to try to compile it yourself for your particular cluster and/or ask for help from the administrators.
- [x] **Q: Can you comment on the utilities in NGLview? Is it mature enough to create publication quality images?**
- Depends on the quality you need. You can export images up to 8x the size of the image in the screen. But not always export all the items you see in the screen. I.e. you can't export labels.
- [x] **Q: How about coloring different parts per some model you want to highlight? (NGLView question following the previous one)**
- That's easy to do, just create a different representation with different color for each model. Take a look to the NGL Viewer manual: https://nglviewer.org/ngl/api/manual/index.html Although this is the manual for NGLViewer, it's similar to NGLView. And even you can embed JS code to your Jupyter Notebook for more complex representations.
- [X] **Q: Upon installation of BioBB, is it possible to select the GROMACS version? And is it GROMACS 2022 already available with BioBB?**
- It is possible to use any version of Gromacs with BioBB. You just have to point the property:
*gmxpath (str) - (“gmx”) Path to the GROMACS executable binary.*
To the binary that you want to use. However by default if the conda installer is used the version will be Gromacs 2019.1
- [X] **Q: And when will it be GROMACS 2022 available from the conda installers?**
- You can always install the latest Gromacs from the conda installer (well, not really true, now the [Conda package for Gromacs](https://anaconda.org/bioconda/gromacs) contains the version 2021.3... They are trying to catch up with the latest versions, but sometimes there's a delay... :) )
- [X] **Q:Is it possbile to import only one chain from pdb if there are multiple chains? for e.g., if there are two chain A, B and and if i need only one chai to run simulation**
- Yes, you have the biobb_structure_utils package with several tools to edit PDB files: https://github.com/bioexcel/biobb_structure_utils/tree/master/biobb_structure_utils/utils As you can see, one of them is *extract_chain*
- [x] **Q: Is there any module to load local gro files?**
- You can use gro files with most of the GROMACS tools.
- [x] **Q: Can the workflow also fix missing backbone atoms?**
- Yes, with the biobb_model.fix_side_chain tool
- [x] **Q Does that also fix backbone?**
- You can use [biobb_model.fix_backbone](https://biobb-model.readthedocs.io/en/latest/model.html#module-model.fix_backbone) for backbone atoms.
- [x] **Q: Is it possible to add a certain number of water molecules apart from adding water layer around the solute?**
- We are working on a new module (biobb_cmip) that includes this possibility
- [x] **Q: Is there an interface to add restraints / constraints?**
- We have a block to generate restraints [genrestr](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.genrestr) but at the moment we don't have any GUI to generate restraints.
- [x] **Q: What’s the benefit of using gmx genion (needs a tpr file) vs. using gmx insert-molecules that just needs the structure (gro)?**
- Did genion evaluate the energy of insertion to place ions on energetically favourable positions?
- [x] **Q: Different force fields have very different mdp files, how is this taken into account?**
- Our mdp presets are just “best effort” values but you can always use your own mdp file or overwrite any mdp parameter from the properties dict
- [x] **Q: Does BioBB also have compilation of various analysis tools required for analysing membrane simulation e.g., order parameter, tilt angle, density profile etc.**
- Not at this moment.
- Will be included with a possible biobb membrane module.
- [x] **Q: How should we allocate the CSC HPC resources if we want to run our own simulations using the details shown by Adam?**
- Well, this is a good question! For a small setup, I think using the one node container is good. For real, parallelized production work we're still looking for a good solution. A conda installation on Lustre will be very slow and also disturb the performance of Puhti in general. Also, we'd like to be sure we're not launching too many Slurm jobs (or job steps) as they burden the Slurm scheduling/accounting system. Workflows are still a bit new thing in massively parallel computing, and new bottlenecks are being discovered. A current rule of thumb at CSC would be, that if you launch hundreds of jobs (or job steps) a day, you might need to rethink how you're doing it. Please contact us and we'll look for a solution together. In practice, use cases tend to be different.
- [x] **Q:what is the difference between BioBB and gmxapi in terms of implementation?**
- BioBB not GROMACS-specific, designed to build portable and reproducible workflows (packaging the whole workflow + dependencies with Conda), with adapters to workflow managers like PyCOMPSs, Common Workflow Language (CWL) and Galaxy.
- gmxAPI is motivated primarily as an attempt to establish a roadmap for more efficient integration with such managers. gmxapi includes some basic workflow and data flow management, but it is more oriented towards being executed by Python-native workflow software (rather than providing such functionality). However, gmxapi also attempts to demonstrate that a user interface that seems more natural to domain specific tools is possible, while preserving the abstractions necessary to optimize work scheduling and data transfer in a middleware layer. In short, gmxapi *lives closer* to the GROMACS liibrary and provides a lower level interface than BioBB.
- [x] **Q: Is there any significance for using `cmd.output.keys_list.result()` and `cmd.output.values_list.result()` instead of returning the lists by using `cmd.result()`?**
- the short answer is that it is not clear what `type` should be returned from such a call. One of my near-term interests is in updating the way Operations are expressed so that it is easy and obvious how to define an Operation in conjunction with its natural return type.
- [x] **Q: Can GMX api be used to modify mdrun behavior?**
- Before run time, `gmxapi.modify_input` can edit some simulation parameters. In the coming development cycle, it should also be able to edit much more of the simulation input, such as replacing coordinates from numpy arrays.
- The plugin framework illustrated in https://gitlab.com/gromacs/gromacs/-/tree/master/python_packaging/sample_restraint can apply forces or cause the simulation to stop according to user-provided code. Sample code is in C++, but it also illustrates how to call out to arbitrary Python functions.
- [x] **Q: Is there any way to dynamically link the installed binary? Eg.: I have two installations for gmx (2018.6 and 2021.4). I use each for different projects as per the requirement.**
- The initial design decision was to err on the side of preventing accidental use of an unintended GROMACS installation. The idea is that if you want to use a different tool kit, use a different Python `venv`. This is potentially problematic if, for some reason, you need multiple GROMACS installations for a single workflow. There is active discussion in this area. See https://gitlab.com/gromacs/gromacs/-/issues/4334 and https://gitlab.com/gromacs/gromacs/-/issues/4335
- [x] **Q: Can GMX api provide a way to hack into gmx tools? For example to modify gmx select or gmx insert-molecules to add some advanced functionality?**
- Unfortunately, this is blocked on additional C++ development to redefine the ways that libgromacs modules express their interfaces in a less commandline-centric way (and less coupled to filesystem I/O). We hope that some of the simulation input preparation tools will have real Python bindings in next year's release. For the moment, the new option to provide *stdin* to *commandline_wrapper* will help a little bit.
- [x] **Q:Is it possible to access the mdrun memory while it runs? So you could inject without stopping and obtaining results? I feel like you couldn't but maybe I am wrong here.**
- We don't currently provide a call-back feature, but that seems like a plausible future feature. Currently, we try to make it easy to write C++ code that can be attached to an (unmodified) GROMACS installation at run time, so you can run efficient code during the MD loop and only call out to Python if you absolutely have to. See https://gitlab.com/gromacs/gromacs/-/tree/master/python_packaging/sample_restraint and https://doi.org/10.1093/bioinformatics/bty484
- [x] **Q: What about compiled python like numba? Would that be runnable within gmxapi.mdrun?**
- The sample_restraint code linked from the paper and available in the gromacs repo (https://gitlab.com/gromacs/gromacs/-/tree/master/python_packaging/sample_restraint/src) illustrates that the C++ code can call out to Python fairly easily, if needed. But the main goal was to guarantee that researchers can attach external code to gromacs with close to zero overhead. Our primary example cases were to demonstrate that research projects that previously required thousands of lines of C++ to patch gromacs in unsustainable ways could be implemented in just a few dozen lines of C++ and Python with (nearly) immeasurable performance cost.
- [x] **Q: How should simulations be set up to be interrupted at the Python level? The example used `-maxh`... has `-nsteps=-1` been completely dropped?**
- I honestly don't know off the top of my head. (The core gromacs devs have passionate and differing opinions on that.)
As an added convenience, gmxapi 0.3 now allows arbitrary command line arguments to be passed (whatever is supported by the current gromacs installation). But the long term conservative approach is leaning towards finding ways to pre-define simulation parameters so that each invocation of mdrun is well-defined. In other words, we're leaning towards extending simulations one chunk at a time rather than running indefinitely and interrupting. But it's an ongoing discussion.
- [x] **Q: Should we always defined in the input "qmmm-cp2k-active = true - activates QM/MM MDModule"?**
- `qmmm-cp2k-active = true` activates QMMM module, by default it will be false (non-active).
-----
### Questions for Wednesday
- [x] **Q: What are the options (and possible behaviors) for randomized velocity generation? When are the velocities generated and under what circumstances might the initially generated velocities not get used? E.g. How do we deterministically prescribed an ensemble of initial randomized (MB) velocities?**
- the following happens in `grompp`
* If `gen-vel` is not specified, it is interpreted as "yes".
* If `gen-seed` is not specified, the random number seed defaults to `-1`.
* If the seed is `-1`, GROMACS does its best to randomly generate a (64-bit unsigned integer) *seed*. This replacement seed value should appear in the output mdp file and the grompp logging output, but not in the TPR file.
* If the seed is `0`, it is randomized, and the new value is reported on the terminal, but not recorded in the generated output file.
* If `gen-vel` is "yes", `grompp` reproducibly generates pseudo-random velocities for the seed and `gen-temp` (default 300K) and writes the velocities to the TPR file.
- The `-multi` option to `mdrun` was removed in the 2019 release, and so the confusing ambiguities I mentioned about re-randomizing velocities should no longer be a concern. The `-multidir` option requires separate input files for each simulation.
- gmxapi 0.3 does not provide a way to replace the generated velocities after grompp has been run, but the next update to `gmxapi.modify_input()` should allow numpy arrays to be provided to replace positions or velocities.
- [x] **Q: What will be better option? To use 20cpu cores and 2 GPU or 1 full node with 4GPUs, which will work more efficiently?...**
- If you need a single trajectory to be as long as possible, the second option will be better (given the system is large enough to populate 4 GPUs). If you need many trajectories or en ensemble, the first option will be better (or even running several trajectory per single GPU).
- [x] **Q: How does -multidir treat the end of the simulation: do all trajectories end at the same simulation time (if they proceed at a little bit different speeds) of at the same moment? (i.e. do some cores idle at the end)**
- You will have separate log files for each trajectory in the respective folder. The ns/day number should be similar there (but no identical - trajectories can be slightly out of sync).
- [x] **Q:How to deal with DLB issue if using more than 1 OpenMP rank**
- DLB checks the time it takes to run one steps and makes assumptions based on it.
- [x] **Q: Some codes like Amber and OpenMM run purely on a GPU. Why do you want to keep the CPUs involved? GPU-only runs would make sense on a multi-GPU workstation.**
- Depends on a GPU/CPU balance. Now even with 4 GPUs you can have as many as 32 physical cores per GPU (Like in Mahti). It is a waste not to use them.
- [x] **Q: What about doing regular -tunepme runs, say once every hour during a long simulation? I've sometimes witnessed weird tuning results (perhaps due to temporary network traffic?), and this can then decrease the performance of the entire run. Alternatively, if the system evolves a lot, the tuning might provide a different solution later into the simulation.**
- This is a very nice idea. We are currently trying to improve on PME tuning and load balance in general. It is very complicated with modern CPUs and heterogenity of using GPUs as well (e.g. some steps can be significantly slower because coordinates are needed on CPU for some reason).
- [x] **Q: '-update gpu' has a major effect on performance, but doesn't support Nose--Hoover (which is the recommended thermostat for C36). Any plans to add this support?**
- I don't know of plans to support this. But a Bussi thermostat (v-rescale in GROMACS) is a better thermostat. I think the recommendation for C36 is either out of date or for Charmm which might not support Bussi (but I don't know much about the Charmm package).
Info: GROMACS 2022 BioExcel webinar: https://bioexcel.eu/webinar-whats-new-in-gromacs-2022-2022-02-22/
- [x] **Q: If only 1 GPU Gromacs simulations on LUMI are possible (e.g. at the start) what would be the best approach to sample a very large system? (~2M particles)**
- It is possible to run on more than one GPU with SYCL, not yet 100% optimized though. Since you'll anyway need several trajectories you can either use AWH or individual simulations. For large systems using multiple GPUs can be advantageous.
- [ ] **Q: I am interested in running a serie of steered MD simulations, simulating the passage of a small molecule through a lipid double layer. On which timescale are you advising to approach this? And can you advise any script I could start with for optimization? Thanks in advance and sorry for a probably banal question.**
- The GROMACS pull code allow you to control the distance of a molecule to the center of a bilayer using the "cylinder" geometry option. You can fix the distance and look at conformations. You can fix at several distances and do umbrella sampling and get a PMF using WHAM. There are a lot of papers on that. You can also use AWH.
:::info
**This is the end of the document, WRITE ABOVE THIS LINE ^^**
HackMD can feel slow if tens of participants are editing at the same time: If you do not need to write, please switch to "view mode" by clicking the eye icon on top left :eye: