# autoD3
This document:
Repository: https://github.com/Rafael-G-C/pyDFTD3
We meet at: https://meet.jit.si/autod3
Manuscript: https://www.overleaf.com/2279434875gbzjvjhgyrkz
## 2022-03-16
- We discussed where to move the repository. We agreed to stay on GitHub (better integrations) but we don't know if it fits well under the `openrsp` organization.
### TODOs for next meeting
- [ ] **Radovan** describe current API
- [x] **Magnus** describe desired API
## 2022-02-22
- Minutes:
- We are all in agreement about going for JOSS
- We will base ourselves on the reviewer checklist of this journal to plan our work
- A quick discussion suggested that we can get pretty far with a) having a well-produced README file reflect our work, give technical background, user instructions etc., and b) giving some thought to the API and how we can make it nice and user-friendly.
- It may be wisest to consider b) first, at least before writing elaborate user guides in a)
- We have a new meeting in a while to make a more detailed division of work
### TODOs for next meeting
- [X] **Magnus** Make "skeleton" of README file, maybe get started on technical background
- [x] **Magnus and others** Think about API and what it should offer
## 2022-02-16
- Main point for minutes: We are discussing switching to a more software-focused journal like JOSS:
- https://joss.readthedocs.io/en/latest/review_checklist.html
- https://github.com/openjournals/joss-reviews/issues
- https://joss.theoj.org/papers/published
- Discuss new results, system?, journal
- Update on grid weight derivatives
### TODOs for next meeting
- [x] **Magnus** Update everyone, if green light then set new meeting to plan remaining tasks. Use JOSS reviewer checklist to guide this work.
## 2021-11-22
## 2021-11-08
### TODOs for next meeting
- [x] **Magnus** Complete num diff stuff for cubic, quartic force constants
- [ ] **Rafael** Small fix for script, then D3 versus no D3 Spectroscpy calculations for water dimer, move on to methane dimer
## 2021-10-25
- Debugging of spectroscpy
### TODOs for next meeting
- [x] **Rafael** Tensor reader/writer, add D3 corrections to OpenRSP tensor
- [x] **Roberto, Magnus** Assist with debugging
## 2021-10-18
- Debugging of spectroscpy. The problem was that we saved data to plot with too fine resolution.
### TODOs for next meeting
- [ ] Teach D3 code to output resuls in `rsp_tensor` format.
## 2021-09-27
- Autodiff section on manuscript done, but will need to be polished/shortened.
- Num diff done but needs better grid settings
- Norway now at decreased alert level, travel regulations could be clarified soon
### TODOs for next meeting
- [x] **Magnus** Assist with calculations, do an iteration on manuscript, num diff with better grid
- [ ] **Radovan** Read through manuscript and edit/add
## 2021-09-06
- Just getting back on track after summer: Testing, calculations, manuscript
- How to bypass stop: https://gitlab.com/dalton/lsdalton/-/merge_requests/183
- There should be a `master` branch image.
- To run a singularity container with an env-var you can do:
```
$ singularity run --env VAR="value" image.sif
```
### TODOs for next meeting
- [x] **Magnus(, Roberto, Radovan)** Get the CFF and QFF with DFT (B3LYP) tested (see notes above, when block is on look for error msg to tell how to fix, tell Radovan if not clear)
- [x] **Radovan** prepare SLURM batch script to submit LSDALTON jobs to the queue.
- see chat
- [x] **Radovan** connect the new code to testing, coordinate with Rafael for running
- https://github.com/Rafael-G-C/pyDFTD3/pull/16
## 2021-07-15
Travel: Postponing until vaccination or easing of restrictions, Rafael will keep posted about own vaccination progress. Also we won't forget about the possibility of meeting all of us in Stockholm if that will be possible sooner than the alternative.
Next meeting: To be decided later (tentative second half of August)
## 2021-07-01
### TODOs for next meeting
- [x] **Magnus** Num diff of DFT for cubic, quartic force constants (master branch of LSDalton) to test this functionality
- [x] **Magnus** Kenneth and Stig for resp. entry restrictions to Norway and booking of stay
- [ ] **Rafael** Get started with some first calculations
Useful resources to figure out travel restrictions:
- https://reopen.europa.eu/en
- https://canitravel.net/
## 2021-06-14
:::info
### Items to discuss
* Travel dates: 25/8 to 10/10 - speculative plan B is to have the visit take place in Stockholm and Magnus goes there too
### TODOs for next meeting
- [x] **Rafael** Find one or a few sensible air travel options
- [ ] **Radovan** prepare SLURM batch script to submit LSDALTON jobs to the queue.
- [x] **Roberto** prepare LSDALTON Singularity image off of the `master` branch.
- [ ] **Magnus** Num diff of DFT for cubic, quartic force constants (master branch of LSDalton) to test this functionality
- [ ] **Radovan** connect the new code to testing
:::info
### Items to discuss
* Next steps
* LSDalton Singularity (has room for improvement): https://gitlab.com/dalton/singularity
- This is what Stig Rune has for MRChem (MPI+OpenMP and OpenMP only) He uses a Python module to generate the recipes. https://github.com/MRChemSoft/mrchem-singularity
### TODOs for next meeting
- [x] **Roberto, (Radovan)** Get a singularity image with OpenMP support to run on Fram
- (Roberto) I have a meeting with Simen tomorrow, maybe we have aligned interests and we can work together.
- (Roberto) I also prepared the MPI+OpenMP image, but this doesn't work with OpenRSP, right?
- [ ] **Radovan** prepare SLURM batch script to submit LSDALTON jobs to the queue.
- [ ] **Rafael** Acquaint with geo optimization and try to run locally, try to work on scripts to process data into format we want (combined OpenRSP + our D3 results) - ask Magnus if any problems
- [ ] **Rafael** Find suitable travel dates (don't worry about ticket prices) within around one week
- [ ] **Radovan** connect the new code to testing
:::
## 2021-05-03
- I have added a draft of the contributions' table following this: https://www.natureindex.com/news-blog/researchers-embracing-visual-tools-contribution-matrix-give-fair-credit-authors-scientific-papers
:::info
### Items to discuss
* Move the code to a more "institutional" organization. I suggest we move it under `openrsp`.
* How to integrate Radovan's rewrite. https://github.com/bast/d3
- We should absolutely ditch my work on `multiprocessing` since it's slower than Radovan's approach.
- Pros: Using `jacfwd` (and/or `jacrev`) instead of `grad` lets us compute all derivatives up to and including a certain order: asking for 3rd order, we can get 2nd and 1st orders without bookkeeping!
- Cons: asking for one single element at a given order might not be as straightforward.
* Travel: I got the OK for booking tickets for September - accommodation can be settled by Stig (office manager)
:::
:::info
### TODOs for next meeting
- [x] **Magnus** Adapt num diff functionality around Radovan's rewritten code
- [x] **Radovan** open a draft pull request towards Rafael's repo with code additions (later we take care of I/O, functionals, CLI, code deletions)
- [x] **Rafael** Anharmonic calculations, test logging
:::
## 2021-04-21
- Travel: Summer booking not forbidden but discouraged, will also be more people in office come fall so better chance to get to know people, quarantine lifting unlikely during summer, not sure if entry will be allowed but less unlikely than quarantine lifting
- DECISION: We plan to do the travel in September, Magnus will get final green-light to book
- Manuscript: Skeleton at https://www.overleaf.com/2279434875gbzjvjhgyrkz
- Multiprocessing for numdiff: Simple implementation made
- Radovan would like to look into improving the algorithm from N^2 to better but unsure where I will get to look at it
- First calculation candidate: Methane dimer, then possibly other strong dispersion systems from the parallelogram figure. Also consider nucleobase dimers from Barone paper (for numerical vs autodiff accuracy comparison)
:::info
### TODOs for next meeting
- [ ] **Roberto** Write autodiff section of paper draft
- [ ] **Roberto** More benchmarking of higher-order derivatives
- [x] **Magnus** Get final OK for fall travel for Rafael
- [ ] **Magnus** Tidy up parallelized numdiff functionality and submit PR with it
- [ ] **Rafael** Try to get started with methane dimer, in the beginning only do geo. optimization, basis set cc-pVTZ, theory DFT/B3LYP
- [x] **Radovan** check whether energy algorithm can be optimized and whether any optimization
would carry over also to derivatives
:::
## 2021-04-07
- Technical updates from previous TODOs
- Optimization target: D3 gradient and derivatives no more than 5-10 x more expensive than corresponding nuc. repulsion derivatives. Main venues: Optimize D3 code itself, parallelize. Secondary: Optimize the JAXification process.
- Travel options
- See TODOs
- JPCL paper
- We are not too concerned about our current performance
- Plan our own paper
- For publication: Have the QM code and our code produce data separately and print to file, then combine and add them, we bring the OpenRSP data into a script as a numpy array
- We want to publish sooner rather than later due to other groups also sniffing around in this field
:::info
### TODOs for next meeting
- [ ] **Roberto**, **Rafael** Sherrill group: Lots of papers about dispersion-affected systems, try to find a small system to run for our paper.
- Bringo? https://pubs.acs.org/doi/10.1021/acs.jctc.8b00114
- [x] **Magnus** Contact head of office abt travel, Rafael is fine with later travel, is fine with quarantine, can stretch visit to 6 wks in case of quarantine
- [x] **Magnus**, **Rafael** Start Overleaf manuscript (Magnus), write "skeleton", Rafael gets acquainted with this and thinks about which parts to take charge of
- [x] **Magnus** Parallelize the num diff functionality
- [x] **Radovan** (Roberto did the work) Profile the D3 code for a larger system
:::
## 2021-03-16
:::info
### TODOs for next meeting
- [x] **Magnus** Follow up with Kenneth about travel costs
- [x] **Magnus** Fix num diff functionality, parallelize num diff and run for at least 3rd order (NOT PARALLELIZED YET)
- [ ] **Rafael** Make interface to request specific tensor components by allowing the `D3_derivatives` to accept a tuple of indices.
- [ ] **Radovan**: profile the Python code, Roberto uses results to inform optimization effort
- Based on https://gist.github.com/robertodr/b482cd51389217863005866c9b68c040 but Radovan will try larger molecule
:::
### Notes from Radovan
A couple of different ways to run tasks in parallel without doing parallel programming: https://github.com/ResearchSoftwareHour/demo-parallel-tasks/blob/master/README.md
### Note from Roberto
- Computing derivatives is slow. I presume it's because we call a JAX-ified function many times in a loop: $3N$ times for the gradient, $(3N)^2$ for the Hessian, and so on (with $N$ the number of atoms). Some ideas for optimization:
- Profile. There are some suggestions on tools to use on the JAX website. I am not sure if these apply only to GPU profiling.
- We should refactor the `d3` function a little bit more to move the initialization of the coefficients read from the tables outside: these steps do not need to be JAX-ified. This should also help with readability of the code.
- Use just-in-time (JIT) compilation. In principle, the many calls all do the same thing: with JIT we pay a higher price for the first (few?) call(s) and a (much?) lower price for all the subsequent ones.
- If none of the above helps, I can ask a colleague for further suggestions.
- User-friendliness of the code is not stellar. If I ask for the $n$-th order derivative, I won't get all lower orders (because autodiff can target the $n$-th order directly). The `order` parameter to the front-end function should be made to accept:
- An integer, to mean "compute only the energy and the derivative of the given order".
- A string, for example `4-`, to mean "compute the energy and all derivatives up to and including the fourth".
- A list, for example `[1, 2, 4]`, to mean "compute the energy and the derivatives to the specified orders".
- A list of tuples, for example `[(1, 3, z)]`, to mean compute the first derivative of atom number 3 with respect to the z coordinate.
:::info
### TODOs for next meeting
- [ ] **Magnus** Follow up with Kenneth about travel costs
- [x] **Rafael and Roberto** WIth PyDFTD3 returning only a single value (not contributions separately), get a JAXified gradient of the D3 contributions
- [x] **Rafael and Roberto** Tell Magnus for num diff testing
- [ ] **Rafael and Roberto** Proceed to look at the Hessian. If there are errors, put it on a separate branch and ping Radovan for him to look at it
:::
## 2021-03-02
:::info
### TODOs for next meeting
- [ ] **Magnus** Make num diff using numerical script once D3 autodiff derivatives are implemented - but before that, make num diff results for just the gradient connecting directly to the pyDFTD3 functionality
- [ ] **Rafael and Roberto** Continue implementing autodiff of D3, tell Magnus once this is ready (will then connect to num diff code)
:::
## 2021-02-16
- [X] **Magnus** Try to look for existing code to get num diff reference data for D3 high-order derivatives (2nd order or higher) FOUND WAYS BUT DIDN'T IMPLEMENT ANYTHING YET
- [x] **Magnus** Give Rafael access to cluster IN MOTION
- [x] **Magnus** LSDalton tutorial for Rafael
- [x] **Magnus and Rafael**: We plan the travel as if it will happen (keeping in mind the possible restrictions) IN MOTION
- [x] **Rafael** implement Radovan's idea to avoid recursion; *i.e.*, generate all combinations of derivatives up to a given order up front. This can be combined with filtering to avoid computing the whole tensor when we're interested in just a fraction of its elements. See Zulip chat for details.
- [x] **Rafael** start work on the [GitHub tracking issue](https://github.com/Rafael-G-C/pyDFTD3/issues/5) to make pyDFTD3 differentiable with JAX.
- [x] **Magnus and Rafael**: install Singularity (this is how Radovan installs it: https://github.com/bast/til/blob/master/containers/singularity-installation.md; requires installing "go" using your package manager. On Debian-like: `sudo apt-get install golang`). Test it with `singularity --version`. Also try: `singularity pull --name hello-world.sif shub://vsoch/hello-world` and then `singularity exec hello-world.sif cat /etc/os-release`. Compare the output with `cat /etc/os-release`
- [x] **Radovan**: prepare a Singularity recipe file and send calendar invite for Feb 1, 16:00 CET and there show how to work with it
### Obtaining reference numbers for the gradient from LSDALTON
You can use these input files for a single-point evaluation of the molecular gradient.
```
**WAVE FUNCTIONS
.DFT
B3LYP
*DFT INPUT
.DFT-D3
*DENSOPT
.ARH DAVID
.CONVDYN
TIGHT
**RESPONS
*MOLGRA
*DIPOLE
*END OF INPUT
```
```
BASIS
STO-3G
Formic acid dimer
Distances in Angstrom
Atomtypes=3 Nosymmetry Angstrom
Charge=8.0 Atoms=4
O 1.25764 -1.09534 -0.03767
O 1.71416 1.07077 0.01796
O -1.35350 1.12953 -0.00726
O -1.61696 -1.06874 0.03430
Charge=6.0 Atoms=2
C 2.12399 -0.22275 0.00219
C -2.13901 0.18274 -0.01751
Charge=1.0 Atoms=4
H 3.22302 -0.31388 0.02838
H 0.74467 1.12248 -0.00573
H -3.24113 0.17485 -0.06355
H -0.64726 -1.03322 0.07411
```
LSDALTON gives the following D3 contribution `-0.005259449380`. The Python code gives `-0.005259458983232145`. I believe it will be OK to mix the values for higher derivatives from the Python code and the ones from LSDALTON :smiley: We can probably push the difference even further down by using atomic units in the LSDALTON `.mol` file.
Note that by default, the D3 contribution is **not** printed out explicitly. You can of course obtain by subtracting all other terms from the total, but a quick workaround is to apply the following patch to LSDALTON to print it directly:
```diff=
diff --git a/src/LSint/II_dft_dftd.F90 b/src/LSint/II_dft_dftd.F90
index 0f1108d9..c49b59b0 100644
--- a/src/LSint/II_dft_dftd.F90
+++ b/src/LSint/II_dft_dftd.F90
@@ -178,12 +178,12 @@ CONTAINS
GRAD(3,IATOM) = GRAD(3,IATOM) + GRADFTD(ISCOOR+3)
END DO
- !WRITE(LUPRI,*)'DISPERSION CONTRIB TO MOL. GRAD.'
- !DO IATOM = 1, NATOMS
- ! ISCOOR = (IATOM-1)*3
- ! WRITE(LUPRI,'(I6,1X,3(F20.12))') IATOM,GRADFTD(ISCOOR+1), &
- !& GRADFTD(ISCOOR+2), GRADFTD(ISCOOR+3)
- !ENDDO
+ WRITE(LUPRI,*)'DISPERSION CONTRIB TO MOL. GRAD.'
+ DO IATOM = 1, NATOMS
+ ISCOOR = (IATOM-1)*3
+ WRITE(LUPRI,'(I6,1X,3(F20.12))') IATOM,GRADFTD(ISCOOR+1), &
+ & GRADFTD(ISCOOR+2), GRADFTD(ISCOOR+3)
+ ENDDO
END IF
call mem_dft_dealloc(C6AB)
```
This could be achieved in a cleaner way, by changing the whole printout routine for the gradient.
## 2021-01-28
:::warning
- Rafael should start running some LSDALTON calculations:
* Learn how to get the code and how to compile it.
* Learn how to run geometry optimizations.
* Learn how to extract data from tthe output.
- I think we should move the `pyDFTD3` fork either under the `openrsp` or the `dev-cafe` GitHub organizations.
- I have opened an [issue](https://github.com/Rafael-G-C/pyDFTD3/issues/5) to keep track of the missing bits to make the D3 code differentiable with JAX.
:::
:::info
### TODOs for next meeting
This is a copy-paste from the Zulip chat.
- [x] **Rafael** implement 3rd derivatives with JAX. These we will test against reference results computed elsewhere.
- [X] **Roberto** compile Magnus' Fortran code and get reference results up to 4th order derivatives.
- [x] **Roberto** tweak our pyDFTD3 fork to accept an input file in JSON format as input (see below for schema)
:::
## 2020-12-23 :santa:
- Roberto prepared a [Deepnote notebook](https://deepnote.com/project/0acf5b58-5908-4ab9-8a21-8ab28b67da70) showing how to compute derivatives to any order with respect to each single variable separately.
- Rafael will work on a wrapper around the `derv` function Radovan wrote (copy-pasted from `xcauto`) to obtain all unique elements of gradient and Hessian.
- Next meeting sometime in January. Roberto will ask around for availability after *January 6th*.
## 2020-11-11
Status: Testing for Hessian is done, works for the two-atom case. Next step is polyatomic for all of energy, gradient, Hessian.
:::info
Follow-up actions:
- The 3 R's will reschedule for next week for pull request tutorial.
- Roberto: Ask Radovan about potential for recursion to general geometrical orders.
:::
:::warning
**To consider**
- Rafael asked how we are going to populate the arrays of atomic coordinates and atomic charges. I think we could use JSON format, consistently with [QCSchema](https://molssi-qc-schema.readthedocs.io/en/latest/auto_topology.html):
```json
{
"molecule": {
"geometry": array[float], # (3 * nat, ) vector of XYZ coordinates [a0] of the atoms.
"symbols": array[string], # (nat, ) atom symbols in title case.
"molecular_charge": integer, # The overall charge of the molecule.
"molecular_multiplicity": integer, # The overall multiplicity of the molecule.
"atomic_numbers": array[integer] # (nat, ) atomic numbers, nuclear charge for atoms.
},
"model": {
"method": string, # name of the DFT functional
"basis": string # name of the basis
}
}
```
It's easy to read in JSON files in Python. This will require some work to get some of LSDALTON's output to JSON with [this library](https://github.com/jacobwilliams/json-fortran).
:::
## 2020-10-28
Link to videomeeting venue:
https://gather.town/app/eW82AYYFFNhw7gqW/My%20Home%20Space
:::danger
Pre-meeting notes:
- Nuclear repulsion working for 2 atoms, tested up to Hessian.
However, the function is not ideal because it takes 6 scalar for the position of the two nuclei. This solution will not scale for $N$ nuclei.
- **TODO** How to pass differentiation variables?
- **TODO** Compute nuclear repulsion energy for $N$ nuclei:
$$
E_{\mathrm{NN}} = \sum_{A\neq B} \frac{Z_{A}Z_{B}}{R_{AB}}.
$$
- **TODO** Compute gradient and Hessian for the $E_{\mathrm{NN}}$ and check they are correct.
:::
## 2020-10-07
1. Update since last meeting
* Rafael: Did tests advised by Roberto
Follow-up:
- Roberto and Rafael: Try first to get Jax to calculate gradient (as vectorial quantity) of ~~electron~~ nuclear repulsion, if successful try to calculate higher-order derivatives
- Rafael: Make update of new tasks in Zulip
- Roberto: Look at pull request
- Magnus: Set up next meeting (include Radovan if available)
## 2020-09-23
1. Update since last meeting
* Magnus: Did all points except HPC access
* Rafael: Working on getting familiar with Jax, follow up with Roberto. Handling pull request with Radovan.
:::info
Follow-up actions:
- Rafael: document the steps for running the test of the pyDFTD3 code.
- Radovan: pick up the pull request (PR) so that we get the testing scripted and connected with github actions
- Roberto: help with JAX for multivariate functions (what's the input, what's the output)
- Roberto: forward email from Rob Paton to everyone. It has some useful info on the code.
- Magnus: Q&A session on response theory. Tentatively scheduled for 1st week of October.
- Magnus: Set up next meeting, two weeks from now, start half an hour earlier (7:30 AM for Rafael; 14:30 PM for Magnus&Roberto) Combine with rsp theory lecture.
- Radovan: update fork to get the license in
- Radovan: add info to readme about customizations they applied in upstream repo, after email forward
:::
## 2020-09-09
1. Round of introductions
2. Look at the plan: https://docs.google.com/document/d/12OhX2XkOYUpbyrFgH_gxYODFmBwvUNCNHwQuHJn0C5g/
* Lit-review
* `pyDFTD3` Rafael made a fork of the repo: https://github.com/Rafael-G-C/pyDFTD3
* TODO for Rafael: learn how to work with pull requests on GitHub. Should contact Radovan for help.
* Rafael changed the code to read in functional and geometry in plain text format. TODO for Rafael: add a test.
* TODO for Roberto: supervise the autodiff portion of the project.
* TODO for Roberto: Settle the license for `pyDFTD3`
* TODO for Magnus: check whether DALTON or LSDALTON does have the gradient with D3 contributions for the geometry optimization.
* Yes, it appears that at least LSDalton has got it: See src/LSint/II_dft_dftd.F90 - subroutine DFTD3_GRAD
* TODO for Magnus: Get Rafael access to a Norwegian HPC cluster
* TODO for Magnus: Find out about travel funding for Rafael DONE
* TODO for all (?) lit-review for systems that could be interesting to run calculations on.
* TODO for Magnus: Set up meeting two weeks from now, same time, send formal calendar invite DONE
### Using JAX
Radovan suggests to start with a "sandbox". Define a function of very many variables and test how JAX deals with it:
* Does it work?
* Does it give the correct result?
* How does the output look?
Other suggestions on how to get started:
* JAX works with functions, so if `pyDFTD3` works with classes, there will need to be a wrapper around.