Design Document

# Design Document ## Code Structure ![](https://i.imgur.com/mhRe3TE.png) *Figure 1: Block chart demonstrating the module dependencies of the backend Fortran program. The orange circles label the module calling order by the the main program.* The Fortran main program at the core of the software will depend on five Fortran modules, as depicted in Fig.1 above. Of these, the "Shared Data" module will not be called directly by the "Main". Rather, it will contain declarations for the global variables that need to be accessed by all the other modules. It will also define a Derived Type to represent the particles moved in the Metropolis Monte Carlo simulation (see Sec.2.1 below for algorithmic details). This Type will store the nature (ie. electron or proton) and the position of the particle it describes. Such a Derived Type is necessary when we implement the extension of relaxing the Born-Oppenheimer approximation (BO), as the electron and proton positions need to be sampled from Gaussians of different standard deviations. The first module used by the Fortran "Main" is the "Parse Inputs" module. The functions in this module will read in the text file with user inputs, parse this data, and assign the input data to the corresponding variables declared in "Shared Data". Next, the "Main" will use the "Assign Problem" module, which will contain the functions for local energy $E_L$ and transition probability ratio $P_t=\frac{P(x_{i+1})}{P(x)}$ for each problem that the software is expected to solve. The expressions for these functions are in Sec.2.2 below. This module will include a function "assign_problem()" that returns pointers to the $E_L$ and $P_t$ functions corresponding to the problem specified by the user. This function will also determine the number of parameters, electrons, protons, and movable particles (depending on whether BO is used) of the user-specified problem. Having set up the problem, the "Main" moves on to the "Run VQMC" module, which will primarily contain two functions. The first is "run_mmc()" that executes Metropolis Monte Carlo, the central algorithm of the software that is detailed in Sec.2.1 below. The second is an envelope function "run_vqmc()" that determines the grid of parameters to be optimized over, and calls run_mmc() for each combination of parameters on the grid. The “Run VMC” module will also have two additional functions that perform form Thermalization and Resampling, which are two data-processing methods that are elaborated in Sec.2.1 below. These functions will allow the developers to determine the optimal equilibration step value and resampling step size respectively, for each problem in the software. These optimized values will then be hard coded into the data processing, so the functions will not be necessary when the software is released. They will however be left in the module in case the user wishes to cross-check the hard coded values, or adapt the code for new problems beyond this software’s specification. Finally, the last module in the calling order is the "Write NetCDF" module, which will contain functions that write out the result arrays from the VQMC and the final optimized parameters, to a NetCDF file. An additional module "Test Data" will be included, which will contain the inputs and expected outputs for test problems. This can be used to check the whole code. If the user input specifies the problem type to be 'test', the main program will read the test inputs from the "Test Data" module, run through the calling order of the program, and then check the results against the expected test outputs stored in the "Test Data" module. The user will be notified about the outcome of the test in the stdout and logfile. The Fortran backend code shown in Fig. 1 will need to be used with Python frontend codes that handle pre- and post-formatting. The zoomed-out calling order of the scripts is as shown in Fig.2 below. The Python "Input Generator" will guide the user through generating the input text file in a format that can be processed by the Fortran "Parse Input" module. The Python "Output Visualizer" will read the NetCDF output file written by the Fortran’s “Write NetCDF” module, and will visualize this data for the user. ![](https://i.imgur.com/rAse2Wv.png) *Figure 2: Block chart demonstrating the calling order of the software's scripts.* ## Algorithms The code will consist of two main algorithmic parts. The first is a Metropolis Monte Carlo algorithm, defined by the 'Run VQMC' module above, that evaluates the overall energy of a given wavefunction at specified parameters. The second is a problem-dependent quantum solver, defined in the 'Assign Problem' module above, that is capable of calculating transition probabilities to be used in the Monte Carlo algorithm and local energies that will be averaged over to get the aforementioned overall energy. ### Metropolis Monte Carlo Algorithm A generic Metropolis Monte Carlo algorithm draws samples from a target distribution $P(x)$ by following a random walk. The algorithm can be described in succinct steps: - Initialise a starting point for the random walk $x_0$ - For each step in the chain, propose a change to the configuration $x_i \rightarrow x_{i+1}$ - Calculate the acceptance probability of this transition $$ a = min\left(1 , \frac{P(x_{i+1})}{P(x)} \right) $$ - Draw from a uniform distribution $u = U(0,1)$ and: -- if $u < a$ accept the transition -- if $u > a$ reject the transition, set $x_{i+1} = x_i$ - Calculate observables (position, energy, ...) The output of this algorithm is a chain of points $({x_0 , x_1 , ... , x_n})$ exploring the target distribution $P(x)$ , and lists of observable quantities from the chain. #### Acceptance Rate The acceptance rate (how often transitions to new points are accepted in the chain) for this algorithm must be tuned carefully. A low acceptance rate will not explore the target space, whereas a high acceptance rate will ignore the target space entirely. This rate can be effectively tuned by the size of the proposed change $x_i \rightarrow x_{i+1}$ , and we aim for an acceptance rate around the theoretically optimal value of approximately 30%. #### Thermalisation and Resampling Initial points in MCMC walk may not have thermalised around the correct value - it is entirely possible that the initial configuration is far from the average. Removal of the first points in the chain will give more accurate results. Subsequent steps in the Metropolis MCMC are correlated and therefore not independent. It is required to resample the points in the chain into independent points. This can be done by calculating the correlation time $t_{corr}$ within the chain, and then thinning or resampling the data so that statistics or further analyses are performed on a new set of points, which would be separated by $t_{corr}$ in the original chain. To find $t_{corr}$ we compute the autocorrelation function for an observable $Z_{i}$ measured at the $i^{th}$ point in the chain, with mean $\mu$ for varying lag length $k$ $$ A(k) = \sigma^{-2} (Z_{i} - \mu)(Z_{i+k} - \mu) $$ where $\sigma$ is the standard deviation of the observable distribution, and then search for the value of $k=t_{corr}$ such that $A(t_{corr}) \approx 0$. Once $t_{corr}$ is found, the data is thinned so that one in every $t_{corr}$ values is 'kept', and observables are calculated only for those values. An initial run for each problem will be performed to find the correct number of points to burn to ensure thermalisation, and the correlation length for thinning. These will be passed as constant values to the full simulation. ### Quantum Solvers #### Quantum Harmonic Oscillator The first system to be implemented within the code will be a quantum harmonic oscillator. This system has a simple analytical solution for both the wavefunction and the energy, and so will be useful as an early test against analytic results to check if the VMC algorithm is working properly. The analytic solution of this problem (in atomic units) takes the form: $$ \psi_{0}(x)=e^{-\frac{x^{2}}{2}} $$ with Hamiltonian: $$ \hat{H}=-\frac{1}{2}\frac{d^{2}}{dx^{2}}+\frac{1}{2}x^{2} $$ and ground state energy: $$ E_{0}=\frac{1}{2} $$ A trial wavefunction can therefore be suggested: $$ \psi^{\alpha}_{T}(x)=e^{-\alpha x^{2}} $$ The local energy $E^{\alpha}_{L}(x)$ is evaluated to be $$ E^{\alpha}_{L}(x)=\frac{\hat{H}\psi^{\alpha}_{T}(x)}{\psi^{\alpha}_{T}(x)}=\alpha+x^{2}\left ( \frac{1}{2}-2\alpha^{2} \right ) $$ and the probability distribution to be used in the Metropolis algorithm is $$ \frac{P(x_{i+1})}{P(x)} = \frac{\psi^{\alpha}_{T}(x_{i+1})}{\psi^{\alpha}_{T}(x)}=e^{-2\alpha(x^{2}_{i+1}-x^{2}_{i})} $$ By iterating over walkers and updating positions with the specified probability, the energy of the trial wavefunction can be evaluated. The ground state wavefunction can then be found by varying $\alpha$ until a minimum energy is found. The code should evaluate $\alpha$ to be equal to 0.5, in agreement with the analytic solution of the problem. #### $H_{2}^{+}$ This problem requires that a single electron at position $\mathbf{r}$ be interacting with two nuclei, $A$ and $B$, frozen at positions $\mathbf{R}_{A}$ and $\mathbf{R}_{B}$ respectively. The Hamiltonian of this system is as follows: $$ \hat{H}=-\frac{1}{2}\nabla^{2} - \frac{1}{\left | \mathbf{r}-\mathbf{R}_{A} \right |} - \frac{1}{\left | \mathbf{r}-\mathbf{R}_{B} \right |} + \frac{1}{\left | \mathbf{R}_{A}-\mathbf{R}_{B} \right |} $$ The simplest way of forming a trial wavefunction for this system is to take a linear combination of two hydrogenic $1s$ orbitals: $$ \psi^{c}_{T}(\mathbf{r})=c\phi_{1s}^{A}(\mathbf{r}) + \sqrt{1-c^{2}}\phi_{1s}^{B}(\mathbf{r}) $$ where $\phi_{1s}^{A}(\mathbf{r})$ and $\phi_{1s}^{B}(\mathbf{r})$ take the form $$ \phi_{1s}^{A}(\mathbf{r})=\frac{1}{\sqrt{\pi}}e^{-\left |\mathbf{r-\mathbf{R}_{A}} \right |} $$ From this, formulae for the local energy $E_{L}^{c}(\mathbf{r})$ and the acceptance probability $a$ can be calculated and implemented directly as functions to be called by the Monte Carlo algorithm. If differentiability of the hydrogenic wavefunctions becomes a problem, it may be better to approximate the exponential by a Taylor expansion in a polynomial basis. This will be investigated as the code is implemented. #### $H_{2}$ This problem requires that two electron at positions $\mathbf{r}_{1}$ and $\mathbf{r}_{2}$ interact with two nuclei, $A$ and $B$, frozen at positions $\mathbf{R}_{A}$ and $\mathbf{R}_{B}$ respectively. The Hamiltonian of this system is as follows: $$ \hat{H}=-\frac{1}{2}\nabla^{2} -\frac{1}{\left | \mathbf{r}_{1}-\mathbf{R}_{A} \right |} - \frac{1}{\left | \mathbf{r}_{1}-\mathbf{R}_{B} \right |} - \frac{1}{\left | \mathbf{r}_{2}-\mathbf{R}_{A} \right |} - \frac{1}{\left | \mathbf{r}_{2}-\mathbf{R}_{B} \right |} + \frac{1}{\left | \mathbf{r}_{1}-\mathbf{r}_{2} \right |} $$ In order to satisfy antisymmetry under exchange, a simple trial wavefunction for this problem is $$ \Psi_{T}(\mathbf{r}_1, \mathbf{r}_{2}) = \phi(\mathbf{r}_{1}) \phi(\mathbf{r}_{2}) \psi(\mathbf{r}_1, \mathbf{r}_{2}) $$ where $$ \phi(\mathbf{r}_{1}) = e^{-\frac{\left | \mathbf{r}_{1A} \right |}{a}} + e^{-\frac{\left | \mathbf{r}_{1B} \right |}{a}} = \phi_{1A} + \phi_{1B} $$ $$ \phi(\mathbf{r}_{2}) = e^{-\frac{\left | \mathbf{r}_{2A} \right |}{a}} + e^{-\frac{\left | \mathbf{r}_{2B} \right |}{a}} = \phi_{2A} + \phi_{2B} $$ and $\psi(\mathbf{r}_1, \mathbf{r}_{2})$ is the Jastrow function with the form $$ \psi(\mathbf{r}_1, \mathbf{r}_{2}) = e^{\frac{\left | \mathbf{r}_{12} \right |}{\alpha \left ( 1 + \beta \left | \mathbf{r}_{12} \right | \right )}} $$ Again, $E_{L}^{c}(\mathbf{r})$ and $a$ can be evaluated from this point to be called directly from the Monte Carlo algorithm. ## Parallelism methods Parallelising the code will be straightforward as each walker has to be updated independently. This naturally lends itself to MPI parallelism, with each MPI rank taking responsibility for a subset of the total number of walkers. To ensure that the MPI ranks generate independent walker trajectories, the random number generator on each rank will be given a rank-dependent seed. For every Monte Carlo step, each MPI rank will find the sum of the local energies of its walkers. When all ranks finish executing the Monte Carlo steps, a MPI-Reduce-Sum operation will be used compile the data in the MPI rank 0. This compiled result is a 1D array, with each array entry containing the sum of the local energies of all walkers at the corresponding Monte Carlo step. The subsequent operations on this array are performed only on MPI rank 0. These operation comprise dividing the array entries by the total number of walkers to get average local energies, resampling the array, finding its mean and variance, and writing the final results to a NetCDT file. If performance-intensive loops are identified within the code, OpenMP parallelism may be implemented to enable a hybrid MPI/OpenMP code. As OpenMP will need to fork and merge threads for every Monte Carlo step, this is likely to lead to significant overhead. However, it could provide an overall performance increase if used correctly. Benchmarking will be undertaken across all implemented problems to identify the correct division of processors between MPI and OpenMP parts of the code, if this hybrid approach gets taken. ## Error Handling In order to preempt any compilation-specific errors with the code, a module of unit testing functions will be written for the Fortran backend code. These functions will operate on each of the main functions of the code with set inputs, and ensure that the expected results are returned by each function. These tests will be callable by setting the problem defined in the input file to a specific test value, which will be detailed in the documentation and the README. The quantum harmonic oscillator system will be usable as a regression test, as it is a simple enough system that if the random number generators are given a fixed seed, the same results should be achievable given the same input values. To this end, a further argument to the input problem will be accepted that sets the range of $\alpha$ to be tested across to a predefined value and fixes the random number generator seed. This will then produce a known result, which will be comparable to an example given in the documentation. In order to eliminate errors due to unsanitised inputs, the Python frontend script handling input file generation will enforce that the user select variables that should be specific (eg. the quantum problem being solved, as there will only be a fixed set of strings accepted by the backend) from a list of allowed values. It will also guide the user through input creation in an ordered manner, to ensure that the structure of the input file is generated correctly. In the event of an error within the backend code, the program will print a basic error message to stdout, alongside an error code. The code will then stop, to avoid further propagation of this error. Each error code will be explained in the documentation, with possible causes and steps to correct the error also supplied. ## Error Messages Everyone would agree that the best kind of error message is the kind that never shows up. However, regardless of how well-designed a software package is, errors are always unavoidable. When users interact with a software package, eventually, something is bound to go wrong. The error message is a critical component of a good user experience. One way for users to evaluate the quality of a software package is by judging the error messages’ quality. Confusing and non-informative error messages are annoying to the user. This is why most developers would agree that good error message can increase the speed of software package learning and increase users’ subjective satisfaction in the process. ### Clarity Clarity is the top priority when writing error messages. Developers need to describe clearly what had happened, why it happened, and what the user can do to resolve the errors. Hence, developers should always write the error message in plain language to ensure the users could easily understand both the error and the solution. Developers should never write abstract error messages as they do not contain adequate information regarding the problems. It is no good to state that something went wrong in the software package but does not help users understand its root cause. As a rule, developers should never assume people will fully understand the context of an error message. Hence, it is a good practice to always be explicit and indicate what exactly has gone wrong in the error messages. Error messages that contain technical terminologies will certainly confuse most users. The error message should always describe the errors in terms of users’ actions. Never assume the users are tech-savvy. Hence, it is also a good practice to use non-technical terminologies that can easily understand by everyone. ### Be Concise and Precise The majority of users rarely read technical software documents word-by-word. The more text on a page, the more likely the users will not read the text at all. This is why it is vital to reduce the text down, including error messages. Developers’ goal is to write concise but precise error messages. Error messages should enable users to resolve errors efficiently by only containing the information required to help users. Hence, developers should eliminate unnecessary details that do not help the users achieve this objective. ### Never Blame Users Tones play a tremendous impact on how users comprehend the error message. Developers should adopt a more friendly tone to help users relate better to a situation and understand the errors. It is essential to avoid phrases that involve second-person pronouns. Poorly written error messages often accuse the user of making errors. Errors are frustrating enough, and there is no need to add to the frustration with unintended judgment from the developers. Developers need to acknowledge that these error messages are essential to communicate and build relationships with the users. Remember never to use upper case text and exclamation mark in the error messages, as they both will contribute to an awful user experience. Bad error message example: ` ERROR: You have entered an incorrect login or password! ` Friendly error message example: ` Error: Oops, login and password do not match. :-( ` ### Error Messages Examples Group A designed its error messages based on the error message design guideline in the previous three sub-sections. Here are some actual codes that were taken within the software package to be illustrated as examples: ``` Error: Oops, your response is invalid, please enter "w" or "c", without quotes. :-( Please have another go. :-) ``` ``` Error: Oops, there is no param.txt file in the current working directory. :-( Please have another go. :-) ``` ## Building & Distribution The Fortran source code will be hosted on a GitHub repository. A Makefile will be provided alongside this for easy compilation of this code. The Python scripts for input file generation and output results visualization will also be included in the GitHub repository. A README file will be present as well, to guide the user through the installation process and direct them on to further documentation. ### Libraries For the Fortran backend, it is anticipated that only two libraries will be required - MPI and NetCDF. Appropriate directions for installation of these libraries will be provided within the README file, alongside the versions tested during development so the user can be assured that these versions are functional with the code. For the Python frontend, the following packages will need to be installed: * NumPy * MatPlotLib * NetCDF (depends on HDF5) A `setup.py` script will be provided, with these packages listed in a `requirements.txt` file. When run, this script will take advantage of Python's `setuptools` package to automatically install the packages via `pip`, so that the user's Python environment will be ready to support the frontend scripts. ## Documentation ### Code Documentation Developer-friendly documentation covering the composite functions of the driver code and the output visualisation code will be produced using Doxygen. Information on the benchmark tests for said functions will also be included. These tests comprise the expected input(s) to each individual function, details of the processes, and the output data’s shape. Doxygen is free software that can generate high-quality documentation based on the source files. Moreover, its add-on extensions will allow users to integrate it into Visual Studio IDE or generate code diagrams. Doxygen is compatible with both the languages used in this software - Python and Fortran. To aid Doxygen, developers will write detailed comments in their codes. ### Help Files Detailed help files will be developed and should act as tutorials for the end-user. These help files will be aimed towards users with a postgraduate level of knowldge in a similar field. A formative tutorial exercise will also be included, that will guide the users through the steps. ## Collaboration mechanisms Given today's virtual work environment, Group A has agreed to collaborate over multiple online platforms to enable communication and maximize work efficiency. Slack will connect group members either in the office or on the go. With Slack, each group member can send draft emails, draft codes, share files, and clarify quick questions with their group mates. Another valuable feature of Slack is its ability to create polls. Polls facilitate effective scheduling and compilation of opinions across the group. Microsoft One Drive will be used to store and co-develope schedule spreadsheets (on MS Excel), presentation slides (on MS Powerpoint) and text documents (on MS Word). For typeset documents involving equations, HackMD will be used for collaborative document development. Microsoft Teams is will be used for group meetings. These virtual group meetings will be scheduled atleast biweekly during the five-week software development period. Lastly and most importantly, Github will be used to host the software developement phases. GitHub is an excellent tool for eliminating software version issues by keeping track of all the changes that have been pushed into the repository. Every group member will have access to a version history of the codes; hence, previous versions will not be lost when changes are made. Github also allows for branching of the code, which will be useful for part of the team to trial extended preliminary studies or new extensions on branches, while others focus on the stable master version of the code.